Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is a Master Inventor and Senior IT Specialist for the IBM System Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2011, Tony celebrated his 25th year anniversary with IBM Storage on the same day as the IBM's Centennial. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson
Yesterday morning, the entire country of Colombia suffered their worst black-out (power outage) in 22 years. 98% of the country was out for 4 1/2 hours.This is just 5 months after an outage that hit 25% of the country, December 7, 2006.Ironically, this one happened the week I am here explaining the need for Business Continuity plans to IBM Business Partners from Argentina, Peru, Velenzuela, Ecuador and Colombia. As is oftenthe case, people often need a real example to recognize the need for planning is important.
It reminded me of the Northeast Black-out of 2003 that impacted USA and Canada. I was speaking to a crowd of 800 people at the SHARE conference in Washington D.C. when it happened, and hundreds of pagers and cell-phones went off all at the same time. Although we were outside the effected area and had plenty of lighting, we ended up canceling therest of my talk, and many people left immediately to help execute their business continuity plans.Of course, terrorism was immediately assumed, but a final report showed that it was initiated in Ohiodue to overgrown trees, and then propagated due to a software bug to hundreds of other plants.
According to this morning's Bogota newspaper, "El Tiempo", nobody knows the root cause of yesterday's outage. Immediately, the country's leftist rebels were blamed, but now the leading theory is that it was initiated byoperator error (a technician touching something he shouldn't have), and then propagated by a faulty distribution system.
Another example of the need for a robust and resilient infrastructure, and appropropriate business continuity plans.
Today,Apple and EMI announced that EMI’s entire music and video catalog will be available in May without any digital rights management (DRM) protection.Not only with the music be higher quality, but can be played on any player, presumably using MP3 format instead ofApple's proprietary AAC format. Being locked into any single vendor solution is undesirable. Similar issues abound for Microsoft Office 2007 file formats.
On my iPod, I ripped all my CDs into MP3 format, not AAC. I love my iPod, but if I ever decided to chose a different MP3 player, I did not want to go through the time-consuming process or re-ripping them again.
A blog by Seth Godin feels this Apple-EMI announcement means thatDRM is dead.
Back when music labels added value by producing and distributing music in physical form, it made sense for them to take a cut. Mass-producing CDs and distributing them out to music stores across the country costs lots of money. However, for online music, music labels don't have these same overhead costs, but continue the process of paying the artists only a few pennies per dollar. Some artists have file lawsuits to get their fair share.
This process applies to any published work. For example, you can purchase Kevin Kelly's book in various formats, at different prices, from different distributors. For example:
In PDF for $2, directly from the author via PayPal
black-and-white hardcover, for $20, from Amazon
color softcopy, for $30, from Lulu
Each nets the author $1.50 in royalties per copy. You can decide how much in production and distribution costs you want to pay.
This week (actually April 29 to May 2) is IBM'sPartnerWorld 2007 conference.Over the past 10 years, IBM's shift to rely more heavily on business partners has proven to be a smart decision. IBM Business Partners can often focus on a specific region or industry much better, with laser-like focus.
We had a great event today! This was a first-of-a-kind product launch, using Second Life as the medium. We invited IBM Business Partners, industry analysts and reporters from the Press to have their "avatars" in-world to watch us launch new tape systems, archive and retention systems, and disk systems announced this month.
Andy Monshaw, IBM System Storage General Manager, welcomed everyone to the event, and introduced our three speakers.He mentioned that this was a great innovative way to meet, collaborate and forge relationships without the carbon pollution associated with travel required by a more traditional face-to-face meeting. We had attendees from the USA, UK, Germany, Sweden, Italy, Colombia, and Brazil.
All the attendees were given a "goody bag" that contained IBM BP-logo clothing, animations and gestures to be used during the meeting.
Eric Buckley, one of our marketing managers for tape systems, introduced our complete line of LTO 4 tape systems, as wellas the TS7520 Virtualization Engine, a virtual tape library for Windows, UNIX and Linux servers. Eric had a virtual 3-Dversion of an LTO cartridge that is photo-realistic and dimensionally correct.
Funda Eceral, our solutions manager for archive and retention offerings, presented the new version of the IBM System Storage DR550, the DR550 file system gateway, and the IBM System Storage Multilevel Grid Archive Manager. At first we thought we would "pass the microphone" from speaker to speaker, but it turned out to be easier just to give all three speakers their own microphone.
Last, but not least, was David Tareen, marketing manager for disk systems, covering the entry-level DS3000 Express disk system bundles designed for our SMB client. David used a black-and-brown pointer stick to point out specific things on the charts.
After the presentations, Kristie Bell, VP of Marketing for IBM System Storage, hosted a Question & Answer (Q&A) panel.Avatars rose their left hand to indicate they had a question.
We thought it would be a good idea to have a few minutes at the end to socialize over a cup of coffee. This involved making a "coffee machine" that dispensed coffee, and the appropriate animations and gestures so that everyone could sip the coffee, and hold the coffee at waist level when they were talking.
The event was held upstairs in one of the conference rooms of the IBM Briefing Center, located on "IBM 8" island.Many people went to the ground floor to look at the many IBM System Storage products on display. Unlike a picture on a web-page, Second Life gives you a 3-D view that you can walk around each product, and get a feel for the size and shape of the hardware.
We had four photographers and camera-persons on hand to capture still shots, video, audio, and chat text, and are working now to combine them for marketing collateral. I want to thank the builders, script programmers, animators, clothing designers, speakers, editors, and channel enablement team for making this event such a great success!
IBM had some big announcements today. The theme for today's announcement was "Protected Information", as there are many reasons to protect your most strategic asset, your information. Let's do a quick run-down of a few of them.
IBM LTO generation 4
LTO 4 provides encryption at the drive level, and supports WORM cartridges similar to LTO 3. It continues the LTO consortium's strategy for higher capacity and faster performance. If you have LTO 1 or LTO 2, now is a good time to consider upgrading your tape technology. The combination of encryption and WORM protects your information against unauthorized access, and unethical tampering of the data. The support is from our largest automated tapelibrary (TS3500),to our smallest drives.
TS7520 Virtualization Engine
The TS7520replaces the TS7510, providing enhanced Virtual Tape Library (VTL) capability. When you hear "storage virtualization" you often think disk, but IBM invented "tape storage virtualization" and this product continues that leadership.
Support for Half-high LTO 3 drives
The TS3100 and TS3200 now support half-high LTO 3 drives, which means you can have twice the number of drives in each unit. LTO 4 drives can read and write to LTO 3 media, so this provides additional investment protection.
IBM System Storage DR550 File System Gateway
This new offering provides much-needed CIFS and NFS access to the DR550, the worlds most flexible compliance-and-retention storage available. Already there is a large body of ISVs that support the DR550 today, and with this new gateway, the list is even longer. The DR550 provides encryption for both disk and tape data, as well as policy-based non-erasable, non-rewriteable enforcement, designed for compliance with government regulations like Sarbanes-Oxley Act, HIPAA, and many others.
IBM System Storage SAN32B-3 switch
This is the first major deliverable from Brocade since their acquisition of McDATA. A powerful switch packs 4 Gbps support in a small 1U form factor. Start with 16 ports, then add in increments of 8 ports to a maximum of 32 ports.
I've provided all the links, so that you can delve deeply into all the data sheets.
It is perhaps coincidence that I learned that two people have blogs today.
Dinh Phan is an IBM Field Technical Sales Specialist (FTSS) for IBM System Storage for the Western region of United States. He contributes to the Solsie.com blog from Costa Mesa, California. While this blog is focused on Mobile technology, Dinh has told me he plans to contribute postings about storage. One posting I found heart-warming was about IBM's historic donation of technology.Find his blog entries at:http://solsie.com/
Most businesses in Latin America would be considered "Small and Medium-size" businesses, which we shorten to SMB, but in some places is shortened to SME for "Small and Medium sized Enterprises." The problem with SME is that we often use this to refer to "subject-matter experts," so it can be confusing.
The problem with many acronyms is that in other countries, the letters are re-arranged, based on the syntax of the language.ISO is actually the International Organization for Standards.
Today, we learned about PYME. In Spanish, this stands for pequeñas y medianas empresas, which is literally "small" and "medium" businesses. Of course, most of my colleagues had not recognized PYME, and most of the people we talked to did not understand SMB. Once we equated one to the other, things went smoothly.
For those not familiar with Latin America, I suggest the movieRomancing The Stone, starring Michael Douglas and Kathleen Turner.
SNW wrapped up Thursday. As is often the case, a lot of people have left already.
I saw two presentations worth discussing here in this blog.
Angus MacDonald, CEO of Mathon Systems,presented "Litigation Readiness: How prepared are you for the demands of eDiscovery?"
The process of eDiscovery is to take a large volume of data and get the small bits of relevance, as it relatesto a case, investigation or litigation. In 2004, there were 64 billion emails per day, and this is expected to be 103 billion by 2008. There are growing concerns about the "spoliation" of evidence, which I thought was a typo,until I looked it up. He encouraged everyone to check out the Electronic Discovery Reference Model, which is trying to standardize the wayIT and legal communication with each other.
The problem is often miscommunication over semantics and terminology. For example, in eDiscovery, the term"production" describes the delivery of relevant documents to a judge or opposing party. This may involve printingthem out on paper, delivering them electronically in their original format, or converting to a more standardelectronic format like Adobe PDF. The judge or opposing party reserves the right to request how they want thedocuments produced. Of course, in any format other than the original format, authenticity needs to be affirmed.
He gave two example lawsuits related to this.
In Zubulake v. UBS Warburg, Zubulake was awarded $29 million because UBS stored old emails on backup tapes, rather than an archiving system, and could not locate seven of these backup tapes. This is not the first time I have seen some IT department, or some legal department, think that keeping backups of email repositories for many years is the same as keeping an "archive".
In Coleman Holdings v. Morgan Stanley, Coleman was awarded $1.45 billion because the judge felt that Morgan Stanley failed to do proper eDiscovery. This was after they tried to reconstruct their email system from 5000 old backup tapes.
Angus suggests identifying the types of documents most often requested, and start planning from there.In an interesting twist, the CEO/CFO/CIO might go to jail if the IT department doesn't do something correctly, so perhaps IT managers will now get the respect/funding/technology they need to get the job done.
Bruce Kornfeld, Compellent Technologies, presented "Building Systems that Scale: Imagining the one Petabyte per Admin management ratio."
Bruce did a good job staying generic, and not mentioning his company's products too much. Specifically, Compellentmakes a frame similar to what IBM used to call the "SAN Integration Server". Back in 2003, IBM introduced the SAN Volume Controller, which had no disk, and the "SAN Integration Server" which had controller + disk. What IBM learned was that customers prefer the diskless model, minimizing the amount of disk that has to be purchased from the original vendor, and instead opting to have the freedom to choose any vendor they like for the managed capacity.
An interesting feature of the Compellent solution is that they chop up the virtual disk into 2MB pieces, and allow these pieces to be moved automatically from high-speed (FC) to low-speed (SATA) disk, based on their reference frequency. This is similar to HSM, but at the block level, rather than the file level.
Every advantage Bruce listed for his box already exists from IBM: improved capacity planning, improved performance, ease of data migration, flexible volumes, and a single pane of glass GUI administration tool.
Perhaps more interesting were the questions from the audience:
Q1. Do you have any customers that have 1PB of your solution? No, we have several in the 200-500TB range.
Q2. You only have a single two-node cluster, can we have more clusters? No, that is all we support, but if you need that you would have to go to one of the major storage vendors (like IBM).
Q3. Do we have to buy Compellent storage to go with the Compellent controllers? Yes, it is designed so it is an integrated solution. If you need to virtualize your existing storage, you have to go to one of the major storage vendors (like IBM).
Q4. Having data migrate automatically from FC to SATA behind the scenes lowers performance and raises the risk of disk failure? Our box is designed for inactive data, so performance is not an issue.
Q5. How do you protect against double-disk failures? We don't, and these would be even more detrimental to our solution than traditional solutions. Other vendors offer RAID6, but we don't have that yet.
It was a fun week, and good to see people I have communicated with, but never met in person.
Continuing my coverage of SNW Spring 2007, Ron and Vincent kicked off Wednesday main tent sessions with more survey questions:
Q1. How secure is your storage network?
27% Redundant, 100% able to withstand physical failures
28% Able to withstand hackers, but not physical failures
37% Weak on both fronts
Q2. What was the cause of most downtime in last 12 months?
1% Natural disasters
13% Network outages
14% Server failures
9% Telecom provider outage
22% IT resource upgrades
33% Human error
Thornton May, futurist and columnist for ComputerWorld, presented "Storage 3.0: What Comes After, What Comes Next."I have seen several "futurists" present at conferences like this. They all feel the need to explain what their job is, and what it takes to be one. This time, Thornton indicated he was "ridiculously well-travelled, amazingly well-connected, pathologically observant, and brutally honest." His insights:
At current rates, in 15 years every molecule on earth will have its own IP address.
"What's NOT good enough changes." -- Clayton Christensen
Gabriel Broner, General Manager of the newly created "Storage Solutions" division of Microsoft, presented "The Drive to Unified Storage". The people sitting around me asked "What does Microsoft have to do with storage?" He defined "Unified Storage" the way we use it for IBM Sytstem Storage N series "a storage unit that provides both file and block level protocol support." Microsoft is using "e-mail" as the model for data access, identifying the need to have "off-line" copies on your PC or laptop that are synced up with "on-line" sources. Features that were typically only available for high-end applications are now being made available to the masses, like "Volume Snapshot" capability in Windows Vista. On the home front, Microsoft recognizes that typically one person acts as the "IT manager" for the family.
Their survey of storage spend of Fortune 1000 companies. It was not clear if this was for Windows environments, or how the data was collected. These numbers don't match what we hear from our UNIX or mainframe customers.
Microsoft is implementing application changes, such as Office 2007, to simplify storage issues. Storage virtualization is the key for the future, he says, stating that Microsoft's "iSCSI target" software support makes files look like block-oriented volumes. Virtualization is now mainstream, and deploying software on standard hardware is the new storage business model. The end goal is to simplify provisioning, device and resource management, without reducing functionality, narrowing the gap between general IT tasks and specific storage tasks.
Craig Lau, NBC Olympic coverage, presented their success story. Look at the number of "hours" of TV Olympic coverage over the years:
1996 Atlanta -- 175 hours
2000 Sydney -- 441 hours
2004 Athens -- 1210 hours
NBC now is able to deliver 70 hours of TV programs per day, shown across their seven channels (NBC, CNBC, MSNBC, Brave, USA Network, Telemundo, and HD-tv). The Olympics in Torino, Italy generated 25,000 tapes in 17 days. Their 100,000 tape Olympic repository is starting to deteriorate, and they need to consider conversion to digital format. Their challenge was that footage was difficult to find and producers needed immediate access to time sensitive/critical content.
Their solution was Digital Asset Management, automating indexing and logging, using an IP-based workflows that reduces the number of people at the Olympics location, and allowing content to be sent back to USA for remote editing.The facilities at Torino involved:
2850 people, most hired just the week prior to the Olympic event
250TB of disk storage
135 High-Definition cameras
212 Video Tape Recorders
4000 hours of content on 1700 tapes
NBC is frustrated by the lack of compatability and interoperability in the video format industry. They have been testing MPEG-1 (1.5 Mbps) formats, and plan to deploy a new system using 1080i for the upcoming 2008 Olympics in Beijing. With the new system, they can index footage by athlete, by event, and by human emotional reaction. They can review and edit footage within 30-45 seconds of live coverage, allowing rough edits to be documented as "Edit Decision Lists" that can be e-mailed or put on USB key for others to review.
Although I missed Anil Gupta's "Blogger Event" on Monday, several bloggers did stop by to visit me at the IBMbooth.
I survived my first day at SNW Spring 2007.This is my first time at SNW, but it is very much like many of the other conferences I have been to.It officially started Monday morning with pre-conferencetutorials and primer break-outsessions that covered storage fundamentals, but I didn't arrive until late Monday night due to highwind conditions at the Phoenix airport that delayed my travel.
Tuesday started out with main tent sessions. Ron Milton, VP of ComputerWorld that puts on this conference,and Vincent Franceschini, Chairman of the Board for SNIA, kicked off the event.It didn't take them long to get into the alphabet soup: ILM, ITIL, SMI-S, XAM, IMA, MMA, DDF,MF, DMF, IPSF, SSIF, and SRM.Several hundred people had "voting devices" so that they could participate in "informal" surveys.
Q1. What was the greatest need?
37% Storage Resource Management (SRM) tools
19% Storage Virtualization
19% Information Lifecycle Management (ILM)
14% Integration with other management tools
11% Compliance storage for regulations
Q2. What are people doing to address storage infrastructure complexity?
33% Deploying new SRM and SAN management tools
26% Adopting "Storage as a Service" methodology
22% Deploying new storage virtualization technologies
8% Hiring more staff
9% (complexity was not an issue)
The first keynote speaker was Cora Carmody, CIO of SAIC. In the late 1980s and early 1990s, I did a lot of work with SAIC here in San Diego, and so IBM sent me to San Diego quite frequentlyfor face-to-face meetings with them. Her talk was cryptically titled "Jumbo Shrimp, InformationManagement, and the Mark of the Beast." Coming up with good titles is important. Some of herkey points:
"Information management" was as much an oxymoron as "jumbo shrimp" or "military intelligence".(SAIC is a general contractor for the US Military, so this was especially funny).
Computer data needs both "ownership" and "stewardship".
Gartner analyst reports that 50% of digital information for a business resides in personal files onindividual PCs.
PAN-StaRRs project is ingesting 10TB per week of astronomical data.
TeraTEXT(R) project is a non-relational database that supports a large mix of structured and unstructured content.
The next "Y2K" crisis for the USA is changing from 3-digit to 4-digit area codes for our telephone numbers.
Battery size and life have not advanced as fast as we need
There has been little progress in "User Interface" ease of use
Formats and standards are picked for the most part by the winning vendors, and it is the silence of themarketplace that lets them get away with this.
We are overly reliant on an inherently insecure medium.
The "mark of the beast" refers to exciting new technologies based on "presence awareness". For example,some hotels now are able to check you into the hotel as you drive up in your car, based on your car's licenseplate. Some 24-hour gyms use your fingerprint as your entry credentials, eliminating the need to staff peopleat the front desk.
IBM's own Barry Rudolph, presented "Storage in an Age of Inconvenient Truths", dressed up like Oscar-winner andformer USA Vice President Al Gore. Barry's focus was on the growingconcern of over environmental Power and Cooling issues in the data center. According to IDC, the cost of power and cooling an individual server, over its lifetime, now exceeds its acquisition cost. Storage devices are not as bad as servers in this regard. Data centers now consume 1.2% of the worlds energy.
Over lunch, I heard Tony Asaro from ESG present "The Need for Highly Virtualized Storage Systems withina Virtualized Data Center." His concern is that there is still a "heavy touch" required to manage storage.Without virtualization, your data center is less than the sum of its parts. Although IBM has been doingstorage virtualization since 1974, Tony mentioned that most storage vendors were "late to the party".He argues that "internal virtualization" inside storage arrays is not enough, you need "external virtualization"(like the IBM System Storage SAN Volume Controller) to virtualize your entire infrastructure.What storage administrators would like is for storage to have consumer levels of "ease of use", and today'snon-virtualized storage environments are nowhere near that.
"The great advantage [the telephone] possesses over every other form of electrical apparatus consists in the fact that it requires no skill to operate the instrument." - Alexander Graham Bell, 1878
I attended a few break-out sessions in the afternoon.
Ralph presented "Crisis of Capacity" which covered the drastic actions he had to take to handle power and coolingin their expanding data center during their summer months, where temperatures peak up to 105 degrees. This included creating "hot" and "cold" aisles onhis raised floor by re-organizing the perforated floor tiles, and doing a better job standardizing how cables areconnected to the back of racks and up through the ceiling to maximize airflow. An amp-meter on each power strip was used to measure the powerused at each rack, which allowed them to better prioritize their efforts. Their Air Conditioning unit was only 12inches from the concrete floor, and raising it to 18 inches greatly reduced noise and vibration. Adding a second AC unit made a world of difference. Finally, they eliminatedKVMs, because people who use KVMs break other parts of thedata center. His rule of thumb: the cooling requirements will be 50% of the rated power requirements for equipment.
Terry Yoshi, Intel internal IT department, as a member of the SNIA's end user council
Terry presented "Taming the SAN Complexity". The problem with "complexity" as a concept is that it is very subjective, difficult to quantify, and therefore difficult to manage. He presented complexity in four areas:Organizational structure of the company as a whole; skill sets required of the IT staff; business process andprocedures; and technology. Dealing with complexity is a battle between Old School (because we've always doneit this way) and New School (because it is new and different technology). Storage Area Networks are inherentlya "shared resource", and the increased complexity is a direct result of the low reliability of the componentsand devices it is composed of. People should focus on the "Total Cost of Ownership" (TCO) for a SAN, and not just the initial acquisitionprice of SAN gear.He was not a fan of the "dual/multiple" vendor strategy that many companies employto reduce costs. His suggestion that things should be tried out first on your "test SAN" caused some chuckles,as few have such a thing. Finally, he suggested not only documenting "Best Practices" and "Best Known Methods"but also things that have been found not to work, his do-not-try-this-at-home list.
Tony Antony, Cisco marketing manager for Optical products
This was an overview of the technologies available for long distance connections for disaster recovery,business continuity, and resilience. He covered three levels.
IP - Fibre Channel of IP (FCIP) offers the greatest "global" distance but forces people into asynchronous mirroring.
SONET/SDH - SONET is what we call it in the USA, and SDH is what it is called in other countries. This provides state-to-state or "out-of-region" distances, which is ideal to meet certain government regulations for homeland defense. He suggests this is offered when dark fiber or DWDM is not available.
DWDM/CWDM - this is using a prism to run multiple colors of light through a single fiber optic cable. CWDM ischeaper, but only handles 8 signals per cable. DWDM can handle 32 to 160 signals per cable, but is more expensive.
His rule of thumb: one buffer credit for every kilometer at 2Gbps speed (for every 2km at 1Gbps).
The day ended at the "Expo". I hung out at the IBM booth to help answer questions and network with others.
Last year in Beijing, China, one of my colleagues told me "When it rains here, cabs dry up". Normally, there are enough taxi cabs to handle normal conditions, but when it rains, people who normally walk now want to take a cab instead, and the demand goes up, resulting in being more difficult to find one when you need one.
I'm wrapping up my week here in Chicago, and it snowed yesterday. Cabs were scarce. I walked. Many others walked too, about half with umbrellas to protect themselves against the snowflakes.
Most systems are designed to handle typical average conditions. Taxi cabs in a city, for example, handle typicalamounts of traffic.
IT is different. In many cases, IT infrastructures are designed for the peaks, not the averages. Peaks can be where you need performance the most, and failure to design for peaks can be disastrous. As with any business decision, this represents a trade-off. Design for the average, and suffer through the peaks, or design for the peak, and be over-allocated and under-utilized most of the time otherwise.
Yesterday, I went to the Bodyworlds exhibition. Here the anatomy of real human cadavers are on display, in full detail, thanks to a process calledPlastination.This was a great way to present anatomy in a 3-D visual way that can be easily understood and appreciated.I was glad to see so many children were there, although I warn parents that some sections of the exhibit maybe a bit shocking. I heard people speaking French and German, and it was great that anyone can be fascinatedby the human body, without having to read or understand English.
In the exhibit, you got to see the bones, nerves, muscles, digestive tract and other organs.Some in action poses, like swinging a baseball bat or ice skating, while others were stretched into specific poses to help emphasize one part or another.
In some cases, they would show side by side healthy and unhealthy organs, for example, the lungs of someone that smokes tobacco cigarettes, compared to the lungs of a normal person. Quite a difference!
Visualization can be an effective way to understand and gain insight from information. Presenting information in a visually stunning manner can be challenging, but often worth the effort. It reminded me of Edward Tufte, who has written several books on this subject.
The concept that there should be a linear "Storage Administrators per TB" rule-of-thumb has been around for a while.Back in 1992, I went to visit a customer in Germany who had FIVE storage admins for 90 GB (yes, GB, not TB) disk array.I told them they only needed 3 admins, but they cited German laws that prohibited "overtime" work on evenings and weekends.
Later, in 1996, I visited an insurance company in Ohio to talk about IBM Tivoli Storage Manager. They had TWO admins to manage 7TB on their mainframe, and another 45 people managing the 7TB across their distributed systems running Linux, UNIX, and Windows. My first question, why TWO? Only one would be needed for the mainframe, but they responded that they back each other up when one takes a 2-week vacation. My second question to the rest of the audience was... "When was the last time you guys took a 2-week vacation?"
Today, admins manage many TBs of storage. But TBs are turning out not to be a fair ruler to estimate the number of admins you need. It's a moving target, and other factors have more influence that sheer quantity of data.Let's take a look at some of those factors, which we call "the three V's":
Variety of information types
In the beginning, there were just flat text files. In today's world, we have structured databases, semi-structured e-mail systems, hypertext documents, composite applications, audio and video formats that require streaming, and so on. Variety adds to the complexity of the environment. Different data requires different treatment, different handling, and perhaps even different storage technologies.
Volume of data
Data on disk and tape is growing 60% year on year. It's growing on paper also. It's growing on film like photos and X-rays. The problem is not the amount, but the rate of growth. Imagine if population and traffic in your city or town increased 60% in one year, most likely people would suffer because most governments just aren't prepared for that level of growth.
Velocity of change
Back in the 1950's and 1960's, people only had to make updates once a year, scheduling time during holidays. Now, people are making changes every month, sometimes every weekend. One customer we spoke with recently said they do about 8000 changes PER WEEKEND!
So, the key is that there is no simple rule-of-thumb. Fewer admins are need per TB on mainframe than distributed systems data. Fewer admins per TB are needed when you deploy productivity software, like IBM TotalStorage Productivity Center. Fewer admins per TB are needed when you deploy storage virtualization, like IBM SAN Volume Controller or IBM virtual tape libraries.
The "corporate bloggers" from the various storage vendors often mention their opinions about IBM products. Sometimes, they say something nice, and other times they poke fun. It's good to read the various opinions. Most are well-thought and well-written.
EMC blogger Chuck Hollis has a post about the various categories that industry analyst IDC used for external controller-based disk in their most recentQ4 Storage Scorecard.I agree with Chuck that it is good to have independent analysts that take an objective look across all storage vendors to provide the facts on various makes and models. Both IBM and EMC took marketshare in 4Q, so we cancongratulate ourselves and each other for the efforts needed to make this happen.
Chuck mentions that while EMC and HDS high-end boxes are similar, perhaps IBM's "DS" series is different enough to question putting it in the same "high-end" category. It's not clear if Chuck is poking fun at the fact that theIBM DS family spans multiple categories; or an admission thatthe IBM DS8300 Turbo is faster than the EMC DMX-3 and HDS USP offerings. Perhaps we need a new categorycalled "super high-end"?
IDC doesn't publish their data by price band, but we can infer from the products in each how they decidedwhich products were grouped into which categories. Let's examine the entire IBM DS family in the various categories.
Our newest offering is the IBM System Storage DS3000 series. Some analysts call this category "low end", but IBM prefers using "entry level". These have an attractivelow acquisition price, very easy to set up, and are intended for the Intel and AMD servers, such as IBMBladeCenter, System x, as well as servers from HP and Dell. Disk arrays in this category typically have listprices below $50,000 USD.
Our midrange offering is the IBM System Storage DS4000 series. These are designed for Linux, UNIX and Windows based workloads.Some call these server platforms "open systems", or sometimes "distributed systems". The DS4000 systems are rack-optimized modularunits, providing plenty of options and trade-offs between price and performance for price-sensitive customers.The "high end" model of the DS4000 series is the DS4800, and has very impressive performance characteristics.Disk arrays in this category typically have list prices in the $50,000 to $299,000 USD range.
IBM System Storage DS6000 seriesis one of our enterprise class offerings. DS6000 offers mainframe attachment comparable to what EMC DMX or HDS USP offer for their "enterprise class" or "high end" models, but uses substantially less power and in a much more compact modular rack-optimized packaging. Disk arrays in this category typically have list prices at $300,000 USD and above.
Super High End
Perhaps IBM and EMC can work together to petition IDC to adopt this as a new category, based on performance,rather than list price. Is the storage marketplace ready for a fourth category?As Chuck mentioned on his blog, IBM is #1 for mainframe disk storage, and perhaps it is because the IBM System Storage DS8000 Turbo series does so well on most mainframe workloads. No offering from EMC or HDS meet or beat the SPC benchmarks for the DS8000 Turbo. You can see the results in the Executive Summary or read the Full Report.
Thanks to IBM's innovative Adaptive ReplacementCache algorithm, IBM DS8000 performance shines best handling read-intensive random-access workloads that mainframes do most often. These types of workload are modeled by the SPC-1 benchmark. In cases of write-intensive, sequential processing, the differences are less substantial, as disk arrays from all manufacturers drop down to the native performance capabilities of the 10K and 15K RPM drives.
I'll give you a real example. Not long ago, I waspart of a team to help resolve a performance bottleneck on-site at the customer location. The customer had an interesting "composite application" where data was processed on AIX platform (IBM System p), which passed the data to a Linux partition running on the IBM System z mainframe,which in turn used Java SQL to post updates to a DB2 database on z/OS partition, which then wrote out through FICON adapters to an HDS USP device. IBM and HDS worked together to help the customer figure out why they weregetting disappointing throughput and response times. IBM brought in experts on AIX, TCP/IP, Java, Linux, z/OS and FICON. HDS had their experts too, and tried to improve performance by quadrupling the storage capacity, and spreading the data out across more spindles. That didn't work. As it turns out, HDS disk just couldn't deliver the performance required. The software and mainframe were all well tuned. They replaced the HDS withan IBM DS8000 array, and it met all the service level requirements. Problem solved.
The problem with having this new "super high end" category, of course, is that only IBM plays in it, so it wouldn'toffer the marketplace much of a comparison. For now, we'll just have to settle for being the fastest in the samecategory as EMC DMX and HDS USP.
Storage is a competitive marketplace.Both EMC and HDS are reputable companies that make quality products that attach to IBM System z mainframe servers. Not all workloads are mission-critical or performance-sensitive. For less critical workloads, perhaps you may find EMC or HDS performance is "good enough".
But if performance is important to you, you should consider IBM on your list of vendors for your next purchase decision. Let IBM help you prove it to yourself, running your specific workloads side by side with your existing equipment.
The terms "information" and "data" are often used interchangeably in regular usage, but for the storageindustry, there are significant differences between the two, as different as "fact" from "meaning".
For example, if you are walking down the street, and see a pole with red and white stripes, the data of red and white stripes may not have much meaning, unless you recognize the information is that you are in front of a barber shop.I thought of this when someone pointed me to theStrip Generator Tool website, which can helpyou generate various stripes for use on the tiled background of web pages. (Or if you aredesigning neckties for your Second Life avatar).
Many national flags are based on simple stripes of different colors.For example, look at the national flags of France, Russia, and the Netherlands. These consist of a red, white, and blue stripe, justin different sequence and orientation.Again, the data of these colors, the width of their lines, and the way they are placed on the flag are all data, but the information they convey is significantly more than that.One person might walk right by the flag, not knowing which country it belongs to, while anotherperson might get emotional memories of their homeland.
For those of us in the storage industry, data is just binary 1's and 0's on disk and tape media, and canbe treated like packages at the post office in brown wrapping paper. Just as post office employees don't have to know the contents to ship them to the final destination, servers and storage devices don't need to knowthe informational content of the data that they process and store.
Converting information to data is easy. Let's take an example of taking a digital photo. The photo could be a picture of you and your spouseon your last vacation trip, but you would never know that from just looking at a series of 1's and 0's. For this reason, you create photo albums, you write captions below indicating where and when the photowas taken. This additional "context" is often called "metadata" or just simply "indexing".
Both the information captured (the photo in this case) and its metadata (the caption), can be storedas 1's and 0's on storage media. These bits can be compressed, encrypted, or represented in a variety of formats.
Information is copied from one data file to another. In the traditional sense, one piece of informationcould exist in the primary production copy, as well as multiple archive or backup copies. One piece ofinformation, stored on multiple copies of data. In a sense, this is similar to genetic information storedon each human being (data copy). Richard Dawkins, author of The Selfish Gene, reminds us that genes outlive individual humans. In storage, we remind people that data outlivesthe media it is initally written to, and the information outlives the initial data copy stored.
Converting data back to information is not always as simple.Not all sequences of 1's and 0's are obvious what they represent. To display a digital photo, you need to know the format the photo is in, and have an appropriate application that can display it back to something a human person can recognize. If the bits were compressed, the application needs to handlethat, or you need to de-compress the data before handing it to the application. For encrypted data,you need to have the decryption key. The process of converting a single file of data back to information is called "rendering".
One of the big problems with keeping information for long periods of time, isthat you may not have the equipment, decryption key, or applications needed to render the data back to usable information. You've kept the data, but you can't make any sense of it, as if it went through an episode of Will it Blend?
A good example is how the current version of Microsoft Office application is unable to interpret andrender data documents that were stored in WORD 1.0 format. IBM and others have developed "rendering tools" that can help decipher the bits, and bring back the information. To help address this challenge, the new Microsoft Office 2007 haschosen the OOXML format, but will continue to support some of the older legacy formats. IBM and the rest of the world are focused instead on Open Document Format (ODF) open standard. Those of usstill using older versions of Microsoft Office might need the Office 2007 Compatibility Pack.
Another way to get information from data is "data mining", an important part of "business intelligence". Here you are gleaning information notfrom individual details, but from patterns in the data, averages, statistics, totals, that havebroader meaning than individual transactions or events.
For many applications, DLM is just fine. Let's consider e-mail, for example. For most employees,deleting e-mails larger than 1 MB, after 90 days, regardless of content, is probably a reasonable DLM policy. All data is treated the same, based purely on the size and date markings on the outer brown wrapper.
For more sensitive content, DLM is not enough. The e-mails that are to or from the president of thecompany, or between top executives, or that contain certain pieces of information relevant for lawsuitsor other investigations, may not be treatedthe same as other e-mails. In this case, you need ILM technologies, managing based on the informational content of the data, and not just the size and date last referenced.
Of course, IBM supports both, and can help you decide the right solution for each workload.