Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
For those of us in the northern hemisphere, yesterday was this year's Winter Solstice, representingthe shortest amount of daylight between sunrise and sunset. So today, I thought I would blog on my thoughtsof managing scarcity.
Earlier in my career, I had the pleasure to serve as "administrative assistant" to Nora Denzel for the week at a storage conference. My job was to make her look good at the conference, which if you know Nora, doesn't take much. Later, she left IBM to work at HP, and I gotto hear her speak at a conference, and the one thing that I remember most was her statement that thewhole point of "management" was to manage scarcity, as in not enough money in the budget,not enough people to implement change, or not enough resources to accomplish a task.(Nora, I have no idea where you are today, so if you are reading this, send me a note).
Of course, the flip-side to this is that resources that are in abundance are generallytaken for granted. Priorities are focused on what is most scarce. Let's examine some of theresources involved in an IT storage environment:
Capacity - while everyone complains that they are "running out of space", the truth is that most external disk attached to Linux, UNIX, or Windows systems contain only 20-40% data. Many years ago, I visitedan insurance company to talk about a new product called IBM Tivoli Storage Manager. This company had 7TB of disk on their mainframe,and another 7TB of disk scattered on various UNIX and Windows machines. In the room were TWO storage admins for
the mainframe, and 45 storage admins for the distributed systems. My first question was "why so many people forthe mainframe, certainly one of you could manage all of it yourself, perhaps on Wednesday afternoons?" Their response was that they acted as eachother's backup, in case one goes on vacation for two weeks. My follow-up question to the rest of the audience was:"When was the last time you took two weeks vacation?" Mainframes fill their disk and tape storage comfortablyat over 80-90% full of data, primarily because they have a more mature, robust set of management software, likeDFSMS.
Labor - by this I mean skilled labor able to manage storage for a corporation. Some companies I have visitedkeep their new-hires off production systems for the first two years, working only on test or development systemsonly until then. Of course, labor is more expensive in some countries than others. Last year, I was doing a whiteboard session on-site for a client in China, and the last dry-erase pen ran out of ink. I asked for another pen, and they instead sent someone to go re-fill it. I asked wouldn't it be cheaper just to buy another pen, and they said "No, labor is cheap, but ink is expensive." Despite this, China does complain that there is a shortage of askilled IT labor force, so if you are looking for a job, start learning Mandarin.
Power and Cooling - Most data centers are located on raised floors, with large trunks of electrical power and hugeair conditioning systems to deal with all the heat generated from each machine. I have visited the data centers ofclients that are forced now to make decisions on storage based on power and cooling consumption, because the coststo upgrade their aging buildings are too high. Leading the charge is IBM, with technology advancements in chips, cards, and complete systems that use less power, and generate less heat. While energy is still fairly cheap in the grand scheme of things, fears ofGlobal Warmingand declining oil supplies, the costs ofpower and cooling have gotten some news lately. In 1956, Hubbert predicted US would reach peak oil supplies by1965-1970 (it happened in 1971), and this year Simmonsestimated that world-wide oil production began its decline already in 2005. Smart companies like Google have movedtheir server farms to places like Oregon in the Pacific Northwest for cheaper hydroelectric power.
Bandwidth - Last year IBM introduced 4Gbps Fibre Channel and FICON SAN networking gear, along with the servers and storage needed to complete the solution. 4Gbps equates to about 400 MB/sec in data throughput. By comparison, iSCSI is typically run on 1Gbps Ethernet, but has so much overheads that you only get abour 80 MB/sec. Next year, we may see both 8 Gbps SAN, and 10 GbE iSCSI, to provide 800 MB/sec throughputs. My experience is that the SAN is not the bottleneck, instead people run out of bandwidth at the server or storage end first. They may not have a million dollars to buy the fastest IBM System p5 servers, or may not have enough host adapters at the storage system end.
Floorspace - I end with floorspace because it reminds me that many "shortages" are temporary or artificially created. Floorspace is only in short supply because you don't want to knock down a wall, or build a new building, to handle your additional storage requirements.In 1997, Tihamer Toth-Fejel wrote an article for the National Space Society newsletter that estimated that ...Everybody on Earth could live comfortably in the USA on only 15% of our land area, with a population density between that of Chicago and San Francisco. Using agricultural yields attained widely now, the rest of the U.S. would be sufficient to grow enough food for everyone. The rest of the planet, 93.7% of it, would be completely empty.Of course, back in 1997 the world population was only 5.9 billion, and this year it is over 6.5 billion.
This last point brings me back to the concept of food, and I am not talking about doughnuts in the conference room, or pizza while making year-end storage upgrades. I'm talking aboutthe food you work so hard to provide for yourself and your family. The folks at Oxfam came up with a simpleanalogy. If 20 people sit down at your table, representing the world’s population:
3 would be served a gourmet, multi-course meal, while sitting at decorated table and a cushioned chair.
5 would eat rice and beans with a fork and sit on a simple cushion
12 would wait in line to receive a small portion of rice that they would eat with their hands while sitting on the floor.
So for those of you planning a special meal next Monday, be thankful you are one of the lucky three, and hopefulthat IBM will continue to lead the IT industry to help out the other seventeen.
'Those who cannot remember the past are condemned to repeat it.' --- George Santayana
This last week of 2006 seems like a good time to recap the past year, and review the upcoming new year.That said, a good start is PC World's Top 21 Tech Screwups of 2006.
Laptops made the news this year in a variety of ways. #1 was exploding batteries,and #6 were the stolen laptops that exposed private personal information. Someone I knowwas listed in one of these stolen databases, so this last one hits close to home. Securityis becoming a bigger issue now, and IBM was the first to deliver device-based encryptionwith the TS1120 enterprise tape drive.
IBM makes the chips used in all the major game consoles: Microsoft's Xbox 360, Nintendo's Wii,and Sony's PlayStation 3. Being all based on IBM technology doesn'tmake the games interoperable or compatible, and in the case of Sony, it made #8 for being incompatible with their own PlayStation 2.Sadly, Nintendo's Wii had its own set of problems, and I found this parody of asafety video on YouTubeyou might enjoy.
Microsoft had #5 (not understanding the holiday shopping season ends in December), #12 (not understanding people who use PCs prefer privacy), and #17 (not understanding how people useMP3 music players). At least they delivered their latest Xbox with minimal problems.As an engineer, taking on a market strategy role involved reading books and taking classeson marketing. I learned that it is all about understanding the marketplace well enoughso that your prospects "know, like, and trust" your company. Perhaps Microsoft should take a refresher course.
A few companies showed off their brilliant customer service. Comcast is representedin a video on #7, and AOL in a taped phone conversation on #15. Many of our clients areafraid of vendor lock-in, and how difficult it might be to undo the deployment of new storagetechnology. Fortunately, IBM is committed to open standards, making it easier for our clientsto make the right choice and feel good about it.
Hopefully, we can all learn from the mistakes of others, and not repeat them in 2007.
This year I resolve to be more consistent in my blogging, and my goal is to give you one to five entries per week, every week, based on the advice from Glenn Wolsey, Jennette Banks, and others.On some weeks, I will have a running theme, so rather than super-long entries to cover everything I can think of on a topic, make the entries short and readable. This week is a good time to review last year's "New Year's Resolutions" and to make new ones for 2007. I will discuss actions that companies can adopt for their data centers.
A common resolution is to lose weight, as in this Dilbert comic. Last year, I resolved to lose weight in 2006, and am delighted with myself that I lost eight pounds. When people ask for the secret of my success, I whisper in their ear "Eat less, exercise more." In general, people (and companies) know what to do, but just don't do it, which Pfeffer and Sutton document in their book The Knowing-Doing Gap. In my case, it involved lifestyle change: I exercised at a gym three times per week in Tucson, with a personal trainer, and revamped my diet.
Not everyone subscribes to the "eat less exercise more" philosophy. For example, Ric Watson argues in his blog that you can eat fewer calories, but eat more in actual volume, by choosing the right foods. This brings up the issues of "metrics" that most data centers are familiar with. Last year, I read the book "You: On a Diet" which explains that it is better to focus on "waist reduction" as measured in inches around your mid-section at the belly button, than "weight reduction" as measured in pounds. This year, I resolve to get down to 35 inches by the end of 2007.
The problem with measuring "weight" is that you are weighing bones, muscle and fat. A person can gain ten pounds of muscle, lose ten pounds of fat, and the scale would indicate no progress. The same problem occurs in data centers. How many TB of data do you have? Storage admins can easily tell you, but can they tell how much of this is bone (data needed for operating infrastructure), muscle (data used in daily operations that generates revenue) or fat (obsolete or orphaned data)?
We at IBM often state that "Information Lifecycle Management (ILM)" is more lifestyle change than a "fad diet". Figuring out what data you should capture in the first place, where to place it, when to move it, and when to get rid of it, is more important that just buying different tiers of storage hardware. So, for those looking to make new data center resolutions, I suggest the following actions:
Re-evaluate the metrics you now use, and determine if they are helpful in making decisions and taking action.
Come up with new ones that are more focused to solve the issues you face.
Consider storage infrastructure software, such as IBM TotalStorage Productivity Center, to help you gather the information about your SAN, disk and tape systems, calculate the metrics, and automate the appropriate actions.
Continuing this week's theme of New Year's Resolutions for the data center, today we'll talk about one that many people make for their own personal lives: staying on a budget.
Often, when faced with a tightening budgets, we try to make more use of what we already have. Tell someone they are only using 10 percent of their brain, and they immediatelybelieve you; but tell them they are only using 30 percent of their storage, and they ask for a whitepaper,magazine article, or clarification on how that percentage is calculated. I actually visiteda customer that was only using6 percent of the storage attached to their Windows servers!
So, to help those of you making data center resolutions to stay on budget, the terms to remember are "Reduce", "Reuse" and "Recycle".
When people come to request storage, are they being reasonable about what they need today, or are they asking for what they might need over the next three years? They might need 50GB, but they ask for 100GB, in case they grow, and a year later, you find they have only 15GB of data on it. On the flipside, the person asks for what they need but some storage admins give out more, just so they don't have to be bothered so often when growth happens. Finally, I have seen this formalized into fixed size LUNs, all the disk is carved into big huge 100GB pieces, so if you need 20GB, here's one big enough with plenty of room to grow.
If you are going to keep on a budget, remember that storage today is 30% more expensive than storage next year. That is the average drop in both disk and tape on a dollar-per-MB basis. If there is any way to postpone giving out storage until it is actually needed, you can save a bundle of money. Timing is everything! In the event of a disaster, getting immediate replacement for disk can be very expensive, but if you can wait just two weeks, you can negotiate a better deal. I thought of this while going to the movie theatre yesterday. A "hot dog" and a bottle of water was $8.00, but if you are able to wait two hours and eat after the movie, you can get a much better meal for less.
A lot of companies buy new storage because their existing storage isn't fast enough, or doesn't have the latest copy services. This can easily be solved with an IBM SAN Volume Controller (SVC). The SVC can virtualize slower, functionless storage, and present to your application hosts virtual disks that are faster, and with all the latest disk-to-disk copy services like FlashCopy, Metro Mirror, and Global Mirror.
Chances are, you have unused disk capacity spread across all your storage today, but perhaps they are formatted into small LUNs. The SVC can combine the capacity, and let you carve up big LUNs at the sizes you need.This is like taking all those tiny pieces of soap in your shower and forming a new bar of soap, or taking all the crumbs at the bottom of your bread box, and making a new slice of bread. And, the virtual LUNs are dynamically expandable,so give out only the amount they need today, as it is simple to expand them to larger sizes later.
Of my 13 patents, the first will always be my favorite, on a function called "RECYCLE" for the Data Facility Storage Management Subsystem Hierarchical Storage Manager (DFSMShsm) product, which is now a component of the IBM z/OS operating system. Basically, tapes could contain hundreds or thousands of files, such as backup versions or archive copies, and these expired on different dates. As a result, a tape would be written100 percent full, and then over time, decrease in valid data to 80, 60, 40, 20 until it hit 0 percent. In some cases, a single filecould hold an entire tape hostage. RECYCLE was able to read the valid data off tapes that were perhaps less than 20 percent full, and consolidate them onto fewer tapes. As a result, a whole bunch of tapes could be returned to the scratch pool, and reused immediately for other workloads. This also helps in moving to newer, higher capacity cartridges, such as the new 700GB cartridge that IBM co-developed with FujiFilm.(This RECYCLE function exists in our IBM Tivoli Storage Manager software, as well as our Virtual Tape Server, but is called "reclamation" instead, to avoid confusion on searches.)
When evaluating your use of tape, determine if you are making best use of the tapes you have now, and perhaps a RECYCLE (or reclamation) scheme may be in order. Fewer tapes can save money in many ways, such as reduced storage costs, and reduced courier costs to send the tapes offsite. Tape media can still be 10-20 times less expensive than disk, based on full capacity.
Continuing this week's theme of New Year's Resolutions for the data center, today we'll talk about one that people don't always think about on a personal level, that is to hone your tools and skills.
A long time ago, I used to be a regular speaker at the SHARE user group conference. One of the most attended sessions was Sam Golob presenting the latest CBT Tape set of tools. Over time, this large collection of "mainframe shareware" was handed out on 3480 tape cartridges, then on CDs, and finally made downloadable off the web.Sam's main point, which I remember to this day, was that everyone who has a job should figure out what tools they use, keep those tools functioning properly, and learn to use them well.
Later, I took some cooking classes at a culinary school. Among other things, we learned:
A sharp knife is safer and easier to use than a dull one, resulting in fewer accidents
Knowing what you are doing is the difference between food that is "simply awful" to that which is "awfully simple" to prepare.
A well trained chef can prepare most meals with just a sharp knife and wooden spoon.
The same could be said about software tools. What tools do you use in your job? Do you feel you know how to take full advantage of their power and capabilities?If you develop software, do you know all the features for your debugging tools? If you develop advertising or marketing materials, do you know all the features of your photo or video editing software? If you manage storage in a data center, do you know all the tools for managing your storage area network (SAN), disk systems, tape libraries, and reporting tools to identify all of your files and databases across your entire IT environment?I would not be surprised if you could replace a whole mess of tools with just one, such as the IBM TotalStorage Productivity Center.
Stephen Colbert, of The Colbert Report, explains the name changes in recent mergers of the Telecommunications industry. A discussion on "changing names" and how that impacts storage seems like a good way to wrap up the week's theme on naming conventions.
Name changes are sometimes painful, but often times done for a purpose, such as to promote a family. In the US, when a man and woman marries, the woman often changes her family name to match her husband, and the kids all adopt the father's family name. I say "often" because there are times where the woman keeps her name, or adds to it in a hyphenated way. ABC News reported that a Man Fights to Take Wife's Name in Marriage. KipEsquire, a lawyer, writes about it in his blogA stitch in haste.
IT industry changes the names of products that people knew as something else. Other times, they re-use an existing name, when really it is or should be different from the original. Last year, I took on the job of helping transition from our brand "TotalStorage" to the "System Storage" product line under the new "IBM Systems" brand. I help decide what stays the same name or what changes, when it should change, and how to announce that change.
On the disk side, IBM renamed Fibre Array Storage Technology, or FAStT, which was pronounced exactly like "fast", to DS4000 series. This was a big improvement, as people couldn't seem to spell it properly, with variations like "FastT". Nor could people pronounce it properly, saying "fast-tee" instead. The advantage of "DS" is that it is both easy to spell, and easy to pronounce. The DS4000 series continues to be "fast", providing excellent performance for its midrange price category.
IBM's Enterprise Storage Server (ESS) line went from model E10, to F20, to 750 and 800. When IBM came out with its replacement, the IBM TotalStorage DS8000, some people asked why it wasn't named the ESS 900, for example. The DS8000 is quite different internally, new hardware design and implementation, but is highly compatible with the ESS line, and shares much of the same functionality from microcode. Last year, it was replaced by the IBM System Storage DS8000 Turbo. Again, newer hardware, so it was easy to justify the new name change from "TotalStorage" to "System Storage".
Renaming a product risks losing its certifications and awards. For example, IBM spent a lot of time and money getting the OS/390 operating system certified as a "UNIX" platform. When it was renamed to z/OS, IBM had to do it all over again. Learning from this experience, IBM decided not to rename the SAN Volume Controllerto a new designation like "DS5750", as it enjoys the "number one" spot on both the SPC-1 and SPC-2 performance benchmarks, and is recognized as the leader in the disk storage virtualization marketplace. Renaming this product would mean losing that collateral.
IBM's "other disk systems" the N series posed another set of challenges. The current DS line already has entry-level (DS3000), midrange (DS4000) and enterprise-class (DS6000 and DS8000) products. The OEM agreement that IBM has with Network Appliance (NetApp) resulted in a new set of entry-level, midrange, and enterprise-class products. But these didn't fit nicely into the DS3000-to-DS8000 continuum. Instead, IBM decided to go with N series, using N3000 for entry-level, N5000 for midrange, and N7000 for enterprise-class. These are different than the numbers used by NetApp for their comparable, but not identical, offerings.
On the tape side, IBM decided to name the tape drives TS1000 and TS2000 range, tape libraries and automation with a TS3000 range, and tape virtualization to the TS7000 range. A lot of tape products already had 3000 numbering that had to change to fit this new scheme. This is why IBM's popular 3592 tape drive was renamed to the TS1120. The replacement to the 3494 Virtual Tape Server was named TS7700 Virtualization Engine.
Obviously, you can't change the names of products that are currently in the field, but what about existing software with minor updates? IBM decided to leave "TotalStorage Produtivity Center" under the "TotalStorage" brand until it has a significant version upgrade. Many people say "TPC" as a convenient acronym when referring to this product, but TPC is a registered trademark of the Professional Golfers Association (PGA) to refer to its "Tournament Players Club".
How can anyone confuse "managing storage" with "playing golf"? One activity is full of frustration that takes years or decades to master, involving the need to understand a variety of equipment and techniques to use each properly to accomplish your goals; and the other is an enjoyable activity, immediately productive in front of a single pane of glass managing all of your DAS, SAN and NAS storage, from reporting on your files and databases to managing storage networks and tape libraries.
This week I am in Japan, so my week's theme will center around travel, speaking at conferences, and Japan itself. I first travelled to Japan in the late 1980s, to visit a college friend who was working for Ford Motor Company, on assignment in Japan as liasion to Mazda Corp.
Back then, the only Japanese phrase I knew was "Wakarimashta" which means "I know" or "I understand". If you only know one phrase in a foreign language, this possibly could be the worst to know.
My second trip, I was better prepared. I learned three "survival phrases":
sumimasen - "I'm sorry/excuse me" hanashimasen - "I don't speak" wakarimasen - "I don't know / I don't understand"
These are great phrases to know individually, but even more powerful strung all together, to emphasize that you will begin speaking English, but at least with good reason (and perhaps a bit of irony.)
I've been to Japan many times since, and have picked up more of the language. When travelling to Japan, or anywhere for that matter, it is important to "pack light". I'll be gone for two weeks, but all I bring is a laptop bag and one carry-on piece of luggage.
I went on a trip to Prague (Czech Republic) with a female co-worker who brought FOUR pieces of luggage. One was just for shoes. Another piece was just for hair styling gel, make-up, face creams and finger nail polish. Today, the rules are different, and the TSA allows only a single quart-size plastic bag containing little jars of 3 ounces or less of liquids or gels. I didn't have any "quart-size" bags, so I used a smaller sandwich-size bag.
What does all this have to do with storage? I've helped many clients move data centers, and this involves moving their servers, their networks, and their storage. Servers and Networks are easy to move, but storage presents some challenges. In many cases, the entire company is shut down, the storage is moved, and then the company is operational again. Needless to say, it is best to do this over a weekend.
I tell clients to "pack light" and figure out what data they really need in the move. What do you really need to operate your business? Bring just that, the rest can arrive later.
This same concept applies for Business Continuity and Disaster Recovery planning. What do you really need after a disaster occurs? Can you run your business for a few weeks on that data, until the rest of the data is restored? If you can't run your entire business on that data, can you run your most important parts of your business?
If you run a bank, perhaps keeping your ATM cash machines running is more important than making out new loans. In Japan, if a bank has any outages that impact their ATM machines, they put out a full page advertisement in the local papers to apologize for the inconvenience.
Business Continuity is one of the nine "Infrastructure Solutions" that IBM can help clients with. If you are interested in learning more on how IBM can help you with your Business Continuity, click here.
Well, I have left Japan, and while everyone else is enjoying the Super Bowl, I am now in Australia, at another conference.Today I had the pleasure to hear filmmakers talk about their successes, and how IBM helps the movie industry.
At one extreme was Khoa Do, independent filmmaker. After acting in movies asideMichael Caine and Billy Zane, he decided to become his own director. He started a project to help seven disadvantaged youths from a poor drug-ridden section of Sydney, by having them act in his first full-length film.Armed with only an IBM laptop and small budget, he made the film called "The Finished People" that had critical acclaim.
The film was a success, and many of the disadvantaged youths have gone on to act in other movies. In 2005, Khoa Do was named "Young Australian of the Year".
Thanks to IBM technology, filmmaking is now accessible to a wider number of aspiring wanna-be directors. It is no longer necessary to be part of a large film studio with a multi-million dollar budget to tell your story.
At the other extreme, was Xavier Desdoigts, director of technical operations at Animal Logic, the Computer Graphics (CG) arthouse that produced special effects of movies like "The Matrix", "House of Flying Dragons" and "World Trade Center". They started with producing digital effects for TV commercials, like this one forCarlton Draught Beer.
With the support of a large film studio and multi-million dollar budget, Animal Logic now boasts the 86th most powerful "Supercomputer" based on IBM BladeCenter technology, with over 4000 servers connected into a cluster, for making the movie "Happy Feet". The movie took four years to make, with over 500 people, of 27 different nationalities. It was the first CG movie made in Australia, and has been well-received by audiences worldwide.
Mr. Desdoigts gave out some interesting facts and figures about the movie:
While visually stunning on the big screen, each frame is only 1.4 Megapixel, about the same resolution as most camera phones.
In one scene, there are 427,086 penguins all appearing on frame.
Mumble, the lovable lead character, is made up of over 6 million feathers.
As many as 17 dancers were "motion-captured" to choreograph the tap-dancing and character interaction segments.
Only one system admin was needed to manage this entire server farm. (IBM Systems Director technology makes this possible)
The movie consumed 103 TB of disk space, backed up to 595 LTO tape cartridges.
An estimated 17 million CPU-hours were needed for all the processing and rendering.
Rather than talking about technology for technology sake, these filmmakers showed how technology couldbe put to use, in a practical sense, to provide the world something of value.
It's official! IBM System Storage TS1120 tape drive takes home the gold award, the product of the year, announced by Storage magazine.
I spent 18 hours traveling from Australia to China yesterday, and we were partially delayed due to weather, but felt that it was necessary to discuss the innovative use of encryption on this drive.
While most consider the TS1120 an "Enterprise-class" tape technology for the mainframe, it is also attachable to the smallest distributed systems running Windows, Linux, or various flavors of UNIX. Rather than limit users with an Encryption Key Manager that only ran on z/OS, IBM instead chose to implement it in Java, that can be run on anything from z/OS to Linux, Unix and Windows platforms, giving clients choice and flexibility in their deployment.
The design is quite clever and elegant. In the encryption world, there are two ways to encrypt.
This is very fast, because it uses a single key for both encryption and decryption, and can be incorporated on a chip. The problem is that anyone with the key can read the sensitive data.
This is slower, but more secure, using two separate keys. The public "encryption" key takes clear data and encrypts it. Anyone can be freely given this key, as they cannot use it to decrypt any other data. The private "decryption" key is able to decrypt the data, so that one is kept secret. If two business plan to exchange lots of tapes, they can exchange their "encryption" keys to each other.
So, let's say that Green, Inc. wants to send a tape to Blue, Co. Blue has already provided its public "encryption" key to Green, so Green does the following:
Generate a unique data key, will call it the "red key", and there is one for each tape. It is a standard AES 256-bit symmetric key that can be processed with less than one percent overhead on the tape drive. All the data is encrypted with this key.
Store the red key on the tape. How does Green give Blue the red key? Green encrypts it with Blue's RSA 2048-bit public "encryption" key. This is stored on three places on the tape cartridge, one in memory, and the other two on the media itself.
Sends the tape over to Blue Co.
When it arrives on the dock at Blue Co., they do the following:
Mount the tape and decrypt the "red key" using Blue's super-secret private decryption key.
Pass the "red key" to the tape drive, and have it read, append or re-write the tape.
If the super-secret private key is ever compromised, all you have to do is mount the tape, unlock the red key with the old private key, and re-lock the red key with a new public key. Since the red key doesn't change, the rest of the data can be left in tact. The whole process takes less than 5 minutes, compared to Sun Microsystems method, which could take 1-2 hours per cartridge, having to decrypt and re-encrypt the entire data stream.
Federal Rules for Civil Procedures (FRCP) will increase adoption of unstructured data classification, email archive systems and CAS.
CAS continues to flounder, but the rest I can agree with. Regulations are being adopted world wide. Japan has its own Sarbanes-Oxley (SOX) style legislation go into effect in 2008.IBM TotalStorage Productivity Center for Data is a great tool to help classify unstructured file systems. IBM CommonStore for email supports both Microsoft Exchange and Lotus Domino, and can be connected to IBM System Storage DR550 for compliance storage.
Unified storage systems (combined file and block storage target systems) will become increasingly attractive in 2007, because of their ease of use and simplicity.
I agree with this one also. Our sales of IBM N series in 2006 was great, and looking to continue its strong growth in 2007. The IBM N series brings together FCP, iSCSI and NAS protocols into one disk system. With the SnapLock(tm) feature, N series can store both re-writable data, as well as non-erasable, non-rewriteable data, on the same box. Combine the N series gateway on the front-end with SAN Volume Controller on the back-end, and you have an even more powerful combination.
Distributed ROBO backup to disk will emerge as the fastest growing data protection solution in 2007.
IDC had a similar prediction for 2006. ROBO refers to "Remote Office/Branch Office", and so ROBO backup deals with how to back up data that is out in the various remote locations. Do you back it up locally? or send it to a central location?Fortunately, IBM Tivoli Storage Manager (TSM) supports both ways, and IBM has introduced small disk and tape drives and auto-loaders that can be used in smaller environments like this. I don't know whether "backup to disk" will be the fastest growing, but I certainly agree that a variety of ROBO-related issues will be of interest this year.
2007 will be remembered as the year iSCSI SAN took off because of the much reduced pricing for 10 Gbit iSCSI and the continued deployment of 10 Gbit iSCSI targets.
While I agree that iSCSI is important, I can't say 2007 will be remembered for anything.We have terrible memory in these things. Ask someone what year did Personal Computers (PC) take off, and they will tell you about Apple's famous 1984 commercial. Ask someone when the Internet took off, cell phones took off, etc, and I suspect most will provide widely different answers, but most likely based on their own experience.
For the longest time, I resisted getting a cell phone. I had a roll of quarters in my car, and when I needed to make a call, I stopped at the nearby pay-phone, and made the call. In 1998, pay phones disappeared. You can't find them anymore. That was the year of the cell phones took off, at least for me.
Back to iSCSI, now that you can intermix iSCSI and SAN on the same infrastructure, either through intelligent multi-protocol switches available from your local IBM rep, or through an N series gateway, you can bring iSCSI technology in slowly and gradually. Low-cost copper wiring for 10 Gbps Ethernet makes all this very practical.
Another up-and-coming technology is AoE, or ATA-over-Ethernet. Same idea as iSCSI, but taken down to the ATA level.
CDP will emerge as an important feature on comprehensive data protection products instead of a separate managed product.
Here, CDP stands for Continuous Data Protection. While normal backups work like a point-and-shoot camera, taking a picture of the data once every midnight for example. CDP can record all the little changes like a video camera, with the option to rewind or fast-forward to a specific point in the day. IBM Tivoli CDP for Files, for example, is an excellent complement to IBM Tivoli Storage Manager.
The technology is not really new, as it has been implemented as "logs" or "journals" on databases like DB2 and Oracle, as well as business applications like SAP R/3.
The prediction here, however, relates to packaging. Will vendors "package" CDP into existing backup products, possibly as a separately priced feature, or will they leave it as a separate product that perhaps, like in IBM's case, already is well integrated.
The VTL market growth will continue at a much reduced rate as backup products provide equivalent features directly to disk. Deduplication will extend the VTL market temporarily in 2007.
VTL here refers to Virtual Tape Library, such as IBM TS7700 or TS7510 Virtualization Engine. IBM introduced the first one in 1997, the IBM 3494 Virtual Tape Server, and we have remained number one in marketshare for virtual tape ever since. I find it amusing that people are now just looking at VTL technology to help with their Disk-to-Disk-to-Tape (D2D2T) efforts, when IBM Tivoli Storage Manager has already had the capability to backup to disk, then move to tape, since 1993.
As for deduplication, if you need the end-target box to deduplicate your backups, then perhaps you should investigatewhy you are doing this in the first place? People take full-volume backups, and keep to many copies of it, when a more sophisticated backup software like Tivoli Storage Manager can implement backup policies to avoid this with a progressive backup scheme. Or maybe you need to investigate why you store multiple copies of the same data on disk, perhaps NAS or a clustered file system like IBM General Parallel File System (GPFS) could provide you a single copy accessible to many servers instead.
The reason you don't see deduplication on the mainframe, is that DFSMS for z/OS already allows multiple servers to share a single instance of data, and has been doing so since the early 1980s. I often joke with clients at the Tucson Executive Briefing Center that you can run a business with a million data sets on the mainframe, but that there wereprobably a million files on just the laptops in the room, but few would attempt to run their business that way.
Optical storage that looks, feels and acts like NAS and puts archive data online, will make dramatic inroads in 2007.
Marc says he's going out on a limb here, and that's good to make at least one risky prediction. IBM used to have anoptical library emulate disk, called the IBM 3995. Lack of interest and advancement in technology encouraged IBM to withdraw it. A small backlash ensued, so IBM now offers the IBM 3996 for the System p and System i clients that really, really want optical.
As for optical making data available "online", it takes about 20 seconds to load an optical cartridge, so I would consider this more "nearline" than online. Tape is still in the 40-60 second range to load and position to data, so optical is still at an advantage.
Optical eliminates the "hassles of tape"? Tape data is good for 20 years, and optical for 100 years, but nobody keeps drives around that long anyways. In general, our clients change drives every 6-8 years, and migrate the data from old to new. This is only a hassle if you didn't plan for this inevitable movement. IBM Tivoli Storage Manager, IBM System Storage Archive Manager, and the IBM System Storage DR550 all make this migration very simple and easy, and can do it with either optical or tape.
The Blue-ray vs. DVD debate will continue through 2007 in the consumer world. I don't see this being a major player in more conservative data centers where a big investment in the wrong choice could be costly, even if the price-per-TB is temporarily in-line with current tape technologies. IBM and others are investing a lot of Research and Development funding to continue the downward price curve for tape, and I'm not sure that optical can keep up that pace.
Well, that's my take. It is a sunny day here in China, and have more meetings to attend.
In case you haven't noticed, IBM System Storage makes most of their announcements on Tuesdays. IBM announced a lot today, so here is a quick run-down.
Cisco storage networking products
IBM continues to resell Cisco switches and directors, but now can offer these with a 1-year IBM warranty.
The entry-level Cisco 9124offers 8 to 24 ports. For IBM BladeCenter, IBM now offers the Cisco10-port and 20-port modules that slide into the back of the chassis, and are functionally equivalent to the 9124.The original BladeCenter came with a 16-port module with 14 internal, but only 2 external, which severely hamperedbandwidth connectivity to external storage. These new modules provide more external ports to relieve that constraint.
The midrange Cisco9200switches have two models, both with 16 fixed ports, with the option for a blade that can provide 12, 24 or 48 additional ports. The 9216A has 16 FCP ports, and the 9216i has 14 FCP ports, and 2 GbE ports to act as a router, such as toconnect to a remote location for business continuity using Metro Mirror or Global Mirror.
The enterprise-class Cisco 9500directors can support up to 528 ports.
TS3400 Tape Library
The new TS3400library is a small entry-level size library, supporting the enterprise-class TS1120 drive, providing interoperabilitywith the larger tape libraries, with all the support for tape encryption.
In addition to Linux, Unix, and WIndows, the TS1120 can now be connected to System i servers. In the past, the only IBMtape available to System i were the LTO models. There are a lot of businesses that need to comply with government regulations that are looking for tape encryption, and now IBM has made it accessible to more clients.
300GB drives at 15K RPM
The DS8000 can now support new drives with 300GB capacity at 15,000 RPM (15K). These can be up to 30 percent faster than the 10,000 RPM drives for typical workloads.
IBM continues its market leadership with these new set of features and offerings!
Well, this week I am in Maryland, just outside of Washington DC. It's a bit cold here.
Robin Harris over at StorageMojo put out this Open Letter to Seagate, Hitachi GST, EMC, HP, NetApp, IBM and Sun about the results of two academic papers, one from Google, and another from Carnegie Mellon University (CMU). The papers imply that the disk drive module (DDM) manufacturers have perhaps misrepresented their reliability estimates, and asks major vendors to respond. So far, NetAppand EMC have responded.
I will not bother to re-iterate or repeat what others have said already, but make just a few points. Robin, you are free to consider this "my" official response if you like to post it on your blog, or point to mine, whatever is easier for you. Given that IBM no longer manufacturers the DDMs we use inside our disk systems, there may not be any reason for a more formal response.
Coke and Pepsi buy sugar, Nutrasweet and Splenda from the same sources
Somehow, this doesn't surprise anyone. Coke and Pepsi don't own their own sugar cane fields, and even their bottlers are separate companies. Their job is to assemble the components using super-secret recipes to make something that tastes good.
IBM, EMC and NetApp don't make DDMs that are mentioned in either academic study. Different IBM storage systems uses one or more of the following DDM suppliers:
Seagate (including Maxstor they acquired)
Hitachi Global Storage Technologies, HGST (former IBM division sold off to Hitachi)
In the past, corporations like IBM was very "vertically-integrated", making every component of every system delivered.IBM was the first to bring disk systems to market, and led the major enhancements that exist in nearly all disk drives manufactured today. Today, however, our value-add is to take standard components, and use our super-secret recipe to make something that provides unique value to the marketplace. Not surprisingly, EMC, HP, Sun and NetApp also don't make their own DDMs. Hitachi is perhaps the last major disk systems vendor that also has a DDM manufacturing division.
So, my point is that disk systems are the next layer up. Everyone knows that individual components fail. Unlike CPUs or Memory, disks actually have moving parts, so you would expect them to fail more often compared to just "chips".
If you don't feel the MTBF or AFR estimates posted by these suppliers are valid, go after them, not the disk systems vendors that use their supplies. While IBM does qualify DDM suppliers for each purpose, we are basically purchasing them from the same major vendors as all of our competitors. I suspect you won't get much more than the responses you posted from Seagate and HGST.
American car owners replace their cars every 59 months
According to a frequently cited auto market research firm, the average time before the original owner transfers their vehicle -- purchased or leased -- is currently 59 months.Both studies mention that customers have a different "definition" of failure than manufacturers, and often replace the drives before they are completely kaput. The same is true for cars. Americans give various reasons why they trade in their less-than-five-year cars for newer models. Disk technologies advance at a faster pace, so it makes sense to change drives for other business reasons, for speed and capacity improvements, lower power consumption, and so on.
The CMU study indicated that 43 percent of drives were replaced before they were completely dead.So, if General Motors estimated their cars lasted 9 years, and Toyota estimated 11 years, people still replace them sooner, for other reasons.
At IBM, we remind people that "data outlives the media". True for disk, and true for tape. Neither is "permanent storage", but rather a temporary resting point until the data is transferred to the next media. For this reason, IBM is focused on solutions and disk systems that plan for this inevitable migration process. IBM System Storage SAN Volume Controller is able to move active data from one disk system to another; IBM Tivoli Storage Manager is able to move backup copies from one tape to another; and IBM System Storage DR550 is able to move archive copies from disk and tape to newer disk and tape.
If you had only one car, then having that one and only vehicle die could be quite disrupting. However, companies that have fleet cars, like Hertz Car Rentals, don't wait for their cars to completely stop running either, they replace them well before that happens. For a large company with a large fleet of cars, regularly scheduled replacement is just part of doing business.
This brings us to the subject of RAID. No question that RAID 5 provides better reliability than having just a bunch of disks (JBOD). Certainly, three copies of data across separate disks, a variation of RAID 1, will provide even more protection, but for a price.
Robin mentions the "Auto-correlation" effect. Disk failures bunch up, so one recent failure might mean another DDM, somewhere in the environment, will probably fail soon also. For it to make a difference, it would (a) have to be a DDM in the same RAID 5 rank, and (b) have to occur during the time the first drive is being rebuilt to a spare volume.
The human body replaces skin cells every day
So there are individual DDMs, manufactured by the suppliers above; disk systems, manufactured by IBM and others, and then your entire IT infrastructure. Beyond the disk system, you probably have redundant fabrics, clustered servers and multiple data paths, because eventually hardware fails.
People might realize that the human body replaces skin cells every day. Other cells are replaced frequently, within seven days, and others less frequently, taking a year or so to be replaced. I'm over 40 years old, but most of my cells are less than 9 years old. This is possible because information, data in the form of DNA, is moved from old cells to new cells, keeping the infrastructure (my body) alive.
Our clients should approach this in a more holistic view. You will replace disks in less than 3-5 years. While tape cartridges can retain their data for 20 years, most people change their tape drives every 7-9 years, and so tape data needs to be moved from old to new cartridges. Focus on your information, not individual DDMs.
What does this mean for DDM failures. When it happens, the disk system re-routes requests to a spare disk, rebuilding the data from RAID 5 parity, giving storage admins time to replace the failed unit. During the few hours this process takes place, you are either taking a backup, or crossing your fingers.Note: for RAID5 the time to rebuild is proportional to the number of disks in the rank, so smaller ranks can be rebuilt faster than larger ranks. To make matters worse, the slower RPM speeds and higher capacities of ATA disks means that the rebuild process could take longer than smaller capacity, higher speed FC/SCSI disk.
According to the Google study, a large portion of the DDM replacements had no SMART errors to warn that it was going to happen. To protect your infrastructure, you need to make sure you have current backups of all your data. IBM TotalStorage Productivity Center can help identify all the data that is "at risk", those files that have no backup, no copy, and no current backup since the file was most recently changed. A well-run shop keeps their "at risk" files below 3 percent.
So, where does that leave us?
ATA drives are probably as reliable as FC/SCSI disk. Customers should chose which to use based on performance and workload characteristics. FC/SCSI drives are more expensive because they are designed to run at faster speeds, required by some enterprises for some workloads. IBM offers both, and has tools to help estimate which products are the best match to your requirements.
RAID 5 is just one of the many choices of trade-offs between cost and protection of data. For some data, JBOD might be enough. For other data that is more mission critical, you might choose keeping two or three copies. Data protection is more than just using RAID, you need to also consider point-in-time copies, synchronous or asynchronous disk mirroring, continuous data protection (CDP), and backup to tape media. IBM can help show you how.
Disk systems, and IT environments in general, are higher-level concepts to transcend the failures of individual components. DDM components will fail. Cache memory will fail. CPUs will fail. Choose a disk systems vendor that combines technologies in unique and innovative ways that take these possibilities into account, designed for no single point of failure, and no single point of repair.
So, Robin, from IBM's perspective, our hands are clean. Thank you for bringing this to our attention and for giving me the opportunity to highlight IBM's superiority at the systems level.
On the news today, they mentioned it was "Happy Pi Day". Today is the 14th day of the 3rd month, and "pi" is about 3.14159, the ratio of the circumference of a circle to its diameter. So, in Tucson it is celebrated on 3/14, at 1:59pm MST.
The ratio has a lot to do with storage.
Tape wrapped around a hub. Tape is thin, but not completely, so wrapping hundreds of meters on tape results in a change in diameter of the spool. This impacts the rotational velocity needed to get the linear meters-per-second on the tape media consistent as the diameter changes when you wind down from a full spindle toward the hub. IBM has variable speed motors and other clever technologies to handle this adjustment.
Disks spin at consistent speeds, but tracks on the outside edge travel faster across the head than the inside tracks.Currently, the top speeds for disk are 15000 Revolutions per minute (RPM). As faster rotational speeds are investigated, the researchers find they need to make the diameters smaller to compensate.
The diameters of disks were based on "U", the unit height of standard 19" racks. A "U" is 1.75 inches, and standard floppy diskettes were 5.25 inch (3U) and 3.5 inch (2U). For those who have a difficult time remember how many inches a "U" is, it is the height of a standard two-by-four (2x4) piece of lumber.
The value of "pi" has been calculated to over a billion significant digits. Here is a cuteapplet to use if you ever need the value to any level of accuracy.
Michael Scott, one of my "Second Life" builder/scripters, for demonstrating client-focused dedication to IBM's corporate values.
Our site manager, Terri Mitchell, did a recap of all our recent awards and accomplishments.Of the nine Design Innovation awards won by IBM this year at the CeBIT conference, eight were for IBM System Storage products!
The IBM System Storage EXP3000: an entry-level data storage server that is optimized for cost-sensitive and space-limited environments and employs a user-centered design that enables ease of use and simple tool-less installation and removal of all components.
The IBM System Storage N7000 Series: a modular disk storage system that delivers high-end enterprise storage and data management value ideal for large-scale applications, while helping to anticipate growth, maintaindata availability and reduce costs.
The IBM System Storage N5000 Series: a modular disk storage system designed to address the entire spectrum of data availability challenges while offering value in price and scalability. Built-in enterprise serviceability and manageability features support efforts to increasereliability and simplify storage infrastructure and maintenance.
The IBM System Storage N3700: a filer that integrates storage and storage processing into a single unit, facilitating affordable network deployments.
The IBM System Storage DS4700: a NEBS-compliant disk storage server designed to address requirements for companies in the telecommunications industry, as well as other segments, such as oil and gas, meeting standardsfor electromagnetic compatibility, thermal robustness, earthquake and office vibration resistance, and provides protection for the product components from airborne contaminants.
The IBM System Storage EXP810: a data storage expansion unit capable of 4.8 Terabytes of physical storage, with a user-centered and tool-less design featuring redundant power, cooling, and disk modules for ease of use and simple serviceability.
The IBM System Storage TS3400: an affordable, space-friendly tape library for users in remote locations that supports enterprise-class technology and encryption capabilities.
A representative from Tucson's Brewster Center presented Terri an award, thanking IBM for its strong support for the community through various charity initiatives.
The final speaker was a new IBM client, Tony Casella, the IT Director of the town of Marana. Recently, the town of Marana selected IBM products made big news. Arizona is the fastest growing state in the USA, and the town of Marana, just north of Tucson, is one of the fastest growing communities in Arizona. The town is growing so large that it will soon spill over from Pima into Pinal county, and will be the first town in Arizona authorized to span county boundaries.
In case you missed it, IBMunveiled a new digital video surveillance service yesterday. This "marks an important shift in the industry's approach to security, applying advanced analytics to video data and signaling the ability to converge physical and information technology (IT) security."
The IBM Smart Surveillance Solution is designed to provide the unique capability to carry out efficient data analysis of video sequences either in real time or from recordings. These recordings can be on disk or tape storage.
The problem with today's existing "analog" surveillance is that the analog cameras record onto traditional VHS tapes, and these are rotated through, re-written after a few hours or days. To review tapes often involves human intervention, and must be done before the VHS tapes are re-used. Many shoplifters, thieves, and other law-breakers take a chance that their actions will not be caught on tape, or that they will be long gone by the time the video is analyzed.
The IBM Smart Surveillance Solution can provide a number of advantages over traditional video solutions, including:
Real-time alerts that can help anticipate incidents by identifying suspicious behaviors.
Forensic capabilities are enhanced by utilizing unique indexing and attribute-based search of video events to classify objects into categories such as people and cars.
Situational awareness of the location, identity and activity of objects in a monitored space including license plate recognition and face capture.
With real-time analytics capabilities, the new DVS service can open up a wide array of new applications that go far beyond the traditional security aspects of surveillance systems. Early adopter industries in this rapidly evolving market include retail, public sector and financial services. The retail industry estimates nearly $50 billion is lost annually to fraud, theft and administrative errors.
Once in digital format, video surveillance can be sent further, processed quicker, and stored for longer periods of time, than traditional media makes practical today.