Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
A lot of people ask me about IBM branding, as we have recently changed brands. In the past we had two separate brands, one for servers (eServer) and one for storage (TotalStorage). These would be fine if we wanted to promote their independence, but customers today want synergy between servers and storage, they want systems that work well together.
Last year, in response to market feedback, we crated a new brand, "IBM Systems" and put all the server and storage product lines under one roof. Over time, we will transition from TotalStorage to System Storage naming. This will occur with new products, and major versions of existing products.
Two other phrases you will hear in the names of our offerings are "Virtualization Engine" and "Express". These are portfolio identifiers. The Virtualization Engine identifier was created to emphasize our leadership in system virtualization, and we have products that span product lines with this identifier.
The Express identifier was created to emphasize our focus on Small and Medium sized business (SMB). It spans not just servers and storage, but across other offerings from other IBM divisions.
Of course, just renaming products and services isn't enough. Systems don't work together just because they have similar names, are covered in similar "Apple white" plastic, or have similar black bezels. Obviously, thoughtful and collaborative design are needed, with the appropriate amounts of engineering and testing. IBM is aligning its server and storage development so that the IBM Systems brand keeps its promise.
It has always been the case in fast pace technology areas that you can't tell the players without a program card, andthis is especially true for storage.
When analyzing each acquistion move, you need to think of what is driving it. What are the motives?Having been in the storage business 20 years now, and seen my share of acquisitions, both from within IBM,as well as competition, I have come up with the following list of motives.
Although slavery was abolished in the US back in the 1800's, and centuries earlier everywhere else, many acquisitionsseem to be focused on acquiring the people themselves, rather than the products or client list. I have seen statistics such as "We retained 98% of the people!" In reality, these retentions usually involve costly incentives,sign-in bonuses, stock options, and the like. Desptie this, people leave after a few years, often because ofpersonality or "corporate culture" clash. For example, many former STK employees seem to be leaving after their company was acquired by Sun Microsystems.
If you can't beat them, join them. Acquisitions can often be used by one company to raise its ranking in marketshare, eliminating smaller competitors. And now that you have acquired their client list, perhaps you can sellthem more of your original set of products!
Symantec had acquired Veritas, which in turn had acquired a variety of other smaller players, and the end result is that they are now #1 backup software provider, even though none of theirproducts holds a candle to IBM's Tivoli Storage Manager. Meanwhile, EMC acquired Avamar to try to get more into the backup/recovery game, but most analysts still find EMC down in the #4 or #5 place in this category.
Next month,Brocade's acquisition of McData should take effect, furthering its marketshare in SAN switch equipment.
Prior to my current role as "brand market strategist" for System Storage, I was a "portfolio manager" where wetried to make sure that our storage product line investments were balanced. This was a tough job, as the investmentshad to balance the right development investments into different technologies, including patent portfolios.Despite IBM's huge research budget, I am not surprised that some clever inventions of new technologies comefrom smaller companies, that then get acquired once their results appear viable.
The last motive is value shift. This is where companies try to re-invent themselves, or find that they are stuck in acommodity market rut, and wish to expand into more profitable areas.
LSI Logic acquisition of StoreAge is a good exampleof this. Most of the major storage vendors have already shifted to software and services to provide customer value,as predicted in 1990's by Clayton Christensen in his book "The Innovator's Dilemma". The rest are still strugglingto develop the right strategy, but leaning in this general direction.
A few weeks ago, my Tivo(R) digital video recorder (DVR) died. All of my digital clocks in my house were flashing 12:00 so I suspect it wasa power strike while I was at the office. The only other item to die was the surge protector,and so it did what it was supposed to do, give up its own life to protect the rest of myequipment. Although somehow, it did not protect my Tivo.
I opened a problem ticket with Sony, and they sent me instructions on how to send itover to another state to get it repaired.Amusingly, the instructions included "Please make a backup of the drive contents beforesending the unit in for repair." Excuse me? How am I supposed to do that, exactly?
My model has only a single 80GB drive, and so my friend and I removed the drive and attachedit to one of our other systems to see if anything was salvageable. It failed every diagnostictest. There was just not enough to read to be usable elsewhere.
This is typical of many home systems. They are not designed for robust usage, high availability, nor any form of backup/recovery process. Some of the newer models havetwo drives in a RAID-1 mode configuration, but most have many single points of failure.
And certainly, it is not mission critical data. Life goes on without the last few episodesof Jack Bauer on "24", or the various Food Network shows that I recorded for items I planto bake some day. For the past few weeks, I have spent more time listening to the radioand reading books. Somehow, even though my television runs fine without my Tivo, watchingTV in "real time" just isn't the same.
I suspect that if you gave someone a method to do the backup, most would not bother to useit. People are now relying more and more heavily on their home-basedinformation storage systems, digital music, video and cherished photographs. Perhaps experiencing a "loss" will help them appreciate backup/recovery systems so much more than they do today.
Continuing this week's theme on Business Continuity, I will use this post to discuss this week'sIBM solid state disk announcement.This new offering provides a new way to separate programs from data, to help minimizedowntime and outages normally associated with disk drive failures.
Until now, the method most people used to minimize the amount of data on internalstorage was to use disk-less servers with Boot-Over-SAN, however, not all operating systems, and not all disk systems, supported this.
Windows, however, is not supported, because of the small 4GB size and USB protocol limitations. For Windows, you would add a SAS drive, you boot from this hard drive, and use the 4GB Flash drive for data only.
So what's new this time? Here's a quick recap of July 17 announcement. For the IBM BladeCenter HS21 XM blade servers, new models of internal "disk" storage:
Single drive model
A single 15.8GB solid-state disk drive, based on SATA protocol. In addition to theLinux operating systems mentioned above, the capacity and SATA protocols allowsyou to boot 32-bit and 64-bit versions of Windows 2003 Server R2, with plans in placeto other platforms in the future, such as VMware. I am able to run my laptop Windows with only 15GB of C: drive, separating my data to a separate D: partition, so this appears to be a reasonable size.
Dual drive model
The dual drive fits in the space of a single 2.5-inch HDD drive bay.You can combine these in either RAID 0 or RAID 1 mode.
RAID 0 gives you a total of 31.6GB, but is riskier. If you lose either drive,you lose all your data. Michael Horowitz of Cnet covers the risks of RAID zerohere andhere.However, if you are just storing your operating system and application, easily re-loadable from CD or DVD in the case of loss, then perhaps that is a reasonable risk/benefit trade-off.
RAID 1 keeps the capacity at 15.8GB, but provides added protection. If you loseeither drive, the server keeps running on the surviving drive, allowing you to schedule repair actions when convenient and appropriate. This would be the configuration I would recommend for most applications.
Until recently, solid state storage was available at a price premium only. Flash prices have dropped 50% annually while capacities have doubled. This trend is expected to continue through 2009.
According to recent studies from Google and Carnegie Mellon, hard drives fail more oftenthan expected. By one account, conventional hard disk drives internal to the server account for as much as 20-50% of component replacements.IBM analysis indicates that the replacement rate of a solid state drive on a typical blade server configuration is only about 1% per year, vs. 3% or more mentionedin the these studies for traditional disk drives.
Flash drives use non-volatile memory instead of moving parts, so less likely to break down during high external environmental stress conditions, like vibration and shock, or extreme temperature ranges (-0C° to +70°C) that would make traditional hard disks prone to failure.This is especially important for our telecommunications clients, who are always looking for solutions that are NEBS Level 3 compliant.
As with any SATA drive, performance depends on workload.Solid state drives perform best as OS boot devices, taking only a few secondslonger to boot an OS than from a traditional 73GB SAS drive. Flash drives also excel in applications featuring random read workloads, such as web servers. For random and sequential write workloads, use SAS drives instead for higher levels of performance.
Part of IBM's Project Big Green, these flash drives are very energy efficient. Thanks to sophisticated power management software, the power requirement of the solid state drive can be 95 percent better than that of a traditional 73GB hard disk drive. These 15.8GB drives use only 2W per drive versus as much as 10W per 2.5” hard drive and 16W per 3.5” hard drive. The resulting power savings can be up to 1,512 watts per server rack, with 50% heat reduction.
So, even though this is not part of the System Storage product line, I am very excitedfor IBM. To find out if this will work in your environment, go to the IBM Server Provenwebsite that lists compatability with hardware, applications and middleware, or review the latest Configuration and Options Guide (COG).
I have created blog categories, based on our System Storage offering matrix, which you can track individually:
Disk systems, including the IBM System Storage DS Family of products, SAN Volume Controller, N series, as well as features unique to these products, such as FlashCopy, MetroMirror, or SnapLock. Tape
Tape systems, including the IBM System Storage TS Family of products, tape-related products in the Virtualization Engine portfolio, drives, libraries and even tape media.
Storage Networking offerings, from Brocade, McData, Cisco and others, such as switches, routers and directors.
Infrastructure management, including IBM TotalStorage Productivity Center software, IBM Tivoli Provisioning Manager, IBM Tivoli Intelligent Orchestrator, and IBM Tivoli Storage Process Manager.
Business Continuity, including IBM Tivoli Storage Manager, Tivoli CDP for Files, Productivity Center for Replication software component, Continuous Availability for Windows (CAW), Continuous Availability for AIX (CAA).
Lifecycle and Retention offerings, including our IBM System Storage DR550, DR550 Express, GPFS, Tivoli Storage Manager Space Management for UNIX, Tivoli Storage Manager HSM for Windows, and DFSMS.
Storage services, including consulting, assessments, design, deployment, management and outsourcing.
For those in the US, last friday, the day after Thanksgiving, marks the official start of the Holiday shopping season. This has been called [Black Friday] as some stores open as early as 4am in the morning, when it is still dark outside, to offer special discount prices. Some shoppers camp out in sleeping bags and lawn chairs in front of stores overnight to be the first to get in.
Not surprisingly, some folks don't care for this approach to shopping, and prefer instead shopping online. Since 2005, the Monday after Thanksgiving (yesterday) has been called [Cyber Monday].USA Today newspaper reports [Cyber Monday really clicks with customers]. Many of the major online shopping websites indicated a 37 percent increase in sales yesterday over last year's Cyber Monday.
On Deadline dispels the hype on both counts:[Cyber Monday: Don't Believe the Hype?"], indicating that Black Friday is not the peak shopping for bricks-and-mortar shops, andthat Cyber Monday is not the busiest online shopping day of the year, either.
A flood of new video and other Web content could overwhelm the Internet by 2010 unless backbone providers invest up to US$137 billion in new capacity, more than double what service providers plan to invest, according to the study, by Nemertes Research Group, an independent analysis firm. In North America alone, backbone investments of $42 billion to $55 billion will be needed in the next three to five years to keep up with demand, Nemertes said.
Internet users will create 161 exabytes of new data this year, and this exaflood is a positive development for Internet users and businesses, IIA says.
If the "161 Exabytes" figure sounds familiar, it is probably from the IDC Whitepaper [The Expanding Digital Universe] that estimated the 161 Exabytes created, captured or replicated in 2006 will increase six-fold to 988 Exabytes by the year 2010. This is not just video captured for YouTube by internet users, but also corporate data captured by employees, and all of the many replicated copies. The IDC whitepaper was based on an earlier University of California Berkeley's often-cited 2003[How Much Info?] study, which not only looked at magnetic storage (disk and tape), but also optical, film, print, and transmissions over the air like TV and Radio.
A key difference was that while UC Berkeley focused on newly created information, the IDC study focused on digitized versions of this information, and included theadded impact of replication.It is not unusual for a large corporate databases to be replicated many times over. This is done for business continuity, disaster recovery, decision support systems, data mining, application testing, and IT administrator training. Companies often also make two or three copies of backups or archives on tape or optical media, to storethem in separate locations.
Likewise, it should be no surprise that internet companies maintain multiple copies of data to improve performance.How fast a search engine can deliver a list of matches can be a competitive advantage. Content providers may offer the same information translated into several languages.Many people replicate their personal and corporate email onto their local hard drives, to improve access performance, as well as to work offline.
The big question is whether we can assume that an increased amount of information created, captured and replicated will have a direct linear relation to the growth of what is transmitted over the internet. Three fourths of the U.S. internet users watched an average of 158 minutes of online video in May 2007, is this also expected to grow six-fold by 2010? That would be fifteen hours a month, at current video densities, or more likely it would be the same 158 minutes but of much higher quality video.
On the other hand, much of what is transmitted is never stored, or stored for only very short periods of time.Some of these transmissions are live broadcasts, you are either their to watch and listen to them when they happen, or you are not. Online video games are a good example. The internet can be used to allow multiple players to participate in real time, but much of this is never stored long-term. An interesting feature of the Xbox 360 is to allow you to replay "highlight" videos of the game just played, but I do not know if these can be stored away or transferred to longer term storage.
Of course, there will always be people who will save whatever they can get their hands on. Wired Magazine has anarticle [Downloading Is a Packrat's Dream], explaining that many [traditional packrats] are now also "digital packrats", and this might account for some of this growth. If you think you might be a digital packrat,Zen Habits offers a [3-step Cure].
In any case, the trends for both increased storage demand, and increased transmission bandwidth requirements, are definitely being felt. Hopefully, the infrastructure required will be there when needed.
Continuing this week's theme of New Year's Resolutions for the data center, today we'll talk about one that many people make for their own personal lives: staying on a budget.
Often, when faced with a tightening budgets, we try to make more use of what we already have. Tell someone they are only using 10 percent of their brain, and they immediatelybelieve you; but tell them they are only using 30 percent of their storage, and they ask for a whitepaper,magazine article, or clarification on how that percentage is calculated. I actually visiteda customer that was only using6 percent of the storage attached to their Windows servers!
So, to help those of you making data center resolutions to stay on budget, the terms to remember are "Reduce", "Reuse" and "Recycle".
When people come to request storage, are they being reasonable about what they need today, or are they asking for what they might need over the next three years? They might need 50GB, but they ask for 100GB, in case they grow, and a year later, you find they have only 15GB of data on it. On the flipside, the person asks for what they need but some storage admins give out more, just so they don't have to be bothered so often when growth happens. Finally, I have seen this formalized into fixed size LUNs, all the disk is carved into big huge 100GB pieces, so if you need 20GB, here's one big enough with plenty of room to grow.
If you are going to keep on a budget, remember that storage today is 30% more expensive than storage next year. That is the average drop in both disk and tape on a dollar-per-MB basis. If there is any way to postpone giving out storage until it is actually needed, you can save a bundle of money. Timing is everything! In the event of a disaster, getting immediate replacement for disk can be very expensive, but if you can wait just two weeks, you can negotiate a better deal. I thought of this while going to the movie theatre yesterday. A "hot dog" and a bottle of water was $8.00, but if you are able to wait two hours and eat after the movie, you can get a much better meal for less.
A lot of companies buy new storage because their existing storage isn't fast enough, or doesn't have the latest copy services. This can easily be solved with an IBM SAN Volume Controller (SVC). The SVC can virtualize slower, functionless storage, and present to your application hosts virtual disks that are faster, and with all the latest disk-to-disk copy services like FlashCopy, Metro Mirror, and Global Mirror.
Chances are, you have unused disk capacity spread across all your storage today, but perhaps they are formatted into small LUNs. The SVC can combine the capacity, and let you carve up big LUNs at the sizes you need.This is like taking all those tiny pieces of soap in your shower and forming a new bar of soap, or taking all the crumbs at the bottom of your bread box, and making a new slice of bread. And, the virtual LUNs are dynamically expandable,so give out only the amount they need today, as it is simple to expand them to larger sizes later.
Of my 13 patents, the first will always be my favorite, on a function called "RECYCLE" for the Data Facility Storage Management Subsystem Hierarchical Storage Manager (DFSMShsm) product, which is now a component of the IBM z/OS operating system. Basically, tapes could contain hundreds or thousands of files, such as backup versions or archive copies, and these expired on different dates. As a result, a tape would be written100 percent full, and then over time, decrease in valid data to 80, 60, 40, 20 until it hit 0 percent. In some cases, a single filecould hold an entire tape hostage. RECYCLE was able to read the valid data off tapes that were perhaps less than 20 percent full, and consolidate them onto fewer tapes. As a result, a whole bunch of tapes could be returned to the scratch pool, and reused immediately for other workloads. This also helps in moving to newer, higher capacity cartridges, such as the new 700GB cartridge that IBM co-developed with FujiFilm.(This RECYCLE function exists in our IBM Tivoli Storage Manager software, as well as our Virtual Tape Server, but is called "reclamation" instead, to avoid confusion on searches.)
When evaluating your use of tape, determine if you are making best use of the tapes you have now, and perhaps a RECYCLE (or reclamation) scheme may be in order. Fewer tapes can save money in many ways, such as reduced storage costs, and reduced courier costs to send the tapes offsite. Tape media can still be 10-20 times less expensive than disk, based on full capacity.
The IBM Storage and Storage Networking Symposium in Las Vegas continues ...
N series and VMware
Jeff Barnett presented how VMware manages disk image files in its VMfs repository, and how N series offersa better alternative. Virtual machines can access N series volumes directly.
Business Continuity with System i
Allison Pate presented the various Business Continuity options for System i. Many customersuse internal storage for System i, but this then hampers Business Continuity efforts. Instead,you can have IBM System Storage DS8000 or DS6000 series disk systems provide disk mirroringbetween clustered systems.
There was a lot of interest in DR550, one of our many compliance storage solutions. Ron Henkhauspresented an overview of our DR550 and DR550 Express offerings. Unlike the competitive disk-onlysolutions, such as the EMC Centera, the DR550 allows you to attach an automated tape library, managing large amounts of fixed content data at a much lower cost point. It also has encryption, for both diskand tape data.
Open Systems Disk Management
Siebo Friesenborg presented the various steps needed to troubleshoot performance problemswith open systems, including the use of "iostat" on AIX systems as an example, and the stepsyou can take to make formal Service Level Agreements (SLA) between the IT department and thevarious lines of business.
IBM Encryption - TS1120 and LTO-4 encryption comparison
Tony Abete presented TS1120 and LTO-4 encryption techniques. Deploying encryption is more thanjust choosing a tape drive. There are a variety of factors involved, such as whether to managethe keys from the application, the operating system, or the library manager. You need policiesto decided when to encrypt tapes and when not to, generating your keys, storing them, and sharingthem with your business partners, suppliers and service providers with which you send tapes.
I can tell that many people are feeling like they are "drinking from a firehose".IBM's success in storage reaches out to so many different aspects of information management,a variety of industries, and disciplines as varied as regulatory compliance and medical imaging.
The IBM Storage and Storage Networking Symposium concludes today. As typical for manysuch conferences, it ended at noon, so that people can catch airline flights.
TS1120 Tape Encryption - Customer Experiences
Jonathan Barney had implemented many deployments of tape encryption, and shared hisexperiences at two customer locations.
The first company had decided to implement their EKM servers on dedicated 64-bitWindows servers. They had three sites, one in Chicago, Alphareta, and New York City,each with two EKM servers. Each library had a single TS3500 tape library, and pointedto four EKM servers, two local, and two remote.
The clever trick was managing the keystore. They decided that EKM-1 was their trustedsource, made all changes to that, and then copied it to the other five EKM servers.His team deployed one site at a time, which turned out to be ok, but he would notrecommend it. Better to design your complete solution, and make sure that all librariescan access all EKM servers.
This company decided to have a single key-label/key-pair for all three locations, but change it every 6 months. You have to keep the old keys for as long as you have tapesencrypted with those keys, perhaps 10-20 years.The customer found the IBM encryption implementation "elegant" and it can be easily replicated to a fourth site if needed.
The second company had both z/OS and Sun Solaris. Initially they planned to have botha hardware-based keystore on System z, and software-based keystore on Sun, but they realized that System z version was so much more secure and reliable, that it made nosense to have anything on the Sun Solaris platform.
On System z, they had two EKM images, and used VIPA to ensure load balancing fromthe library. Tapes written from z/OS used DFSMS Data Class to determine which tapesare encrypted and which aren't. All Tapes written from Sun Solaris were encryptied, written to a separate logical library partition of the TS3500, which in turn contactedthe System z for the EKM management to provide the keys to use for the encryption.
The "gotcha" for this case was that when they tested Disaster Recovery, they had torecover the two EKM servers first, before any other restores could take place, and thistook way too long. Instead, they developed a scaled-down 10-volume "rescue recovery" z/OS image that would contain the RACF database and all EKM related software to actas the keystore during a disaster recovery. Anytime they make updates, they only haveto dump 10 volumes to tape. Restore time is down to only 2 hours.
He gave this advice to deploy tape encryption:
Some third party z/OS security products, like Computer Associates Top Secret orACF2, require some PTFs to work with the EKM. The latest IBM RACF is good to go.
Getting IP support from IOS to OMVS requires IPL.
At one customer, an OMVS monitor software program killed the EKM because it wasn'tin their list of "acceptable Java programs". They updated the list and EKM ran fine.
DO not update EKM properties file while EKM is running. EKM keeps a lot of stuffin memory, and when it is recycled, copies this back to the EKM properties file, reversing any changes you may have done. It is best to shut down EKM, update theproperties file, then start up EKM back up again. This is why you should always haveat least two EKM servers for redundancy.
TSM for Linux on System z
Randy Larson from our Tivoli group presented this session.There is a lot of interest in deploying IBM Tivoli Storage Manager backup and archivesoftware on Linux for System z. Many customers are already invested in a mainframeinfrastructure, may have TSM for z/OS or z/VM, and want the newer features and functions that are available for TSM on Linux.
TSM has special support for Lotus Domino, Oracle, DB2 and WebSphere Application Servers.TSM clients can send backup data to a TSM server internally via Hipersockets, a virtualLAN feature on the System z platform that uses shared memory to emulate TCP/IP stack.
One of the big questions is whether to run Linux as guests under z/VM, or natively onLPAR. The general deployment is to carve an LPAR and run Linux natively untilyour server and storage administration staff have taken z/VM training classes. Oncetrained, they can easily move native LPAR images to z/VM guests. Unlike VMware that takesa hefty 40% overhead on x86 platforms to manage guests, z/VM only takes 5-10% overhead.
For the TSM database and disk storage pools, Randy recommends FC/SCSI disk, with ext3 file system, combined with LVM2 into logical volumes. ECKD disk and reiserfsworks too. Avoid use of z/VM minidisks. Under LVM2, consider 32KB stripes for the TSM database, and 256KB stripes for the disk storage pools. For multipathing, usefailover rather than multibus method. Read IC45459 before you activate "directio".
The TSM for Linux on z is very much like the TSM on AIX or Windows, and not like theTSM for z/OS. For tape, TSM for Linux on z does not support ESCON/FICON attached tape,you need to use FC/SCSI attached tape and tape libraries. TSM owns the library anddrives it uses, so give it a logical library partition separate from z/OS. ForSun/StorageTek customers, TSM works with or without the Gersham Enterprise Distrbu-Tape(EDT) software. Use the IBM-provided drivers for IBM tape. For non-IBM tape, TSM providessome drivers that you can use instead.
That wraps up my week. This was a great conference! If you missed it, look for the one in Montpelier, France this October. Check out the list of IBM Technical Conferencesto find others that might interest you.
Lagasse, Inc. sells janitorial supplies, such as mops, cleaning chemicals, waste receptacles, and garbage can liners. Of the 1000 employees of Lagasse nationwide, about 200 associates were located in New Orleans at their main Headquarters, primary customer care center, and primary IT computing center.
Amazingly, Lagasse did not have a formally documented BCP (Business Continuity Plan) but more of aBCI (Business Continuity Idea). They chose to take a ["donut tire"] approach, putting older previous-generation equipment at their DR site. They knew that in the event of a disaster,they would not be processing as many transactions per second. That was a business trade-offthey could accept.
Evaluating all the different threat scenarios for impact and likelihood, and focused on hurricanes and floods.They had experienced previous hurricanes, learning from each,with the most recent being 2004 Hurricane Ivan and 2005 Hurricane Dennis. From this, they wereable to categorize three levels of DR recovery:
Tier 1 - The most mission-critical, which for them related to picking, packing and shipping products.
Tier 2 - The next most important, focused on maintaining good customer service
Tier 3 - Everything else, including reporting and administrative functions
The time-line of events went as follows:
The US Government issues warning that a hurricane may hit New Orleans
August 27 - 7pm
Lagasse declares a disaster, starts recovery procedures to an existing IT facility in Chicago, owned by their parent company. A temporary "Southeast" Headquarters were set up in Atlanta.Remote call centers were identified in Dallas, Atlanta, San Antonio, and Miami.
August 28 - just after midnight
In just five hours, they recovered their "Tier 1" applications.
August 28 - 7:30pm
In just over 24 hours, they recovered their "Tier 2" applications.
August 29 - 6am
The Hurricane hits land. With 73 levees breached, the city of New Orleans was flooded.
The following week
Lagasse was fully operational, and recorded their second and third best sales days ever.
I was quite impressed with their company's policy for how they treat their employees during a disaster. For many companies, people during a disaster prioritize on their families, not their jobs.If any associate was asked to work during a disaster, the company would take care of:
The safety of their family
The safety of their pets. (In the weeks following this hurricane, I sponsored people in Tucson to go to New Orleans to attend to lost and stray dogs and cats, many of which were left behind when rescuers picked up people from their rooftops.)
Any emergency repairs to secure the home they leave behind
Marshall felt that if you don't know the names of the spouse and kids of your key employees, you are not emotionally-invested enough to be successful during a disaster.
For communications, cell phones were useless. They could call out on them, but anyone with acell phone with 504 area code had difficulty receiving calls, as the calls had to be processedthrough New Orleans. Instead, they used Voice over IP (VoIP) to redirect calls to whichever remote call center each associate went to. Laptops, Citrix, VPN and email were considered powerful tools during this process. They did not have Instant Messaging (IM) at the time.
While the disk and tapes needed to recover Tiers 1 and 2 were already in Chicago, the tapes for Tier 3 were stored locally by a third-party provider. When Lagasse asked for thier DR tapes back, the third-party refused, based on their [force majeure] clause. Force majeure is a common clause in many business contracts to free parties from liabilityduring major disasters.Marshall advised everyone to strike out any "force majeure" clauses out of any future third-party DR protection contracts.
Hurricane Katrina hit the US hard, killing over 1400 people, and America still has not fully recovered. The recovery of thecity of New Orleans has been slow. Massive relocations has caused a deficit of talent inthe area, not just IT talent, but also in the areas of medicine, education and other professions. The result has been degraded social services, encouraging others to relocate as well. Some have called it the "liberation effect", a major event that causespeople to move to a new location or take on a new career in a different field.
On a personal note, I was in New Orleans for a conference the week prior to landfall, and helped clients with their recoveries the weeks after. For more on how IBM Business Continuity Recovery Services (BCRS) helped clients during Hurricane Katrina, see the following [media coverage].
This wraps up my week in Las Vegas for the 27th Annual [Data Center Conference]. This conference follows the common approach of ending at noon on Friday, so that attendees can get home to their families for the weekend, or start their weekend in Las Vegas early to watch the 50th annual Wrangler National Finals Rodeo.
I attended the last few sessions. Here is my recap:
Where, When and Why do I need a Solid-State Drive?
The internet provides transport of digital data between any devices. All other uses have evolved from this aim. Increasing data storage on any node on the Web therefore increases the possibilities at every other point. We are just now beginning to recognize the implications of this. The two speakers co-presented this session to cover how Solid State Disk (SSD) may participate.
Some electronic surveys of the audience provided some insight. Only 12 percent are deploying SSD now. 59 percent are evaluating the technology. A whopping 89 percent did not understand SSD technology, or how it would apply to their data center. Here is the expected time linefor SSD adoption:
17 percent - within 1 year
60 percent - around 3 years from now
21 percent - 5 years or later
The main reasons cited for adopting SSD were increasing IOPS, reducing power and floorspace requirements, and expanding global networks. Here's a side-by-side comparison between HDD and SSD:
Disk array with 120 HDD, 73GB drives
Disk array with 120 SSD, 32GB drives
Per 73GB drive
Per 32GB drive
100MB/sec per drive
Read 250 MB/sec per drive Write 170 MB/sec per drive
300 IOPS per drive
35,000 IOPS per drive
12 Watts per drive
2.4 Watts per drive
However, the cost-per-GB for SSD is still 25x over traditional spinning disk, andthe analysts expected SSD to continue to be 10-20x for a while. For now, they estimatethat SSD will be mostly found in blade servers, enterprise-class disk systems, andhigh-end network directors.
The speakers gave examples such as Sun's ZFS Hybrid, and other products from NetApp,Compellent, Rackable, Violin, and Verari Systems.
Taking fear out of IT Disaster Recovery Exercises
The analyst presented best practices for disaster recovery testing with a "Pay Now or Pay Later"pre-emptive approach. Here were some of the suggestions:
Schedule adequate time for DR exercises
Build DR considerations into change control procedures and project lifecycle planning
Document interdependencies between applications and business processes
Bring in the "crisis team" on even the smallest incidents to keep skill sharp
Present the "State of Disaster Recovery" to Senior Management annually
The speaker gave examples of different "tiers" for recovery, with appropriate RPO and RTOlevels, and how often these should be tested per year. A survey of the audience found that70 percent already have a tiered recovery approach.
In addition to IT staff, you might want to consider inviting others to the DR exerciseas reviewers for oversight, including: Line of Business folks, Facilities/Operations, Human Resources, Legal/Compliance officers, even members of government agencies.
DR exercises can be performed at a variety of scope and objectives:
Tabletop Test - IBM calls these "walk-throughs", where people merely sit around the table and discuss what actions they would take in the event of a hypothetical scenario. This is a good way to explore all kinds of scenarios from power outages, denial of service attacks, or pandemic diseases.
Checklist Review - Here a physical inventory is taken of all the equipment needed at the DR site.
Stand-alone Test - Sometimes called a "component test" or "unit test", a single application is recovered and tested.
End-to-End simulation - All applications for a business process are recovered for a full simulation.
Full Rehearsal - Business is suspended to perform this over a weekend.
Production Cut-Over - If you are moving data center locations, this is a good time to consider testing some procedures. Other times, production is cut-over for a week over to the DR site and then returned back to the primary site.
Mock Disaster - Management calls this unexpectedly to the IT staff, certain IT staff are told to participate, and others are told not to. This helps to identify critical resources, how well procedures are documented, and members of the team are adequately cross-trained.
For exercise, set the appropriate scope and objectives, score the results, and then identifyaction plans to address the gaps uncovered. Scoring can be as simple as "Not addressed","Needs Improvement" and "Met Criteria".
Full Speed Ahead for iSCSI
The analyst presented this final session of the conference. He recognized IBM's early leadership in this area back in 1999, with the IP200i disk system. Today, there are many storage vendors that provide iSCSI solutions, the top three being:
23 percent - Dell/EqualLogic
15 percent - EMC
14 percent - HP/LeftHand Networks
This protocol has been mostly adopted for Windows, Linux and VMware, but has been largelyignored by the UNIX community. The primary value proposition is to offer SAN-like functionality at lower cost. When using the existing NICs that come built-in on most servers, iSCSI canbe 30-50 percent less expensive than FC-based SANs. Even if you install TCP-Offload-Engine (TOE) cards into the servers, iSCSI can still represent a 16-19 percent cost savings. ManyIBM servers now have TOE functionality built-in.
Since lower costs are the primary motivator, most iSCSI deployments are on 1GbE. The new10Gbps Ethernet is still too expensive for most iSCSI configurations. For servers runninga single application, 2 1GbE NICs is sufficient. For servers running virtualization with multiple workloads might need 4 or 5 NICs (1GbE), or consider 2 10GbE NICs if 10Gbps is available.
The iSCSI protocol has been most successful for small and medium sized businesses (SMB) lookingfor one-stop shopping. Buying iSCSI storage from the same vendor as your servers makes a lot of sense: EqualLogic with Dell servers, LeftHand software with HP servers, and IBM's DS3300 or N series with IBM System x servers.The average iSCSI unit was 10TB for about $24,000 US dollars.
Security and Management software for iSCSI is not as fully developed as for FC-based SANs.For this reason, most network vendors suggest having IP SANs isolated from your regular LAN.If that is not possible, consider VPN or encryption to provide added security.Issues of security and management imply that iSCSI won't dominate the large enteprise data center. Instead, many arewatching closely the adoption of Fibre Channel over Ethernet (FCoE), based on revised standardsfor 10Gbps Ethernet. FCoE standards probably won't be finalized till mid-2009, with productsfrom major vendors by 2010, and perhaps taking as much as 10 percent marketshare by 2011.
I hope you have enjoyed this series of posts. In addition to the sessions I attended, theconference has provided me with 67 presentations for me to review. Those who attended couldpurchase all the audio recordings and proceedings of every session for $295 US dollars, and those who missed the event can purchase these for $595 US dollars. These are reasonable prices, when you realize that the average Las Vegas visitor spends 13.9 hours gambling, losing an average of $626 US dollars per visit. The audio recordings and proceedings can provide more than 13.9 hours of excitement for less money!
I am back at "the Office" for a single day today. This happens often enough I need a name for it.Air Force pilots that practice landing and take-offs call them "Touch and Go", but I think I needsomething better. If you can think of a better phrase, let me know.
This week, I was in Hartford, CT, Somers, NY and our Corporate Headquarters in Armonk, in a varietyof meetings, some with editors of magazines, others with IBMers I have only spoken to over the phone andfinally got a chance to meet face to face.
I got back to Tucson last night, had meetings this morning in Second Life, then presented "InformationLifecycle Management" in Spanish to a group of customers from Mexico, Chile, and Brazil. We have a great Tucson Executive Briefing Center, and plenty of foreign-language speakers to draw from our localemployees here at the lab site.
Sunday, I leave for Las Vegas for our upcoming IBM Storage and Storage Networking Symposium. We will cover the latest in our disk, tape, storage networking and related software.Do you have your tickets? If you plan to attend, and want to meet up with me, let me know.
Continuing my week's theme on travel, conferences, and Japan, I saw two items in the newsthat seem to follow a common theme.
According to the "The Daily Yomiuri", a local Japanese paper, "double happy weddings" arebecoming more and more popular in Japan. These would be called "stotgun" weddings in the US, butin Japan, couples pay extra to have a wedding between the fifth and seventh month ofpregnancy. As Dave Barry would say, I am not making this up. 27% of couples in Japan got married while or after pregnant. The logic is that they can celebrate both events with one ceremony. Many couples believe that the primary purpose of marriage is to have children, and somethat fail to have children suffer terrible anguish or divorce. Waiting untilbeing pregnant helps ensure the couple will be "successful" in this regard.
IBM acquires Softek, a software company that develops a product called Transparent Data MoverFacility (TDMF) to move mainframe data from one disk system to another, while applicationsare running. This can be used, for example, to move data from outdated disk systems to IBMdisk systems. This is not to be confused with IBM's archive and retention software partner,Princeton Softech.
Softek is the software spin-off of Fujitsu (a Japanese computer hardware manufacturer). Fora while, Fujitsu made IBM-compatible mainframe servers, but was not successful at developingits own system software, relying heavily on IBM for this. Unable to compete against IBM, it stoppedmaking mainframe servers, but continues making other kinds of hardware equipment.
With TDMF, the process of moving data is simple. The software runs on z/OS and intercepts all writes intendedfor a source volumes on the old array, and re-directs a copy to destination volumes on the new device.Systems can run with old and new equipment side by side for a few weeks, with the new devicestaying in-sync with the old. When the client is ready to cross over, the systems arepointed to the new disk, and the old disk systems are detached and removed from the sysplex.
Afraid that installing TDMF will mess with your applications? IBM Global Technology Services (GTS)is able to roll-in a separate mainframe, move the data, than disconnect it along with the old storage.
(For customers running Linux, UNIX or Windows on other platforms, IBM offers SAN Volume Controller (SVC).While SVC is not marketed as a "data migration device", per se, it does have this capability.Many clients were able to cost-justify purchase of an SVCto move data from old storage to new in similar fashion to how TDMF works on the mainframe.)
What do these stories have to do with one another, other than both relating to Japan? IBM has beenusing TDMF for years as part of a service offering to move data from one disk system to another.Since Sam Palmiasano took over in 2002, IBM has acquired 51 companies, 31 of them software companies.Often, these have been "successful" turning quickly profitable because IBM was already well familiar with the companies they acquire, in much the same way that husbandsare well familiar with their brides-to-be at a "double happy wedding".
So, welcome Softek! It looks like its time to celebrate again!
On Tuesday, I covered much of the Feb 26 announcements, but left the IBM System Storage DS8000 for today so that it can haveits own special focus.
Many of the enhancements relate to z/OS Global Mirror, which we formerly called eXtended Remote Copy or "XRC", not to be confused with our "regular" Global Mirror that applies to all data. For those not familiar with z/OS Global Mirror, here is how it works. The production mainframe writes updates to the DS8000, and the DS8000 keeps track of these in cache until a "reader" can pull them over to the secondary location.The "reader" is called System Data Mover (SDM) which runs in its own address space under z/OS operating system. Thanks to some work my team did several years ago, z/OS Global Mirror was able to extend beyond z/OS volumes and include Linux on System z data. Linux on System z can use a "Compatible Disk Layout" (CDL) format (now the default) that meetsall the requirements to be included in the copy session.
IBM has over 300 deployments of z/OS Global Mirror, mostly banks, brokerages and insurance companies. The feature can keep tens of thousands of volumes in one big "consistency group" and asynchronously mirror them to any distance on the planet, with the secondary copy recovery point objective (RPO) only a few seconds behind the primary.
Extended Distance FICON
Extended Distance FICON is an enhancement to the industry-standard FICON architecture (FC-SB-3) that can help avoid degradation of performance at extended distances by implementing a new protocol for "persistent" Information Unit (IU) pacing. This deals with the number of packets in flight between servers and storage separated by long distances, andcan keep a link fully utilized at 4Gpbs FICON up to 50 kilometers. This is particularly important for z/OS GlobalMirror "reader" System Data Mover (SDM). By having many "reads" in flight, this enhancementcan help reduce the need for spoofing or channel-extender equipment, or allow you to choose lower-costchannel extenders based on "frame-forwarding" technology. All of this helps reduce your total cost of ownership (TCO)for a complete end-to-end solution.
This feature will be available in March as a no-charge update to the DS8000 microcode.For more details, see the [IBM Press Release]
z/OS Global Mirror process offload to zIIP processors
To understand this one, you need to understand the different "specialty engines" available on the System z.
On distributed systems where you run a single application on a single piece of server hardware, you mightpay "per server", "per processor" or lately "per core" for dual-core and quad-core processors. Software vendors were looking for a way to charge smaller companies less, and larger companies more. However, you might end up paying the same whether you use 1GHz Intelor 4GHz Intel processor, even though the latter can do four times more work per unit time.
The mainframe has a few processors for hundreds or thousands of business applications.In the beginning, all engines on a mainframe were general-purpose "Central Processor" or CP engines. Based on theircycle rate, IBM was able to publish the number of Million Instructions per Second (MIPS) that a machine witha given number of CP engines can do. With the introduction of side co-processors, this was changed to "Millionsof Service Units" or MSU. Software licensing can charge per MSU, and this allows applications running in aslittle as one percent of a processor to get appropriately charged.
One of the first specialty engines was the IFL, the "Integrated Facility for Linux". This was a CP designatedto only run z/VM and Linux on the mainframe. You could "buy" an IFL on your mainframe much cheaper than a CP,and none of your z/OS application software would count it in the MSU calculations because z/OS can't run on theIFL. This made it very practical to run new Linux workloads.
In 2004, IBM introduced "z Application Assist Processor" (zAAP) engines to run Java, and in 2006, the "z Integrated Information Processor" (zIIP) engines to run database and background data movement activities.By not having these counted in the MSU number for business applications, it greatly reduced the cost for mainframe software.
Tuesday's announcement is that the SDM "reader" will now run in a zIIP engine, reducing the costs for applicationsthat run on that machine. Note that the CP, IFL, zAAP and zIIP engines are all identical cores. The z10 EC hasup to 64 of these (16 quad-core) and you can designate any core as any of these engine types.
Faster z/OS Global Mirror Incremental Resync
One way to set up a 3-site disaster recovery protection is to have your production synchronously mirrored to a second site nearby, and at the same time asynchronously mirrored to a remote location. On the System z,you can have site "A" using synchronous IBM System Storage Metro Mirror over to nearby site "B", and alsohave site "A" sending data over to size "C" using z/OS Global Mirror. This is called "Metro z/OS Global Mirror"or "MzGM" for short.
In the past, if the disk in site A failed, you would switch over to site B, and then send all the data all over again. This is because site B was not tracking what the SDM reader had or had not yet processed.With Tuesday's announcement, IBM has developed an "incremental resync" where site B figures out what theincremental delta is to connect to the z/OS Global Mirror at site "C", and this is 95% faster than sendingall the data over.
IBM Basic HyperSwap for z/OS
What if you are sending all of your data from one location to another, and one disk system fails? Do you declare a disaster and switch over entirely? With HyperSwap, you only switch over the disk systems, but leave therest of the servers alone. In the past, this involved hiring IBM Global Technology Services to implementa Geographically Dispersed Parallel Sysplex (GDPS) with software that monitors the situation and updates thez/OS operating system when a HyperSwap had occurred. All application I/O that were writing to the primary locationare automatically re-routed to the disks at the secondary location. HyperSwap can do this for all the disk systems involved,allowing applications at the primary location to continue running uninterrupted.
HyperSwap is a very popular feature, but not everyone has implemented the advanced GDPS capabilities.To address this, IBM now offers "Basic HyperSwap", which is actually going to be shipped as IBMTotalStorage Productivity Center for Replication Basic Edition for System z. This will run in a z/OSaddress space, and use either the DB2 RDBMS you already have, or provide you Apache Derby database for thosefew out there who don't have DB2 on their mainframe already.
Update: There has been some confusion on this last point, so let me explain the keydifferences between the different levels of service:
Basic HyperSwap: single-site high availability for the disk systems only
GDPS/PPRC HyperSwap Manager: single- or multi-site high availability for the disk systems, plus some entry-level disaster recovery capability
GDPS/PPRC: highly automated end-to-end disaster recovery solution for servers, storage and networks
I apologize to all my colleagues who thought I implied that Basic HyperSwap was a full replacement for the morefull-function GDPS service offerings.
Extended Address Volumes (EAV)
Up until now, the largest volume you could have was only 54 GB in size, and many customers still are using 3 GB and 9 GB volume sizes. Now, IBM will introduce 223 GB volumes. You can have any kind of data set on these volumes,but only VSAM data sets can reside on cylinders beyond the first 65,280. That is because many applications still thinkthat 65,280 is the largest cylinder number you can have.
This is important because a mainframe, or a set of mainframes clustered together, can only have about 60,000disk volumes total. The 60,000 is actually the Unit Control Block (UCB) limit, and besides disk volumes, youcan have "virtual" PAVs that serve as an alias to existing volumes to provide concurrent access.
Aside from the first item, the Extended Distance FICON, the other enhancements are "preview announcements" which means that IBM has not yet worked out the final details of price, packaging or delivery date. In many cases, the work is done, has been tested in our labs, or running beta in select client locations, but for completeness I am required to make the following disclaimer:
All statements regarding IBM's plans, directions, and intent are subject to change or withdrawal without notice. Availability, prices, ordering information, and terms and conditions will be provided when the product is announced for general availability.
I am always amused in the manner the IT industry tries to solve problems. Take, for example, theprocess of backups. The simplest approach is to backup everything, and keep "n" versions of that.Simple enough for a small customer who has only a handful of machines, but does not scale well. Inmy post [Times a Million],I coined the phrase "laptop mentality", referring to people's inability to think through solutions in large scale.
Apparently, I am not alone.Steve Duplessie (ESG) wrote in his post[Random Thoughts]:
"I may even get to stop yelling at people to stop doing full backups every week on non-changing data (which is 80 %+) just because that's how they used to do it. They won't have a choice. You can't back up 5X your current data the way you do (or don't) today."
Hu Yoshida (HDS) does a great job explaining that thereare three ways to perform deduplication for backups:
Pre-processing. Have the backup software not backup unchanged data.
Inline processing. Have an index to filter the output of the backup as it sends data to storage.
Post-processing. Have the receiving storage detect duplicates and handle them accordingly.
"A full backup of 1TB data base tablespace is taken on day one. The next day another full backup is taken and only 2GB of that backup has any changes.
Using traditional full backup approaches after 2 nights, the backup capacity required is 2 x 1TB = 2TB
One method of calculating de-duplication ratios could yield a low ratio:
Total de-duplicated backup capacity used = 1TB + 2GB = 1.002TB
If the de-duplication ratio compares the amount of total physical storage used to the total amount that would have been used by traditional backup methods, the ratio = 2TB / 1.002TB = approximately 2:1
Another method of calculating de-duplication ratios could yield a high ratio:
Total de-duplicated backup capacity used still = 1.002TB
If the de-duplication ratio compares the amount of data stored in the most recent (second) backup to the amount that would have been used by traditional backup methods, the ratio 1TB / 2GB = 1000GB / 2GB = 500:1"
While IBM also offers deduplication in the IBM System Storage N series disk systems, I find that for backup, itis often more effective to apply best practices via IBM Tivoli Storage Manager (TSM). Let's take a look at some:
Exclude Operating System files
Why take full backups of your operating system every day? Yes, deduplication will find a lot to reduce fromthis, but best practices would exclude these. TSM has an include/exclude list, and the default version excludesall the operating system files that would be recovered from "bare machine recovery" or "new system install"procedures. Often, if the replacement machine has different gear inside, your OS backups aren't what you need,and a fresh OS install may determine this and install different drivers or different settings.
Exclude Application programs
Again, yes if there are several machines running the same application, you probably have opportunity for deduplication. However, unless you match these up with the appropriate registry or settings buried down in theoperating system, recovering just application program files may render an unusable system. Applications are bestinstalled from a common source that are either "pushed" through software distribution, or "pulled" from an application installation space.
If you have TB-sized databases, and are only doing full backups daily to protect it, have I got a solution for you.IBM and others have software that are "application-aware" and "database-aware" enough to determine what haschanged since the last backup and copy only that delta. Taking advantage of the TSM Application ProgrammingInterface (API) allows for both IBM and third party tools to take these delta backups correctly.
Which leaves us with user files, which are often unique enough on their own from the files of other users,that would not benefit from file-level deduplication. Backing up changed data only, as TSM does with its patented ["progressive incremental backup"] method, generally gets most of the benefits described by deduplication, without having to purchase storage hardware features.
Of course, if two or more users have identical files, the question might be why these are not stored on acommon file share. NAS file share repositories can greatly reduce each user keeping their own set of duplicates.It is interesting that some block-oriented deduplication,such as that found in the IBM System Storage N series, can get some benefit because some user files are oftenderivatives of other files, and there might be some 4 KB blocks of data in common.
Last November, I visited a customer in Canada. All of their problems were a direct result of taking full backupsevery weekend. It put a strain on their network; it used up too many disk and tape resources; and it took too long tocomplete. They asked about virtual tape libraries, deduplication, and anything else that could help them. The answer was simple: switch to IBM Tivoli Storage Manager and apply best practices.
One of the differences between IBM and the other storage vendors is that IBM is also in the business of middleware, application-aware backup software, and advanced copy services. This allows IBM to put togethersolutions that work to address specific challenges for our clients.
IBM has written a whitepaper on a cleverVSS Snapshot Backup for Exchange using IBM Tivoli Storage Manager and the point-in-time copy capabilities of IBM System Storage disk systems.
A problem in the past was that each vendor's point-in-time copy method had its own unique proprietary interface.Microsoft Developed Volume Shadow Copy Services (VSS) as a common interface front-end to resolve this concern.IBM Tivoli Storage Manager for Mail can invoke standard VSS interfaces, and this in turn can invoke FlashCopyon the IBM System Storage SAN Volume Controller, DS8000 series, or DS6000 series disk system.
You might be thinking: Wouldn't it have been less effort to just have TSM for Mail invoke IBM proprietary interfaces,rather than having to put full VSS support into TSM for mail, and then full VSS support into IBM's various disksystems? Perhaps, but IBM doesn't decide to do things because it is the cheapest way, we focus on what is theright way, and in this case, customers now have more choices, then can use TSM for Mail with IBM or non-IBM disksystems that support the VSS interface, and IBM disk systems can be employed into other uses for VSS snapshot.
Of course, we would like our clients to consider both TSM and IBM System Storage disk systems for a combined solution,not because they are required to make the solution work, but because both are best-of-breed, and whitepapers likethis show how they can provide synergy working together.
Well, this week I am in Maryland, just outside of Washington DC. It's a bit cold here.
Robin Harris over at StorageMojo put out this Open Letter to Seagate, Hitachi GST, EMC, HP, NetApp, IBM and Sun about the results of two academic papers, one from Google, and another from Carnegie Mellon University (CMU). The papers imply that the disk drive module (DDM) manufacturers have perhaps misrepresented their reliability estimates, and asks major vendors to respond. So far, NetAppand EMC have responded.
I will not bother to re-iterate or repeat what others have said already, but make just a few points. Robin, you are free to consider this "my" official response if you like to post it on your blog, or point to mine, whatever is easier for you. Given that IBM no longer manufacturers the DDMs we use inside our disk systems, there may not be any reason for a more formal response.
Coke and Pepsi buy sugar, Nutrasweet and Splenda from the same sources
Somehow, this doesn't surprise anyone. Coke and Pepsi don't own their own sugar cane fields, and even their bottlers are separate companies. Their job is to assemble the components using super-secret recipes to make something that tastes good.
IBM, EMC and NetApp don't make DDMs that are mentioned in either academic study. Different IBM storage systems uses one or more of the following DDM suppliers:
Seagate (including Maxstor they acquired)
Hitachi Global Storage Technologies, HGST (former IBM division sold off to Hitachi)
In the past, corporations like IBM was very "vertically-integrated", making every component of every system delivered.IBM was the first to bring disk systems to market, and led the major enhancements that exist in nearly all disk drives manufactured today. Today, however, our value-add is to take standard components, and use our super-secret recipe to make something that provides unique value to the marketplace. Not surprisingly, EMC, HP, Sun and NetApp also don't make their own DDMs. Hitachi is perhaps the last major disk systems vendor that also has a DDM manufacturing division.
So, my point is that disk systems are the next layer up. Everyone knows that individual components fail. Unlike CPUs or Memory, disks actually have moving parts, so you would expect them to fail more often compared to just "chips".
If you don't feel the MTBF or AFR estimates posted by these suppliers are valid, go after them, not the disk systems vendors that use their supplies. While IBM does qualify DDM suppliers for each purpose, we are basically purchasing them from the same major vendors as all of our competitors. I suspect you won't get much more than the responses you posted from Seagate and HGST.
American car owners replace their cars every 59 months
According to a frequently cited auto market research firm, the average time before the original owner transfers their vehicle -- purchased or leased -- is currently 59 months.Both studies mention that customers have a different "definition" of failure than manufacturers, and often replace the drives before they are completely kaput. The same is true for cars. Americans give various reasons why they trade in their less-than-five-year cars for newer models. Disk technologies advance at a faster pace, so it makes sense to change drives for other business reasons, for speed and capacity improvements, lower power consumption, and so on.
The CMU study indicated that 43 percent of drives were replaced before they were completely dead.So, if General Motors estimated their cars lasted 9 years, and Toyota estimated 11 years, people still replace them sooner, for other reasons.
At IBM, we remind people that "data outlives the media". True for disk, and true for tape. Neither is "permanent storage", but rather a temporary resting point until the data is transferred to the next media. For this reason, IBM is focused on solutions and disk systems that plan for this inevitable migration process. IBM System Storage SAN Volume Controller is able to move active data from one disk system to another; IBM Tivoli Storage Manager is able to move backup copies from one tape to another; and IBM System Storage DR550 is able to move archive copies from disk and tape to newer disk and tape.
If you had only one car, then having that one and only vehicle die could be quite disrupting. However, companies that have fleet cars, like Hertz Car Rentals, don't wait for their cars to completely stop running either, they replace them well before that happens. For a large company with a large fleet of cars, regularly scheduled replacement is just part of doing business.
This brings us to the subject of RAID. No question that RAID 5 provides better reliability than having just a bunch of disks (JBOD). Certainly, three copies of data across separate disks, a variation of RAID 1, will provide even more protection, but for a price.
Robin mentions the "Auto-correlation" effect. Disk failures bunch up, so one recent failure might mean another DDM, somewhere in the environment, will probably fail soon also. For it to make a difference, it would (a) have to be a DDM in the same RAID 5 rank, and (b) have to occur during the time the first drive is being rebuilt to a spare volume.
The human body replaces skin cells every day
So there are individual DDMs, manufactured by the suppliers above; disk systems, manufactured by IBM and others, and then your entire IT infrastructure. Beyond the disk system, you probably have redundant fabrics, clustered servers and multiple data paths, because eventually hardware fails.
People might realize that the human body replaces skin cells every day. Other cells are replaced frequently, within seven days, and others less frequently, taking a year or so to be replaced. I'm over 40 years old, but most of my cells are less than 9 years old. This is possible because information, data in the form of DNA, is moved from old cells to new cells, keeping the infrastructure (my body) alive.
Our clients should approach this in a more holistic view. You will replace disks in less than 3-5 years. While tape cartridges can retain their data for 20 years, most people change their tape drives every 7-9 years, and so tape data needs to be moved from old to new cartridges. Focus on your information, not individual DDMs.
What does this mean for DDM failures. When it happens, the disk system re-routes requests to a spare disk, rebuilding the data from RAID 5 parity, giving storage admins time to replace the failed unit. During the few hours this process takes place, you are either taking a backup, or crossing your fingers.Note: for RAID5 the time to rebuild is proportional to the number of disks in the rank, so smaller ranks can be rebuilt faster than larger ranks. To make matters worse, the slower RPM speeds and higher capacities of ATA disks means that the rebuild process could take longer than smaller capacity, higher speed FC/SCSI disk.
According to the Google study, a large portion of the DDM replacements had no SMART errors to warn that it was going to happen. To protect your infrastructure, you need to make sure you have current backups of all your data. IBM TotalStorage Productivity Center can help identify all the data that is "at risk", those files that have no backup, no copy, and no current backup since the file was most recently changed. A well-run shop keeps their "at risk" files below 3 percent.
So, where does that leave us?
ATA drives are probably as reliable as FC/SCSI disk. Customers should chose which to use based on performance and workload characteristics. FC/SCSI drives are more expensive because they are designed to run at faster speeds, required by some enterprises for some workloads. IBM offers both, and has tools to help estimate which products are the best match to your requirements.
RAID 5 is just one of the many choices of trade-offs between cost and protection of data. For some data, JBOD might be enough. For other data that is more mission critical, you might choose keeping two or three copies. Data protection is more than just using RAID, you need to also consider point-in-time copies, synchronous or asynchronous disk mirroring, continuous data protection (CDP), and backup to tape media. IBM can help show you how.
Disk systems, and IT environments in general, are higher-level concepts to transcend the failures of individual components. DDM components will fail. Cache memory will fail. CPUs will fail. Choose a disk systems vendor that combines technologies in unique and innovative ways that take these possibilities into account, designed for no single point of failure, and no single point of repair.
So, Robin, from IBM's perspective, our hands are clean. Thank you for bringing this to our attention and for giving me the opportunity to highlight IBM's superiority at the systems level.
I've blogged about some of these videos already, but since there are probably a few out there buying the brand new Apple iPhone looking for YouTube videos to play on them, these links might provide some exampleentertainment on your new handheld device.
Next week has "Fourth of July" Independence Day holiday in the USA smack in the middle of the week, so I suspect the blogosphereto quiet down a bit. So whether you are working next week or not, in the USA or elsewhere, take some time to enjoy your friends and family.
I didn't really have a theme this week, still recovering from jet-lag from my travels through Japan, Australia, China.
Gary Diskman has an amusing blog entry about a Funny disaster recovery job posting. It is not clear if he is being completely tongue-in-cheek, or a bit cynical. However, it rings true that you get what you measure, and some managers look for easy metrics, even if there are unintended consequences.
Western medicine works this way. Rather than paying your doctor to keep you healthy, you pay him per visit, to get refills on prescriptions, check-ups on medical conditions, surgeries and so on. While Eastern medicine is focused on keeping people healthy, Western medicine profits more from resolving "situations".
I have seen similar situations on the "health" of the data center. In one case, the admins were measured on how quickly they bring back up their web-servers after a crash. They had this process down to a science, because they were measured on how quickly they resolved the situation. I suggested switching from Windows to Linux, a much more reliable operating system for web-serving, and showed examples of web-servers running Linux that have been up for 1000 days or more. Management changed the metrics to "average up-time in days" and magically the re-boots all but disappeared, thanks to Linux, but also thanks in part to shifting the incentive structure. Perhaps some of those earlier situations were "artificially created"?
Back in the 1980s, I was working on a small software project that was about 5000 lines of code. In those days, testers were measured by the number of "successful" testcases that ran without incident. Testcases that uncovered an error were labeled as "failures" to be re-run after the developers fixed the code. When I declared my code ready for test, the test team ran 110 testcases, all successfully, and they were all rewarded for meeting their schedule. I, on the other hand, did not accept these results, met with them and told them I would give them $100 each if they could find a bug in my code in the next week. Nobody writes 5000 lines of code without some error along the way, not even me. (As one author put it, more people have left earth's gravity to orbit the planet than have written perfect code that did not require subsequent review or testing. It's so true. Good software is difficult to write.)
The test team accepted the challenge, and found 6 problems, more than I expected, but at least I felt more confident of the code quality after fixing these. As I suspected, the unintended consequence of counting "successful" testcases was that testers would write the most simple, basic, least-likely-to-challenge-boundaries testcases to ensure they meet their numbers. My experiment was costly to me, but more importantly was a wake-up call for the test management, and they realized they needed to re-evaluate their test procedures, metrics and terminology. This was a long time ago, and I am glad to see that the overall "software engineering" practice has matured much over the past 20 years.
So, my advice is to determine metrics that have the intended consequences you want, while avoiding any negative unintented consequences that might undermine your eventual success. People will quickly figure out how to maximize the results, and if you can align their goals to company goals, then everybody benefits.
Well, I'll be blogging from Mexico next week (yes, it is a business trip!). Enjoy the weekend.
Many people have asked me if there was any logic with the IBM naming convention of IBM Systems branded servers. Here's your quick and easy cheat sheet:
System x -- "x" for cross-platform architecture. Technologies from our mainframe and UNIX servers were brought into chips that sit next to the Intel or AMD processors to provide a more reliable x86 server experience. For example, some models have a POWER processor-based Remote Supervisor Adapter (RSA).
System p -- "p" for POWER architecture.
System z -- "z" for Zero-downtime, zero-exposures. Our lawyers prefer "near-zero", but this is about as close as you get to ["six-nines" availability] in our industry, with the highest level of security and encryption, no other vendor comes close, so you get the idea.
But what about the "i" for System i? Officially, it stands for "Integrated" in that it could integrate different applications running on different operating systems onto a [COMMON] platform. Options were available to insert Intel-based processor cards that ran Windows, or attach special cables that allowed separate System x servers running Windows to attach to a System i. Both allowed Windows applications to share the internal LAN and SAN inside the System i machine. Later, IBM allowed [AIX on System i] and [Linux on Power] operating systems to run as well.
From a storage perspective, we often joked that the "i" stood for "island", as most System i machines used internal disk, or attached externally to only a fewselected models of disk from IBM and EMC that had special support for i5/OS using a special, non-standard 520-byte disk block size. This meant only our popular IBM System Storage DS6000 and DS8000 series disk systems were available. This block size requirement only applies to disk. For tape, i5/OS supports both IBM TS1120 and LTO tape systems. For the most part,System i machines stood separate from the mainframe, and the rest of the Linux, UNIX and Windows distributed serverson the data center floor.
Often, when I am talking to customers, they ask when will product xyz be supported on System z or System i?I explained that IBM's strategy is not to make all storage devices connect via ESCON/FICON or support non-standard block sizes, but rather to get the servers to use standard 512-byte block size, Fibre Channel and other standard protocols.(The old adage applies: If you can't get Mohamed to move to the mountain, get the mountain to move to Mohamed).
On the System z mainframe, we are 60 percent there, allowing three of the five operating systems (z/VM, z/VSE and Linux) to access FCP-based disk and tape devices. (Four out of six if you include [OpenSolaris for the mainframe])But what about System i? As the characters on the popular television show [LOST] would say: It's time to get off the island!
Last week, IBM announced the new [i5/OS V6R1 operating system] with features that will greatly improve the use of external storage on this platform. Check this out:
POWER6-based System i 570 model server
Our latest, most powerful POWER processor brought to the System i platform. The 570 model will be the first in the System i family of servers to make use of new processing technology, using up to 16 (sixteen!) POWER6 processors (running at 4.7GHZ) in each machine.The advantage of the new processors is the increased commercial processing workload (CPW) rating, 31 percent greater than the POWER5+ version and 72 percent greater than the POWER5 version. CPW is the "MIPS" or "TeraFlops" rating for comparing System i servers.Here is the[Announcement Letter].
Fibre Channel Adapter for System i hardware
That's right, these are [Smart IOAs], so an I/O Processor (IOP) is no longer required! You can even boot the Initial Program Load (IPL) direclty from SAN-attached tape.This brings System i to the 21st century for Business Continuity options.
Virtual I/O Server (VIOS)
[VirtualI/O Server] has been around for System p machines, but now available on System i as well. This allows multiplelogical partitions (LPARs) to access resources like Ethernet cards and FCP host bus adapters. In the case of storage, the VIOS handles the 520-byte to 512-byte conversion, so that i5/OS systems can now read and write to standard FCP devices like the IBM System Storage DS4800 and DS4700 disk systems.
IBM System Storage DS4000 series
Initially, we have certified DS4700 and DS4800 disk systems to work with i5/OS, but more devices are in plan.This means that you can now share your DS4700 between i5/OS and your other Linux, UNIX and Windowsservers, take advantage of a mix of FC and SATA disk capacities, RAID6 protection, and so on.
To call [IBM PowerVM] the "VMware for the POWER architecture" would not do it quite justice. In combination with VIOS, IBM PowerVM is able to run a variety of AIX, Linux and i5/OS guest images.The "Live Partition Mobility" feature allows you to easily move guest images from one system to another, while they are running, just like VMotion for x86 machines.
And while we are on the topic of x86, PowerVM is also able to represent a Linux-x86 emulation base to run x86-compiled applications. While many Linux applications could be re-complied from source code for the POWER architecture "as is", others required perhaps 1-2 percent modification to port them over, and that was too much for some software development houses. Now, we can run most x86-compiled Linux application binaries in their original form on POWER architecture servers.
BladeCenter JS22 Express
The POWER6-based [JS22 Express blade] can run i5/OS, taking advantage of PowerVM and VIOS to access all of the BladeCenterresources. The BladeCenter lets you mix and match POWER and x86-based blades in the same chassis, providing theultimate in flexibility.