“In times of universal deceit, telling the truth will be a revolutionary act.”
-- George Orwell
Well, it has been over two years since I first covered IBM's acquisition of the XIV company. Amazingly, I still see a lot of misperceptions out in the blogosphere, especially those regarding double drive failures for the XIV storage system. Despite various attempts to [explain XIV resiliency] and to [dispel the rumors], there are still competitors making stuff up, putting fear, uncertainty and doubt into the minds of prospective XIV clients.
Clients love the IBM XIV storage system! In this economy, companies are not stupid. Before buying any enterprise-class disk system, they ask the tough questions, run evaluation tests, and all the other due diligence often referred to as "kicking the tires". Here is what some IBM clients have said about their XIV systems:
“3-5 minutes vs. 8-10 hours rebuild time...”
-- satisfied XIV client
“...we tested an entire module failure - all data is re-distributed in under 6 hours...only 3-5% performance degradation during rebuild...”
-- excited XIV client
“Not only did XIV meet our expectations, it greatly exceeded them...”
-- delighted XIV client
In this blog post, I hope to set the record straight. It is not my intent to embarrass anyone in particular, so instead will focus on a fact-based approach.
- Fact: IBM has sold THOUSANDS of XIV systems
XIV is "proven" technology with thousands of XIV systems in company data centers. And by systems, I mean full disk systems with 6 to 15 modules in a single rack, twelve drives per module. That equates to hundreds of thousands of disk drives in production TODAY, comparable to the number of disk drives studied by [Google], and [Carnegie Mellon University] that I discussed in my blog post [Fleet Cars and Skin Cells].
- Fact: To date, no customer has lost data as a result of a Double Drive Failure on XIV storage system
This has always been true, both when XIV was a stand-alone company and since the IBM acquisition two years ago. When examining the resilience of an array to any single or multiple component failures, it's important to understand the architecture and the design of the system and not assume all systems are alike. At it's core, XIV is a grid-based storage system. IBM XIV does not use traditional RAID-5 or RAID-10 method, but instead data is distributed across loosely connected data modules which act as independent building blocks. XIV divides each LUN into 1MB "chunks", and stores two copies of each chunk on separate drives in separate modules. We call this "RAID-X".
Spreading all the data across many drives is not unique to XIV. Many disk systems, including EMC CLARiiON-based V-Max, HP EVA, and Hitachi Data Systems (HDS) USP-V, allow customers to get XIV-like performance by spreading LUNs across multiple RAID ranks. This is known in the industry as "wide-striping". Some vendors use the terms "metavolumes" or "extent pools" to refer to their implementations of wide-striping. Clients have coined their own phrases, such as "stripes across stripes", "plaid stripes", or "RAID 500". It is highly unlikely that an XIV will experience a double drive failure that ultimately requires recovery of files or LUNs, and is substantially less vulnerable to data loss than an EVA, USP-V or V-Max configured in RAID-5. Fellow blogger Keith Stevenson (IBM) compared XIV's RAID-X design to other forms of RAID in his post [RAID in the 21st Centure].
- Fact: IBM XIV is designed to minimize the likelihood and impact of a double drive failure
The independent failure of two drives is a rare occurrence. More data has been lost from hash collisions on EMC Centera than from double drive failures on XIV, and hash collisions are also very rare. While the published worst-case time to re-protect from a 1TB drive failure for a fully-configured XIV is 30 minutes, field experience shows XIV regaining full redundancy on average in 12 minutes. That is 40 times less likely than a typical 8-10 hour window for a RAID-5 configuration.
A lot of bad things can happen in those 8-10 hours of traditional RAID rebuild. Performance can be seriously degraded. Other components may be affected, as they share cache, connected to the same backplane or bus, or co-dependent in some other manner. An engineer supporting the customer onsite during a RAID-5 rebuild might pull the wrong drive, thereby causing a double drive failure they were hoping to avoid. Having IBM XIV rebuild in only a few minutes addresses this "human factor".
In his post [XIV drive management], fellow blogger Jim Kelly (IBM) covers a variety of reasons why storage admins feel double drive failures are more than just random chance. XIV avoids load stress normally associated with traditional RAID rebuild by evenly spreading out the workload across all drives. This is known in the industry as "wear-leveling". When the first drive fails, the recovery is spread across the remaining 179 drives, so that each drive only processes about 1 percent of the data. The [Ultrastar A7K1000] 1TB SATA disk drives that IBM uses from HGST have specified 1.2 million hours mean-time-between-failures [MTBF] would average about one drive failing every nine months in a 180-drive XIV system. However, field experience shows that an XIV system will experience, on average, one drive failure per 13 months, comparable to what companies experience with more robust Fibre Channel drives. That's innovative XIV wear-leveling at work!
- Fact: In the highly unlikely event that a DDF were to occur, you will have full read/write access to nearly all of your data on the XIV, all but a few GB.
Even though it has NEVER happened in the field, some clients and prospects are curious what a double drive failure on an XIV would look like. First, a critical alert message would be sent to both the client and IBM, and a "union list" is generated, identifying all the chunks in common. The worst case on a 15-module XIV fully loaded with 79TB data is approximately 9000 chunks, or 9GB of data. The remaining 78.991 TB of unaffected data are fully accessible for read or write. Any I/O requests for the chunks in the "union list" will have no response yet, so there is no way for host applications to access outdated information or cause any corruption.
(One blogger compared losing data on the XIV to drilling a hole through the phone book. Mathematically, the drill bit would be only 1/16th of an inch, or 1.60 millimeters for you folks outside the USA. Enough to knock out perhaps one character from a name or phone number on each page. If you have ever seen an actor in the movies look up a phone number in a telephone booth then yank out a page from the phone book, the XIV equivalent would be cutting out 1/8th of a page from an 1100 page phone book. In both cases, all of the rest of the unaffected information is full accessible, and it is easy to identify which information is missing.)
If the second drive failed several minutes after the first drive, the process for full redundancy is already well under way. This means the union list is considerably shorter or completely empty, and substantially fewer chunks are impacted. Contrast this with RAID-5, where being 99 percent complete on the rebuild when the second drive fails is just as catastrophic as having both drives fail simultaneously.
- Fact: After a DDF event, the files on these few GB can be identified for recovery.
Once IBM receives notification of a critical event, an IBM engineer immediately connects to the XIV using remote service support method. There is no need to send someone physically onsite, the repair actions can be done remotely. The IBM engineer has tools from HGST to recover, in most cases, all of the data.
Any "union" chunk that the HGST tools are unable to recover will be set to "media error" mode. The IBM engineer can provide the client a list of the XIV LUNs and LBAs that are on the "media error" list. From this list, the client can determine which hosts these LUNs are attached to, and run file scan utility to the file systems that these LUNs represent. Files that get a media error during this scan will be listed as needing recovery. A chunk could contain several small files, or the chunk could be just part of a large file. To minimize time, the scans and recoveries can all be prioritized and performed in parallel across host systems zoned to these LUNs.
As with any file or volume recovery, keep in mind that these might be part of a larger consistency group, and that your recovery procedures should make sense for the applications involved. In any case, you are probably going to be up-and-running in less time with XIV than recovery from a RAID-5 double failure would take, and certainly nowhere near "beyond repair" that other vendors might have you believe.
- Fact: This does not mean you can eliminate all Disaster Recovery planning!
To put this in perspective, you are more likely to lose XIV data from an earthquake, hurricane, fire or flood than from a double drive failure. As with any unlikely disaster, it is best to have a disaster recovery plan than to hope it never happens. All disk systems that sit on a single datacenter floor are vulnerable to such disasters.
For mission-critical applications, IBM recommends using disk mirroring capability. IBM XIV storage system offers synchronous and asynchronous mirroring natively, both included at no additional charge.
For more about IBM XIV reliability, read this whitepaper [IBM XIV© Storage System: Reliability Reinvented]. To find out why so many clients LOVE their XIV, contact your local IBM storage sales rep or IBM Business Partner.
technorati tags: IBM, XIV, DDF, RAID-5, RAID-10, RAID-X, RAID-6, RAID-DP, HP, EVA, HDS, USP-V, EMC, CLARiiON, V-Max, Disaster Recovery, HGST, UltraStar, A7K1000
Well, it's Tuesday again, and you know what that means? IBM Announcements!
(OK, yes, today is Friday, but I was busy getting married on Tuesday, so IBM pushed the announcements out one day to Wednesday, and technically I am writing this blog post during my honeymoon vacation, so the IBM marketing team and my new wife both cut me some slack. Work/Life balance is all about compromises, right?)
- IBM DS8880 Storage System
The IBM DS8880 comes in three models, the DS8884 entry level, the DS8886 enterprise level, and the DS8888 all-flash array. IBM offers 1, 2, 3 and 4 year warranties.
The new High Performance Flash Enclosure (HPFE) Gen2 delivers more capacity than Gen1. The 2U flash enclosures are configured in pairs with each enclosure supporting up to twenty-four 2.5-inch flash cards in capacities 400 GB, 800 GB, 1.6 TB and 3.2 TB.
The HPFE Gen2 are currently available for both the DS8884 and DS8886 models. The maximum flash capacity for the DS8886 increases from 96 TB to 614.4 TB, delivering reduced storage costs through lesser cost per IOPS with this new flash enclosure. IBM has made a statement of direction to offer these HPFE Gen2 on the DS8888 as well.
To improve security, IBM DS8880 now supports customer-defined digital certificates for authentication, and configurable Hardware Management Console (HMC) firewall support.
For IBM's mainframe clients, IBM now offers "Extents-level" space release support for z/OS®, DSCLI (Command Line Interface) support for z/OS environment, and FICON® Information Unit (IU) pacing improvements.
My blog post [Re-Evaluating RAID-5 and RAID-6 for slower larger drives] helped to convince upper management to make RAID-6 the default protection level in R8.2 release.
To learn more, see [ IBM DS8880 storage family delivers a series of flash-enclosure models] press release.
- IBM Spectrum Virtualize™ V7.8
IBM Spectrum Virtualize™ V7.8 delivers support for the latest SAN Volume Controller, FlashSystem V9000 and Storwize® product family, and adds new software functionality and improvements
In conjunction with [IBM Spectrum Copy Data Management], Spectrum Virtualize v7.8 offers flexible data protection with transparent cloud tiering to leverage the cloud as FlashCopy targets and restore these snapshots from the cloud on select platforms.
In my September blog post [IBM Edge 2016 Day 3 Wednesday Breakout Sessions] I gave a quick recap of how IBM Spectrum Virtualize offers data-at-rest encryption for both internal and external drives.
However, the encryption keys are kept on USB thumb drives, which are either left in the USB ports on the back of the hardware, or locked away in a safe, only to be retrieved as needed when rebooting the systems or upgrading the firmware.
Now, IBM Spectrum Virtualize v7.8 supports the IBM Security Key Lifecycle Manager (SKLM) to manage encryption keys. IBM continues to support USB thumb drives if you prefer, but SKLM is used to manage keys for most of the rest of IBM products, and provides centralized management.
To learn more, see [ IBM Spectrum Virtualize Software V7.8] press release.
- IBM SAN Volume Controller and Storwize
The SVC and Storwize models can directly attach via 12Gb SAS to expansion drawers. At the time, we supported 2U-high 12-bay that support Large Form Factor (LFF) 3.5-inch Nearline (7200 rpm) drives, and 2U-high 24-bay that support the Small Form Factor (SFF) 2.5-inch drives (SSD, 15K, 10K and 7200 rpm).
With Spectrum Virtualize v7.8, IBM now offers a third option, the 5U-high 92-bay that supports both LFF and SFF drives. This new expansion can be attached to Storwize V5000 Gen2, Storwize V7000 (models 524/Gen2 and 624/Gen2+), and SVC (models DH8 and SV1).
For the 12-bay and 92-bay, IBM now supports 10TB capacity 3.5-inch Nearline drives. For the 24-bay and 92-bay, IBM now supports 7.68 TB and 15.36 TB capacity Solid State Drives (SSD).
For those concerned about the phrase "lower endurance" in the press release, let me explain. SSD have a bit of extra capacity included. If you write the full capacity of the drive every day for a year, you will "burn up" about one percent of the capacity.
To handle ten "Full Drive Writes per Day" (10 FDWP) over the course of five years, IBM adds 50 percent extra spare capacity above the 400 GB, 800 GB, 1.6 TB and 3.2 TB capacities. So, a 400GB full-endurance drive is really 600 GB inside. These were sometimes referred to as "Enterprise" SSD.
For the larger device sizes, the IT industry has determined that 1 FDWP is sufficient, so instead of 50 percent spare capacity, IBM adds only 5 percent extra. The 7.68 TB is really 8.06 TB inside. These were earlier referred to as "Read-Intensive" SSD. These come in 1.92 TB, 3.84 TB, 7.68 TB and 15.36 TB capacities.
IBM is also offering non-disruptive model conversions. Storwize V5010 can now be converted to V5020, and V5020 can be converted to V5030. The Storwize V7000 Model 524 (Gen2) can be converted to model 624 (Gen2+).
To learn more, see [ IBM SAN Volume Controller and Storwize family high-density expansion] press release.
- IBM FlashSystem V9000
The IBM FlashSystem V9000 will also support its own version of 5U-high, 92-bay, but to simplify ordering, will only support the following drive types:
- High-capacity, archival-class Nearline disk drives in 8 TB and 10 TB 7,200 rpm
- Flash drives in 1.92 TB, 3.84 TB, 7.68 TB, and 15.36 TB
To learn more, see [ IBM FlashSystem V9000 HD Expansion Enclosure Model 92F] press release.
- IBM DeepFlash Elastic Storage Server (ESS)
The DeepFlash 150 is the perfect JBOF addition to the ESS family. The current ESS models had either 2U-high 24-drive bays, or 4U-high 60-drive bays. This new model is 3U-high with 64 high-capacity (8 TB) Board Solid State Drives (BSSD).
The ESS includes all the features of IBM Spectrum Scale, including both 8+2 and 8+3 Erasure Coding data protection. This provides file and object access to data, including POSIX compliance for Windows, Linux and AIX operating systems, as well as HDFS-compliant access for big data analytics.
To learn more, see [IBM DeepFlash Elastic Storage Server] landing page.
By now, there are multitude news articles on these announcements, so I recommend you go look for them.
technorati tags: IBM, DS8880, DS8884, DS8886, High Performance Flash Enclosure, HPFE, DSCLI, FICON, Spectrum Virtualize, Data-at-Rest Encryption, Spectrum Copy Data Management, Transparent Cloud Tiering, SAS, NL-SAS, LFF, SFF, FDWP, FlashSystem V9000, DeepFlash 150, DeepFlash ESS, Elastic Storage Server, Erasure Coding, Widows, Linux, AIX, BSSD
I am presenting a Webinar on Monday, October 31, 1:00pm CDT!
Go to [http://event.on24.com/wcc/r/1286360/BAB1D141B103646D019328D53361D830] website to register!
Modified by TonyPearson
SAP HANA is an in-memory, relational database management system supported on Linux for x86 and POWER servers. The "HANA" acronym is short for "High-Performance Analytic Appliance" software. By keeping the data in memory, analytics and queries can be performed much faster than from traditional disk repositories.
Server memory, however, is volatile storage, so the data needs to be stored on persistent storage such as flash or disk drives. SAP has certified several configurations, some involve IBM Spectrum Scale solutions. I will use the following graphic to explain the three configurations.
- Linux on x86-64 with Spectrum Scale FPO
With SAP HANA on Lenovo x86-64 servers, SAP has certified internal flash or disk drives running IBM Spectrum Scale in "File Placement Optimization" (FPO) mode. FPO provides a shared-nothing architecture that matches the SAP HANA architecture. IBM Spectrum Protect can backup this configuration, providing data protection and disaster recovery support.
- Linux on POWER with Elastic Storage Server
With SAP HANA on POWER servers, SAP has certified external Elastic Storage Server (ESS). Not only is POWER the better platform to run SAP HANA than x86-64, but Elastic Storage Server offers excellent erasure coding to provide excellent rebuild times and storage efficiency.
The ESS is a pre-built system that combines IBM Spectrum Scale software with server and storage hardware. IBM Spectrum Protect can also backup this configuration, providing data protection and disaster recovery support.
- Block-level Storage over Storage Area Network (SAN)
Various IBM block-level devices are support for SAP HANA on both Linux on x86-64 and Linux on POWER. Unfortunately, SAP only has certified (to date) the use of the XFS file system. The problem many clients mention about this configuration is the lack of end-to-end backup and disaster recovery. This is solved by the Spectrum Scale configurations in the previous two examples.
Other combinations, such as SAP HANA on POWER with Spectrum Scale FPO, or on x86-64 servers with Elastic Storage Serer, are either not SAP-certified, or not directly supported by SAP without their approval.
IBM and SAP have worked closely together for many years, and I am glad to see SAP HANA and IBM Spectrum Scale based solutions continue this tradition.
technorati tags: IBM, SAP, SAP HANA, x86, x86-64, Linux, POWER, Spectrum Scale, FPO, Elastic Storage Server, ESS, Spectrum Protect, Disaster Recovery
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Last week, IBM announced a variety of tape system enhancements.
- IBM TS7760 Virtual Tape System
The IBM TS7760 combines the benefits of the previous TS7720 and TS7740 offerings. Those with IBM z System mainframes will recognize both. The TS7740 has a small amount of disk that pretend to be a tape library, with enough capacity to hold a few hours to a few days worth of data. After that, the data is moved to physical tape. The TS7720 is an all-disk solution, holding up to 1 PB of disk to hold weeks or months worth of data, but did not have tape attachment. Previously, IBM announced the TS7720T, a high-capacity offering with tape attachment. The new TS7760 is now the replacement for all three of these, powered by the latest POWER8 processor.
In addition to all the features available in the former models, the new TS7760 uses 4TB drives instead of 3TB drives, resulting in a maximum capacity of 1.3PB of disk capacity before compression. The disks are encrypted and protected by distributed RAID-6 referred to as "Dynamic Disk Pooling". While tape attachment is still optional, it supports both IBM TS3500 and TS4500 tape libraries.
To learn more, see the [TS7700 R4.0 delivers faster performance and larger maximum capacity with the new TS7760 offering] press release.
- new Rack-mount Kit for TS1140 and TS1150 tape drives
Previously, the IBM tape drives had a rack-mount kit that took up 10U, and only worked with racks that were 28 inches deep, so two drives took up nearly one-fourth of a full rack. These new rack-mount kits take up only 3U for one or two drives, so they are more space-efficient, and can work with any racks that is 28 to 44.5 inches deep. To learn more, see the [IBM TS1140 and TS1150 Tape Drive rack mount kit features support RoHS compliance in a 3U form factor] press release.
- IBM TS3500 Tape Library
The IBM TS3500 has been enhanced to support the new 16Gb FC attachments for the TS7700 virtual tape systems, including the new TS7760 I mentioned above. To learn more, see the [IBM TS3500 Tape Library supports new switch options]
- IBM TS4500 Tape Library
The IBM TS4500 now can attach to IBM TS7720, TS7720T, TS7740 and TS7760 Virtual Tape Systems for z Systems mainframe attachment, with some amazing enhnancements over its TS3500 predecessor:
- Up to 60 percent reduction in floor space costs
- Up to two times faster access to data
- Up to 25 percent higher bandwidth per frame over TS3500
- z Systems synergy with support for 16 Gb Fibre Channel switch for up to 100 PB of z System Data storage
To learn more, see the [IBM TS4500 Tape Library supports TS7700 attachment] press release.
I am at the airport headed to Chicago for the IBM Technical University. If you are in the Chicago area, consider attending!
technorati tags: IBM, TS7700, TS7720, TS7720T, TS7740, TS7760, TS1140, TS1150, TS3500, TS4500