Well, this week I am in Maryland, just outside of Washington DC. It's a bit cold here.
Robin Harris over at StorageMojo put out this Open Letter to Seagate, Hitachi GST, EMC, HP, NetApp, IBM and Sun about the results of two academic papers, one from Google, and another from Carnegie Mellon University (CMU). The papers imply that the disk drive module (DDM) manufacturers have perhaps misrepresented their reliability estimates, and asks major vendors to respond. So far, NetAppand EMC have responded.
I will not bother to re-iterate or repeat what others have said already, but make just a few points. Robin, you are free to consider this "my" official response if you like to post it on your blog, or point to mine, whatever is easier for you. Given that IBM no longer manufacturers the DDMs we use inside our disk systems, there may not be any reason for a more formal response.
- Coke and Pepsi buy sugar, Nutrasweet and Splenda from the same sources
Somehow, this doesn't surprise anyone. Coke and Pepsi don't own their own sugar cane fields, and even their bottlers are separate companies. Their job is to assemble the components using super-secret recipes to make something that tastes good.
IBM, EMC and NetApp don't make DDMs that are mentioned in either academic study. Different IBM storage systems uses one or more of the following DDM suppliers:
- Seagate (including Maxstor they acquired)
- Hitachi Global Storage Technologies, HGST (former IBM division sold off to Hitachi)
In the past, corporations like IBM was very "vertically-integrated", making every component of every system delivered.IBM was the first to bring disk systems to market, and led the major enhancements that exist in nearly all disk drives manufactured today. Today, however, our value-add is to take standard components, and use our super-secret recipe to make something that provides unique value to the marketplace. Not surprisingly, EMC, HP, Sun and NetApp also don't make their own DDMs. Hitachi is perhaps the last major disk systems vendor that also has a DDM manufacturing division.
So, my point is that disk systems are the next layer up. Everyone knows that individual components fail. Unlike CPUs or Memory, disks actually have moving parts, so you would expect them to fail more often compared to just "chips".
If you don't feel the MTBF or AFR estimates posted by these suppliers are valid, go after them, not the disk systems vendors that use their supplies. While IBM does qualify DDM suppliers for each purpose, we are basically purchasing them from the same major vendors as all of our competitors. I suspect you won't get much more than the responses you posted from Seagate and HGST.
- American car owners replace their cars every 59 months
According to a frequently cited auto market research firm, the average time before the original owner transfers their vehicle -- purchased or leased -- is currently 59 months.Both studies mention that customers have a different "definition" of failure than manufacturers, and often replace the drives before they are completely kaput. The same is true for cars. Americans give various reasons why they trade in their less-than-five-year cars for newer models. Disk technologies advance at a faster pace, so it makes sense to change drives for other business reasons, for speed and capacity improvements, lower power consumption, and so on.
The CMU study indicated that 43 percent of drives were replaced before they were completely dead.So, if General Motors estimated their cars lasted 9 years, and Toyota estimated 11 years, people still replace them sooner, for other reasons.
At IBM, we remind people that "data outlives the media". True for disk, and true for tape. Neither is "permanent storage", but rather a temporary resting point until the data is transferred to the next media. For this reason, IBM is focused on solutions and disk systems that plan for this inevitable migration process. IBM System Storage SAN Volume Controller is able to move active data from one disk system to another; IBM Tivoli Storage Manager is able to move backup copies from one tape to another; and IBM System Storage DR550 is able to move archive copies from disk and tape to newer disk and tape.
If you had only one car, then having that one and only vehicle die could be quite disrupting. However, companies that have fleet cars, like Hertz Car Rentals, don't wait for their cars to completely stop running either, they replace them well before that happens. For a large company with a large fleet of cars, regularly scheduled replacement is just part of doing business.
This brings us to the subject of RAID. No question that RAID 5 provides better reliability than having just a bunch of disks (JBOD). Certainly, three copies of data across separate disks, a variation of RAID 1, will provide even more protection, but for a price.
Robin mentions the "Auto-correlation" effect. Disk failures bunch up, so one recent failure might mean another DDM, somewhere in the environment, will probably fail soon also. For it to make a difference, it would (a) have to be a DDM in the same RAID 5 rank, and (b) have to occur during the time the first drive is being rebuilt to a spare volume.
- The human body replaces skin cells every day
So there are individual DDMs, manufactured by the suppliers above; disk systems, manufactured by IBM and others, and then your entire IT infrastructure. Beyond the disk system, you probably have redundant fabrics, clustered servers and multiple data paths, because eventually hardware fails.
People might realize that the human body replaces skin cells every day. Other cells are replaced frequently, within seven days, and others less frequently, taking a year or so to be replaced. I'm over 40 years old, but most of my cells are less than 9 years old. This is possible because information, data in the form of DNA, is moved from old cells to new cells, keeping the infrastructure (my body) alive.
Our clients should approach this in a more holistic view. You will replace disks in less than 3-5 years. While tape cartridges can retain their data for 20 years, most people change their tape drives every 7-9 years, and so tape data needs to be moved from old to new cartridges. Focus on your information, not individual DDMs.
What does this mean for DDM failures. When it happens, the disk system re-routes requests to a spare disk, rebuilding the data from RAID 5 parity, giving storage admins time to replace the failed unit. During the few hours this process takes place, you are either taking a backup, or crossing your fingers.Note: for RAID5 the time to rebuild is proportional to the number of disks in the rank, so smaller ranks can be rebuilt faster than larger ranks. To make matters worse, the slower RPM speeds and higher capacities of ATA disks means that the rebuild process could take longer than smaller capacity, higher speed FC/SCSI disk.
According to the Google study, a large portion of the DDM replacements had no SMART errors to warn that it was going to happen. To protect your infrastructure, you need to make sure you have current backups of all your data. IBM TotalStorage Productivity Center can help identify all the data that is "at risk", those files that have no backup, no copy, and no current backup since the file was most recently changed. A well-run shop keeps their "at risk" files below 3 percent.
So, where does that leave us?
- ATA drives are probably as reliable as FC/SCSI disk. Customers should chose which to use based on performance and workload characteristics. FC/SCSI drives are more expensive because they are designed to run at faster speeds, required by some enterprises for some workloads. IBM offers both, and has tools to help estimate which products are the best match to your requirements.
- RAID 5 is just one of the many choices of trade-offs between cost and protection of data. For some data, JBOD might be enough. For other data that is more mission critical, you might choose keeping two or three copies. Data protection is more than just using RAID, you need to also consider point-in-time copies, synchronous or asynchronous disk mirroring, continuous data protection (CDP), and backup to tape media. IBM can help show you how.
- Disk systems, and IT environments in general, are higher-level concepts to transcend the failures of individual components. DDM components will fail. Cache memory will fail. CPUs will fail. Choose a disk systems vendor that combines technologies in unique and innovative ways that take these possibilities into account, designed for no single point of failure, and no single point of repair.
So, Robin, from IBM's perspective, our hands are clean. Thank you for bringing this to our attention and for giving me the opportunity to highlight IBM's superiority at the systems level.
technorati tags: IBM, Seagate, Hitachi, HGST, EMC, NetApp, HP, HDS, Sun, Google, CMU, DDM, Fujitsu, MTBF, MTTF, AFR, ARR, JBOD, RAID, Tivoli, SVC, DR550, CDP, FC, SCSI, disk, tape, SAN,
Well, it's Tuesday again, and you know what that means? IBM Announcements! After much needed vacation in Cancun Mexico, Lake Havasu and Sedona, Arizona, I am glad to be back at work! This week, I was visiting clients in the Los Angeles area.
- IBM FlashSystem 9100
IBM's latest addition to its lineup of All-Flash Arrays is the FlashSystem 9100.
There are actually two models: the 9110 (model AF7) has 8-core processors, and the 9150 (model AF8) has 14-core processors. Both models are 2U 19-inch shelves with 24 drives on the front, with two control node canisters in the back. The term "FlashSystem 9100" applies to both 9110 and 9150 models.
Each canister has two processors, 64GB to 768GB of cache memory, an on-board 1GbE port for management, four 10GbE ports for Ethernet, and three HIC slots for I/O adapters, which can be any mix of quad-port FC cards, dual-port 25GbE Ethernet cards, or 12Gb SAS cards for expansion drawers.
For drives, you can have any mix of FlashCore Modules (FCM) or Industry-Standard NVMe (ISN) drives. The FlashCore modules are similar to the FlashCore boards in the FlashSystem 900, including Variable-Striped RAID, advanced flash management, heat binning, health separation, hardware-embedded encryption and compression.
These FCM are packaged into standard NVMe SSD form-factor, with 4.8, 9.6 and 19.2 TB capacities. The Industry-Standard NVMe drives come in 1.92, 3.84, 7.68 and 15.36 TB capacities to offer additional price/capacity options to clients.
A fully maxed out twenty-four FCM module system at 19.2TB represents approximately 400TB usable capacity, combined with 5:1 data footprint reduction with deduplication and compression, can provide up to an effective 2PB in as little as 2U of rack space!
The NVMe and FlashCore technology truly accelerates performance. Latencies as low as 100 microseconds are 2.5x lower than competitive offerings. Each control enclosure can deliver up to 2.5 Million IOPS, and a four-way cluster up to 10 million IOPS in just 8U!
You can mix and match FCM and ISN drives in the same controller, but FCM and ISN have to be in their own separate RAID groups. To use Distributed RAID6 (DRAID6), you need at least six drives for this.
IBM has made a "Statement of Direction" that these models are NVMe-OF hardware ready and will support both FC-NVMe and NVMe-OF over Ethernet by year end. Part of this involves changes to server-side software, including various operating systems, device drivers, and multi-pathing drivers.
The FlashSystem 9100 support up to 40U of expansion drawers, over 12Gb SAS, in two sizes. A 2U drawer for 24 SFF drives, and 5U for 92 SFF/LFF drives. Each FlashSystem 9100 can support up to 760 drives. These expansion drawers are not NVMe, so the Solid-State Drives (SSD) inside them use standard SAS. Consider using Easy Tier sub-LUN automated tiering to move fast data up to the FCM/ISN drives, and slower data to these SAS-based SSD.
Even though it doesn't have a "V" in its name, the FlashSystem 9100 runs Spectrum Virtualize, so you can also virtualize other storage behind it. Over 400 different storage devices from leading storage vendors are supported. The FlashSystem 9100 can be virtualized behind SVC or FlashSystem V9000.
FlashSystem 9100 can also cluster with Gen2 and Gen2+ models of the Storwize V7000 and V7000F controllers. You can connect up to four of any of these into a single cluster, supporting up to 3,040 drives.
The FlashSystem 9100 offers all of the features you have come to love from the rest of the Spectrum Virtualize products: data deduplication and compression, encryption, high-availability guarantee, data footprint reduction guarantee, hardware refresh option after three years, storage utility pricing, and IBM Storage Insights support.
IBM has no plans to withdraw either the existing FlashSystem V9000 nor the Storwize V7000/F models anytime soon. They continue to be available for purchase.
To learn more, see [IBM FlashSystem 9100] announcement letter, and fellow blogger Barry Whyte's post [Introducing the FlashSystem 9100 NVMe with FCM].
- IBM FlashSystem 9100 Multi-Cloud solutions
To complement the hardware features of the FlashSystem 9100, IBM has come up with three Multi-cloud solutions.
- Multi-Cloud Solution for Data Reuse, Protection and Efficiency - this combines Spectrum CDM with Spectrum Protect Plus to take snapshots of volumes on FlashSystem 9100. These snapshots are not just for data protection, but can also be "reused" for other purposes, like dev/test, DevOPS, or analytics.
- Multi-Cloud Solution for Business Continuity and Data Reuse - combines Spectrum CDM with Spectrum Virtualize in the Public Cloud, allowing you to take snapshots to the IBM Cloud for disaster recovery. The snapshots can be used in the cloud, or copied back to the same or different data center.
- Multi-Cloud Solution for Private Cloud Flexibility and Data Protection - combines IBM Cloud Private, Spectrum CDM, and Spectrum Connect to support client's efforts to re-factor their applications with Docker containers and Kubernetes. IBM FlashSystem 9100 can be used as persistent storage for containerized applications.
To learn more, see [IBM Multi-Cloud solutions] announcement letter.
- IBM Spectrum Virtualize 8.2 release
This release applies only to the Storwize V7000/F and the new FlashSystem 9100 models, and provides support for iSCSI Extensions over RDMA (iSER) on the 25GbE NIC cards. If you want to cluster existing Storwize V7000/F models to the new FlashSystem 9100 models, you need all of them to be at least v8.2.0 release.
Lower latencies and higher bandwidth requirements can be addressed by using RDMA to implement iSCSI. iSER is a new interconnect protocol that allows iSCSI to run on top of RDMA technology. RDMA can be implemented by using RoCE (RDMA over Converged Ethernet) or iWARP (Internet Wide-area RDMA Protocol). iSER enables iSCSI to run on top of it regardless of which of these technologies is used underneath.
To learn more, see [ IBM Spectrum Virtualize Software V8.2] announcement letter.
- IBM Storage Utility Pricing
The "Storage Utility" pricing available for many of IBM's other products has been extended to include the IBM FlashSystem 9100 and IBM Cloud Object Storage.
Basically, this is a variable-priced usage-based lease. Let's say you lease 500TB of capacity, but only use 150TB, the first few months you only pay for 150TB, a bit later, you use more, and now start paying more monthly, say 200TB. The price can go up or down. At the end of the lease, typically 36 or 60 months, you have a choice: give the equipment back, or pay the difference.
To learn more, see [IBM Storage Utility offerings for IBM Cloud Object Storage] announcement letter.
IBM is pleased to be on the leading edge of NVMe technology!
technorati tags: FlashSystem 9100, Multi-Cloud, Spectrum Virtualize
Well, it's Tuesday again, and you know what that means? IBM announcements!
Today's announcements are all about the Storwize family, IBM's market-leading Software Defined Storage offerings. Having sold over 55,000 systems, and managing over 1.6 Exabytes of data, IBM continues to be the #1 leader in storage virtualization solutions. The Storwize family consists of the SAN Volume Controller (SVC), Storwize V7000, Storwize V7000 Unified, Flex System V7000, Storwize V5000, Storwize V3700 and V3500.
SAN Volume Controller 2145-DH8
The new 2145-DH8 model is a complete repackaging of this popular storage system. The previous model, the 2145-CG8, was 1U-high x86 server per node, and each node required a separate 1U-high UPS to provide battery protection for its cache. Nobody liked this. The new 2145-DH8 instead is a 2U-high node with two hot-swappable batteries, eliminating the need for UPS altogether. Thus, an SVC node-pair using the 2145-DH8 models takes up the same 4U space, but with fewer cables. The SVC can now also support standard office 110/240 voltage sources.
The new model sports an 8-core processor with 32GB RAM. Since these are 2-socket servers, IBM offers that option to add a second 8-core processor and additional 32GB RAM to help boost Real-time Compression. Each node can have optionally one or two hardware-assisted compression cards which use the Intel QuickAssist chip to boost compression performance.
While the Real-time Compression was in fact, real-time, performed in-line to the read/write I/O process, at latency comparable to uncompressed data for applications, the compression process on older models was entirely software-based, consuming some of the CPU resources, which lowered the maximum IOPS of the solution. With the added cores, added RAM, and hardware-assisted compression chips, IBM resolves that concern. In fact, the new 2145-DH8 with compression can provide more IOPS than an older 2145-CG8 without compression.
The previous model 2145-CG8 allowed you to put up to 4 small SSD drives in the node itself, which were treated the same as externally Flash drives for purposes of having a high-speed storage pool for select volumes, or automated sub-LUN tiering with Easy Tier. The new model 2145-DH8 allows you to attach up to 48 Solid State Drives (SSD) via 12Gb SAS cables. These are housed in the new 2U-high 24F enclosures that can offer up to 38.4 TB of Flash per SVC I/O group.
IBM also re-designed the host/device ports to use Hardware Interface Card (HIC) slots. In the 2145-CG8, you had four FCP ports, two 1GbE Ethernet ports, with options to add two 10GbE Ethernet ports or four additional FCP ports. If you had mostly an FCoE or iSCSI environment, you didn't need the FCP, and if you were mostly a FCP Storage Area Network (SAN) environment, then most of the Ethernet ports went unused. To solve this, the 2145-DH8 can allow you to have up to six HIC cards that are either FCP, Ethernet, or SAS. There are three 1GbE fixed Ethernet ports which can be used for iSCSI and administration.
If you have SVC today, you can upgrade non-disruptively by either swapping out your current SVC engines with the new 2145-DH8 engines, or you can add the new 2145-DH8 engines to your existing SVC cluster. Either way, there is no outage to your applications!
To learn more, see the [Announcement letter: SAN Volume Controller Storage Engine DH8].
New Storwize V7000 hardware
This is the next generation of the popular Storwize V7000. The previous generation had a 4-core processor and 8GB RAM per canister. The new model has an 8-core processor with 32GB of RAM per canister, with the option to double these to boost Real-time compression. There are two canisters per control enclosure, which gives you 64GB to 128GB of RAM per Storwize V7000 I/O group.
The new Storwize V7000 comes with one hardware-assisted compression chip on the mother board of each canister, with the option to add a second chip per canister.
Each canister offers three HIC slots, which can be used for the additional hardware-assist compression chip, FCP or Ethernet ports.
To accommodate these HIC slots, new canisters were needed. Instead of the flat wide style top and bottom, we now have taller, thinner canisters that sit side to side. This side-to-side design is similar to our existing Storwize V5000 and V3700 models.
The previous model could support up to 9 expansion enclosures per control enclosure. The Storwize V7000 can have up to 24 drives in its control enclosure, and now attach up to 20 expansion enclosures, which allows up to 504 drives per control enclosure, and up to a maximum of 1,056 drives per Storwize cluster.
If you have previous models of Storwize V7000, you can add the new Storwize V7000 into the same cluster, or virtualize the previous storage for migration purposes.
To learn more, see the [Announcement letter: New Storwize V7000].
IBM Storwize Family Software V7.3.0
The new software applies new capabilities to both new generation hardware as well as the older models, so people with existing gear can benefit as well.
In prior releases, the sub-LUN automated tiering was limited to two levels: Flash and HDD. This lumped all 15K, 10K and 7200 RPM drives into a common HDD category. In the new v7.3.0 code, you can now have three levels: Flash, Enterprise HDD, and Nearline HDD, or two HDD levels: Enterprise and Nearline. The Enterprise level combines 15K and 10K RPM drives, similar to what is done on the IBM System Storage DS8000 disk systems.
The new code is also able balance your storage pools, and can be used with uniform or mixed storage pools to eliminate performance hot spots.
The new code has been enhanced to detect the hardware-assisted compression chip on the new SVC and Storwize V7000 models, and use those if available.
For the Storwize V3700 and V5000 models, the new code allows up to nine expansion enclosures per control enclosure. In the previous models, the V3700 allowed only four expansions, and the V6000 only six expansions per control enclosure. The V3700 can now support up to 240 drives, and the V5000 can support up to 480 drives.
To learn more, see the [Announcement letter: Storwize Family Software v7.3.0].
IBM Storwize V7000 Unified File Module software v1.5
For Storwize V7000 Unified clients, there is new software for the File Modules that provide NFS, CIFS, FTP, HTTPS and SCP protocol capability. The new v1.5 code now adds NFS v4 and SMB 2.1 levels of support. Most NFS users are still on NFSv3, but about 20 percent of NFS users are using NFS v4 which offers stateful access. The SMB 2.1 for CIFS was introduced by Microsoft in Windows 7 and Windows Server 2008 R2.
Deterministic ID mapping allows you to map Windows userids to UNIX/Linux group and owner id numbers. In the past, the problem is that this mapping is different on each machine, so people often had to stand up a Windows System for Unix Services (SFU) server to provide consistent ID mapping. Now, with v1.5 code, you will no longer have to do this. The deterministic ID mapping will can now replicate the mapping to each machine without an SFU server.
Active Cloud Engine allows up to ten Storwize V7000 Unified to be connected across distance to form a single global name space. WAN caching, however, was restricted to a single site having write capabilities, while the others were read-only. In v1.5 release, IBM now supports multiple independent writers at different locations on the same fileset.
Security enhancements include multi-tenancy, configurable password policies, session policies, and hardened boot and SSH configurations. With NFS v3/v4, you can now use [Kerberos] for security.
Finally, I am please to see that we now have Cinder support for files on the Storwize V7000 Unified on the OpenStack Havana release that just came out last month. The OpenStack Cinder interface can assign LUNs to virtual machines, but the new Havana release allows NAS systems to dole out files that act as LUNs, such as OVA or VMDK files. The advantage is that these files can managed by Active Cloud Engine, cached locally across global name space, have policies place them on appropriate storage tiers, and inactive Virtual Machine images can be migrated to less expensive disk or tape.
To learn more, see the [Announcement letter: Storwize Family Software v7.3.0].
You can learn more about the Storwize family at the [IBM Edge Conference], May 19-23, at Las Vegas. I'll be there!
technorati tags: IBM, Announcements, SAN Volume Controller, SVC, Storwize, Storwize V7000, Flex System V7000, Storwize V5000, Storwize V3700, 2145-DH8, hardware-assisted compression, Real-time Compression, Intel QuickAssist, New Storwize, HIC, Easy Tier, Storwize V7000 Unified, File Modules, OpenStack, OpenStack Havana, OpenStack Cinder, multiple-writer, independent-writer, Active Cloud Engine, Windows SFU, Kerberos, Storwize family, #ibmEdge, Las Vegas
Happy Winter Solstice everyone! The Mayan calendar flipped over yesterday, and everything continued as normal.
The next date to watch out for is ... drumroll please ... April 8, 2014. This is the date Microsoft has decided to [drop support for Windows XP].
While many large corporations are actively planning to get off Windows XP, there are still many homes and individuals that are running on this platform.
When [Windows XP] was introduced in 2001, it could support systems with as little as 64MB of RAM. Nowadays, the latest versions of Windows now requires a minimum of 1GB for 32-bit systems, with 2GB or 3GB recommended.
That leaves Windows XP users on older hardware few choices:
- Continue to run Windows XP, but without support (and hope for the best)
- Upgrade their hardware with more RAM (and possibly more disk space) needed to run a newer level of Windows
- Install a different operating system like Linux
- Put the hardware in the recycle bin, and buy a new computer
Here is a personal example. A long time ago, I gave my sister a Thinkpad R31 laptop so that she could work from home. When she got a newer one, she passed this down to her daughter for doing homework. When my neice got a newer one, she passed this old laptop to her grandma.
Grandma is fairly happy with her modern PC running Windows XP. She plays all kinds of games, scans photographs, sends emails, listens to music on iTunes, and even uses Skype to talk to relatives. Her problem is that this PC is located upstairs, in her bedroom, and she wanted something portable that she could play music downstairs when she is playing cards with her friends.
"Why not use the laptop you have?" I asked. Her response: "It runs very slow. Perhaps it has a virus. Can you fix that?" I was up for the challenge, so I agreed.
(The Challenge: Update the Thinkpad R31 so that grandma can simply turn it on, launch iTunes or similar application, and just press a "play" button to listen to her music. It will be plugged in to an electrical outlet wherever she takes it, and she already has her collection of MP3 music files. My hope is to have something that is (a) simple to use, (b) starts up quickly, and (c) will not require a lot of on-going maintenance issues.)
Here are the relevant specifications of the Thinkpad R31 laptop:
|CPU||Intel Celeron 1.13GHz Pentium-III|
|Display||13.3-inch TFT, 1024x768 XGA|
|Memory (RAM)||384 MB @133MHz, upgradeable only to 1GB|
|Disk storage||20.0 GB|
|Optical Drive||CD-ROM drive|
|BIOS boot options||Hard drive or CD-ROM only|
|External attachment||2 USB ports, but no USB boot option|
|Network||Wired 10/100 Mbps Ethernet|
56 Kbps Phone modem
The system was pre-installed with Windows XP, but was terribly down-level. I updated to Windows XP SP3 level, downloaded the latest anti-virus signatures, and installed iTunes. A full scan found no viruses. All this software takes up 14GB, leaving less than 6GB for MP3 music files.
The time it took from hitting the "Power-on" button to hearing the first note of music was over 14 minutes! Unacceptable!
If you can suggest what my next steps should be, please comment below or send me an email!
technorati tags: IBM, Windows XP, Microsoft, Thinkpad
Have you ever noticed that sometimes two movies come out that seem eerily similar to each other, released by different studios within months or weeks of each other? My sister used to review film scripts for a living, she would read ten of them and have to pick her top three favorites, and tells me that scripts for nearly identical concepts came all the time. Here are a few of my favorite examples:
- 1994: [Wyatt Earp] and [Tombstone] were Westerns recounting the famed gunfight at the O.K. Corral. Tombstone, Arizona is near Tucson, and the gunfight is recreated fairly often for tourists.
- 1998: [Armageddon] and [Deep Impact] were a pair of disaster movies dealing with a large rock heading to destroy all life on earth. I was in Mazatlan, Mexico to see the latter, dubbed in Spanish as "Impacto Profundo".
- 1998: [A Bug's Life] and [Antz] were computer-animated tales of the struggle of one individual ant in an ant colony.
- 2000: [Mission to Mars] and [Red Planet] were sci-fi pics exploring what a manned mission to our neighboring planet might entail.
- 2009: [Paul Blart: Mall Cop] and [Observe and Report] were comedies dealing with challenges of security at a shopping mall.
(I think I made my point with just a few examples. A more complete list can be found on [Sam Greenspan's 11 Points website].)
This is different than copy-cat movies that are re-made or re-imagined many years later based on the previous successes of an original. Ever since my blog post [VPLEX: EMC's Latest Wheel is Round] in 2010 comparing EMC's copy-cat product that came our seven years after IBM's SAN Volume Controller (SVC), I've noticed EMC doesn't talk about VPLEX that much anymore.
This week, IBM announced [XIV Gen3 Solid-State Drive support] and our friends over at EMC announced [VFCache SSD-based PCIe cards]. Neither of these should be a surprise to anyone who follows the IT industry, as IBM had announced its XIV Gen3 as "SSD-Ready" last year specifically for this purpose, and EMC has been touting its "Project Lightning" since last May.
Fellow blogger Chris Mellor from The Register has a series of articles to cover this, including [EMC crashes the server flash party], [NetApp slaps down Lightning with multi-card Flash flush], [HP may be going the server flash route], and [Now HDS joins the server flash party].
Fellow blogger Chuck Hollis from EMC has a blog post [VFCache means Very Fast Cache indeed] that provides additional detail. Chuck claims the VFCache is faster than popular [Fusion-IO PCIe cards] available for IBM servers. I haven't seen the performance spec sheets, but typically SSD is four to five times slower than the DRAM cache used in the XIV Gen3. The VFCache's SSD is probably similar in performance to the SSD supported in the IBM XIV Gen3, DS8000, DS5000, SVC, N series, and Storwize V7000 disk systems.
Nonetheless, I've been asked my opinions on the comparison between these two announcements, as they both deal with improving application performance through the use of Solid-State Drives as an added layer of read cache.
(FTC Disclosure: I am both a full-time employee and stockholder of the IBM Corporation. The U.S. Federal Trade Commission may consider this blog post as a paid celebrity endorsement of IBM servers and storage systems. This blog post is based on my interpretation and opinions of publicly-available information, as I have no hands-on access to any of these third-party PCIe cards. I have no financial interest in EMC, Fusion-IO, Texas Memory Systems, or any other third party vendor of PCIe cards designed to fit inside IBM servers, and I have not been paid by anyone to mention their name, brands or products on this blog post.)
The solutions are different in that IBM XIV Gen3 the SSD is "storage-side" in the external storage device, and EMC VFCache is "server-side" as a PCI Express [PCIe] card. Aside from that, both implement SSD as an additional read cache layer in front of spinning disk to boost performance. Neither is an industry first, as IBM has offered server-side SSD since 2007, and IBM and EMC have offered storage-side SSD in many of their other external storage devices. The use of SSD as read cache has already been available in IBM N series using [Performance Accelerator Module (PAM)] cards.
IBM has offered cooperative caching synergy between its servers and its storage arrays for some time now. The predecessor to today's POWER7-based were the iSeries i5 servers that used PCI-X IOP cards with cache to connect i5/OS applications to IBM's external disk and tape systems. To compete in this space, EMC created their own PCI-X cards to attach their own disk systems. In 2006, IBM did the right thing for our clients and fostered competition by entering in a [Landmark agreement] with EMC to [license the i5 interfaces]. Today, VIOS on IBM POWER systems allows a much broader choice of disk options for IBM i clients, including the IBM SVC, Storwize V7000 and XIV storage systems.
EMC is not the first to manufacture an SSD-based PCIe card. Last summer, my friends at Texas Memory Systems [TMS] gave away a [RAMsan-70 PCIe card] at an after-party on [Day 2 of the IBM System Storage University].
Can a little SSD really help performance? Yes! An IBM client running a [DB2 Universal Database] cluster across eight System x servers was able to replace an 800-drive EMC Symmetrix by putting eight SSD Fusion-IO cards in each server, for a total of 64 Solid-State drives, saving money and improving performance. DB2 has the Data Partitioning Feature that has multi-system DB2 configurations using a Grid-like architecture similar to how XIV is designed. Most IBM System x and BladeCenter servers support internal SSD storage options, and many offer PCIe slots for third-party SSD cards. Sadly, you can't do this with a VFCache card, since you can have only one VFCache card in each server, the data is unprotected, and only for ephemeral data like transaction logs or other temporary data. With multiple Fusion-IO cards in an IBM server, you can configure a RAID rank across the SSD, and use it for persistent storage like DB2 databases.
Here then is my side-by-side comparison:
|Category||EMC VFCache||IBM XIV Gen3 SSD Caching|
|Servers supported||Selected x86-based models of Cisco UCS, Dell PowerEdge, HP ProLiant DL, and IBM xSeries and System x servers||All of these, plus any other blade or rack-optimized server currently supported by XIV Gen3, including Oracle SPARC, HP Titanium, IBM POWER systems, and even IBM System z mainframes running Linux|
|Operating System support||Linux RHEL 5.6 and 5.7, VMware vSphere 4.1 and 5.0, and Windows 2008 x64 and R2.||All of these, plus all the other operating systems supported by XIV Gen3, including AIX, IBM i, Solaris, HP-UX, and Mac OS X|
|Protocol support||FCP||FCP and iSCSI|
|Vendor-supplied driver required on the server||Yes, the VFCache driver must be installed to use this feature.||No, IBM XIV Gen3 uses native OS-based multi-pathing drivers.|
|External disk storage systems required||None, it appears the VFCache has no direct interaction with the back-end disk array, so in theory the benefits are the same whether you use this VFCache card in front of EMC storage or IBM storage||XIV Gen3 is required, as the SSD slots are not available on older models of IBM XIV.|
|Shared disk support||No, VFCache has to be disabled and removed for vMotion to take place.||Yes! XIV Gen3 SSD caching shared disk supports VMware vMotion and Live Partition Mobility.|
|Support for multiple servers||No||An advantage of the XIV Gen3 SSD caching approach is that the cache can be dynamically allocated to the busiest data from any server or servers.|
|Support for active/active server clusters||No||Yes!|
|Aware of changes made to back-end disk||No, it appears the VFCache has no direct interaction with the back-end disk array, so any changes to the data on the box itself are not communicated back to the VFCache card itself to invalidate the cache contents.||Yes!|
|Sequential-access detection||None identified. However, VFCache only caches blocks 64KB or smaller, so any sequential processing with larger blocks will bypass the VFCache.||Yes! XIV algorithms detect sequential access and avoid polluting the SSD with these blocks of data.|
|Number of SSD supported||One, which seems odd as IBM supports multiple Fusion-IO cards for its servers. However, this is not really a single point of failure (SPOF) as an application experiencing a VFCache failure merely drops down to external disk array speed, no data is lost since it is only read cache.||6 to 15 (one per XIV module) for high availability.|
|Pin data in SSD cache||Yes, using split-card mode, you can designate a portion of the 300GB to serve as Direct-attached storage (DAS). All data written to the DAS portion will be kept in SSD. However, since only one card is supported per server and the data is unprotected, this should only be used for ephemeral data like logs and temp files.||No, there is no option to designate an XIV Gen3 volume to be SSD-only. Consider using Fusion-IO PCIe card as a DAS alternative, or another IBM storage system for that requirement.|
|Pre-sales Estimating tools||None identified||Yes! CDF and Disk Magic tools are available to help cost-justify the purchase of SSD based on workload performance analysis.|
IBM has the advantage that it designs and manufactures both servers and storage, and can design optimal solutions for our clients in that regard.
technorati tags: IBM, XIV, Gen3, SSD, cache, EMC, VFCache, Project Lightning, SVC, Solid State Drives, Fusion-IO, Texas Memory Systems, RAMSan, System+x, POWER systems, VIOS, DRAM, VMware, Vmotion, Live Partition Mobility, AIX, IBM i, PCIe, PCI-X