Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Have you ever noticed that sometimes two movies come out that seem eerily similar to each other, released by different studios within months or weeks of each other? My sister used to review film scripts for a living, she would read ten of them and have to pick her top three favorites, and tells me that scripts for nearly identical concepts came all the time. Here are a few of my favorite examples:
1994: [Wyatt Earp] and [Tombstone] were Westerns recounting the famed gunfight at the O.K. Corral. Tombstone, Arizona is near Tucson, and the gunfight is recreated fairly often for tourists.
1998: [Armageddon] and [Deep Impact] were a pair of disaster movies dealing with a large rock heading to destroy all life on earth. I was in Mazatlan, Mexico to see the latter, dubbed in Spanish as "Impacto Profundo".
1998: [A Bug's Life] and [Antz] were computer-animated tales of the struggle of one individual ant in an ant colony.
2000: [Mission to Mars] and [Red Planet] were sci-fi pics exploring what a manned mission to our neighboring planet might entail.
This is different than copy-cat movies that are re-made or re-imagined many years later based on the previous successes of an original. Ever since my blog post [VPLEX: EMC's Latest Wheel is Round] in 2010 comparing EMC's copy-cat product that came our seven years after IBM's SAN Volume Controller (SVC), I've noticed EMC doesn't talk about VPLEX that much anymore.
This week, IBM announced [XIV Gen3 Solid-State Drive support] and our friends over at EMC announced [VFCache SSD-based PCIe cards]. Neither of these should be a surprise to anyone who follows the IT industry, as IBM had announced its XIV Gen3 as "SSD-Ready" last year specifically for this purpose, and EMC has been touting its "Project Lightning" since last May.
Fellow blogger Chuck Hollis from EMC has a blog post [VFCache means Very Fast Cache indeed] that provides additional detail. Chuck claims the VFCache is faster than popular [Fusion-IO PCIe cards] available for IBM servers. I haven't seen the performance spec sheets, but typically SSD is four to five times slower than the DRAM cache used in the XIV Gen3. The VFCache's SSD is probably similar in performance to the SSD supported in the IBM XIV Gen3, DS8000, DS5000, SVC, N series, and Storwize V7000 disk systems.
Nonetheless, I've been asked my opinions on the comparison between these two announcements, as they both deal with improving application performance through the use of Solid-State Drives as an added layer of read cache.
(FTC Disclosure: I am both a full-time employee and stockholder of the IBM Corporation. The U.S. Federal Trade Commission may consider this blog post as a paid celebrity endorsement of IBM servers and storage systems. This blog post is based on my interpretation and opinions of publicly-available information, as I have no hands-on access to any of these third-party PCIe cards. I have no financial interest in EMC, Fusion-IO, Texas Memory Systems, or any other third party vendor of PCIe cards designed to fit inside IBM servers, and I have not been paid by anyone to mention their name, brands or products on this blog post.)
The solutions are different in that IBM XIV Gen3 the SSD is "storage-side" in the external storage device, and EMC VFCache is "server-side" as a PCI Express [PCIe] card. Aside from that, both implement SSD as an additional read cache layer in front of spinning disk to boost performance. Neither is an industry first, as IBM has offered server-side SSD since 2007, and IBM and EMC have offered storage-side SSD in many of their other external storage devices. The use of SSD as read cache has already been available in IBM N series using [Performance Accelerator Module (PAM)] cards.
IBM has offered cooperative caching synergy between its servers and its storage arrays for some time now. The predecessor to today's POWER7-based were the iSeries i5 servers that used PCI-X IOP cards with cache to connect i5/OS applications to IBM's external disk and tape systems. To compete in this space, EMC created their own PCI-X cards to attach their own disk systems. In 2006, IBM did the right thing for our clients and fostered competition by entering in a [Landmark agreement] with EMC to [license the i5 interfaces]. Today, VIOS on IBM POWER systems allows a much broader choice of disk options for IBM i clients, including the IBM SVC, Storwize V7000 and XIV storage systems.
Can a little SSD really help performance? Yes! An IBM client running a [DB2 Universal Database] cluster across eight System x servers was able to replace an 800-drive EMC Symmetrix by putting eight SSD Fusion-IO cards in each server, for a total of 64 Solid-State drives, saving money and improving performance. DB2 has the Data Partitioning Feature that has multi-system DB2 configurations using a Grid-like architecture similar to how XIV is designed. Most IBM System x and BladeCenter servers support internal SSD storage options, and many offer PCIe slots for third-party SSD cards. Sadly, you can't do this with a VFCache card, since you can have only one VFCache card in each server, the data is unprotected, and only for ephemeral data like transaction logs or other temporary data. With multiple Fusion-IO cards in an IBM server, you can configure a RAID rank across the SSD, and use it for persistent storage like DB2 databases.
Here then is my side-by-side comparison:
IBM XIV Gen3 SSD Caching
Selected x86-based models of Cisco UCS, Dell PowerEdge, HP ProLiant DL, and IBM xSeries and System x servers
All of these, plus any other blade or rack-optimized server currently supported by XIV Gen3, including Oracle SPARC, HP Titanium, IBM POWER systems, and even IBM System z mainframes running Linux
Operating System support
Linux RHEL 5.6 and 5.7, VMware vSphere 4.1 and 5.0, and Windows 2008 x64 and R2.
All of these, plus all the other operating systems supported by XIV Gen3, including AIX, IBM i, Solaris, HP-UX, and Mac OS X
FCP and iSCSI
Vendor-supplied driver required on the server
Yes, the VFCache driver must be installed to use this feature.
No, IBM XIV Gen3 uses native OS-based multi-pathing drivers.
External disk storage systems required
None, it appears the VFCache has no direct interaction with the back-end disk array, so in theory the benefits are the same whether you use this VFCache card in front of EMC storage or IBM storage
XIV Gen3 is required, as the SSD slots are not available on older models of IBM XIV.
Shared disk support
No, VFCache has to be disabled and removed for vMotion to take place.
Yes! XIV Gen3 SSD caching shared disk supports VMware vMotion and Live Partition Mobility.
Support for multiple servers
An advantage of the XIV Gen3 SSD caching approach is that the cache can be dynamically allocated to the busiest data from any server or servers.
Support for active/active server clusters
Aware of changes made to back-end disk
No, it appears the VFCache has no direct interaction with the back-end disk array, so any changes to the data on the box itself are not communicated back to the VFCache card itself to invalidate the cache contents.
None identified. However, VFCache only caches blocks 64KB or smaller, so any sequential processing with larger blocks will bypass the VFCache.
Yes! XIV algorithms detect sequential access and avoid polluting the SSD with these blocks of data.
Number of SSD supported
One, which seems odd as IBM supports multiple Fusion-IO cards for its servers. However, this is not really a single point of failure (SPOF) as an application experiencing a VFCache failure merely drops down to external disk array speed, no data is lost since it is only read cache.
6 to 15 (one per XIV module) for high availability.
Pin data in SSD cache
Yes, using split-card mode, you can designate a portion of the 300GB to serve as Direct-attached storage (DAS). All data written to the DAS portion will be kept in SSD. However, since only one card is supported per server and the data is unprotected, this should only be used for ephemeral data like logs and temp files.
No, there is no option to designate an XIV Gen3 volume to be SSD-only. Consider using Fusion-IO PCIe card as a DAS alternative, or another IBM storage system for that requirement.
Pre-sales Estimating tools
Yes! CDF and Disk Magic tools are available to help cost-justify the purchase of SSD based on workload performance analysis.
IBM has the advantage that it designs and manufactures both servers and storage, and can design optimal solutions for our clients in that regard.
If you store your VMware bits on external SAN or NAS-based disk storage systems, this post is for you. The subject of the post, VM Volumes, is a potential storage management game changer!
Fellow blogger Stephen Foskett mentioned VM Volumes in his [Introducing VMware vSphere Storage Features] presentation at IBM Edge 2012 conference. His session on VMware's storage features included VMware APIs for Array Integration (VAAI), VMware Array Storage Awareness (VASA), vCenter plug-ins, and a new concept he called "vVol", now more formally known as VM Volumes. This post provides a follow-up to this, describing the VM Volumes concepts, architecture, and value proposition.
"VM Volumes" is a future architecture that VMware is developing in collaboration with IBM and other major storage system vendors. So far, very little information about VM Volumes has been released. At VMworld 2012 Barcelona, VMware highlights VM Volumes for the first time and IBM demonstrates VM Volumes with the IBM XIV Storage System (more about this demo below). VM Volumes is worth your attention -- when it becomes generally available, everyone using storage arrays will have to reconsider their storage management practices in a VMware environment -- no exaggeration!
But enough drama. What is this all about?
(Note: for the sake of clarity, this post refers to block storage only. However, the VM Volumes feature applies to NAS systems as well. Special thanks to Yossi Siles and the XIV development team for their help on this post!)
The VM Volumes concept is simple: VM disks are mapped directly to special volumes on a storage array system, as opposed to storing VMDK files on a vSphere datastore.
The following images illustrate the differences between the two storage management paradigms.
You may still be asking yourself: bottom line, how will I benefit from VM Volumes?
Well, take a VM snapshot for example. With VM Volumes, vSphere can simply offload the operation by invoking a hardware snapshot of the hardware volume. This has significant implications:
VM-Granularity: Only the right VMs are copied (with datastores, backing up or cloning individual-VM portions of hardware snapshot of a datastore would require more complex configuration, tools and work)
Hardware Offload: No ESXi server resources are consumed
XIV advantage: With XIV, snapshots consume no space upfront and are completed instantly.
Here's the first takeaway: With VM Volumes, advanced storage services (which cost a lot when you buy a storage array), will become available at an individual VM level. In a cloud world, this means that applications can be provisioned easily with advanced storage services, such as snapshots and mirroring.
Now, let's take a closer look at another relevant scenario where VM Volumes will make a lot of difference - provisioning an application with special mirroring requirements:
VM Volumes case: The application is ordered via the private cloud portal. The requestor checks a box requesting an asynchronous mirror. He changes the default RPO for his needs. When the request is submitted, the process wraps up automatically: Volumes are created on one of the storage arrays, configured with a mirror and RPO exactly as specified. A few minutes later, the requestor receives an automatic mail pointing to the application virtual machine.
Datastores case #1: As may be expected, a datastore that is mirrored with the special RPO does not exist. As a result, the automated workflow sets a pending status on the request, creates an urgent ticket to a VMware administrator and aborts. When the VMware admin handles that ticket, she re-assigns the ticket to the storage administrator, asking for a new volume which is mirrored with the special RPO, and mapped to the right ESXi cluster. The next day, the volume is created; the ticket is re-assigned to the storage admin, with the new LUN being pointed to. The VMware administrator follows and creates the datastore on top of it. Since the automated workflow was aborted, the admin re-assigns the ticket to the cloud administrator, who sometime later completes the application provisioning manually.
Datastores case #2: Luckily for the requestor, a datastore that is mirrored with the special RPO does exist. However, that particular datastore is consuming space from a high performance XIV Gen3 system with SSD caching, while the application does not require that level of performance, so the workflow requires a storage administrator approval. The approval is given to save time, but the storage administrator opens a ticket for himself to create a new volume on another array, as well as a follow-up ticket for the VMware admin to create a new datastore using the new volume and migrate the application to the other datastore. In this case, provisioning was relatively rapid, but required manual follow up, involving the two administrators.
Here's the second takeaway: With VM Volumes, management is simplified, and end-to-end automation is much more applicable. The reason is that there are no datastores. Datastores physically group VMs that may otherwise be totally unrelated, and require close coordination between storage and VMware administrators.
Now, the above mainly focuses on the VMware or cloud administrator perspective. How does VM Volumes impact storage management?
VM's are the new hosts: Today, storage administrators have visibility of physical hosts in their management environment. In a non-virtualized environment, this visibility is very helpful. The storage administrator knows exactly which applications in a data center are storage-provisioned or affected by storage management operations because the applications are running on well-known hosts. However, in virtualized environments the association of an application to a physical host is temporary. To keep at least the same level of visibility as in physical environments, VMs should become part of the storage management environment, like hosts. Hosts are still interesting, for example to manage physical storage mapping, but without VM visibility, storage administrators will know less about their operation than they are used to, or need to. VM Volumes enables such visibility, because volumes are provided to individual VMs. The XIV VM Volumes demonstration at VMworld Barcelona, although experimental, shows a view of VM volumes, in XIV's management GUI.
Here's a screenshot:
That's not all!
Storage Profiles and Storage Containers: A Storage Profile is a vSphere specification of a set of storage services. A storage profile can include properties like thin or thick provisioning, mirroring definition, snapshot policy, minimum IOPS, etc.
Storage administrators define a portfolio of supported storage services, maintained as a set of storage profiles, and published (via VASA integration) to vSphere.
VMware or cloud administrators define the required storage profiles for specific applications
VMware and storage administrators need to coordinate the typical storage requirements and the automatically-available storage services. When a request to provision an application is made, the associated storage profiles are matched against the published set of available storage profiles. The matching published profiles will be used to create volumes, which will be bound to the application VMs. All that will happen automatically.
Note that when a VM is created today, a datastore must be specified. With VM Volumes, a new management entity called Storage Container (also known as Capacity Pool) replaces the use of datastore as a management object. Each Storage Container exposes a subset of the available storage profiles, as appropriate. The storage container also has a capacity quota.
Here are some more takeaways:
New way to interface vSphere and storage management: Storage administrators structure and publish storage services to vSphere via storage profiles and storage containers.
Automated provisioning, out of the box: The provisioning process automatically matches application-required storage profiles against storage profiles available from the specified storage containers. There is no need to build custom scripts and custom processes to automate storage provisioning to applications
The XIV advantage:
XIV services are very simple to define and publish. The typical number of available storage profiles would be low. It would also be easy to define application storage profiles.
XIV provides consistent high performance, up to very high capacity utilization levels, without any maintenance. As a result, automated provisioning (which inherently implies less human attention) will not create an elevated risk of reduced performance.
Note: A storage vendor VASA provider is required to support VM Volumes, storage profiles, storage containers and automated provisioning. The IBM Storage VASA provider runs as a standalone service that needs to be deployed on a server.
To summarize the VM Volumes value proposition:
Streamline cloud operation by providing storage services at VM and application level, enabling end-to-end provisioning automation, and unifying VMware and storage administration around volumes and VMs.
Increase storage array ROI, improve vSphere scalability and response time, and reduce cloud provisioning lag, by offloading VM-level provisioning, failover, backup, storage migration, storage space recycling, monitoring, and more, to the storage array, using advanced storage operations such as mirroring and snapshots.
Simplify the adoption of VM Volumes using XIV, with smaller and simpler sets of storage profiles. Apply XIV's supreme fast cloning to individual VMs, and keep automation risks at bay with XIV's consistent high performance.
Until you can get your hands on a VM Volumes-capable environment, the VMware and IBM developer groups will be collaborating and working hard to realize this game-changing feature. The above information is definitely expected to trigger your questions or comments, and our development teams are eager to learn from them and respond. Enter your comments below, and I will try to answer them, and help shape the next post on this subject. There's much more to be told.
Am I dreaming? On his Storagezilla blog, fellow blogger Mark Twomey (EMC) brags about EMC's standard benchmark results, in his post titled [Love Life. Love CIFS.]. Here is my take:
A Full 180 degree reversal
For the past several years, EMC bloggers have argued, both in comments on this blog, and on their own blogs, that standard benchmarks are useless and should not be used to influence purchase decisions. While we all agree that "your mileage may vary", I find standard benchmarks are useful as part of an overall approach in comparing and selecting which vendors to work with, and which architectures or solution approaches to adopt, and which products or services to deploy. I am glad to see that EMC has finally joined the rest of the planet on this. I find it funny this reversal sounds a lot like their reversal from "Tape is Dead" to "What? We never said tape was dead!"
Impressive CIFS Results
The Standard Performance Evaluation Corporation (SPEC) has developed a series of NFS benchmarks, the latest, [SPECsfs2008] added support for CIFS. So, on the CIFS side, EMC's benchmarks compare favorably against previous CIFS tests from other vendors.
On the NFS side, however, EMC is still behind Avere, BlueArc, Exanet, and IBM/NetApp. For example, EMC's combination of Celerra gateways in front of V-Max disk systems resulted in 110,621 OPS with overall response time of 2.32 milliseconds. By comparison, the IBM N series N7900 (tested by NetApp under their own brand, FAS6080) was able to do 120,011 OPS with 1.95 msec response time.
Even though Sun invented the NFS protocol in the early 1980s, they take an EMC-like approach against standard benchmarks to measure it. Last year, fellow blogger Bryan Cantrill (Sun) gives his [Eulogy for a Benchmark]. I was going to make points about this, but fellow blogger Mike Eisler (NetApp) [already took care of it]. We can all learn from this. Companies that don't believe in standard benchmarks can either reverse course (as EMC has done), or continue their downhill decline until they are acquired by someone else.
(My condolences to those at Sun getting laid off. Those of you who hire on with IBM can get re-united with your former StorageTek buddies! Back then, StorageTek people left Sun in droves, knowing that Sun didn't understand the mainframe tape marketplace that StorageTek focused on. Likewise, many question how well Oracle will understand Sun's hardware business in servers and storage.)
What's in a Protocol?
Both CIFS and NFS have been around for decades, and comparisons can sometimes sound like religious debates. Traditionally, CIFS was used to share files between Windows systems, and NFS for Linux and UNIX platforms. However, Windows can also handle NFS, while Linux and UNIX systems can use CIFS. If you are using a recent level of VMware, you can use either NFS or CIFS as an alternative to Fibre Channel SAN to store your external disk VMDK files.
The Bigger Picture
There is a significant shift going on from traditional database repositories to unstructured file content. Today, as much as [80 percent of data is unstructured]. Shipments this year are expected to grow 60 percent for file-based storage, and only 15 percent for block-based storage. With the focus on private and public clouds, NAS solutions will be the battleground for 2010.
So, I am glad to see EMC starting to cite standard benchmarks. Hopefully, SPC-1 and SPC-2 benchmarks are forthcoming?
A lot was announced this week, so I decided to break it up into several separate posts. This is part 3 in my 3-part series, focusing on our Tivoli Storage products.
To read the rest of the series, see:
The latest release of FlashCopy Manager now supports NetApp and IBM N series storage devices. This provides application-aware snapshots, coordinated with applications like SAP, DB2 and Oracle.
FlashCopy Manager now integrates with Metro and Global Mirror capabilities, so that application-consistent copies are available at remote sites for disaster recovery, or to off-load the FlashCopy destination copy from disk to Tivoli Storage Manager storage pools.
Tivoli Storage Manager v6.4
IBM Tivoli Storage Manager is part of IBM's Unifed Recovery Management. Here are some highlights:
Enhanced Reporting. Cognos reporting to monitor backup and archive environments.
TSM for ERP. I remember when these were called "Tivoli Data Protection" modules. We still refer to them as "TDPs". The TSM for ERP provides backup capability for SAP environments, and this latest release adds support for in-memory SAP HANA databases.
TSM for Virtualization Environments IBM TSM is famous for its patented "Progressive Incremental Backup" which is far more efficient than full+incrementals or full+differentials. IBM now extends this method to VM images. With people consolidating more and more VMs onto fewer host servers, TSM-VE now offers multiple backup streams in parallel. TSM-VE can now take application-aware backups of Microsoft Exchange, SQL Server, and Active Directory running in VMs. TSM-VE will also support vApp and VM templates. If it takes you [a day and a half to build a VMware template], you would want to make sure all that work was backed up, right?
Enhanced Security. Complex password support and improved user authentication and management by integration with Lightweight Directory Access Protocol (LDAP)
This week, I am in Taipei, teaching Top Gun class. There was concern that another typhoon would hit the island of Taiwan later this week, but it looks like it is now headed for Hong Kong instead.
Elsewhere in the world, there are several events going on next week, so I thought I would bring them to your attention.
ECTY - South Africa
Next week, Jerry Kluck, IBM Global Sales Executive for Storage Optimization and Integration Services, will be the keynote speaker at "Edge Comes to You" (ECTY) conference in South Africa. This is a one-day event, similar to the [ECTY event in Moscow, Russia] that I spoke at last June.
Here is the schedule for South Africa next week:
Monday, August 20, 2012 - Johannesburg
Wednesday, August 22, 2012 - Cape Town
(I have been to both Jo'burg and Cape Town back in 1994. A month after Apartheid had just ended, I was part of a small group of IBMers sent to re-establish IBM's business operations there. I would have liked to have attended the events next week, not just to hear Jerry speak, but also to see how much the country has changed over the past 18 years, but I could not get a work permit in time.)
If you are interested in attending either of these next week, contact your local IBM Business Partner or sales rep to attend.
Forrester's Total Economic Impact Study of Virtualized Storage
Virtualized storage can help organizations stretch their storage investment dollar and storage administration and management resources. Jon Erickson from Forrester Research will review the latest findings from IBM SAN Volume Control (SVC) users studied as part of the recently completed Forrester Total Economic Impact Study of IBM System Storage SAN Volume Controller.
Date: Tuesday, August 21, 2012
Time: 10:00 AM PDT / 1:00 PM EDT
Duration: 60 minutes
Among the findings, users were able to:
Avoid the capital cost of additional storage
Increase IT productivity
Provide greater end user data availability
The second presenter is Chris Saul, IBM Storage Virtualization Manager, who will explain how SVC can manage heterogeneous disk from a single point of control, autonomously manage tiered disk storage and can store up to five times as much data on your existing disk using IBM Real-time Compression.
Not all virtualization solutions are created equal! That's true for storage virtualization, like the SAN Volume Controller mentioned above, and it's true for server virtualization as well.
This webcast discusses the real-world impact on businesses that deploy IBM's PowerVM®
virtualization technology as compared to those using Oracle® VM for SPARC (OVM SPARC), Microsoft® Hyper-V, VMware® vSphere or other competing products.
Date: Wednesday, August 22, 2012
Time: 10:00 AM PDT / 1:00 PM EDT
Duration: 60 minutes
This webcast will include findings from a [Solitaire Interglobal] study of over 61,000 customer sites on the value of virtualization from a business perspective and how IBM's PowerVM provides real business value.
Other key discussion points that will be covered during this webcast include:
Behavioral characteristics of server virtualization technologies that were examined and analyzed from survey participant's environments
How IT colleagues were able to obtain a faster time-to-market for business initiatives when using IBM PowerVM
Why the learning curve time for PowerVM is as much as 2.58 times faster than for other offerings
Why VM reboot comparisons for PowerVM vs competitive platforms resulted in downtime of 5.5 times less than with other options
A TCO reduction of up to 71.4% for PowerVM compared to alternative options
This webcast will also feature an in-depth discussion on the IBM PowerVM solution from an IBM product expert who will share the unique virtualization features available when PowerVM is utilized within the IBM Power Systems™ environment.