Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year.
Day 4, the last day of the conference, is only a partial day, and many people opted to leave on Wednesday evening, or Thursday morning instead. The breakfast and lunch meals had fewer people than the previous days. Here is my recap of day 4 Thursday breakout sessions.
Building Hyperconverged Infrastructure for Next-Generation Workloads
Supermicro is more than happy to customize these, upgrading the CPU, RAM, disk or networking connectivity as needed. This solution is roughly half the price of Nutanix, and offers a better Next-Business-Day/9am-to-5pm support package .
The last time I was in Las Vegas, I presented this topic at [IBM Interconnect conference]. Back then, I was given only 20 minutes, was placed on the Solutions Expo showroom floor, competing with the noise and traffic of attendees going to lunch.
This time, it was much better, a large room, and a bigger-than-expected audience given that it was scheduled on Thursday morning.
Cloud storage comes in four flavors: persistent, ephemeral, hosted, and reference. The first two I refer to as "Storage for the Computer Cloud" and the latter two I refer to as "Storage as the Storage Cloud".
I also explained the differences between block, file and object access, and why different Cloud storage types use different access methods. I wrapped up the session covering the various storage solutions that IBM offers for all four Cloud Storage types.
IBM Storwize and IBM FlashSystem with VersaStack versus NetApp FlexPod
Norm Patten, part of the IBM Competitive Project Office Storage Team, presented a competitive comparison between VersaStack with IBM storage, versus FlexPod with NetApp storage.
Commodity Solid State Drives (SSD) and Shingled Magnetic Recording [SMR] offer low-cost, high-capacity storage.
However, they have their own set of problems, so IBM is developing software that can be included in IBM Spectrum Accelerate, Spectrum Scale, and Spectrum Virtualize to optimize their utility.
The concept of Log-Structured Array has been around since 1988. The IBM RAMAC Virtual Array back in the 1990s used it. NetApp's Write-Anywhere File System (WAFL) is an implementation of the [Log-Structured File System] general concept.
SALSA combines Log-Structured Array with enhancements borrowed from the IBM FlashSystem design, that I covered in my Monday and Wednesday presentations, to enhance write endurance by as much as 4.6 times!
This was an NDA session, so I cannot blog any of the details.
World-class Flash-optimized Data Reduction and Efficiency with IBM FlashSystem A9000 and A9000R
Tomer Carmeli, IBM Offering Manager for the A9000 and A9000R presented. He presented an overview of these models on Monday, so this session was focused on the data footprint reduction technologies.
Basically, it is a three step process. First, all "standard patterns" are removed. IBM has identified some 260 standard patterns that are 8KB in length, such as all zeros, all ones, or all spaces, and replaces these blocks immediately with a pattern token.
Second, [SHA-1] 20-byte hash codes are computed on 8KB pieces on a rolling 4KB alignment boundary. In other words, if a 64KB block of data is written, bytes 0-to-8KB are hashed an compared to existing hash codes. If no match, then bites 4KB-to-12KB are hashed, and so on. This approach nearly doubles the likelihood of finding duplicates. When a block match is found, the algorithm can replacing them with pointer and reference count.
Third, any unique data that still remains is compressed using Lempel-Ziv algorithm. This is done using the [Intel® QuickAssist]. This co-processor can compress data 20 times faster than software algorithms running on general-purpose x86 processors.
Do you want an estimate of how much "reduction ratio" you may achieve? IBM has developed two estimator tools to help. The first tool is a complete scan for data expected to be dedupe-friendly. It is a slow process, taking 8 hours per TB. This would be ideal for Virtual Desktop Infrastructure or backup copies.
The second tool is the infamous [Comprestimator] that IBM has had for awhile to help estimate compression savings for IBM Spectrum Virtualize storage solutions like SVC, Storwize and FlashSystem V9000. This tool is very fast, looking at only a statistically-valid subset of the data.
The results of both tools are merged, and the result is within five percent accuracy. This allows IBM to offer guidance on which data to place on these new A9000 and A9000R models, as well as offer a "reduction ratio" guarantee.
A client asked me why I bother to attend other sessions, when I probably know most of the material they present. I explained that I can always learn from others. I can honestly say that I learned something new and useful at every session I attended.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of Day 3 Wednesday.
Become your own Storage Consultant
Gary Graham, IBM Field Technical Specialist for Storage, and Brian Pioreck, IBM Client Technical Specialist for Storage, co-presented this session. This session explained how to use IBM's 30-day free trial of IBM Spectrum Control Storage Insights, a cloud-based services offering.
(Note: 15 years ago, I was the chief architect of version 1 of what we now call IBM Spectrum Control. I am pleased to see how well this product has evolved over the years.)
Storage Insights provides a reporting-only subset of the popular IBM Spectrum Control Standard and Advanced editions. It reports on IBM storage devices, as well as any non-IBM devices that are virtualized behind IBM Spectrum Virtualize products like SAN Volume Controller (SVC), Storwize, and FlashSystem V9000.
If you are a storage administrator, consider trying this out for 30 days, get some immediate results. Since it is cloud-based, you only need a Windows, Linux or AIX system to install a "collector" on site. This collector sends data up to the Cloud at one of IBM SoftLayer facilities. The installation process takes only 30 minutes, and you can download the code from the Internet.
If you find Storage Insights valuable, helping you reclaim some unused space, or provide other insight that saves your company money, consider buying the service, for only 250 US Dollars per 50 TB monitored. If you want more than just monitoring and reporting, consider one of the on-premise solutions like IBM Spectrum Control Standard, or IBM Spectrum Control Advanced edition, which provide provisioning and configuration capabilities as well.
Enhance your Security posture with At-Rest Encryption using the latest IBM Spectrum Virtualize
All of the IBM Spectrum Virtualize products support Data-at-Rest Encryption. For direct-attached storage, the 12Gb SAS controller performs hardware-assisted encryption.
For SAN-attached storage via FCP, FCoE or iSCSI back-end devices, IBM uses the [AES-NI instruction set] that comes included in certain Intel CPU processors.
Last November 2015, [IBM acquired Cleversafe] for $1.3 Billion US dollars because Cleversafe has the brand name recognition as the #1 Object Storage vendor the past two years in a row (2014 and 2015). On July 1 of this year, the transformation was complete, and their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS.
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
A few years ago, I explained to a client that Converged and Hyperconverged were like a pendulum swinging back. Over the past few decades, we have gone from internal disk, to externally attached disk, to SAN and LAN networks.
Each time, we gained more flexibility, greater connectivity and longer distances. Then I explained that Converged and Hyperconverged is like going backwards, the pendulum swinging back to the days of internal and direct-attached storage. The analogy was a hit, and thus this session was born!
IBM offers multiple Converged Systems. IBM PureSystems, PureData, PurePower and PureApplication solutions offer racks of compute, storage and network gear. Last year, IBM collaborated with Cisco to create VersaStack, a converged system that combines Cisco's x86 blade servers and switches with IBM FlashSystem and Storwize products.
IBM also offers Hyperconverged solutions. IBM Spectrum Accelerate allows the compute, storage and network functions run on 3 to 15 VMware ESXi hosts to form a cluster. The cluster can then make iSCSI-based volumes available to other virtual machines running on these same hosts. The volumes can also be made available to servers outside the cluster, such as bare metal servers or other Hypervisors. This is available as software-only, or you can get pre-built system called the Supermicro Hyperconvergence Appliance.
IBM Spectrum Scale provides a clustered file system that allows the compute, storage and network functions to run on 3 to 16,000 machines. Formerly called General Parallel File System (GPFS), IBM Spectrum Scale has been around for over 18 years. Over 200 of the world's largest "Top 500" supercomputers run IBM Spectrum Scale today.
IBM Spectrum Virtualize and IBM Storwize Birds-of-a-Feather
Barry Whyte, fellow blogger and IBM Master Inventor, presented an overview of the latest features, and where IBM is headed in 2017 for the IBM Spectrum Virtualize family of products. Barry now works in Advanced Technical Skills for Storage Virtualization Asia/Pacific Region.
The group then moved to another room offering delicious food and drink, as Eric Stouffer, IBM Director, Storwize Offering Manager and Business Line Exec, presented the future areas that IBM is consider for this product family.
All of this was done under Non-Disclosure Agreements (NDA), preventing me from blogging any details. Back in 2003, Las Vegas started a marketing campaign ["What Happens in Vegas, Stays in Vegas"]. Coincidentally, this is the same year IBM introduced the IBM SAN Volume Controller, the first product in the IBM Spectrum Virtualize family.
This was a long day, but was pleased with the large audiences I had at my sessions.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of breakout sessions on Day 2.
Introducing IBM FlashSystem A9000 and A9000R: Grid Architecture Designed for the Hybrid Cloud
Tomer Carmeli, IBM Offering Manager for the A9000 and A9000R presented. Both models offer data-at-rest encryption, snapshots, remote mirroring, and data footprint reduction, assuming 5.26:1, a combination of pattern removal, data deduplication and hardware-assisted Real-time compression.
The A9000 is an 8U high pod that can fit into existing racks. It comes in 60TB, 150TB and 300TB effective capacity.
The A9000R includes its own 42U rack. The rack is organized as two to six "grid elements" combined with two InfiniBand switches. Grid elements come in 150TB and 300TB effective capacities, giving you up to a whopping 1.8 PB in a single rack!
Similar to the IBM XIV and IBM Spectrum Accelerate offerings, the A9000 and A9000R support Hyper-Scale features. Hyper-Scale Manager lets you manage up to 144 devices on a single pane of glass. Hyper-Scale Mobility lets you move volumes (LUNs) non-disruptively from one device to another.
Different data compresses or dedupes at different ratios. Your mileage may vary. Unless you are evaluating a JBOF (just a bunch of flash) device, there is a great difference between raw, usable, and effective capacity. Raw capacity can be calculated by the size of each chip, times the number of chips. Usable capacity factors out RAID, and any spare capacity set aside for RAID rebuild and garbage collection. Effective capacity indicates the amount of information that can be stored by taking advantage of data footprint reduction technologies, such as compression or data deduplication.
IBM offers three options:
Measured Estimate -- IBM has a set of data reduction estimator tools that can scan your existing data, and estimate your reduction ratio, within 5 percent accuracy.
Competitive Match -- If a competitor had run their own set of estimator tools, IBM might be able to match the reduction ratio, without repeating the analysis, by just reviewing the competitor results.
"Sight unseen" -- without analyzing your actual data, reduction ratio is determine by the type of data (DB2, Oracle, SQL server, etc.), based on experience with similar data at other data centers.
Both A9000 and A9000R models are published at 250 microsecond latency, about 30 times faster than traditional spinning disk, although some workloads actually can run even faster than that. Assuming 5.26:1 reduction, these sell for about $1.50 per effective GB.
Flash Primer - Ready to move from disk storage?
Patricia Crowell, IBM Worldwide FlashSystem Enablement manager, presented. She presented an interesting time line:
First Solid-State Drive (SSD)
First Flash card, such as for digital cameras
First USB stick
Flash used in specialized IT appliances
Flash for the enterprise - Microsoft and UCSD paper on SSD
In 2012, Microsoft Research and University of California San Diego published ["The Bleak Future of NAND Flash Memory"], 8 pages, by Laura M. Grupp, John D. Davis, and Steven Swanson. Here is an excerpt:
"The technology trends we have described put SSDs in an unusual position for a cutting-edge technology: SSDs will continue to improve by some metrics (notably density and cost per bit), but everything else about them is poised to get worse. This makes the future of SSDs cloudy: While the growing capacity of SSDs and high IOP rates will make them attractive in many applications, the reduction in performance that is necessary to increase capacity while keeping costs in check may make it difficult for SSDs to scale as a viable technology for some applications"
IBM disagreed with this bleak assessment, announced it was investing $1 billion US Dollars into this technology, acquired Texas Memory Systems, and has deployed flash throughout its product line. For the past three years, IBM has been the #1 vendor for Flash storage systems.
Patricia offered the following example. What would it take to run 20 million IOPS? Here's a comparison:
Disk systems 15K rpm
Disk systems 7200 rpm
How to migrate from SONAS to IBM Spectrum Scale/ESS using Active File Manager
Paul Schena, IBM Senior IT Specialist, presented his experiences migrating existing SONAS data to new IBM Spectrum Scale or Elastic Storage Server (ESS) deployments. SONAS is going End-of-Service (EOS) on April 30, 2018, so it is never too soon to start this migration.
Paul gave two different methodologies. The first used Active File Management (AFM):
Setup an IBM Spectrum Scale "Gateway Node" in "Independent-Writer" AFM mode. Paul recommends 10 threads per gateway node.
Issue an AFM pre-fetch, disabling the "cache eviction" feature to ensure data remains. AFM transfers the directory structure, file data including sparse files, Access Control Lists (ACL), extended attributes.
Define your exports with no-root-squash and move your user mounts to the new systems
Once all the data is moved, convert the cache filesets to regular filesets
Define your quotas, export settings, ILM policies and rules
Decommision the SONAS
The second used Robocopy and Rsync, which may be required if there is high-latency, long-distance connection that prevents proper AFM connections:
Configure IBM Spectrum Scale CES servers to appropriate NFS and/or SMB protocols
Use Robocopy and/or Rsync as appropriate to move the data to the new system
Decommision the SONAS
Having it all: Hybrid Cloud Storage Services for Block, Power and Backup
Clint Parish, Director of Enterprise Solutions and Services for VSS, and Marc The'berge, Business Development for Supermicro, co-presented this session.
VSS offers POWER8-based Cloud services. They consider themselves "boutique" with POWER8 servers, able to run AIX, IBM i and Linux on POWER applications, but not at the scale and size of larger x86-based clouds like Amazon Web Services or Microsoft Azure.
For IBM i, they attach to IBM Storwize V7000. For AIX and Linux on POWER, they use IBM Storwize V7000 and/or Supermicro Hyperconverged Appliance, a pre-built system based on IBM Spectrum Accelerate.
Supermicro offers three "tee-shirt sizes", their small systems have six nodes, medium with 9 nodes, and large with 15 nodes. Unlike other Hyperconverged systems, the ones from Supermicro include a rack, and are pre-cabled with all the necessary Ethernet switches necessary to make a complete solution.
To offer backup services, VSS uses IBM Spectrum Protect with the Supermicro appliances.
In the evening, we were treated with a concert with Train, known for songs like "Meet Virginia", "Hey Soul Sister", "Calling all Angels" and "Drops of Jupiter". They played all of these, plus covered some songs by Led Zeppelin, Journey, Queen and Aerosmith,
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of breakout sessions for Monday, Sep 19, 2016:
How do you storage a Zettabyte? IBM and Microsoft Know...
A [Zettabyte] is a million Petabytes, or a billion Terabytes, of data. Most clients I deal with have less than 10 PB of centralized storage in their data center, but there are a few that have much larger data repositories.
Ed Childers, IBM STSM and manager for Tape and LTFS development, and Aaron Ogus, Microsoft Architect, discussed different solutions developed by IBM and Microsoft. IBM's solution has been productized, and is available as IBM Spectrum Scale and IBM Spectrum Archive. Microsoft's solution is not productized, but is being "operationalized" to be used within Microsoft's Azure Cloud.
Not surprisingly, to be able to store a Zettabyte of data, you have to be creative and cost-effective with storage media. The current winner is magnetic tape, which continues to be 20 times less expensive than disk. IBM developed the Linear Tape File System (LTFS) and then shared it with other leading IT vendors. Ed also covered some future storage media developments, from using Macro-molecular strands of DNA, to Phase Change Memory (PCM).
All Flash is not Created Equal - Contrasting IBM FlashSystem with Solid State Drives (SSD)
Many IBM FlashSystem presentations focus on the product, but don't explain the underlying technology, specifically what differentiates IBM FlashSystem from substantially slower competitive alternatives like EMC XtremIO and PureStorage that are based instead on fallible commodity Solid State Drives (SSD).
By working closely with our chip vendor, Micron, IBM was able to improve the write endurance of these Multi-level cell (MLC) chips by 9.4x, and reduce write amplification by 45 percent.
I explained IBM's clever asymmetrical wear-level balancing, heat segregation, read disturb mitigation, voltage level shifting, and health binning, all of which contribute to the performance and reliability of this solution. IBM's innovative Error Correcting Code provides LDPC-like correction strength but at much faster BCH-like latency speed.
This was a popular session. Despite being moved to a much larger room, they still had to turn people away, so I will be repeating this session on Wednesday, 11:00am.
Real-time Compression: Bendingo and Adelaide Bank's Perspective
James Harris, Senior Storage Systems Specialist for [Bendingo and Adelaide Bank], presented his success story with the use of Real-time Compression. Oracle RAC databases got 60-70 percent savings. SQL databases got 70-80 percent savings. VMware VMFS datastores average 50 percent savings. For IBM i, he is getting 60-70 percent savings for SYSBAS, and over 70 percent savings of the rest of his IBM i production data.
As a result, the bank has not had to make any Capital Expenditures (CAPEX) for disk for 2-3 years since they started compressing in 2014.
Storage Options for Big Data and Analytics: IBM FlashSystem or Traditional Disk Systems?
Eric Sperley, IBM Software Defined Storage Architect, presented the basics of Hadoop and the Hadoop File System (HDFS), then explained how IBM Spectrum Scale, when combined with the right tiers of flash and disk technology, could be used to optimize an environment for big data analytics.
The Solutions EXPO is open all day, for people to visit the booths in between sessions. I stopped in for the evening reception. This is a great way to catch up on the latest products, re-connect with some clients or colleagues that I haven't seen in person for awhile, and meet new friends.
Shown here is Angie Welchert, who just started working for IBM a few years ago! I took her around to introduce her to some IBM executives at the Solutions EXPO.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year.
General Session - Outthink Status Quo
This week's motto is "Outthink the Status Quo.. Before the Status Quo disrupts your business!
Tom Rosamilia, IBM Senior VP for IBM Systems (and my fifth-line manager), kicked off the event. There are about 5,500 people at this event. He mentioned that just like a picture is worth a thousand words, "a prototype is worth a thousand meetings."
He showed a video of our client "Plenty of Fish" [POF], which is a dating site. They have 100 million members, of which 4 million access their site every day. IBM FlashSystem paid for itself, with an ROI payback period of 2 months.
Jason Pontin, Editor in Chief and Publisher of [MIT Technology Review], mentioned three major areas to watch:
Explosive innovation in Artificial Intelligence (AI), including IBM Watson, machine learning, etc.
Pervasive computing, including augmented reality or virtual reality, what IBM calls Internet of Things (IoT)
Re-writing life, directly editing genomes for healthcare and agriculture
Jason feels there are two major challenges for humans. First, what is the "future of work"? People are no longer working for the same company for their entire career. Rather, they come and go, moving in and out of companies. Second, how will we deliver food and water to the 9.6 billion population expected to exist by 2050, with added challenge of climate change. Ed Walsh, IBM General Manager for Storage and Software Defined Infrastructure, presented next. Last year, I was asked to throw my hat in the ring to be the next General Manager of IBM Storage. I was up against some strong competition, and in the end upper management selected Ed Walsh instead. He is a good choice, and I support his efforts.
Matt Cadieux, CIO for [Red Bull Racing], presented on the IT challenges of designing, building and racing Formula One racing cars. They have 21 races per year, and each race has slightly different specifications, forcing Red Bull Racing to break down and rebuild their cars for each race.
Michael Lawley, Senior IT Vice President for [HealthPlan Services], explained how his business grew 300 percent in the past four years. Their workloads are very "spiky", so it is good that they can scale up or down their IT infrastructure 3-4x as needed, within minutes.
Jacob Yundt, CIO for University of Pittsburgh Medical Center [UPMC], explained the importance of genomics as the next frontier of medicine. Genomics allows for more accurate cancer determinations, which helps target specific treatments. They moved from x86-based clusters to those based on Power LC models from IBM. For analytics, they chose IBM Power8 S822L servers with Elastic Storage Server (ESS) and the Hadoop Transparency Layer.
Lastly, Terri Virnig hosted two technology partners to the stage for some major announcements. First, Jim Totton from Red Hat, announced that RHEV v4 (based on Linux KVM) is announced for POWER platform. Secondly, Scott Gnau, CTO for [Hortonworks], announced that Hortonworks will run on the POWER platform, as part of IBM and Hortonworks Open Data Platform [ODP] initiative.
Trends & Directions: The Future of Storage in the Cloud and Cognitive Era
Eric Herzog, IBM Vice President, Product Marketing and Management Software Defined Infrastructure, served as emcee for this session.
Ed Walsh, IBM General Manager for IBM Storage and Software Defined Infrastructure, marveled at IBM's "storied history in storage innovation". He suggests clients should modernize and transform their business with IBM broadest storage portfolio in the IT industry.
Clod Barrera, IBM Engineer and the Chief Technical Strategist for IBM Systems Storage, explained that in the past 60 years of disk systems, areal density has improved by a factor of one billion. Unfortunately, that is slowing down, and we won't see such improvements anymore.
Bina Hallman, IBM Vice President, Software Defined Storage Solutions Offering Management, hosted a panel of clients, including:
Bob Osterlin, from [Nuance], that has 5-10 PB of data using IBM Spectrum Scale for voice recognition software.
Rich Spurlock, from [Cobalt Iron], that provides Backup-as-a-Service using IBM Spectrum Protect. Their clients experience an 80 percent reduction in operating expenditures (OPEX) using Spectrum Protect.
Moshe Perez, from [RR Media], that provides television channel distribution like ESPN and BBC to other countries. They use IBM Spectrum Accelerate to handle the demand peaks, such as the Olympics.
Mike Kuhn, IBM Vice President for Storage Solutions Offering Management, also hosted a panel of clients, including:
Kevin Muha, from [UPMC], managing 13 PB of storage, across a variety of IBM storage devices, including 700 TB of FlashSystem V9000.
Bill Reed, CTO for [Arizona State Land Department], that uses VersaStack with IBM FlashSystem V9000 for geographic information system [GIS] applications. They manage over 9.2 million acres to help fund K-12 schools in Arizona.
Owen Morley, from Plenty of Fish [POF] dating website, evaluated nearly every flash device in the market, and chose IBM FlashSystem. "The one metric that matters is Latency!"
These were the two main keynote sessions on Monday morning. During the rest of the week there will be over 285 storage-related breakout sessions, dozens of labs, and 7 panels.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. In previous years, this conference was held in May, June or July, but this year, it was moved back to September, to coincide with the 60th Anniversary of IBM Disk Systems.
I have arrived safely to Las Vegas, and checked in at Edge 2016 Conferenece Registration.
This year, the Solutions EXPO opens early, on Sunday with a reception. This gives people a chance to go to booth #330 to make appointments for one-on-one with various IBM Executives!
I was able to catch up with co-workers I have not seen in a while! There is a whole section on IBM storage products such as the IBM DS8888 All-Flash Array, as well as software products like IBM Spectrum Protect and IBM Spectrum Control.
On Monday, my session "All Flash is Not Created Equal: Tony Pearson Contrasts IBM FlashSystem and SSD" has moved from the tiny room to a much larger room "Studio A". There was a lot of demand for this session, so I have agreed to present this again, as a repeat session, on Wednesday.
Edge will be different in many ways this year. The past few years we had separate "Executive Edge" for C-level executives, "Winning Edge" for IBM Business Partners, and "Technical Edge" for server, network and storage administrators.
This year, all 1,000 sessions are combined back into one, but with clever hints in the titles. The words "General Session", "Outthink" or "Cognitive" are used to indicate C-level executive talks. Those that use the terms "Winning" or "Community" target IBM Business Partners, Managed Service Providers and Cloud Service Providers. Those that mention z Systems, POWER servers, or Storage solutions, often adding the term "Deep-Dive", are technical.
(Unlike other sessions that might appeal to one portion of the audience or another, mine are suitable for everyone, from C-level executives and IBM Business Partners to storage administrators. To help people find them under the new naming scheme, I have added "Tony Pearson Presents", or words to that effect.)
About 260 breakout sessions relate to IBM Storage, but there are only 20 or so time slots, so obviously you can't see them all in person.
I strongly suggest you pick about three to five topics per time slot, so that you are not overwhelmed by the dozens of choices during the event. This allows you to make a quick decision on which one you finally decide on during each time slot.
Occasionally, a session might get canceled, postponed, or be so full of attendees that nobody else is allowed in, so having three to five topics selected allows you to chose an alternate.
Here is my schedule for next week at Edge 2016.
Trends & Directions: The Future of Storage in the Cloud and Cognitive Era
All Flash is Not Created Equal: Tony Pearson Contrasts IBM FlashSystem and SSD
MGM Grand - Studio 9
Solution EXPO: Reception
Edge at Night: Poolside Reception and Concert "Train"
Tony Pearson Presents IBM Cloud Object Storage System and Its Applications
MGM Grand - Room 114
The Pendulum Swings Back: Tony Pearson Explains Converged and Hyperconverged Environments
MGM Grand - Room 113
Solution EXPO: Reception
Tony Pearson Presents IBM's Cloud Storage Options
MGM Grand - Room 116
My colleagues Dave Dabney or Adam Bergren will be located at the WW Systems Client Centers Booth 125 of the Solution EXPO.
If you are active in Social Media, consider using the hashtags #IBMedge, #IBMstorage, and #IBMcloud. You can follow me on Twitter, my handle is @az990tony
For those interested in a one-on-one meeting with me, over breakfast, lunch or dinner, or some other time, I have several slots still available. Fill out a request form on BriefingSource at: [https://briefingsource.dst.ibm.com/]
SAP HANA is an in-memory, relational database management system supported on Linux for x86 and POWER servers. The "HANA" acronym is short for "High-Performance Analytic Appliance" software. By keeping the data in memory, analytics and queries can be performed much faster than from traditional disk repositories.
Server memory, however, is volatile storage, so the data needs to be stored on persistent storage such as flash or disk drives. SAP has certified several configurations, some involve IBM Spectrum Scale solutions. I will use the following graphic to explain the three configurations.
Linux on x86-64 with Spectrum Scale FPO
With SAP HANA on Lenovo x86-64 servers, SAP has certified internal flash or disk drives running IBM Spectrum Scale in "File Placement Optimization" (FPO) mode. FPO provides a shared-nothing architecture that matches the SAP HANA architecture. IBM Spectrum Protect can backup this configuration, providing data protection and disaster recovery support.
Linux on POWER with Elastic Storage Server
With SAP HANA on POWER servers, SAP has certified external Elastic Storage Server (ESS). Not only is POWER the better platform to run SAP HANA than x86-64, but Elastic Storage Server offers excellent erasure coding to provide excellent rebuild times and storage efficiency.
The ESS is a pre-built system that combines IBM Spectrum Scale software with server and storage hardware. IBM Spectrum Protect can also backup this configuration, providing data protection and disaster recovery support.
Block-level Storage over Storage Area Network (SAN)
Various IBM block-level devices are support for SAP HANA on both Linux on x86-64 and Linux on POWER. Unfortunately, SAP only has certified (to date) the use of the XFS file system. The problem many clients mention about this configuration is the lack of end-to-end backup and disaster recovery. This is solved by the Spectrum Scale configurations in the previous two examples.
Other combinations, such as SAP HANA on POWER with Spectrum Scale FPO, or on x86-64 servers with Elastic Storage Serer, are either not SAP-certified, or not directly supported by SAP without their approval.
IBM and SAP have worked closely together for many years, and I am glad to see SAP HANA and IBM Spectrum Scale based solutions continue this tradition.
As we get to larger and larger flash and spinning disk drives, a common question I get is whether to use RAID-5 versus RAID-6. Here is my take on the matter.
A quick review of basic probability statistics
Failure rates are based on probabilities. Take for example a traditional six-sided die, with numbers one through six represented as dots on each face. What are the chances that we can roll the die several times in a row, that we will have no sixes ever rolled? You might think that if there is a 1/6 (16.6 percent) chance to roll a six, then you would guarantee hit a six after six rolls. That is not the case.
# of Rolls
Probability of no sixes (percent)
So, even after 24 rolls, there is more than 1 percent chance of not rolling a six at all. The formula is (1-1/6) to the 24th power.
Let's say that rolling one to five is success, and rolling a six is a failure. Being successful requires that no sixes appear in a sequence of events. This is the concept I will use for the rest of this post. If you don't care for the math, jump down to the "Summary of Results" section below.
Error Correcting Codes (ECC) and Unreadable Read Errors (URE)
When I speak to my travel agent, I have to provide my six-character [Record Locator] code. Pronouncing individual letters can be error prone, so we use a "spelling alphabet".
The International Radiotelephony Spelling Alphabet, sometimes known as the [NATO phonetic alphabet], has 26 code words assigned to the 26 letters of the English alphabet in alphabetical order as follows: Alfa, Bravo, Charlie, Delta, Echo, Foxtrot, Golf, Hotel, India, Juliett, Kilo, Lima, Mike, November, Oscar, Papa, Quebec, Romeo, Sierra, Tango, Uniform, Victor, Whiskey, X-ray, Yankee, Zulu.
Foxtrot Golf Mike Oscar Victor Whiskey
Foxtrot Gold Mine Oscar Vector Whisker
Boxcart Golf Miko Boxcart Victor Whiskey
Having five or so characters to represent a single character may seem excessive, but you can see that this can be helpful when communications link has static, or background noise is loud, as is often the case at the airport!
If spelling words are misheard, either (a) they are close enough like "Gold" for "Golf", or "Whisker" for "Whiskey", that the correct word is known, or (b) not close enough, such that "Boxcart" could refer to either "Foxtrot" or "Oscar" that we can at least detect that the failure occurred.
For data transfers, or data that is written, and later read back, the functional equivalent is an Error Correcting Code [ECC], used in transmission and storage of data. Some basic ECC can correct a single bit error, and detect double bit errors as failures. More sophisticated ECC can correct multiple bit errors up to a certain number of bits, and detect most anything worse.
When reading a block, sector or page of data from a storage device, if the ECC detects an error, but is unable to correct the bits involved, we call this an "Unrecoverable Read Error", or URE for short.
Bit Error Rate (BER)
Different storage devices have different block, sector or page sizes. Some use 512 bytes, 4096 bytes or 8192 bytes, for example. To normalize likelihood of errors, the industry has simplified this to a single bit error rate or BER, represented often as a power of 10.
Bit Error Rate per read (BER)
Consumer HDD (PC/Laptops)
Enterprise 15k/10k/7200 rpm
Solid-State and Flash
IBM TS1150 tape
In other words, the chance that a bit is unreadable on optical media is 1 in 10 trillion (1E13), on enterprise 15k drives is 1 in 10 quadrillion, and on LTO-7 tape is 1 in 10 quintillion.
There are eight bits per byte, so reading 1 GB of data is like rolling the die eight billion times. The chance of successfully reading 1GB on DVD, then would be (1 - 1/1E13) to the 8 billionth power, or 99.92 percent, or conversely a 0.08 percent chance of failure.
In this paper, Google had studied drive failure using an "Annual Failure Rate" or AFR. Here are two graphs from this paper:
This first graph shows AFR by age. Some drives fail in their first 3-6 months, often called "infant mortality". Then they are fairly reliable for a few years, down to 1.7 percent, then as they get older, they start to fail more often, up to 8.3 percent.
This second graph factors in how busy the drives are. Dividing the drive set into quartiles, "Low" represents the least busy drives (the bottom quartile), "Medium" represents the median two quartiles, and "High" represents the busiest drives, the top quartile. Not surprisingly, the busiest drives tend to fail more often than medium-busy drives.
Given an AFR, what are the chances a drive will fail in the next hour? There are 8,766 hours per year, so the success of a drive over the course of a year is like rolling the die 8,766 times. This allows us to calculate a "Drive Error Rate" or DER:
Drive Error Rate per hour (DER)
For example, an AFR=3 drive has a 1 in 287,800 chance of failing in a particular hour. The probability this drive will fail in the next 24 hours would be like rolling the die 24 times. The formula is (1-1/287,800) to the 24th power, resulting in a failure rate of roughly 0.008 percent.
Let's take a typical RAID-5 rank with 600GB drives at 15K rpm, in a 7+P RAID-5 configuration.
During normal processing, if a URE occurs on a individual drive, RAID comes to the rescue. The system can rebuild the data from parity, and correct the broken block of data.
When a drive fails, however, we don't have this rescue, so a URE that occurs during the rebuild process is catastrophic. How likely is this? Data is read from the other seven drives, and written to a spare empty drive. At 8 bits per byte, reading 4200 GB of data is rolling the die 33.6 trillion times. The formula is then (1-1/E16) to the 33.6 trillionth power, or approximately 0.372 percent chance of URE during the rebuild process.
The time to perform the rebuild depends heavily on the speed of the drive, and how busy the RAID rank is doing other work. Under heavy load, the rebuild might only run at 25 MB/sec, and under no workload perhaps 90 MB/sec. If we take a 60 MB/sec moderate rebuild rate, then it would take 10,000 seconds or nearly 3 hours. The chance that any of the seven drives fail during these three hours, at AFR=10 rolling the DER die (7 x 3) 21 times, results in a 0.025 percent chance of failure.
It is nearly 15 times more likely to get a URE failure than a second drive failure. A rebuild failure would happen with either of these, with a probability of 0.397 percent.
The situation gets worse with higher capacity Nearline drives. Let's do a RAID-5 rank with 6TB Nearline drives at 7200 rpm, in a 7+P configuration. The likelihood of URE reading 42 TB of data, is rolling the die 336 trillion times, or approximately 3.66 percent chance of URE failure. Yikes!
The time to rebuild is also going to take longer. A moderate rebuild rate might only be 30 MB/sec, so that rebuilding a 6TB drive would take 55 hours. The chance that one of the other seven drives fail, assuming again AFR=10, during these 55 hours results in a 0.462 percent.
This time, a URE failure is nearly eight times more likely than a double drive failure. The chance of a rebuild failure is 4.12 percent. Good thing you backed up to tape or object storage!
The math can be done easily using modern spreadsheet software. The URE failure rate is based on the quantity of data read from the remaining drives, so a 4+P with 600GB drives is the same as 8+P with 300GB drives. Both read 2.4 TB of data to recalculate from parity. The Double Drive failure rate is based on the number of drives being read times the number of hours during the rebuild. Slower, higher capacity drives take longer to rebuild. However, in both the 15K and 7200rpm examples, the chance of a URE failure was 8 to 15 times more likely than double drive failure.
Many of the problems associated with RAID-5 above can be mitigated with RAID-6.
After a single drive fails, any URE during rebuild can be corrected from parity. However, if a second drive fails during the rebuild process, then a URE on the remaining drives would be a problem.
Let's start with the 600GB 15k drives in a 6+P+Q RAID-6 configuration. The chance of a second drive failing is 0.0252 percent, as we calculated above. The likelihood of a URE is then based on the remaining six drives, 3600 GB of data. Doing the math, that is 0.0319 percent chance. So, the change of a URE during RAID-6 failure is the probability of both occurring, roughly 0.0000806 percent. Far more reliable than RAID-5!
Likewise, we can calculate the probability of a triple drive failure. After the second drive fails, the likelihood of a third drive at AFR=10, results in 0.00000546 percent.
Combining these, the chance of failure of rebuild is 0.000861 percent.
Switching to 6 TB Nearline drives, in a 6+P+Q RAID-6 configuration, we can do the math in the same manner. The likelihood of URE and two drives failing is 0.0145 percent, and for triple drive failure is 0.00183 percent. Chance of rebuild failure is 0.0163 percent.
Summary of Results
Putting all the results in a table, we have the following:
RAID-5 rebuild failure (percent)
RAID-6 rebuild failure (percent)
600GB 15K rpm
6 TB 7200rpm
Hopefully, I have shown you how to calculate these yourself, so that you can plug in your own drive sizes, rebuild rates, and other parameters to convince yourself of this.
In all cases, RAID-6 drastically reduced the probability of rebuild failure. With modern cache-based systems, the write-penalty associated with additional parity generally does not impact application performance. As clients transition from faster 15K drives to slower, higher capacity 10K and 7200 rpm drives, I highly recommend using RAID-6 instead of RAID-5 in all cases.
As I have mentioned before, I started this blog on September 1, 2006 as part of IBM's big ["50 Years of Disk Systems Innovation"] campaign. IBM introduced the first commercial disk system on September 13, 1956 and so the 50th anniversary was in 2006. That means this month, IBM celebrates the "Diamond" anniversary, 60 years of Disk Systems!
For those who missed it, IBM announced last Tuesday encryption capability for the TS1120 drive, our enterprise tape drive that read and write 3592 cartridges. Do you need special cartridges for this? No! Use the sames ones you have already been using!
You can read more about it www.ibm.com/storage/tape."
Short and sweet, but it got me started, and I ended up writing 21 blog posts that first month. You can read blog posts from all 10 years by looking at the left panel of my blog under "Archive".
While traditional disk and tape storage are still very important and relevant in today's environment, IBM has also expanded into other technologies:
In 2012, IBM [acquired Texas Memory Systems]. In 2014, IBM shipped 62PB, more Flash capacity than any other vendor. In 2015, continued its #1 status, shipping 170PB of Flash, again, more than any other vendor.
IBM has flash everywhere, from the advanced FlashSystem 900, V9000, A9000 and A9000R models, to other all-flash array and hybrid flash-and-disk systems a with various sets of features and functions to meet a variety of workload requirements.
The DS8888 all-flash array, and the DS8886 and DS8884 hybrid flash-and-disk systems round out the latest in the DS8000 storage systems family. SAN Volume Controller and Storwize family of products, based on IBM Spectrum Virtualize software, also have all-flash array and hybrid configurations. The most recent being the Gen2+ models of Storwize V7000F and V5030F. The latest solution is the DeepFlash 150 models, designed for analytics and unstructured data.
Between internally-developed IBM Spectrum Scale and IBM Spectrum Archive, and IBM's [acquisition of Cleversafe], IBM is ranked #1 in Object Storage. IBM Cloud Object Storage System, IBM's new name for Cleversafe's flagship product, is available as software-only, pre-built systems, or in the IBM SoftLayer cloud.
Software-Defined Storage (SDS) with IBM Spectrum Storage
Last year, IBM re-branded its various storage software products under the "IBM Spectrum Storage" family. Earlier this year, IBM announced the new [IBM Spectrum Storage Suite license] which makes it even easier to procure, either with a perpetual software license, elastic monthly licensing, or utility license that combines some of each.
IBM is ranked #1 in Software-Defined Storage, with over 40 percent marketshare, offering solutions as Software-only, pre-built systems, and in IBM SoftLayer cloud.