Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Jim is an IBM Fellow for IBM Systems and Technology Group. There are only 73 IBM Fellows currently working for IBM, and this is the highest honor IBM can bestow on an employee. He has been working with IBM since 1968.
He is tasked with predicting the future of IT, and help drive strategic direction for IBM. Cost pressures, requirements for growth, accelerating innovation and changing business needs help influence this direction.
IBM's approach is to integrate four different "IT building blocks":
Scale-up Systems, like the IBM System Storage DS8000 and TS3500 Tape Library
Resource Pools, such as IBM Storage Pools formed from managed disks by IBM SAN Volume Controller (SVC)
Integrated stacks and appliances, integrated software and hardware stacks, from Storwize V7000 to full rack systems like IBM Smart Analytics Server or CloudBurst.
Mobility of workloads and resources requires unified end-to-end service management. Fortunately, IBM is the #1 leader in IT Service Management solutions.
Jim addressed three myths:
Myth 1: IT Infrastructures will be homogenous.
Jim feels that innovations are happening too rapidly for this to ever happen, and is not a desirable end-goal. Instead, a focus to find the right balance of the IT building blocks might be a better approach.
Myth 2: All of your problems can be solved by replacing everything with product X.
Jim feels that the days of "rip-and-replace" are fading away. As IBM Executive Steve Mills said, "It isn't about the next new thing, but how well new things integrate with established applications and processes."
Myth 3: All IT will move to the Cloud model.
Jim feels a substantial portion of IT will move to the Cloud, but not all of it. There will always be exceptions where the old traditional ways of doing things might be appropriate. Clouds are just one of the many building blocks to choose from.
Jim's focus lately has been finding new ways to take advantage of virtualization concepts. Server, storage and network virtualization are helping address these challenges through four key methods:
Sharing - virtualization that allows a single resource to be used by multiple users. For example, hypervisors allow several guest VM operating systems share common hardware on a single physical server.
Aggregation - virtualization that allows multiple resources to be managed as a single pool. For example, SAN Volume Controller can virtualize the storage of multiple disk arrays and create a single storage pool.
Emulation - virtualization that allows one set of resources to look and feel like a different set of resources. Some hypervisors can emulate different kinds of CPU processors, for example.
Insulation - virtualization that hides the complexity from the end-user application or other higher levels of infrastructure, making it easier to make changes of the underlying managed resources. For example, both SONAS and SAN Volume Controller allow disk capacity to be removed and replaced without disruption to the application.
In today's economy, IT transformation costs must be low enough to yield near-term benefits. The long-term benefits are real, but near-term benefits are needed for projects to get started.
What set's IBM ahead of the pack? Here was Jim's list:
100 Years of Innovation, including being the U.S. Patent leader for the last 18 years in a row
IBM's huge investment in IBM Research, with labs all over the globe
Leadership products in a broad portfolio
Workload-optimized designs with integration from middleware all the way down to underlying hardware
Comprehensive management software for IBM and non-IBM equipment
Clod is an IBM Distinguished Engineer and Chief Technical Strategist for IBM System Storage. His presentation focused on trends and directions in the IT storage industry. Clod started with five workload categories:
To address these unique workload categories, IBM will offer workload-optimized systems. The four drivers on the design for these are performance, efficiency, scalability, and integration. For example, to address performance, companies can adopt Solid-State Drives (SSD). Unfortunately, these are 20 times more expensive dollar-per-GB than spinning disk, and the complexity involved in deciding what data to place on SSD was daunting. IBM solved this with an elegant solution called IBM System Storage Easy Tier, which provides automated data tiering for IBM DS8000, SAN Volume Controller (SVC) and Storwize V7000.
For scalability, IBM has adopted Scale-Out architectures, as seen in the XIV, SVC, and SONAS. SONAS is based on the highly scalable IBM General Parallel File System (GPFS). File systems are like wine, they get better with age. GPFS was introduced 15 years ago, and is more mature than many of the other "scalable file systems" from our competition.
Areal Density advancements on Hard Disk Drives (HDD) are slowing down. During the 1990s, the IT industry enjoyed 60 to 100 percent annual improvement in areal density (bits per square inch). In the 2000s, this dropped to 25 to 40 percent, as engineers are starting to hit various physical limitations.
Storage Efficiency features like compression have been around for a while, but are being deployed in new ways. For example, IBM invented WAN compression needed for Mainframe HASP. WAN compression became industry standard. Then IBM introduced compression on tape, and now compression on tape is an industry standard. ProtecTIER and Information Archive are able to combine compression with data deduplication to store backups and archive copies. Lastly, IBM now offers compression on primary data, through the IBM Real-Time Compression appliance.
For the rest of this decade, IBM predicts that tape will continue to enjoy (at least) 10 times lower dollar-per-GB than the least expensive spinning disk. Disk and Tape share common technologies, so all of the R&D investment for these products apply to both types of storage media.
For integration, IBM is leading the effort to help companies converge their SAN and LAN networks. By 2015, Clod predicts that there will be more FCoE purchased than FCP. IBM is also driving integration between hypervisors and storage virtualization. For example, IBM already supports VMware API for Array Integration (VAAI) in various storage products, including XIV, SVC and Storwize V7000.
Lastly, Clod could not finish a presentation without mentioning Cloud Computing. Cloud storage is expected to grow 32 percent CAGR from year 2010 to 2015. Roughly 10 percent of all servers and storage will be in some type of cloud by 2015.
As is often the case, I am torn between getting short posts out in a timely manner versus spending some more time to improve the length and quality of information, but posted much later. I will spread out the blog posts in consumable amounts throughout the next week or two, to achieve this balance.
Continuing my week in Washington DC for the annual [2010 System Storage Technical University], I presented a session on Storage for the Green Data Center, and attended a System x session on Greening the Data Center. Since they were related, I thought I would cover both in this post.
Storage for the Green Data Center
I presented this topic in four general categories:
Drivers and Metrics - I explained the three key drivers for consuming less energy, and the two key metrics: Power Usage Effectiveness (PUE) and Data Center Infrastructure Efficiency (DCiE).
Storage Technologies - I compared the four key storage media types: Solid State Drives (SSD), high-speed (15K RPM) FC and SAS hard disk, slower (7200 RPM) SATA disk, and tape. I had comparison slides that showed how IBM disk was more energy efficient than competition, for example DS8700 consumes less energy than EMC Symmetrix when compared with the exact same number and type of physical drives. Likewise, IBM LTO-5 and TS1130 tape drives consume less energy than comparable HP or Oracle/Sun tape drives.
Integrated Systems - IBM combines multiple storage tiers in a set of integrated systems managed by smart software. For example, the IBM DS8700 offers [Easy Tier] to offer smart data placement and movement across Solid-State drives and spinning disk. I also covered several blended disk-and-tape solutions, such as the Information Archive and SONAS.
Actions and Next Steps - I wrapped up the talk with actions that data center managers can take to help them be more energy efficient, from deploying the IBM Rear Door Heat Exchanger, or improving the management of their data.
Greening of the Data Center
Janet Beaver, IBM Senior Manager of Americas Group facilities for Infrastructure and Facilities, presented on IBM's success in becoming more energy efficient. The price of electricity has gone up 10 percent per year, and in some locations, 30 percent. For every 1 Watt used by IT equipment, there are an additional 27 Watts for power, cooling and other uses to keep the IT equipment comfortable. At IBM, data centers represent only 6 percent of total floor space, but 45 percent of all energy consumption. Janet covered two specific data centers, Boulder and Raleigh.
At Boulder, IBM keeps 48 hours reserve of gasoline (to generate electricity in case of outage from the power company) and 48 hours of chilled water. Many power outages are less than 10 minutes, which can easily be handled by the UPS systems. At least 25 percent of the Computer Room Air Conditioners (CRAC) are also on UPS as well, so that there is some cooling during those minutes, within the ASHRAE guidelines of 72-80 degrees Fahrenheit. Since gasoline gets stale, IBM runs the generators once a month, which serves as a monthly test of the system, and clears out the lines to make room for fresh fuel.
The IBM Boulder data center is the largest in the company: 300,000 square feet (the equivalent of five football fields)! Because of its location in Colorado, IBM enjoys "free cooling" using outside air temperature 63 percent of the year, resulting in a PUE of 1.3 rating. Electricity is only 4.5 US cents per kWh. The center also uses 1 Million KwH per year of wind energy.
The Raleigh data center is only 100,000 Square feet, with a PUE 1.4 rating. The Raleigh area enjoys 44 percent "free cooling" and electricity costs at 5.7 US cents per kWh. The Leadership in Energy and Environmental Design [LEED] has been updated to certify data centers. The IBM Boulder data center has achieved LEED Silver certification, and IBM Raleigh data center has LEED Gold certification.
Free cooling, electricity costs, and disaster susceptibility are just three of the 25 criteria IBM uses to locate its data centers. In addition to the 7 data centers it manages for its own operations, and 5 data centers for web hosting, IBM manages over 400 data centers of other clients.
It seems that Green IT initiatives are more important to the storage-oriented attendees than the x86-oriented folks. I suspect that is because many System x servers are deployed in small and medium businesses that do not have data centers, per se.
The "Basic" offering includes a single IBM Storwize V7000 controller enclosure, and three year warranty package that includes software licenses for IBM Tivoli Storage FlashCopy Manager (FCM) and IBM Tivoli Storage Productivity Center for Disk - Midrange Edition (MRE). Planning, configuration and testing services for the software are included and can be performed by either IBM or an IBM Business Partner.
The "Standard" offering allows for multiple IBM Storwize V7000 enclosures, provides three year warranty package for the FCM and MRE software, and includes implementation services for both the hardware and the software components. These services can be performed by IBM or an IBM Business Partner.
Why bundle? Here are the key advantages for these offerings:
Increased storage utilization! First introduced in 2003, IBM SAN Volume Controller is able to improve storage utilization by 30 percent through virtualization and thin provisioning. IBM Storwize V7000 carries on this tradition. Space-efficient FlashCopy is included in this bundle at no additional charge and can reduce the amount of storage normally required for snapshots by 75 percent or more. IBM Tivoli Storage FlashCopy Manager can manage these FlashCopy targets easily.
Improved storage administrator productivity! The new IBM Storwize V7000 Graphical User Interface can help improve administrator productivity up to 2 times compared to other midrange disk solutions. The IBM Tivoli Storage Productivity Center for Disk - Midrange Edition provides real-time performance monitoring for faster analysis time.
Increased application performance! This bundle includes the "Easy Tier" feature at no additional charge. Easy Tier is IBM's implementation of sub-LUN automated tiering between Solid-State Drives (SSD) and spinning disk. Easy Tier can help improve application throughput up to 3 times, and improve response time up to 60 percent. Easy Tier can help meet or exceed application performance levels with its internal "hot spot" analytics.
Increased application availability! IBM Tivoli Storage FlashCopy Manager provides easy integration with existing applications like SAP, Microsoft Exchange, IBM DB2, Oracle, and Microsoft SQL Server. Reduce application downtime to just seconds with backups and restores using FlashCopy. The built-in online migration feature, included at no additional charge, allows you to seamlessly migrate data from your old disk to the new IBM Storwize V7000.
Significantly reduced implementation time! This bundle will help you cut implementation time in half, with little or no impact to storage administrator staff. This will help you realize your return on investment (ROI) much sooner.
Continuing my coverage of last week's Data Center Conference 2009, my last breakout session of the week was an analyst presentation on Solid State Drive (SSD) technology. There are two different classes of SSD, consumer grade multi-level cell (MLC) running currently at $2 US dollars per GB, and Enterprise grade single-level cell (SLC) running at $4.50 US dollars per GB. Roughly 80 to 90 percent of the SSD is used in consumer use cases, such as digital cameras, cell phones, mobile devices, USB sticks, camcorders, media players, gaming devices and automotive.
While the two classes are different, the large R&D budgets spent on consumer grade MLC carry forward to help out enterprise grade SLC as well. SLC means there is a single level for each cell, so each cell can only hold a single bit of data, a one or a zero. MLC means the cell can hold multiple levels of charge, each representing a different value. Typically MLC can hold 3 to 4 bits of data per cell.
Back in 1997, SLC Enterprise Grade SSD cost roughly $7870 per GB. By 2013, Consumer Grade 4-bit MLC is expected to be only 24 cents per GB. Engineers are working on trade-offs between endurance cycles and retention periods. FLASH management software is the key differentiator, such as clever wear-leveling algorithms.
SSD is 10-15 times more expensive than spinning hard disk drives (HDD), and this price difference is expected to continue for a while. This is because of production volumes. In 4Q09, manufacturers will manufacturer 50 Exabytes of HDD, but only 2 Exabytes of SSD. The analyst thinks that SSD will only be roughly 2 percent of the total SAN storage deployed over the next few years.
How well did the audience know about SSD technology?
4 percent not at all
30 percent some awareness
30 percent enough to make purchase decision
21 percent able to quantify benefits and trade-offs
15 percent experts
SSD does not change the design objectives of disk systems. We want disk systems that are more scalable and have higher performance. We want to fully utilize our investment. We want intelligent self-management similar to caching algorithms. We want an extensible architecture.
What will happen to fast Fibre Channel drives? Take out your Mayan calendar. Already 84mm 10K RPM drives are end of life (EOL) in 2009. The analyst expects 67mm and 70mm 10K drives will EOL in 2010, and that 15K will EOL by 2012. A lot of this is because HDD performance has not kept up with CPU advancements, resulting in an I/O bottleneck. SSD is roughly 10x slower than DRAM, and some architectures use SSD as a cache extension. The IBM N series PAM II card and Sun 7000 series being two examples.
Let's take a look at a disk system with 120 drives, comparing 73GB HDD's versus 32GB SSD's.
per HDD drive
per SSD drive
There are various use cases for SSD. These include internal DAS, stand-alone Tier 0 storage, replace or complement HDD in disk arrays, and as an extension of read cache or write cache. The analyst believes there will be mixed MLC/SLC devices that will allow for mixed workloads. His recommendations:
Use SSD to eliminate performance and throughput bottlenecks
Consolidate workloads to maximize value
Use SLAs to identify workload candidates
Evaluate emerging technologies along with established vendors
Do not expect SSD to drastically reduce power/cooling
SSD will continue to complement HDD, primarily SATA disk
Trust but verify, check out customer references offered by storage vendors
Well, it's Tuesday again, and you know what that means! IBM Announcements! Typically, IBM System Storage has three to five major product launches per year. Making announcements every Tuesday would have been two frequent, and having one big announcement every two or three years would be too far apart. Worldwide combined revenues for storage hardware and software grew double digits last year, comparing full-year 2011 to the prior 2010 year, and I am sure that 2012 will also be a good year for IBM as well! This week we have announcements for both disk and tape, but since 2012 is the 60th Diamond Anniversary for tape, I will start with tape systems first.
TS1140 support for JA/JJ tape cartridges
The TS1140 enterprise tape drive was announced at the [Storage Innovation Executive Summit] last May. It supported a new E07 format on three different new tape cartridges. Models "JC" was 4.0TB standard re-writeable tapes, "JY" was 4.0TB WORM tapes, and "JK" were 500GB economy tapes that were less expensive, but offered faster random access.
Generally, IBM has adopted an N-2 read, N-1 write [backward compatibility]. This means that the TS1140 could read E05 and E06 formatted tapes on JB and JX media, and could write E06 format on JB and JX media. However, there are a lot of older JA and JJ media, especially as part of TS7740 environments, so IBM now supports TS1140 drives to read J1A formatted JA and JJ media. This is not just for TS7740 environments, any TS1140 in stand-alone or tape library configurations will support this as well.
TS7700 R2.1 enhancements
IBM is a leader in tape virtualization with or without physical tape as back-end media. There are two hardware models of the [IBM Virtualization Engine TS7700 family] for the IBM System z mainframe. These virtual libraries are referred to as "clusters" in IBM literature.
The TS7740 Virtual Tape Library supports putting virtual tape images on disk first, then move less-active data to physical tape, which I covered in my blog post [IBM Announcements - July 2007].
A unique feature of the TS7700 series is support for a Grid configuration, which allows up to six different TS7700 clusters to be grouped into a single instance image. These clusters can be in local or remote locations, connected via WAN or LAN connections.
R2.1 is the latest software release of this successful IBM's TS7700 series.
True Sync Mode Copy. Before R2.1, the TS7700 offered "immediate mode copy". An application would write to a virtual tape, and when it was done with the tape and performed an unmount, the TS7700 would then replicate the tape contents to a secondary cluster on the grid. With True Sync Mode, data contents are replicated per implicit or explicit SYNC points. This is another IBM first in the IT tape industry.
Remote Mount Fail-over. When you have two or more TS7700 clusters in a grid configuration, you can do remote mounts. We've added fail-over multi-pathing up to four paths, so that if a link to a remote cluster is down, it will try one of the others instead.
Parallel Copies and Pre-Migration. On of my 19 patents is for the pre-migration feature for the IBM 3494 Virtual Tape Server (VTS) that carries forward into the TS7700, and is also used in the SONAS and Information Archive products. However, when the grid architecture was introduced, the engineers decided not to allow pre-migration and copies to secondary clusters to occur concurrently. Now these two operations can be done in parallel.
Merge two grids into one grid. Now that we can support up to six clusters into a single grid, we have people with 2-cluster and 3-cluster grids looking to merge them into one. Of course, all the logical and physical volume serials (VOLSER) must be unique!
Accelerate off JA/JJ Media. There are a lot of older JA and JJ media still in TS7700 libraries. This feature allows customers to speed up the transition to newer physical tape media.
Copy Export to E06 format on JB media. This one is clever, and I have to say I would have never thought about it. Let's say you have a TS7740 with TS1140 drives, but you want to export some virtual tapes to physical media to be sent to someone who only has a TS7740 connected with older TS1130 drives. These older drives can't read new JC media nor make sense of the E07 format. This feature will let you export to older JB media in E06 format so that it will be fully readable at the new location on the TS1130 drives.
Copy Export Merge service offering. Thanks to mergers and acquisitions, it is sometimes necessary to split off a portion of data from a TS7700 grid. In the past, IBM supported sending this export to a completely empty TS7700 library, but this new service offerings allows the export to be merged into an existing TS7700 that already contains data.
LTFS-SDE support for Mac OS X 10.7 Lion
How do people still not yet know about the Linear Tape File System [LTFS]? I mentioned this in my blogs back in 2010 in [April], [September], and [November]. Last year, LTFS was the [NAB Show Pick Hits Award] and an [Emmy] for revolutionizing the use of digital tape in Television broadcasting.
In layman's terms, the Single Drive Edition [LTFS-SDE] allows a tape cartridge to be treated like USB memory stick. It is supported on the LTO5 tape drives for systems running various levels of Windows, Linux and Mac OS X. Prior to this announcement, IBM supported Snow Leopard (10.5.6) and Leopard (10.6), and now supports Mac OS X 10.7 "Lion" release.
IBM first introduced Solid-State Drives (SSD) back in 2007 where it made sense the most, in [drive-for-drive replacements on blade servers in the IBM BladeCenter]. Blade servers typically only have a single drive, and SSD are both faster and use less energy on a drive-for-drive comparison, so this provided immediate benefit. Today, SSD are available on a variety of System x and POWER system servers.
In 2008, IBM rocked the world by being the first to reach [1 Million IOPS with Project Quicksilver]. This was an all-SSD configuration which many considered unrealistic (at the time), but it showed the potential for solid state drives.
When the [XIV Gen3 was Announced - July 2011], each module included an 1.8-inch "SSD-Ready" slot in the back. IBM made a Statement of Direction that IBM would someday offer SSD drives to put in these slots. Today's announcement is that IBM has finalized the qualification process, so now XIV Gen3 clients can have 400GB of usable non-volatile SSD read cache added to each module. This SSD can be added to existing XIV Gen3 boxes in the field, or it can be factory-installed in new shipments. If you have a 15-module XIV, that's 6TB of additional read cache! This SSD is entirely managed by the XIV Gen3, so you won't have to spend weeks reading manuals or specifying configuration parameters.
When you carve volumes on the XIV, you now have an option to enable or disable use of the SSD cache for each volume. Since XIV is being used in private and public cloud deployments, this offers the ability to offer premium performance at premium prices. The use of SSD is complementary to IBM XIV Quality of Service (QoS) performance levels, which are determined by host instead.
Well, that's the first major IBM System Storage launch of 2012. Let me know what you think in the comment section below.
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of more of the Tuesday afternoon sessions:
IBM CIOs and Storage
Barry Becker, IBM Manager of Global Strategic Outsourcing Enablement for Data Center Services, presented this session on Storage Infrastructure Optimization (SIO).
A bit of context might help. I started my career in DFHSM which moved data from disk to tape to reduce storage costs. Over the years, I wouuld visit clients, analyze their disk and tape environment, and provide a set of recommendations on how to run their operations better. In 2004, this was formalized into week-long "Information Lifecycle Management (ILM) Assessments", and I spent 18 months in the field training a group of folks on how to perform them. The IBM Global Technology Services team have taken a cross-brand approach, expanding this ILM approach to include evaluations of the application workloads and data types. These SIO studies take 3-4 weeks to complete.
Over the next decade, there will only be 50 percent more IT professionals than we have today, so new approaches will be needed for governance and automation to deal with the explosive growth of information.
SIO deals with both the demand and supply of data growth in five specific areas:
Data reclamation, rationalization and planning
Virtualization and tiering
Backup, business continuity and disaster recovery
Storage process and governance
Archive, Retention and Compliance
The process involves gathering data and interview business, financial and technical stakeholders like storage administrators and application owners. The interviews take less than one hour per person.
Over the past two years, the SIO team has uncovered disturbing trends. A big part of the problem is that 70 percent of data stored on disk has not been accessed in the past 90 days, and is unlikely to be accessed at all in the near future, so would probably be better to store on lower cost storage tiers.
Storage Resource Management (SRM) is also a mess, with over 85 percent of clients having serious reporting issues. Even rudimentary "Showback" systems to report back what every individual, group or department were using resulted in significant improvement.
Archive is not universally implemented mostly because retention requirements are often misunderstood. Barry attributed this to lack of collaboration between storage IT personnel, compliance officers, and application owners. A "service catalog" that identifies specific storage and data types can help address many of these concerns.
The results were impressive. Clients that follow SIO recommendations save on average 20 to 25 percent after one year, and 50 percent after three to five years. Implementing storage virtualization averaged 22 percent lower CAPEX costs. Those that implemented a "service catalog" saved on average $1.9 million US dollars. Internally, IBM's own operations have saved $13 million dollars implementing these recommendations over the past three years.
Reshaping Storage for Virtualization and Big Data
The two analysts presenting this topic acknowledged there is no downturn on the demand for storage. To address this, they recommend companies identify storage inefficiencies, develop better forecasting methodologies, implement ILM, and follow vendor management best practices during acquisition and outsourcing.
To deal with new challenges like virtualization and Big Data, companies must decide to keep, replace or supplement their SRM tools, and build a scalable infrastructure.
One suggestion to get upper management to accept new technologies like data deduplication, thin provisioning, and compression is to refer to them as "Green" technologies, as they help reduce energy costs as well. Thin provisioning can help drive up storage utilization to rates as high as you dare, typically 60 to 70 percent is what most people are comfortable with.
A poll of the audience found that top three initiatives for 2012 are to implement data deduplication, 10Gb Ethernet, and Solid-State drives (SSD).
The analysts explained that there are two different types of cloud storage. The first kind is storage "for" the cloud, used for cloud compute instances (aka Virtual Machines), such as Amazon EBS for EC2. The second kind is storage "as" the cloud, storage as a data service, such as Amazon S3, Azure Blob and AT&T Synaptic.
The analysts feel that cloud storage deployments will be mostly private clouds, bursting as needed to public cloud storage. This creates the need for a concept called "Cloud Storage Gateways" that manage this hybrid of some local storage and some remote storage. IBM's SONAS Active Cloud Engine provides long-distance caching in this manner. Other smaller startups include cTera, Nasuni, Panzura, Riverbed, StorSimple, and TwinStrata.
A variation of this are "storage gateways" for backup and archive providers as a staging area for data to be subsequently sent on to the remote location.
New projects like virtualization, Cloud computing and Big Data are giving companies a new opportunity to re-evaluate their strategies for storage, process and governance.
IBM has announced it has entered into a definitive agreement to acquire Texas Memory Systems, Inc. (TMS), a privately held Houston, Texas-based company with about 100 employees, that focuses on solid-state flash optimized systems and solutions, including the RamSan family of external rack-mounted storage, as well as PCIe cards for internal storage that fit inside servers.
I've mentioned Solid-State Drive storage quite a few times over the past few years in this blog, which included some great interactions with my friends over at Texas Memory Systems. Here's a quick look:
In my now infamous blog post [Hybrid, Solid State and the future of RAID], I resort to a deck of [Tarot cards] in an effort to fight [writer's block] responding to query about combining solid-state with spinning disk. In the original post, I poked fun at Texas Memory Systems having the slogan "World's Fastest Storage". Woody Hutsell, then VP of marketing for Texas Memory Systems, explained that the reason that TMS did not have faster benchmark results was because it did not have a million dollars to buy the fastest IBM UNIX server.
In my post [Good News and Bad News], I mentioned that Texas Memory Systems has an impressive SPC benchmark result. The Storage Performance Council [SPC] publishes the benchmarking industry standard by which all block-based storage devices are measured. It looks like the TMS performance test department finally got the million-dollar IBM server they needed for this.
My colleagues in marketing were not amused, afraid that mentioning small companies like TMS would give them a huge boost in marketing awareness, above and beyond what TMS could do on their own modest marketing budget, similar to the [Colbert Bump]. I could call it the Pearson Bump. If you first heard of Texas Memory Systems from my blog, or bought TMS products based on my discussion, please post a comment below!
IBM made history as the first major storage vendor to [break the 1 million IOPS barrier with Solid State Disk]. The project was known as "Quicksilver", and was able to demonstrate that a product like SAN Volume Controller with Solid-State Drives (SSD) can indeed provide a significant boost in performance to external disk arrays. The IBM 2145-CF8 and 2145-CG8 models allow up to four SSD in each node. I was asked not to blog the entire month of August, so that our upcoming September announcements would get more notice, but I couldn't resist covering Quicksilver. The original post had mentioned Texas Memory Systems, but were later removed to avoid the "Pearson Bump".
In my post [Day 2 IBM Storage University - Solutions Expo - TMS After-party], I mentioned that I attended the TMS after-party. Texas Memory Systems had just been qualified as Solid-State Drive (SSD) storage behind the IBM SAN Volume Controller, and the two products work extremely well together for IBM Easy Tier, the sub-volume automated tiering capability to optimize storage performance. I was able to catch up with my friend Erik Eyberg, and meet CEO and Founder Holly Frost.
Nearly half (43 percent) of IT decision makers say they have plans to use SSD technology in the future or are already using it in their datacenter. Solid-state can refer to both volatile Random Access Memory (RAM) and non-volatile Flash, and Texas Memory Systems has built solutions around both types. The survey question referred to non-volatile Flash Solid-State Drives (SSD) that do not require a battery to keep the data from fading away after the power goes out. Nearly all storage in the datacenter has volatile Random Access Memory (RAM).
Speeding delivery of data was the motivation behind 75 percent of respondents who plan to use or already use SSD technology. I would have thought this would have been 100 percent, but the other options included reduced energy consumption, and improved drive reliability, which are both also true with Solid-State Drives.
However, for those who were not using SSD today, the major factor was cost, according to 71 percent of respondents. On a Dollar-per-GB basis, Solid-State Drives continue to be anywhere from 10 to 25 times more expensive spinning disk. Last year's tsunami in Japan, and the floods in Thailand, have caused spinning disk prices to rise to cover component shortages, thereby shrinking the price gap between SSD and spinning disk.
Nearly half (48 percent) say they plan on increasing storage investments in the area of virtualization, cloud (26 percent) and flash memory/solid state (24 percent) and analytics (22 percent).
My series last week on IBM Watson (which you can read [here], [here], [here], and [here]) brought attention to IBM's Scale-Out Network Attached Storage [SONAS]. IBM Watson used a customized version of SONAS technology for its internal storage, and like most of the components of IBM Watson, IBM SONAS is commercially available as a stand-alone product.
Like many IBM products, SONAS has gone through various name changes. First introduced by Linda Sanford at an IBM SHARE conference in 2000 under the IBM Research codename Storage Tank, it was then delivered as a software-only offering SAN File System, then as a services offering Scale-out File Services (SoFS), and now as an integrated system appliance, SONAS, in IBM's Cloud Services and Systems portfolio.
If you are not familiar with SONAS, here are a few of my previous posts that go into more detail:
This week, IBM announces that SONAS has set a world record benchmark for performance, [a whopping 403,326 IOPS for a single file system]. The results are based on comparisons of publicly available information from Standard Performance Evaluation Corporation [SPEC], a prominent performance standardization organization with more than 60 member companies. SPEC publishes hundreds of different performance results each quarter covering a wide range of system performance disciplines (CPU, memory, power, and many more). SPECsfs2008_nfs.v3 is the industry-standard benchmark for NAS systems using the NFS protocol.
(Disclaimer: Your mileage may vary. As with any performance benchmark, the SPECsfs benchmark does not replicate any single workload or particular application. Rather, it encapsulates scores of typical activities on a NAS storage system. SPECsfs is based on a compilation of workload data submitted to the SPEC organization, aggregated from tens of thousands of fileservers, using a wide variety of environments and applications. As a result, it is comprised of typical workloads and with typical proportions of data and metadata use as seen in real production environments.)
The configuration tested involves SONAS Release 1.2 on 10 Interface Nodes and 8 Storage Pods, resulting a single file system over 900TB usable capacity.
10 Interface Nodes; each with:
Maximum 144 GB of memory
One active 10GbE port
8 Storage Pods; each with:
2 Storage nodes and 240 drives
Drive type: 15K RPM SAS hard drives
Data Protection using RAID-5 (8+P) ranks
Six spare drives per Storage Pod
IBM wanted a realistic "no compromises" configuration to be tested, by choosing:
Regular 15K RPM SAS drives, rather than a silly configuration full of super-expensive Solid State Drives (SSD) to plump up the results.
Moderate size, typical of what clients are asking for today. The Goldilocks rule applies. This SONAS is not a small configuration under 100TB, and nowhere close to the maximum supported configuration of 7,200 disks across 30 Interface Nodes and 30 Storage Pods.
Single file system, often referred to as a global name space, rather than using an aggregate of smaller file systems added together that would be more complicated to manage. Having multiple file systems often requires changes to applications to take advantage of the aggregate peformance. It is also more difficult to load-balance your performance and capacity across multiple file systems. Of course, SONAS can support up to 256 separate file systems if you have a business need for this complexity.
The results are stunning. IBM SONAS handled three times more workload for a single file system than the next leading contender. All of the major players are there as well, including NetApp, EMC and HP.
This week I got a comment on my blog post [IBM Announces another SSD Disk offering!]. The exchange involved Solid State Disk storage inside the BladeCenter and System x server line. Sandeep offered his amazing performance results, but we have no way to get in contact with him. So, for those interested, I have posted on SlideShare.net a quick five-chart presentation on recent tests with various SSD offerings on the eX5 product line here:
If you store your VMware bits on external SAN or NAS-based disk storage systems, this post is for you. The subject of the post, VM Volumes, is a potential storage management game changer!
Fellow blogger Stephen Foskett mentioned VM Volumes in his [Introducing VMware vSphere Storage Features] presentation at IBM Edge 2012 conference. His session on VMware's storage features included VMware APIs for Array Integration (VAAI), VMware Array Storage Awareness (VASA), vCenter plug-ins, and a new concept he called "vVol", now more formally known as VM Volumes. This post provides a follow-up to this, describing the VM Volumes concepts, architecture, and value proposition.
"VM Volumes" is a future architecture that VMware is developing in collaboration with IBM and other major storage system vendors. So far, very little information about VM Volumes has been released. At VMworld 2012 Barcelona, VMware highlights VM Volumes for the first time and IBM demonstrates VM Volumes with the IBM XIV Storage System (more about this demo below). VM Volumes is worth your attention -- when it becomes generally available, everyone using storage arrays will have to reconsider their storage management practices in a VMware environment -- no exaggeration!
But enough drama. What is this all about?
(Note: for the sake of clarity, this post refers to block storage only. However, the VM Volumes feature applies to NAS systems as well. Special thanks to Yossi Siles and the XIV development team for their help on this post!)
The VM Volumes concept is simple: VM disks are mapped directly to special volumes on a storage array system, as opposed to storing VMDK files on a vSphere datastore.
The following images illustrate the differences between the two storage management paradigms.
You may still be asking yourself: bottom line, how will I benefit from VM Volumes?
Well, take a VM snapshot for example. With VM Volumes, vSphere can simply offload the operation by invoking a hardware snapshot of the hardware volume. This has significant implications:
VM-Granularity: Only the right VMs are copied (with datastores, backing up or cloning individual-VM portions of hardware snapshot of a datastore would require more complex configuration, tools and work)
Hardware Offload: No ESXi server resources are consumed
XIV advantage: With XIV, snapshots consume no space upfront and are completed instantly.
Here's the first takeaway: With VM Volumes, advanced storage services (which cost a lot when you buy a storage array), will become available at an individual VM level. In a cloud world, this means that applications can be provisioned easily with advanced storage services, such as snapshots and mirroring.
Now, let's take a closer look at another relevant scenario where VM Volumes will make a lot of difference - provisioning an application with special mirroring requirements:
VM Volumes case: The application is ordered via the private cloud portal. The requestor checks a box requesting an asynchronous mirror. He changes the default RPO for his needs. When the request is submitted, the process wraps up automatically: Volumes are created on one of the storage arrays, configured with a mirror and RPO exactly as specified. A few minutes later, the requestor receives an automatic mail pointing to the application virtual machine.
Datastores case #1: As may be expected, a datastore that is mirrored with the special RPO does not exist. As a result, the automated workflow sets a pending status on the request, creates an urgent ticket to a VMware administrator and aborts. When the VMware admin handles that ticket, she re-assigns the ticket to the storage administrator, asking for a new volume which is mirrored with the special RPO, and mapped to the right ESXi cluster. The next day, the volume is created; the ticket is re-assigned to the storage admin, with the new LUN being pointed to. The VMware administrator follows and creates the datastore on top of it. Since the automated workflow was aborted, the admin re-assigns the ticket to the cloud administrator, who sometime later completes the application provisioning manually.
Datastores case #2: Luckily for the requestor, a datastore that is mirrored with the special RPO does exist. However, that particular datastore is consuming space from a high performance XIV Gen3 system with SSD caching, while the application does not require that level of performance, so the workflow requires a storage administrator approval. The approval is given to save time, but the storage administrator opens a ticket for himself to create a new volume on another array, as well as a follow-up ticket for the VMware admin to create a new datastore using the new volume and migrate the application to the other datastore. In this case, provisioning was relatively rapid, but required manual follow up, involving the two administrators.
Here's the second takeaway: With VM Volumes, management is simplified, and end-to-end automation is much more applicable. The reason is that there are no datastores. Datastores physically group VMs that may otherwise be totally unrelated, and require close coordination between storage and VMware administrators.
Now, the above mainly focuses on the VMware or cloud administrator perspective. How does VM Volumes impact storage management?
VM's are the new hosts: Today, storage administrators have visibility of physical hosts in their management environment. In a non-virtualized environment, this visibility is very helpful. The storage administrator knows exactly which applications in a data center are storage-provisioned or affected by storage management operations because the applications are running on well-known hosts. However, in virtualized environments the association of an application to a physical host is temporary. To keep at least the same level of visibility as in physical environments, VMs should become part of the storage management environment, like hosts. Hosts are still interesting, for example to manage physical storage mapping, but without VM visibility, storage administrators will know less about their operation than they are used to, or need to. VM Volumes enables such visibility, because volumes are provided to individual VMs. The XIV VM Volumes demonstration at VMworld Barcelona, although experimental, shows a view of VM volumes, in XIV's management GUI.
Here's a screenshot:
That's not all!
Storage Profiles and Storage Containers: A Storage Profile is a vSphere specification of a set of storage services. A storage profile can include properties like thin or thick provisioning, mirroring definition, snapshot policy, minimum IOPS, etc.
Storage administrators define a portfolio of supported storage services, maintained as a set of storage profiles, and published (via VASA integration) to vSphere.
VMware or cloud administrators define the required storage profiles for specific applications
VMware and storage administrators need to coordinate the typical storage requirements and the automatically-available storage services. When a request to provision an application is made, the associated storage profiles are matched against the published set of available storage profiles. The matching published profiles will be used to create volumes, which will be bound to the application VMs. All that will happen automatically.
Note that when a VM is created today, a datastore must be specified. With VM Volumes, a new management entity called Storage Container (also known as Capacity Pool) replaces the use of datastore as a management object. Each Storage Container exposes a subset of the available storage profiles, as appropriate. The storage container also has a capacity quota.
Here are some more takeaways:
New way to interface vSphere and storage management: Storage administrators structure and publish storage services to vSphere via storage profiles and storage containers.
Automated provisioning, out of the box: The provisioning process automatically matches application-required storage profiles against storage profiles available from the specified storage containers. There is no need to build custom scripts and custom processes to automate storage provisioning to applications
The XIV advantage:
XIV services are very simple to define and publish. The typical number of available storage profiles would be low. It would also be easy to define application storage profiles.
XIV provides consistent high performance, up to very high capacity utilization levels, without any maintenance. As a result, automated provisioning (which inherently implies less human attention) will not create an elevated risk of reduced performance.
Note: A storage vendor VASA provider is required to support VM Volumes, storage profiles, storage containers and automated provisioning. The IBM Storage VASA provider runs as a standalone service that needs to be deployed on a server.
To summarize the VM Volumes value proposition:
Streamline cloud operation by providing storage services at VM and application level, enabling end-to-end provisioning automation, and unifying VMware and storage administration around volumes and VMs.
Increase storage array ROI, improve vSphere scalability and response time, and reduce cloud provisioning lag, by offloading VM-level provisioning, failover, backup, storage migration, storage space recycling, monitoring, and more, to the storage array, using advanced storage operations such as mirroring and snapshots.
Simplify the adoption of VM Volumes using XIV, with smaller and simpler sets of storage profiles. Apply XIV's supreme fast cloning to individual VMs, and keep automation risks at bay with XIV's consistent high performance.
Until you can get your hands on a VM Volumes-capable environment, the VMware and IBM developer groups will be collaborating and working hard to realize this game-changing feature. The above information is definitely expected to trigger your questions or comments, and our development teams are eager to learn from them and respond. Enter your comments below, and I will try to answer them, and help shape the next post on this subject. There's much more to be told.