Well it's Tuesday again, and you know what that means.. IBM announcements! Today, IBM announces that next Monday marks the 60th anniversary of first commercial digital tape storage system! I am on the East coast this week visiting clients, but plan to be back in Tucson in time for the cake and fireworks next Monday.
Note that I said first commercial tape system as tape itself, in various forms, [has been around since 4000 B.C.]. A little historical context might help:
- 1845 - surgical tape
- 1921 - the Band-Aid, self-adhesive bandage
- 1925 - masking tape (which 3M sold under its newly announced Scotch® brand)
- 1930 - clear cellulose-based tape (today, when people say Scotch tape, they usually are referring to the cellulose version)
- 1935 - Allgemeine Elektrizitatsgesellschaft (AEG) presents Magnetophon K1, audio recording on analog tape
- 1942 - Duct tape
- 1947 - Bing Crosby adopts audio recording for his radio program. This eliminated him doing the same program live twice per day, perhaps the first example of using technology for "deduplication".
According to the IBM Archives the [IBM 726 tape drive was formally announced May 21, 1952]. It was the size of a refrigerator, and the tape reel was the size of a large pizza. The next time you pull a frozen pizza from your fridge, you can remember this month's celebration!
When I first joined IBM in 1986, there were three kinds of IBM tape. The round reel called 3420, and the square cartridge called 3480, and the tubes that contained a wide swath of tape stored in honeycomb shelves called the [IBM 3850 Mass Storage System].
My first job at IBM was to work on DFHSM, which was specifically started in 1977 to manage the IBM 3850, and later renamed to the DFSMShsm component of the DFSMS element of the z/OS operating system. This software was instrumental in keeping disk and tape at high 80-95 percent utilization rates on mainframe servers.
While visiting a client in Detroit, the client loved their StorageTek tape automation silo, but didn't care for the StorageTek drives inside were incompatible with IBM formats. They wanted to put IBM drives into the StorageTek silos. I agreed it was a good idea, and brought this back to the attention of development. In a contentious meeting with management and engineers, I presented this feedback from the client.
Everyone in the room said IBM couldn't do that. I asked "Why not?" The software engineers I spoke to already said they could support it. With StorageTek at the brink of Chapter 11 bankruptcy, I argued that IBM drives in their tape automation would ease the transition of our mainframe customers to an all-IBM environment.
Was the reason related to business/legal concerns, or was their a hardware issue? It turned out to be a little of both. On the business side, IBM had to agree to work with StorageTek on service and support to its mutual clients in mixed environments. On the technical side, the drive had to be tilted 12 degrees to line up with the robotic hand. A few years later, the IBM silo-compatible 3592 drive was commercially available.
Rather than put StorageTek completely out of business, it had the opposite effect. Now that IBM drives can be put in StorageTek libraries, everyone wanted one, basically bringing StorageTek back to life. This forced IBM to offer its own tape automation libraries.
In 1993, I filed my first patent. It was for the RECYCLE function in DFHSM to consolidate valid data from partial tapes to fresh new tapes. Before my patent, the RECYCLE function selected tapes alphabetically, by volume serial (VOLSER). My patent evaluated all tapes based on how full they were, and sorted them least-full to most-full, to maximize the return of cartridges.
Different tape cartridges can hold different amounts of data, especially with different formats on the same media type, with or without compression, so calculating the percentage full turned out to be a tricky algorithm that continues to be used in mainframe environments today.
The patent was popular for cross-licensing, and IBM has since filed additional patents for this invention in other countries to further increase its license revenue for intellectual property.
In 1997, IBM launched the IBM 3494 Virtual Tape Server (VTS), the first virtual tape storage device, blending disk and tape to optimal effect. This was based off the IBM 3850 Mass Storage Systems, which was the first virtual disk system, that used 3380 disk and tape to emulate the older 3350 disk systems.
In the VTS, tape volume images would be emulated as files on a disk system, then later moved to physical tape. We would call the disk the "Tape Volume Cache", and use caching algorithms to decide how long to keep data in cache, versus destage to tape. However, there were only a few tape drives, and sometimes when the VTS was busy, there were no tape drives available to destage the older images, and the cache would fill up.
I had already solved this problem in DFHSM, with a function called pre-migration. The idea was to pre-emptively copy data to tape, but leave it also on disk, so that when it needed to be destaged, all we had to do was delete the disk copy and activate the tape copy. We patented using this idea for the VTS, and it is still used in the successor models of IBM Sysem Storage TS7740 virtual tape libraries today.
Today, tape continues to be the least expensive storage medium, about 15 to 25 times less expensive, dollar-per-GB, than disk technologies. A dollar of today's LTO-5 tape can hold 22 days worth of MP3 music at 192 Kbps recording. A full TS1140 tape cartridge can hold 2 million copies of the book "War and Peace".
(If you have not read the book, Woody Allen took a speed reading course and read the entire novel in just 20 minutes. He summed up the novel in three words: "It involves Russia." By comparison, in the same 20 minutes, at 650MB/sec, the TS1140 drive can read this novel over and over 390,000 times.)
If you have your own "war stories" about tape, I would love to hear them, please consider posting a comment below.
technorati tags: IBM, AEG, Bing Crosby, Duct+Tape, Band Aid, DFHSM, RECYCLE, DFSMShsm, z/OS, StorageTek, VTS, VTL, LTO-5, TS1140, LTFS, Woody Allen
Tonight PBS plans to air Season 38, Episode 6 of NOVA, titled [Smartest Machine On Earth]. Here is an excerpt from the station listing:
"What's so special about human intelligence and will scientists ever build a computer that rivals the flexibility and power of a human brain? In "Artificial Intelligence," NOVA takes viewers inside an IBM lab where a crack team has been working for nearly three years to perfect a machine that can answer any question. The scientists hope their machine will be able to beat expert contestants in one of the USA's most challenging TV quiz shows -- Jeopardy, which has entertained viewers for over four decades. "Artificial Intelligence" presents the exclusive inside story of how the IBM team developed the world's smartest computer from scratch. Now they're racing to finish it for a special Jeopardy airdate in February 2011. They've built an exact replica of the studio at its research lab near New York and invited past champions to compete against the machine, a big black box code -- named Watson after IBM's founder, Thomas J. Watson. But will Watson be able to beat out its human competition?"
Craig Rhinehart offers
[10 Things You Need to Know About the Technology Behind Watson].
An artist has come up with this clever
Dr. Jon Lenchner from IBM Research has a series of posts on
[How Watson "sees", "hears", and "speaks"] and [Selected Nuances].
Like most supercomputers, Watson runs the Linux operating system. The system runs 2,880 cores (90 IBM Power 750 servers, four sockets each, eight cores per socket) to achieve 80 [TeraFlops]. TeraFlops is the unit of measure for supercomputers, representing a trillion floating point operations. By comparison, Hans Morvec, principal research scientist at the Robotics Institute of Carnegie Mellon University (CMU) estimates that the [human brain is about 100 TeraFlops]. So, in the three seconds that Watson gets to calculate its response, it would have processed 240 trillion operations.
Several readers of my blog have asked for details on the storage aspects of Watson. Basically, it is a modified version of IBM Scale-Out NAS [SONAS] that IBM offers commercially, but running Linux on POWER instead of Linux-x86. System p expansion drawers of SAS 15K RPM 450GB drives, 12 drives each, are dual-connected to two storage nodes, for a total of 21.6TB of raw disk capacity. The storage nodes use IBM's General Parallel File System (GPFS) to provide clustered NFS access to the rest of the system. Each Power 750 has minimal internal storage mostly to hold the Linux operating system and programs.
When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, "The actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1TB." For performance reasons, various subsets of the data are replicated in RAM on different functional groups of cluster nodes. The entire system is self-contained, Watson is NOT going to the internet searching for answers.
On ZDnet, Steven J. Vaughan-Nichols welcomes our new [Linux Penguin Jeopardy overlords]. I have to say I share his enthusiasm!
technorati tags: IBM, Nova, Watson, #ibmwatson, Jeopardy, POWER7, p750, supercomputer, TeraFlops, disk, SONAS, GPFS, SAS, Craig Rhinehart, Jon Lenchner, Hans Morvec, Carnegie Mellon University, CMU
“In times of universal deceit, telling the truth will be a revolutionary act.”
-- George Orwell
Well, it has been over two years since I first covered IBM's acquisition of the XIV company. Amazingly, I still see a lot of misperceptions out in the blogosphere, especially those regarding double drive failures for the XIV storage system. Despite various attempts to [explain XIV resiliency] and to [dispel the rumors], there are still competitors making stuff up, putting fear, uncertainty and doubt into the minds of prospective XIV clients.
Clients love the IBM XIV storage system! In this economy, companies are not stupid. Before buying any enterprise-class disk system, they ask the tough questions, run evaluation tests, and all the other due diligence often referred to as "kicking the tires". Here is what some IBM clients have said about their XIV systems:
“3-5 minutes vs. 8-10 hours rebuild time...”
-- satisfied XIV client
“...we tested an entire module failure - all data is re-distributed in under 6 hours...only 3-5% performance degradation during rebuild...”
-- excited XIV client
“Not only did XIV meet our expectations, it greatly exceeded them...”
-- delighted XIV client
In this blog post, I hope to set the record straight. It is not my intent to embarrass anyone in particular, so instead will focus on a fact-based approach.
- Fact: IBM has sold THOUSANDS of XIV systems
XIV is "proven" technology with thousands of XIV systems in company data centers. And by systems, I mean full disk systems with 6 to 15 modules in a single rack, twelve drives per module. That equates to hundreds of thousands of disk drives in production TODAY, comparable to the number of disk drives studied by [Google], and [Carnegie Mellon University] that I discussed in my blog post [Fleet Cars and Skin Cells].
- Fact: To date, no customer has lost data as a result of a Double Drive Failure on XIV storage system
This has always been true, both when XIV was a stand-alone company and since the IBM acquisition two years ago. When examining the resilience of an array to any single or multiple component failures, it's important to understand the architecture and the design of the system and not assume all systems are alike. At it's core, XIV is a grid-based storage system. IBM XIV does not use traditional RAID-5 or RAID-10 method, but instead data is distributed across loosely connected data modules which act as independent building blocks. XIV divides each LUN into 1MB "chunks", and stores two copies of each chunk on separate drives in separate modules. We call this "RAID-X".
Spreading all the data across many drives is not unique to XIV. Many disk systems, including EMC CLARiiON-based V-Max, HP EVA, and Hitachi Data Systems (HDS) USP-V, allow customers to get XIV-like performance by spreading LUNs across multiple RAID ranks. This is known in the industry as "wide-striping". Some vendors use the terms "metavolumes" or "extent pools" to refer to their implementations of wide-striping. Clients have coined their own phrases, such as "stripes across stripes", "plaid stripes", or "RAID 500". It is highly unlikely that an XIV will experience a double drive failure that ultimately requires recovery of files or LUNs, and is substantially less vulnerable to data loss than an EVA, USP-V or V-Max configured in RAID-5. Fellow blogger Keith Stevenson (IBM) compared XIV's RAID-X design to other forms of RAID in his post [RAID in the 21st Centure].
- Fact: IBM XIV is designed to minimize the likelihood and impact of a double drive failure
The independent failure of two drives is a rare occurrence. More data has been lost from hash collisions on EMC Centera than from double drive failures on XIV, and hash collisions are also very rare. While the published worst-case time to re-protect from a 1TB drive failure for a fully-configured XIV is 30 minutes, field experience shows XIV regaining full redundancy on average in 12 minutes. That is 40 times less likely than a typical 8-10 hour window for a RAID-5 configuration.
A lot of bad things can happen in those 8-10 hours of traditional RAID rebuild. Performance can be seriously degraded. Other components may be affected, as they share cache, connected to the same backplane or bus, or co-dependent in some other manner. An engineer supporting the customer onsite during a RAID-5 rebuild might pull the wrong drive, thereby causing a double drive failure they were hoping to avoid. Having IBM XIV rebuild in only a few minutes addresses this "human factor".
In his post [XIV drive management], fellow blogger Jim Kelly (IBM) covers a variety of reasons why storage admins feel double drive failures are more than just random chance. XIV avoids load stress normally associated with traditional RAID rebuild by evenly spreading out the workload across all drives. This is known in the industry as "wear-leveling". When the first drive fails, the recovery is spread across the remaining 179 drives, so that each drive only processes about 1 percent of the data. The [Ultrastar A7K1000] 1TB SATA disk drives that IBM uses from HGST have specified 1.2 million hours mean-time-between-failures [MTBF] would average about one drive failing every nine months in a 180-drive XIV system. However, field experience shows that an XIV system will experience, on average, one drive failure per 13 months, comparable to what companies experience with more robust Fibre Channel drives. That's innovative XIV wear-leveling at work!
- Fact: In the highly unlikely event that a DDF were to occur, you will have full read/write access to nearly all of your data on the XIV, all but a few GB.
Even though it has NEVER happened in the field, some clients and prospects are curious what a double drive failure on an XIV would look like. First, a critical alert message would be sent to both the client and IBM, and a "union list" is generated, identifying all the chunks in common. The worst case on a 15-module XIV fully loaded with 79TB data is approximately 9000 chunks, or 9GB of data. The remaining 78.991 TB of unaffected data are fully accessible for read or write. Any I/O requests for the chunks in the "union list" will have no response yet, so there is no way for host applications to access outdated information or cause any corruption.
(One blogger compared losing data on the XIV to drilling a hole through the phone book. Mathematically, the drill bit would be only 1/16th of an inch, or 1.60 millimeters for you folks outside the USA. Enough to knock out perhaps one character from a name or phone number on each page. If you have ever seen an actor in the movies look up a phone number in a telephone booth then yank out a page from the phone book, the XIV equivalent would be cutting out 1/8th of a page from an 1100 page phone book. In both cases, all of the rest of the unaffected information is full accessible, and it is easy to identify which information is missing.)
If the second drive failed several minutes after the first drive, the process for full redundancy is already well under way. This means the union list is considerably shorter or completely empty, and substantially fewer chunks are impacted. Contrast this with RAID-5, where being 99 percent complete on the rebuild when the second drive fails is just as catastrophic as having both drives fail simultaneously.
- Fact: After a DDF event, the files on these few GB can be identified for recovery.
Once IBM receives notification of a critical event, an IBM engineer immediately connects to the XIV using remote service support method. There is no need to send someone physically onsite, the repair actions can be done remotely. The IBM engineer has tools from HGST to recover, in most cases, all of the data.
Any "union" chunk that the HGST tools are unable to recover will be set to "media error" mode. The IBM engineer can provide the client a list of the XIV LUNs and LBAs that are on the "media error" list. From this list, the client can determine which hosts these LUNs are attached to, and run file scan utility to the file systems that these LUNs represent. Files that get a media error during this scan will be listed as needing recovery. A chunk could contain several small files, or the chunk could be just part of a large file. To minimize time, the scans and recoveries can all be prioritized and performed in parallel across host systems zoned to these LUNs.
As with any file or volume recovery, keep in mind that these might be part of a larger consistency group, and that your recovery procedures should make sense for the applications involved. In any case, you are probably going to be up-and-running in less time with XIV than recovery from a RAID-5 double failure would take, and certainly nowhere near "beyond repair" that other vendors might have you believe.
- Fact: This does not mean you can eliminate all Disaster Recovery planning!
To put this in perspective, you are more likely to lose XIV data from an earthquake, hurricane, fire or flood than from a double drive failure. As with any unlikely disaster, it is best to have a disaster recovery plan than to hope it never happens. All disk systems that sit on a single datacenter floor are vulnerable to such disasters.
For mission-critical applications, IBM recommends using disk mirroring capability. IBM XIV storage system offers synchronous and asynchronous mirroring natively, both included at no additional charge.
For more about IBM XIV reliability, read this whitepaper [IBM XIV© Storage System: Reliability Reinvented]. To find out why so many clients LOVE their XIV, contact your local IBM storage sales rep or IBM Business Partner.
technorati tags: IBM, XIV, DDF, RAID-5, RAID-10, RAID-X, RAID-6, RAID-DP, HP, EVA, HDS, USP-V, EMC, CLARiiON, V-Max, Disaster Recovery, HGST, UltraStar, A7K1000
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Last week, IBM announced a variety of tape system enhancements.
- IBM TS7760 Virtual Tape System
The IBM TS7760 combines the benefits of the previous TS7720 and TS7740 offerings. Those with IBM z System mainframes will recognize both. The TS7740 has a small amount of disk that pretend to be a tape library, with enough capacity to hold a few hours to a few days worth of data. After that, the data is moved to physical tape. The TS7720 is an all-disk solution, holding up to 1 PB of disk to hold weeks or months worth of data, but did not have tape attachment. Previously, IBM announced the TS7720T, a high-capacity offering with tape attachment. The new TS7760 is now the replacement for all three of these, powered by the latest POWER8 processor.
In addition to all the features available in the former models, the new TS7760 uses 4TB drives instead of 3TB drives, resulting in a maximum capacity of 1.3PB of disk capacity before compression. The disks are encrypted and protected by distributed RAID-6 referred to as "Dynamic Disk Pooling". While tape attachment is still optional, it supports both IBM TS3500 and TS4500 tape libraries.
To learn more, see the [TS7700 R4.0 delivers faster performance and larger maximum capacity with the new TS7760 offering] press release.
- new Rack-mount Kit for TS1140 and TS1150 tape drives
Previously, the IBM tape drives had a rack-mount kit that took up 10U, and only worked with racks that were 28 inches deep, so two drives took up nearly one-fourth of a full rack. These new rack-mount kits take up only 3U for one or two drives, so they are more space-efficient, and can work with any racks that is 28 to 44.5 inches deep. To learn more, see the [IBM TS1140 and TS1150 Tape Drive rack mount kit features support RoHS compliance in a 3U form factor] press release.
- IBM TS3500 Tape Library
The IBM TS3500 has been enhanced to support the new 16Gb FC attachments for the TS7700 virtual tape systems, including the new TS7760 I mentioned above. To learn more, see the [IBM TS3500 Tape Library supports new switch options]
- IBM TS4500 Tape Library
The IBM TS4500 now can attach to IBM TS7720, TS7720T, TS7740 and TS7760 Virtual Tape Systems for z Systems mainframe attachment, with some amazing enhnancements over its TS3500 predecessor:
- Up to 60 percent reduction in floor space costs
- Up to two times faster access to data
- Up to 25 percent higher bandwidth per frame over TS3500
- z Systems synergy with support for 16 Gb Fibre Channel switch for up to 100 PB of z System Data storage
To learn more, see the [IBM TS4500 Tape Library supports TS7700 attachment] press release.
I am at the airport headed to Chicago for the IBM Technical University. If you are in the Chicago area, consider attending!
technorati tags: IBM, TS7700, TS7720, TS7720T, TS7740, TS7760, TS1140, TS1150, TS3500, TS4500
Modified by TonyPearson
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Monday afternoon, I attended various break-out sessions.
- 1441A Data Resiliency: Data-Driven Analytics and Beyond
Ramani Routray (IBM) and B.J. Klingenberg, IBM, co-presented. Aggressive and differentiated Recovery Point Objectives (RPOs) and Recovery Time Objectives (RTOs) create data protection silos. Resiliency for an enterprise data center is often achieved via redundant components, periodic backup, continuous replication and/or highly available architectures. With the emergence of cloud delivery models, Backup-as-a-Service and DR-as-a-Service have gained wide acceptance. This uniquely challenges service providers to quickly analyze all the metadata from these environments to enable problem determination, fault isolation, capacity management, SLA violation, etc. Learn about a big data analytics framework that analyzes millions of resiliency metadata tuples in near real-time to generate actionable insights.
- 1267A Prudential and IBM: Integrating Application and Storage Management to Drive Cloud Service Levels
This was a 50/50 presentation, with the first half covered by clients OJ Dua, supported by his boss, Scott Singerline, both from Prudential Financial.
Prudential explored their successful approach for optimizing storage and improving service. First, experts from Prudential Financial will describe their experiences integrating IBM Spectrum Control v5.2 (formerly IBM Tivoli Storage Productivity Center) inventory, availability, and performance data with Tivoli Application Dependency Discovery Manager (TADDM) and Netcool OMNIbus to improve services for core business applications.
(Over 10 years ago, I was the chief architect for IBM TotalStorage Productivity Center v1. The clients from Prudential could not emphasize enough how much better Spectrum Control v5.2 was compared to their experiences with the prior versions. It has come a long way, baby!)
The second half was covered by Brian Sherman, IBM Distinguished Engineer. He described how related IBM Spectrum Storage solutions are transforming storage. IBM Spectrum Storage solutions deliver reliable, flexible service levels at a significantly lower cost than traditional storage.
- 6523A VersaStack: Because Time and Cost are of the Essence for Cloud Service Providers
This was more of a 25/75 presentation. Ian Shave, IBM Business Line Executive for Spectrum Virtualize and VersaStack, kicked off the session with a quick overview of VersaStack, which combines Cisco UCS x86 blade servers and Cisco network switches with IBM Spectrum Virtualize storage solutions. This is often referred to as "Integrated Infrastructure" or "Converged Systems". While the growth of Integrated Infrastructure adoption is growing 15 percent, storage within Integrated Infrastructure solutions is growing faster at 44 percent.
VersaStack can be implemented as follows:
- Cisco UCS Mini with Storwize V5000, either iSCSI or FCP
- Cisco UCS with Storwize V7000 (block-only) or V7000 Unified (file and block access)
- Cisco UCS with FlashSystem V9000, for high-speed, low-latency application requirements
John Buskermolen and Dan Simunic, both from i-Virtualize, covered their experiences with VersaStack. Founded in 2009, i-Virtualize is a Managed Services Provider (MSP), Cloud Service Provider (CSP) and value-added reseller, for clients in both USA and Canada, growing 41 percent year over year.
They reduced the time to market from weeks to days, cut new environment provisioning time from days to minutes, and simplified management when it implemented VersaStack, an integrated infrastructure solution that combines Cisco UCS Integrated Infrastructure with IBM storage solutions built with IBM Spectrum Virtualize to deliver extraordinary levels of performance and efficiency.
Why did i-Virtualize choose VersaStack?
- 79 percent reduced provisioning time
- 60 percent lower costs
- 10x performance acceleration
- Higher flexibility, with clustered systems that scale up and out
- Let's i-Virtualize administrators and management sleep at night
- 47 percent capacity savings with Real-time Compression
- IBM Spectrum Virtualize HyperSwap for high availability
- Storage-based replication across multiple datacenters
- Cisco UCS director provides single-pane-of-glass management
Their latest project is called VIXO, a Cloud Managed Services Console which stacks Cloud Foundry, Docker, OpenStack, VMware and other 3rd party services on top of their VersaStack. This is a collaboration with Oxbury Group.
VersaStack is an ideal solution for Cloud Service Providers (CSP) or for any client interested in "cloud-in-a-box."
- 3690A Meet the Experts on IBM Cloud Storage Services
Ann Corrao and Mike Fork, both from IBM, presented IBM's various storage capabilities on SoftLayer and Cloud Managed Services (CMS). Of IBM's 43 Cloud datacenters, 28 are SoftLayer, and the other 15 are CMS.
For block-based volume storage, SoftLayer offers "Endurance" and "Performance". These are backed by multi-pathed iSCSI volumes.
- With "Endurance" option, you purchase a fixed I/O density, either 0.5 IOPS/GB, 1 IOPS/GB or 4 IOPS/GB. If you choose a 100 GB volume, you are guaranteed 400 IOPS. Typical business applications like database or email consume about 0.7 IOPS/GB.
- With the "Performance" option, you pick the IOPS for your volume, up to 6,000 IOPS, and then pick the size to match your needs, say 100 GB. This is best suited for clients who know their application well enough to specify this.
IBM Bluemix also has a block service, based on OpenStack Cinder drivers. These are backed by internal disk on storage-rich servers. IBM SoftLayer can pack 4 drives into a 1U server, 12 drives into a 2U server and 36 drives into a 3U server.
For object store, IBM SoftLayer supports OpenStack Swift. They support content expiration, versioning and metadata search.
(When asked if this was Cleversafe or something else, Mike was quick to point out that IBM SoftLayer focuses on the "Service Level Agreement (SLA), the client experience, and the APIs" so however they chose to back this storage is internally determined. The client should not have to specify product xyz in their contract.)
An extra feature for object store is "Content Delivery Network" (CDN) which uses EdgeCast to cache content at the edges of the network to improve performance delivery. You designate which object containers you want to accelerate performance, and you pay for the amount of bandwidth consumed.
For file space, IBM SoftLayer supports NFS and SFTP only. Supporting CIFS, or rather its replacement SMB, is a known requirement. In the meantime, there are a variety of 3rd party "Cloud Gateway" solutions, like NetApp AltaVault, Panzura global namespace, or CTERA.
For file sync-and-share, IBM has partnered with Box to provide Enterprise-class service.
How do clients ingest data into their IBM SoftLayer account? One option is to use Aspera, a recent IBM acquisition that is 3x faster than traditional SCP. Another option is to ship disk or tape cartridges to IBM SoftLayer facility.
technorati tags: IBM, #InterConnect, #IBMSystems, #IBMStorage, Data Resiliency, IBM Resiliency Services, Ramani Routray, Bernhard Klingenberg, Prudential Financial, OJ Dua, Scott Singerline, Tivoli Storage Productivity Center, Spectrum Control, TADDM, Tivoli Application Dependency Discovery Manager, Netcool OMNIbus, Brian Sherman, Spectrum Storage, VersaStack, i-Virtualize, Ian Shave, John Buskermolen, Dan Simunic, Managed Service Provider, MSP, Cloud Service Provider, CSP, Cloud Foundry, Docker, OpenStack, VMware, VIXO, Ann Corrao, Mike Fork, SoftLayer, Cloud Managed Services, CMS, OpenStack Cinder, OpenStack Swift, Content Delivery Network, CDN, NetApp AltaVault, Panzura, CTERA, IBM and Box, Aspera
Well, it's Tuesday again, and you know what that means? IBM Announcements!
This week, IBM announces the second generation of Storwize V5000 flash and disk storage systems. There are the V5000F All-flash configurations, as well as the V5000 that can support a variety of flash and spinning disk drives.
There are three models:
- Storwize V5010
The V5010 has dual 2-core/2-thread processors and 16GB of cache. It supports thin provisioning, FlashCopy, Easy Tier, and remote mirroring. The base unit includes 1 GbE Ethernet ports for iSCSI host connectivity, with options to add 16GB Fibre Channel, 12Gb SAS, and 10GbE iSCSI/FCoE as well.
The 2U controllers and expansion enclosures can hold either 24 small 2.5-inch drives, or 12 larger 3.5-inch drives. A single control enclosure has two active/active IBM Spectrum Virtualize nodes, and can attach up to 10 expansion enclosures for a maximum of 264 drives.
- Storwize V5020
The V5020 unit has dual 2-core/4-thread processors and up to 32GB of cache. It supports everything the V5010 does, plus encryption. The encryption is done via the Intel AES-NI instruction set to eliminate the need for special "self-encrypting drives" (SED) that other storage devices may require.
- Storwize V5030
The V5030 has dual 6-core/4-thread processors and up to 64GB of cache. It supports everything the V5010 and V5020 do, plus Real-time Compression and external virtualization. The Real-time Compression can achieve up to 80 percent space savings, representing a 5:1 compression ratio.
Each control enclosure can attach to 20 expansion enclosures, which can support 504 internal drives per controller, and up to 1,008 with two controllers (four Spectrum Virtualize nodes) clustered together. This is in addition to the drives in external storage systems virtualized.
To learn more, read the
[Storwize V5000 Gen2 announcement letter].
technorati tags: IBM, Storwize, Spectrum Virtualize, Storwize V5000F, Storwize V5000, Storwize V5010, Storwize V5020, Storwize V5030, Thin Provisioning, FlashCopy, Easy Tier, Remote Mirroring, Metro Mirror, Global Mirror, iSCSI, Fibre Channel, SAS, FCoE, Encryption, Intel AES-NI, SED, Real-time Compression
Modified by TonyPearson
Well it's Tuesday again, and you know what that means? IBM Announcements!
(FCC Disclosure: This official launch also includes October 6 announcements. In any case, the usual disclaimer applies: I currently work for IBM, and this blog post can be considered a "paid celebrity endorsement" of the IBM products mentioned below.)
IBM announced various updates to its Spectrum Storage product line. Here is a quick recap.
- IBM Spectrum Virtualize 7.6
Spectrum Virtualize is the new name of the "storage hypervisor" code that resides in IBM SAN Volume Controller (SVC) and Storwize family products. When you buy an SVC, you will license Spectrum Virtualize software on it. It is NOT available separately as software-only that you can install on any other hardware. There are three major improvements:
- Software-based Data-at-Rest Encryption
Earlier this year, IBM delivered data-at-rest encryption for the Storwize V7000 and V7000 Unified. This week, IBM extends this support to other storage hypervisors.
Since this feature is based on the Intel processor that supports the Advanced Encryption Standard New Instructions (AES-NI), it applies only to the newer hardware: SAN Volume Controller 2145-DH8, the Storwize V7000 Gen2, FlashSystem V9000, and VersaStack converged systems that contain these. You can run Spectrum Virtualize v7.6 on older hardware models, but the encryption feature will be disabled.
Basically, by taking advantage of AES-NI commands, IBM can now offer data-at-rest encryption on any virtualized flash or disk arrays, eliminating the need for special "Self-Encrypting Drives", or SED.
The encryption keys are kept on USB memory sticks, that you can either leave in the machine, or stash away in some vault or safe somewhere.
- Distributed RAID
The other improvement is distributed RAID. Distributed RAID has been hugely popular on IBM XIV products, and has since found its way into the DCS3700, DCS3860 and Elastic Storage Server models.
With this new enhancement, storage admins can select "Distributed RAID-5" or "Distributed RAID-6" as alternate choices to traditional RAID ranks.
Why use it? All the drives are now active, eliminating idle spare drives that do nothing collecting dust and cobwebs waiting for an opportunity to spin up, and when they finally are used for a rebuild become a terrible bottleneck. Since all drives are reading and writing, the rebuild rate is an order of magnitude (5 to 10x) faster!
For those clients nervous about large 8TB drives and the number of days it would take to perform a traditional RAID rebuild, this should calm all of your fears.
- IP-based Quorum
This is one of those line-items that we have told clients that it was "just around the corner" and "coming soon, watch this space", and finally it is available. For clients using Stretched Cluster or HyperSwap across two buildings, best practices suggests keeping the quorum disk in a third building. This often met having to dedicate a single 2U disk system in a closet somewhere, with expensive Fibre Channel cables connecting to the other two buildings.
To address this, IBM now allows the quorum disk to be based on Internet Protocol (the IP portion of TCP/IP), which can be any bare-metal or virtual machine that is LAN or WAN attached. The "quorum disk" is just a little Java program. This can run on any cloud service provider as well, such as IBM SoftLayer, that both buildings have connectivity.
A minor improvement worth mentioning is that the IBM "Comprestimator" tool that estimates the capacity savings of Real-time Compression is now integrated into Spectrum Virtualize v7.6 command line interface (CLI), allowing you to run the tool on demand, as needed, on any virtual volume.
- IBM Spectrum Scale v4.2
IBM plans to offer all of its solutions in any of three flavors: software-only that you can deploy on your own server hardware, pre-built system appliances, and cloud services on IBM SoftLayer, IBM Cloud Managed Services or third-party cloud providers. Spectrum Scale is the software-only flavor, and Elastic Storage Server and Storwize V7000 Unified are pre-built systems based on that software.
- File and Object access
IBM published a "Redbook" on how to implement OpenStack Swift and Amazon S3 interfaces to an existing Spectrum Scale deployment. IBM supported it, but it was basically Do-it-Yourself DIY implementation. This has now been resolved, with full integration of OpenStack Swift and Amazon S3 object-protocol interfaces.
(For those unfamiliar with "Object storage", think of it like valet parking for your data. Before working for IBM, I was previously employed as a valet attendant, so I feel qualified to make this analogy.
If you park your car in a 10-story high parking structure, you have to remember where you parked to go find the car again. With valet parking, you hand over the keys to the valet attendant, the car gets parked, and you get a claim stub that you then use to get your car back. In the meantime, you don't know where your car is parked, and you don't care either!
Storing files in volume-level or file-level storage is like that 10-story high parking structure. You have to remember where you put it, which LUN or which sub-directory. With object storage, the system provides a "claim stub" in the form of an Universal Record Identifier, or URI, and simple HTTP commands like GET and POST can be used to upload and download the content.)
- Policy-driven Compression and Quality of Service (QoS)
If you want to differentiate the levels of service provided by files and objects stored in your infrastructure, look no further. Simple SQL-like language is used to set up policies that are invoked when needed.
- Hadoop Connector for File and Objects
The IBM Hadoop Connector allows Hadoop and Spark analytics applications to treat Spectrum Scale as a 100 percent compatible alternative to Hadoop File Systems (HDFS). Previously, this was only available for files, but now it has been extended to include objects as well.
- Advanced Graphical User Interface (GUI)
Based on the award-winning GUI that has been used for IBM XIV, SVC, Storwize and various other members of the IBM System Storage family, IBM announces an HTML5-based web-browser GUI for configuring and managing Spectrum Scale and Elastic Storage Server (ESS).
- Storwize V7000 Unified
The "file modules" that run IBM Spectrum Scale will get updated to R1.6 level, which supports SMB 3.0 and NFS 4.1 protocols. SMB support will now include both internal and externally-virtualized storage. You will also be able to use Active File Management to migrate to other Spectrum Scale implementations.
- IBM Spectrum Control
As the former chief architect of IBM Tivoli Storage Productivity Center v1, I have been a big fan of the advancements and evolution of Spectrum Control. IBM offers three levels. The first level is "Basic Edition", entitled at no additional charge for IBM storage hardware clients. The second level is "Standard Edition" which offers configuration, provisioning and performance monitoring. The third level is "Advanced Edition", which includes advanced storage analytics, file-level reporting, storage tiering and data placement optimization.
You can imagine my skepticism when I was told that Spectrum Control was going to be enhanced to support Spectrum Scale. What could it offer? IBM Spectrum Scale already has built-in storage tiering and data placement optimization!
It turns out that having effective "management tools" was the #1 reason clients have stated were needed to implement and deploy Spectrum Scale. Since 1998, back when it was called General Parallel File System, or GPFS, the target market was High Performance Computing (HPC) familiar with Command Line Interfaces (CLI).
But IBM was to broaden the reach of IBM Spectrum Scale, to financial services, health care and life sciences, government and education, and a variety of other industries. They won't tolerate being limited to CLI interfaces.
For clients with multiple Spectrum Scale clusters, Spectrum Control can offer the following:
- Visibility across the capacity utilization (file systems, pools, file sets, quotas) and cluster health across all Spectrum Scale clusters in the data center
- Ability to specify alerts which are applied across all Spectrum Scale clusters, for things like relative or absolute free space in a file system, or inodes used, nodes going down, etc.
- Understand the cross-cluster relationships established by remote cluster mounts, and seamlessly navigate between them
- If external SAN storage is used, Spectrum Control shows the correlation between Spectrum Scale Network Shared Disks (NSD) and their corresponding SAN volumes, again with the ability to navigate between them; also it can provide performance monitoring for the volumes backing the NSD
- Ability to monitor file capacity usage in the context of applications, by adding Spectrum Scale "file set containers" to application groups defined in Spectrum Control
- Compare file system activity across Spectrum Scale clusters, with the ability to drill into file system and node performance charts
- Support for object storage on Spectrum Scale, determine which object-enabled clusters are closest to running out of free space
While the basic built-in GUI is great for smaller deployments, if you have a dozen or more Spectrum Scale clusters, or have Spectrum Scale clusters intermixed with traditional block-level and NAS storage devices, then Spectrum Control is for you!
It used to take weeks to deploy the original versions of Tivoli Storage Productivity Center, but now, Spectrum Control is now offered in the cloud, and you can deploy it in as little as 30 minutes.
Want to check it out? You can explore Spectrum Control Storage Insights cloud service as a [Live Demo], or [Start your free trial]! The reporting capabilities of Spectrum Scale are identical between the on-premise version of Spectrum Control, and this cloud service offering.
Here's a great quote from a leading IT industry analyst:
"In multi-petabyte, multivendor installations, overall storage costs of ownership for use of IBM Spectrum Storage solutions averaged 73 percent less than EMC, and 61 percent less than Hitachi equivalents" -- Brian Jeffery, Managing Director, International Technology Group, Naples, FL
As IBM continues its transition from a hardware-oriented company founded over a century ago, manufacturing meat scales and cheese slicers, to one more focused on higher value-add software and services, the Spectrum Storage software family will play a critical role of this transformation!
technorati tags: IBM, Spectrum Virtualize, data-at-rest, encryption, SVC, Storwize, Storwize V7000, FlashSystem V9000, VersaStack, storage hypervisor, distributed RAID, RAID-5, RAID-6, Spectrum Scale, Elastic Storage Server, OpenStack, OpenStack Swift, Amazon S3, HTTP, Compression, Quality of Service, QoS, Hadoop, Spark, Hadoop Connector, HDFS, GUI, XIV, DCS3700, DCS3860, Spectrum Control, Tivoli Storage, Productivity Center, TPC, CLI, NAS, Storage Insights, SoftLayer, IBM Cloud Managed Services,
It's Tuesday, and you know what that means? IBM Announcements! This week I am in beautiful Orlando, Florida for the [IBM Systems Technical University] conference.
This week, IBM announced its latest tape offerings for the seventh generation of Linear Tape Open (LTO-7), providing huge gains in performance and capacity.
For capacity, the new LTO-7 cartridges can hold up to 6TB native capacity, or 15TB effective capacity with 2.5x compression that for typical data. That is 2.4x larger than the 2.5TB catridges available with LTO-6. Performance is also nearly doubled, with a native throughput of 315 MB/sec, or effective 780 MB/sec effective capacity with 2.5x compression. The LTO consortium, of which IBM is a founding member, has published the roadmap for LTO generations to LTO-8, LTO-9 and LTO-10.
IBM will offer both half-height and full-height LTO-7 tape drives. All the features you love from LTO-6 like WORM, partitioning and Encryption carry forward. These drives will be supported on a variety of distributed operating systems, including Linux on z System mainframes, and the IBM i platform on POWER Systems.
The Linear Tape File System (LTFS) can be used to treat LTO-7 cartridges in much the same way as Compact Discs or USB memory sticks, allowing one person to create conent on an LTO-7 tape cartridge, and pass that cartridge to the next employee, or to another company. LTFS is also the basis for IBM Spectrum Archive that allows tape data to be part of a global namespace with IBM Spectrum Scale.
LTO-7 will be supported on the TS2900 auto-loader, as well as all of IBM's tape libraries: TS3100, TS3200, TS3310, TS3500 and TS4500. You can connect up to 15 TS3500 tape libraries together with shuttle connectors, for a maximum capacity of 2,700 drives serving 300,000 cartridges, for a maximum capacity of 1.8 Exabytes of data in a single system environment.
In addition to LTO-7 support, the IBM TS4500 tape library was also enchanced. You can now grow it up to 18 frames, and have up to 128 drives serving 23,170 cartridges, for a maximum capacity of 139 PB of data. You can now also intermix LTO and 3592 frames in the same TS4500 tape library.
For comptability, LTO-7 drives can read existing LTO-5 and LTO-6 tape cartridges, and can write to LTO-6 media, to help clients with transition.
technorati tags: IBM, #ibmtechu, LTO, LTO-7, TS2900, TS2270, TS1070, TS3100, TS3200, TS3500, TS3310, TS4500
Modified by TonyPearson
This post was originally written as a guest post for VMware for VMworld 2015 conference. Read the full blog post [IBM Storage and the Beauty and Benefits of VVol]. The following is an exerpt:
Back in 2012, I had mentioned that VMware was cooking up an exciting new feature called VVol, short for VMware vSphere Virtual Volume.
Officially, the VVol concept was still just a "technology preview" in 2012, to be fleshed out over the next few years through extensive collaboration between VMware and all the major players: IBM, HP, Dell, NetApp and EMC.
In 2013 and 2014, IBM attended VMworld with live demonstrations of VVol support. VMware vSphere v6 was not yet available, but when it was, we assured them, IBM would be one of the first vendors with support!
When vSphere v6 was finally made available earlier this year, [only four vendors support VVols on Day 1 of vSphere 6 GA]! Keeping true to its promises, IBM was indeed one of them.
To understand why VVol is such a game-changer, you have to understand a major problem with VMware version 4 and version 5, namely their Virtual Machine File System, or [VMFS].
Here is a picture to help illustrate:
On the left, we see that VMFS datastore is a set of LUNs from the storage admin perspective, and a set of VMDK and related files from the vCenter admin perspective.
If there was a storage-related problem, such as bandwidth performance or latency, how would the two admins communicate to perform troubleshooting? For many disk systems, it is not obvious which VMDK file sits on which LUN.
There are also a variety of hardware capabilities that work at the LUN level, such as snapshots, clones or remote distance mirroring, and this would apply to all the VMDK files in the data store across the set of LUNs, which may not be what you want.
There are two ways to address this in vSphere v4 and v5:
- The first method is to have fewer VMDK files per datastore. By defining smaller datastores with just a few VMs associated with each, you can then have a closer mapping of VMDK files to datastore LUNs. Unfortunately, VMware ESXi has a 256 limit on the number of different datastores that can be attached, so this method has its own limitations.
- The other method around this is "Raw Device Mapping" (RDM) which allowed Virtual Machines to be attached to specific LUNs. Some of the earlier restrictions and limitations for RDMs have since been relaxed over the releases, but your disk system still needs to expose the SCSI identifiers of each LUN to make this work, and additional setup is required if you plan to cluster two or more systems together, such as for a Microsoft Cluster Server (MSCS).
On the right side of the picture, using VMware v6, vCenter admins can now allocate VVols, which are mapped to specific "VVol Storage Containers" on specific storage systems. The storage admin knows exactly which VVol is in which container, so they can now communicate and collaborate on troubleshooting!
The vSphere ESXi host communicates to storage arrays via a new "virtual LUN id" called a "Protocol Endpoint". This is to allow FCP, iSCSI and FCoE traffic to flow correctly through SAN or LAN switches. For NFS, the Protocol Endpoint represents a "virtual mount point", so that traffic can be routed through LAN switches correctly.
Storage Policies can help determine which attributes or characteristics you want for your VVol. For example, you may want your VVol to be on a storage container that supports snapshots at the hardware level. The vCenter server can be aware of which storage arrays, and which storage containers in those arrays, through the VMware API for Storage Awareness, or VASA.
Different storage manufactures can implement their VASA provider in different ways. IBM has opted to have a single VASA provider for all of its supported devices, so as to provide consistent client experience. When you purchase any VVol-supported storage system from IBM, you are entitled to download the IBM VASA provider at no additional charge!
Initially, the IBM VASA provider will focus on IBM XIV Storage System, an ideal platform for your VVol needs. The XIV is a grid-based storage system, utilizing unique algorithms that give optimal data placement for every LUN or VVol created, and virtually guarantees there will be no hot spots. The XIV provides an impressive selection of Enterprise-class features, including snapshot, mirroring, thin provisioning, real-time compression, data-at-rest encryption, performance monitoring, multi-tenancy and data migration capabilities.
With the XIV 11.6 firmware level, you can define up to 12,000 VVols across one or more storage containers in a single XIV system. For more details, see IBM Redbook [Enabling VMware Virtual Volumes with IBM XIV Storage System].
Let me give some real world examples from Paul Braren, an IBM XIV and FlashSystem Storage Technical Advisor from Connecticut, who has been working directly with clients over the past five years:
"Many of my customers have clearly said they really want the ability to have a granular snapshot that grabs a moment in time of just one VM, rather than all the VMs that happen to be on the same LUN. They also want to delete VMs, and have the storage array automatically present that newly available space. Even better, with VVol, these SAN related tasks appear to be executed nearly instantly, leaving behind those legacy shared VMFS datastore limitations and overhead.
The same benefits of VVol are evident when cloning or deploying VMs. Imagine being to create a Windows Server VM with a 400GB thick-provisioned drive in under 20 seconds. Well, you don't have to imagine it! I recorded video of this actually happening over at IBM's European Storage Competence Center, featured in this 8-minute video: [IBM XIV Storage System and VMware vSphere Virtual Volumes (VVol). An ideal combination!]"
-- Paul Braren
In addition to XIV, all of IBM's Spectrum Virtualize products also support VVolLs, including SAN Volume Controller, Storwize including the Storwize in VersaStack, and FLashSystem V9000.
I am not in San Francisco this week for VMworld, but lots of my IBM colleagues are, so please, stop by the IBM booth and tell them I sent you!
Modified by TonyPearson
Every year, March 31 marks "World Backup Day". Sadly, many people forget the importance of backing up their critical information. This is not just true for businesses, non-profit organizations and government agencies, but also for all of your personal information that you keep on computer devices.
My friends over at Cloudwards had developed an awesome infographic related to World Backup Day. Here it is.
(FTC Disclosure: I work for IBM, which has no business relationship with Cloudwards. Cloudwards does not itself provide backup services, but rather reviews services provided by others. This post should not be considered an endorsement of Cloudwards or their reviews.)
Courtesy of: Cloudwards.net
I hope you find this information helpful and informative!