Fellow Blogger BarryB mentions "chunk size" in his post [Blinded by the light
],as it relates to Symmetrix Virtual Provisioning capability. Here is an excerpt:
I mean, seriously, who else but someone who's already implemented thin provisioning would really understand the implications of "chunk" size enough to care?
For those of you who don't know what the heck "chunk size" means (now listen up you folks over at IBM who have yet to implement thin provisioning on your own storage products), a "chunk" is the term used (and I think even trademarked by 3PAR) to refer to the unit of actual storage capacity that is assigned to a thin device when it receives a write to a previously unallocated region of the device.For reference, Hitachi USP-V uses I think a 42MB chunk, XIV NEXTRA is definitely 1MB, and 3PAR uses 16K or 256K (depending upon how you look at it).
Thin Provisioning currently offered in IBM System Storage N serieswas technically "implemented" by NetApp, and that the Thin Provisioning that will be offered in our IBM XIV Nextrasystems will have been acquired from XIV. Lest I remind you that many of EMC's products were developed by other companies first, then later acquired by EMC, so no need for you to throw rocks from your glass houses in Hopkington.
"Thin provisioning" was first introduced by StorageTek in the 1990's and sold by IBM under the name of RAMAC Virtual Array (RVA). An alternative approach is "Dynamic Volume Expansion" (DVE). Rather than giving the host application a huge 2TB LUN but actually only use 50GB for data, DVE was based on the idea that you only give out 50GB they need now, but could expand in place as more space was required. This was specifically designed to avoid the biggest problem with "Thin Provisioning" which back then was called "Net Capacity Load" on the IBM RVA, but today is now referred to as "over-subscription". It gave Storage Administrators greater control over their environment with no surprises.
In the same manner as Thin Provisioning, DVE requires a "chunk size" to work with. Let's take a look:
- DS4000 series
On the DS4000 series, we use the term "segment size", and indicate that the choice of a segment size can have some influence on performance in both IOPS and throughput. Smaller segment sizes increase the request rate (IOPS) by allowing multiple disk drives to respond to multiple requests. Large segment sizes increase the data transfer rate(Mbps) by allowing multiple disk drives to participate in one I/O request. The segment size does not actually change what is stored in cache, just what is stored on the disk itself.It turns out in practice there is no advantage in using smaller sizes with RAID 1; only in a few instances does this help with RAID-5 if you can writea full stripe at once to calculate parity on outgoing data. For most business workloads, 64KB or 128KB are recommended. DVE expands by the same number of segments across all disks in the RAID rank, so for example in a 12+P rank using 128KB segment sizes, the chunk size would be thirteen segments, about 1.6MB in size.
- SAN Volume Controller
On the SAN Volume Controller, we call this "extent size" and allow it to be various values 64MB to 512MB. Initially,IBM only managed four million extents, so this table was used to explain the maximum amount that could be managedby an SVC system (up to 8 nodes) depending on extent size selected.
|Extent Size||Maximum Addressable|
IBM thought that since we externalized "segment size" on the DS4000, we should do the same for the SANVolume Controller. As it turned out, SVC is so fast up in the cache, that we could not measure any noticeable performance difference based on extent size. We did have a few problems. First, clients who chose 16MB andthen grew beyond the 64TB maximum addressable discovered that perhaps they should have chosen something larger.Second, clients called in our help desk to ask what size to choose and how to determine the size that was rightfor them. Third, we allowed people to choose different extent sizes per managed disk group, but that preventsmovement or copies between groups. You can only copy between groups that use the same extent size. The generalrecommendation now is to specify 256MB size, and use that for all managed disk groups across the data center.
The latest SVC expanded maximum addressability to 8PB, still more than most people have today in their shops.
- DS8000 series
Getting smarter each time we introduce new function, we chose 1GB chunks for the DS8000. Based on a mainframebackground, most CKD volumes are 3GB, 9GB, or 27GB in size, and so 1GB chunks simplified this approach. Spreadingthese 1GB chunks across multiple RAID ranks greatly reduced hot-spots that afflict other RAID-based systems.(Rather than fix the problem by re-designing the architecture, EMC will offer to sell you software to help you manually move data around inside the Symmetrix after the hot-spot is identified)
Unlike EMC's virtual positioning, IBM DS8000 dynamic volume expansion does work on CKD volumes for our System z mainframe customers.
The trade-off in each case was between granularity and table space. Smaller chunks allow finer control on the exact amount allocated for a LUN or volume, but larger chunks reduced the number of chunks managed. With our advanced caching algorithms, changes in chunk size did not noticeably impact performance. It is best just to come up with a convenient size, and either configure it as fixed in the architecture, or externalize it as a parameter with a good default value.
Meanwhile, back at EMC, BarryB indicates that they haven't determined the "optimal" chunk size for their newfunction. They plan to run tests and experiments to determine which size offers the best performance, and thenmake that a fixed value configured into the DMX-4. I find this funny coming from the same EMC that won't participate in [standardized SPC benchmarks] because they feel that performance is a personal and private matter between a customer and their trusted storage vendor, that all workloads are different, and you get the idea. Here's another excerpt:
Back at the office, they've taking to calling these "chunks" Thin Device Extents (note the linkage back to EMC's mainframe roots), and the big secret about the actual Extent size is...(wait for it...w.a.i.t...for....it...)...the engineers haven't decided yet!
That's right...being the smart bunch they are, they have implemented Symmetrix Virtual Provisioning in a manner that allows the Extent size to be configured so that they can test the impact on performance and utilization of different sizes with different applications, file systems and databases. Of course, they will choose the optimal setting before the product ships, but until then, there will be a lot of modeling, simulation, and real-world testing to ensure the setting is "optimal."
Finally, BarryB wraps up this section poking fun at the chunk sizes chosen by other disk manufacturers. I don't knowwhy HDS chose 42MB for their chunk size, but it has a great[Hitchiker's Guide to the Galaxy]sound to it, answering the ultimate question to life, the universe and everything. Hitachi probably went to theirDeep Thought computer and asked how big should their "chunk size" be for their USP-V, and the computer said: 42.Makes sense to me.
I have to agree that anything smaller than 1MB is probably too small. Here's the last excerpt:
Now, many customers and analysts I've spoken to have in fact noted that Hitachi's "chunk" size is almost ridiculously large; others have suggested that 3PAR's chunks are so small as to create performance problems (I've seen data that supports that theory, by the way).
Well, here's the thing: the "right" chunk size is extremely dependent upon the internal architecture of the implementation, and the intersection of that ideal with the actual write distribution pattern of the host/application/file system/database.
So my suggestion to EMC is, please, please, please take as much time as you need to come up with the perfect"chunk size" for this, one that handles all workloads across a variety of operating systems and applications, from solid-state Flash drives to 1TB SATA disk. Take months or years, as long as it takes. The rest of the world is in no hurry, as thin provisioning or dynamic volume expansion is readily available on most other disk systems today.
Maybe if you ask HDS nicely, they might let you ask their computer.
technorati tags: IBM, thin provisioning, XIV, Nextra, N series, chunk size, BarryB, EMC, Symmetrix, virtual provisioning, 3PAR, Hitachi, HDS, USP-V, StorageTek, RAMAC Virtual Array, RVA, dynamic volume expansion, DVE, 42MB, Hitchhiker's Guide, CKD, System z, mainframe, SATA, DS8000, DS4000, SAN Volume Controller, SVC
This week, Hitachi Ltd. announced their next generation disk storage virtualization array, the Virtual Storage Platform, following on the success of its USP V line. It didn't take long for fellow blogger Chuck Hollis (EMC) to comment on this in his blog post [Hitachi's New VSP: Separating The Wheat From The Chaff]. Here are some excerpts:
"Well, we all knew that Hitachi (through HDS and HP) would be announcing some sort of refresh to their high-end storage platform sooner or later.
As EMC is Hitachi's only viable competitor in this part of the market, I think people are expecting me to say something.
If you're a high-end storage kind of person, your universe is basically a binary star: EMC and Hitachi orbiting each other, with the interesting occasional sideshow from other vendors trying to claim relevance in this space."
Chuck implies that neither Hewlett-Packard (HP) nor Hitachi Data Systems (HDS) as vendors provide any value-add from the box manufactured by Hitachi Ltd. so combines them into a single category. I suspect the HP and HDS folks might disagree with that opinion.
When I reminded Chuck that IBM was also a major player in the high-end disk space, his response included the following gem:
"Many of us in the storage industry believe that IBM currently does not field a competitive high-end storage platform. IDC market share numbers bear out this assertion, as you probably know."
While Chuck is certainly entitled to his own beliefs and opinions, believing the world is flat does not make it so. Certainly, I doubt IDC or any other market research firm has put out a survey asking "Do you think IBM offers a competitive high-end disk storage platform?" Of course, if Chuck is basing his opinion on anecdotal conversations with existing EMC customers, I can certainly see how he might have formed this misperception. However, IDC market share numbers don't support Chuck's assertion at all.
There is no industry-standard definition of what is a "high-end" or "enterprise-class" disk system. Some define high-end as having the option for mainframe attachment via ESCON and/or FICON protocol. Others might focus on features, functionality, scalability and high 99.999+ percent availability. Others insist high-end requires block-oriented protocols like FC and iSCSI, rather than file-based protocols like NAS and CIFS.
For the most demanding mission-critical mix of random and sequential workloads, IBM offers the [IBM System Storage DS8000 series] high-end disk system which connects to mainframes and distributed servers, via FCP and FICON attachment, and supports a variety of drive types and RAID levels. The features that HP and HDS are touting today for the VSP are already available on the IBM DS8000, including sub-LUN automatic tiering between Solid-State drives and spinning disk, called [Easy Tier], thin provisioning, wide striping, point-in-time copies, and long distance synchronous and asynchronous replication.
There are lots of analysts that track market share for the IT storage industry, but since Chuck mentions [IDC] specifically, I reviewed the most recent IDC data, published a few weeks ago in their "IDC Worldwide Quarter Disk Storage Tracker" for 2Q 2010, representing April 1 to June 30, 2010 sales. Just in case any of the rankings have changed over time, I also looked at the previous four quarters: 2Q 2009, 3Q 2009, 4Q 2009 and 1Q 2010.
(Note: IDC considers its analysis proprietary, out of respect for their business model I will not publish any of the actual facts and figures they have collected. If you would like to get any of the IDC data to form your own opinion, contact them directly.)
In the case of IDC, they divide the disk systems into three storage classes: entry-level, midrange and high-end. Their definition of "high-end" is external RAID-protected disk storage that sells for $250,000 USD or more, representing roughly 25 to 30 percent of the external disk storage market overall. Here are IDC's rankings of the four major players for high-end disk systems:
By either measure of market share, units (disk systems) or revenue (US dollars), IDC reports that IBM high-end disk outsold both HDS and HP combined. This has been true for the past five quarters. If a smaller start-up vendor has single digit percent market share, I could accept it being counted as part of Chuck's "occasional sideshow from other vendors trying to claim relevance", but IBM high-end disk has consistently had 20 to 30 percent market share over the past five quarters!
Not all of these high-end disk systems are connected to mainframes. According to IDC data, only about 15 to 25 percent of these boxes are counted under their "Mainframe" topology.
Chuck further writes:
"It's reasonable to expect IBM to sell a respectable amount of storage with their mainframes using a protocol of their own design -- although IBM's two competitors in this rather proprietary space (notably EMC and Hitachi) sell more together than does IBM."
The IDC data doesn't support that claim either, Chuck. By either measure of market share, units (disk systems) or revenue (US dollars), IDC reports that IBM disk for mainframes outsold all other vendors (including EMC, HDS, and HP) combined. And again, this has been true for the past five quarters. Here is the IDC ranking for mainframe disk storage:
IBM has over 50 percent market share in this case, primarily because IBM System Storage DS8000 is the industry leader in mainframe-related features and functions, and offers synergy with the rest of the z/Architecture stack.
So Chuck, I am not picking a fight with you or asking you to retract or correct your blog post. Your main theme, that the new VSP presents serious competition to EMC's VMAX high-end disk arrays, is certainly something I can agree with. Congratulations to HDS and HP for putting forth what looks like a viable alternative to EMC's VMAX.
To learn more about IBM's upcoming products, register for next week's webcast "Taming the Information Explosion with IBM Storage" featuring Dan Galvan, IBM Vice President, and Steve Duplessie, Senior Analyst and Founder of Enterprise Storage Group (ESG).
technorati tags: IBM, DS8000, EMC, Chuck Hollis, Hitachi, HDS, Virtual Storage Platform, VSP, USP-V, HP, P9500, Easy Tier, high-end, enterprise-class, IDC, marketshare
Continuing coverage of my week in Washington DC for the annual [2010 System Storage Technical University], I attended several XIV sessions throughout the week. There were many XIV sessions. I could not attend all of them. Jack Arnold, one of my colleagues at the IBM Tucson Executive Briefing Center, often presents XIV to clients and Business Partners. He covered all the basics of XIV architecture, configuration, and features like snapshots and migration. Carlos Lizarralde presented "Solving VMware Challenges with XIV". Ola Mayer presented "XIV Active Data Migration and Disaster Recovery".
Here is my quick recap of two in particular that I attended:
- XIV Client Success Stories - Randy Arseneau
Randy reported that IBM had its best quarter ever for the XIV, reflecting an unexpected surge shortly after my blog post debunking the DDF myth last April. He presented successful case studies of client deployments. Many followed a familiar pattern. First, the client would only purchase one or two XIV units. Second, the client would beat the crap out of them, putting all kinds of stress from different workloads. Third, the client would discover that the XIV is really as amazing as IBM and IBM Business Partners have told them. Finally, in the fourth phase, the client would deploy the XIV for mission-critical production applications.
- A large US bank holding company managed to get 5.3 GB/sec from a pair of XIV boxes for their analytics environment. They now have 14 XIV boxes deployed in mission-critical applications.
- A large equipment manufacturer compared the offerings among seven different storage vendors, and IBM XIV came out the winner. They now have 11 XIV boxes in production and another four boxes for development/test. They have moved their entire VMware infrastructure to IBM XIV, running over 12,000 guest instances.
- A financial services company bought their first XIV in early 2009 and now has 34 XIV units in production attached to a variety of Windows, Solaris, AIX, Linux servers and VMware hosts. Their entire Microsoft Exchange was moved from HP and EMC disk to IBM XIV, and experienced noticeable performance improvement.
- When a University health system replaced two competitive disk systems with XIV, their data center temperature dropped from 74 to 68 degrees Fahrenheit. In general, XIV systems are 20 to 30 percent more energy efficient per usable TB than traditional disk systems.
- A service provider that had used EMC disk systems for over 10 years evaluated the IBM XIV versus upgrading to EMC V-Max. The three year total cost of ownership (TCO) of EMC's V-Max was $7 Million US dollars higher, so EMC counter-proposed CLARiiON CX4 instead. But, in the end, IBM XIV proved to be the better fit, and now the customer is happy having made the switch.
- The manager of an information communications technology service provider was impressed that the XIV was up and running in just a couple of days. They now have over two dozen XIV systems.
- Another XIV client had lost all of their Computer Room Air Conditioning (CRAC) units for several hours. The data center heated up to 126 degrees Fahrenheit, but the customer did not lose any data on either of their two XIV boxes, which continued to run in these extreme conditions.
- Optimizing XIV Performance - Brian Cormody
This session was an update from the [one presented last year] by Izhar Sharon. Brian presented various best practices for optimizing the performance when using specific application workloads with IBM XIV disk systems.
- Oracle ASM: Many people allocate lots of small LUNs, because this made sense a long time ago when all you had was just a bunch of disks (JBOD). In fact, many of the practices that DBAs use to configure databases across disks become unnecessary with XIV. Wth XIV, you are better off allocating a few number of very large LUNs from the XIV. The best option was a 1-volume ASM pool with 8MB AU stripe. A single LUN can contain multiple Oracle databases. A single LUN can be used to store all of the logs.
- VMware: Over 70 percent of XIV customers use it with VMware. For VMFS, IBM recommends allocating a few number of large LUNs. You can specify the maximum of 2181 GB. Do not use VMware's internal LUN extension capability, as IBM XIV already has thin provisioning and works better to allow XIV to do this for you. XIV Snapshots provide crash-consistent copies without all the VMware overhead of VMware Snapshots.
- SAP: For planning purposes, the "SAPS" unit equates roughly to 0.4 IOPS for ERP OLTP workloads, and 0.6 IOPS for BW/BI OLAP workloads. In general, an XIV can deliver 25-30,000 IOPS at 10-15 msec response time, and 60,000 IOPS at 30 msec response time. With SAP, our clients have managed to get 60,000 IOPS at less than 15 msec.
- Microsoft Exchange: Even my friends in Redmond could not believe how awesome XIV was during ESRP testing. Five Exchange 2010 servers connected two a pair of XIV boxes using the new 2TB drawers managed 40,000 mailboxes at the high profile (0.15 IOPS per mailbox). Another client found four XIV boxes (720 drives) was able to handle 60,000 mailboxes (5GB max), which would have taken over 4000 drives if internal disk drives were used instead. Who said SANs are obsolete for MS Exchange?
- Asynchronous Replication: IBM now has an "Async Calculator" to model and help design an XIV async replication solution. In general, dark fiber works best, and MPLS clouds had the worst results. The latest 10.2.2 microcode for the IBM XIV can now handle 10 Mbps at less than 250 msec roundtrip. During the initial sync between locations, IBM recommends setting the "schedule=never" to consume as much bandwidth as possible. If you don't trust the bandwidth measurements your telco provider is reporting, consider testing the bandwidth yourself with [iPerf] open source tool.
Several members of the XIV team thanked me for my April 5th post [Double Drive Failure Debunked: XIV Two Years Later]. Since April 5th, IBM has sold more XIV units this quarter than any prior quarters. I am glad to have helped!
technorati tags: IBM, Technical University, XIV, HP, EMC, CLARiiON, VMAX, TCO, CRAC, JBOD, SAP, Oracle, ASM, Microsoft Exchange, ESRP
This week, I am in beautiful Sao Paulo, Brazil, teaching Top Gun class to IBM Business Partners and sales reps. Traditionally, we have "Tape Thursday" where we focus on our tape systems, from tape drives, to physical and virtual tape libraries. IBM is the number #1 tape vendor, and has been for the past eight years.
(The alliteration doesn't translate well here in Brazil. The Portuguese word for tape is "fita", and Thursday here is "quinta-feira", but "fita-quinta-feira" just doesn't have the same ring to it.)
In the class, we discussed how to handle common misperceptions and myths about tape. Here are a few examples:
- Myth 1: Tape processing is manually intensive
In my July 2007 blog post [Times a Million], I coined the phrase "Laptop Mentality" to describe the problem most people have dealing with data center decisions. Many folks extend linearly their experiences using their PCs, workstations or laptops to apply to the data center, unable to comprehend large numbers or solutions that take advantage of the economies of scale.
For many, the only experience dealing with tape was manual. In the 1980s, we made "mix tapes" on little cassettes, and in the 1990s we recorded our favorite television shows on VHS tapes in the VCR. Today, we have playlists on flash or disk-based music players, and record TV shows on disk-based video recorders like Tivo. The conclusion is that tapes are manual, and disk are not.
Manual processing of tapes ended in 1987, with the introduction of a silo-like tape library from StorageTek. IBM quickly responded with its own IBM 3495 Tape Library Data Server in 1992. Today, clients have many tape automation choices, from the smallest IBM TS2900 Tape Autoloader that has one drive and nine cartridges, all the way to the largest IBM TS3500 multiple-library shuttle complex that can hold exabytes of data. These tape automation systems eliminate most of the manual handling of cartridges in day-to-day operations.
- Myth 2: Tape media is less reliable than disk media
For any storage media to be unreliable is to return the wrong information that is different than what was originally stored. There are only two ways for this to happen: if you write a "zero" but read back a "one", or write a "one" and read a "zero". This is called a bit error. Every storage media has a "bit error rate" that is the average likelihood for some large amount of data written.
According to the latest [LTO Bit Error rates, 2012 March], today's tape expects only 1 bit error per 10E17 bits written (about 100 Petabytes). This is 10 times more reliable than Enterprise SAS disk (1 bit per 10E16), and 100 times more reliable than Enterprise-class SATA disk (1 bit per 10E15).
Tape is the media used in "black boxes" for airplanes. When an airplane crashes, the black box is retrieved and used to investigate the causes of the crash. In 1986, the Space Shuttle Challenger exploded 73 seconds after take-off. The tapes in the black box sat on the ocean floor for six weeks before being recovered. Amazingly, IBM was able to successfully restore [90 percent of the block data, and 100 percent of voice data].
- Myth 3: Most tape restores fail
Why do people still believe that most tape restores fail? Curtis Preston, on his Backup Central blog, has a great post [Gartner Never Said 71 percent of Tape Restores Fail].
Analysts are quite upset when they are quoted out of context, but in this case, Gartner never said anything closely similar to this. Nor did the other analysts that Curtis investigated for similar claims. What Garnter did say was that disk provides an attractive alternative storage media for backup which can increase the performance of the recovery process.
Back in the 1990s, Savur Rao and I developed a patent to help backup DB2 for z/OS by using the FlashCopy feature of IBM's high-end disk system. The software method to coordinate the FlashCopy snapshots with the database application and maintain multiple versions was implemented in the DFSMShsm component of DFSMS. A few years later, this was part of a set of patents IBM cross-licensed to Microsoft for them to implement a similar software for Windows called Data Protection Manager (DPM). IBM has since introduced its own version for distributed systems called IBM Tivoli FlashCopy Manager that runs not just on Windows, but also AIX, Linux, HP-UX and Solaris operating systems.
Curtis suspects the "71 percent" citation may have been propogated by an ambitious product manager of Microsoft's Data Protection Manager, back in 2006, perhaps to help drive up business to their new disk-based backup product. Certainly, Microsoft was not the only vendor to disparage tape in this manner.
A few years ago, an [EMC failure brought down the State of Virginia] due to not just a component failure it its production disk system, but then made it worse by failing to recover from the disk-based remote mirror copy. Fortunately, the data was able to be restored from tape over the next four days. If you wonder why nobody at EMC says "Tape is Dead" anymore, perhaps it is because tape saved their butts that week.
(FTC Disclosure: I work for IBM and this post can be considered a paid, celebrity endorsement for all of the IBM tape and software products mentioned on this post. I own shares of stock in both IBM and Google, and use Google's Gmail for my personal email, as well as many other Google services. While IBM, Google and Microsoft can be considered competitors to each other in some areas, IBM has working relationships with both companies on various projects. References in this post to other companies like EMC are merely to provide illustrative examples only, based on publicly available information. IBM is part of the Linear Tape Open (LTO) consortium.)
Last year, Google lost the email data for half a million Gmail accounts due to a software error. Once again, tape came to the rescue, with [Google restoring lost Gmail data from tape backups].
- Myth 4: Vendors and Manufacturers are no longer investing in tape technology
IBM and others are still investing Research and Development (R&D) dollars to improve tape technology. What people don't realize is that much of the R&D spent on magnetic media can be applied across both disk and tape, such as IBM's development of the Giant Magnetoresistance read/write head, or [GMR] for short.
Most recently, IBM made another major advancement with tape with the introduction of the Linear Tape File Systems (LTFS). This allows greater portability to share data between users, and between companies, but treating tape cartridges much like USB memory sticks or pen drives. You can read more in my post [IBM and Fox win an Emmy for LTFS technology]!
Next month, IBM celebrates the 60th anniversary for tape. It is good to see that tape continues to be a vibrant part of the IT industry, and to IBM's storage business!
technorati tags: IBM, Google, Microsoft, EMC, Brazil, LTO, TS2900, TS3500, Space Shuttle, Challenger
Well it's Tuesday again, and you know what that means... IBM announcements! Yesterday, at the IBM Edge conference here in Orlando, Florida, IBM announced its new apporach to storage, and a whole bunch of storage products, enhancements, and services. I will focus on some key ones here, and save the rest for next week.
- IBM SAN Volume Controller (SVC) v6.4
The SVC is IBM's enterprise-class storage hypervisor. The latest software release, v6.4, can be installed on any SVC hardware, from the 2145-8F2 introduced back in 2005, to newer models like the 2145-CG8. Here are the key features:
- Fibre Channel over Ethernet (FCoE) -- This is complete end-to-end support. For SVC units with 10GbE ports, these ports can be now be used for FCoE. This allows hosts to attach to SVC via FCoE, allows SVC node-to-node communication for clustering, and allows SVC to communicate to back-end devices via FCoE.
- Real-Time Compression -- IBM ported over the patent Random Access Compression Engine (RACE) from the Real-Time Compression Appliances to SVC v6.4. This allows primary data, accessed via block-based protocols, to be compressed up to 80 percent. This feature is an extra priced feature by TB.
- Non-Disruptive Volume move between I/O Groups -- If you don't already have SVC, you don't need to worry about this. For existing SVC customers, this allows volumes to be associated with two or more I/O groups, and that you can add or remove I/O groups non-disruptively. For example, if you want to move a volume from IOG1 to IOG2, then you add IOG2 to the list of I/O groups for the volume, let the multi-pathing software discover the additional paths, the remove IOG1, which then marks the previous IOG1 paths inactive. All this can be done while applications read and write data.
- Dedicate FCP ports for Replication -- If you activate the two 10GbE Ethernet ports for FCoE, you can free up two FCP ports that you can dedicate for long-distance Metro Mirror or Global Mirror.
If you have SVC today, but are running an old release like v4.3 or v5.1, I recommennd you upgrade up to at least v6.2.05 release now. This release has been out for a year and is very stable, and serves as a great platform for a later upgrade to SVC v6.4.
- IBM Storwize V7000 v6.4
The Storwize V7000 is IBM's midrange storage hypervisor. The latest software release, v6.4, can be installed on existing block-only Storwize V7000 units in the field. The Storwize V7000 v6.4 gets all the features listed above, as well as the following:
- Four-way clustering -- Previously, you could cluster two Storwize V7000 controller enclosures together (4 canisters total). To cluster three or four controllers required an RPQ. Now, IBM supports up to four Storwize V7000 controller enclosures (8 canisters) without an RPQ.
- Direct Fibre Channel attach -- A lot of people are using Storwize V7000 inside single-rack configurations, so it makes sense not to require a SAN switch for just a few Windows, Linux or VMware servers. An RPQ is now available to allow this to happen.
- IBM Tivoli Storage Productivity Center (TPC) v5.1
TPC is already ranked one of the best Storage Infrastructure Management software in the market, and this release will just solidify its lead. Key features include:
- Upward integration to higher level management systems
- A new, intuitive, easy-to-use web-based GUI inspired by the XIV GUI
- Integration of COGNOS to be able to generate and customize reports
- Support for SONAS systems
There are several presentations on TPC this week that will go into more detail. Check out the [TPC Facebook page].
- My latest book Inside System Storage: Volume IV is now available!
Yes, can you believe it? I have published my fourth volume in my "Inside System Storage" series! It is available in three formats:
- Hardcover with dust jacket
- eBook (Adobe Acrobat PDF)
You can order this, and all my other books, in all formats, directly from my [Author Spotlight] page. The paperback will also be available soon from other online booksellers, search for ISBN 978-1-105-72213-4.
- IBM DS3500 Express
The DS3500 is our entry-level block-based device, designed specifically for random I/O workloads. This includes databases, email repositories, traditional business applications, and on-line transactional workloads. Here are the new features:
- Dynamic Disk Pooling, similar to what XIV does to reduce disk rebuild times, but using a RAID-6 like approach per chunk of data.
- Thin Provisioning using Dynamic Disk Pooling
- Asynchronous Logical Unit Access (ALUA) failover
- Enhanced FlashCopy, improved scalability, consistency groups and rollback support
- VMware API for Array Integration (VAAI) support. This includes Write Same, Extended Copy, and Atomic Test & Set.
The DS3500 replaces the previous models of DS3200, DS3300 and DS3400 models.
- IBM DCS3700
The DCS3700 is our entry-level/midrange block-based device, replacing the DCS9900 model, designed specifically for sequential I/O workloads. This includes Big Data analytics, Hadoop, High Performance Computing (HPC), video surveillance, and television broadcasting. It holds 60 drives in a 4U controller enclosure.
For more on any of these announcements, see the [June 4th Announcement Page], or follow the Twitter tag #transformITnow.
technorati tags: IBM, SVC, Storwize V7000, Tivoli Storage, Productivity Center, TPC, DS3500, DCS37000
I'm down here in Australia, where the government is a bit stalled for the past two weeks at the moment, known formally as being managed by the [Caretaker government]. Apparently, there is a gap between the outgoing administration and the incoming administration, and the caretaker government is doing as little as possible until the new regime takes over. They are still counting votes, including in some cases dummy ballots known as "donkey votes", the Australian version of the hanging chad. Three independent parties are also trying to decide which major party they will support to finalize the process.
While we are on the topic of a government stalled, I feel bad for the state of Virginia in the United States. Apparently, one of their supposedly high-end enterprise class EMC Symmetrix DMX storage systems, supporting 26 different state agencies in Virginia, crashed on August 25th and now more than a week later, many of those agencies are still down, including the Department of Motor Vehicles and the Department of Taxation and Revenue.
Many of the articles in the press on this event have focused on what this means for the reputation of EMC. Not surprisingly, EMC says that this failure is unprecedented, but really this is just one in a long series of failures from EMC. It reminds me of the last time EMC had a public failure with a dual-controller CLARiiON a few months ago that stopped another company from their operations. There is nothing unique in the physical equipment itself, all IT gear can break or be taken down by some outside force, such as a natural disaster. The real question, though, is why haven’t EMC and the State Government been able to restore operations many days after the hardware was fixed?
In the Boston Globe, Zeus Kerravala, a data storage analyst at Yankee Group in Boston, is quoted as saying that such a high-profile breakdown could undermine EMC’s credibility with large businesses and government agencies. “I think it’s extremely important for them,’’ said Kerravala. “When you see a failure of this magnitude, and their inability to get a customer like the state of Virginia up and running almost immediately, all companies ought to look at that and raise their eyebrows.’’
Was the backup and disaster recovery solution capable of the scale and service level requirements needed by vital state
agencies? Had they tested their backups to ensure they were running correctly, and had they tested their recovery plans? Were they monitoring the success of recent backup operations?
Eventually, the systems will be back up and running, fines and penalties will be paid, and perhaps the guy who chose to go with EMC might feel bad enough to give back that new set of golf clubs, or whatever ridiculously expensive gift EMC reps might offer to government officials these days to influence the purchase decision making process.
(Note: I am not accusing any government employee in particular working at the state of Virginia of any wrongdoing, and mention this only as a possibility of what might have happened. I am sure the media will dig into that possibility soon enough during their investigations, so no sense in me discussing that process any further.)
So what lessons can we learn from this?
- Lesson 1: You don't just buy technology, you also are choosing to work with a particular vendor
IBM stands behind its products. Choosing a product strictly on its speeds and feeds misses the point. A study IBM and Mercer Consulting Group conducted back in 2007 found that only 20 percent of the purchase decision for storage was from the technical capabilities. The other 80 percent were called "wrapper attributes", such as who the vendor was, their reputation, the service, support and warranty options.
- Lesson 2: Losing a single disk system is a disaster, so disaster recovery plans should apply
IBM has a strong Business Continuity and Recovery Services (BCRS) services group to help companies and government agencies develop their BC/DR plans. In the planning process, various possible incidents are identified, recovery point objectives (RPO) and recovery time objectives (RTO) and then appropriate action plans are documentede on how to deal with them. For example, if the state of Virginia had an RPO of 48 hours, and an RTO of 5 days, then when the failure occurred on August 25, they could have recovered up to August 23 level data(48 hours prior to the incident) and be up and running by August 30 (five days after the incident). I don't personally know what RPO and RTO they planned for, but certainly it seems like they missed it by now already.
- Lesson 3: BC/DR Plans only work if you practice them often enough
Sadly, many companies and government agencies make plans, but never practice them, so they have no idea if the plans will work as expected, or if they are fundamentally flawed. Just as we often have fire drills that force everyone to stop what they are doing and vacate the office building, anyone with an IT department needs to practice BC/DR plans often enough so that you can ensure the plan itself is solid, but also so that the people involved know what to do and their respective roles in the recovery process.
- Lesson 4: This can serve as a wake-up call to consider Cloud Computing as an alternative option
Are you still doing IT in your own organization? Do you feel all of the IT staff have been adequately trained for the job? If your biggest disk system completely failed, not just a minor single or double drive failure, but a huge EMC-like failure, would your IT department know how to recover in less than five days? Perhaps this will serve as a wake-up call to consider alternative IT delivery options. The advantage of big Cloud Service Providers (Microsoft, Google, Yahoo, Amazon, SalesForce.com and of course, IBM) is that they are big enough to have worked out all the BC/DR procedures, and have enough resources to switch over to in case any individual disk system fails.
To learn more on this event, see the following articles: Washington
technorati tags: EMC, Symmetrix, DMX, Failure, Virginia, State Government, DMV, IBM, Business Continuity, Disaster Recovery
This week I got a comment on my blog post [IBM Announces another SSD Disk offering!]. The exchange involved Solid State Disk storage inside the BladeCenter and System x server line. Sandeep offered his amazing performance results, but we have no way to get in contact with him. So, for those interested, I have posted on SlideShare.net a quick five-chart presentation on recent tests with various SSD offerings on the eX5 product line here:
Sandeep, if you see this, we would also be interested in seeing your results as well.
technorati tags: , IBM, BladeCenter, eX5, server, solid state disk, SSD, PCIe
This week I was aboard the Queen Mary in Long Beach, California! This was a business event organized by [Key Info Systems], a valued IBM Business Partner. Key Info resells IBM servers, storage and switches.
The Queen Mary retired in 1967, and has been converted into a hotel and events venue. The locals just parked their car and walked on board, but I got to stay Tuesday through Thursday in one of the cabins. It was long and narrow, with round windows! There were four dials for the bathtub: Cold Salt, Hot Fresh, Cold Fresh, and Hot Salt.
Stepping on the boat was like walking back in time through history! If you decide to go see it, check out the [Art Deco bar at the front of the Promenade deck. The ship is still in the water, but is permanently docked. It is sectioned off to prevent the ocean waves from affecting it, so we did not have the nauseous moving back and forth normally associated with cruise ships.
(It is with a bit of irony that we are on the Queen Mary just days after the tragedy of the [Costa Concordia], the largest Italian cruise ship that ran aground near Isola de Giglio. The captain will have to explain how he [fell into a lifeboat] before he had a chance to wait for everyone else to get safely off the shipwreck. He was certainly no [Captain Sulley]! I am thankful that most of the 4,200 people survived the incident.)
- Executive Welcome
Lief Morin, Founder and Chief Executive for Key Info Systems, kicked off the meeting with highlights of 2011 successes. I have known Lief for years, as Key Info comes to the Tucson EBC on a frequent basis. This event was designed to give his sellers an update of what is the latest for each product line, and what to look forward to in the next 12-18 months.
- Power Systems
My colleague Pat O'Rourke from Austin EBC presented [IBM Power Systems], from the smallest [POWER-based blade servers for BladeCenter] to the largest [Power 795 server]. When it comes to UNIX servers, IBM is [kickin' butt and takin' names], leaving HP and Oracle/Sun to fend for scraps.
- Vision Solutions
The next speaker was from Vision Solutions that provides High Availability solutions for IBM i on Power Systems. In 2010, their company nearly doubled in size with the acquisition of Double-Take, which provides data replication for x86 servers running Windows, Linux, VMware, Hyper-V and other hypervisors. The capabilities of Double-Take sounded similar to what IBM offers with [Tivoli Storage Manager FastBack] and [Tivoli Storage Manager for Virtual Environments].
- Dinner at Sir Winston's
Rather than take the "Ghosts and Legends" tour, I opted for dinner at the Queen Mary's signature restaurant, Sir Winston's. This is a fancy place, so dress accordingly. If you want the Raspberry soufflé, order it early as it takes 30 minutes to prepare!
- System Storage
I presented on a variety of storage topics:
Storage is an important part of the Key Info Systems revenue stream, so I was glad to have lots of questions and interactions from the audience.
- Murder Mystery Dinner
The acting troupe from [Dinner Detective] put on quite the show for us! With all that is going on in the world, it is good to laugh out loud every now and then.
In other murder mystery dinners I have participated in, each person is assigned a "character" and given a script of what to say and when to say it. This was different, we got to pick our own characters. I chose "Doctor Watson", from the Sherlock Holmes series. Several attendees thought it was a double meaning with [IBM Watson], the computer that figured out the clues on Jeopardy! television game show, and has since been [put to work at Wellpoint] to help out the Healthcare industry.
After the "murder" happened, two actors portraying policemen selected members of the audience to answer questions. We didn't get a script of what to say, so everyone had to "ad lib". I was singled out as a suspect, and had fun playing along in character. One of the attendees afterwards said he was impressed that I was able to fabricate such amusing and elaborate responses to their personal and embarassing questions. As a public speaker for IBM, I have had a lot of practice thinking quickly on my feet.
- Fibre Channel and Ethernet Switches
The next two speakers gave us an update on Fibre Channel and Ethernet switches, and their thoughts on the inevitability of Fibre Channel over Ethernet (FCoE). One of the exciting new developments is the [Brocade Network Subscription] which creates a flexible pay-per-use Ethernet port rental model for customers. This is especially timely given the Financial Accounting Standards Board proposed [FASB Change 13] that affects operating leases in the balance sheet.
With the Brocade Network Subscription, you pay monthly for the ports you are using. Need more ports, Brocade will install the added gear. Use fewer ports, Brocade will take the equipment back. There is no term endpoint or residual value like tradtional leasing, so when you are done using the equipment, give it back any time. This is ideal for companies that may need to have a lot of Ethernet ports for the next 2-3 years, but then plan to taper down, and don't want to get stuck with a long-term commitment or capital depreciation.
The last speaker was from VMware. IBM is the #1 reseller of VMware, and VMware commands an impressive 81 percent marketshare in the x86 virtualization space. The speaker presented VMware's strategy going forward, which aligns well with IBM's own strategy, to help companies Cloud-enable their existing IT infrastructures, in preparation for eventual moves to Hybrid or Public cloud deployments.
Special thanks to Lief Morin for sponsoring this event, Raquel Hernandez from IBM for coordinating my travel, and Pete, Christina and Kendrell from Key Info Systems for organizing the activities!
technorati tags: IBM, Queen Mary, Key Info, Art Deco, Costa Concordia, Lief Morin, Pat O'Rourke, Power 795, DS8000, XIV, SONAS, Tape, TS1140, LTFS, Storwize V7000, Unified storage, FCoE, BNS, VMware, Cloud Computing
Well, it's Tuesday, and you all know what that means... IBM announcements!
This week, IBM announced the IBM Tivoli Storage Productivity Center for Disk Midrange Edition, affectionately referred to as "MRE". This is basically TPC for Disk but with two key differences:
- A special license that covers only DS3000, DS4000, DS5000 series, whether natively attached or virtualized behind SAN Volume Controller.
- A new pricing model based on the number on controllers and drawers, rather than by TB managed. For example, if you have a DS5300 and two expansion drawers, then you pay for three units of MRE. As you upgrade from smaller capacity disks to larger capacity disks, your license costs won't increase. This eliminates the quarterly hassle to "true up" your software licenses to match actual capacity that is required on TB-based licensing.
This includes the [new DS3500 model] that was announced last month. This was part of the set of solutions to [help midsized businesses].
A fresh new blogger on the scene, Anthony Vandewerdt (IBM), covers [10 things I like about the IBM DS3500], saving me the trouble.
For more information on Tivoli Storage Productivity Center for Disk Midrange Edition, see the IBM [Announcement Letter].
technorati tags: IBM, Tivoli, Productivity Center, Midrange+Edition, MRE, TPC, DS3000, DS3500, DS4000, DS5000, Anthony Vandewerdt
Back in Februray, my blog post [A Box Full of Floppies] mentioned that I uncovered some diskettes compressed with OS/2 Stacker. Jokingly, I suggested that I may have to stand up an OS/2 machine just to check out what is actually on those floppies. Each floppy contains only three files: README.STC, STACKER.EXE and a hidden STACKVOL.DSK file. The README.STC explains that the disk is compressed by Stacker, a program developed by [Stac Electronics, Inc.]. The STACKER.EXE would not run on Windows XP, Vista or Windows 7. The STACKVOL.DSK is just a huge binary file, like a ZIP file, compressed with [Lempel-Ziv-Stac] algorithm that combines Lempel-Ziv with Huffman coding.
In my follow-up post [Like Sands in an Hourglass], I explained how there are many ways I could have tackled this project. I could either use the Emulation approach and try to build an OS/2 guest image under a hypervisor like VMware, KVM or VirtualBox, or just take the Museum approach and try taking one of my half dozen old machines, wipe it clean and stand up OS/2 on it bare metal. This turned out to be more challenging than I expected. The systems I have that are modern and powerful enough to run hypervisors don't have floppy drives, so I opted for the Museum approach.
(A quick [history of OS/2] might be helpful. IBM and Microsoft jointly developed OS/2 back in 1985. By 1990, Microsoft decided it's own Windows operating system was more popular with the ladies, and decided to break off with IBM. In 1992, IBM release OS/2 version 2.0, touted as "a better DOS than DOS and a better Windows than Windows!" Both parties maintained ownership rights, Microsoft renamed OS/2 to Windows NT. The "NT" stood for New Technology, the basis for all of the enterprise-class Windows servers used today. IBM named its version of OS/2 version 3 and 4 "WARP", with the last version 4.52 released in 2001. In its heyday, OS/2 ran the majority of Automated Teller Machines (ATMs), was used for hardware management consoles (HMC), and was used worldwide to run various Railway systems. After 2001, IBM encouraged people to transition from Windows or OS/2 over to Java and Linux. For those that can't or won't leave OS/2, IBM partnered with Serenity Systems to continue OS/2 under the brand [eComStation].)
Working with an IBM [ThinkCentre 8195-E2U Pentium 4 machine] with 640MB RAM and 80GB hard disk, a CD-rom and one 3.5-inch floppy drive, I first discovered that OS/2 is limited to very small amounts of hard disk. There are limits on [file systems and partition sizes] as well as the infamous [1024-cylinder limit] for bootable operating systems. Having a completely empty drive didn't work, as the size of the disk was too big. Carving out a big partition out of this also failed, as it exceeded the various limits. Each time, it felt the partition table was corrupted because the values were so huge. Even modern Disk Partitioning tools ([SysRescueCD] or [PartedMagic]) didn't work, as these create partitions not recognizable to OS/2.
The next obstacle I knew I would encounter would be device drivers. OS/2 comes as a set of three floppy diskettes and a CD-rom. The bootable installation disk was referred to affectionately as "Disk 0", then Disk 1, then Disk 2. Once all drivers have been loaded into memory, then it can start looking at the CDrom, and continue with the installation. In searching for updated drivers, I came across [Updated OS/2 Warp 4 Installation Diskettes] to address problems with newer display monitors. It also addresses the 8.4GB volume limit.
The updates were in the form of EXE files that only execute in a running DOS or OS/2 environment, expanded onto a floppy diskette. It seemed like [Catch-22], I need a working DOS or OS/2 system to run the update programs to create the diskettes, but need the diskettes to build a working system.
To get around this, I decided to take a "scaffolding" approach. Using DOS 6 bootable floppy, I was able to re-partition the drive with FDISK into two small 1.9GB partitions. I have the full five-floppy IBM DOS 6 set, I hid the first partition for OS/2, and install the DOS 6 GUI on the second partition. I went ahead and added a few new subdirectories: BOOT to hold Grub2, PERSONAL to hold the data I decompress from the floppies, and UTILS to hold additional utilities. This little DOS system worked, and I now have new OS/2 "Disk 1" and "Disk 2" for the installation process.
(If you don't have a full set of DOS installation diskettes, you can make due with "FORMAT C: /S" from a [DOS boot disk], and then just copy over all the files from the boot disk to your C: drive. You won't have a nice DOS GUI, but the command line prompt will be enough to proceed.)
Like DOS, OS/2 expects to be installed on the C: drive. I hid the second partition (DOS), and marked the first partition installable and bootable. The OS/2 installation involves a lot of reboots, and the hard drive is not natively bootable in the intermediate stages. This means having to boot from Disk 0, then putting in Disk 1, then disk 2, before continuing the next phase of the installation. I tried to keep the installation as "Plain Vanilla" as possible.
I had to figure out what to include, and what to exclude, and this involved a lot of trial and error. For example, one of the choices was for "external diskette support". Since I had an "internal diskette drive", I didn't think I needed it. But after a full install, I discovered that it would not read or write floppy diskettes, so it appears that I do indeed need this support.
OS/2 supports two different file systems, FAT16 and the High Performance File System (HPFS). Since my partition was only 1.9GB in size, I chose just to use FAT16. HPFS supported larger disk partitions, longer file names, and faster performance, none of which I need for these purposes.
I thought it would be nice to get TCP/IP networking to work with my Ethernet card. However, after many attempts, I decided against this. I needed to focus on my mission, which was to decompress floppy diskettes. It was amusing to see that OS/2 supported all kinds of networking, including Token Ring, System Management, Remote Access, Mobile Access Services, File and Print.
Once all the options are chosen, OS/2 installation then proceeds to unpack and copy all the programs to the C: drive. During this process, IBM had informational splash screens. Here's one that caught my eye, titled "IBM Means Three Things" that listed three reasons to partner with IBM:
- Providing global solutions for a small planet
- Creating and Applying advanced technologies to improve with which customers run their businesses
- Constantly improving customer service with the products and services we provide
You might wonder how these OS/2 splash screens, written over 10 years ago, can appear almost identical to IBM's current [Smarter Planet] campaign. Actually, it is not that odd. IBM has been keeping to these same core principles since 1911, only the words to describe and promote these core values have changed.
To access both OS/2 and DOS partitions, I installed Grand Unified Bootloader [Grub2] on the DOS partition under C:/BOOT/GRUB directory. However, when I boot OS/2, I cannot see the DOS partition. And when I boot DOS, I cannot see the OS/2 partition. Each operating system thinks its C: drive is the only partition on the system.
Now that I had OS/2 running, I was then able to install Stacker from two floppy diskettes. With this installed, I can compress and decompress data on either the hard disk, or on floppy diskettes. Most of the files were flat text documents and digital photos. After copying the data off the compressed disks onto my hard drive, I now can copy them off to a safe place.
To finish this project, I installed Ubuntu Linux on the remaining 76GB of disk space, which can access both the OS/2 and DOS drives FAT16 file systems natively. This allows me to copy files from OS/2 to DOS or vice versa.
Now that I know what data types are on the diskettes, I determined that I could have decompressed the data in just a few steps:
- Set up a DOS partition on C: drive
- Insert one of the compressed diskettes into the floppy drive
- Copy the STACKER.EXE program from the floppy to the C: drive
- Run "STACKER A:" to decompress the floppy diskette
However, now that I have a working DOS and OS/2 system, I can possibly review the rest of my floppy diskettes, some of which may require running programs natively on OS/2 or DOS. This brings me to an important lesson. If you are going to keep archive data for long-term retention, you need to choose file formats that can be read by current operating systems and programs. Installing older operating systems and programs to access proprietary formats can be quite time-consuming, and may not always be possible or desirable.
technorati tags: IBM, Stac, Stacker, OS/2, DOS, Microsoft, Windows, ThinkCentre, Compression, Liv-Zempel
My colleagues, Harley Puckett (left) and Jack Arnold (right) were highlighted in today's Arizona Daily Star, our local newspaper, as part of an article on IBM's success and leadership in the IT storage industry. At 1400 employees here in Tucson, IBM is Southern Arizona's 36th largest employer.
Highlighted in the article:
- DS8700 with the new Easy Tier feature
- TS7650 ProtecTIER virtual tape library with data deduplication capability
- LTO-5 tape and the new Long Term File System (LTFS)
- XIV with the new 2TB drive, for a maximum per-rack usable capacity of 161 TB.
Read the full article [IBMers Crank Out 4 New Offerings To Handle Data Deluge]
technorati tags: , Arizona Daily Star, IBM Tucson, DS8700, Easy Tier, ProtecTIER, Deduplication, LTO-5, LTFS, XIV, IBM, Tucson, Arizona
Well, it's Tuesday again, and you know what that means! IBM Announcements! Typically, IBM System Storage has three to five major product launches per year. Making announcements every Tuesday would have been two frequent, and having one big announcement every two or three years would be too far apart. Worldwide combined revenues for storage hardware and software grew double digits last year, comparing full-year 2011 to the prior 2010 year, and I am sure that 2012 will also be a good year for IBM as well! This week we have announcements for both disk and tape, but since 2012 is the 60th Diamond Anniversary for tape, I will start with tape systems first.
- TS1140 support for JA/JJ tape cartridges
The TS1140 enterprise tape drive was announced at the [Storage Innovation Executive Summit] last May. It supported a new E07 format on three different new tape cartridges. Models "JC" was 4.0TB standard re-writeable tapes, "JY" was 4.0TB WORM tapes, and "JK" were 500GB economy tapes that were less expensive, but offered faster random access.
Generally, IBM has adopted an N-2 read, N-1 write [backward compatibility]. This means that the TS1140 could read E05 and E06 formatted tapes on JB and JX media, and could write E06 format on JB and JX media. However, there are a lot of older JA and JJ media, especially as part of TS7740 environments, so IBM now supports TS1140 drives to read J1A formatted JA and JJ media. This is not just for TS7740 environments, any TS1140 in stand-alone or tape library configurations will support this as well.
- TS7700 R2.1 enhancements
IBM is a leader in tape virtualization with or without physical tape as back-end media. There are two hardware models of the [IBM Virtualization Engine TS7700 family] for the IBM System z mainframe. These virtual libraries are referred to as "clusters" in IBM literature.
A unique feature of the TS7700 series is support for a Grid configuration, which allows up to six different TS7700 clusters to be grouped into a single instance image. These clusters can be in local or remote locations, connected via WAN or LAN connections.
R2.1 is the latest software release of this successful IBM's TS7700 series.
- True Sync Mode Copy. Before R2.1, the TS7700 offered "immediate mode copy". An application would write to a virtual tape, and when it was done with the tape and performed an unmount, the TS7700 would then replicate the tape contents to a secondary cluster on the grid. With True Sync Mode, data contents are replicated per implicit or explicit SYNC points. This is another IBM first in the IT tape industry.
- Remote Mount Fail-over. When you have two or more TS7700 clusters in a grid configuration, you can do remote mounts. We've added fail-over multi-pathing up to four paths, so that if a link to a remote cluster is down, it will try one of the others instead.
- Parallel Copies and Pre-Migration. On of my 19 patents is for the pre-migration feature for the IBM 3494 Virtual Tape Server (VTS) that carries forward into the TS7700, and is also used in the SONAS and Information Archive products. However, when the grid architecture was introduced, the engineers decided not to allow pre-migration and copies to secondary clusters to occur concurrently. Now these two operations can be done in parallel.
- Merge two grids into one grid. Now that we can support up to six clusters into a single grid, we have people with 2-cluster and 3-cluster grids looking to merge them into one. Of course, all the logical and physical volume serials (VOLSER) must be unique!
- Accelerate off JA/JJ Media. There are a lot of older JA and JJ media still in TS7700 libraries. This feature allows customers to speed up the transition to newer physical tape media.
- Copy Export to E06 format on JB media. This one is clever, and I have to say I would have never thought about it. Let's say you have a TS7740 with TS1140 drives, but you want to export some virtual tapes to physical media to be sent to someone who only has a TS7740 connected with older TS1130 drives. These older drives can't read new JC media nor make sense of the E07 format. This feature will let you export to older JB media in E06 format so that it will be fully readable at the new location on the TS1130 drives.
- Copy Export Merge service offering. Thanks to mergers and acquisitions, it is sometimes necessary to split off a portion of data from a TS7700 grid. In the past, IBM supported sending this export to a completely empty TS7700 library, but this new service offerings allows the export to be merged into an existing TS7700 that already contains data.
- LTFS-SDE support for Mac OS X 10.7 Lion
How do people still not yet know about the Linear Tape File System [LTFS]? I mentioned this in my blogs back in 2010 in [April], [September], and [November]. Last year, LTFS was the [NAB Show Pick Hits Award] and an [Emmy] for revolutionizing the use of digital tape in Television broadcasting.
In layman's terms, the Single Drive Edition [LTFS-SDE] allows a tape cartridge to be treated like USB memory stick. It is supported on the LTO5 tape drives for systems running various levels of Windows, Linux and Mac OS X. Prior to this announcement, IBM supported Snow Leopard (10.5.6) and Leopard (10.6), and now supports Mac OS X 10.7 "Lion" release.
- XIV Gen3 Solid-State Drives (SSD)
The [IBM XIV Gen3 storage system] now supports Solid State Drives! I thought I would provide some context from a historic perspective.
IBM first introduced Solid-State Drives (SSD) back in 2007 where it made sense the most, in [drive-for-drive replacements on blade servers in the IBM BladeCenter]. Blade servers typically only have a single drive, and SSD are both faster and use less energy on a drive-for-drive comparison, so this provided immediate benefit. Today, SSD are available on a variety of System x and POWER system servers.
In 2008, IBM rocked the world by being the first to reach [1 Million IOPS with Project Quicksilver]. This was an all-SSD configuration which many considered unrealistic (at the time), but it showed the potential for solid state drives.
In 2010, IBM announced [DS8700 Easy Tier with Sub-LUN automated tiering], and followed it up with similar support for [SVC 6.1 and Storwize V7000] that provides this enterprise-class functionality to midrange and externally-virtualized storage systems.
In 2011, IBM was able to [scan 10 billion files in 43 minutes] using the GPFS file system in support of Big Data analytics. This, of course, was done with Solid State Drives.
When the [XIV Gen3 was Announced - July 2011], each module included an 1.8-inch "SSD-Ready" slot in the back. IBM made a Statement of Direction that IBM would someday offer SSD drives to put in these slots. Today's announcement is that IBM has finalized the qualification process, so now XIV Gen3 clients can have 400GB of usable non-volatile SSD read cache added to each module. This SSD can be added to existing XIV Gen3 boxes in the field, or it can be factory-installed in new shipments. If you have a 15-module XIV, that's 6TB of additional read cache! This SSD is entirely managed by the XIV Gen3, so you won't have to spend weeks reading manuals or specifying configuration parameters.
My colleague Elisabeth Stahl covers this from a performance angle in her blog post [Performance in a Flash: New IBM XIV SSD Caching].
When you carve volumes on the XIV, you now have an option to enable or disable use of the SSD cache for each volume. Since XIV is being used in private and public cloud deployments, this offers the ability to offer premium performance at premium prices. The use of SSD is complementary to IBM XIV Quality of Service (QoS) performance levels, which are determined by host instead.
Well, that's the first major IBM System Storage launch of 2012. Let me know what you think in the comment section below.
technorati tags: IBM, TS1140, TS7700, TS7720, TS7740, LTFS, LTFS-SDE, LTO5, Mac OS X, Sync, XIV, Gen3, Elisabeth Stahl, SSD
Modified by TonyPearson
If Eskimos have 37 words for "snow", then EMC has perhaps a similar number of names for "failure". I have already covered a few of their past attempts, including [ATMOS], [Invista], and [VPLEX]. Last week, EMC introduced its latest, called XtremeIO.
But rather than focus on XtremeIO's many shortcomings, I thought it would be better to point out the highlights of IBM's All-Flash array, IBM FlashSystem.
But first, a quick story.
Two years ago, I worked the booth at [Oracle OpenWorld 2011]. After a conference attendee had visited the booths of Violin Memory and Pure Storage, he asked me why IBM did not have an all-Flash array.
Of course IBM did, and I showed him the [Storwize V7000]. For example, a 2U model with 18 SSD drives of 400GB each, configured in two RAID-5 ranks 7+P+S could offer 5.6 TB of space, running up to 250,000 IOPS at sub-millisecond response times.
Why didn't IBM advertise the Storwize V7000 as an all-Flash array? I though the question was silly at the time, since the Storwize V7000 supported SSD, 15K, 10K and 7200 RPM spinning disk, it seemed obvious that it could be configured with only SSD if you chose.
Since then, IBM has added 800GB support to the Storwize V7000, doubling the capacity. More importantly, IBM acquired Texas Memory Systems, and offers a much better all-Flash array.
Flash can be deployed in three levels. The first is in the server itself, such as with PCiE cards containing Flash chips, limited to applications running on that server only.
The second option is a hybrid disk system, that can intermix Flash-based Solid State Drives (SSD) with regular spinning hard disk drives (HDD). These can be attached to many servers.
The problem with this approach is that when Flash is packaged to pretend to be spinning disk, it undermines some of the performance benefits. Traditional disk system architectures using SCSI commands over Device adapter loops can introduce added latency.
The third fits snuggly in the middle: all-Flash arrays designed from the ground up to be only Flash.
Whereas SSD can typically achieve an I/O latency in the 300 to 1000 microseconds range, IBM FlashSystem can process I/O in the 25 to 110 microsecond range. That is a huge difference!
(FTC Disclosure: The U.S. Federal Trade Commission requires that I mention that I am an IBM employee, and that this post may be considered a paid, celebrity endorsement of both the IBM FlashSystem and IBM Storwize family of products. I have no financial interest in EMC, do not endorse the XtremeIO mentioned here, and was not paid to mention their company or products in any manner.)
Fellow blogger and IBM Master Inventor Barry Whyte has a great comparison table in his blog post [Extreme Blogging]. I thought I would add an added column for the Storwize V7000 with 18 Solid State drives.
IBM FlashSystem 820
IBM Storwize V7000 with SSD
20 Terabytes: 1U
11 Terabytes: 2U
7 Terabytes: 6U
I/O latency (microseconds)
110us (~5x faster)
Maximum I/O per second
NAND Flash type
While it is easy to show that EMC's XtremeIO does not hold a candle to IBM FlashSystems, I think it is more amusing that it is not even as good as a Storwize V7000 with SSD that IBM offered two years ago, long before [EMC acquired XtremeIO company] back in May 2012.
But don't just take my word for it, fellow blogger Robin Harris, on his StorageMojo blog, has several posts, including [EMC's Xtreme embarrassment] and [ XtremLY late XtremIO launch]. Both are worth a read.
Earlier this year, [IBM announced it is investing $1 Billion USD in Flash technology]. EMC's announcement last week shows that they are at least 18 months behind IBM in Flash technology solutions.
technorati tags: IBM, FlashSystem, Storwize V7000, Flash, Solid-State Drives, SSD, EMC, Atmos, Invista, VPLEX, XtremeIO, all-Flash array, Barry Whyte, Robin Harris, StorageMojo
Wrapping up this week's theme on the XO laptop, I decided to take on thechallenge of printing. I managed to print from my XO laptop to my laserjet printer.I checked the One Laptop Per Child [OLPC
] website,and found there is no built-in support for printers, but there have been several peopleasking how to print from the XO, so here are the steps I did to make it happen.
(Note: I did all of these steps successfully on my Qemu-emulated system first, and then performed them on my XO laptop)
- Step 1: Determine if you have an acceptable printer
The XO laptop can only connect to a printer via USB cable or over the network.Check your printer to see if it supports either of these two options. In my case, my printer is connected to my Linksys hub that offers Wi-Fi in my home.
The XO runs a modified version of Red Hat's Fedora 7, so we need to also determineif the printer is supported on Linux.Check the [Open Printing Database]for the level of support. This database has come up with the following ranking system.Printers are categorized according to how well they work under Linux and Unix. The ratings do not pertain to whether or not the printer will be auto-recognized or auto-configured, but merely to the highest level of functionality achieved.
- Perfectly - everything the printer can do is working also under Linux
- Mostly - work almost perfectly - funny enhanced resolution modes may be missing, or the color is a bit off, but nothing that would make the printouts not useful
- Partially - mostly don't work; you may be able to print only in black and white on a color printer, or the printouts look horrible
- Paperweight - These printers don't work at all. They may work in the future, but don't count on it
If your printer only supports a parallel cable connection, or does not have a high enough ranking above, go buy another printer. The [Linux Foundation] websiteoffers a list of suggested printers and tutorials.
In my case, I have a Brother HL5250-DN black-and-white laserjet printer connected over a network to Windows XP, OS X and my other Linux systems. It is rated as supporting Linux perfectly, so I decided to use this for my XO laptop.
- Step 2: Install Common UNIX Printing System (CUPS)
Technically, Linux is not UNIX, but for our purposes, close enough. Start the Terminalactivity, use "su" to change to root, and then use "yum" to install CUPS. Yum will automatically determine what other packages are needed, in this case paps and tmpwatch. Once installed, use "/usr/sbin/cupsd" to get the CUPS daemon started, and add this to the end ofrc.local so that it gets started every time you reboot.
Click graphic on the left to see larger view
[olpc@xo-10-CC-6F ~]$ subash-3.2# yum install cups...Total download size = 3.0 MIs this OK [y/N]? y
bash-3.2# /usr/sbin/cupsdbash-3.2# echo "/usr/sbin/cupsd" >> /etc/rc.d/rc.localbash-3.2# exit[olpc@xo-10-CC-6F ~]$
- Step 3: Install Opera or Firefox browser
To download the appropriate drivers, you may need a browser that can handle file downloads. I have triedto do this with the built-in Browse activity (aka Gecko) but encountered problems. I have both Opera and Firefox installed, but I will focus on Opera for this effort.I also installed the older18.104.22.168 version of the Flash player (worked better than the latest 22.214.171.124 version) and Java JRE.Follow the OLPC Wiki instructions for [Opera, Adobe Flash,and Sun Java] installation, thenverify with the following [Java and Flash] testers.
- Step 4: Download drivers and packages unique for your printer
In my case, I used Opera to get to the [Brother Linux Driver Homepage], and downloaded the RPM's for LPR and CUPS wrapper. These are the ones listed under "Drivers for Red Hat, Mandrake (Mandriva), SuSE". I saved these under "/home/olpc" directory.
[olpc@xo-10-CC-6F ~]$ subash-3.2# cd /home/olpcbash-3.2# rpm -vi brhl5250dnlpr-2.0.1-1.i386.rpmbash-3.2# rpm -vi cupswrapperHL5250DN-2.0.1-1.i386.rpmbash-3.2# exit[olpc@xo-10-CC-6F ~]$
- Step 5: Create a "root" password
By default, the root user has no password. However, you will need it to be something for later steps,so here is the process to create a root password. I set mine to "tony" which normallywould be considered too simple a password, but ignore those messages and continue.We will remove it in step 8 (below) to put things back to normal.
[olpc@xo-10-CC-6F ~]$ subash-3.2# passwdChanging password for user root.New UNIX password: tonyBAD PASSWORD: it is too shortRetype new UNIX password: tonypasswd: all authentication tokens updated successfullybash-3.2# exit[olpc@xo-10-CC-6F ~]$
- Step 6: Launch CUPS administration
Here I followed the instructions in Robert Spotswood's [Printing In Linux with CUPS] tutorial.Launch the Opera browser, and enter "http://localhost:631/admin" as the URL. The localhostrefers to the laptop itself, and 631 is the special port that CUPS listens to from browsers. You can alsouse 127.0.0.1 as a shortcut for "localhost", and can be used interchangeably.
In my case, it detected both of my networked printers, so I selected the HL5250DN, entered thelocation of my PPD file "/usr/share/cups/model/HL5250DN.ppd" that was created in Step 4. I set the URI to "lpd://192.168.0.75/binary_p1" per the instructions [Network Setting in CUPS based Linux system] in the Brother FAQ page. I chage the page size from "A4" to "Letter".I set this printer as the default printer. When it asks for userid and password, that is whereyou would enter "root" for the user, and "tony" or whatever you decided to set your root password to.
Select "Print a Test Page" to verify that everything is working.
- Step 7: Printing actual files
Sadly, I don't know Opera well enough to know how to print from there. So, I went over to my trustedFirefox browser. Select File->Page Setup to specify the settings, File->Print Preview tosee what it will look like, and then File->Print to send it to the printer.
To print the file "out.txt" that is in your /home/olpc directory, for example, enter"file:///home/olpc/out.txt" as the URL of the firefox browser. This will show the file,which you can then print to your printer. I had to specify 200% scaling otherwise the fontswere too small to read.
- Step 8: Remove the "root" password
If you want to remove the root password, here are the steps.
[olpc@xo-10-CC-6F ~]$ suPassword: tonybash-3.2# passwd -d rootRemoving password for user root.passwd: Successbash-3.2# exit[olpc@xo-10-CC-6F ~]$
Now the problem is that there is no way to print stuff from any of the Sugar activities. The best place toput in print support would be the Journal
activity. Along the bottom where the mounted USB keys arelocated could be an icon for a printer, and dragging a file down to the printer ojbect could cause it tobe send to the printer.
The alternative is to write some scripts invocable from the Terminal activity to determine what isin the journal, and send them to LPR with the appropriate parameters.
I did not have time to do either of these, but perhaps someone out there can take on that as a project.
technorati tags: OLPC, XO, printing, printer, linux, Opera, Firefox, Java, Flash
Since so many personal and corporate users are still on [Windows XP], Microsoft announced that it would provide [Extended Support until 2014]. A ComputerWorld article back in 2007 offered tips on [How to make Windows XP last for the next seven years]. From May 2009 to April 2014, all support is fee-based and non-security hotfixes are produced only for corporate customers.
If we have learned anything from last decade's Y2K crisis, is that we should not wait for the last minute to take action. Now is the time to start thinking about weaning ourselves off Windows XP. IBM has 400,000 employees, so this is not a trivial matter.
Already, IBM has taken some bold steps:
Last July, IBM announced that it was switching from Internet Explorer (IE6) to [Mozilla Firefox as its standard browser]. IBM has been contributing to this open source project for years, including support for open standards, and to make it [more accessible to handicapped employees with visual and motor impairments]. I use Firefox already on Windows, Mac and Linux, so there was no learning curve for me. Before this announcement, if some web-based application did not work on Firefox, our Helpdesk told us to switch back to Internet Explorer. Those days are over. Now, if a web-based application doesn't work on Firefox, we either stop using it, or it gets fixed.
IBM also announced the latest [IBM Lotus Symphony 3] software, which replaces Microsoft Office for Powerpoint, Excel and Word applications. Symphony also works across Mac, Windows and Linux. It is based on the OpenOffice open source project, and handles open-standard document formats (ODF). Support for Microsoft Office 2003 will also run out in the year 2014, so moving off proprietary formats to open standards makes sense.
I am not going to wait for IBM to decide how to proceed next, so I am starting my own migrations. In my case, I need to do it twice, on my IBM-provided laptop as well as my personal PC at home.
- IBM-provided laptop
Last summer, IBM sent me a new laptop, we get a new one every 3-4 years. It was pre-installed with Windows XP, but powerful enough to run a 64-bit operating system in the future. Here are my series of blog posts on that:
I decided to try out Red Hat Enterprise Linux 6.1 with its KVM-based Red Hat Enterprise Virtualization to run Windows XP as a guest OS. I will try to run as much as I can on native Linux, but will have Windows XP guest as a next option, and if that still doesn't work, reboot the system in native Windows XP mode.
Here's is how I have configured my laptop:
|/dev/sda1||35GB||NTFS||C:||Windows XP SP3 operating system and programs|
|/dev/sda2||15GB||ext4||/(root)||Ubuntu Desktop 10.10, SystemRescueCD, Clonezilla, Parted Magic|
|/dev/sda3||55GB||ext4||/(root)||RHEL 6.1 with KVM to run Windows XP as guest OS|
|/dev/sda6||130GB||NTFS||D:||My Documents, Lotus Notes and other data|
|/dev/sda7||70GB||NTFS||E:||Extras and Archives|
Basically, this is a multi-boot system. I use Ubuntu to hold all my Linux utilities, including [SystemRescueCD], [Clonezilla], and [Parted Magic]. The new [Grub2 loader] makes this easy.
Here is what my initial boot screen looks like:
So far, I am pleased that I can do nearly everything my job requires natively in Red Hat Linux, including accessing my Lotus Notes for email and databases, edit and present documents with Lotus Symphony, and so on. I have made RHEL 6.1 my default when I boot up. Setting up Windows XP under KVM was relatively simple, involving an 8-line shell script and 54-line XML file. Here is what I have encountered:
- We use a wonderful tool called "iSpring Pro" which merges Powerpoint slides with voice recordings for each page into a Shockwave Flash video. I have not yet found a Linux equivalent for this yet.
- To avoid having to duplicate files between systems, I use instead symbolic links. For example, my Lotus Notes local email repository sits on D: drive, but I can access it directly with a link from /home/tpearson/notes/data.
- While my native Ubuntu and RHEL Linux can access my C:, D: and E: drives in native NTFS file system format, the irony is that my Windows XP guest OS under KVM cannot. This means moving something from NTFS over to Ext4, just so that I can access it from the Windows XP guest application.
- For whatever reason, "Password Safe" did not run on the Windows XP guest. I launch it, but it takes forever to load and never brings up the GUI. Fortunately, there is a Linux version [MyPasswordSafe] that seems to work just fine to keep track of all my passwords.
- Personal home PC
My Windows XP system at home gave up the ghost last month, so I bought a new system with Windows 7 Professional, quad-core Intel processor and 6GB of memory. There are [various editions of Windows 7], but I chose Windows 7 Professional to support running Windows XP as a guest image.
Here's is how I have configured my personal computer:
|/dev/sda1||104MB||NTFS||C:||Windows 7 Loader|
|/dev/sda2||10GB||ext4||/(root)||Ubuntu Desktop 10.10, SystemRescueCD, Clonezilla, Parted Magic|
|/dev/sda6||60GB||NTFS||C:||Windows 7 OS and programs|
|/dev/sda7||230GB||NTFS||D:||My Documents, Lotus Notes and other data|
|/dev/sda8||250GB||NTFS||E:||Extras and Archives|
I actually found it more time-consuming to implement the "Virtual PC" feature of Windows 7 to get Windows XP mode working than KVM on Red Hat Linux. I am amazed how many of my Windows XP programs DO NOT RUN AT ALL natively on Windows 7. I now have native 64-bit versions of Lotus Notes and Symphony 3, which will do well enough for me for now.
I went ahead and put Red Hat Linux on my home system as well, but since I have Windows XP running as a guest under Windows 7, no need to duplicate KVM setup there. At least if I have problems with Windows 7, I can reboot in RHEL6 Linux at home and use that for Linux-native applications.
Hopefully, this will position me well in case IBM decides to either go with Windows 7 or Linux as the replacement OS for Windows XP.
technorati tags: IBM, Windows, Windows XP, Windows 7, Linux, Ubuntu, RedHat, RHEL, RHEL6, Lotus, Lotus Notes, Lotus Symphony
Well, I'm back safely from my tour of Asia. I am glad to report that Tokyo, Beijing and Kuala Lumpur are pretty much how I remember them from the last time I was there in each city. I have since been fighting jet lag by watching the last thirteen episodes of LOST season 6 and the series finale.
Recently, I have started seeing a lot of buzz on the term "Storage Federation". The concept is not new, but rather based on the work in database federation, first introduced in 1985 by [A federated architecture for information management] by Heimbigner and McLeod. For those not familiar with database federation, you can take several independent autonomous databases, and treat them as one big federated system. For example, this would allow you to issue a single query and get results across all the databases in the federated system. The advantage is that it is often easier to federate several disparate heterogeneous databases than to merge them into a single database. [IBM Infosphere Federation Server] is a market leader in this space, with the capability to federate DB2, Oracle and SQL Server databases.
Fellow blogger and BFF, Marc Farley (3PAR) has an excellent post [Zeroing in on a definition for federated storage]. Here's an excerpt:
- Storage expansion: You want to increase the storage capacity of an existing storage system that cannot accommodate the total amount of capacity desired. Storage Federation allows you to add additional storage capacity by adding a whole new system.
- Storage migration: You want to migrate from an aging storage system to a new one. Storage Federation allows the joining of the two systems and the evacuation from storage resources on the first onto the second and then the first system is removed.
- Safe system upgrades: System upgrades can be problematic for a number of reasons. Storage Federation allows a system to be removed from the federation and be re-inserted again after the successful completion of the upgrade.
- Load balancing: Similar to storage expansion, but on the performance axis, you might want to add additional storage systems to a Storage Federation in order to spread the workload across multiple systems.
- Storage tiering: In a similar light, storage systems in a Storage Federation could have different capacity/performance ratios that you could use for tiering data. This is similar to the idea of dynamically re-striping data across the disk drives within a single storage system, such as with 3PAR's Dynamic Optimization software, but extends the concept to cross storage system boundaries.
To some extent, IBM SAN Volume Controller (SVC), XIV, Scale-Out NAS (SONAS), and Information Archive (IA) offer most, if not all, of these capabilities. EMC claims its VPLEX will be able to offer storage federation, but only with other VPLEX clusters, which brings up a good question. What about heterogenous storage federation? Before anyone accuses me of throwing stones at glass houses, let's take a look at each IBM solution:
- IBM SAN Volume Controller
The IBM SAN Volume Controller has been doing storage federation since 2003. Not only can IBM SAN Volume Controller bring together storage from a variety of heterogenous storage, the SVC cluster itself can be a mix of different hardware models. You can have a 2145-8A4 node pair, 2145-8G4 node pair, and the new 2145-CF8 node pair, all combined together into a single SVC cluster. Upgrading SVC hardware nodes in an SVC cluster is always non-disruptive.
- IBM XIV storage system
The IBM XIV has two kinds of independent modules. Data modules have processor, cache and 12 disks. Interface modules are data modules with additional processor, FC and Ethernet (iSCSI) adapters. Because these two modules play different roles in an XIV "colony", that number of each type is predetermined. Entry-level six-module systems have 2 interface and 4 data modules. Full 15-module systems have 6 interface and 9 data modules. Individual modules can be added or removed non-disruptively in an XIV.
- IBM Scale-Out NAS
The SONAS is comprised of three kinds of nodes that work together in concert. A management node, one or more interface nodes, and two or more storage nodes. The storage nodes are paired to manage up to 240 nodes in a storage pod. Individual interface or data nodes can be added or removed non-disruptively in the SONAS. The underlying technology, the General Parallel File System, has been doing storage federation since 1996 for some of the largest top 500 supercomputers in the world.
- IBM Information Archive (IA)
For the IA, there are 1, 2 or 3 nodes, which manages a set of collections. A collection can either be file-based using industry-standard NAS protocols, or object-based using the popular System Storage™ Archive Manager (SSAM) interface. Normally, you have as many collections as you have nodes, but nodes are powerful enough to manage two collections to provide N-1 availability. This allows a node to be removed, and a new node added into the IA "colony", in a non-disruptive manner.
Even in an ant colony, there are only a few types of ants, with typically one queen, several males, and lots of workers. But all the ants are red. You don't see colonies that mix between different species of ants. For databases, federation was a way to avoid the much harder task of merging databases from different platforms. For storage, I am surprised people have latched on to the term "federation", given our mixed results in the other "federations" we have formed, which I have conveniently (IMHO) ranked from least effective to most effective:
- The Union of Soviet Socialist Republics (USSR)
My father used to say, "If the Soviet Union were in charge of the Sahara desert, they would run out of sand in 50 years." The [Soviet Union] actually lasted 68 years, from 1922 to 1991.
- The United Nations (UN)
After the previous League of Nations failed, the UN was formed in 1945 to facilitate cooperation in international law, international security, economic development, social progress, human rights, and the achieving of world peace by stopping wars between countries, and to provide a platform for dialogue.
- The European Union (EU)
With the collapse of the Greek economy, and the [rapid growth of debt] in the UK, Spain and France, there are concerns that the EU might not last past 2020.
- The United States of America (USA)
My own country is a federation of states, each with its own government. California's financial crisis was compared to the one in Greece. My own state of Arizona is under boycott from other states because of its recent [immigration law]. However, I think the US has managed better than the EU because it has evolved over the past 200 years.
- The Organization of the Petroleum Exporting Countries [OPEC]
Technically, OPEC is not a federation of cooperating countries, but rather a cartel of competing countries that have agreed on total industry output of oil to increase individual members' profits. Note that it was a non-OPEC company, BP, that could not "control their output" in what has now become the worst oil spill in US history. OPEC was formed in 1960, and is expected to collapse sometime around 2030 when the world's oil reserves run out. Matt Savinar has a nice article on [Life After the Oil Crash].
- United Federation of Planets
The [Federation] fictitiously described in the Star Trek series appears to work well, an optimistic view of what federations could become if you let them evolve long enough.
Given the mixed results with "federation", I think I will avoid using the term for storage, and stick to the original term "scale-out architecture".
technorati tags: , LOST, storage, federation, IBM, DB2, Oracle, SQL, 3PAR, Marc Farley, SVC, XIV, SONAS, IA, EMC, VPLEX, USSR, United Nations, OPEC, Star Trek
In my presentations in Australia and New Zealand, I mentioned that people were re-discovering the benefits of removable media. While floppy diskettes were convenient way of passing information from one person to another, they unfortunately did not have enough capacity. In today's world, you may need Gigabytes or Terabytes of re-writeable storage with a file system interface that can easily be passed from one person to another. In this post, I explore three options.
- Cirago CDD2000 Docking Station
The good folks over at [Cirago International Ltd.] sent me a cute little [CDD2000 docking station] for evalution.
(FCC Disclaimer: I work for IBM, and IBM has no business relationship with Cirago at the time of this writing. Cirago has not paid me to mention their product, but instead provided me a free loaner that I promised to return to them after my evaluation is completed. This post should not be considered an endorsement for Cirago's products. List prices for Cirago and IBM products were determined from publicly available sources for the United States, and may vary in different countries. The views expressed herein may not necessarily reflect the views and opinions of either IBM or Cirago.)
I took a few photos so you can see what exactly this device looks like. Basically, it is a plastic box that holds a single naked disk drive. It has four little rubber feet so that it does not slip on your desk surface.
The inside is quite simple. The power and SATA connections match those of either a standard 3.5 inch drive, or the smaller form factor (SFF) 2.5 inch drive. However, to my dismay, it does not handle EIDE drives which I have a ton of. After taking apart six different computer systems, I found only one had SATA drives for me to try this unit out with.
The unit comes with a USB cable and AC/DC power adapter. In my case, I found the USB 3.0 cable too short for my liking. My tower systems are under my desk, but I like keeping docking stations like this on the top of the desk, within easy reach, but that wasn't going to happen because the USB cable was not long enough.
Instead, I ended up putting it half-way in between, behind my desk, sitting on another spare system. Not ideal, but in theory there are USB-extension cables that probably could fix this.
Here it is with the drive inside. I had a 3.5 inch Western Digital [1600AAJS drive] 160 GB, SATA 3 Gbps, 8 MB Cache, 7200 RPM.
To compare the performance, I used a dual-core AMD [Athlon X2] system that I had built for my 2008 [One Laptop Per Child] project. To compare the performance, I ran with the drive externally in the Cirago docking station, then ran the same tests with the same drive internally on the native SATA controller. Although the Cirago documentation indicated that Windows was required, I used Ubuntu Linux 10.04 LTS just fine, using the flexible I/O [fio] benchmarking tool against an ext3 file system.
- Sequential Write - a common use for external disk drive is backup.
- Random read - randomly read files ranging from 5KB to 10MB in size.
- Random mixed - randomly read/write files (50/50 mix) ranging from 5KB to 10MB in size.
|Sequential Write||Throughput IOPS||1119||1044|
| ||Latency (msec)||0.866 ms||0.948 ms|
| ||Bandwidth (KB/s)||16900||14400|
|Random Read||Throughput (IOPS)||164||119|
| ||Latency (msec)||6.06 ms||8.36 ms|
| ||Bandwidth (KB/s)||658||477|
|Random Mixed (50/50)||Throughput (IOPS)||112||81|
| ||Latency (msec) read||8.78 ms||12.1 ms|
| ||Latency (msec) write||0.0983 ms||0.120 ms|
| ||Bandwidth (KB/s) read||557||328|
| ||Bandwidth (KB/s) write||556||337|
For sequential write, the Cirago performed well, only about 15 percent slower than native SATA. For random workloads, however, it was 30-40 percent slower. If you are wondering why I did not get USB 3.0 speeds, there are several factors involved here. First, with overheads, 5 Gbps USB 3.0 is expected to get only about 400 MB/sec. My SATA 2.0 controller maxes out at 375 MB/sec, and my USB 2.0 ports on my system are rated for 57 MB/sec, but with overheads will only get 20-25 MB/sec. Most spinning drives only get 75 to 110 MB/sec. Even solid-state drives top out at 250 MB/sec for sustained activity. Despite all that, my internal SATA drive only got 16 MB/sec, and externally with the Cirago 14 MB/sec in sustained write activity.
Here is the mess that is inside my system. The slot for drive 2 was blocked by cables, memory chips and the heat sink for my processor. It is possible to damage a system just trying to squeeze between these obstacles.
However, the point of this post is "removable media". Having to open up the case and insert the second drive and wire it up to the correct SATA port was a pain, and certainly a more difficult challenge than the average PC user wishes to tackle.
Price-wise, the Cirago lists for $49 USD, and the 160GB drive I used lists for $69, so the combination $118 is about what you would pay for a fully integrated external USB drive. However, if you had lots of loose drives, then this could be more convenient and start to save you some money.
- IBM RDX disk backup system
Another problem with the Cirago approach is that the disk drives are naked, with printed circuit board (PCB) exposed. When not in the docking station, where do you put your drive? Did you keep the [anti-static ESD bag] that it came in when you bought it? And once inside the bag, now what? Do you want to just stack it up in a pile with your other pieces of equipment?
To solve this, IBM offers the RDX backup system. These are fully compatible with other RDX sytems from Dell, HP, Imation, NEC, Quantum, and Tandberg Data. The concept is to have a docking station that takes removable, rugged plastic-coated disk-enclosed cartridges. The docking station can be part of the PC itself, similar to how CD/DVD drives are installed, or as a stand-alone USB 2.0 system, capable of processing data up to 25 MB/sec.
The idea is not new, about 10 years ago we had [Iomega "zip" drives] that offered disk-enclosed cartridges with capacities of 100, 250 and 750MB in size. Iomega had its fair share of problems with the zip drive, which were ranked in 2006 as the 15th worst technology product of all time, and were eventually were bought out by EMC two years later (as if EMC has not had enough failures on its own!)
The problem with zip drives was that they did not hold as much as CD or DVD media, and were more expensive. By comparison, IBM RDX cartridges come in 160GB to 750GB in size, at list prices starting at $127 USD.
- IBM LTO tape with Long-Term File System
Removable media is not just for backup. Disk cartridges, like the IBM RDX above, had the advantage of being random access, but most tape are accessed sequentially. IBM has solved this also, with the new IBM Long Term File System [LTFS], available for LTO-5 tape cartridges.
With LFTS, the LTO-5 tape cartridge now can act as a super-large USB memory stick for passing information from one person to the next. The LTO-5 cartridge can handle up to 3TB of compressed data at up to SAS speeds of 140 MB/sec. An LTO-5 tape cartridge lists for only $87 USD.
The LTO-5 drives, such as the IBM [TS2250 drive] can read LTO-3, LTO-4 and LTO-5cartridges, and can write LTO-4 and LTO-5 cartridges, in a manner that is fully compatible with LTO drives from HP or Quantum. LTO-3, LTO-4 and LTO-5 cartridges are available in WORM or rewriteable formats. LTO-4 and LTO-5 cartridges can be encrypted with 256-bit AES built-in encryption. With three drive manufacturers, and seven cartridge manufacturers, there is no threat of vendor lock-in with this approach.
These three options offer various trade-offs in price, performance, security and convenience. Not surprisingly, tape continues to be the cheapest option.
technorati tags: IBM, Cirago, CDD2000, RDX, Ubuntu, Linux, LTO, LTO-5, LTFS, SATA, USB, fio
Well, it's Tuesday, and you know what that means... IBM announcements!
In today's environment, clients expect more from their storage, and from their storage provider. The announcements span the gamut, from helping to use Business Analytics to analyze Big Data for trends, insights and patterns, to managing private, public and hybrid cloud environments, all with systems that are optimized for their particular workloads.
There are over a dozen different announcements, so I will split these up into separate posts. Here is part 1.
- IBM Scale Out Network Attach Storage (SONAS) R1.3
I have covered [IBM SONAS] for quite some time now. Based on IBM's General Parallel File System (GPFS), this integrated system combines servers, storage and software into a fully functional scale-out NAS solution that support NFS, CIFS, FTP/SFTP, HTTP/HTTPS, and SCP protocols. IBM continues its technical leadership in the scale-out NAS marketplace with new hardware and software features.
The hardware adds new disk options, with 900GB SAS 15K RPM drives, and 3TB NL-SAS 7200 RPM drives. These come in 4U drawers of 60 drives each, six ranks of ten drives each. So, with the high-performance SAS drives that would be about 43TB usable capacity per drawer, and with the high-capacity NL-SAS drives about 144TB usable. You can have any mix of high-performance drawers and high-capacity drawers, up to 7200 drives, for a maximum usable capacity of 17PB usable (21PB for those who prefer it raw). This makes it the largest commercial scale-out NAS in the industry. This capacity can be made into one big file system, or divided up to 256 smaller file systems.
In addition to snapshots of each file system, you can divide the file system up into smaller tree branches and snapshot these independently as well. The tree branches are called fileset containers. Furthermore, you can now make writeable clones of individual files, which provides a space-efficient way to create copies for testing, training or whatever.
Performance is improved in many areas. The interface nodes now can support a second dual-port 10GbE, and replication performance is improved by 10x.
SONAS supports access-based enumeration, which means that if there are 100 different subdirectories, but you only have authority to access five of them, then that's all you see, those five directories. You don't even know the other 95 directories exist.
I saved the coolest feature for last, it is called Active Cloud Engine™ that offers both local and global file management. Locally, Active Cloud Engine placement rules to decide what type of disk a new file should be placed on. Management rules that will move the files from one disk type to another, or even migrates the data to tape or other externally-managed storage! A high-speed scan engine can rip through 10 million files per node, to identify files that need to be moved, backed up or expired.
Globally, Active Cloud Engine makes the global namespace truly global, allowing the file system to span multiple geographic locations. Built-in intelligence moves individual files to where they are closest to the users that use them most. This includes an intelligent push-over-WAN write cache, on-demand pull-from-WAN cache for reads, and will even pre-fetch subsets of files.
No other scale-out NAS solution from any other storage vendor offers this amazing and awesome capability!
- IBM® Storwize® V7000
Last year, we introduced the [IBM Storwize V7000], a midrange disk system with block-level access via FCP and iSCSI protocols. The 2U-high control enclosure held two cannister nodes, a 12-drive or 24-drive bay, and a pair of power-supply/battery UPS modules. The controller could attach up to nine expansion enclosures for more capacity, as well as virtualize other storage systems. This has been one of our most successful products ever, selling over 100PB in the past 12 months to over 2,500 delighted customers.
The 12-drive enclosure now supports both 2TB and 3TB NL-SAS drives. The 24-drive enclosures support 200/300/400GB Solid-State Drives (SSD), 146 and 300GB 15K RPM drives, 300/450/600GB 10K RPM drives, and a new 1TB NL-SAS drive option. For those who want to set up "Flash-and-Stash" in a single 2U drawer, now you can combine SSD and NL-SAS in the 24-drive enclosure! This is the perfect platform for IBM's Easy Tier sub-LUN automated tiering. IBM's Easy Tier is substantially more powerful and easier to use than EMC's FAST-VP or HDS's Dynamic Tiering.
Last week, at Oracle OpenWorld, there were various vendors hawking their DRAM/SSD-only disk systems, including my friends at Texas Memory Systems, Pure Storage, and Violin Memory Systems. When people came to the IBM booth to ask what IBM offers, I explained that both the IBM DS8000 and the Storwize V7000 can be outfitted in this manner. With the Storwize V7000, you can buy as much or little SSD as you like. You do not have to buy these drives in groups of 8 or 16 at a time.
The Storwize V7000 is the sister product of the IBM SAN Volume Controller, so you can replicate between one and the other. I see two use cases for this. First, you might have a SVC at a primary location, and decide to replicate just the subset of mission-critical production data to a remote location, and use the Storwize V7000 as the target device. Secondly, you could have three remote or branch offices (ROBO) that replicate to a centralized data center SAN Volume Controller.
Lastly, like the SVC, the Storwize V7000 now supports clustering so that you can now combine multiple control enclosures together to make a single system.
- IBM® Storwize® V7000 Unified
Do you remember how IBM combined the best of SAN Volume Controller, XIV and DS8000 RAID into the Storwize V7000? Well, IBM did it again, combining the best of the Storwize V7000 with the common NAS software base developed for SONAS into the new "Storwize V7000 Unified".
You can upgrade your block-only Storwize V7000 into a file-and-block "Storwize V7000 Unified" storage system. This is a 6U-high system, consisting of a pair of 2U-high file modules connected to a standard 2U-high control enclosure. Like the block-only version, the control enclosure can attach up to nine expansion enclosures, as well as all the same support to virtualize external disk systems. The file modules combine the management node, interface node and storage node functionality that SONAS R1.3 offers.
What exactly does that mean for you? In addition to FCP and iSCSI for block-level LUNs, you can carve out file systems that support NFS, CIFS, FTP/SFTP, HTTP/HTTPS, and SCP protocols. All the same support as SONAS for anti-virus checking, access-based enumeration, integrated TSM backup and HSM functionality to migrate data to tape, NDMP backup support for other backup software, and Active Cloud Engine's local file management are all included!
- IBM SAN Volume Controller V6.3
The SAN Volume Controller [SVC] increases its stretched cluster to distances up to 300km. This is 3x further than EMC's VPLEX offering. This allows identical copies of data to be kept identical in both locations, and allows for Live Partition Mobility or VMware vMotion to move workloads seamlessly from one data center to another. Combining two data centers with an SVC stretch cluster is often referred to as "Data Center Federation".
The SVC also introduces a low-bandwidth option for Global Mirror. We actually borrowed this concept from our XIV disk system. Normally, SVC's Global Mirror will consume all the bandwidth it can to keep the destination copy of the data within a few seconds of currency behind the source copy. But do you always need to be that current? Can you afford the bandwidth requirements needed to keep up with that? If you answered "No!" to either of these, then the low-bandwidth option is you. Basically, a FlashCopy is done on the source copy, this copy is then sent over to the destination, and a FlashCopy is made of that. The process is then repeated on a scheduled basis, like every four hours. This greatly reduces the amount of bandwidth required, and for many workloads, having currency in hours, rather than seconds, is good enough.
I am very excited about all these announcements! It is a good time to be working for IBM, and look forward to sharing these exciting enhancements with clients at the Tucson EBC.
technorati tags: IBM, SONAS, GPFS, SAS, NL-SAS, Active Cloud Engine, Global+Namespace, Storwize+V7000, V7000U, V7000 Unified, block-only, block-and-file, SVC, SSD, Easy Tier, Flash-and-Stash, Texas Memory Systems, Pure Storage, Violin Memory
Continuing my coverage of the annual [2010 System Storage Technical University], I attended some sessions from the System x and Federal track side of this conference.
- Grid, SOA and Cloud Computing
Bill Bauman, IBM System x Field Technical Support Specialist and System x University celebrity, presented the differences between Grid, SOA and Cloud Computing. I thought this was an odd combination to compare and contrast, but his presentation was well attended.
- Grid - this is when two or more independently owned and managed computers are brought together to solve a problem. Some research facilities do this. IBM helped four hospitals connect their computers together into a grid to help analyze breast cancer. IBM also supports the [World Community Grid] which allows your personal computer to be connected to the grid and help process calculations.
- SOA - SOA, which stands for Service Oriented Architecture, is an approach to building business applications as a combination of loosely-coupled black-box components orchestrated to deliver a well-defined level of service by linking together business processes. I often explain SOA as the the business version of Web 2.0. You can download a free copy of the eBook "SOA for Dummies" at the [IBM Smart SOA] landing page.
- Cloud - A Cloud is a dynamic, scalable, expandable, and completely contractible architecture. It may consist of multiple, disparate, on-premise and off-premise hardware and virtualized platforms hosting legacy, fully installed, stateless, or virtualized instances of operating systems and application workloads.
Bill has his own blog, and has an interesting post [Cloud Computing, What it Is, and What it is Not] that appears to be the basis of this presentation.
- Chaos to Cloud
Tom Vezina, IBM Advanced Technical Sales Specialist, presented "Chaos to Cloud Computing". Survey results show that roughly 70 percent of cloud spend will be for private clouds, and 30 percent for public, hybrid or community clouds. Of the key motivations for public cloud, 77 percent or respondents cited reducing costs, 72 percent time to value, and 50 percent improving reliability.
Tom ran over 500 "server utilization" studies for x86 deployments during the past eight years. Of these, the worst was 0.52 percent CPU utilization, the best was 13.4 percent, and the average was 6.8 percent. When IBM mentions that 85 percent of server capacity is idle, it is mostly due to x86 servers. At this rate, it seems easy to put five to 20 guest images onto a machine. However, many companies encounter "VM stall" where they get stuck after only 25 percent of their operating system images virtualized.
He feels the problem is with the fact most Physical-to-Virtual (P2V) migrations are manual efforts. There are tools available like Novell [PlateSpin Recon] to help automate and reduce the total number of hours spent per migration.
- System x KVM Solutions
Boy, I walked into this one. Many of IBM's cloud offerings are based on the Linux hypervisor called Kernel-based Virtual Machine [a href="http://www.linux-kvm.org/page/Main_Page">KVM] instead of VMware or Microsoft Hyper-V. However, this session was about the "other KVM": keyboard video and mouse switches, which thankfully, IBM has renamed to Console Managers to avoid confusion. Presenters Ben Hilmus (IBM) and Steve Hahn (Avocent) presented IBM's line of Local Console Managers (LCM) and Global Console Managers (GCM) products.
LCM are the traditional KVM switches that people are familiar with. A single keyboard, video and mouse can select among hundreds of servers to perform maintenance or check on status. GCM adds KVM-over-IP capabilities, which means that now you can access selected systems over the Ethernet from a laptop or personal computer. Both LCM and GCM allow for two-level tiering, which means that you can have an LCM in each rack, and an LCM or GCM that points to each rack, greatly increasing the number of servers that can be managed from a single pane of glass.
Many severs have a "service processor" to manage the rest of the machine. IBM RSA II, HP iLO, and Dell DRAC4 are some examples. These allow you to turn on and off selected servers. IBM BladeCenter offers an Management Module that allows the chassis to be connected to a Console Manager and select a specific blade server inside. These can also be used with VMware viewer, Virtual Network Computing (VNC), or Remote Desktop Protocol (RDP).
IBM's offerings are unique it that you can have an optical CD/DVD drive or USB external storage attached at the LCM or GCM, and make it look like the storage is attached to the selected server. This can be used to install or upgrade software, transfer log files, and so on. Another great use, and apparently the motivation for having this session in the "Federal Track", is that the USB can be used to attach a reader for a smart card, known as a Common Access Card [CAC] used by various government agencies. This provides two-factor authentication [TFA]. For example, to log into the system, you enter your password (something you know) and swipe your employee badge smart card (something you have). The combination are validated at the selected server to provide access.
I find it amusing that server people limit themselves to server sessions, and storage people to storage sessions. Sometimes, you have to step "outside your comfort zone" and learn something new, something different. Open your eyes and look around a bit. You might just be surprised what you find.
(FTC note: I work for IBM. IBM considers Novell a strategic Linux partner. Novell did not provide me a copy of Platespin Recon, I have no experience using it, and I mention it only in context of the presentation made. IBM resells Avocent solutions, and we use LCM gear in the Tucson Executive Briefing Center.)
technorati tags: IBM, Technical University, Grid, SOA, Cloud Computing, P2V, VMware, Novell, Platespin, x86, KVM, LCM, GCM, Avocent, CAC, TFA
Continuing my catch-up on past posts, Jon Toigo on his DrunkenData
blog, posted a ["bleg"
] for information aboutdeduplication. The responses come from the "who's who" of the storage industry, so I will provide IBM'sview. (Jon, as always, you have my permission to post this on your blog!)
- Please provide the name of your company and the de-dupe product(s) you sell. Please summarize what you think are the key values and differentiators of your wares.
IBM offers two different forms of deduplication. The first is IBM System Storage N series disk system with Advanced Single Instance Storage (A-SIS), and the second is IBM Diligent ProtecTier software. Larry Freeman from NetApp already explains A-SIS in the [comments on Jon's post], so I will focus on the Diligent offering in this post. The key differentiators for Diligent are:
- Data agnostic. Diligent does not require content-awareness, format-awareness nor identification of backup software used to send the data. No special client or agent software is required on servers sending data to an IBM Diligent deployment.
- Inline processing. Diligent does not require temporarily storing data on back-end disk to post-process later.
- Scalability. Up to 1PB of back-end disk managed with an in-memory dictionary.
- Data Integrity. All data is diff-compared for full 100 percent integrity. No data is accidentally discarded based on assumptions about the rarity of hash collisions.
- InfoPro has said that de-dupe is the number one technology that companies are seeking today — well ahead of even server or storage virtualization. Is there any appeal beyond squeezing more undifferentiated data into the storage junk drawer?
Diligent is focused on backup workloads, which has the best opportunity for deduplication benefits. The two main benefits are:
- Keeping more backup data available online for fast recovery.
- Mirroring the backup data to another remote location for added protection. With inline processing, only the deduplicated data is sent to the back-end disk, and this greatly reduces the amount of data sent over the wire to the remote location.
- Every vendor seems to have its own secret sauce de-dupe algorithm and implementation. One, Diligent Technologies (just acquired by IBM), claims that their’s is best because it collapses two functions — de-dupe then ingest — into one inline function, achieving great throughput in the process. What should be the gating factors in selecting the right de-dupe technology?
As with any storage offering, the three gating factors are typically:
- Will this meet my current business requirements?
- Will this meet my future requirements for the next 3-5 years that I plan to use this solution?
- What is the Total Cost of Ownership (TCO) for the next 3-5 years?
Assuming you already have backup software operational in your existing environment, it is possible to determine thenecessary ingest rate. How many "Terabytes per Hour" (TB/h) must be received, processed and stored from the backup software during the backup window. IBM intends to document its performance test results of specific software/hardwarecombinations to provide guidance to clients' purchase and planning decisions.
For post-process deployments, such as the IBM N series A-SIS feature, the "ingest rate" during the backup only has to receive and store the data, and the rest of the 24-hour period can be spent doing the post-processing to find duplicates. This might be fine now, but as your data grows, you might find your backup window growing, and that leaves less time for post-processing to catch up. IBM Diligent does the processing inline, so is unaffected by an expansion of the backup window.
IBM Diligent can scale up to 1PB of back-end data, and the ingest rate does not suffer as more data is managed.
As for TCO, post-process solutions must have additional back-end storage to temporarily hold the data until the duplicates can be found. With IBM Diligent's inline methodology, only deduplicated data is stored, so less disk space is required for the same workloads.
- Despite the nuances, it seems that all block level de-dupe technology does the same thing: removes bit string patterns and substitutes a stub. Is this technically accurate or does your product do things differently?
IBM Diligent emulates a tape library, so the incoming data appears as files to be written sequentially to tape. A file is a string of bytes. Unlike block-level algorithms that divide files up into fixed chunks, IBM Diligent performs diff-compares of incoming data with existing data, and identifies ranges of bytes that duplicate what already is stored on the back-end disk. The file is then a sequence of "extents" representing either unique data or existing data. The file is represented as a sequence of pointers to these extents. An extent can vary from2KB to 16MB in size.
- De-dupe is changing data. To return data to its original state (pre-de-dupe) seems to require access to the original algorithm plus stubs/pointers to bit patterns that have been removed to deflate data. If I am correct in this assumption, please explain how data recovery is accomplished if there is a disaster. Do I need to backup your wares and store them off site, or do I need another copy of your appliance or software at a recovery center?
For IBM Diligent, all of the data needed to reconstitute the data is stored on back-end disks. Assuming that all of your back-end disks are available after the disaster, either the original or mirrored copy, then you only need the IBM Diligent software to make sense of the bytes written to reconstitute the data. If the data was written by backup software, you would also need compatible backup software to recover the original data.
- De-dupe changes data. Is there any possibility that this will get me into trouble with the regulators or legal eagles when I respond to a subpoena or discovery request? Does de-dupe conflict with the non-repudiation requirements of certain laws?
I am not a lawyer, and certainly there are aspects of[non-repudiation] that may or may not apply to specific cases.
What I can say is that storage is expected to return back a "bit-perfect" copy of the data that was written. Thereare laws against changing the format. For example, an original document was in Microsoft Word format, but is converted and saved instead as an Adobe PDF file. In many conversions, it would be difficult to recreate the bit-perfect copy. Certainly, it would be difficult to recreate the bit-perfect MS Word format from a PDF file. Laws in France and Germany specifically require that the original bit-perfect format be kept.
Based on that, IBM Diligent is able to return a bit-perfect copy of what was written, same as if it were written to regular disk or tape storage, because all data is diff-compared byte-for-byte with existing data.
In contrast, other solutions based on hash codes have collisions that result in presenting a completely different set of data on retrieval. If the data you are trying to store happens to have the same hash code calculation as completely different data already stored on a solution, then it might just discard the new data as "duplicate". The chance for collisions might be rare, but could be enough to put doubt in the minds of a jury. For this reason, IBM N series A-SIS, that does perform hash code calculations, will do a full byte-for-byte comparison of data to ensure that data is indeed a duplicate of an existing block stored.
- Some say that de-dupe obviates the need for encryption. What do you think?
I disagree. I've been to enough [Black Hat] conferences to know that it would be possible to read thedata off the back-end disk, using a variety of forensic tools, and piece together strings of personal information,such as names, social security numbers, or bank account codes.
Currently, IBM provides encryption on real tape (both TS1120 and LTO-4 generation drives), and is working withopen industry standards bodies and disk drive module suppliers to bring similar technology to disk-based storage systems.Until then, clients concerned about encryption should consider OS-based or application-based encryption from thebackup software. IBM Tivoli Storage Manager (TSM), for example, can encrypt the data before sending it to the IBMDiligent offering, but this might reduce the number of duplicates found if different encryption keys are used.
- Some say that de-duped data is inappropriate for tape backup, that data should be re-inflated prior to write to tape. Yet, one vendor is planning to enable an “NDMP-like” tape backup around his de-dupe system at the request of his customers. Is this smart?
Re-constituting the data back to the original format on tape allows the original backup software to interpret the tape data directly to recover individual files. For example, IBM TSM software can write its primary backup copies to an IBM Diligent offering onsite, and have a "copy pool" on physical tape stored at a remote location. The physical tapes can be used for recovery without any IBM Diligent software in the event of a disaster. If the IBM Diligent back-end disk images are lost, corrupted, or destroyed, IBM TSM software can point to the "copy pool" and be fully operational. Individual files or servers could be restored from just a few of these tapes.
An NDMP-like tape backup of a deduplicated back-end disk would require that all the tapes are in-tact, available, and fully restored to new back-end disk before the deduplication software could do anything. If a single cartridge fromthis set was unreadable or misplaced, it might impact the access to many TBs of data, or render the entire systemunusable.
In the case of a 1PB of back-end disk for IBM Diligent, you would be having to recover over a thousand tapes back to disk before you could recover any individual data from your backup software. Even with dozens of tape drives in parallel, could take you several days for the complete process.This represents a longer "Recovery Time Objective" (RTO) than most people are willing to accept.
- Some vendors are claiming de-dupe is “green” — do you see it as such?
Certainly, "deduplicated disk" is greener than "non-deduplicated" disk, but I have argued in past posts, supportedby Analyst reports, that it is not as green as storing the same data on "non-deduplicated" physical tape.
- De-dupe and VTL seem to be joined at the hip in a lot of vendor discussions: Use de-dupe to store a lot of archival data on line in less space for fast retrieval in the event of the accidental loss of files or data sets on primary storage. Are there other applications for de-duplication besides compressing data in a nearline storage repository?
Deduplication can be applied to primary data, as in the case of the IBM System Storage N series A-SIS. As Larrysuggests, MS Exchange and SharePoint could be good use cases that represent the possible savings for squeezing outduplicates. On the mainframe, many master-in/master-out tape applications could also benefit from deduplication.
I do not believe that deduplication products will run efficiently with “update in place” applications, that is high levels of random writes for non-appending updates. OLTP and Database workloads would not benefit from deduplication.
- Just suggested by a reader: What do you see as the advantages/disadvantages of software based deduplication vs. hardware (chip-based) deduplication? Will this be a differentiating feature in the future… especially now that Hifn is pushing their Compression/DeDupe card to OEMs?
In general, new technologies are introduced on software first, and then as implementations mature, get hardware-based to improve performance. The same was true for RAID, compression, encryption, etc. The Hifn card does "hash code" calculations that do not benefit the current IBM Diligent implementation. Currently, IBM Diligent performsLZH compression through software, but certainly IBM could provide hardware-based compression with an integrated hardware/software offering in the future. Since IBM Diligent's inline process is so efficient, the bottleneck in performance is often the speed of the back-end disk. IBM Diligent can get improved "ingest rate" using FC instead of SATA disk.
Sorry, Jon, that it took so long to get back to you on this, but since IBM had just acquired Diligent when you posted, it took me a while to investigate and research all the answers.
technorati tags: IBM, Diligent, Jon Toigo, DrunkenData, bleg, deduplication, A-SIS, NetApp, ProtecTier, inline, post-process, back-end, disk, data integrity, hash, collision, ingest rate, VTL, non-repudiation, extent, bit-perfect, Microsoft Word, Adobe PDF, diff, Black Hat, encryption, compression, Hifn, FC, SATA