The technology industry is full of trade-offs. Take for example solar cells that convert sunlight to electricity. Every hour, more energy hits the Earth in the form of sunlight than the entire planet consumes in an entire year. The general trade-off is between energy conversion efficiency versus abundance of materials:
- Get 9-11 percent efficiency using rare materials like indium (In), gallium (Ga) or cadmium (Cd).
- Get only 6.7 percent efficiency using abundant materials like copper (Cu), tin (Sn), zinc (Zn), sulfur (S), and selenium (Se)
IBM has eliminated this trade-off with a record-setting breakthrough last week, demonstrating 9.6 percent efficiency [thin film solar cells using earth-abundant materials].
A second trade-off is exemplified by EMC's recent GeoProtect announcement. This appears similar to the geographic dispersal method introduced by a company called [CleverSafe]. The trade-off is between the amount of space to store one or more copies of data and the protection of data in the event of disaster. Here's an excerpt from fellow blogger Chuck Hollis (EMC) titled ["Cloud Storage Evolves"]:
"Imagine a average-sized Atmos network of 9 nodes, all in different time zones around the world. And imagine that we were using, say, a 6+3 protection scheme.
The implication is clear: any 3 nodes could be completely lost: failed, destroyed, seized by the government, etc.
-- and the information could be completely recovered from the surviving nodes."
For organizations worried about their information falling into the wrong hands (whether criminal or government sponsored!), any subset of the nodes would yield nothing of value -- not only would the information be presumably encrypted, but only a few slices of a far bigger picture would be lost.
Seized by the government? falling into the wrong hands? Is EMC positioning ATMOS as "Storage for Terrorists"? I can certainly appreciate the value of being able to protect 6PB of data with only 9PB of storage capacity, instead of keeping two copies of 6PB each, the trade-off means that you will be accessing the majority of your data across your intranet, which could impact performance. But, if you are in an illicit or illegal business that could have a third of your facilities "seized by the government", then perhaps you shouldn't house your data centers there in the first place. Having two copies of 6PB each, in two "friendly nations", might make more sense.
(In reality, companies often keep way more than just two copies of data. It is not unheard of for companies to keep three to five copies scattered across two or three locations. Facebook keeps SIX copies of photographs you upload to their website.)
ChuckH argues that the governments that seize the three nodes won't have a complete copy of the data. However, merely having pieces of data is enough for governments to capture terrorists. Even if the striping is done at the smallest 512-byte block level, those 512 bytes of data might contain names, phone numbers, email addresses, credit cards or social security numbers. Hackers and computer forensics professionals take advantage of this.
You might ask yourself, "Why not just encrypt the data instead?" That brings me to the third trade-off, protection versus application performance. Over the past 30 years, companies had a choice, they could encrypt and decrypt the data as needed, using server CPU cycles, but this would slow down application processing. Every time you wanted to read or update a database record, more cycles would be consumed. This forced companies to be very selective on what data they encrypted, which columns or fields within a database, which email attachments, and other documents or spreadsheets.
An initial attempt to address this was to introduce an outboard appliance between the server and the storage device. For example, the server would write to the appliance with data in the clear, the appliance would encrypt the data, and pass it along to the tape drive. When retrieving data, the appliance would read the encrypted data from tape, decrypt it, and pass the data in the clear back to the server. However, this had the unintended consequences of using 2x to 3x more tape cartridges. Why? Because the encrypted data does not compress well, so tape drives with built-in compression capabilities would not be able to shrink down the data onto fewer tapes.
(I covered the importance of compressing data before encryption in my previous blog post
[Sock Sock Shoe Shoe].)
Like the trade-off between energy efficiency and abundant materials, IBM eliminated the trade-off by offering compression and encryption on the tape drive itself. This is standard 256-bit AES encryption implemented on a chip, able to process the data as it arrives at near line speed. So now, instead of having to choose between protecting your data or running your applications with acceptable performance, you can now do both, encrypt all of your data without having to be selective. This approach has been extended over to disk drives, so that disk systems like the IBM System Storage DS8000 and DS5000 can support full-disk-encryption [FDE] drives.
Certainly, something to think about!
technorati tags: , sunlight, solar cells, electricity, indium, gallium, cadmium, copper, tin, zinc, sulfur, selenium, thin+film, efficiency, EMC, Chuck Hollis, GeoProtect, Cleversafe, governement, seizure, Facebook, terrorists, encryption, forensics, hackers, protection, performance, disk, tape
Have you ever noticed that sometimes two movies come out that seem eerily similar to each other, released by different studios within months or weeks of each other? My sister used to review film scripts for a living, she would read ten of them and have to pick her top three favorites, and tells me that scripts for nearly identical concepts came all the time. Here are a few of my favorite examples:
- 1994: [Wyatt Earp] and [Tombstone] were Westerns recounting the famed gunfight at the O.K. Corral. Tombstone, Arizona is near Tucson, and the gunfight is recreated fairly often for tourists.
- 1998: [Armageddon] and [Deep Impact] were a pair of disaster movies dealing with a large rock heading to destroy all life on earth. I was in Mazatlan, Mexico to see the latter, dubbed in Spanish as "Impacto Profundo".
- 1998: [A Bug's Life] and [Antz] were computer-animated tales of the struggle of one individual ant in an ant colony.
- 2000: [Mission to Mars] and [Red Planet] were sci-fi pics exploring what a manned mission to our neighboring planet might entail.
- 2009: [Paul Blart: Mall Cop] and [Observe and Report] were comedies dealing with challenges of security at a shopping mall.
(I think I made my point with just a few examples. A more complete list can be found on [Sam Greenspan's 11 Points website].)
This is different than copy-cat movies that are re-made or re-imagined many years later based on the previous successes of an original. Ever since my blog post [VPLEX: EMC's Latest Wheel is Round] in 2010 comparing EMC's copy-cat product that came our seven years after IBM's SAN Volume Controller (SVC), I've noticed EMC doesn't talk about VPLEX that much anymore.
This week, IBM announced [XIV Gen3 Solid-State Drive support] and our friends over at EMC announced [VFCache SSD-based PCIe cards]. Neither of these should be a surprise to anyone who follows the IT industry, as IBM had announced its XIV Gen3 as "SSD-Ready" last year specifically for this purpose, and EMC has been touting its "Project Lightning" since last May.
Fellow blogger Chris Mellor from The Register has a series of articles to cover this, including [EMC crashes the server flash party], [NetApp slaps down Lightning with multi-card Flash flush], [HP may be going the server flash route], and [Now HDS joins the server flash party].
Fellow blogger Chuck Hollis from EMC has a blog post [VFCache means Very Fast Cache indeed] that provides additional detail. Chuck claims the VFCache is faster than popular [Fusion-IO PCIe cards] available for IBM servers. I haven't seen the performance spec sheets, but typically SSD is four to five times slower than the DRAM cache used in the XIV Gen3. The VFCache's SSD is probably similar in performance to the SSD supported in the IBM XIV Gen3, DS8000, DS5000, SVC, N series, and Storwize V7000 disk systems.
Nonetheless, I've been asked my opinions on the comparison between these two announcements, as they both deal with improving application performance through the use of Solid-State Drives as an added layer of read cache.
(FTC Disclosure: I am both a full-time employee and stockholder of the IBM Corporation. The U.S. Federal Trade Commission may consider this blog post as a paid celebrity endorsement of IBM servers and storage systems. This blog post is based on my interpretation and opinions of publicly-available information, as I have no hands-on access to any of these third-party PCIe cards. I have no financial interest in EMC, Fusion-IO, Texas Memory Systems, or any other third party vendor of PCIe cards designed to fit inside IBM servers, and I have not been paid by anyone to mention their name, brands or products on this blog post.)
The solutions are different in that IBM XIV Gen3 the SSD is "storage-side" in the external storage device, and EMC VFCache is "server-side" as a PCI Express [PCIe] card. Aside from that, both implement SSD as an additional read cache layer in front of spinning disk to boost performance. Neither is an industry first, as IBM has offered server-side SSD since 2007, and IBM and EMC have offered storage-side SSD in many of their other external storage devices. The use of SSD as read cache has already been available in IBM N series using [Performance Accelerator Module (PAM)] cards.
IBM has offered cooperative caching synergy between its servers and its storage arrays for some time now. The predecessor to today's POWER7-based were the iSeries i5 servers that used PCI-X IOP cards with cache to connect i5/OS applications to IBM's external disk and tape systems. To compete in this space, EMC created their own PCI-X cards to attach their own disk systems. In 2006, IBM did the right thing for our clients and fostered competition by entering in a [Landmark agreement] with EMC to [license the i5 interfaces]. Today, VIOS on IBM POWER systems allows a much broader choice of disk options for IBM i clients, including the IBM SVC, Storwize V7000 and XIV storage systems.
EMC is not the first to manufacture an SSD-based PCIe card. Last summer, my friends at Texas Memory Systems [TMS] gave away a [RAMsan-70 PCIe card] at an after-party on [Day 2 of the IBM System Storage University].
Can a little SSD really help performance? Yes! An IBM client running a [DB2 Universal Database] cluster across eight System x servers was able to replace an 800-drive EMC Symmetrix by putting eight SSD Fusion-IO cards in each server, for a total of 64 Solid-State drives, saving money and improving performance. DB2 has the Data Partitioning Feature that has multi-system DB2 configurations using a Grid-like architecture similar to how XIV is designed. Most IBM System x and BladeCenter servers support internal SSD storage options, and many offer PCIe slots for third-party SSD cards. Sadly, you can't do this with a VFCache card, since you can have only one VFCache card in each server, the data is unprotected, and only for ephemeral data like transaction logs or other temporary data. With multiple Fusion-IO cards in an IBM server, you can configure a RAID rank across the SSD, and use it for persistent storage like DB2 databases.
Here then is my side-by-side comparison:
|Category||EMC VFCache||IBM XIV Gen3 SSD Caching|
|Servers supported||Selected x86-based models of Cisco UCS, Dell PowerEdge, HP ProLiant DL, and IBM xSeries and System x servers||All of these, plus any other blade or rack-optimized server currently supported by XIV Gen3, including Oracle SPARC, HP Titanium, IBM POWER systems, and even IBM System z mainframes running Linux|
|Operating System support||Linux RHEL 5.6 and 5.7, VMware vSphere 4.1 and 5.0, and Windows 2008 x64 and R2.||All of these, plus all the other operating systems supported by XIV Gen3, including AIX, IBM i, Solaris, HP-UX, and Mac OS X|
|Protocol support||FCP||FCP and iSCSI|
|Vendor-supplied driver required on the server||Yes, the VFCache driver must be installed to use this feature.||No, IBM XIV Gen3 uses native OS-based multi-pathing drivers.|
|External disk storage systems required||None, it appears the VFCache has no direct interaction with the back-end disk array, so in theory the benefits are the same whether you use this VFCache card in front of EMC storage or IBM storage||XIV Gen3 is required, as the SSD slots are not available on older models of IBM XIV.|
|Shared disk support||No, VFCache has to be disabled and removed for vMotion to take place.||Yes! XIV Gen3 SSD caching shared disk supports VMware vMotion and Live Partition Mobility.|
|Support for multiple servers||No||An advantage of the XIV Gen3 SSD caching approach is that the cache can be dynamically allocated to the busiest data from any server or servers.|
|Support for active/active server clusters||No||Yes!|
|Aware of changes made to back-end disk||No, it appears the VFCache has no direct interaction with the back-end disk array, so any changes to the data on the box itself are not communicated back to the VFCache card itself to invalidate the cache contents.||Yes!|
|Sequential-access detection||None identified. However, VFCache only caches blocks 64KB or smaller, so any sequential processing with larger blocks will bypass the VFCache.||Yes! XIV algorithms detect sequential access and avoid polluting the SSD with these blocks of data.|
|Number of SSD supported||One, which seems odd as IBM supports multiple Fusion-IO cards for its servers. However, this is not really a single point of failure (SPOF) as an application experiencing a VFCache failure merely drops down to external disk array speed, no data is lost since it is only read cache.||6 to 15 (one per XIV module) for high availability.|
|Pin data in SSD cache||Yes, using split-card mode, you can designate a portion of the 300GB to serve as Direct-attached storage (DAS). All data written to the DAS portion will be kept in SSD. However, since only one card is supported per server and the data is unprotected, this should only be used for ephemeral data like logs and temp files.||No, there is no option to designate an XIV Gen3 volume to be SSD-only. Consider using Fusion-IO PCIe card as a DAS alternative, or another IBM storage system for that requirement.|
|Pre-sales Estimating tools||None identified||Yes! CDF and Disk Magic tools are available to help cost-justify the purchase of SSD based on workload performance analysis.|
IBM has the advantage that it designs and manufactures both servers and storage, and can design optimal solutions for our clients in that regard.
technorati tags: IBM, XIV, Gen3, SSD, cache, EMC, VFCache, Project Lightning, SVC, Solid State Drives, Fusion-IO, Texas Memory Systems, RAMSan, System+x, POWER systems, VIOS, DRAM, VMware, Vmotion, Live Partition Mobility, AIX, IBM i, PCIe, PCI-X
Has EMC stooped so low that they have to resort to Hitachi math for their latest performance claims?
Readers might remember that just a few months ago, I had a blog post [Is this what HDS tells our mainframe clients?] pointing out the outlandish comparison Hitachi was using in their presentations. Their response was to cover it up, forcing me to follow up with my post [The Cover-up is worse than the original crime]. To their credit, they eventually removed the false and misleading information from their materials.
Now an avid reader of my blog has brought this to my attention. Apparently,
EMC has been showing customers a presentation
[Accelerating Storage Transformation with VMAX and VPLEX] with false and misleading comparison claims between IBM DS8000, HDS VSP and EMC VMAX 40K disk system performance.
(FTC Disclosure: This would be a good time to remind my readers that I work for IBM and own IBM stock. I do not endorse any of the EMC or HDS products mentioned in this post, and have no financial affiliation or investments directly with either EMC nor HDS. I am basing my information solely on the presentation posted on the internet and other sources publicly available, and not on any misrepresentations from EMC speakers at the various conferences where these charts might have been shown.)
The problem with misinformation is that it is not always obvious. The EMC presentation is quite pretty and professional-looking. It is the typical slick, attention-getting, low-content, over-simplified marketing puffery you have come to expect from EMC. There are two slides in particular that I have issue with.
This first graphic implies that IBM and HDS are nearly tied in performance, but that EMC VMAX 40K has nearly triple that bandwidth. Overall the slide has very little detail. That makes it difficult to determine what exactly is being claimed and whether a fair comparison is being made.
- The title claims that VMAX 40K is "#1 in High Bandwidth Apps". Only three disk systems are shown so the claim appears to be relative to only the three systems. The wording "High Bandwidth Apps" is confusing considering the cited numbers are for disk systems and no application is identified. By comparison, IBM SONAS can drive up to 105 GB/sec sequential bandwidth, nearly double what EMC claims for its VMAX 40K, so EMC is certainly not even close to #1.
- Is the workload random or sequential? That is not easy to determine. The use of "GB/s" along with the large block size of 128KB implies the I/O workload is sequential, which is great for some workloads like high performance computing, technical computing and video broadcasts. Random workloads, on the other hand, are usually measured in I/Os per second (IOPS) with a block size ranging 4KB to 64KB. (I am assuming the 128K blocks refers to 128KB block size, and not reading the same block of cache 128,000 times.)
- The slide states "Maximum Sustainable RRH Bandwidth 128K Blocks". The acronym "RRH" is not defined; but I suspect this refers to "random read hits". For random workloads, 100 percent random read hits from cache represents one corner of the infamous "four corners" test. Real-world workloads have a mix of reads and writes, and a mix of cache hits and cache misses. It is also unclear whether the hits are from standard data cache or from internal buffers in adapters (perhaps accessing the same blocks repeatedly) or something else. So is this really for a random workload, or a sequential workload?
(The term "Hitachi Math" was coined by an EMC blogger precisely to slam Hitachi Data Systems for their blatant use of four-corners results, claiming that spouting ridiculously large, but equally unrealistic, 100 percent random read hit results don't provide any useful information. I agree. There are much better industry-standard benchmarks available, such as SPC-1 for random workloads, SPC-2 for sequential workloads, and even benchmarks for specific applications, that represent real-world IT environments. To shame HDS for their use of four-corners results, only for EMC themselves to use similar figures in their own presentation is truly hypocritical of them!)
- The IBM system is identified as "DS8000". DS8000 is a generic family name that applies to multiple generations of systems first introduced in 2004. The specific model is not identified, but that is critical information. Is this a first generation DS8100, or the latest DS8800, or something in between?
The slide says "Full System Configs", but that is not defined and configuration details are not identified. Configuration details, also critical information in assessing system performance capabilities, are not specified. If the EMC box costs seven times more than IBM or HDS, would you really buy it to get 3x more performance? Is the EMC packed with the maximum amount of SSD? Were there any SSD in the IBM or HDS boxes to match?
The source of the claimed IBM DS8000 performance numbers is not identified. Did they run their own tests? While I cannot tell, the VMAX may have been configured with 64 Fibre Channel 8Gbps host connections. In that case each channel is theoretically capable of supporting about 800 MB/s at 100% channel utilization. Multiplying 64 x 800MB/s = 51.2GB/s, so did EMC just do the performance comparison on the back of a napkin, assuming there are no other bottlenecks in the system? Even then, I would not round up 51.2 to 52!
- Response times were not identified. For random I/Os, response time is a very important metric. It is possible that the Symmetrix was operating with some resources at 100% utilization to get the highest GB/s result, but that would likely make I/O response times unacceptable for real-world random I/O workloads.
IBM and HDS have both published Storage Performance Council [SPC] industry-standard performance benchmarks. EMC has not published any SPC benchmarks for VMAX systems. If EMC is interested in providing customers with audited, detailed performance information along with detailed configuration information, all based on benchmarks designed to represent real-world workloads, EMC can always publish SPC benchmark results as IBM and other vendors have done. In past blog fights, EMC resorts to the excuse that SPC isn't perfect, but can they really argue that vague and unrealistic claims cited in its presentation are better?
The second graphic is so absurd, you would think it came directly from Larry Ellison at an Oracle OpenWorld keynote session. EMC is comparing a configuration with VMAX 40K plus an EMC VFCache host-side flash memory cache card to a configuration with an IBM and HDS disk system without host-side flash memory cache also configured. The comparison is clearly apples-to-oranges. Other disk system configuration details are also omitted.
FAST VP is EMC's name for its sub-volume drive tiering feature, comparable to IBM Easy Tier and Hitachi's Dynamic Tiering. The graph implies that IBM and HDS can only achieve a modest increment improvement from their sub-volume tiering. I beg to differ. I have seen various cases where a small amount of SSD on IBM DS8000 series can drastically improve performance 200 to 400 percent.
The "DBClassify" shown on the graph is a tool run as part of an EMC professional services offering called Database Performance Tiering Assessment, makes recommendations for storing various database objects on different drive tiers based on object usage and importance. Do you really need to pay for professional services? With IBM Easy Tier, you just turn it on, and it works. No analysis required, no tools, no professional services, and no additional charge!
- VFCache is an optional product from EMC that currently has no integration whatsoever with VMAX. A fair comparison would have included a host-side flash memory cache (from any vendor) when the IBM or HDS storage system was configured. Or leave it out altogether and just focus on the sub-volume tiering comparison.
Keep in mind that EMC's VFCache supports only selected x86-based hosts. IBM has published a [Statement of Direction] indicating that it will also offer this for Power systems running AIX and Linux host-side flash memory cache integrated with DS8000 Easy Tier.
I feel EMC's claims about IBM DS8000 performance are vague and misleading. EMC appears to lack the kind of technical marketing integrity that IBM strives to attain.
Since EMC is not able or willing to publish fair and meaningful performance comparisons, it is up to me to set the record straight and point out EMC's failings in this matter.
Reminder: It's not to late to register for my Webcast "Solving the Storage Capacity Crisis" on Tuesday, September 25. See my blog post [Upcoming events in September] to register!
technorati tags: IBM, DS8000, DS8800, HDS, VSP, EMC, VMAX, Symmetrix, VFCache, Easy Tier, FAST VP
A long time ago, perhaps in the early 1990s, I was an architect on the component known today as DFSMShsm on z/OS mainframe operationg system. One of my job responsibilities was to attend the biannual [SHARE conference to listen to the requirements of the attendees on what they would like added or changed to the DFSMS, and ask enough questions so that I can accurately present the reasoning to the rest of the architects and software designers on my team. One person requested that the DFSMShsm RELEASE HARDCOPY should release "all" the hardcopy. This command sends all the activity logs to the designated SYSOUT printer. I asked what he meant by "all", and the entire audience of 120 some attendees nearly fell on the floor laughing. He complained that some clever programmer wrote code to test if the activity log contained only "Starting" and "Ending" message, but no error messages, and skip those from being sent to SYSOUT. I explained that this was done to save paper, good for the environment, and so on. Again, howls of laughter. Most customers reroute the SYSOUT from DFSMS from a physical printer to a logical one that saves the logs as data sets, with date and time stamps, so having any "skipped" leaves gaps in the sequence. The client wanted a complete set of data sets for his records. Fair enough.
When I returned to Tucson, I presented the list of requests, and the immediate reaction when I presented the one above was, "What did he mean by ALL? Doesn't it release ALL of the logs already?" I then had to recap our entire dialogue, and then it all made sense to the rest of the team. At the following SHARE conference six months later, I was presented with my own official "All" tee-shirt that listed, and I am not kidding, some 33 definitions for the word "all", in small font covering the front of the shirt.
I am reminded of this story because of the challenges explaining complicated IT concepts using the English language which is so full of overloaded words that have multiple meanings. Take for example the word "protect". What does it mean when a client asks for a solution or system to "protect my data" or "protect my information". Let's take a look at three different meanings:
- Unethical Tampering
The first meaning is to protect the integrity of the data from within, especially from executives or accountants that might want to "fudge the numbers" to make quarterly results look better than they are, or to "change the terms of the contract" after agreements have been signed. Clients need to make sure that the people authorized to read/write data can be trusted to do so, and to store data in Non-Erasable, Non-Rewriteable (NENR) protected storage for added confidence. NENR storage includes Write-Once, Read-Many (WORM) tape and optical media, disk and disk-and-tape blended solutions such as the IBM Grid Medical Archive Solution (GMAS) and IBM Information Archive integrated system.
- Unauthorized Access
The second meaning is to protect access from without, especially hackers or other criminals that might want to gather personally-identifiably information (PII) such as social security numbers, health records, or credit card numbers and use these for identity theft. This is why it is so important to encrypt your data. As I mentioned in my post [Eliminating Technology Trade-Offs], IBM supports hardware-based encryption FDE drives in its IBM System Storage DS8000 and DS5000 series. These FDE drives have an AES-128 bit encryption built-in to perform the encryption in real-time. Neither HDS or EMC support these drives (yet). Fellow blogger Hu Yoshida (HDS) indicates that their USP-V has implemented data-at-rest in their array differently, using backend directors instead. I am told EMC relies on the consumption of CPU-cycles on the host servers to perform software-based encryption, either as MIPS consumed on the mainframe, or using their Powerpath multi-pathing driver on distributed systems.
There is also concern about internal employees have the right "need-to-know" of various research projects or upcoming acquisitions. On SANs, this is normally handled with zoning, and on NAS with appropriate group/owner bits and access control lists. That's fine for LUNs and files, but what about databases? IBM's DB2 offers Label-Based Access Control [LBAC] that provides a finer level of granularity, down to the row or column level. For example, if a hospital database contained patient information, the doctors and nurses would not see the columns containing credit card details, the accountants would not see the columnts containing healthcare details, and the individual patients, if they had any access at all, would only be able to access the rows related to their own records, and possibly the records of their children or other family members.
- Unexpected Loss
The third meaning is to protect against the unexpected. There are lots of ways to lose data: physical failure, theft or even incorrect application logic. Whatever the way, you can protect against this by having multiple copies of the data. You can either have multiple copies of the data in its entirety, or use RAID or similar encoding scheme to store parts of the data in multiple separate locations. For example, with RAID-5 rank containing 6+P+S configuration, you would have six parts of data and one part parity code scattered across seven drives. If you lost one of the disk drives, the data can be rebuilt from the remaining portions and written to the spare disk set aside for this purpose.
But what if the drive is stolen? Someone can walk up to a disk system, snap out the hot-swappable drive, and walk off with it. Since it contains only part of the data, the thief would not have the entire copy of the data, so no reason to encrypt it, right? Wrong! Even with part of the data, people can get enough information to cause your company or customers harm, lose business, or otherwise get you in hot water. Encryption of the data at rest can help protect against unauthorized access to the data, even in the case when the data is scattered in this manner across multiple drives.
To protect against site-wide loss, such as from a natural disaster, fire, flood, earthquake and so on, you might consider having data replicated to remote locations. For example, IBM's DS8000 offers two-site and three-site mirroring. Two-site options include Metro Mirror (synchronous) and Global Mirror (asynchronous). The three-site is cascaded Metro/Global Mirror with the second site nearby (within 300km) and the third site far away. For example, you can have two copies of your data at site 1, a third copy at nearby site 2, and two more copies at site 3. Five copies of data in three locations. IBM DS8000 can send this data over from one box to another with only a single round trip (sending the data out, and getting an acknowledgment back). By comparison, EMC SRDF/S (synchronous) takes one or two trips depending on blocksize, for example blocks larger than 32KB require two trips, and EMC SRDF/A (asynchronous) always takes two trips. This is important because for many companies, disk is cheap but long-distance bandwidth is quite expensive. Having five copies in three locations could be less expensive than four copies in four locations.
Fellow blogger BarryB (EMC Storage Anarchist) felt I was unfair pointing out that their EMC Atmos GeoProtect feature only protects against "unexpected loss" and does not eliminate the need for encryption or appropriate access control lists to protect against "unauthorized access" or "unethical tampering".
(It appears I stepped too far on to ChuckH's lawn, as his Rottweiler BarryB came out barking, both in the [comments on my own blog post], as well as his latest titled [IBM dumbs down IBM marketing (again)]. Before I get another rash of comments, I want to emphasize this is a metaphor only, and that I am not accusing BarryB of having any canine DNA running through his veins, nor that Chuck Hollis has a lawn.)
As far as I know, the EMC Atmos does not support FDE disks that do this encryption for you, so you might need to find another way to encrypt the data and set up the appropriate access control lists. I agree with BarryB that "erasure codes" have been around for a while and that there is nothing unsafe about using them in this manner. All forms of RAID-5, RAID-6 and even RAID-X on the IBM XIV storage system can be considered a form of such encoding as well. As for the amount of long-distance bandwidth that Atmos GeoProtect would consume to provide this protection against loss, you might question any cost savings from this space-efficient solution. As always, you should consider both space and bandwidth costs in your total cost of ownership calculations.
Of course, if saving money is your main concern, you should consider tape, which can be ten to twenty times cheaper than disk, affording you to keep a dozen or more copies, in as many time zones, at substantially lower cost. These can be encrypted and written to WORM media for even more thorough protection.
If these three methods of protection sound familiar, I mentioned them in my post about [Pulse conference, Data Protection Strategies] back in May 2008.
Did IBM XIV force EMC's hand to announce VMAXe? Let's take a stroll down memory lane.
In 2008, IBM XIV showed the world that it could ship a Tier-1, high-end, enterprise-class system using commodity parts. Technically, prior to its acquisition by IBM, the XIV team had boxes out in production since 2005. EMC incorrectly argued this announcement meant the death of the IBM DS8000. Just because EMC was unable to figure out how to have more than one high-end disk product, doesn't mean IBM or other storage vendors were equally challenged. Both IBM XIV and DS8000 are Tier-1, high-end, enterprise-class storage systems, as are the IBM N series N7900 and the IBM Scale-Out Network Attached Storage (SONAS).
In April 2009, EMC followed IBM's lead with their own V-Max system, based on Symmetrix Engenuity code, but on commodity x86 processors. Nobody at EMC suggested that the V-Max meant the death of their other Symmetrix box, the DMX-4, which means that EMC proved to themselves that a storage vendor could offer multiple high-end disk systems. Hitachi Data Systems (HDS) would later offer the VSP, which also includes some commodity hardware as well.
In July 2009, analysts at International Technology Group published their TCO findings that IBM XIV was 63 percent less expensive than EMC V-Max, in a whitepaper titled [COST/BENEFIT CASE
FOR IBM XIV STORAGE SYSTEM Comparing Costs for IBM XIV and EMC V-Max Systems]. Not surprisingly, EMC cried foul, feeling that EMC V-Max had not yet been successful in the field, it was too soon to compare newly minted EMC gear with a mature product like XIV that had been in production accounts for several years. Big companies like to wait for "Generation 1" of any new product to mature a bit before they purchase.
To compete against IBM XIV's very low TCO, EMC was forced to either deeply discount their Symmetrix, or counter-offer with lower-cost CLARiiON, their midrange disk offering. An ex-EMCer that now works for IBM on the XIV sales team put it in EMC terms -- "the IBM XIV provides a Symmetrix-like product at CLARiiON-like prices."
(Note: Somewhere in 2010, EMC dropped the hyphen, changing the name from V-Max to VMAX. I didn't see this formally announced anywhere, but it seems that the new spelling is the officially correct usage. A common marketing rule is that you should only rename failed products, so perhaps dropping the hyphen was EMC's way of preventing people from searching older reviews of the V-Max product.)
This month, IBM introduced the IBM XIV Gen3 model 114. The analysts at ITG updated their analysis, as there are now more customers that have either or both products, to provide a more thorough comparison. Their latest whitepaper, titled [Cost/Benefit Case for IBM XIV Systems: Comparing Cost
Structures for IBM XIV and EMC VMAX Systems], shows that IBM maintains its substantial cost savings advantage, representing 69 percent less Total Cost of Ownership (TCO) than EMC, on average, over the course of three years.
In response, EMC announced its new VMAXe, following the naming convention EMC established for VNX and VNXe. Customers cannot upgrade VNXe to VNX, nor VMAXe to VMAX, so at least EMC was consistent in that regard. Like the IBM XIV and XIV Gen3, the new EMC VMAXe eliminated "unnecessary distractions" like CKD volumes and FICON attachment needed for the IBM z/OS operating system on IBM System z mainframes. Fellow blogger Barry Burke from EMC explains everything about the VMAXe in his blog post [a big thing in a small package].
So, you have to wonder, did IBM XIV force EMC's hand into offering this new VMAXe storage unit? Surely, EMC sales reps will continue to lead with the more profitable DMX-4 or VMAX, and then only offer the VMAXe when the prospective customer mentions that the IBM XIV Gen3 is 69 percent less expensive. I haven't seen any list or street prices for the VMAXe yet, but I suspect it is less expensive than VMAX, on a dollar-per-GB basis, so that EMC will not have to discount it as much to compete against IBM.
technorati tags: IBM, XIV, Gen3, EMC, DMX-4, VMAX, V-Max, HDS, N7900, SONAS, DS8000, CKD, FICON, TCO
“In times of universal deceit, telling the truth will be a revolutionary act.”
-- George Orwell
Well, it has been over two years since I first covered IBM's acquisition of the XIV company. Amazingly, I still see a lot of misperceptions out in the blogosphere, especially those regarding double drive failures for the XIV storage system. Despite various attempts to [explain XIV resiliency] and to [dispel the rumors], there are still competitors making stuff up, putting fear, uncertainty and doubt into the minds of prospective XIV clients.
Clients love the IBM XIV storage system! In this economy, companies are not stupid. Before buying any enterprise-class disk system, they ask the tough questions, run evaluation tests, and all the other due diligence often referred to as "kicking the tires". Here is what some IBM clients have said about their XIV systems:
“3-5 minutes vs. 8-10 hours rebuild time...”
-- satisfied XIV client
“...we tested an entire module failure - all data is re-distributed in under 6 hours...only 3-5% performance degradation during rebuild...”
-- excited XIV client
“Not only did XIV meet our expectations, it greatly exceeded them...”
-- delighted XIV client
In this blog post, I hope to set the record straight. It is not my intent to embarrass anyone in particular, so instead will focus on a fact-based approach.
- Fact: IBM has sold THOUSANDS of XIV systems
XIV is "proven" technology with thousands of XIV systems in company data centers. And by systems, I mean full disk systems with 6 to 15 modules in a single rack, twelve drives per module. That equates to hundreds of thousands of disk drives in production TODAY, comparable to the number of disk drives studied by [Google], and [Carnegie Mellon University] that I discussed in my blog post [Fleet Cars and Skin Cells].
- Fact: To date, no customer has lost data as a result of a Double Drive Failure on XIV storage system
This has always been true, both when XIV was a stand-alone company and since the IBM acquisition two years ago. When examining the resilience of an array to any single or multiple component failures, it's important to understand the architecture and the design of the system and not assume all systems are alike. At it's core, XIV is a grid-based storage system. IBM XIV does not use traditional RAID-5 or RAID-10 method, but instead data is distributed across loosely connected data modules which act as independent building blocks. XIV divides each LUN into 1MB "chunks", and stores two copies of each chunk on separate drives in separate modules. We call this "RAID-X".
Spreading all the data across many drives is not unique to XIV. Many disk systems, including EMC CLARiiON-based V-Max, HP EVA, and Hitachi Data Systems (HDS) USP-V, allow customers to get XIV-like performance by spreading LUNs across multiple RAID ranks. This is known in the industry as "wide-striping". Some vendors use the terms "metavolumes" or "extent pools" to refer to their implementations of wide-striping. Clients have coined their own phrases, such as "stripes across stripes", "plaid stripes", or "RAID 500". It is highly unlikely that an XIV will experience a double drive failure that ultimately requires recovery of files or LUNs, and is substantially less vulnerable to data loss than an EVA, USP-V or V-Max configured in RAID-5. Fellow blogger Keith Stevenson (IBM) compared XIV's RAID-X design to other forms of RAID in his post [RAID in the 21st Centure].
- Fact: IBM XIV is designed to minimize the likelihood and impact of a double drive failure
The independent failure of two drives is a rare occurrence. More data has been lost from hash collisions on EMC Centera than from double drive failures on XIV, and hash collisions are also very rare. While the published worst-case time to re-protect from a 1TB drive failure for a fully-configured XIV is 30 minutes, field experience shows XIV regaining full redundancy on average in 12 minutes. That is 40 times less likely than a typical 8-10 hour window for a RAID-5 configuration.
A lot of bad things can happen in those 8-10 hours of traditional RAID rebuild. Performance can be seriously degraded. Other components may be affected, as they share cache, connected to the same backplane or bus, or co-dependent in some other manner. An engineer supporting the customer onsite during a RAID-5 rebuild might pull the wrong drive, thereby causing a double drive failure they were hoping to avoid. Having IBM XIV rebuild in only a few minutes addresses this "human factor".
In his post [XIV drive management], fellow blogger Jim Kelly (IBM) covers a variety of reasons why storage admins feel double drive failures are more than just random chance. XIV avoids load stress normally associated with traditional RAID rebuild by evenly spreading out the workload across all drives. This is known in the industry as "wear-leveling". When the first drive fails, the recovery is spread across the remaining 179 drives, so that each drive only processes about 1 percent of the data. The [Ultrastar A7K1000] 1TB SATA disk drives that IBM uses from HGST have specified 1.2 million hours mean-time-between-failures [MTBF] would average about one drive failing every nine months in a 180-drive XIV system. However, field experience shows that an XIV system will experience, on average, one drive failure per 13 months, comparable to what companies experience with more robust Fibre Channel drives. That's innovative XIV wear-leveling at work!
- Fact: In the highly unlikely event that a DDF were to occur, you will have full read/write access to nearly all of your data on the XIV, all but a few GB.
Even though it has NEVER happened in the field, some clients and prospects are curious what a double drive failure on an XIV would look like. First, a critical alert message would be sent to both the client and IBM, and a "union list" is generated, identifying all the chunks in common. The worst case on a 15-module XIV fully loaded with 79TB data is approximately 9000 chunks, or 9GB of data. The remaining 78.991 TB of unaffected data are fully accessible for read or write. Any I/O requests for the chunks in the "union list" will have no response yet, so there is no way for host applications to access outdated information or cause any corruption.
(One blogger compared losing data on the XIV to drilling a hole through the phone book. Mathematically, the drill bit would be only 1/16th of an inch, or 1.60 millimeters for you folks outside the USA. Enough to knock out perhaps one character from a name or phone number on each page. If you have ever seen an actor in the movies look up a phone number in a telephone booth then yank out a page from the phone book, the XIV equivalent would be cutting out 1/8th of a page from an 1100 page phone book. In both cases, all of the rest of the unaffected information is full accessible, and it is easy to identify which information is missing.)
If the second drive failed several minutes after the first drive, the process for full redundancy is already well under way. This means the union list is considerably shorter or completely empty, and substantially fewer chunks are impacted. Contrast this with RAID-5, where being 99 percent complete on the rebuild when the second drive fails is just as catastrophic as having both drives fail simultaneously.
- Fact: After a DDF event, the files on these few GB can be identified for recovery.
Once IBM receives notification of a critical event, an IBM engineer immediately connects to the XIV using remote service support method. There is no need to send someone physically onsite, the repair actions can be done remotely. The IBM engineer has tools from HGST to recover, in most cases, all of the data.
Any "union" chunk that the HGST tools are unable to recover will be set to "media error" mode. The IBM engineer can provide the client a list of the XIV LUNs and LBAs that are on the "media error" list. From this list, the client can determine which hosts these LUNs are attached to, and run file scan utility to the file systems that these LUNs represent. Files that get a media error during this scan will be listed as needing recovery. A chunk could contain several small files, or the chunk could be just part of a large file. To minimize time, the scans and recoveries can all be prioritized and performed in parallel across host systems zoned to these LUNs.
As with any file or volume recovery, keep in mind that these might be part of a larger consistency group, and that your recovery procedures should make sense for the applications involved. In any case, you are probably going to be up-and-running in less time with XIV than recovery from a RAID-5 double failure would take, and certainly nowhere near "beyond repair" that other vendors might have you believe.
- Fact: This does not mean you can eliminate all Disaster Recovery planning!
To put this in perspective, you are more likely to lose XIV data from an earthquake, hurricane, fire or flood than from a double drive failure. As with any unlikely disaster, it is best to have a disaster recovery plan than to hope it never happens. All disk systems that sit on a single datacenter floor are vulnerable to such disasters.
For mission-critical applications, IBM recommends using disk mirroring capability. IBM XIV storage system offers synchronous and asynchronous mirroring natively, both included at no additional charge.
For more about IBM XIV reliability, read this whitepaper [IBM XIV© Storage System: Reliability Reinvented]. To find out why so many clients LOVE their XIV, contact your local IBM storage sales rep or IBM Business Partner.
technorati tags: IBM, XIV, DDF, RAID-5, RAID-10, RAID-X, RAID-6, RAID-DP, HP, EVA, HDS, USP-V, EMC, CLARiiON, V-Max, Disaster Recovery, HGST, UltraStar, A7K1000
This week, I am in beautiful Sao Paulo, Brazil, teaching Top Gun class to IBM Business Partners and sales reps. Traditionally, we have "Tape Thursday" where we focus on our tape systems, from tape drives, to physical and virtual tape libraries. IBM is the number #1 tape vendor, and has been for the past eight years.
(The alliteration doesn't translate well here in Brazil. The Portuguese word for tape is "fita", and Thursday here is "quinta-feira", but "fita-quinta-feira" just doesn't have the same ring to it.)
In the class, we discussed how to handle common misperceptions and myths about tape. Here are a few examples:
- Myth 1: Tape processing is manually intensive
In my July 2007 blog post [Times a Million], I coined the phrase "Laptop Mentality" to describe the problem most people have dealing with data center decisions. Many folks extend linearly their experiences using their PCs, workstations or laptops to apply to the data center, unable to comprehend large numbers or solutions that take advantage of the economies of scale.
For many, the only experience dealing with tape was manual. In the 1980s, we made "mix tapes" on little cassettes, and in the 1990s we recorded our favorite television shows on VHS tapes in the VCR. Today, we have playlists on flash or disk-based music players, and record TV shows on disk-based video recorders like Tivo. The conclusion is that tapes are manual, and disk are not.
Manual processing of tapes ended in 1987, with the introduction of a silo-like tape library from StorageTek. IBM quickly responded with its own IBM 3495 Tape Library Data Server in 1992. Today, clients have many tape automation choices, from the smallest IBM TS2900 Tape Autoloader that has one drive and nine cartridges, all the way to the largest IBM TS3500 multiple-library shuttle complex that can hold exabytes of data. These tape automation systems eliminate most of the manual handling of cartridges in day-to-day operations.
- Myth 2: Tape media is less reliable than disk media
For any storage media to be unreliable is to return the wrong information that is different than what was originally stored. There are only two ways for this to happen: if you write a "zero" but read back a "one", or write a "one" and read a "zero". This is called a bit error. Every storage media has a "bit error rate" that is the average likelihood for some large amount of data written.
According to the latest [LTO Bit Error rates, 2012 March], today's tape expects only 1 bit error per 10E17 bits written (about 100 Petabytes). This is 10 times more reliable than Enterprise SAS disk (1 bit per 10E16), and 100 times more reliable than Enterprise-class SATA disk (1 bit per 10E15).
Tape is the media used in "black boxes" for airplanes. When an airplane crashes, the black box is retrieved and used to investigate the causes of the crash. In 1986, the Space Shuttle Challenger exploded 73 seconds after take-off. The tapes in the black box sat on the ocean floor for six weeks before being recovered. Amazingly, IBM was able to successfully restore [90 percent of the block data, and 100 percent of voice data].
- Myth 3: Most tape restores fail
Why do people still believe that most tape restores fail? Curtis Preston, on his Backup Central blog, has a great post [Gartner Never Said 71 percent of Tape Restores Fail].
Analysts are quite upset when they are quoted out of context, but in this case, Gartner never said anything closely similar to this. Nor did the other analysts that Curtis investigated for similar claims. What Garnter did say was that disk provides an attractive alternative storage media for backup which can increase the performance of the recovery process.
Back in the 1990s, Savur Rao and I developed a patent to help backup DB2 for z/OS by using the FlashCopy feature of IBM's high-end disk system. The software method to coordinate the FlashCopy snapshots with the database application and maintain multiple versions was implemented in the DFSMShsm component of DFSMS. A few years later, this was part of a set of patents IBM cross-licensed to Microsoft for them to implement a similar software for Windows called Data Protection Manager (DPM). IBM has since introduced its own version for distributed systems called IBM Tivoli FlashCopy Manager that runs not just on Windows, but also AIX, Linux, HP-UX and Solaris operating systems.
Curtis suspects the "71 percent" citation may have been propogated by an ambitious product manager of Microsoft's Data Protection Manager, back in 2006, perhaps to help drive up business to their new disk-based backup product. Certainly, Microsoft was not the only vendor to disparage tape in this manner.
A few years ago, an [EMC failure brought down the State of Virginia] due to not just a component failure it its production disk system, but then made it worse by failing to recover from the disk-based remote mirror copy. Fortunately, the data was able to be restored from tape over the next four days. If you wonder why nobody at EMC says "Tape is Dead" anymore, perhaps it is because tape saved their butts that week.
(FTC Disclosure: I work for IBM and this post can be considered a paid, celebrity endorsement for all of the IBM tape and software products mentioned on this post. I own shares of stock in both IBM and Google, and use Google's Gmail for my personal email, as well as many other Google services. While IBM, Google and Microsoft can be considered competitors to each other in some areas, IBM has working relationships with both companies on various projects. References in this post to other companies like EMC are merely to provide illustrative examples only, based on publicly available information. IBM is part of the Linear Tape Open (LTO) consortium.)
Last year, Google lost the email data for half a million Gmail accounts due to a software error. Once again, tape came to the rescue, with [Google restoring lost Gmail data from tape backups].
- Myth 4: Vendors and Manufacturers are no longer investing in tape technology
IBM and others are still investing Research and Development (R&D) dollars to improve tape technology. What people don't realize is that much of the R&D spent on magnetic media can be applied across both disk and tape, such as IBM's development of the Giant Magnetoresistance read/write head, or [GMR] for short.
Most recently, IBM made another major advancement with tape with the introduction of the Linear Tape File Systems (LTFS). This allows greater portability to share data between users, and between companies, but treating tape cartridges much like USB memory sticks or pen drives. You can read more in my post [IBM and Fox win an Emmy for LTFS technology]!
Next month, IBM celebrates the 60th anniversary for tape. It is good to see that tape continues to be a vibrant part of the IT industry, and to IBM's storage business!
technorati tags: IBM, Google, Microsoft, EMC, Brazil, LTO, TS2900, TS3500, Space Shuttle, Challenger
Normally, when EMC fails, it is worth a giggle. Companies are run by humans, and nobody is perfect. However, their latest one, failing to defend their RSA SecurID two-factor website, is no laughing matter. Breaches like this undermine the trust needed for business and commerce to be done with Information Technology, so it affects the entire IT industry.
(FTC Disclosure: I do not work or have any financial investments in either EMC nor ENC Security Systems. Neither EMC nor ENC Security Systems paid me to mention them on this blog. Their mention in this blog is not an endorsement of either company or their products. Information about EMC was based solely on publicly available information made available by EMC and others. My friends at ENC Security Systems provided me an evaluation license for their latest software release so that I could confirm the use cases posed in this post.)
Of course, EMC did the right thing by making this breach public in an [Open Letter to RSA Customers]. While this may affect their revenues, as clients question whether they should do business with EMC, or affect their stock price, as investors question whether they should invest in EMC, they were very clear and public that the breach occurred. As far as I know, none of the executives of the RSA security division have stepped down. The disclosure of the breach was the right thing to do, and required by law from the [US Securities Exchange Commission]. This law was created to prevent companies from trying to hide breaches that expose external client information.
The breach does not affect RSA public/private key pairs used by IBM and most every other large company. Rather, this breach was targeted to RSA SecurID two-factor authentication. I explained two-factor authentication in my blog post [Day 5 Grid, SOA and Cloud Computing - System x KVM solutions], but basically it is an added level of security, requiring something you know (your password) with something you have (such as a magnetic card or key fob). Both are required to gain access to the system.
Breaches happen. Recently, [Hackers found vulnerabilities in the McAfee.com website]. Last month, fellow blogger Chuck Hollis from EMC had a blog post on [Understanding Advanced Persistent Threats (APT)] in the week leading up to their RSA Conference. It was precisely an APT that hit RSA, so the irony of this breach was not lost on the blogosphere. Perhaps Chuck's blog post gave hackers the idea to do this, like saying "I hope terrorists don't bomb this building that hold all of our chemical weapons..." or "I hope bank robbers don't rob this repository where we keep all the cash..."
(The sinister counter-theory, that EMC staged this breach as a marketing stunt to undermine trust in hybrid or public cloud offerings, such as those offered by IBM, Amazon or Salesforce.com, offers an interesting twist. While computer breaches in general are fodder for [Luddites] to argue we should not use computers at all, this particular breach could be used by EMC salesmen to encourage their customers to choose private cloud over hybrid cloud or public cloud deployments. Given all the extra work that RSA SecurID customers have to now do to harden their environments, that would be in bad taste.)
Over on Mashable, Simon Crosby argues [Why the Cloud Is Actually the Safest Place for Your Data]. I am sure we have not heard the last of the implications of this RSA breach. For now, I have two recommendations for you.
- Validate Backup Methodology
Today, March 31, is World Backup Day. This is because many viruses are triggered to operate on April 1. Just like checking the batteries in your smoke alarms every year, you should ensure that your backup methodology remains valid.
Back in 2008, I was a volunteer for the One Laptop Per Child (OLPC) initiative, and built an XS server to be used for Uruguay. I shipped [this baby off to school] to be the central server that all the student and teacher laptops connected to. It was the gateway to the Internet, as well as the [repository for the blogs of each student]. The blogs were accessible to the public, so that parents could read what their students were writing.
Unfortunately, this public access resulted in my little XS server being attacked by hackers, with IP addresses in Russia and China. Why anyone from either of those two countries wanted to ruin the hopes and dreams of small school children in Uruguay was beyond me. Fortunately, I had planned for remote administration. Backups were taken by me weekly to a second drive that was only mounted when I was dialed in to take the backup. The rest of the time, it was offline, so as not to be written to by hackers.
I also shipped along with the server a bootable DVD that contained a modified version of [System Rescue CD], scripts to start up SSHD daemon, and pre-populated for use with public/private RSA keys for me and eight other administrators located in various countries. To effect repairs, the local operator would reboot to the DVD, and then I could login via "ssh" and restore the operating system, programs and data. Sadly, this meant that the students might have lost some of their most recent blog posts since the last backup.
Please consider reviewing your own backup strategies. If your security were compromised, data was corrupted or lost, would you be able to recover from your backups?
- Use Encryption where Appropriate
If you plan to travel this Summer, you may want to consider encryption to protect yourself. ENC Security Systems has just released their latest [Encrypt Stick] which is a USB memory stick pre-loaded with software that provides three features:
- Encryption for your files
- A secure web browser for accessing sensitive websites
- Secure password manager
- Hotel Lobby
Many hotels now offer computers for use by the guests. These are typically running some flavor of Windows operating system. Encrypt Stick comes with an EXE file that you can run to browse the web securely, and have access to your encrypted files and passwords, leaving no trace on the hotel lobby computer.
- Friends and Family
What if you are visiting friends and family, and they have a Mac instead? No problem, as Encrypt Stick has a DMG file to use on Mac OS X operating system. While you may not be worried about your siblings hacking into your bank account, you may not want them necessarily seeing what sites you visited.
- Airport Lounge
I have been to several airport lounges now that use Linux for their public computers. Makes sense to me, as there are fewer viruses for Linux, and updating Linux is relatively straightforward. However, Encrypt Stick does not support Linux. For my Linux-knowledgeable readers, you can build your own with [Unetbootin] bootable USB memory stick to launch your favorite Linux browser in memory on whatever system you are using. The [Gparted Magic] utility rescue tool includes [TrueCrypt] to encrypt your files. Lastly, you can use [MyPasswordSafe] to hold all of your passwords securely.
Several clients have asked if any of the IBM data-at-rest encrypted disks or tapes are affected by this breach. IBM uses AES encryption for the actual disk and tape media, but we do use RSA keys to encrypt the generated keys used on the TS1120 and TS1130 drives. However, these were not affected by the RSA SecurID breach, and your tapes are safely protected.
Advanced Persistent Threats, viruses and other malware are no laughing matter. If you are concerned about security, contact IBM to help you assess your current environment and help you plan a robust protection strategy.
technorati tags: IBM, EMC, ENC Security Systems, EncryptStick, RSA, SecurID, breach, APT, Chuck Hollis, OLPC, SysRescCD, UnetBootin, TrueCrypt, Gparted, TS1120, TS1130, AES
My series last week on IBM Watson (which you can read [here], [here], [here], and [here]) brought attention to IBM's Scale-Out Network Attached Storage [SONAS]. IBM Watson used a customized version of SONAS technology for its internal storage, and like most of the components of IBM Watson, IBM SONAS is commercially available as a stand-alone product.
Like many IBM products, SONAS has gone through various name changes. First introduced by Linda Sanford at an IBM SHARE conference in 2000 under the IBM Research codename Storage Tank, it was then delivered as a software-only offering SAN File System, then as a services offering Scale-out File Services (SoFS), and now as an integrated system appliance, SONAS, in IBM's Cloud Services and Systems portfolio.
If you are not familiar with SONAS, here are a few of my previous posts that go into more detail:
This week, IBM announces that SONAS has set a world record benchmark for performance, [a whopping 403,326 IOPS for a single file system]. The results are based on comparisons of publicly available information from Standard Performance Evaluation Corporation [SPEC], a prominent performance standardization organization with more than 60 member companies. SPEC publishes hundreds of different performance results each quarter covering a wide range of system performance disciplines (CPU, memory, power, and many more). SPECsfs2008_nfs.v3 is the industry-standard benchmark for NAS systems using the NFS protocol.
(Disclaimer: Your mileage may vary. As with any performance benchmark, the SPECsfs benchmark does not replicate any single workload or particular application. Rather, it encapsulates scores of typical activities on a NAS storage system. SPECsfs is based on a compilation of workload data submitted to the SPEC organization, aggregated from tens of thousands of fileservers, using a wide variety of environments and applications. As a result, it is comprised of typical workloads and with typical proportions of data and metadata use as seen in real production environments.)
The configuration tested involves SONAS Release 1.2 on 10 Interface Nodes and 8 Storage Pods, resulting a single file system over 900TB usable capacity.
- 10 Interface Nodes; each with:
- Maximum 144 GB of memory
- One active 10GbE port
- 8 Storage Pods; each with:
- 2 Storage nodes and 240 drives
- Drive type: 15K RPM SAS hard drives
- Data Protection using RAID-5 (8+P) ranks
- Six spare drives per Storage Pod
IBM wanted a realistic "no compromises" configuration to be tested, by choosing:
- Regular 15K RPM SAS drives, rather than a silly configuration full of super-expensive Solid State Drives (SSD) to plump up the results.
- Moderate size, typical of what clients are asking for today. The Goldilocks rule applies. This SONAS is not a small configuration under 100TB, and nowhere close to the maximum supported configuration of 7,200 disks across 30 Interface Nodes and 30 Storage Pods.
- Single file system, often referred to as a global name space, rather than using an aggregate of smaller file systems added together that would be more complicated to manage. Having multiple file systems often requires changes to applications to take advantage of the aggregate peformance. It is also more difficult to load-balance your performance and capacity across multiple file systems. Of course, SONAS can support up to 256 separate file systems if you have a business need for this complexity.
The results are stunning. IBM SONAS handled three times more workload for a single file system than the next leading contender. All of the major players are there as well, including NetApp, EMC and HP.
Congratulations to the SONAS development and test teams! Scale-Out NAS is a competitive space. SONAS can handle not only large streaming files but also small random I/O workloads extraordinarily well. Just in the last two years, to compete against IBM's leadership in this realm, [HP acquired Ibrix], [EMC acquired Isilon] and [Dell has acquired what's left of Exanet's assets], THey have a lot of catching up to do!
technorati tags: IBM, SONAS, Watson, Storage Tank, SFS, SoFS, SBSC, SSD, SAS, , IOPS, SPEC, SPECsfs, SPECsfs2008, SPECsfs2008_nfs.v3, EMC, Isilon, HP, Ibrix, Dell, Exanet, Global Name Space, scale-out,, Watson, IBM Watson, benchmark, performance, record performance, world record, filesystem, file+system, nfs, EMC, NetApp, VNX, Isilon, storage, storage+system, NAS
This week, July 26-30, 2010, I am in Washington DC for the annual [2010 System Storage Technical University]. As with last year, we have joined forces with the System x team. Since we are in Washington DC this time, IBM added a "Federal Track" to focus on government challenges and solutions. So, basically, offering attendees the option to attend three conferences for one low price.
This conference was previously called the "Symposium", but IBM changed the name to "Technical University" to emphasize the technical nature of the conference. No marketing puffery like "Journey to the Private Cloud" here! Instead, this is bona fide technical training, qualifying attendees to count this towards their Continuing Professional Education (CPE).
(Note to my readers:The blogosphere is like a playground. In the center are four-year-olds throwing sand into each other's faces, while mature adults sit on benches watching the action, and only jumping in as needed. For example, fellow blogger Chuck Hollis (EMC) got sand in his face for promising to resign if EMC ever offered a tacky storage guarantee, and then [failed to follow through on his promise] when it happened.
Several of my readers asked me to respond to another EMC blogger's latest [fistful of sand].
A few months ago, fellow blogger Barry Burke (EMC) committed to [stick to facts] in posts on his Storage Anarchist blog. That didn't last long! BarryB apparently has fallen in line with EMC's over-promise-then-under-deliver approach. Unfortunately, I will be busy covering the conference and IBM's robust portfolio of offerings, so won't have time to address BarryB's stinking pile of rumor and hearsay until next week or later. I am sorry to disappoint.)
This conference is designed to help IT professionals make their business and IT infrastructure more dynamic and, in the process, help reduce costs, mitigate risks, and improve service. This technical conference event is geared to IT and Business Managers, Data Center Managers, Project Managers, System Programmers, Server and Storage Administrators, Database Administrators, Business Continuity and Capacity Planners, IBM Business Partners and other IT Professionals. This week will offer over 300 different sessions and hands-on labs, certification exams, and a Solutions Center.
For those who want a quick stroll through memory lane, here are my posts from past events:
- 2007 Storage Symposium: [Day 1,
- 2009 Storage Symposium: [
Day 2-Server Virtualization,
Day 3-Extraordinary Networks,
Day 5-Meet the Experts]
In keeping up with IBM's leadership in Social Media, IBM Systems Lab Services and Training team running this event have their own [Facebook Fan Page] and
[blog]. IBM Technical University has a Twitter account [@ibmtechconfs], and hashtag #ibmtechu. You can also follow me on Twitter [@az990tony].
technorati tags: IBM, Technical University, Federal, System Storage, System x, Washington DC, CPE, EMC, Facebook, Twitter