This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
"When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, the actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1 Terabyte (TB). For performance reasons, various subsets of the data are replicated in RAM on different functional groups of cluster nodes. The entire system is self-contained, Watson is NOT going to the internet searching for answers."
I had several readers ask me to explain the significance of the "Terabyte". I'll work my way up.
A bit is simply a zero (0) or one (1). This could answer a Yes/No or True/False question.
Most computers have standardized a byte as a collection of 8 bits. There are 256 unique combinations of ones and zeros possible, so a byte could be used to storage a 2-digit integer, or a single upper or lower case character in the English alphabet. In pratical terms, a byte could store your age in years, or your middle initial.
The Kilobyte is a thousand bytes, enough to hold a few paragraphs of text. A typical written page could be held in 4 KB, for example.
The IBM Challenge to play on Jeopardy! is being compared to the historic 1969 moon landing. To land on the moon, Apollo 11 had the "Apollo Guidance Computer" (AGC) which had 74KB of fixed read-only memory, and 2KB of re-writeable memory. Over [3500 IBM employees were involved] to get the astronauts to the moon and safely back to earth again.
The importance of this computer was highlighted in a [lecture by astronaut David Scott] who said: "If you have a basketball and a baseball 14 feet apart, where the baseball represents the moon and the basketball represents the Earth, and you take a piece of paper sideways, the thinness of the paper would be the corridor you have to hit when you come back."
The Megabyte is a thousand KB, or a million bytes. The 3.5-inch floppy diskette, mentioned in my post [A Boxfull of Floppies] could hold 1.44MB, or about 360 pages of text.
In the article [Wikipedia as a printed book], the printing of a select 400 articles resulted in a book 29 inches thick. Those 5,000 pages would consume about 20 MB of space.
One of my favorite resources I use to search is the Internet Movie Data Base [IMDB]. Leaving out the photos and videos, the [text-only portion of the IMDB database is just over 600 MB], representing nearly all of the actors, awards, nominations, television shows and movies. A standard CD-ROM can hold 700MB, so the text portion of the IMDB could easily fit on a single CD.
The Gigabyte is a thousand MB, or a billion bytes. My Thinkpad T410 laptop has 4GB of RAM and 320GB of hard disk space. My laptop comes with a DVD burner, and each DVD can hold up to 4.7GB of information.
The popular Wikipedia now has some 17 million articles, of which 3.5 million are in English language. It would only take [14GB of space to hold the entire English portion] of Wikipedia. That is small enough to fit on twenty CDs, three DVDs, an Apple iPad or my cellphone (a Samsung Galaxy S Vibrant).
Perhaps you are thinking, "Someone should offer Wikipedia pre-installed on a small handheld!" Too late. The [The Humane Reader] is able to offer 5,000 books and Wikipedia in a small device that connects to your television. This would be great for people who do not have access to the internet, or for parents who want their kids to do their homework, but not be online while they are doing it.
In the latest 2009 report of [How Much Information?] from the University of California, San Diego, the average American consumes 34 GB of information. This includes all the information from radio, television, newspapers, magazines, books and the internet that a person might look at or listen to throughout the day. This project is sponsored by IBM and others to help people understand the nature of our information-consuption habits.
Back in 1992, I visited a client in Germany. Their 90 GB of disk storage attached to their mainframe was the size of three refrigerators, and took five full-time storage administrators to manage.
The Terabyte is a thousand GB, or a trillion bytes. It is now possible to buy external USB drive for your laptop or personal computer that holds 1TB or more. However, at 40MB/sec speeds that USB 2.0 is capable of, it would take seven hours to do a bulk transfer in or out of the device.
IBM offers 1TB and 2TB disk drives in many of our disk systems. In 2008, IBM was preparing to announce the first 1TB tape drive. However, Sun Microsystems announced their own 1TB drive the day before our big announcement, so IBM had to rephrase the TS1130 announcement to [The World's Fastest 1TB tape drive!]
A typical academic research library will hold about 2TB of information. For the [US Library of Congress] print collection is considered to be about 10TB, and their web capture team has collected 160TB of digital data. If you are ever in the Washington DC, I strongly recommend a visit to the Library of Congress. It is truly stunning!
Full-length computer animated movies, like [Happy Feet], consume about 100TB of disk storage during production. IBM offers disk systems that can hold this much data. For example, the IBM XIV can hold up to 151 TB of usable disk space in the size of one refrigerator.
A Key Performance Indicator (KPI) for some larger companies is the number of TB that can be managed by a full-time employee, referred to as TB/FTE. Discussions about TB/FTE are available from IT analysts including [Forrester Research] and [The Info Pro].
The website [Ancestry.com] claims to have over 540 million names in its genealogical database, with a storage of 600TB, with the inclusion of [US census data from 1790 to 1930]. The US government took nine years to process the 1880 census, so for the 1890 census, it rented equipment from Herman Hollerith's Tabulating Machine Company. This company would later merge with two others in 1911 to form what is now called IBM.
A Petabyte is thousand TB, or a quadrillion bytes. It is estimated that all printed materials on Earth would represent approximately 200 PB of information.
IBM's largest disk system, the Scale-Out Network Attach Storage (SONAS) comprised of up to 7,200 disk drives, which can hold over 11 PB of information. A smaller 10-frame model, the same size as IBM Watson, with six interface nodes and 19 storage pods, could hold over 7 PB of information.
For those of us in the IT industry, 1TB is small potatoes. I for one, was expecting it to be much bigger. But for everyone else, the equivalent of 200 million pages of text that IBM Watson has loaded inside is an incredibly large repository of information. I suspect IBM Watson probably contains the complete works of Shakespeare as well as other fiction writers, the IMDB database, all 3.5 million articles of Wikipedia, religious texts like the Bible and the Quran, famous documents like the Magna Carta and the US Constitution, and reference books like a Dictionary, a Thesaurus, and "Gray's Anatomy". And, of course, lots and lots of lists.
For those on Twitter, follow [@ibmwatson] these next three days during the challenge.
Wrapping up my coverage of the annual [2010 System Storage Technical University], I attended what might be perhaps the best session of the conference. Jim Nolting, IBM Semiconductor Manufacturing Engineer, presented the new IBM zEnterprise mainframe, "A New Dimension in Computing", under the Federal track.
The zEnterprises debunks the "one processor fits all" myth. For some I/O-intensive workloads, the mainframe continues to be the most cost-effective platform. However, there are other workloads where a memory-rich Intel or AMD x86 instance might be the best fit, and yet other workloads where the high number of parallel threads of reduced instruction set computing [RISC] such as IBM's POWER7 processor is more cost-effective. The IBM zEnterprise combines all three processor types into a single system, so that you can now run each workload on the processor that is optimized for that workload.
IBM zEnterprise z196 Central Processing Complex (CPC)
Let's start with the new mainframe z196 central processing complex (CPC). Many thought this would be called the z11, but that didn't happen. Basically, the z196 machine has a maximum 96 cores versus z10's 64 core maximum, and each core runs 5.2GHz instead of z10's cores running at 4.7GHz. It is available in air-cooled and water-cooled models. The primary operating system that runs on this is called "z/OS", which when used with its integrated UNIX System Services subsystem, is fully UNIX-certified. The z196 server can also run z/VM, z/VSE, z/TPF and Linux on z, which is just Linux recompiled for the z/Architecture chip set. In my June 2008 post [Yes, Jon, there is a mainframe that can help replace 1500 servers], I mentioned the z10 mainframe had a top speed of nearly 30,000 MIPS (Million Instructions per Second). The new z196 machine can do 50,000 MIPS, a 60 percent increase!
The z196 runs a hypervisor called PR/SM that allows the box to be divided into dozens of logical partitions (LPAR), and the z/VM operating system can also act as a hypervisor running hundreds or thousands of guest OS images. Each core can be assigned a specialty engine "personality": GP for general processor, IFL for z/VM and Linux, zAAP for Java and XML processing, and zIIP for database, communications and remote disk mirroring. Like the z9 and z10, the z196 can attach to external disk and tape storage via ESCON, FICON or FCP protocols, and through NFS via 1GbE and 10GbE Ethernet.
IBM zEnterprise BladeCenter Extension (zBX)
There is a new frame called the zBX that basically holds two IBM BladeCenter chassis, each capable of 14 blades, so total of 28 blades per zBX frame. For now, only select blade servers are supported inside, but IBM plans to expand this to include more as testing continues. The POWER-based blades can run native AIX, IBM's other UNIX operating system, and the x86-based blades can run Linux-x86 workloads, for example. Each of these blade servers can run a single OS natively, or run a hypervisor to have multiple guest OS images. IBM plans to look into running other POWER and x86-based operating systems in the future.
If you are already familiar with IBM's BladeCenter, then you can skip this paragraph. Basically, you have a chassis that holds 14 blades connected to a "mid-plane". On the back of the chassis, you have hot-swappable modules that snap into the other side of the mid-plane. There are modules for FCP, FCoE and Ethernet connectivity, which allows blades to talk to each other, as well as external storage. BladeCenter Management modules serve as both the service processor as well as the keyboard, video and mouse Local Console Manager (LCM). All of the IBM storage options available to IBM BladeCenter apply to zBX as well.
Besides general purpose blades, IBM will offer "accelerator" blades that will offload work from the z196. For example, let's say an OLAP-style query is issued via SQL to DB2 on z/OS. In the process of parsing the complicated query, it creates a Materialized Query Table (MQT) to temporarily hold some data. This MQT contains just the columnar data required, which can then be transferred to a set of blade servers known as the Smart Analytics Optimizer (SAO), then processes the request and sends the results back. The Smart Analytics Optimizer comes in various sizes, from small (7 blades) to extra large (56 blades, 28 in each of two zBX frames). A 14-blade configuration can hold about 1TB of compressed DB2 data in memory for processing.
IBM zEnterprise Unified Resource Manager
You can have up to eight z196 machines and up to four zBX frames connected together into a monstrously large system. There are two internal networks. The Inter-ensemble data network (IEDN) is a 10GbE that connects all the OS images together, and can be further subdivided into separate virtual LANs (VLAN). The Inter-node management network (INMN) is a 1000 Mbps Base-T Ethernet that connects all the host servers together to be managed under a single pane of glass known as the Unified Resource Manager. It is based on IBM Systems Director.
By integrating service management, the Unified Resource Manager can handle Operations, Energy Management, Hypervisor Management, Virtual Server Lifecycle Management, Platform Performance Management, and Network Management, all from one place.
IBM Rational Developer for System z Unit Test (RDz)
But what about developers and testers, such as those Independent Software Vendors (ISV) that produce mainframe software. How can IBM make their lives easier?
Phil Smith on z/Journal provides a history of [IBM Mainframe Emulation]. Back in 2007, three emulation options were in use in various shops:
Open Mainframe, from Platform Solutions, Inc. (PSI)
FLEX-ES, from Fundamental Software, Inc.
Hercules, which is an open source package
None of these are viable options today. Nobody wanted to pay IBM for its Intellectual Property on the z/Architecture or license the use of the z/OS operating system. To fill the void, IBM put out an officially-supported emulation environment called IBM System z Professional Development Tool (zPDT) available to IBM employees, IBM Business Partners and ISVs that register through IBM Partnerworld. To help out developers and testers who work at clients that run mainframes, IBM now offers IBM Rational Developer for System z Unit Test, which is a modified version of zPDT that can run on a x86-based laptop or shared IBM System x server. Based on the open source [Eclipse IDE], the RDz emulates GP, IFL, zAAP and zIIP engines on a Linux-x86 base. A four-core x86 server can emulate a 3-engine mainframe.
With RDz, a developer can write code, compile and unit test all without consuming any mainframe MIPS. The interface is similar to Rational Application Developer (RAD), and so similar skills, tools and interfaces used to write Java, C/C++ and Fortran code can also be used for JCL, CICS, IMS, COBOL and PL/I on the mainframe. An IBM study ["Benchmarking IDE Efficiency"] found that developers using RDz were 30 percent more productive than using native z/OS ISPF. (I mention the use of RAD in my post [Three Things to do on the IBM Cloud]).
What does this all mean for the IT industry? First, the zEnterprise is perfectly positioned for [three-tier architecture] applications. A typical example could be a client-facing web-server on x86, talking to business logic running on POWER7, which in turn talks to database on z/OS in the z196 mainframe. Second, the zEnterprise is well-positioned for government agencies looking to modernize their operations and significantly reduce costs, corporations looking to consolidate data centers, and service providers looking to deploy public cloud offerings. Third, IBM storage is a great fit for the zEnterprise, with the IBM DS8000 series, XIV, SONAS and Information Archive accessible from both z196 and zBX servers.
Well, it's Tuesday again, and you know what that means? IBM announcements!
Today's announcements are all about the Storwize family, IBM's market-leading Software Defined Storage offerings. Having sold over 55,000 systems, and managing over 1.6 Exabytes of data, IBM continues to be the #1 leader in storage virtualization solutions. The Storwize family consists of the SAN Volume Controller (SVC), Storwize V7000, Storwize V7000 Unified, Flex System V7000, Storwize V5000, Storwize V3700 and V3500.
SAN Volume Controller 2145-DH8
The new 2145-DH8 model is a complete repackaging of this popular storage system. The previous model, the 2145-CG8, was 1U-high x86 server per node, and each node required a separate 1U-high UPS to provide battery protection for its cache. Nobody liked this. The new 2145-DH8 instead is a 2U-high node with two hot-swappable batteries, eliminating the need for UPS altogether. Thus, an SVC node-pair using the 2145-DH8 models takes up the same 4U space, but with fewer cables. The SVC can now also support standard office 110/240 voltage sources.
The new model sports an 8-core processor with 32GB RAM. Since these are 2-socket servers, IBM offers that option to add a second 8-core processor and additional 32GB RAM to help boost Real-time Compression. Each node can have optionally one or two hardware-assisted compression cards which use the Intel QuickAssist chip to boost compression performance.
While the Real-time Compression was in fact, real-time, performed in-line to the read/write I/O process, at latency comparable to uncompressed data for applications, the compression process on older models was entirely software-based, consuming some of the CPU resources, which lowered the maximum IOPS of the solution. With the added cores, added RAM, and hardware-assisted compression chips, IBM resolves that concern. In fact, the new 2145-DH8 with compression can provide more IOPS than an older 2145-CG8 without compression.
The previous model 2145-CG8 allowed you to put up to 4 small SSD drives in the node itself, which were treated the same as externally Flash drives for purposes of having a high-speed storage pool for select volumes, or automated sub-LUN tiering with Easy Tier. The new model 2145-DH8 allows you to attach up to 48 Solid State Drives (SSD) via 12Gb SAS cables. These are housed in the new 2U-high 24F enclosures that can offer up to 38.4 TB of Flash per SVC I/O group.
IBM also re-designed the host/device ports to use Hardware Interface Card (HIC) slots. In the 2145-CG8, you had four FCP ports, two 1GbE Ethernet ports, with options to add two 10GbE Ethernet ports or four additional FCP ports. If you had mostly an FCoE or iSCSI environment, you didn't need the FCP, and if you were mostly a FCP Storage Area Network (SAN) environment, then most of the Ethernet ports went unused. To solve this, the 2145-DH8 can allow you to have up to six HIC cards that are either FCP, Ethernet, or SAS. There are three 1GbE fixed Ethernet ports which can be used for iSCSI and administration.
If you have SVC today, you can upgrade non-disruptively by either swapping out your current SVC engines with the new 2145-DH8 engines, or you can add the new 2145-DH8 engines to your existing SVC cluster. Either way, there is no outage to your applications!
This is the next generation of the popular Storwize V7000. The previous generation had a 4-core processor and 8GB RAM per canister. The new model has an 8-core processor with 32GB of RAM per canister, with the option to double these to boost Real-time compression. There are two canisters per control enclosure, which gives you 64GB to 128GB of RAM per Storwize V7000 I/O group.
The new Storwize V7000 comes with one hardware-assisted compression chip on the mother board of each canister, with the option to add a second chip per canister.
Each canister offers three HIC slots, which can be used for the additional hardware-assist compression chip, FCP or Ethernet ports.
To accommodate these HIC slots, new canisters were needed. Instead of the flat wide style top and bottom, we now have taller, thinner canisters that sit side to side. This side-to-side design is similar to our existing Storwize V5000 and V3700 models.
The previous model could support up to 9 expansion enclosures per control enclosure. The Storwize V7000 can have up to 24 drives in its control enclosure, and now attach up to 20 expansion enclosures, which allows up to 504 drives per control enclosure, and up to a maximum of 1,056 drives per Storwize cluster.
If you have previous models of Storwize V7000, you can add the new Storwize V7000 into the same cluster, or virtualize the previous storage for migration purposes.
The new software applies new capabilities to both new generation hardware as well as the older models, so people with existing gear can benefit as well.
In prior releases, the sub-LUN automated tiering was limited to two levels: Flash and HDD. This lumped all 15K, 10K and 7200 RPM drives into a common HDD category. In the new v7.3.0 code, you can now have three levels: Flash, Enterprise HDD, and Nearline HDD, or two HDD levels: Enterprise and Nearline. The Enterprise level combines 15K and 10K RPM drives, similar to what is done on the IBM System Storage DS8000 disk systems.
The new code is also able balance your storage pools, and can be used with uniform or mixed storage pools to eliminate performance hot spots.
The new code has been enhanced to detect the hardware-assisted compression chip on the new SVC and Storwize V7000 models, and use those if available.
For the Storwize V3700 and V5000 models, the new code allows up to nine expansion enclosures per control enclosure. In the previous models, the V3700 allowed only four expansions, and the V6000 only six expansions per control enclosure. The V3700 can now support up to 240 drives, and the V5000 can support up to 480 drives.
IBM Storwize V7000 Unified File Module software v1.5
For Storwize V7000 Unified clients, there is new software for the File Modules that provide NFS, CIFS, FTP, HTTPS and SCP protocol capability. The new v1.5 code now adds NFS v4 and SMB 2.1 levels of support. Most NFS users are still on NFSv3, but about 20 percent of NFS users are using NFS v4 which offers stateful access. The SMB 2.1 for CIFS was introduced by Microsoft in Windows 7 and Windows Server 2008 R2.
Deterministic ID mapping allows you to map Windows userids to UNIX/Linux group and owner id numbers. In the past, the problem is that this mapping is different on each machine, so people often had to stand up a Windows System for Unix Services (SFU) server to provide consistent ID mapping. Now, with v1.5 code, you will no longer have to do this. The deterministic ID mapping will can now replicate the mapping to each machine without an SFU server.
Active Cloud Engine allows up to ten Storwize V7000 Unified to be connected across distance to form a single global name space. WAN caching, however, was restricted to a single site having write capabilities, while the others were read-only. In v1.5 release, IBM now supports multiple independent writers at different locations on the same fileset.
Security enhancements include multi-tenancy, configurable password policies, session policies, and hardened boot and SSH configurations. With NFS v3/v4, you can now use [Kerberos] for security.
Finally, I am please to see that we now have Cinder support for files on the Storwize V7000 Unified on the OpenStack Havana release that just came out last month. The OpenStack Cinder interface can assign LUNs to virtual machines, but the new Havana release allows NAS systems to dole out files that act as LUNs, such as OVA or VMDK files. The advantage is that these files can managed by Active Cloud Engine, cached locally across global name space, have policies place them on appropriate storage tiers, and inactive Virtual Machine images can be migrated to less expensive disk or tape.
Well, it's Tuesday again, and you know what that means? IBM Announcements!
This week, IBM announces the second generation of Storwize V5000 flash and disk storage systems. There are the V5000F All-flash configurations, as well as the V5000 that can support a variety of flash and spinning disk drives.
There are three models:
The V5010 has dual 2-core/2-thread processors and 16GB of cache. It supports thin provisioning, FlashCopy, Easy Tier, and remote mirroring. The base unit includes 1 GbE Ethernet ports for iSCSI host connectivity, with options to add 16GB Fibre Channel, 12Gb SAS, and 10GbE iSCSI/FCoE as well.
The 2U controllers and expansion enclosures can hold either 24 small 2.5-inch drives, or 12 larger 3.5-inch drives. A single control enclosure has two active/active IBM Spectrum Virtualize nodes, and can attach up to 10 expansion enclosures for a maximum of 264 drives.
The V5020 unit has dual 2-core/4-thread processors and up to 32GB of cache. It supports everything the V5010 does, plus encryption. The encryption is done via the Intel AES-NI instruction set to eliminate the need for special "self-encrypting drives" (SED) that other storage devices may require.
The V5030 has dual 6-core/4-thread processors and up to 64GB of cache. It supports everything the V5010 and V5020 do, plus Real-time Compression and external virtualization. The Real-time Compression can achieve up to 80 percent space savings, representing a 5:1 compression ratio.
Each control enclosure can attach to 20 expansion enclosures, which can support 504 internal drives per controller, and up to 1,008 with two controllers (four Spectrum Virtualize nodes) clustered together. This is in addition to the drives in external storage systems virtualized.
Continuing my rant from Monday's post [Time for a New Laptop], I got my new laptop Wednesday afternoon. I was hoping the transition would be quick, but that was not the case. Here were my initial steps prior to connecting my two laptops together for the big file transfer:
Document what my old workstation has
Back in 2007, I wrote a blog post on how to [Separate Programs from Data]. I have since added a Linux partition for dual-boot on my ThinkPad T60.
Windows XP SP3 operating system and programs
Red Hat Enterprise Linux 5.4
My Documents and other data
I also created a spreadsheet of all my tools, utilities and applications. I combined and deduplicated the list from the following sources:
Control Panel -> Add/Remove programs
Start -> Programs panels
Program taskbar at bottom of screen
The last one was critical. Over the years, I have gotten in the habit of saving those ZIP or EXE files that self-install programs into a separate directory, D:/Install-Files, so that if I had to unintsall an application, due to conflicts or compatability issues, I could re-install it without having to download them again.
So, I have a total of 134 applications, which I have put into the following rough categories:
AV - editing and manipulating audio, video or graphics
Files - backup, copy or manipulate disks, files and file systems
Browser - Internet Explorer, Firefox, Opera and Google Chrome
Communications - Lotus Notes and Lotus Sametime
Connect - programs to connect to different Web and Wi-Fi services
Demo - programs I demonstrate to clients at briefings
Drivers - attach or sync to external devices, cell phones, PDAs
Games - not much here, the basic solitaire, mindsweeper and pinball
Help Desk - programs to diagnose, test and gather system information
Projects - special projects like Second Life or Lego Mindstorms
Lookup - programs to lookup information, like American Airlines TravelDesk
Meeting - I have FIVE different webinar conferencing tools
Office - presentations, spreadsheets and documents
Platform - Java, Adobe Air and other application runtime environments
Player - do I really need SIXTEEN different audio/video players?
Printer - print drivers and printer management software
Scanners - programs that scan for viruses, malware and adware
Tools - calculators, configurators, sizing tools, and estimators
Uploaders - programs to upload photos or files to various Web services
Backup my new workstation
My new ThinkPad T410 has a dual-core i5 64-bit Intel processor, so I burned a 64-bit version of [Clonezilla LiveCD] and booted the new system with that. The new system has the following configuration:
Windows XP SP3 operating system, programs and data
There were only 14.4GB of data, it took 10 minutes to backup to an external USB disk. I ran it twice: first, using the option to dump the entire disk, and the second to dump the selected partition. The results were roughly the same.
Run Workstation Setup Wizard
The Workstation Setup Wizard asks for all the pertinent location information, time zone, userid/password, needed to complete the installation.
I made two small changes to connect C: to D: drive.
Changed "My Documents" to point to D:\Documents which will move the files over from C: to D: to accomodate its new target location. See [Microsoft procedure] for details.
Edited C:\notes\notes.ini to point to D:\notes\data to store all the local replicas of my email and databases.
Install Ubuntu Desktop 10.04 LTS
My plan is to run Windows and Linux guests through virtualization. I decided to try out Ubuntu Desktop 10.04 LTS, affectionately known as Lucid Lynx, which can support a variety of different virtualization tools, including KVM, VirtualBox-OSE and Xen. I have two identical 15GB partitions (sda2 and sda3) that I can use to hold two different systems, or one can be a subdirectory of the other. For now, I'll leave sda3 empty.
Take another backup of my new workstation
I took a fresh new backup of paritions (sda1, sda2, sda6) with Clonezilla.
The next step involved a cross-over Ethernet cable, which I don't have. So that will have to wait until Thursday morning.
Tonight PBS plans to air Season 38, Episode 6 of NOVA, titled [Smartest Machine On Earth]. Here is an excerpt from the station listing:
"What's so special about human intelligence and will scientists ever build a computer that rivals the flexibility and power of a human brain? In "Artificial Intelligence," NOVA takes viewers inside an IBM lab where a crack team has been working for nearly three years to perfect a machine that can answer any question. The scientists hope their machine will be able to beat expert contestants in one of the USA's most challenging TV quiz shows -- Jeopardy, which has entertained viewers for over four decades. "Artificial Intelligence" presents the exclusive inside story of how the IBM team developed the world's smartest computer from scratch. Now they're racing to finish it for a special Jeopardy airdate in February 2011. They've built an exact replica of the studio at its research lab near New York and invited past champions to compete against the machine, a big black box code -- named Watson after IBM's founder, Thomas J. Watson. But will Watson be able to beat out its human competition?"
Like most supercomputers, Watson runs the Linux operating system. The system runs 2,880 cores (90 IBM Power 750 servers, four sockets each, eight cores per socket) to achieve 80 [TeraFlops]. TeraFlops is the unit of measure for supercomputers, representing a trillion floating point operations. By comparison, Hans Morvec, principal research scientist at the Robotics Institute of Carnegie Mellon University (CMU) estimates that the [human brain is about 100 TeraFlops]. So, in the three seconds that Watson gets to calculate its response, it would have processed 240 trillion operations.
Several readers of my blog have asked for details on the storage aspects of Watson. Basically, it is a modified version of IBM Scale-Out NAS [SONAS] that IBM offers commercially, but running Linux on POWER instead of Linux-x86. System p expansion drawers of SAS 15K RPM 450GB drives, 12 drives each, are dual-connected to two storage nodes, for a total of 21.6TB of raw disk capacity. The storage nodes use IBM's General Parallel File System (GPFS) to provide clustered NFS access to the rest of the system. Each Power 750 has minimal internal storage mostly to hold the Linux operating system and programs.
When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, "The actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1TB." For performance reasons, various subsets of the data are replicated in RAM on different functional groups of cluster nodes. The entire system is self-contained, Watson is NOT going to the internet searching for answers.
If you store your VMware bits on external SAN or NAS-based disk storage systems, this post is for you. The subject of the post, VM Volumes, is a potential storage management game changer!
Fellow blogger Stephen Foskett mentioned VM Volumes in his [Introducing VMware vSphere Storage Features] presentation at IBM Edge 2012 conference. His session on VMware's storage features included VMware APIs for Array Integration (VAAI), VMware Array Storage Awareness (VASA), vCenter plug-ins, and a new concept he called "vVol", now more formally known as VM Volumes. This post provides a follow-up to this, describing the VM Volumes concepts, architecture, and value proposition.
"VM Volumes" is a future architecture that VMware is developing in collaboration with IBM and other major storage system vendors. So far, very little information about VM Volumes has been released. At VMworld 2012 Barcelona, VMware highlights VM Volumes for the first time and IBM demonstrates VM Volumes with the IBM XIV Storage System (more about this demo below). VM Volumes is worth your attention -- when it becomes generally available, everyone using storage arrays will have to reconsider their storage management practices in a VMware environment -- no exaggeration!
But enough drama. What is this all about?
(Note: for the sake of clarity, this post refers to block storage only. However, the VM Volumes feature applies to NAS systems as well. Special thanks to Yossi Siles and the XIV development team for their help on this post!)
The VM Volumes concept is simple: VM disks are mapped directly to special volumes on a storage array system, as opposed to storing VMDK files on a vSphere datastore.
The following images illustrate the differences between the two storage management paradigms.
You may still be asking yourself: bottom line, how will I benefit from VM Volumes?
Well, take a VM snapshot for example. With VM Volumes, vSphere can simply offload the operation by invoking a hardware snapshot of the hardware volume. This has significant implications:
VM-Granularity: Only the right VMs are copied (with datastores, backing up or cloning individual-VM portions of hardware snapshot of a datastore would require more complex configuration, tools and work)
Hardware Offload: No ESXi server resources are consumed
XIV advantage: With XIV, snapshots consume no space upfront and are completed instantly.
Here's the first takeaway: With VM Volumes, advanced storage services (which cost a lot when you buy a storage array), will become available at an individual VM level. In a cloud world, this means that applications can be provisioned easily with advanced storage services, such as snapshots and mirroring.
Now, let's take a closer look at another relevant scenario where VM Volumes will make a lot of difference - provisioning an application with special mirroring requirements:
VM Volumes case: The application is ordered via the private cloud portal. The requestor checks a box requesting an asynchronous mirror. He changes the default RPO for his needs. When the request is submitted, the process wraps up automatically: Volumes are created on one of the storage arrays, configured with a mirror and RPO exactly as specified. A few minutes later, the requestor receives an automatic mail pointing to the application virtual machine.
Datastores case #1: As may be expected, a datastore that is mirrored with the special RPO does not exist. As a result, the automated workflow sets a pending status on the request, creates an urgent ticket to a VMware administrator and aborts. When the VMware admin handles that ticket, she re-assigns the ticket to the storage administrator, asking for a new volume which is mirrored with the special RPO, and mapped to the right ESXi cluster. The next day, the volume is created; the ticket is re-assigned to the storage admin, with the new LUN being pointed to. The VMware administrator follows and creates the datastore on top of it. Since the automated workflow was aborted, the admin re-assigns the ticket to the cloud administrator, who sometime later completes the application provisioning manually.
Datastores case #2: Luckily for the requestor, a datastore that is mirrored with the special RPO does exist. However, that particular datastore is consuming space from a high performance XIV Gen3 system with SSD caching, while the application does not require that level of performance, so the workflow requires a storage administrator approval. The approval is given to save time, but the storage administrator opens a ticket for himself to create a new volume on another array, as well as a follow-up ticket for the VMware admin to create a new datastore using the new volume and migrate the application to the other datastore. In this case, provisioning was relatively rapid, but required manual follow up, involving the two administrators.
Here's the second takeaway: With VM Volumes, management is simplified, and end-to-end automation is much more applicable. The reason is that there are no datastores. Datastores physically group VMs that may otherwise be totally unrelated, and require close coordination between storage and VMware administrators.
Now, the above mainly focuses on the VMware or cloud administrator perspective. How does VM Volumes impact storage management?
VM's are the new hosts: Today, storage administrators have visibility of physical hosts in their management environment. In a non-virtualized environment, this visibility is very helpful. The storage administrator knows exactly which applications in a data center are storage-provisioned or affected by storage management operations because the applications are running on well-known hosts. However, in virtualized environments the association of an application to a physical host is temporary. To keep at least the same level of visibility as in physical environments, VMs should become part of the storage management environment, like hosts. Hosts are still interesting, for example to manage physical storage mapping, but without VM visibility, storage administrators will know less about their operation than they are used to, or need to. VM Volumes enables such visibility, because volumes are provided to individual VMs. The XIV VM Volumes demonstration at VMworld Barcelona, although experimental, shows a view of VM volumes, in XIV's management GUI.
Here's a screenshot:
That's not all!
Storage Profiles and Storage Containers: A Storage Profile is a vSphere specification of a set of storage services. A storage profile can include properties like thin or thick provisioning, mirroring definition, snapshot policy, minimum IOPS, etc.
Storage administrators define a portfolio of supported storage services, maintained as a set of storage profiles, and published (via VASA integration) to vSphere.
VMware or cloud administrators define the required storage profiles for specific applications
VMware and storage administrators need to coordinate the typical storage requirements and the automatically-available storage services. When a request to provision an application is made, the associated storage profiles are matched against the published set of available storage profiles. The matching published profiles will be used to create volumes, which will be bound to the application VMs. All that will happen automatically.
Note that when a VM is created today, a datastore must be specified. With VM Volumes, a new management entity called Storage Container (also known as Capacity Pool) replaces the use of datastore as a management object. Each Storage Container exposes a subset of the available storage profiles, as appropriate. The storage container also has a capacity quota.
Here are some more takeaways:
New way to interface vSphere and storage management: Storage administrators structure and publish storage services to vSphere via storage profiles and storage containers.
Automated provisioning, out of the box: The provisioning process automatically matches application-required storage profiles against storage profiles available from the specified storage containers. There is no need to build custom scripts and custom processes to automate storage provisioning to applications
The XIV advantage:
XIV services are very simple to define and publish. The typical number of available storage profiles would be low. It would also be easy to define application storage profiles.
XIV provides consistent high performance, up to very high capacity utilization levels, without any maintenance. As a result, automated provisioning (which inherently implies less human attention) will not create an elevated risk of reduced performance.
Note: A storage vendor VASA provider is required to support VM Volumes, storage profiles, storage containers and automated provisioning. The IBM Storage VASA provider runs as a standalone service that needs to be deployed on a server.
To summarize the VM Volumes value proposition:
Streamline cloud operation by providing storage services at VM and application level, enabling end-to-end provisioning automation, and unifying VMware and storage administration around volumes and VMs.
Increase storage array ROI, improve vSphere scalability and response time, and reduce cloud provisioning lag, by offloading VM-level provisioning, failover, backup, storage migration, storage space recycling, monitoring, and more, to the storage array, using advanced storage operations such as mirroring and snapshots.
Simplify the adoption of VM Volumes using XIV, with smaller and simpler sets of storage profiles. Apply XIV's supreme fast cloning to individual VMs, and keep automation risks at bay with XIV's consistent high performance.
Until you can get your hands on a VM Volumes-capable environment, the VMware and IBM developer groups will be collaborating and working hard to realize this game-changing feature. The above information is definitely expected to trigger your questions or comments, and our development teams are eager to learn from them and respond. Enter your comments below, and I will try to answer them, and help shape the next post on this subject. There's much more to be told.
Now an avid reader of my blog has brought this to my attention. Apparently,
EMC has been showing customers a presentation
[Accelerating Storage Transformation with VMAX and VPLEX] with false and misleading comparison claims between IBM DS8000, HDS VSP and EMC VMAX 40K disk system performance.
(FTC Disclosure: This would be a good time to remind my readers that I work for IBM and own IBM stock. I do not endorse any of the EMC or HDS products mentioned in this post, and have no financial affiliation or investments directly with either EMC nor HDS. I am basing my information solely on the presentation posted on the internet and other sources publicly available, and not on any misrepresentations from EMC speakers at the various conferences where these charts might have been shown.)
The problem with misinformation is that it is not always obvious. The EMC presentation is quite pretty and professional-looking. It is the typical slick, attention-getting, low-content, over-simplified marketing puffery you have come to expect from EMC. There are two slides in particular that I have issue with.
This first graphic implies that IBM and HDS are nearly tied in performance, but that EMC VMAX 40K has nearly triple that bandwidth. Overall the slide has very little detail. That makes it difficult to determine what exactly is being claimed and whether a fair comparison is being made.
The title claims that VMAX 40K is "#1 in High Bandwidth Apps". Only three disk systems are shown so the claim appears to be relative to only the three systems. The wording "High Bandwidth Apps" is confusing considering the cited numbers are for disk systems and no application is identified. By comparison, IBM SONAS can drive up to 105 GB/sec sequential bandwidth, nearly double what EMC claims for its VMAX 40K, so EMC is certainly not even close to #1.
Is the workload random or sequential? That is not easy to determine. The use of "GB/s" along with the large block size of 128KB implies the I/O workload is sequential, which is great for some workloads like high performance computing, technical computing and video broadcasts. Random workloads, on the other hand, are usually measured in I/Os per second (IOPS) with a block size ranging 4KB to 64KB. (I am assuming the 128K blocks refers to 128KB block size, and not reading the same block of cache 128,000 times.)
The slide states "Maximum Sustainable RRH Bandwidth 128K Blocks". The acronym "RRH" is not defined; but I suspect this refers to "random read hits". For random workloads, 100 percent random read hits from cache represents one corner of the infamous "four corners" test. Real-world workloads have a mix of reads and writes, and a mix of cache hits and cache misses. It is also unclear whether the hits are from standard data cache or from internal buffers in adapters (perhaps accessing the same blocks repeatedly) or something else. So is this really for a random workload, or a sequential workload?
(The term "Hitachi Math" was coined by an EMC blogger precisely to slam Hitachi Data Systems for their blatant use of four-corners results, claiming that spouting ridiculously large, but equally unrealistic, 100 percent random read hit results don't provide any useful information. I agree. There are much better industry-standard benchmarks available, such as SPC-1 for random workloads, SPC-2 for sequential workloads, and even benchmarks for specific applications, that represent real-world IT environments. To shame HDS for their use of four-corners results, only for EMC themselves to use similar figures in their own presentation is truly hypocritical of them!)
The IBM system is identified as "DS8000". DS8000 is a generic family name that applies to multiple generations of systems first introduced in 2004. The specific model is not identified, but that is critical information. Is this a first generation DS8100, or the latest DS8800, or something in between?
The slide says "Full System Configs", but that is not defined and configuration details are not identified. Configuration details, also critical information in assessing system performance capabilities, are not specified. If the EMC box costs seven times more than IBM or HDS, would you really buy it to get 3x more performance? Is the EMC packed with the maximum amount of SSD? Were there any SSD in the IBM or HDS boxes to match?
The source of the claimed IBM DS8000 performance numbers is not identified. Did they run their own tests? While I cannot tell, the VMAX may have been configured with 64 Fibre Channel 8Gbps host connections. In that case each channel is theoretically capable of supporting about 800 MB/s at 100% channel utilization. Multiplying 64 x 800MB/s = 51.2GB/s, so did EMC just do the performance comparison on the back of a napkin, assuming there are no other bottlenecks in the system? Even then, I would not round up 51.2 to 52!
Response times were not identified. For random I/Os, response time is a very important metric. It is possible that the Symmetrix was operating with some resources at 100% utilization to get the highest GB/s result, but that would likely make I/O response times unacceptable for real-world random I/O workloads.
IBM and HDS have both published Storage Performance Council [SPC] industry-standard performance benchmarks. EMC has not published any SPC benchmarks for VMAX systems. If EMC is interested in providing customers with audited, detailed performance information along with detailed configuration information, all based on benchmarks designed to represent real-world workloads, EMC can always publish SPC benchmark results as IBM and other vendors have done. In past blog fights, EMC resorts to the excuse that SPC isn't perfect, but can they really argue that vague and unrealistic claims cited in its presentation are better?
The second graphic is so absurd, you would think it came directly from Larry Ellison at an Oracle OpenWorld keynote session. EMC is comparing a configuration with VMAX 40K plus an EMC VFCache host-side flash memory cache card to a configuration with an IBM and HDS disk system without host-side flash memory cache also configured. The comparison is clearly apples-to-oranges. Other disk system configuration details are also omitted.
FAST VP is EMC's name for its sub-volume drive tiering feature, comparable to IBM Easy Tier and Hitachi's Dynamic Tiering. The graph implies that IBM and HDS can only achieve a modest increment improvement from their sub-volume tiering. I beg to differ. I have seen various cases where a small amount of SSD on IBM DS8000 series can drastically improve performance 200 to 400 percent.
The "DBClassify" shown on the graph is a tool run as part of an EMC professional services offering called Database Performance Tiering Assessment, makes recommendations for storing various database objects on different drive tiers based on object usage and importance. Do you really need to pay for professional services? With IBM Easy Tier, you just turn it on, and it works. No analysis required, no tools, no professional services, and no additional charge!
VFCache is an optional product from EMC that currently has no integration whatsoever with VMAX. A fair comparison would have included a host-side flash memory cache (from any vendor) when the IBM or HDS storage system was configured. Or leave it out altogether and just focus on the sub-volume tiering comparison.
Keep in mind that EMC's VFCache supports only selected x86-based hosts. IBM has published a [Statement of Direction] indicating that it will also offer this for Power systems running AIX and Linux host-side flash memory cache integrated with DS8000 Easy Tier.
I feel EMC's claims about IBM DS8000 performance are vague and misleading. EMC appears to lack the kind of technical marketing integrity that IBM strives to attain.
Since EMC is not able or willing to publish fair and meaningful performance comparisons, it is up to me to set the record straight and point out EMC's failings in this matter.
Reminder: It's not to late to register for my Webcast "Solving the Storage Capacity Crisis" on Tuesday, September 25. See my blog post [Upcoming events in September] to register!
A reader from New Zealand expressed concern some corporate bloggers were [using the earthquake for marketing]. He lost someone close to him in Christchurch, and is unable to reach a friend living in Japan, so I am sorry for his loss. I plan to be in Australia and New Zealand to teach a Top Gun class May 15-27, so hopefully I will be able to meet him in person when I am down there.
"Earmarking funds is a really good way of hobbling relief organizations and ensuring that they have to leave large piles of money unspent in one place while facing urgent needs in other places. ... Meanwhile, the smaller and less visible emergencies where NGOs can do the most good are left unfunded.
In the specific case of Japan, there's all the more reason not to donate money. Japan is a wealthy country which is responding to the disaster, among other things, by printing hundreds of billions of dollars' worth of new money."
Another reader mentioned that the last surviving American WW-II vet died the same week. WTF? IBM and Japan have been allies for quite a while now, and there is no reason to bring up past wars except to compare the scope and magnitude of the cleanup effort. (Update: Frank Buckles was the last surviving WW-I vet, but also served in WW-II).
Many readers felt that charity begins at home, and there are plenty of worthy causes right here in the USA to donate to instead. Inspired by last year's movie [Waiting for Superman], my girlfriend started a project called [Centers for My Super Stars] for her first grade class on DonorsChoose.org. For those not familiar with this website, DonorsChoose.org uses the cloud to connect school teachers in need of supplies with rich people to donate funds towards these projects. If you want to contribute to her project, [donate here].
"And speaking of class, there just happens to be a baseball team in Sendai, Japan. The Golden Eagles. Their stadium was severely damaged from the earthquake. Wouldn't you think some of them lug nuts who run American baseball would bring the Golden Eagles and their opponents over to the United States when the Japanese season starts -- play some games over here and raise money to help the Japanese? Wouldn't you think they could just once stop that national pastime stuff and help the international pastime?"
As you can see, different readers have different opinions on this. We are all on this world together, and both our economy and our ecology are more interconnected than you might think. Let's build a smarter planet.
A long time ago, perhaps in the early 1990s, I was an architect on the component known today as DFSMShsm on z/OS mainframe operationg system. One of my job responsibilities was to attend the biannual [SHARE conference to listen to the requirements of the attendees on what they would like added or changed to the DFSMS, and ask enough questions so that I can accurately present the reasoning to the rest of the architects and software designers on my team. One person requested that the DFSMShsm RELEASE HARDCOPY should release "all" the hardcopy. This command sends all the activity logs to the designated SYSOUT printer. I asked what he meant by "all", and the entire audience of 120 some attendees nearly fell on the floor laughing. He complained that some clever programmer wrote code to test if the activity log contained only "Starting" and "Ending" message, but no error messages, and skip those from being sent to SYSOUT. I explained that this was done to save paper, good for the environment, and so on. Again, howls of laughter. Most customers reroute the SYSOUT from DFSMS from a physical printer to a logical one that saves the logs as data sets, with date and time stamps, so having any "skipped" leaves gaps in the sequence. The client wanted a complete set of data sets for his records. Fair enough.
When I returned to Tucson, I presented the list of requests, and the immediate reaction when I presented the one above was, "What did he mean by ALL? Doesn't it release ALL of the logs already?" I then had to recap our entire dialogue, and then it all made sense to the rest of the team. At the following SHARE conference six months later, I was presented with my own official "All" tee-shirt that listed, and I am not kidding, some 33 definitions for the word "all", in small font covering the front of the shirt.
I am reminded of this story because of the challenges explaining complicated IT concepts using the English language which is so full of overloaded words that have multiple meanings. Take for example the word "protect". What does it mean when a client asks for a solution or system to "protect my data" or "protect my information". Let's take a look at three different meanings:
The first meaning is to protect the integrity of the data from within, especially from executives or accountants that might want to "fudge the numbers" to make quarterly results look better than they are, or to "change the terms of the contract" after agreements have been signed. Clients need to make sure that the people authorized to read/write data can be trusted to do so, and to store data in Non-Erasable, Non-Rewriteable (NENR) protected storage for added confidence. NENR storage includes Write-Once, Read-Many (WORM) tape and optical media, disk and disk-and-tape blended solutions such as the IBM Grid Medical Archive Solution (GMAS) and IBM Information Archive integrated system.
The second meaning is to protect access from without, especially hackers or other criminals that might want to gather personally-identifiably information (PII) such as social security numbers, health records, or credit card numbers and use these for identity theft. This is why it is so important to encrypt your data. As I mentioned in my post [Eliminating Technology Trade-Offs], IBM supports hardware-based encryption FDE drives in its IBM System Storage DS8000 and DS5000 series. These FDE drives have an AES-128 bit encryption built-in to perform the encryption in real-time. Neither HDS or EMC support these drives (yet). Fellow blogger Hu Yoshida (HDS) indicates that their USP-V has implemented data-at-rest in their array differently, using backend directors instead. I am told EMC relies on the consumption of CPU-cycles on the host servers to perform software-based encryption, either as MIPS consumed on the mainframe, or using their Powerpath multi-pathing driver on distributed systems.
There is also concern about internal employees have the right "need-to-know" of various research projects or upcoming acquisitions. On SANs, this is normally handled with zoning, and on NAS with appropriate group/owner bits and access control lists. That's fine for LUNs and files, but what about databases? IBM's DB2 offers Label-Based Access Control [LBAC] that provides a finer level of granularity, down to the row or column level. For example, if a hospital database contained patient information, the doctors and nurses would not see the columns containing credit card details, the accountants would not see the columnts containing healthcare details, and the individual patients, if they had any access at all, would only be able to access the rows related to their own records, and possibly the records of their children or other family members.
The third meaning is to protect against the unexpected. There are lots of ways to lose data: physical failure, theft or even incorrect application logic. Whatever the way, you can protect against this by having multiple copies of the data. You can either have multiple copies of the data in its entirety, or use RAID or similar encoding scheme to store parts of the data in multiple separate locations. For example, with RAID-5 rank containing 6+P+S configuration, you would have six parts of data and one part parity code scattered across seven drives. If you lost one of the disk drives, the data can be rebuilt from the remaining portions and written to the spare disk set aside for this purpose.
But what if the drive is stolen? Someone can walk up to a disk system, snap out the hot-swappable drive, and walk off with it. Since it contains only part of the data, the thief would not have the entire copy of the data, so no reason to encrypt it, right? Wrong! Even with part of the data, people can get enough information to cause your company or customers harm, lose business, or otherwise get you in hot water. Encryption of the data at rest can help protect against unauthorized access to the data, even in the case when the data is scattered in this manner across multiple drives.
To protect against site-wide loss, such as from a natural disaster, fire, flood, earthquake and so on, you might consider having data replicated to remote locations. For example, IBM's DS8000 offers two-site and three-site mirroring. Two-site options include Metro Mirror (synchronous) and Global Mirror (asynchronous). The three-site is cascaded Metro/Global Mirror with the second site nearby (within 300km) and the third site far away. For example, you can have two copies of your data at site 1, a third copy at nearby site 2, and two more copies at site 3. Five copies of data in three locations. IBM DS8000 can send this data over from one box to another with only a single round trip (sending the data out, and getting an acknowledgment back). By comparison, EMC SRDF/S (synchronous) takes one or two trips depending on blocksize, for example blocks larger than 32KB require two trips, and EMC SRDF/A (asynchronous) always takes two trips. This is important because for many companies, disk is cheap but long-distance bandwidth is quite expensive. Having five copies in three locations could be less expensive than four copies in four locations.
Fellow blogger BarryB (EMC Storage Anarchist) felt I was unfair pointing out that their EMC Atmos GeoProtect feature only protects against "unexpected loss" and does not eliminate the need for encryption or appropriate access control lists to protect against "unauthorized access" or "unethical tampering".
(It appears I stepped too far on to ChuckH's lawn, as his Rottweiler BarryB came out barking, both in the [comments on my own blog post], as well as his latest titled [IBM dumbs down IBM marketing (again)]. Before I get another rash of comments, I want to emphasize this is a metaphor only, and that I am not accusing BarryB of having any canine DNA running through his veins, nor that Chuck Hollis has a lawn.)
As far as I know, the EMC Atmos does not support FDE disks that do this encryption for you, so you might need to find another way to encrypt the data and set up the appropriate access control lists. I agree with BarryB that "erasure codes" have been around for a while and that there is nothing unsafe about using them in this manner. All forms of RAID-5, RAID-6 and even RAID-X on the IBM XIV storage system can be considered a form of such encoding as well. As for the amount of long-distance bandwidth that Atmos GeoProtect would consume to provide this protection against loss, you might question any cost savings from this space-efficient solution. As always, you should consider both space and bandwidth costs in your total cost of ownership calculations.
Of course, if saving money is your main concern, you should consider tape, which can be ten to twenty times cheaper than disk, affording you to keep a dozen or more copies, in as many time zones, at substantially lower cost. These can be encrypted and written to WORM media for even more thorough protection.