Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
If Eskimos have 37 words for "snow", then EMC has perhaps a similar number of names for "failure". I have already covered a few of their past attempts, including [ATMOS], [Invista], and [VPLEX]. Last week, EMC introduced its latest, called XtremeIO.
But rather than focus on XtremeIO's many shortcomings, I thought it would be better to point out the highlights of IBM's All-Flash array, IBM FlashSystem.
But first, a quick story.
Two years ago, I worked the booth at [Oracle OpenWorld 2011]. After a conference attendee had visited the booths of Violin Memory and Pure Storage, he asked me why IBM did not have an all-Flash array.
Of course IBM did, and I showed him the [Storwize V7000]. For example, a 2U model with 18 SSD drives of 400GB each, configured in two RAID-5 ranks 7+P+S could offer 5.6 TB of space, running up to 250,000 IOPS at sub-millisecond response times.
Why didn't IBM advertise the Storwize V7000 as an all-Flash array? I though the question was silly at the time, since the Storwize V7000 supported SSD, 15K, 10K and 7200 RPM spinning disk, it seemed obvious that it could be configured with only SSD if you chose.
Since then, IBM has added 800GB support to the Storwize V7000, doubling the capacity. More importantly, IBM acquired Texas Memory Systems, and offers a much better all-Flash array.
Flash can be deployed in three levels. The first is in the server itself, such as with PCiE cards containing Flash chips, limited to applications running on that server only.
The second option is a hybrid disk system, that can intermix Flash-based Solid State Drives (SSD) with regular spinning hard disk drives (HDD). These can be attached to many servers.
The problem with this approach is that when Flash is packaged to pretend to be spinning disk, it undermines some of the performance benefits. Traditional disk system architectures using SCSI commands over Device adapter loops can introduce added latency.
The third fits snuggly in the middle: all-Flash arrays designed from the ground up to be only Flash.
Whereas SSD can typically achieve an I/O latency in the 300 to 1000 microseconds range, IBM FlashSystem can process I/O in the 25 to 110 microsecond range. That is a huge difference!
(FTC Disclosure: The U.S. Federal Trade Commission requires that I mention that I am an IBM employee, and that this post may be considered a paid, celebrity endorsement of both the IBM FlashSystem and IBM Storwize family of products. I have no financial interest in EMC, do not endorse the XtremeIO mentioned here, and was not paid to mention their company or products in any manner.)
Fellow blogger and IBM Master Inventor Barry Whyte has a great comparison table in his blog post [Extreme Blogging]. I thought I would add an added column for the Storwize V7000 with 18 Solid State drives.
IBM FlashSystem 820
IBM Storwize V7000 with SSD
20 Terabytes: 1U
11 Terabytes: 2U
7 Terabytes: 6U
I/O latency (microseconds)
110us (~5x faster)
Maximum I/O per second
NAND Flash type
While it is easy to show that EMC's XtremeIO does not hold a candle to IBM FlashSystems, I think it is more amusing that it is not even as good as a Storwize V7000 with SSD that IBM offered two years ago, long before [EMC acquired XtremeIO company] back in May 2012.
Continuing my rant from Monday's post [Time for a New Laptop], I got my new laptop Wednesday afternoon. I was hoping the transition would be quick, but that was not the case. Here were my initial steps prior to connecting my two laptops together for the big file transfer:
Document what my old workstation has
Back in 2007, I wrote a blog post on how to [Separate Programs from Data]. I have since added a Linux partition for dual-boot on my ThinkPad T60.
Windows XP SP3 operating system and programs
Red Hat Enterprise Linux 5.4
My Documents and other data
I also created a spreadsheet of all my tools, utilities and applications. I combined and deduplicated the list from the following sources:
Control Panel -> Add/Remove programs
Start -> Programs panels
Program taskbar at bottom of screen
The last one was critical. Over the years, I have gotten in the habit of saving those ZIP or EXE files that self-install programs into a separate directory, D:/Install-Files, so that if I had to unintsall an application, due to conflicts or compatability issues, I could re-install it without having to download them again.
So, I have a total of 134 applications, which I have put into the following rough categories:
AV - editing and manipulating audio, video or graphics
Files - backup, copy or manipulate disks, files and file systems
Browser - Internet Explorer, Firefox, Opera and Google Chrome
Communications - Lotus Notes and Lotus Sametime
Connect - programs to connect to different Web and Wi-Fi services
Demo - programs I demonstrate to clients at briefings
Drivers - attach or sync to external devices, cell phones, PDAs
Games - not much here, the basic solitaire, mindsweeper and pinball
Help Desk - programs to diagnose, test and gather system information
Projects - special projects like Second Life or Lego Mindstorms
Lookup - programs to lookup information, like American Airlines TravelDesk
Meeting - I have FIVE different webinar conferencing tools
Office - presentations, spreadsheets and documents
Platform - Java, Adobe Air and other application runtime environments
Player - do I really need SIXTEEN different audio/video players?
Printer - print drivers and printer management software
Scanners - programs that scan for viruses, malware and adware
Tools - calculators, configurators, sizing tools, and estimators
Uploaders - programs to upload photos or files to various Web services
Backup my new workstation
My new ThinkPad T410 has a dual-core i5 64-bit Intel processor, so I burned a 64-bit version of [Clonezilla LiveCD] and booted the new system with that. The new system has the following configuration:
Windows XP SP3 operating system, programs and data
There were only 14.4GB of data, it took 10 minutes to backup to an external USB disk. I ran it twice: first, using the option to dump the entire disk, and the second to dump the selected partition. The results were roughly the same.
Run Workstation Setup Wizard
The Workstation Setup Wizard asks for all the pertinent location information, time zone, userid/password, needed to complete the installation.
I made two small changes to connect C: to D: drive.
Changed "My Documents" to point to D:\Documents which will move the files over from C: to D: to accomodate its new target location. See [Microsoft procedure] for details.
Edited C:\notes\notes.ini to point to D:\notes\data to store all the local replicas of my email and databases.
Install Ubuntu Desktop 10.04 LTS
My plan is to run Windows and Linux guests through virtualization. I decided to try out Ubuntu Desktop 10.04 LTS, affectionately known as Lucid Lynx, which can support a variety of different virtualization tools, including KVM, VirtualBox-OSE and Xen. I have two identical 15GB partitions (sda2 and sda3) that I can use to hold two different systems, or one can be a subdirectory of the other. For now, I'll leave sda3 empty.
Take another backup of my new workstation
I took a fresh new backup of paritions (sda1, sda2, sda6) with Clonezilla.
The next step involved a cross-over Ethernet cable, which I don't have. So that will have to wait until Thursday morning.
(FTC Disclosure: I do not work or have any financial investments in ENC Security Systems. ENC Security Systems did not paid me to mention them on this blog. Their mention in this blog is not an endorsement of either their company or any of their products. Information about EncryptStick was based solely on publicly available information and my own personal experiences. My friends at ENC Security Systems provided me a full-version pre-loaded stick for this review.)
The EncryptStick software comes in two flavors, a free/trial version, and the full/paid version. The free trial version has [limits on capacity and time] but provides enough glimpse of the product to decide before you buy the full version. You can download the software yourself and put in on your own USB device, or purchase the pre-loaded stick that comes with the full-version license.
Whichever you choose, the EncryptStick offers three nice protection features:
Encryption for data organized in "storage vaults", which can be either on the stick itself, or on any other machine the stick is connected to. That is a nice feature, because you are not limited to the capacity of the USB stick.
Encrypted password list for all your websites and programs.
A secure browser, that prevents any key-logging or malware that might be on the host Windows machine.
I have tried out all three functions and everything works as advertised. However, there is always room for improvement, so here are my suggestions.
The first problem is that the pre-loaded stick looks like it is worth a million dollars. It is in a shiny bronze color with "EncryptStick" emblazoned on it. This is NOT subtle advertising! This 8GB capacity stick looks like it would be worth stealing solely on being a nice piece of jewelry, and then the added bonus that there might be "valuable secrets" just makes that possibility even more likely.
If you want to keep your information secure, it would help to have "plausible deniability" that there is nothing of value on a stick. Either have some corporate logo on it, of have the stick look like a cute animal, like these pig or chicken USB sticks.
It reminds me how the first Apple iPod's were in bright [Mug-me White]. I use black headphones with my black iPod to avoid this problem.
Of course, you can always install the downloadable version of EncryptStick software onto a less conspicuous stick if you are concerned about theft. The full/paid version of EncryptStick offers an option for "lost key recovery" which would allow you to backup the contents of the stick and be able to retrieve them on a newly purchased stick in the event your first one is lost or stolen.
Imagine how "unlucky" I felt when I notice that I had lost my "rabbits feet" on this cute animal-themed USB stick.
I sense trouble for losing the cap on my EncryptStick as well. This might seem trivial, but is a pet-peeve of mine that USB sticks should plan for this. Not only is there nothing to keep the cap on (it slides on and off quite smoothly), but there is no loop to attach the cap to anything if you wanted to.
Since then, I got smart and try to look for ways to keep the cap connected. Some designs, like this IBM-logoed stick shown above, just rotate around an axle, giving you access when you need it, and protection when it is folded closed.
Alternatively, get a little chain that allows you to attach the cap to the main stick. In the case of the pig and chicken, the memory section had a hole pre-drilled and a chain to put through it. I drilled an extra hole in the cap section of each USB stick, and connected the chain through both pieces.
(Warning: Kids, be sure to ask for assistance from your parents before using any power tools on small plastic objects.)
The EncryptStick can run on either Microsoft Windows or Mac OS. The instructions indicate that you can install both versions of download software onto a single stick, so why not do that for the pre-loaded full version? The stick I have had only the Windows version pre-loaded. I don't know if the Windows and Mac OS versions can unlock the same "storage vaults" on the stick.
Certainly, I have been to many companies where either everyone runs Windows or everyone runs Mac OS. If the primary target audience is to use this stick at work in one of those places, then no changes are required. However, at IBM, we have employees using Windows, Mac OS and Linux. In my case, I have all three! Ideally, I would like a version of EncryptStick that I could take on trips with me that would allow me to use it regardless of the Operating System I encountered.
Since there isn't a Linux-version of EncryptStick software, I decided to modify my stick to support booting Linux. I am finding more and more Linux kiosks when I travel, especially at airports and high-traffic locations, so having a stick that works both in Windows or Linux would be useful. Here are some suggestions if you want to try this at home:
Use fdisk to change the FAT32 partition type from "b" to "c". Apparently, Grub2 requires type "c", but the pre-loaded EncryptStick was set to "b". The Windows version of EncryptStick> seems to work fine in either mode, so this is a harmless change.
Install Grub2 with "grub-install" from a working Linux system.
Once Grub2 is installed, you can boot ISO images of various Linux Rescue CDs, like [PartedMagic] which includes the open-source [TrueCrypt] encryption software that you could use for Linux purposes.
This USB stick could also be used to help repair a damaged or compromised Windows system. Consider installing [Ophcrack] or [Avira].
Certainly, 8GB is big enough to run a full Linux distribution. The latest 32-bit version of [Ubuntu] could run on any 32-bit or 64-bit Intel or AMD x86 machine, and have enough room to store an [encrypted home directory].
Since the stick is formatted FAT32, you should be able to run your original Windows or Mac OS version of EncryptStick with these changes.
Depending on where you are, you may not have the luxury to reboot a system from the USB memory stick. Certainly, this may require changes to the boot sequence in the BIOS and/or hitting the right keys at the right time during the boot sequence. I have been to some "Internet Cafes" that frown on this, or have blocked this altogether, forcing you to boot only from the hard drive.
Well, those are my suggestions. Whether you go on a trip with or without your laptop, it can't hurt to take this EncryptStick along. If you get a virus on your laptop, or have your laptop stolen, then it could be handy to have around. If you don't bring your laptop, you can use this at Internet cafes, hotel business centers, libraries, or other places where public computers are available.
It's official! My "blook" Inside System Storage - Volume I is now available.
This blog-based book, or “blook”, comprises the first twelve months of posts from this Inside System Storage blog,165 posts in all, from September 1, 2006 to August 31, 2007. Foreword by Jennifer Jones. 404 pages.
IT storage and storage networking concepts
IBM strategy, hardware, software and services
Disk systems, Tape systems, and storage networking
Storage and infrastructure management software
Second Life, Facebook, and other Web 2.0 platforms
IBM’s many alliances, partners and competitors
How IT storage impacts society and industry
You can choose between hardcover (with dust jacket) or paperback versions:
This is not the first time I've been published. I have authored articles for storage industry magazines, written large sections of IBM publications and manuals, submitted presentations and whitepapers to conference proceedings, and even had a short story published with illustrations by the famous cartoon writer[Ted Rall].
But I can say this is my first blook, and as far as I can tell, the first blook from IBM's many bloggers on DeveloperWorks, and the first blook about the IT storage industry.I got the idea when I saw [Lulu Publishing] run a "blook" contest. The Lulu Blooker Prize is the world's first literary prize devoted to "blooks"--books based on blogs or other websites, including webcomics. The [Lulu Blooker Blog] lists past year winners. Lulu is one of the new innovative "print-on-demand" publishers. Rather than printing hundredsor thousands of books in advance, as other publishers require, Lulu doesn't print them until you order them.
I considered cute titles like A Year of Living Dangerously, orAn Engineer in Marketing La-La land, or Around the World in 165 Posts, but settled on a title that matched closely the name of the blog.
In addition to my blog posts, I provide additional insights and behind-the-scenes commentary. If you go to the Luluwebsite above, you can preview an entire chapter in its entirety before purchase. I have added a hefty 56-page Glossary of Acronyms and Terms (GOAT) with over 900 storage-related terms defined, which also doubles as an index back to the post (or posts) that use or further explain each term.
So who might be interested in this blook?
Business Partners and Sales Reps looking to give a nice gift to their best clients and colleagues
Managers looking to reward early-tenure employees and retain the best talent
IT specialists and technicians wanting a marketing perspective of the storage industry
Mentors interested in providing motivation and encouragement to their proteges
Educators looking to provide books for their classroom or library collection
Authors looking to write a blook themselves, to see how to format and structure a finished product
Marketing personnel that want to better understand Web 2.0, Second Life and social networking
Analysts and journalists looking to understand how storage impacts the IT industry, and society overall
College graduates and others interested in a career as a storage administrator
And yes, according to Lulu, if you order soon, you can have it by December 25.
A long time ago, perhaps in the early 1990s, I was an architect on the component known today as DFSMShsm on z/OS mainframe operationg system. One of my job responsibilities was to attend the biannual [SHARE conference to listen to the requirements of the attendees on what they would like added or changed to the DFSMS, and ask enough questions so that I can accurately present the reasoning to the rest of the architects and software designers on my team. One person requested that the DFSMShsm RELEASE HARDCOPY should release "all" the hardcopy. This command sends all the activity logs to the designated SYSOUT printer. I asked what he meant by "all", and the entire audience of 120 some attendees nearly fell on the floor laughing. He complained that some clever programmer wrote code to test if the activity log contained only "Starting" and "Ending" message, but no error messages, and skip those from being sent to SYSOUT. I explained that this was done to save paper, good for the environment, and so on. Again, howls of laughter. Most customers reroute the SYSOUT from DFSMS from a physical printer to a logical one that saves the logs as data sets, with date and time stamps, so having any "skipped" leaves gaps in the sequence. The client wanted a complete set of data sets for his records. Fair enough.
When I returned to Tucson, I presented the list of requests, and the immediate reaction when I presented the one above was, "What did he mean by ALL? Doesn't it release ALL of the logs already?" I then had to recap our entire dialogue, and then it all made sense to the rest of the team. At the following SHARE conference six months later, I was presented with my own official "All" tee-shirt that listed, and I am not kidding, some 33 definitions for the word "all", in small font covering the front of the shirt.
I am reminded of this story because of the challenges explaining complicated IT concepts using the English language which is so full of overloaded words that have multiple meanings. Take for example the word "protect". What does it mean when a client asks for a solution or system to "protect my data" or "protect my information". Let's take a look at three different meanings:
The first meaning is to protect the integrity of the data from within, especially from executives or accountants that might want to "fudge the numbers" to make quarterly results look better than they are, or to "change the terms of the contract" after agreements have been signed. Clients need to make sure that the people authorized to read/write data can be trusted to do so, and to store data in Non-Erasable, Non-Rewriteable (NENR) protected storage for added confidence. NENR storage includes Write-Once, Read-Many (WORM) tape and optical media, disk and disk-and-tape blended solutions such as the IBM Grid Medical Archive Solution (GMAS) and IBM Information Archive integrated system.
The second meaning is to protect access from without, especially hackers or other criminals that might want to gather personally-identifiably information (PII) such as social security numbers, health records, or credit card numbers and use these for identity theft. This is why it is so important to encrypt your data. As I mentioned in my post [Eliminating Technology Trade-Offs], IBM supports hardware-based encryption FDE drives in its IBM System Storage DS8000 and DS5000 series. These FDE drives have an AES-128 bit encryption built-in to perform the encryption in real-time. Neither HDS or EMC support these drives (yet). Fellow blogger Hu Yoshida (HDS) indicates that their USP-V has implemented data-at-rest in their array differently, using backend directors instead. I am told EMC relies on the consumption of CPU-cycles on the host servers to perform software-based encryption, either as MIPS consumed on the mainframe, or using their Powerpath multi-pathing driver on distributed systems.
There is also concern about internal employees have the right "need-to-know" of various research projects or upcoming acquisitions. On SANs, this is normally handled with zoning, and on NAS with appropriate group/owner bits and access control lists. That's fine for LUNs and files, but what about databases? IBM's DB2 offers Label-Based Access Control [LBAC] that provides a finer level of granularity, down to the row or column level. For example, if a hospital database contained patient information, the doctors and nurses would not see the columns containing credit card details, the accountants would not see the columnts containing healthcare details, and the individual patients, if they had any access at all, would only be able to access the rows related to their own records, and possibly the records of their children or other family members.
The third meaning is to protect against the unexpected. There are lots of ways to lose data: physical failure, theft or even incorrect application logic. Whatever the way, you can protect against this by having multiple copies of the data. You can either have multiple copies of the data in its entirety, or use RAID or similar encoding scheme to store parts of the data in multiple separate locations. For example, with RAID-5 rank containing 6+P+S configuration, you would have six parts of data and one part parity code scattered across seven drives. If you lost one of the disk drives, the data can be rebuilt from the remaining portions and written to the spare disk set aside for this purpose.
But what if the drive is stolen? Someone can walk up to a disk system, snap out the hot-swappable drive, and walk off with it. Since it contains only part of the data, the thief would not have the entire copy of the data, so no reason to encrypt it, right? Wrong! Even with part of the data, people can get enough information to cause your company or customers harm, lose business, or otherwise get you in hot water. Encryption of the data at rest can help protect against unauthorized access to the data, even in the case when the data is scattered in this manner across multiple drives.
To protect against site-wide loss, such as from a natural disaster, fire, flood, earthquake and so on, you might consider having data replicated to remote locations. For example, IBM's DS8000 offers two-site and three-site mirroring. Two-site options include Metro Mirror (synchronous) and Global Mirror (asynchronous). The three-site is cascaded Metro/Global Mirror with the second site nearby (within 300km) and the third site far away. For example, you can have two copies of your data at site 1, a third copy at nearby site 2, and two more copies at site 3. Five copies of data in three locations. IBM DS8000 can send this data over from one box to another with only a single round trip (sending the data out, and getting an acknowledgment back). By comparison, EMC SRDF/S (synchronous) takes one or two trips depending on blocksize, for example blocks larger than 32KB require two trips, and EMC SRDF/A (asynchronous) always takes two trips. This is important because for many companies, disk is cheap but long-distance bandwidth is quite expensive. Having five copies in three locations could be less expensive than four copies in four locations.
Fellow blogger BarryB (EMC Storage Anarchist) felt I was unfair pointing out that their EMC Atmos GeoProtect feature only protects against "unexpected loss" and does not eliminate the need for encryption or appropriate access control lists to protect against "unauthorized access" or "unethical tampering".
(It appears I stepped too far on to ChuckH's lawn, as his Rottweiler BarryB came out barking, both in the [comments on my own blog post], as well as his latest titled [IBM dumbs down IBM marketing (again)]. Before I get another rash of comments, I want to emphasize this is a metaphor only, and that I am not accusing BarryB of having any canine DNA running through his veins, nor that Chuck Hollis has a lawn.)
As far as I know, the EMC Atmos does not support FDE disks that do this encryption for you, so you might need to find another way to encrypt the data and set up the appropriate access control lists. I agree with BarryB that "erasure codes" have been around for a while and that there is nothing unsafe about using them in this manner. All forms of RAID-5, RAID-6 and even RAID-X on the IBM XIV storage system can be considered a form of such encoding as well. As for the amount of long-distance bandwidth that Atmos GeoProtect would consume to provide this protection against loss, you might question any cost savings from this space-efficient solution. As always, you should consider both space and bandwidth costs in your total cost of ownership calculations.
Of course, if saving money is your main concern, you should consider tape, which can be ten to twenty times cheaper than disk, affording you to keep a dozen or more copies, in as many time zones, at substantially lower cost. These can be encrypted and written to WORM media for even more thorough protection.
Five years ago, I sprayed coffee all over my screen from something I read on a blog post from fellow blogger Hu Yoshida from HDS. You can read what cased my reaction in my now infamous post [Hu Yoshida should know better]. Subsequently, over the years, I have disagreed with Hu on a variety of of topics, as documented in my 2010 blog post [Hu Yoshida Does It Again].
(Apparently, I am not alone, as the process of spraying one's coffee onto one's computer screen while reading other blog posts has been referred to as "Pulling a Tony" or "Doing a Tony" by other bloggers!)
Fortunately, my IBM colleague David Sacks doesn't drink coffee. Last month, David noticed that Hu had posted a graph in a recent blog entry titled [Additional Storage Performance Efficiencies for Mainframes], comparing the performance of HDS's Virtual Storage Platform (VSP) to IBM's DS8000.
For those not familiar with disk performance graphs, flatter is better, lower response time and larger IOPS are always desired. This graph implies that the HDS disk system is astonishingly faster than IBM's DS8000 series disk system. Certainly, the HDS VSP qualifies as a member of the elite [Super High-End club] with impressive SPC benchmark numbers, and is generally recognized as a device that works in IBM mainframe environments. But this new comparison graph is just ridiculous!
(Note: While SPC benchmarks are useful for making purchase decisions, different disk systems respond differently to different workloads. As the former lead architect of DFSMS for z/OS, I am often brought in to consult on mainframe performance issues in complex situations. Several times, we have fixed performance problems for our mainframe clients by replacing their HDS systems with IBM DS8000 series!)
Since Hu's blog entry contained very little information about the performance test used to generate the graph, David submitted a comment directly to Hu's blog asking a few simple questions to help IBM and Hu's readers determine whether the test was fair. Here is David's comment as submitted:
(Disclosure: I work for IBM. This comment is my own.)
I was quite surprised by the performance shown for the IBM DS8000 in the graph in your blog. Unfortunately, you provided very little detail about the benchmark. That makes it rather difficult (to say the least) to identify factors behind the results shown and to determine whether the comparison was a fair one.
Of the little information provided, an attribute that somewhat stands out is that the test appears to be limited to a single volume at least, that's my interpretation of "LDEV: 1*3390-3"? IBM's internal tests for this kind of case show far better response time and I/Os per second than the graph you published.
Here are a few examples of details you could provide to help readers determine whether the benchmark was fair and whether the results have any relevance to their environment.
What DS8000 model was the test run on? (the DS8000 is a family of systems with generations going back 8 years. The latest and fastest model is the DS8800.)
What were the hardware and software configurations of the DS8000 and VSP systems, including the number and speed of performance-related components?
What were the I/O workload characteristics (e.g., read:write ratio and block size(s))?
What was the data capacity of each volume? (Allocated and used capacity.)
What were the cache sizes and cache hit ratios for each system? (The average I/O response times under 1.5 milliseconds for each system imply the cache hit ratios were relatively high.)
How many physical drives were volumes striped across in each system?"
Unlike my blog on IBM, HDS bloggers like Hu are allowed to reject or deny comments before they appear on his blog post. We were disappointed that HDS never posted David's comment nor responded to it. That certainly raises questions about the quality of the comparison.
So, perhaps this is yet another case of [Hitachi Math], a phrase coined by fellow blogger Barry Burke from EMC back in 2007 in reference to outlandish HDS claims. My earliest mention was in my blog post [Not letting the Wookie Win].
By the way, since the test was about z/OS Extended Address Volumes (EAV), it is worth mentioning that IBM's DS8700 and DS8800 support 3390 volume capacities up to 1 TB each, while the HDS VSP is limited to only 223 GB per volume. Larger volume capacities help support ease-of-growth and help reduce the number of volumes storage administrators need to manage; that's just one example of how the DS8000 series continues to provide the best storage system support for z/OS environments.
Personally, I am all for running both IBM and HDS boxes side-by-side and publishing the methodology, the workload characteristics, the configuration details, and the results. Sunshine is always the best disinfectant!
Last week, I got the following comment from Bob Swann:
I am looking for the IBM VM Poster or a picture of the IBM VM "Catch the Wave"
Do you know where I might find it?
Well, Bob, I made some phone calls. The company that published these posters no longer exists, butI found a coworker at the Poughkeepsie Briefing Center who still had the poster on his wall, and he was kind enough to take a picture of it for you.
VM: The Wave of the Future (click thumbnail at left to see larger image)
Some may recognize this as a [mash-up] using as a base the famous Japanese 10-inch by 15-inch block print[The Great Wave off Kanagawa] byartist [Katsushika Hokusai]. I had this as my laptop'swallpaper screen image until last year when I was presenting in Kuala Lumpur, Malaysia. I was told that it reminded people about the horrible tsunami caused by the [Indian Ocean earthquake] back in 2004.I was actually scheduled to fly the last week of December 2004 to Jakarta, Indonesia, but at the last minute ourclient team changed plans. I would have been on route over the Pacific ocean when the tsunami hit, and probably stranded over there for weeks or months until the airports re-opened.
The Wave theme was in part to honor the IBM users group called World Alliance VSE VM and Linux (WAVV) which is havingtheir next meeting [April 18-22, 2008] in Chattanooga, Tennessee. I presentedat this conference back in 1996 in Green Bay, Wisconsin, as part of the IBM Linux for S/390 team. It started onthe Sunday that Wisconsin switched their clocks for [DaylightSaving Time], and the few of us from Arizona or other places that don't both with this, all showed up forbreakfast an hour early.
When I was in Australia last year, I was told the wave that sports fans do, by raising their hands in coordinatedsequence, was called the [Mexican Wave]in most other countries. When I was there, Melbourne was trying to outlaw this practice at their cricket matches.
The "wave" represents a powerful metaphor, from z/VM operating system on System z mainframes to VMware and Xenon Intel-based processor machines, as the direction of virtualization that we are heading for future data centers.The Mexican wave represents a glimpse of what humans can accomplish with collaboration on a globalscale. It can also represent the tidal wave of data arising from nearly 60 percent annual growth instorage capacity. (I had to mention storage eventually, to avoid being completely off-topic on this post!)
I hope this is the graphic you were looking for Bob. If anyone else has wave-themed posters they would like to contribute, please post a comment below.
I've gotten suggestions to upgrade the memory and disk storage, and how to fine-tune the Microsoft Windows XP operating system. Others suggested replacing the OS with Linux, and to use the Cloud to avoid some of the storage space limitations.
But first, I have to mention the latest in our series of "Enterprise Systems" videos. The first was being [Data Ready]. The second was being [Security Ready]. The now the third in the series: the 3-minute
[Cloud Ready] video.
So I decided to try different Cloud-oriented Operating Systems, to see if any would be a good fit. Here is what I found:
(FTC Disclosure: I work for IBM and own IBM stock. This blog post is not meant to endorse one OS over another. I have financial interests in, and/or have friends and family who work at some of the various companies mentioned in this post. Some of these companies also have business relationships with IBM.)
Jolicloud and Joli OS 1.2
I gave this OS a try. This is based on Linux, but with an interesting approach. First, you have to be on-line all the time, and this OS is designed for 15-25 year-olds who are on social media websites like Facebook. By having a Jolicloud account, you can access this from any browser on any system, or run the Joli OS operating system, or buy the already pre-installed Jolibook netbook computer.
The Joli OS 1.2 LiveCD ran fine on my T410 with 4GB or RAM, giving me a chance to check it out, but sadly did not run on grandma's Thinkpad R31 with 384MB of RAM. According to the [Jolicloud specifications], Joli OS should run in as little as 384MB of RAM and 2GB of disk storage space, but it didn't for me.
Google Chrome and Chromium OS Vanilla
Like the Jolibook, Google has come out with a $249 Chromebook laptop that runs their "Chrome OS". This is only available via OEM install on desginated hardware, but the open source version is available called Chromium OS. These are also based on Linux.
Rather than compiling from source, Hexxeh has made nightly builds available. You can download [Chromium OS Vanilla] zip file, unzip the image file, and copy it to a 4GB USB memory stick. The compressed image is about 300MB, but uncompressed about 2.5GB, so too big to fit on a CD. The image on the USB stick is actually two partitions, and cannot be run from DVD either.
If you don't have a 4GB USB stick handy, and want to see what all the fuss is about, just install the Google Chrome browser on your Windows or Linux system, and then maximize the browser window. That's it. That is basically what Chromium OS is all about.
Files can be stored locally, or out on your Google Drive. Documents can be edited using "Google Docs" in the Cloud. You can run in "off-line" mode, for example, read your Gmail notes when not connected to the Internet. Music and video files can be played using the "Files" app.
If you really need to get out of the browser, you can hit the right combination of keys to get to the "crosh" command line shell.
Like Joli OS, I was able to run this from my Thinkpad T410 with 4GB of RAM, but not on grandma's Thinkpad R31. It appears that Chromium requires at least 1GB of RAM to run properly.
Android for x86
While researching the Chromium OS, I found that there is an open source community porting [Android to the x86] platform. Android is based on Linux, and would allow your laptop or netbook to run very much like a smartphone or tablet. Most of the apps available to Android should work here as well.
Unfortunately, the project has focused only on selected hardware:
ASUS Eee PCs/Laptops
Viewsonic Viewpad 10
Dell Inspiron Mini Duo
Lenovo ThinkPad x61 Tablet
I tried running the Thinkpad x61 version on both my Thinkpad T410 and grandma's Thinkpad R31, but with no success.
Peppermint OS Three
Next up was Peppermint OS, which claims to be a blend of Linux Mint, Lubuntu, and Xfce, but with a "twist" of aspiring to be a Cloud-oriented OS.
Rather than traditional apps to write documents or maintain a calendar, this OS offers a "Single-Site Browser" (SSB) experience, where you can configure "apps" by pointing to their respective URL. For documents, launch GWoffice, the client for Google Docs. For calendar, launch Google Calendar.
Most Linux distros have both a number and a project name associated with them. For example, Ubuntu 10.04 LTS is known as "Lucid Lynx". The Peppermint OS team avoided this by just calling their latest version "Three" which serves as both its number and its name.
The browser is Chromium, similar to Google Chrome OS above, and uses the "DuckDuckGo" search engine. This is how the Peppermint OS folks make their money to defray the costs of this effort.
Peppermint OS claims to run in systems as little as 192MB or RAM, and only 4GB of disk space. The LiveCD ran well on both my Thinkpad T410, as well as grandma's Thinkpad R31. More importantly, when I installed on the hard drive, it ran well.
The music app "Guayadeque" that came pre-installed was awful. It couldn't play MP3 music out-of-the-box. I had to install the Codec plugins from various "ubuntu-restricted-extras" libraries. I also installed the music app "Rhythmbox", and that worked great. Time from power-on to first-note was less than 2 minutes! However, the problems with the Guayadeque gave me the impression this OS might not be ready for primetime.
I contacted grandma to ask if she has Wi-Fi in her home, and sure enough, she doesn't. Her PC upstairs is direct attached to the cable modem. So, while the Cloud suggestion was worthy of investigation, I will continue to pursue other options that do not require being connected. I certainly do not want to spend any time and effort getting Wi-Fi installed there.
Are you tired of hearing about Cloud Computing without having any hands-on experience? Here's your chance. IBM has recently launched its IBM Development and Test Cloud beta. This gives you a "sandbox" to play in. Here's a few steps to get started:
Generate a "key pair". There are two keys. A "public" key that will reside in the cloud, and a "private" key that you download to your personal computer. Don't lose this key.
Request an IP address. This step is optional, but I went ahead and got a static IP, so I don't have to type in long hostnames like "vm353.developer.ihost.com".
Request storage space. Again, this step is optional, but you can request a 50GB, 100GB and 200GB LUN. I picked a 200GB LUN. Note that each instance comes with some 10 to 30GB storage already. The advantage to a storage LUN is that it is persistent, and you can mount it to different instances.
Start an "instance". An "instance" is a virtual machine, pre-installed with whatever software you chose from the "asset catalog". These are Linux images running under Red Hat Enterprise Virtualization (RHEV) which is based on Linux's kernel virtual machine (KVM). When you start an instance, you get to decide its size (small, medium, or large), whether to use your static IP address, and where to mount your storage LUN. On the examples below, I had each instance with a static IP and mounted the storage LUN to /media/storage subdirectory. The process takes a few minutes.
So, now that you are ready to go, what instance should you pick from the catalog? Here are three examples to get you started:
IBM WebSphere sMASH Application Builder
Base OS server to run LAMP stack
Next, I decided to try out one of the base OS images. There are a lot of books on Linux, Apache, MySQL and PHP (LAMP) which represents nearly 70 percent of the web sites on the internet. This instance let's you install all the software from scratch. Between Red Hat and Novell SUSE distributions of Linux, Red Hat is focused on being the Hypervisor of choice, and SUSE is focusing on being the Guest OS of choice. Most of the images on the "asset catalog" are based on SLES 10 SP2. However, there was a base OS image of Red Hat Enterprise Linux (RHEL) 5.4, so I chose that.
To install software, you either have to find the appropriate RPM package, or download a tarball and compile from source. To try both methods out, I downloaded tarballs of Apache Web Server and PHP, and got the RPM packages for MySQL. If you just want to learn SQL, there are instances on the asset catalog with DB2 and DB2 Express-C already pre-installed. However, if you are already an expert in MySQL, or are following a tutorial or examples based on MySQL from a classroom textbook, or just want a development and test environment that matches what your company uses in production, then by all means install MySQL.
This is where my SSH client comes in handy. I am able to login to my instance and use "wget" to fetch the appropriate files. An alternative is to use "SCP" (also part of PuTTY) to do a secure copy from your personal computer up to the instance. You will need to do everything via command line interface, including editing files, so I found this [VI cheat sheet] useful. I copied all of the tarballs and RPMs on my storage LUN ( /media/storage ) so as not to have to download them again.
Compiling and configuring them is a different matter. By default, you login as an end user, "idcuser" (which stands for IBM Developer Cloud user). However, sometimes you need "root" level access. Use "sudo bash" to get into root level mode, and this allows you to put the files where they need to be. If you haven't done a configure/make/make install in awhile, here's your chance to relive those "glory days".
In the end, I was able to confirm that Apache, MySQL and PHP were all running correctly. I wrote a simple index.php that invoked phpinfo() to show all the settings were set correctly. I rebooted the instance to ensure that all of the services started at boot time.
Rational Application Developer over VDI
This last example, I started an instance pre-installed with Rational Application Developer (RAD), which is a full Integrated Development Environment (IDE) for Java and J2EE applications. I used the "NX Client" to launch a virtual desktop image (VDI) which in this case was Gnome on SLES 10 SP2. You might want to increase the screen resolution on your personal computer so that the VDI does not take up the entire screen.
From this VDI, you can launch any of the programs, just as if it were your own personal computer. Launch RAD, and you get the familiar environment. I created a short Java program and launched it on the internal WebSphere Application Server test image to confirm it was working correctly.
If you are thinking, "This is too good to be true!" there is a small catch. The instances are only up and running for 7 days. After that, they go away, and you have to start up another one. This includes any files you had on the local disk drive. You have a few options to save your work:
Copy the files you want to save to your storage LUN. This storage LUN appears persistent, and continues to exist after the instance goes away.
Take an "image" of your "instance", a function provided in the IBM Developer and Test Cloud. If you start a project Monday morning, work on it all week, then on Friday afternoon, take an "image". This will shutdown your instance, and backup all of the files to your own personal "asset catalog" so that the next time you request an instance, you can chose that "image" as the starting point.
Another option is to request an "extension" which gives you another 7 days for that instance. You can request up to five unique instances running at the same time, so if you wanted to develop and test a multi-host application, perhaps one host that acts as the front-end web server, another host that does some kind of processing, and a third host that manages the database, this is all possible. As far as I can tell, you can do all the above from either a Windows, Mac or Linux personal computer.
Getting hands-on access to Cloud Computing really helps to understand this technology!
I am still wiping the coffee off my computer screen, inadvertently sprayed when I took a sip while reading HDS' uber-blogger Hu Yoshida's post on storage virtualization and vendor lock-in.
HDS is a major vendor for disk storage virtualization, and Hu Yoshida has been around for a while, so I felt it was fair to disagree with some of the generalizations he made to set the record straight. He's been more careful ever since.
However, his latest post [The Greening of IT: Oxymoron or Journey to a New Reality] mentions an expert panel at SNW that includedMark O’Gara Vice President of Infrastructure Management at Highmark. I was not at the SNW conference last week in Orlando, so I will just give the excerpt from Hu's account of what happened:
"Later I had the opportunity to have lunch with Mark O’Gara. Mark is a West Point graduate so he takes a very disciplined approach to addressing the greening of IT. He emphasized the need for measurements and setting targets. When he started out he did an analysis of power consumption based on vendor specifications and came up with a number of 513 KW for his data center infrastructure....
The physical measurements showed that the biggest consumers of power were in order: Business Intelligence Servers, SAN Storage, Robotic tape Library, and Virtual tape servers....
Another surprise may be that tape libraries are such large consumers of power. Since tape is not spinning most of the time they should consume much less power than spinning disk - right? Apparently not if they are sitting in a robotic tape library with a lot of mechanical moving parts and tape drives that have to accelerate and decelerate at tremendous speeds. A Virtual Tape Library with de-duplication factor of 25:1 and large capacity disks may draw significantly less power than a robotic tape library for a given amount of capacity.
Obviously, I know better than to sip coffee whenever reading Hu's blog. I am down here in South America this week, the coffee is very hot and very delicious, so I am glad I didn't waste any on my laptop screen this time, especially reading that last sentence!
In that report, a 5-year comparison found that a repository based on SATA disk was 23 times more expensive overall, and consumed 290 times more energy, than a tape library based on LTO-4 tape technology. The analysts even considered a disk-based Virtual Tape Library (VTL). Focusing just on backups, at a 20:1 deduplication ratio, the VTL solution was still 5 times per expensive than the tape library. If you use the 25:1 ratio that Hu Yoshida mentions in his post above, that would still be 4 times more than a tape library.
I am not disputing Mark O'Gara's disciplined approach. It is possible that Highmark is using a poorly written backup program, taking full backups every day, to an older non-IBM tape library, in a manner that causes no end of activity to the poor tape robotics inside. But rather than changing over to a VTL, perhaps Mark might be better off investigating the use of IBM Tivoli Storage Manager, using progressive backup techniques, appropriate policies, parameters and settings, to a more energy-efficient IBM tape library.In well tuned backup workloads, the robotics are not very busy. The robot mounts the tape, and then the backup runs for a long time filling up that tape, all the meanwhile the robot is idle waiting for another request.
(Update: My apologies to Mark and his colleagues at Highmark. The above paragraph implied that Mark was using badproducts or configured them incorrectly, and was inappropriate. Mark, my full apology [here])
If you do decide to go with a Virtual Tape Library, for reasons other than energy consumption, doesn't it make sense to buy it from a vendor that understands tape systems, rather than buying it from one that focuses on disk systems? Tape system vendors like IBM, HP or Sun understand tape workloads as well as related backup and archive software, and can provide better guidance and recommendations based on years of experience. Asking advice abouttape systems, including Virtual Tape Libraries, from a disk vendor is like asking for advice on different types of bread from your butcher, or advice about various cuts of meat at the bakery.
The butchers and bakers might give you answers, but it may not be the best advice.