This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
EMC Corporation (NYSE:EMC) today announced it has been positioned as a leader in the Forrester Wave™: Enterprise Open Systems Virtual Tape Library (VTL), Q1 2008 by Forrester Research, Inc. (January 31, 2008), an independent market and technology research firm. EMC achieved a position as a leader in the Forrester Wave report on virtual tape libraries based on the largest installed base of the EMC® Disk Library family of systems, its broad ecosystem interoperability. Virtual tape libraries emulate tape drives and work in conjunction with existing backup software applications, enabling fast backup and restoration of data by using high-capacity, low-cost disk drives.
EMC was the first major vendor in the open systems virtual tape library market as it introduced the EMC Disk Library in April 2004 and today is a leading provider of open systems virtual tape solutions, with systems that are designed for businesses and organizations of all sizes.
While the press release implies that "EDL equals VTL", Chuck tries to explain they are in fact very different. Here is an excerpt from his blog post:
Virtual Tape Libraries vs. Disk Libraries
As many of you know, VTLs have been around for a while. They use disk as a cache -- they buffer the incoming backup streams, do some housekeeping and stacking, then turn around and write tape efficiently. When you go to restore, you're usually coming back off of tape, unless the backup image in question is sitting in the disk cache.
Now, there is nothing wrong with the VTL approach, but it was conceived in a time when disks were horribly expensive. It was also pretty clear to many of us that disks were going to be a whole lot cheaper in the near future, and this fundamental assumption wouldn't be valid for much longer.
I kept thinking in terms of disk as a direct target for a backup application. No modifications to the backup application. Native speed of sequential disks for both backup and restore. Tape positioned as a backup to the backup. Use the strengths of the underlying array (e.g. CLARiiON) for performance, availability, management, etc.
We ended up calling the concept a "disk library" to differentiate from the VTLs that had come before it. It was a different value proposition and offering, based on the emergence of lower-cost disk media.
... It's nice to see we're at 1,100+ customers, and still going strong.
For those new to the blogosphere, there is a difference between "Press Releases" as formalcorporate communications versus "Blog Posts" which are informal opinions of the individual blogger, whichmay or may not match exactly the views of their respective employer.As we've learned many times before, one should not treat termslike "first" or "leader" in corporate press releases literally! Let's explore each.
Was EDL the first "open systems" Virtual Tape Library?
This is implied by the Forrester report. Chuck mentions the "VTLs that had came before it" in his blog, and many people are aware that IBM and StorageTek had introduced mainframe-attached VTLs in the 1990s. But what about VTL for "open systems"?
(Hold aside for the moment that IBM System zmainframe is an open system itself, with z/OS certified as a bona fide UNIX operating system by the [the Open Group] standards body. Most analysts and research firms usually refer only to the non-mainframe versions of UNIX and Windows. Alternative definitions for "open systems" can be foundin [Web definitions or Wikipedia]. I will assume Forrester meantnon-mainframe servers.)
IBM announced AIX non-mainframe attachment via SCSI connectivity to the IBM 3494 Virtual Tape Server (VTS) on Feb 16, 1999, with general availability in May 28, 1999. That's nearly FIVE YEARS before the April 2004 introduction of EDL. IBM VTS support for Sun Solaris and Microsoft Windows came shortly thereafter in November 2000, and support for HP-UX a bit later in June 2001. One of my 17 patents is for the software inside the IBM 3494 VTS, so like Chuck, I can takesome pride in the success of a successful product.
(I don't remember if StorageTek, which was subsequently acquired by Sun, had ever supported non-mainframe operating systems with their Virtual Storage Manager[VSM] offering, but if they did, I am sure it was also before EMC.)
Last week, another EMC blogger, BarryB (aka [the Storage Anarchist]),took me to task in comments on my post [IBM now supports 1TB SATA drives]. He felt that IBM should not claim support, given that the software inside the IBM System Storage N series is developed by NetApp. He compared this to the situation of HP and Sun re-badging the HDS USP-V disk system. If someone else wrote the software, BarryB opines, IBM should not claim credit for it. I tried to explain how IBM provides added value and has full-time employees dedicated to N series development and support, butdoubt I have changed his mind.
Why do I bring that up? Because the EMC Disk Library runs OEM software from FalconStor. Basically EMC is assembling a hardware/software solution with components provided from OEM suppliers. Hmmm? Sound familiar? Who is calling the kettle black?
If there is a clear winner here, it is FalconStor itself.Perhaps one of the worst kept industry secrets is that FalconStor software is also used in VTL offerings from Sun, Copan, and IBM, the latter embodied as the [IBM TS7520 Virtualization Engine] offering. If you like the concept of an EDL,but prefer instead one-stop shopping from an "information infrastructure" vendor, IBM can offer the TS7520 along with servers, software and services for a complete end-to-end solution.
Can EMC claim to be "a leader" in Virtual Tape Libraries?
During the measured quarter, IBM shipped its 10 millionth LTO-4 tape drive cartridge to Getty Images, the world's leading creator and distributor of still imagery, footage and multi-media products, as well as a recognized provider of other forms of premium digital content, including music. Getty Images is using the LTO-4 drives as part of a tiered infrastructure of IBM disk and tape solutions that help support the backup needs of their digital imagery;
IBM shipped more than 1,500 Petabytes of tape storage in Q3'07 alone;
During Q3'07, IBM shipped the 10,000th IBM System Storage TS3500 Tape Library. The TS3500 is a highly scalable tape library with support from 1 to 192 tape drives and up to 6,400 cartridge slots for open system, mainframe and virtual tape system attachment.
Let's take a look at the numbers. IBM has sold over 5,400 virtual tape libraries. Sun/STK has sold over 4,000 virtual tape libraries. Both are drastically more than the 1,100 mentioned in Chuck's post. Does IDC recognize EMC in third place? No, EMC chooses instead to declare EDL as disk arrays (probably toprop up their IDC "Disk Tracker" numbers), so they don't even earn an honorable mention under the virtual tape librarycategory. This of course includes the number of mainframe-attached models from IBM and Sun/STK. So, if EMC did call these tape systems instead, they might showup in third place, and as such EMC could claim to be "a leader" in much the same way an athlete can claim to be an "Olympic medalist" winning the bronze for third place. (If you limit thecount to just the FalconStor-based models from IBM, EMC, Sun and Copan, then EMC moves up to first or second, but then press release titles like "EMC a Leader in FalconStor-based non-mainframe Virtual Tape Libraries" can get too confusing.)
Chuck, if you are reading this, I feel you have every right to celebrate your involvement with the EDL. Despite having common software and hardware components, both IBM and EMC can rightfully declare their own unique value-add through their respective VTL offerings. Like the IBM N series, the EMC Disk Library is not diminished by the fact the software was written by someone else. BarryB might disagree.
Hello everyone! I am back, fully well-rested from a wonderful 3-week vacation touring the lovely state of Tennessee. Here's a quick recap:
(FCC Disclosure: I mention various companies and products in this blog post. I have no financial interested in any of them, nor have I received any compensation to mention or endorse them here.)
Our first stop was Lynchburg, TN, home of [Jack Daniel's], America's oldest whiskey distillery. Our tour guide, Ron (who both looked and sounded like [John Goodman]) took us first to see how they burn wood to make charcoal, then the natural water spring which supplies the iron-free water used for the whiskey. We then got a whiff of the mash at various stages of fermentation. Lastly, we had samples of Original No. 7, Gentleman's Jack, and Single Barrel.
(A word of caution: Domestic airlines only allow FIVE LITERS of Bourbon, Whiskey or Rum in your checked luggage. That is only six bottles at the 750ml size, of beverages that are between 24 to 70 percent alcohol by volume [ABV]. Anything above 70 percent is considered too flammable to take on the plane. Excess bottles can be custom packed and shipped, but can be quite expensive. Nearly everyone we met drove all the way to Tennessee instead of flying, and now I understand why.)
While in the area, we had a nice lunch at [Miss Mary Bobo's], a boarding house turned into a restaurant. They only serve one meal a day at 1pm, by reservation only. And we were paired up with eight others and served food "family style" a large round table with a [Lazy Susan].
Jack Daniel's is not the only attraction in the area. We also visited [Falls Mill], a grist mill that grinds corn, wheat and rye for the other distilleries. Mo and I visited [Prichard's Distillery], where they make Whiskey, Rye and Rum. We highly recommend their molasses-flavored "Sweet Lucy"!
We stopped at the famous historic landmark, the [Chattanooga Choo Choo], which was formerly a train station, and now renovated into a hotel. We asked to see the inside of one of the train cars converted into a hotel room.
Gatlinburg and Pigeon Forge
We stayed in a cabin in the [Smokey Mountains] near Gatlinburg. In addition to pleasant rides through the National park, we also walked around the small town, looking at all the shops and amusements.
The next town over is Pigeon Forge, and driving down the main parkway is like Las Vegas in a slightly alternate universe. One person called it the Redneck Riviera!
We spent two days at Dollywood theme park, named after its founder, famous country singer Dolly Parton. We arrived after 3pm the first day, so they gave us the second day free!
In addition to roller coaster rides, artisan shops and restaurants, we found zip lines! Mo and I put on harness, attached to a pulley, and zipped over roller coasters, trees and rivers throughout Dollywood park. It was a lot of fun!
We also went to Dolly Parton's other attraction: Stampede. This was a dinner show with horses. It was similar to the Excalibur show we saw in Las Vegas last year during the week of Edge 2013 Conference.
On our way from Gatlinburg, we stopped into Knoxville to have lunch with clients. We had a choice to make, we could either drive up into Kentucky and visit the distilleries in the Bourbon trail, or drive straight to Nashville and spend more time there. We opted for Nashville, saving the Bourbon trail for a future trip.
Our final stop was Nashville, known as Music City. Our hotel was on Broadway, walking distance between Vanderbilt University and the [honky-tonks] downtown.
We had purchased advanced tickets for the [Grand Ole Opry]. This is not your typical concert. Instead, you have no idea who will play until just a few days before. The three hour show had about a dozen different musical acts, some famous, some new to the country music scene.
We went to the Johnny Cash Museum. People with ticket stubs from the Grand Ole Opry get in for a discount!
Searching [TripAdvisor] for things to do in Nashville, I found [The Escape Game]. You pay them money to lock you up in a room with a bunch of strangers, and then collectively as a team you need to figure out how to escape by solving puzzles and clues.
Each room has different themes. First, we tried the "Underground Playground". You know that TV show [Are you Smarter than a Fifth Grader?] Well, the majority of our so-called team were not in this case, and after 60 minutes the referee told us we had failed and unlocked the door.
We had so much fun that we came back two days later to try a different room. This time we tried "The Heist" which is all about art theft. The strangers we were teamed up with were very motivated to get out of the room in time, and we succeeded, getting out in just 54 minutes!
Mo and I had a great time, but are glad to be back home!
Well, I'm going to take a two week break from blogging. Not because my clarification of storage terminology got me Marc Farley's finger wagging of shame.
No, I'm going on vacation.I'll be going to a third-world country, possibly outside the reaches of cell phones, e-mail and the internet, so I won't be blogging until I get back later this month. Since Clark Hodge has discovered a pattern that I am suspiciously close to massive power failures, I think it best not to tell people exactly where I am going.
So, until I get back, I leave you with a nice piece from Kirby at Storage Sanity who has discovered that IBMers are very nice.
Over the past ten years, my co-workers have asked to write a "guest post" on this blog. This time, Moshe Weiss, IBM Senior Manager, Development and Design, has offered the following post, not in his own voice, but in the voice of his "baby", the Hyper-Scale Manager software.
You might think this is a strange approach, but today we have robots that can dance, and cars that can drive themselves! If software could talk, this is what IBM Hyper-Scale Manager would say:
"I was born a year ago.
It wasn't an easy birth… there were many complications. In fact, so many, that I was almost prematurely born!
Most of my development, in preparation for labor and delivery, was done within the last 6 months of the overall 18 months. I was shaped and designed, and sometimes re-shaped, three times. Lots of assumptions had to be made in hopes to ease a successful delivery and help bring me to full term of the birthing process.
During my first year of maturity, I focused on learning how customers used me; what frustrated them the most, and what they loved or 'almost' loved, while still needing refinement and redesign.
The number of customers adopting me grew higher and higher, as did the number of complaints and bugs that I had to deal with, and my users’ frustrations and dislikes because I wasn't yet a complete solution and still had some missing features.
I was renewed four times! Each time of which improved me and made my senses better, faster, adding new capabilities that helped make me more approachable, intuitive and delightful.
Choosing how to renew, and what to add to each renewal, is not an easy task. Basically, it was about prioritizing user experience versus gaps that were deferred from my birth, versus differentiators to make me unique and sell more, versus features in my roadmap, versus investing huge efforts in my quality.
Each renewal was a complex process with lots of features and behaviors to add, while trying to make my customers’ life a bit easier, since features that were important to them were sometimes considered low priority.
But, there were also good times during my first year:
Huge customer adoption rate
100 new customers in two months!
Growing was a great thing and my parents were and are still so proud! But, like with most things, it came with a price - a lot of sustain issues from the field, requests for changes and bad feedback that I am hard to use and missing core elements.
Being a new baby in the Storage world is not a simple thing, as expectations are huge (mainly because of my successful elder brother, the XIV GUI) and I must quickly keep up with all of them.
Although, I am getting tons of good feedback for being revolutionary and unique. People are emotionally engaged with me, and being that I’m a baby, I love to see emotions!
Huge marketing efforts to put me center stage
However, because of some initial problems at the start -- I am a new product, remember? -- I was thrown out of multiple customer sites, and some sales/marketing guys just stopped believing in me. That made me sad.
My parents did a great job, though, in talking, explaining and demonstrating what I can do, together with what I can’t do now, but will do soon. This really helped in some areas, and customers began to see what my parents saw in me for so many years.
I’m really enthusiastic to hear what people will think of me when I’m two years old!
As part of the renewal I had four times during my first year, design elements were reconsidered, redesigned and rewritten to find the best solutions ever. No product has come even close to what I suggest to the world… I am so proud of myself!
Additionally, my parents wrote approximately 20 patents on my User Interface (UI) elements and User Experience (UX) concepts, which makes me extremely unique.
Prioritization of what goes in and what doesn't, especially during a time when fewer and fewer babysitters handled me during that year. It was a real challenge. Read my parent's post [How to drive forward an exhausted team?] for more details.
But my parents did it! They succeeded to add cool features like:
Filter analytics and free text, making the filter a great experience that everyone is using.
Great UX improvements like redesigning the tabs, adding right click menus, and adding more on-boarding enablers
Improving the dashboard.
Improving my core business, capacity management (four different times!), and still working on it.
Adding features that were initially deferred in my birth. Deferring features back then was the way to make my birth go smoother. Now, these missing features annoy people.
Improving quality dramatically, adding automation to the way people test me.
Adding differentiators, like the health widget, with more than 20 best practices that provide helpful tips to the customer when there’s a need to change something in their environment, to avoid future issues.
Continue to bring added values for the 'A-family'. I am monitoring: FlashSystem A9000/R, XIV and Spectrum Accelerate, both on and off premises. This added value makes for a family with the most powerful management solutions and experience."
If you are planning to attend the upcoming IBM Systems Technical University, Orlando Florida, May 22-26, There will also be a variety of hands-on labs. I recommend participating in the hands-on session to feel and witness the next release of IBM Hyper-Scale Manager.
You may not be the right person to ask but I am asking everyone so "How do you see hybrid disk drives?"
(For the record, I am not immediately related to Robert. At onepoint, "Pearson" was the 12th most common surname in the USA, but now doesn't even make the Top 100.)
Robert, I would like to encourage you and everyone else to ask questions, don't worry if I am the wrong person to ask, asprobably I know the right person within IBM. Some people have called me the "Kevin Bacon" of Storage,as I am often less than six degrees away from the right person, having worked in IBM Storage for over 20 years.
For those not familiar with hybrid drives, there is a good write-up in Wikipedia.
Unfortunately, most of the people I would consult on this question, such as those from Market Intelligence or Research, are on vacation for the holidays, so, Robert, I will have to rely on my trusted 78-card Tarot deck and answer you with a five-card throw.
Your first card, Robert, is the Hermit. This card represents "introspection". The best I/O is no I/O, which means that if applications can keep the information they need inside server memory, you can avoid the bus bandwidth limitations to going to external storage devices. Where external storage makes sense is when data is shared between servers, or when the single server is limited to a set amount of internal memory. So, consider maxing out the memory in your server first (IBM would be glad to sell you more internal memory!!!), then consider outside solid-state or hybrid devices. Windows for example has an architectural limit of 4GB.
Your second card, Robert, is the Four of Cups, representing "apathy".On the card, you see three cups together, with the fourth cup being delivered from a cloud. This reminds me thatwe have three storage tiers already (memory,disk,tape), and introducing a fourth tier into the mix may not garnermuch excitement. For the mainframe, IBM introduced a Solid-State Device, call the Coupling Facility, which can be accessed from multipleSystem z servers. It is used heavily by DFSMS and DB2 to hold shared information. However, given some customer's apathytowards Information Lifecycle Management which includes "tiered storage", introducing yet another tier that forcespeople to decide what data goes where may be another challenge.
Your third card, Robert, is the Chariot, which represents "Speed, Determination,and Will". In some cases, solid state disk are faster for reading, but can be slower for writing. In the case of ahybrid drive, where the memory acts as a front-end cache, read-hits would be faster, but read-misses might be slower.While the idea of stopping the drives during inactivity will reduce power consumption, spinning up and slowing downthe disk may incur additional performance penalties. At the time of this post, the fastest disk system remains the IBM SAN Volume Controller, based on SPC-1 and SPC-2 benchmarks in excess of those published for other devices.
Your fourth card, Robert, is the Eight of Pentacles, which represents"Diligence, Hard work". The pentacles are coins with five-sided stars on them, and this often represents money.Our research team has projected that spinning disk will continue to be a viable and profitable storage media for at least anothereight years.
Your fifth and last card, Robert, is the World, which normallyrepresents "Accomplishment", but since it is turned upside down, the meaning is reversed to "Limitation". Some Hybriddisks, and some types of solid state memory in general, do have limitations in the number of write cycles they can handle. For thoseunhappy with the frequency and slowness for rebuilds on SATA disk may find similar problems with hybrid drives.For that reason, businesses may not trust using hybrid drives for their busiest, mission-critical applications, but certainlymight use it for archive data with lower write-cycle requirements.
The tarot cards are never wrong, but certainly interpretations of the cards can be.
Several readers have asked me what is the difference between Hybrid Cloud and Multi-Cloud. The two phrases are used in various contexts, not just by IBM, but also by our competitors, as well as the press and industry analysts.
A hybrid cloud attempts to develop a single platform to run a specific Cloud workload. This single platform combines two or more of the following resources:
on-premise private Cloud
off-premise private Cloud
off-premise public Cloud
A Hybrid Cloud is like the United Nations peacekeeping force. A single force, with a single mission, representing the combined resources of many countries.
A Hybrid Cloud is a deployment model that might offer advantages over just using a Private Cloud, or just using a Public Cloud.
A practical example is Tennis Australia. For three weeks every January, they run the Australian Open, a tennis tournament, with over 4,000 employees, and millions of views to their website each day. For the rest of the year, they have only about 300 employees, and manage quite well to run smaller tournaments for high-school and college students, as well as plan for next year's event.
In this case, a Hybrid Cloud that combines perhaps two racks of an on-premise private Cloud, combined with the incredible power of IBM Cloud, gives them the variability and agility needed to run smoothly without wasting CAPEX on equipment they don't need.
Many "Hybrid Cloud" products focus on being the "glue" that combines two different resources together. This can be at the management layer, the data layer, the application layer, or the infrastructure layer.
In contrast, a Multi-Cloud represents a deployment strategy for different Cloud workloads. One workload might be better served on a Private Cloud, another workload might be better served on a Public Cloud, and a third workload, as we saw above, might benefit from the combined resources of a Hybrid Cloud.
In the past, people felt that all Cloud Service Providers were the same. Just as people buy gasoline from which ever gas station offers the lowest prices, many just chose their Cloud Service Provider based entirely on the costs involved. Loyalty can change the minute new price tables are published.
But today, Cloud Service Providers have made an effort to provide differentiation. For example, your Multi-Cloud might have three Hybrid Clouds. One cloud platform combines your on-premise Private Cloud with IBM Cloud, another combines your on-premise Private Cloud with Amazon Web Services, and a third combines your on-premise Cloud with Microsoft Azure.
In this case, a Multi-Cloud is like the various armed forces. You might deploy the Army for one mission, the Navy for another, and the Air Force or Marines for a third.
Many "Multi-Cloud" products focus on being versatile and multi-purpose. For example, the same FlashSystem 9100 that you deploy in your "Analytics Cloud" platform could also be useful for your "Docker Container Cloud" platform, or your "DevOPS Cloud" platform. IBM's various Multi-Cloud Solutions provide the additional software and services needed to complement the FlashSystem 9100 to pull this off.
Deciding to use a Multi-Cloud strategy is mostly a business decision. Deploying a Hybrid Cloud as one of your Multi-Cloud platforms could be a combination of business and technical decision.
While most of the post is accurate and well-stated, two opinions particular caught my eye. I'll be nice and call them opinions, since these are blogs, and always subject to interpretation. I'll put quotes around them so that people will correctly relate these to Hu, and not me.
"Storage virtualization can only be done in a storage controller. Currently Hitachi is the only vendor to provide this." -- Hu Yoshida
Hu, I enjoy all of your blog entries, but you should know better. HDS is fairly new-comer to the storage virtualization arena, so since IBM has been doing this for decades, I will bring you and the rest of the readers up to speed. I am not starting a blog-fight, just want to provide some additional information for clients to consider when making choices in the marketplace.
First, let's clarify the terminology. I will use 'storage' in the broad sense, including anything that can hold 1's and 0's, including memory, spinning disk media, and plastic tape media. These all have different mechanisms and access methods, based on their physical geometry and characteristics. The concept of 'virtualization' is any technology that makes one set of resources look like another set of resources with more preferable characteristics, and this applies to storage as well as servers and networks. Finally, 'storage controller' is any device with the intelligence to talk to a server and handle its read and write requests.
Second, let's take a look at all the different flavors of storage virtualization that IBM has developed over the past 30 years.
IBM introduces the S/370 with the OS/VS1 operating system. "VS" here refers to virtual storage, and in this case internal server memory was swapped out to physical disk. Using a table mapping, disk was made to look like an extension of main memory.
IBM introduces the IBM 3850 Mass Storage System (MSS). Until this time, programs that ran on mainframes had to be acutely aware of the device types being written, as each device type had different block, track and cylinder sizes, so a program written for one device type would have to be modified to work with a different device type. The MSS was able to take four 3350 disks, and a lot of tapes, and make them look like older 3330 disks, since most programs were still written for the 3330 format. The MSS was a way to deliver new 3350 disk to a 3330-oriented ecosystem, and greatly reduce the cost by handling tape on the back end. The table mapping was one virtual 3330 disk (100 MB) to two physical tapes (50 MB each). Back then, all of the mainframe disk systems had separate controllers. The 3850 used a 3831 controller that talked to the servers.
IBM invents Redundant Array of Independent Disk (RAID) technology. The table mapping is one or more virtual "Logical Units" (or "LUNs") to two or more physical disks. Data is striped, mirrored and paritied across the physical drives, making the LUNs look and feel like disks, but with faster performance and higher reliability than the physical drives they were mapped to. RAID could be implemented in the server as software, on top or embedded into the operating system, in the host bus adapter, or on the controller itself. The vendor that provided the RAID software or HBA did not have to be the same as the vendor that provided the disk, so in a sense, this avoided "vendor lock-in".Today, RAID is almost always done in the external storage controller.
IBM introduces the Personal Computer. One of the features of DOS is the ability to make a "RAM drive". This is technology that runs in the operating system to make internal memory look and feel like an external drive letter. Applications that already knew how to read and write to drive letters could work unmodified with these new RAM drives. This had the advantage that the files would be erased when the system was turned off, so it was perfect for temporary files. Of course, other operating systems today have this feature, UNIX has a /tmp directory in memory, and z/OS uses VIO storage pools.
This is important, as memory would be made to look like disk externally, as "cache", in the 1990s.
IBM AIX v3 introduces Logical Volume Manager (LVM). LVM maps the LUNs from external RAID controllers into virtual disks inside the UNIX server. The mapping can combine the capacity of multiple physical LUNs into a large internal volume. This was all done by software within the server, completely independent of the storage vendor, so again no lock-in.
IBM introduces the Virtual Tape Server (VTS). This was a disk array that emulated a tape library. A mapping of virtual tapes to physical tapes was done to allow full utilization of larger and larger tape cartridges. While many people today mistakenly equate "storage virtualization" with "disk virtualization", in reality it can be implemented on other forms of storage. The disk array was referred to as the "Tape Volume Cache". By using disk, the VTS could mount an empty "scratch" tape instantaneously, since no physical tape had to be mounted for this purpose.
Contradicting its "tape is dead" mantra, EMC later developed its CLARiiON disk library that emulates a virtual tape library (VTL).
IBM introduces the SAN Volume Controller. It involves mapping virtual disks to manage disks that could be from different frames from different vendors. Like other controllers, the SVC has multiple processors and cache memory, with the intelligence to talk to servers, and is similar in functionality to the controller components you might find inside monolithic "controller+disk" configurations like the IBM DS8300, EMC Symmetrix, or HDS TagmaStore USP. SVC can map the virtual disk to physical disk one-for-one in "image mode", as HDS does, or can also map virtual disks across physical managed disks, using a similar mapping table, to provide advantages like performance improvement through striping. You can take any virtual disk out of the SVC system simply by migrating it back to "image mode" and disconnecting the LUN from management. Again, no vendor lock-in.
The HDS USP and NSC can run as regular disk systems without virtualization, or the virtualization can be enabled to allow external disks from other vendors. HDS usually counts all USP and NSC sold, but never mention what percentage these have external disks attached in virtualization mode. Either they don't track this, or too embarrassed to publish the number. (My guess: single digit percentage).
Few people remember that IBM also introduced virtualization in both controller+disk and SAN switch form factors. The controller+disk version was called "SAN Integration Server", but people didn't like the "vendor lock-in" having to buy the internal disk from IBM. They preferred having it all external disk, with plenty of vendor choices. This is perhaps why Hitachi now offers a disk-less version of the NSC 55, in an attempt to be more like IBM's SVC.
IBM also had introduced the IBM SVC for Cisco 9000 blade. Our clients didn't want to upgrade their SAN switch networking gear just to get the benefits of disk virtualization. Perhaps this is the same reason EMC has done so poorly with its "Invista" offering.
So, bottom line, storage virtualization can, and has, been delivered in the operating system software, in the server's host bus adapter, inside SAN switches, and in storage controllers. It can be delivered anywhere in the path between application and physical media. Today, the two major vendors that provide disk virtualization "in the storage controller" are IBM and HDS, and the three major vendors that provide tape virtualization "in the storage controller" are IBM, Sun/STK, and EMC. All of these involve a mapping of logical to physical resources. Hitachi uses a one-for-one mapping, whereas IBM additionally offers more sophisticated mappings as well.
Last week, in Computer Technology Review's article [Tiering: Scale Up? Scale Out? Do Both], Mark Ferelli interviews fellow blogger Hu Yoshida, CTO of Hitachi Data Systems (HDS). Here's an excerpt:
"MF/CTR: A global cache should be required to implement that common pool that you’re talking about going across all tiers.
Hu/HDS: Right. So that is needed to get to all the resources. Now with our system, we can also attach external storage behind it for capacity so that as the storage ages out or becomes less active we can move it to the external storage. They would certainly have less performance capability, but you don’t need it for the stale data that we’re aging down. Right now we’re the only vendor that can provide this type of tiering.
If you look at other people who do virtualization like IBM’s SVC, the SVC has no storage within it because it’s sitting so if you attach any storage behind it, there is some performance degradation because you have this appliance sitting in front. That appliance is also very limited in cache and very limited in the number of storage boards on it. It cannot really provide you additional performance than what is attached behind it. And in fact, it will always degrade what is attached behind it because it’s not storage, where as our USP is storage and it has a global cache and it has thousands of port connections, load balancing and all that. So our front end can enhance existing storage that sits behind it."
This is not the first time I have had to correct Hu and others of misperceptions of IBM's SAN Volume Controller (SVC). This month marks my four year "blogoversary", and I seem to spend a large portion of my blogging time setting the record straight. Here are just a few of my favorite posts setting the record straight on SVC back in 2007:
Since day 1, SAN Volume Controllers has focused primarily on external storage. Initially, the early models had just battery-protected DRAM cache memory, but the most recent model of the SVC, the 2145-CF8, adds support for internal SLC NAND flash solid state drives. To fully appreciate how SVC can help improve the performance of the disks that are managed, I need to use some visual aids.
In this first chart, we look at a 70/30/50 workload. This indicates that 70 percent of the IOPS are reads, 30 percent writes, and 50 percent can be satisfied as cache hits directly from the SVC. For the reads, this means that 50 percent are read-hits satisfied from SVC DRAM cache, and 50 percent are read-miss that have to get the data from the managed disk, either from the managed disk's own cache, or from the actual spinning drives inside that managed disk array.
For writes, all writes are cache-hits, but some of them will be destaged to the managed disk. Typically, we find that a third of writes are over-written before this happens, so only two-thirds are written down to managed disk.
In this example, the SVC reduced the burden of the managed disk from 100,000 IOPS down to 55,000, which is 35,000 reads and 20,000 writes. Some have argued against putting one level of cache (SVC) in front of another level of cache (managed disk arrays). However, CPU processor designers have long recognized the value of hierarchical cache with L1, L2, L3 and sometimes even L4 caches. The cache-hits on SVC are faster than most disk system's cache-hits.
This is a Ponder curve, mapping millisecond response (MSR) times for different levels of I/O per second, named after the IBM scientist John Ponder that created them. Most disk array vendors will publish similar curves for each of their products. In this case, we see that 100,000 IOPS would cause a 25 millisecond response (MSR) time, but when the load is reduced to 55,000 IOPS, the average response time drops to only 7 msec.
To be fair, the SVC does introduce 0.06 msec of additional latency on read-misses, so let's call this 7.06 msec. This tiny amount of latency could be what Hu Yoshida was referring to when he said there was "some performance degradation". There are other storage virtualization products in the market that do not provide caching to boost performance, but rather just map incoming requests to outgoing requests, and these can indeed slow down every I/O they process. Perhaps Hu was thinking of those instead of IBM's SVC when he made his comments.
Of course, not all workloads are 70/30/50, and not every disk array is driven to its maximum capability, so your mileage may vary. As we slide down the left of the curve where things are flatter, the improvement in performance lowers.
IOPS before SVC
IOPS after SVC
MSR before SVC
MSR after SVC
Hitachi's offerings, including the HDS USP-V, USP-VM and their recently announced Virtual Storage Platform (VSP) sold also by HP under the name P9500, have similar architecture to the SVC and can offer similar benefits, but oddly the Hitachi engineers have decided to treat externally attached storage as second-class citizens instead. Hu mentions data that "ages out or becomes less active we can move it to the external storage." IBM has chosen not to impose this "caste" system onto its design of the SAN Volume Controller.
The SVC has been around since 2003, before the USP-V came to market, and has sold over 20,000 SVC nodes over the past seven years. The SVC can indeed improve performance of managed disk systems, in some cases by a substantial amount. The 0.06 msec latency on read-miss requests represents less than 1 percent of total performance in production workloads. SVC nearly always improves performance, and in the worst case, provides same performance but with added functionality and flexibility. For the most part, the performance boost comes as a delightful surprise to most people who start using the SVC.
To learn more about IBM's upcoming products and how IBM will lead in storage this decade, register for next week's webcast "Taming the Information Explosion with IBM Storage" featuring Dan Galvan, IBM Vice President, and Steve Duplessie, Senior Analyst and Founder of Enterprise Storage Group (ESG).
Last month, HP and Oracle jointly announced their new "Exadata Storage Server".This solution involves HP server and storage paired up with Oracle software, designed for Data Warehouse andBusiness Intelligence workloads (DW/BI).
I immediately recognized the Exadata Storage Server as a "me too" product, copying the idea from IBM's [InfoSphere Balanced Warehouse]which combines IBM servers, IBM storage and IBM's DB2 database software to accomplish this, but from a singlevendor, rather than a collaboration of two vendors.The Balanced Warehouse has been around for a while. I even blogged about this last year, in my post[IBMCombo trounces HP and Sun] when IBM announced its latest E7100 model. IBM offers three different sizes: C-class for smaller SMB workloads, D-class for moderate size workloads, and E-class for large enterprise workloads.
One would think that since IBM and Oracle are the top two database software vendors, and IBM and HP are the toptwo storage hardware vendors, that IBM would be upset or nervous on this announcement. We're not. I would gladlyrecommend comparing IBM offerings with anything HP and Oracle have to offer. And with IBM's acquisition of Cognos,IBM has made a bold statement that it is serious about competing in the DW/BI market space.
But apparently, it struck a nerve over at EMC.
Fellow blogger Chuck Hollis from EMC went on the attack, and Oracle blogger Kevin Closson went on the defensive.For those readers who do not follow either, here is the latest chain of events:
When it comes to blog fights like these, there are no clear winners or losers, but hopefully, if done respectfully,can benefit everyone involved, giving readers insight to the products as well as the company cultures that produce them.Let's see how each side fared:
Chuck implies that HP doesn't understand databases and Oracle doesn't understand server and storage hardware, socobbling together a solution based on this two-vendor collaboration doesn't make sense to him. The few I know who work at HP and Oracle are smart people, so I suspect this is more a claim againsteach company's "core strengths". Few would associate HP with database knowledge, or Oracle with hardware expertise,so I give Chuck a point on this one.
Of course, Chuck doesn't have deep, inside knowledge of this new offering, nor do I for that matter, and Kevin is patient enough to correct all of Chuck's mistaken assumptions and assertions. Kevin understands that EMC's "core strengths" isn't in servers or databases, so he explains things in simple enough terms that EMC employees can understand, so I give Kevin a point on this one.
If two is bad, then three is worse! How much bubble gum and bailing wire do you need in your data center? The better option is to go to the one company that offers it all and brings it together into a single solution: IBM InfoSphere Balanced Warehouse.
Are you going to the [IBM Edge 2015 conference]? This is IBM's premiere conference covering IBM System Storage, z Systems and POWER Systems.
Here are some secrets for winning prizes while you attend!
Sit in the first FIVE rows of the Techincal Kickoffs - Monday 8:30am
Funding has been approved to give out a few nice prizes. To be eligible, you need to show up on time, and sit in the first five rows of any of the following three Kickoffs. I will be in the one for Storage!
Attend sessions by Edge Event Sponsor companies
Brocade, Cisco and others often present lectures at Technical Edge, and they often give out prizes at those sessions, as part of their sponsorship to the event.
Take a "Selfie" with IBM z13 System mainframe
Yes, we will actually have a z13 System on display at the Solution Center for you to take pictures with. Post it on Instragram, Twitter, Facebook or your other favorite social media websites and be eligible to win prizes.
Get your handwriting analyzed with an IBM POWER8 system
Get your handwriting analyzed at the Solution Center and be eligible to win prizes.
Get your badge scanned at as many booths as you can at the Solution Center
Yes, this means you might get an email from the companies involved, but it will also add you to the list of people eligible for some raffles and drawings for prizes.
Participate in the #IBMEdgeHunt scavenger hunt!
Follow the Twitter hashtag #IBMEdgeHunt to see what else the "Hunt Organizers" have in store during the week!
I arrive Sunday afternoon! Below are some of the hashtags I will be using during the event. You can follow me on @az990tony Twitter handle.
"How can I participate in IBM's Smarter Planet, specifically Smarter Cities?"
With a lot of college students graduating next month, I thought this would be a good question to answer.
Apply for a Job at IBM
The best way to participate in IBM Smarter Cities is to get a job within IBM, and then get assigned to one of the many IBM Smarter Cities projects. Visit IBM's [Employment Page] to learn why IBM is recognized as one of the top 50 most attractive employers in the world. Mention "Smarter Cities" on your Resume so it can be routed to the appropriate manager.
Join the Conversation
Another way to participate in Smarter Cities is to "join the conversation". Each of IBM's 25 different programs has folks that are focused on that area, with blogs, forums and case studies. Here is the conversations page for [Smarter Cities]. Watch the videos at ibm.com/theSmarterCity]. Play IBM's [City One], IBM's Smarter Planet for game for Smarter Cities. Provide IBM feedback on any ideas you might have to help make cities smarter.
You can also join in one of the many upcoming [IBM Jam events]. Jams are not restricted to generating business ideas. Their methods, tools and technology can also be applied to social issues. In 2005, over three days, the Government of Canada, UN-HABITAT and IBM hosted Habitat Jam. Tens of thousands of participants - from urban specialists, to government leaders, to residents from cities around the world - discussed issues of urban sustainability. Their ideas shaped the agenda for the UN World Urban Forum, held in June 2006. People from 158 countries registered for the jam and shared their ideas for action to improve the environment, health, safety and quality of life in the world's burgeoning cities.
Buy Products and/or Services from IBM
IBM has the resources to help the planet in so many ways that NGOs and non-profit agencies only dream of. With IBM's advocacy for causes like global public education, universal healthcare, and improved infrastructures, people often forget that IBM is not itself a non-profit organization. IBM has learned early on that creating value for the world can also be good business. The more people buy from IBM, the more skills and resources IBM will have to solve the world's toughest challenges.
(FTC Disclosure: I do not work or have any financial investments in ENC Security Systems. ENC Security Systems did not paid me to mention them on this blog. Their mention in this blog is not an endorsement of either their company or any of their products. Information about EncryptStick was based solely on publicly available information and my own personal experiences. My friends at ENC Security Systems provided me a full-version pre-loaded stick for this review.)
The EncryptStick software comes in two flavors, a free/trial version, and the full/paid version. The free trial version has [limits on capacity and time] but provides enough glimpse of the product to decide before you buy the full version. You can download the software yourself and put in on your own USB device, or purchase the pre-loaded stick that comes with the full-version license.
Whichever you choose, the EncryptStick offers three nice protection features:
Encryption for data organized in "storage vaults", which can be either on the stick itself, or on any other machine the stick is connected to. That is a nice feature, because you are not limited to the capacity of the USB stick.
Encrypted password list for all your websites and programs.
A secure browser, that prevents any key-logging or malware that might be on the host Windows machine.
I have tried out all three functions and everything works as advertised. However, there is always room for improvement, so here are my suggestions.
The first problem is that the pre-loaded stick looks like it is worth a million dollars. It is in a shiny bronze color with "EncryptStick" emblazoned on it. This is NOT subtle advertising! This 8GB capacity stick looks like it would be worth stealing solely on being a nice piece of jewelry, and then the added bonus that there might be "valuable secrets" just makes that possibility even more likely.
If you want to keep your information secure, it would help to have "plausible deniability" that there is nothing of value on a stick. Either have some corporate logo on it, of have the stick look like a cute animal, like these pig or chicken USB sticks.
It reminds me how the first Apple iPod's were in bright [Mug-me White]. I use black headphones with my black iPod to avoid this problem.
Of course, you can always install the downloadable version of EncryptStick software onto a less conspicuous stick if you are concerned about theft. The full/paid version of EncryptStick offers an option for "lost key recovery" which would allow you to backup the contents of the stick and be able to retrieve them on a newly purchased stick in the event your first one is lost or stolen.
Imagine how "unlucky" I felt when I notice that I had lost my "rabbits feet" on this cute animal-themed USB stick.
I sense trouble for losing the cap on my EncryptStick as well. This might seem trivial, but is a pet-peeve of mine that USB sticks should plan for this. Not only is there nothing to keep the cap on (it slides on and off quite smoothly), but there is no loop to attach the cap to anything if you wanted to.
Since then, I got smart and try to look for ways to keep the cap connected. Some designs, like this IBM-logoed stick shown above, just rotate around an axle, giving you access when you need it, and protection when it is folded closed.
Alternatively, get a little chain that allows you to attach the cap to the main stick. In the case of the pig and chicken, the memory section had a hole pre-drilled and a chain to put through it. I drilled an extra hole in the cap section of each USB stick, and connected the chain through both pieces.
(Warning: Kids, be sure to ask for assistance from your parents before using any power tools on small plastic objects.)
The EncryptStick can run on either Microsoft Windows or Mac OS. The instructions indicate that you can install both versions of download software onto a single stick, so why not do that for the pre-loaded full version? The stick I have had only the Windows version pre-loaded. I don't know if the Windows and Mac OS versions can unlock the same "storage vaults" on the stick.
Certainly, I have been to many companies where either everyone runs Windows or everyone runs Mac OS. If the primary target audience is to use this stick at work in one of those places, then no changes are required. However, at IBM, we have employees using Windows, Mac OS and Linux. In my case, I have all three! Ideally, I would like a version of EncryptStick that I could take on trips with me that would allow me to use it regardless of the Operating System I encountered.
Since there isn't a Linux-version of EncryptStick software, I decided to modify my stick to support booting Linux. I am finding more and more Linux kiosks when I travel, especially at airports and high-traffic locations, so having a stick that works both in Windows or Linux would be useful. Here are some suggestions if you want to try this at home:
Use fdisk to change the FAT32 partition type from "b" to "c". Apparently, Grub2 requires type "c", but the pre-loaded EncryptStick was set to "b". The Windows version of EncryptStick> seems to work fine in either mode, so this is a harmless change.
Install Grub2 with "grub-install" from a working Linux system.
Once Grub2 is installed, you can boot ISO images of various Linux Rescue CDs, like [PartedMagic] which includes the open-source [TrueCrypt] encryption software that you could use for Linux purposes.
This USB stick could also be used to help repair a damaged or compromised Windows system. Consider installing [Ophcrack] or [Avira].
Certainly, 8GB is big enough to run a full Linux distribution. The latest 32-bit version of [Ubuntu] could run on any 32-bit or 64-bit Intel or AMD x86 machine, and have enough room to store an [encrypted home directory].
Since the stick is formatted FAT32, you should be able to run your original Windows or Mac OS version of EncryptStick with these changes.
Depending on where you are, you may not have the luxury to reboot a system from the USB memory stick. Certainly, this may require changes to the boot sequence in the BIOS and/or hitting the right keys at the right time during the boot sequence. I have been to some "Internet Cafes" that frown on this, or have blocked this altogether, forcing you to boot only from the hard drive.
Well, those are my suggestions. Whether you go on a trip with or without your laptop, it can't hurt to take this EncryptStick along. If you get a virus on your laptop, or have your laptop stolen, then it could be handy to have around. If you don't bring your laptop, you can use this at Internet cafes, hotel business centers, libraries, or other places where public computers are available.
Today, I met with Teresa Ferraro and Mike Buttrum from FirstRain in their Manhattan office in downtown New York City. IBM recently contracted FirstRain to provide IBMers like myself with analytics on publicly-available news to keep us informed for business meetings. Here's how IBMers can get the most out of this service.
Basically, FirstRain takes a list and generates the best summaries of publicly-available news that are most relevant. You can organize into different channels. Here I have seven channels.
Companies to watch refer to existing or prospective clients that I plan to be talking with soon. Some of my colleagues are assigned to specific clients, so they can set this up once and enjoy the news for the rest of the year. I, on the other hand, meet with different clients every week, so I will be updating this list on a frequent basis.
I have divided the Competitors between major ones, and smaller startups. Since I am often working with business partners and distributors, I made that a separate channel as well.
For product lines, I picked three: Data migration, Data storage solutions, and Software defined storage.
For conferences where I don't know which companies will attend, such as the IBM Technical University, I can set up information by territory. Here is one for Brazil.
I also attend industry-oriented events, so I can pick those vertical markets that might be helpful with dinner conversations. In this example, I chose Energy, Electric Utilities and Gas Utilities.
Once you have your channels configured, you get your results in various sections:
Management Changes lists any changes in top C-level positions, who left the company, who got recently hired.
Key Developments indicates news like mergers and acquisitions and government regulations.
First Reads prioritizes the top six articles for your channel. You can access more, but these six will get you started as you have your morning coffee.
First Tweets gives you the six most relevant tweets, if those articles above were just "TL;DR"
A section on Business Influencers and Market Drivers is interesting to see who the big players are, and what topics are driving the most conversation. Here's an example from my Energy/Electric/Gas channel:
The Most Talked About section covers quotes and commentary about the most talked about companies in your channel.
With most news sources focused on politics, weather and celebrity gossip, it is nice to have a quicker, more focused approach to get the news I need to prepare for my client briefings. Special thanks to my hosts Teresa and Mike for their hospitality!
Use more efficient disk media, such as high-capacity SATA disk drives
Both are great recommendations, but why limit yourself to what EMC offers? Your x86-based machines are only a subset of your servers,and disk is only a subset of your storage. IBM takes a more holistic approach, looking at the entire data center.
VMware is a great product, and IBM is its top reseller. But in addition to VMware, there are other solutions for the x86-based servers, like Xen and Microsoft Virtual Server. IBM's System p, System i, and System z product lines all support logical partitioning.
To compare the energy effectiveness of server virtualization, consider a metric that can apply across platforms. For example, for an e-mail server, consider watts per mailbox. If you have, say, 15,000 users, you can calculate how many watts you are consuming to manage their mailboxes on your current environment, and compare that with running them on VMware, or logical partitions on other servers. Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
More efficient Media
SATA and FATA disks support higher capacities, and run at slower RPM speeds, thus using fewer watts per terabyte.A terabyte stored on 73GB high-speed 15K RPM drives consumes more watts than the same terabyte stored using 500GB SATA.Chuck correctly identifies that tape is more power-efficient than disk, but then argues that paper is more power-efficient than tape. But paper is not necessarily more efficient than tape.
ESG analyst Steve Duplessie divides up data betweenDynamic vs. Persistent. The best place to put dynamic data is on disk, and here is where evaluation of FC/SAS versus SATA/FATA comes into play.Persistent data, on the other hand, can be stored on paper, microfiche, optical or tape media. All of these shelf-resident media consume no electricity, nor generate any heat that would require additional cooling.
A study by scientists at the Lawrence Berkeley National Laboratory titled High-Tech Means High-Efficiency: The Business Case for Energy Management in High-Tech Industries indicates thatData centers consume 15 to 100 times more energy per square foot than traditional office space. Storing persistent data in traditional office space can save a huge amount of energy. Steve Duplessie feels the ratio of dynamic to persistent data is 1:10 today, but is likely to grow to 1:100 in the near future, raising the demand for energy-efficient storage of persistent data ever more important to our environment.
Data centers consume nearly 5000 Megawatts in the USA alone, 14000 Megawatts worldwide. To put that in perspective, the country of Hungary I was in last week can generate up to 8000 Megawatts for the entire country (and they were using 7400 Megawatts last week as a result of their current heat wave, causing them grave concern).
Back in the 1990's, one of the insurance companies IBM worked with kept data on paper in manila folders, and armiesof young adults in roller skates were dispatched throughout the large warehouses of shelves to get the appropriate folder in response to customer service inquiries. Digitizing this paper into electronic format greatly reduced the need for this amount of warehouse space, as well as improved the time to retrieve the data.
A typical file storage box (12 inch x 12 inch x 18 inch) containing typed pages single-spaced, double-sided, 12 point font could hold perhaps 100MB. The same box could hold a hundred or more LTO or 3592 tape cartridges, each storing hundreds of GB of information. That's a million-to-one improvement of space-efficiency, and from a watts-per-TB basis, translates to substantial improvement in standard office air conditioning and lighting conditions.
To learn more about IBM's Project Big Green, watch thisintroductory video which used Second Life for the animation.
The concept that there should be a linear "Storage Administrators per TB" rule-of-thumb has been around for a while.Back in 1992, I went to visit a customer in Germany who had FIVE storage admins for 90 GB (yes, GB, not TB) disk array.I told them they only needed 3 admins, but they cited German laws that prohibited "overtime" work on evenings and weekends.
Later, in 1996, I visited an insurance company in Ohio to talk about IBM Tivoli Storage Manager. They had TWO admins to manage 7TB on their mainframe, and another 45 people managing the 7TB across their distributed systems running Linux, UNIX, and Windows. My first question, why TWO? Only one would be needed for the mainframe, but they responded that they back each other up when one takes a 2-week vacation. My second question to the rest of the audience was... "When was the last time you guys took a 2-week vacation?"
Today, admins manage many TBs of storage. But TBs are turning out not to be a fair ruler to estimate the number of admins you need. It's a moving target, and other factors have more influence that sheer quantity of data.Let's take a look at some of those factors, which we call "the three V's":
Variety of information types
In the beginning, there were just flat text files. In today's world, we have structured databases, semi-structured e-mail systems, hypertext documents, composite applications, audio and video formats that require streaming, and so on. Variety adds to the complexity of the environment. Different data requires different treatment, different handling, and perhaps even different storage technologies.
Volume of data
Data on disk and tape is growing 60% year on year. It's growing on paper also. It's growing on film like photos and X-rays. The problem is not the amount, but the rate of growth. Imagine if population and traffic in your city or town increased 60% in one year, most likely people would suffer because most governments just aren't prepared for that level of growth.
Velocity of change
Back in the 1950's and 1960's, people only had to make updates once a year, scheduling time during holidays. Now, people are making changes every month, sometimes every weekend. One customer we spoke with recently said they do about 8000 changes PER WEEKEND!
So, the key is that there is no simple rule-of-thumb. Fewer admins are need per TB on mainframe than distributed systems data. Fewer admins per TB are needed when you deploy productivity software, like IBM TotalStorage Productivity Center. Fewer admins per TB are needed when you deploy storage virtualization, like IBM SAN Volume Controller or IBM virtual tape libraries.
Well it's Tuesday again, and you know what that means? IBM Announcements!
Today, IBM announced an exciting new addition to the IBM System Storage™ product line, [IBM Spectrum Storage™], a family of software defined storage offerings.
To understand its significance, I need to explain a few things first. Software defined storage is part of a larger concept of software defined environment.
How is software defined environment different than what you have now? In every data center, you need to map business requirements of an application workload with an appropriate set of IT infrastructure, including server, network and storage resources.
The traditional approach involves an application owner or database administrator reviewing the business requirements documented for the application, calling the server, network and storage administrators, who match those requirements to appropriate IT hardware and notify the folks in facilities to rack and stack the gear accordingly.
In a software defined environment, Application Programming Interfaces (API), Service Level Agreements (SLA) and Orchestration workflows can automate the request for the appropriate resources. This is referred as the "Control Plane".
Responding to these requests, the software can provision the appropriate server, network and storage resources required. Server, network and storage virtualization, standard interfaces and deployment technologies exist to make this practical. This is referred to as the "Data Plane".
Any time new a way of doing things is introduced into the world, there could be some resistance. Let's tackle the three most frequently stated objections:
"IT infrastructure resources are rare and expensive! Administrators need to control or approve how resources are doled out!" An objection to self-service automation is the fear that employees would take too much.
If you have a bank account, Automated Teller Machines (ATM) can restrict the amount of cash you can take out, based on what is appropriate per request, or per day, with an upper limit of what you have in your personal checking or savings account. You enter your debit card and PIN into the "Control Plane" keypad and out comes a stack of 20-dollar bills from the "Data Plane" slot. In a software defined environment, you can limit requests through quotas and resource pools.
"Some application workloads are more important than others! Another objection is that every workload will be treated in the same standard way, mission critical workloads and dev/test would be treated alike.
At the gas station, you can select different levels of octane gasoline. You enter your credit card and zip code into the "Control Plane" keypad and selected octane comes out of the "Data Plane" hose. In a software defined environment, resources can be provisioned with different Quality of Service (QoS) levels.
"Different applications require different combinations of resources!" Another objection is the fear that fixed combinations of server, storage and network resources will be stifling to innovation and productivity.
At the vending machine, you can choose which candy bar and which chips to have with whatever soft drink you choose for lunch. You enter your bills and coins into the "Control Plane" slot, select the row letter and column number for your snack of choice, and then fetch your purchases from the "Data Plane" flap. In a software defined environment, a Service Catalog can offer a virtual menu of different server, network and storage resources to be combined together as needed.
These concerns are addressed well enough in software defined environments, in general, and with IBM Spectrum Storage family of products, in particular.
(Nostalgia: I remember the days before self-service automation. At the bank, I had to stand in line at the bank until I could to talk to a human bank teller to get cash from my savings account. At the gas station, human gas attendants would come out and pump the gas for me, check my oil and wash my windshield. And at a restaurant, I felt like I waited an eternity from the time I ordered my meal to the time the human short-order cook had it ready and human wait staff delivered it to my table. These all seem silly today, doesn't it?)
How do you define success? For some, it is based on their salary, or perhaps revenue they helped close for their company.
For others, their family life and the flexibility to handle work/life issues might be more important.
Still others look for certifications and awards from official agencies.
As a side gig, I sometimes do bartending on the weekends. Typically, these are for weddings or corporate parties.
I took weeks of bartender training and passed a three-hour exam to become state-certified to do so in Arizona. We Arizonans take our liquor seriously! If you think about it, bartending is just a notch below being a Pharmacist dispensing other drugs.
Surprisingly, some of my patrons will be condescending, "Don't you wish you can do more with your life than be a bartender?"
I am also certified "Laughter Yoga" instructor, and am called in at times to substitue for other instructors. Again, I took formal training and was certified to do so.
Again, some of my students will ask, "Don't you wish you could do more with your life than be a yoga instructor?"
In both cases, I would respond, "Dude, I earn six figures, and am happy to meet new people every week, how about you?" This usually shuts them up!
(For those interested, here are [my top 10 posts] which served as the basis of the interview!)
I am happy to be recognized externally and within IBM for my success as a blogger. Since I started blogging over 10 years ago, I have helped close over $4 Billion USD in revenue for IBM, written five books on IBM Storage, mentored dozens of other successful bloggers, and presented to thousands of clients at conferences, workshops and briefings.
This week, I was reminded that back in 2011, Watson beat two human players, Ken Jennings and Brad Rutter on the TV game show "Jeopardy!" On his last response, Ken wrote "I for one welcome our new computer overlords." With IBM investing heavily in Cognitive Solutions, should people be worried, or welcome the new technology?
Back in 1950, Isaac Asimov proposed "Three laws of robots":
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Let's take a look at how Artificial Intelligence has been represented in the movies over the past few decades. I have put these in chronological order when they were initially released in the United States.
(FCC Disclosure and Spoiler Alert: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for cognitive solutions made by IBM. While IBM may have been involved or featured in some of these movies, I have no financial interest in them. I have seen them all and highly recommend them. I am hoping that you have all seen these, or at least familiar enough with their plot lines that I am not spoiling them for you.)
2001: A Space Odyssey
Back in 1968, Stanley Kubrick and Arthur C. Clarke made a masterpiece movie about a mysterious obelisk floating near Jupiter. To investigate, a crew of human beings takes a space ship managed by a sentient computer named [HAL-9000].
(Many people thought HAL was a subtle reference to IBM. Stanley Kubrick clarifies:
"By the way, just to show you how interpretation can sometimes be bewildering: A cryptographer went to see the film, and he said, 'Oh. I get it. Each letter of HAL's name is one letter ahead of IBM. The H is one letter in front of I, the A is one letter in front of B, and the L is one letter in front of M.'
Now this is a pure coincidence, because HAL's name is an acronym of heuristic and algorithmic, the two methods of computer programming...an almost inconceivable coincidence. It would have taken a cryptographer to have noticed that."
Source: The Making of 2001: A Space Odyssey, Eye Magazine Interview, Modern Library, pp. 249)
The problem arises when HAL-9000 refuses commands from the astronauts. The astronauts are not in control, HAL-9000 was given separate orders from ground control back on earth, and it has determined it would be more successful without the crew.
In 1973, Michael Crichton wrote and directed this movie about an amusement park with three uniquely themed areas: Medieval World, Roman World, and Westworld. Robots are used to staff the parks to make them more realistic, interacting with the guests in character appropriate for each time period.
A malfunction spreads like a computer virus among the robots, causing them to harm or kill the park's guests. Yul Brenner played a robot called simply "the Gunslinger". Equipped with fast reflexes and infrared vision, the Gunslinger proves especially deadly!
(Michael Crichton also wrote "Jurassic Park", which had a similar story line involving dinosaurs with catastrophic results!)
Last year, HBO launched a TV series called "Westworld", based on the same themes covered in this movie. The first season of 10 episodes just finished, and the next season is scheduled for 2018.
Directed by Ridley Scott, this 1982 movie stars Harrison Ford as Rick Deckard, a law enforcement officer. Rick is tasked to hunt down and "retire" four cognitive androids named "replicants" that have killed some humans and are now in search of their creator, a man named J. F. Sebastian.
(I enjoy the euphemisms used in these movies. Terms like kill, murder or assassinate apply to humans but not machines. The word "retire" in this movie refers to destruction of the robots. As we say in IBM, "retirement is not something you do, it is something done to you!")
Destroying machines does not carry the same emotional toll as killing humans, but this movie explores that empathy. A sequel called "Blade Runner 2049" will be released later this year.
In 1983, Matthew Broderick plays David, a young high school student who hacks into the U.S. Military's War Operation Plan Response (WOPR) computer. The WOPR was designed to run various strategic games, including war game simulations, learning as it goes. David decides to initiate the game "Global Thermonuclear War", and the military responds as if the threats were real.
Can the computer learn that the only way to win a war is not to wage it in the first place? And if a computer can learn this, can our human leaders learn this too?
In this series of movies, a franchise spanning from 1984 to 2009, the US Military builds a defense grid computer called [Skynet]. After cognitive learning at an alarming rate, Skynet becomes self-aware, and decides to launch missiles, starting a nuclear war that kills over 3 billion people.
Arnold Schwarzenegger plays the Terminator model T-800, a cognitive solution in human form designed by Skynet to finish the job and kill the remainder of humanity.
In this 2004 movie, Will Smith plays Del Spooner, a technophobic cop who investigates a crime committed by a cognitive robot.
(Many people associate the title with author Isaac Asimov. A short story called "I, Robot" written by Earl and Otto Binder was published in the January 1939 issue of 'Amazing Stories', well before the unrelated and more well-known book 'I, Robot' (1950), a collection of short stories, by Asimov.
Asimov admitted to being heavily influenced by the Binder short story. The title of Asimov's collection was changed to "I, Robot" by the publisher, against Asimov's wishes. Source: IMDB)
Del Spooner uncovers a bigger threat to humanity, not just a single malfunctioning robot, but rather the Virtual Interactive Kinesthetic Interface, or simply VIKI for short, a cognitive solution that controls all robots. VIKI interprets Asimov's three laws in a manner not originally intended.
In this 2015 movie, Domhnall Gleeson plays Caleb, a 26 year old programmer at the world's largest internet company. Caleb wins a competition to spend a week at a private mountain retreat. However, when Caleb arrives he discovers that he must interact with Ava, the world's first true artificial intelligence, a beautiful robot played by Alicia Vikander.
(The title derives from the Latin phrase "Deus Ex-Machina," meaning "a god from the Machine," a phrase that originated in Greek tragedies. Sources: IMDB)
Nathan, the reclusive CEO of this company, relishes this opportunity to have Caleb participate in this experiment, explaining how Artificial Intelligence (AI) will transform the world.
(The three main characters all have appropriate biblical names. Ava is a form of Eve, the first woman; Nathan was a prophet in the court of David; and Caleb was a spy sent by Moses to evaluate the Promised Land. Source: IMDB)
The premise is based in part on the famous [Turing Test], developed by Alan Turing. This is designed to test a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Movies that depict the bad guys as a particular nationality, ethnicity or religion may be offensive to some movie audiences. Instead, having dinosaurs, monsters, aliens or robots provides a villain that all people can fear equally. This helps movie makers reach a more global audience!
Of course, if robots, androids and other forms of Artificial Intelligence did exactly what humans expect them to, we would not have the tense, thrilling action movies to watch on the big screen.
This is not a complete list of movies. Enter in the comments below your favorite movie that features Artificial Intelligence and why it is your favorite!
With all the excitement of the [IBM Challenge], where the [IBM Watson computer] will compete against humans on [Jeopardy!], I thought it would be good to provide the following homework exercise to help you appreciate how challenging the game is and the strategies required.
Overview of the game of Jeopardy!
If you are familiar with the show, you can safely skip this section.
Known as "America's Favorite Quiz Show", the Jeopardy pits three contestants against each other. The board is divided into six columns and five rows of answers. Each column indicates the category for that column of answers. The rows are ranked from easiest to most difficult, with more difficult answers being worth more money to wager.
The contestants take turns. The returning champion gets to select a spot on the board, by indicating the category (column) and wager (row), such as "I will take Animals for 800 dollars!" Contestants must then press a button to "buzz in", be recognized by the host, and respond correctly. If the contestant responds incorrectly, the other two contestants have the opportunity to respond. The contestant with the correct response gets to chose the next answer.
For each turn, the host, Alex Trebek, shows the answer on the board, and spends three seconds reading it aloud to give everyone a chance to come up with a corresponding question. This is perhaps what Jeopardy is most famous for. In a traditional "Quiz Show", the host asks questions, and the contestants answer that question. On Jeopardy, however, the host poses "answers", and the contestants provide their response in the form of a "questions" that best fit the category and answer clues. For example, if the categories were "Large Corporations" and the answer was "Sam Palmisano", the contestant would answer "Who is the CEO of IBM Corporation?" Both the categories, and the answers are filled with puns, slang and humor to make it more challenging. Often, the answer itself is not sufficient clue, you have to factor in the category as well to have a complete set of information.
The game is played in three rounds:
In the first round, there are six categories, and the rows are worth $200, $400, $600, $800 and $1000 dollars. If you respond correctly on all five answers in a category column, you would win $3000. If you respond to all thirty answers correctly, you would earn $18,000.
In the second round, there are six different categories, and the rows are worth twice as much.
The final round has a single category and a single question. Each player can decide to wager up to the full amount of their score in this game. This wager is done after they see the category, but before they see the answer.
After the host finishes reading the answer aloud, the buzzers are lighted so that the contestants can buzz in. If a contestant gets the question correctly, he earns the corresponding money for the row it was in. If the contestant guesses incorrectly, the money is subtracted from his score. If the first contestant fails, the buzzers are re-lit so the other two contestants can then buzz in with their answers, learning from previous failed attempts.
To provide added challenge, some of the answers are surprise "Daily Double". Instead of the dollar amount for the row, the contestant can wager any amount, up to their total score they have won so far in that game, or the largest dollar amount for that round, whichever is higher, based on his confidence in that category. There is one "Daily Double" surprise in the first round, and two in the second round.
In the final round, each contestant wagers an amount up to their total score, based on their confidence on the final category. A common strategy for the leading contestant with the highest score is to wager a low amount, so that if he fails to guess the response correctly, he will still have a large dollar amount. For example, if the leader has $2000 and the second place is $900, the leader can wager only $100 dollars, and the second place might wager his full $900. If the leader loses the round, he still has $1900, beating the second place regardless of how well he does.
Whomever has the most money at the end of all three rounds wins that amount of cash, and gets to return to the show for another game the next day to continue his winning streak. The other two contestants are given consolation prizes and a nominal appearance fee for being on the show, and are never seen from again.
The show is only 30 minutes long, so the folks at Sony Pictures who produce the show can film a full weeks' worth of television shows in just two days of real-life, Tuesday and Wednesday, allowing the host Alex Trebek and his "Clue Crew" time to research new categories and answers.
So, here is your homework assignment. Record a full episode of Jeopardy on your VCR or Digital Video Recorder (DVR) and have your thumb ready to press the pause button. For each round, listen to each category, pause, and try to guess what all the answers in that column will have in common. For each category, write down a statement like "All the responses in this category are ...".
The answers could be people, places or things. Suppose the category "Chicks Dig Me". In English, "chicks" can be slang for women, or refer to young chickens. The term "dig" can be slang for admires or adores, so this could be "Male Celebrities" that women find attractive, it could be objects of desire that women fancy (diamonds, puppies, etc.), or it could be places that women like to go to. As it turns out, the "dig" referred to archaeology, and the responses were all famous female archaeologists.
Once you have those all your statements written down, press play button again.
Next, as each answer is shown, you have three seconds to hit the pause again, so that you have the question on the screen, but before any contestants have responded. Go on your favorite search engine like Google or Bing and try to determine the correct response based on the category and answer. Consider these [tips for being an Internet Search ninja]. Once you think you have figured out your response, write it down, and the dollar amount you wager, or decide you will not respond for that answer, if you are not sure about your findings.
Even if you think you already know the correct response, you may decide to gain more confidence of your response by finding confirming or supporting evidence on the Internet.
Press play. Either one of the contestants will get it right, or the host will provide the question that was expected as the correct response.
How well did you do? Were you able to find on the the correct response online, or at least confirm that what you knew was correct. If you got it correct, add in your dollar amount to your score. If you got it wrong, subtract the amount.
At the end of each round, look back at your statements for each category. Did you guess correctly the common theme for each category column of answers? Did you misinterpret the slang, pun or humor intended?
At the end of the game, you might have done better than the contestant that won the game. However, check how much added time you took to do those Internet searches. The average winner only questions half of the answers and only gets 80 percent of them correctly.
If you are really brave, take the [Jeopardy Online Test]. If you do this homework assignment, feel free to post your insights in the comments below.
We are only days away from the big IBM Challenge of Watson computer against two human contestants on the show Jeopardy!
I watched two episodes of Jeopardy! on my Tivo, pausing it to follow the [homework assignment] I suggested in my last post. Here are my own results and observations.
Episode  involved a web programmer, a customer service representative, and a bank teller.
Of the first six categories in Round 1, I guessed four of the six themes for each category. For the category "Diamonds are Forever", I wrote down "All answers are some kind of gem or mineral", but the reality was that all the answers were some physical characteristic of diamonds specifically. For the category "...Fame is not", I wrote down "All answers are TV or Movie celebrities". I was close, but actually it was famous celebrities, rock bands and pop culture of the 1980s. (The movie "Fame" came out in 1980).
In the round, there were 27 of the 30 answers given before they ran out of time. Of these, I was able to get 24 of 27 correct by searching the Internet. That is 88 percent correct. Here were the ones that eluded me:
Answer related to a "multi-chambered mollusk". I could not find anything on the Internet definitively on this, so abstained from wager. The correct question was "What is Nautilus?".
Answer was the Irish variant of "Kathryne". I found Kathleen as a variant, but did not investigate if it had Irish origins. The correct question was "What is Caitlin?"
Answer was this Norse name for "ruler" whether you had red hair or not. I found "Roy" and "Rory" so guessed "What is Rory?" The correct question was "What is Eric?"
The second round, I guesed three of the six themese for the categories. For category "Musical Titles Letter Drop" I wrote down "All the answers are titles of musical songs" but it was actually "Musicals" as in the Broadway shows. For category "Place called Carson", I wrote down "All the answers are places" and was way off on that one, with answers that were people, places and names of corporations. And for "State University Alums", I wrote down "All the answers are college graduates", but instead they were all "State Universities" such as the University of Arizona.
In this second round, only 26 answers were posed. I got 80 percent correct with Internet searching. I missed three on the "Musical Titles", one in "Pope-pourri" and one State University (sorry SMU). The "Musical Titles Letter Drop category" was especially difficult, as for each title of a Musical, you had to remove a single letter out of it to form the correct response.
For the answer "Good luck when you ask the singers "What I Did For Love"; they never tell the truth", you would need to take "Chorus Line" the musical, where the song "What I did for Love" appears, and ask "What is Chorus Lie?" Note that "line" changed to "lie" and the letter "n" was dropped out.
For the answer "Embrace the atoms as Simba and company lose and gain electrons en masse in this production", you would need to recognize that Simba was the main character of "The Lion King" and change it to "What is The Ion King".
I think these play-on-words are the questions that would stump the IBM Watson computer.
In the final round, the category was "Ancient Quotes". I thought the answer would be a famous adage or quotation, but it was instead famous people who uttered those phrases. The answer was "He said, to leave this stream uncrossed will breed manifold distress for me; to cross it, for all mankind". I was able to determine the correct response readily from searching the Internet: The river was the Rubicon, the border of the Gaul region governed by an ambitious general. The correct response "Who was Julius Caesar?"
Total time for the entire exercise: 87 minutes.
The following night, episode  brought back Paul Wampler, the returning champion web programmer, against two new contestants: an actor, and high school principal.
Of the first six categories in Round 1, I guessed five of the six themes for each category. For the category "Nonce Words", I wrote all the answers would be nonsense words. I was close, the clues had words invented for a particular occasion, but the correct responses did not.
I was able to get 29 of 30 correct by searching the Internet. That is 96 percent correct. The one I missed was in the category "Nonce Words" and the answer was "In an arithmocracy, this portion of the population rules, not trigonometry teachers.." My response was "What is Math?" but the correct answer was "What are the majority?" It did not occur for me to even look up [Arithmocracy] as a legitimate word, but it is real.
The second round, I guesed five of the six themese for the categories. For category "Hawk" eyes, the "Hawk" was in quotation marks, so I wrote "All answers would start with the word Hawk or end with the word "eyes". I was close, the correct theme was that the word "hawk" would appear in the front, middle or end of the correct response.
In this second round, I got 28 of 30 correct. I got 93 percent correct with Internet searching. Ironically, it was the category "German Foods" that caught me off guard.
For, the answer was "Pichelsteiner Fleisch, a favorite of Otto von Bismarck, is this one-pot concoction, made with beef & pork". I know that "fleisch" is a German word for meat, so I guessed "What is sausage?" but the correct response was "What is stew?" I should have paid more attention to the "one-pot concoction" part of the answer.
For the answer was "Mimi Sheraton says German stuffed hard-boiled eggs are always made with a great deal of this creamy product". I didn't realize that "stuffed eggs" was German for "deviled eggs". Instead, I found Mimi Sheraton's "The German Cookbook" on Google Books, and jumped to the page for "Stuffed Eggs" The ingredients I read included whippedc cream, cognac, and worcestershire sauce. Taking the "creamiest" ingredient of these, I wrote down "What is whipped cream?" However, it turned out I was actually reading the ingredients for "Crabmeat Cocktail" that was coninuing from the previous page. I thought it was gross to put whipped cream with eggs, and should have known better. The correct response was "What is mayonnaise?"
In the final round, the category was "Political Parties". This could either be political organizations like Republicans and Democrats, or festivities like the Whitehouse Correspondents Dinner. The answer was "Only one U.S. president represented this party, and he said, I dread...a division of the republic into two great parties." So, we can figure out the answer refers to political organizations, but both Democrat and Republican are ruled out because each has had multiple presidents. So, looking at a [List of Political Parties of each US President], I found that there were four presidents in the Whig party, four in the Democrat-Republic party, but only one president in the Federalist party (John Adams), and one in the War Union party (Andrew Johnson). Looking at [famous quotes from John Adams] first, I found the quote, it matched, and so I wrote down "What is the Federalist party?". I got it right, as did two of the three contestants. Ironically, the one contestant who got it wrong, the returning champion web programmer, wagered a small amount, so he still had more money after the round and won the game overall.
Total time for the entire exercise: 75 minutes. I was able to do this faster as I skipped searching the internet for the responses I was confident on.
To find out when Jeopardy is playing in your town, consult the [Interactive Map].