This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections platform will be sunset on December 31, 2019. On January 1, 2020, this blog will no longer be available. More details available on our FAQ.
While clients and IBM executives were in meetings today, in and around the Scottsdale Fairmont resort here in Scottsdale, Arizona, I helped to set up the "Solutions Showcase". There were three stations:
David Ayd and I manned this one, covering storage and server systems. From left to right: a fully-populated 15-module XIV storage system, my laptop running the XIV GUI; two-socket 16-core POWER p770 server, a solid-state drive, PS702 POWER blade, my book Inside System Storage: Volume I, HX5 x86 blade, and four-socket 16-core x3850 M3 server with MAX5 memory extension; David's laptop with various POWER and System x presentations, and our Kaon V-Osk interactive plasma screen display.
Eric Kern manned the Smarter Clouds station. He had live guest images on the IBM Developer and Test cloud, which one the "Best of Interop" award up in Las Vegas this week. I covered IBM's cloud offering in my post [Three Things To Do on the IBM Cloud].
Smarter Data Centers
Ken Schneebeli manned the "Smarter Data Centers" station. He directed people out to the parking lot to see Brian Canney and the Portable Modular Data Center (PMDC). The one here is 8.5 feet by 8.5 feet by 40 feet in size and can be configured and deployed in 12-14 weeks to any location. We can fit any mix of IBM and non-IBM equipment, provided it meets physical dimensions. Want a DS8700 disk system? The PMDC can hold up to 3-frame configurations of the DS8700. Want an eclectic mix of Sun, HP and Dell servers with HDS and EMC disk in your PMDC? IBM can do that too.
After we finished setup, we joined the clients at the "Welcome Reception" on the Lagoon Lawn. The weather was quite pleasant.
Special thanks to Jasdeep Purdhani, Lisa Gates, and Kelly Olson for their help organizing this event.
The Tucson Executive Briefing Center hosted 20 dignitaries from local companies and academia.
This is a historic competition, an exhibition match pitting a computer against the top two celebrated Jeopardy champions:
Brad Rutter, won $3.2 million USD on Jeopardy!, winning 5 days on the show, and then three later tournamets.
Ken Jennings, winning $2.5 million in a 74-day winning streak on Jeopardy!
One of the members of the audience had never seen an episode of Jeopardy! in his life.
(Note: there are NO SPOILERS in this blog post. If you have not yet watched the show, you are safe to continue reading the rest of this post. I will not
disclose the correct responses to any of the clues nor how well each contestant scored.)
Calline Sanchez, IBM Director, Systems Storage Development for Data Protection and Retention, kicked off today's ceremonies.
The IBM Watson computer, named after IBM founder Thomas J. Watson, has been developed over the past 4 years by a team of IBM scientists who set out to accomplish a grand challenge - build a computing system that rivals a human's ability to answer questions posed in natural language with speed, accuracy and confidence. IBM Research labs in the United States, Japan, China and Israel [collaborated with Artificial Intelligence (AI) experts at eight universities], including Massachusetts Institute of Technology (MIT), University of Texas (UT) at Austin, University of Southern California (USC), Rensselaer Polytechnic Institute (RPI), University at Albany (UAlbany), University of Trento (Italy), University of Massachusetts Amherst, and Carnegie Mellon University.
(Disclaimer: I attended the University of Texas at Austin. My father attended Carnegie Mellon University.)
Last week, NOVA on PBS had a special episode on the making of IBM Watson, you can [watch it online] on their website. Delaney Turner, IBM Social Media Communications Manager for Business Analytics Software, has posted [his observations of Nova].
Since IBM Watson is the size of 10 refrigerators and weighs over 14,000 pounds, it was easier to design the Jeopardy! set at the TJ Watson Research lab in Yorktown Heights, NY, than to ship it over to California where the show is normally recorded. Two of the visual designers that worked on this set, as well as on the visual appearance of Watson, live in Tucson and were part of our audience today.
The IBM Challenge consists of a two-game tournament, where the scores of both games will be added to determine winner rankings. The producers of Jeopardy! will give $1 million dollars USD to first place, $300,000 to second place, and $200,000 to third place. Regardless of outcome, [IBM will donate all of its winings to charity]. The two human contestants plan to donate half of their earnings to their favorite charities as well.
Jeopardy! The IBM Challenge
Alex Trebek introduces IBM Watson, explaining that it can neither hear nor see. It will receive all information electronically. Categories and clues will be sent as text files via TCP/IP over Ethernet at the same time the two human contestants see them so that all have the same time to think about the right answer.
Watson has two rows of five racks, back to back. This was done so that cold air could rise up from holes in the tile floors around the unit, and all the hot air would be forced into the center and up to the ceiling return. This technique is known as "hot aisle/cold aisle" design. Alex Trebek opens one of the rack doors to show a series of 4U-high IBM Power 750 servers.
The avatar is a representation of Watson, as the machine itself is too big to fit behind the podium. The avatar is IBM's "Smarter Planet" logo with orbiting streaks and circles. It shows "Green" when it has high confidence, and orange when it gets an answer wrong. When busy thinking, the streaks and circles speed up, the closest we will see to "watching a computer sweat."
During the show, an "Answer panel" shows Watson's top three candidate responses, with confidence level compared to its current "buzz threshold".
Watson knows what it knows, and knows what it doesn't know. Here is an [Interactive Watson Game] on New York Times website to give you an idea of how the answer panel works. I was impressed with how close all three candidate answers were. In a question about Olympic swimmers, all three candidates are Olympic swimmers. In a question about the novel "Les Miserables", all three candidates were characters of that novel.
Well, IBM Watson did well, but missed answered some questions incorrectly. This [parody Slate video] pokes fun at this. Here were some discussions we had after the show ended:
IBM did not do well in categories that required [abductive reasoning]. For example, to identify two or three things that happened in different years, and then postulate that what they all have in common is a specific decade (such as the 1950s) is difficult.
Watson does not hear the wrong answers from the two human contestants. For one question, Ken buzzes in first, guesses wrong, then Watson buzzes in with the same exact response. Alex Trebek rebukes Watson with "No, Ken just said that!" Brad would learn from their mistakes and guess correctly for the score.
Watson is provided the correct answer after a contestant guesses it correctly, or if nobody does, when Alex provides the correct response. This is sent as a text message to Watson immediately, so that it can use this information to adjust its algorithms and machine-learning for future clues in that same category. This was evident in the "Answer panel" on the fourth and fifth attempts on the category of "Decades".
With this demonstration, IBM Research has advanced science by leaps and bounds for the Articial Intelligence community. IBM is a leader in Business Analytics, and this technology will find uses in a variety of industries. The average knowledge worker spends 30 percent of her time looking for information on corporate data repositories. By demonstrating a computer that can provide answers quickly, employees will be more productive, make stronger business decisions, and have greater insight.
Day 1 was only able to cover the first round of Game 1. This allowed more time to talk about the history and technology of IBM Watson. Tomorrow, the contestants will finish Game 1 and head into Game 2.
The last keynote session of the [Oracle OpenWorld 2011] conference was Oracle making a few major announcements.
Steve Miranda, Senior VP for Oracle Applications, explained the new "Fusion 11g Apps" which are now generally available. Basically, they took all the scattered applications they have from acquisitions of PeopleSoft, JD Edwards, Siebel and so on, and re-wrote them to industry-standard Java so that they would all run either on-premise or in the Cloud. The Enterprise Apps come in seven categories: Financials like General Ledger and Payroll; Human Capital Management (HCM) formerly known as Human Resources; Supply Chain Management (SCM); Customer Relationship Management (CRM); Governance Risk and Compliance (GRC); Procurement; and Project/Portfolio Management (PPM). Oracle also has "Industry Apps" for specific verticals.
All of these apps have "embedded BI" (business intelligence), such as dashboards, multi-dimensional calculations, decision support, and real-time optimization. This is intended to help the end-user answer four questions:
What do you need to do today?
How to get it done?
What you need to know today?
Who can help you?
Larry Ellison, Oracle CEO, said that it took six years to rewrite all the Fusion Apps. They used an "agile" development model with over 200 early adopters to ensure that these applications were successful. They were under a "controlled release program" but now that is over, and the applications are generally available. Larry indicates that these applications were developed under the concepts of Service Oriented Architecture [SOA], which neither Salesforce.com nor SAP R3 have.
(This made me chuckle. SOA was initially developed by IBM and Microsoft, but is now industry standard. There is no reason not to develop software that isn't SOA.)
Following the IBM model, Oracle has built-in the security at the OS, Database and Middleware layer, rather than in each application. As IBM has understood for several decades, a secure infrastructure is the way to go so that all applications are secure.
With all these Fusion Apps now re-written so that they work on industry-standard Java (J2EE, actually), allowing them to run either on-premise or out on the Cloud, Larry Ellison said "I guess we need a Cloud!" This started his announcement of the "Oracle Public Cloud" [OPC]. OPC has both PaaS and SaaS. The PaaS would offer VM instances with support for database and Java services. The SaaS would be all the Fusion Apps rented on the "as-a-service" model. Rather than force everyone to Oracle 11g, you can run any Oracle database on OPC, and you can run any Java or J2EE application on the OPC.
Your data is portable. Larry is pro-choice, and wants people to be able to move from any cloud to any cloud. Since it's based on industry-standard Java, applications can move seamlessly between OPC, Amazon EC2 and IBM SmartCloud. IBM has been a major force behind [Open Cloud Standards], so it is always good that other major vendors follow suit.
He quoted [someone as saying "Beware of False Clouds"] This was Salesforce.com CEO Marc Benioff's attack against all "Private Cloud" IT vendors. Larry twisted this to say he agrees, "True Clouds" are based on open industry standards, and "False Clouds" are vendor-lockin. OPC is based on Java, J2EE, XML, BPEL and Ruby on Rails, whereas Salesforce.com is based on proprietary Heroku and APEX. He called Salesforce.com the "Roach Motel of Cloud Computing" .. you can check in, but you can't check out.
OPC plans to offer some "data sources", including Dun&Bradstreet news feed, Twitter, Facebook and other social networks. It is based on a monthly subscription using a self-service portal. The resources are elastic, with capacity delivered on demand. He claims that Salesforce.com is rate-limited, and cancels long-running jobs if they are consuming too many resources. Larry said OPC would never do that.
Larry said that there are private-only offerings like SAP R3, and public-only offerings like Salesforce.com, Workday, and Taleo, but Oracle instead has adopted the IBM model of supporting choice between private, public and hybrid clouds.
Larry then attacked "Multi-tenancy", specifically, the idea that SaaS providers often use a single database instance, but then create a column to identify which records belong to which tenants. He said this was state-of-the-art 15 years ago, but is a bad idea now. Too risky. Instead, Larry's OPC has unique database instances for each tenant through virtualization.
Larry also announced the Oracle Social Network (OSN). This is a corporate-version of Facebook, that supports collaboration and file-sharing, similar to IBM [LotusLive], Google Docs, or Microsoft Office365. All of the Fusion Apps are written to interface directly with the OSN or any of these other social networks through APIs. This includes navigation and integrated social networking. He also indicated that all Fusion Apps run on mobile devices. He showed the SAP R3 GUI, and said it reminded him of "the fins on a 1968 Cadillac!"
Larry said that other CRM SaaS focus on helping sales managers track their employees, but Oracle's CRM helps sellers sell more.
He then gave an example of a mythical sales manager Bob, and his sales employee Julian, selling two Exadata boxes for $4.8 Million USD. A "safe harbor" statement was shown at the beginning of this keynote, to make sure nobody asks to buy Exadata boxes this cheap.
Can you believe it has been five years since I started blogging?
(If you absolutely abhor the navel-gazing associated with blogging-about-blogging posts, then by all means stop reading now!)
Back in July 2005, IBM decided to merge together two brands, IBM eServer and IBM TotalStorage, into a single all-encompassing "IBM Systems" brand. Thus TotalStorage brand became the "IBM System Storage" product line of the "IBM Systems" brand. The next six months was spent renaming some (not all) of the products. The following January, I was named the Marketing Strategist for this new product line, with the mission to help promote the new naming convention.
We looked at possibly doing a regularly-scheduled podcast, but nobody back then, including myself, were familar with audio editing tools. Instead, we chose a blog. Most blogs at IBM are internal, safely hidden behind the firewall, accessible only to IBM employees. I wanted mine to be different, to be accessible to the public, clients, prospects, IBM Business Partners, and yes, even those working for IBM's various competitors. One thing I like about blogs is that if you have a typo, or make a mistake, you can go back and correct it after it has posted.
Marketing through social media is quite different than traditional marketing techniques. Management was supportive, but legal wanted to review and approval everything I wrote before I posted it onto my blog. Official IBM Press Releases, for example, go through a dozen reviews before they are finally made public. I refused. This kind of review and approval would ruin the blogging process.
Fortunately, this blog was not my first attempt at technical writing. Our legal counsel reviewed my past trip reports from various conferences, and decided to let me blog without review. Occasionally, someone will reivew my blog once already posted, and ask me to make some corrections. It reminds me of my favorite saying used heavily within IBM:
Despite these delays, we managed to launch this blog in September 2006, just in time to celebrate the 50th anniversary of disk systems. IBM introduced the industry's first commercial disk system on September 13, 1956.
Over the years, this blog has helped sales reps and IBM Business Partners close deals, and address the FUD their prospects heard from competition. I have helped my readers get in touch with the right people within IBM. And, I have "sent the elevator back down", helping other IBMers launch their own blogs, including [Barry Whyte], [Elisabeth Stahl], and [Anthony Vandewerdt].
Today, bloggers have a profound impact on the world. Not everyone has a positive view on this. Bloggers and other users of social media have been seen as whistle-blowers for fraudulent corporations, as activists against corrupt governments and dictators, and as subject matter experts and fact checkers referenced during television and radio newscasts. In a recent movie, one of the major characters was a trouble-making blogger, and another character describes his blogging as nothing more than "graffiti with punctuation."
I want to thank all of my readers for making this the #1 most influential blog on IBM DeveloperWorks in 2011! This blog has been [published in a series of books], Inside System Storage Volume I and Volume II. And yes, before you all ask in the comments below, I am actively working on Volume III.
For a bit of nostalgia, I invite you to read my first 21 blog posts that I posted back in [September 2006].
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of some of the Tuesday afternoon sessions:
Brocade: Maximizing Your Cloud: How Data Centers Must Evolve
This was a session sponsored by Brocade to promote their concept of the "Ethernet Fabric". The first speaker, John McHugh, was from Brocade, and the second speaker was a client testimonial, Jamie Shepard, EVP for International Computerware, Inc.
John had an interesting take on today's network challenges. He feels that most LANs are organized for "North-South" traffic, referring to upload/downloads between clients and servers. However, the networks of tomorrow will need to focus on "East-West" traffic, referring to servers talking to other servers.
John was also opposed to integrated stacks that combine servers, storage and networking into a single appliance, as this prevents independent scaling of resources.
The Future of Backup is Not Backup
Primary data is growing at 40 to 60 percent compound annual growth rate (CAGR), but backup data is growing faster. Why? Because data that was not backed up before are now being backed up, including test data, development data, and mobile application data.
Backup costs are 19x more expensive than production software costs. There is an enormous gap in data protection because companies fail to factor this into their budgets. It is not uncommon for IT departments to use multiple backup tools, for example one tool for VMs, and another tool for servers, and a third product for desktops.
part of the problem is identifying who "buys" the backup software. The server team might focus on the operating systems supported. The storage team focuses on the disk and tape media supported. The application owners focus on the features and capabilities for backup that minimize impact to their application.
The analyst organized these issues into three "C's" of backup concerns: Cost, Capability and Complexity. Cost is not just the software license fee for the backup software, but the cost of backup media, courier fees, and transmisison bandwidth. Capability refers to the features and functions, and IT folks are tired of having to augment their backup solution with additional tools and scripts to compensate for lack of capability. Complexity refers to the challenges trying to get existing backup software to tackle new sources like Virtual Machines, Mobile apps, and so on.
Has everyone moved to a tape-less backup system? Polling results found that people are shifting back to tape, either in a tape-only environment, or to supplement their disk or disk-based virtual tape library (VTL). Here are the polling results:
The poll also showed the top three backup software vendors were Symantec, IBM and Commvault, which is consistent with marketshare. However, the analyst feels that by 2014, an estimated 30 percent of companies will change their backup softwar vendor out of frustration over cost, capability and/or complexity.
There are a lot new backup software products specific to dealing with Virtual Machines. Some are focused exclusively on VMware. When asked what tool people used to backup their VMs, the polling results showed the following. NOte that 20 percent for Other includes products from major vendors, like IBM Tivoli Storage Manager for Virtual Environments, as the analyst was more interested in the uptake of backup software from startups.
Some companies are considering Cloud Computing for backup. This is one area where having the cloud service provider at a distance is an actual advantage for added protection. A poll asking whether some or most data is backed up to the Cloud, either already today, or plans for the near future within the next 12 or 24 months, showed the following:
In addition to backup service providers, there are now several startups that offer file sharing, and some are adding "versioning" to this that can serve as an alternative to backup. These include DropBox, SugarSync, iCloud, SpiderOak and ShareFile.
The final topic was Snapshot and Disk Replication. These tend to be hardware-based, so they may not have options for versioning, scheduling, or application-aware capabilities normally associated with backup software. Space-efficient snapshots, which point unchanged data back to the original source, may not provide full data protection that disparate backup copies would provide. Here were polling results on whether snapshot/replication was used to augment or replace some or most of their backups:
Some of his observations and recommendations:
Maintenance is more expensive than acquisition cost. Don't focus on the tip of the iceberg. Some backup software is more efficient for bandwidth and media which will save tons of money in the long run.
Try to optimize what you have. He calls this the "Starbuck's effect". If you just need one coffee, then paying $4.50 for a cup makes sense. But if you need 100 coffees, you might be better off buying the beans.
Design backups to meet service level agreements (SLAs). In the past, backup was treated as one-size-fits-all, but today you can now focus on a workload by workload basis.
Be conservative in adopting new technologies until you have your backup procedures in place to handle data protection.
Backup is for operational recovery, not long-term retention of data. A poll showed two-thirds of the audience kept backup versions for longer than 60 days! Re-evaluate how long you keep backups, and how many versions you keep. If you need long-term retention, use archive process instead.
Recovery testing is a dying art. Practice recovery procedures so that you can do it safely and correctly when it matters most.
The analyst had a series of awesome pictures of large structures, the pyramids of Giza, the Chrysler building, and so on, and how they would look without their foundations in place. Backup is a foundation and should be treated as such in all IT planning purposes.
IT is evolving, but some basic needs like networking and backup procedures don't change. As companies re-evaluate their IT operations for Big Data, Cloud Computing and other new technologies, it is best to remember that some basic needs must be met as part of those evaluations.
Since the [IBM System Storage Technical University 2011] runs concurrently with the System x Technical University, attendees are allowed to mix-and-match. I attended several presentations regarding server virtualization and hypervisors.
Matt Archibald is an IT Management Consultant in IBM's Systems Agenda Delivery team. He started with a history of hypervisors, from IBM's early CP/CMS in 1967, through the latest VMware Vsphere 5 just announced.
He explained that there are three types of Hypervisor architectures today:
Type 1 - often referred to as "Bare Metal" runs directly on the server host hardware, and allows different operating system virtual machines to run as guests. IBM's System z [PR/SM] and [PowerVM] as well as the popular VMware ESXi are examples of this type.
Type 2 - often referred to as "Hosted" runs above an existing operating system, and allows different operating system virtual machines to run as guests. The popular [Oracle/Sun VirtualBox] is an example of this type.
OS Containers - runs above an existing operating system base, and allows multiple "guests" that all run the same operating system as the base. This affords some isolation between applications. [Parallels Virtuozzo Containers] is an example of this type.
The dominant architecture is Type 1. For x86, IBM is the number one reseller of VMware. VMware recently announced [Vsphere 5], which changes its licensing model from CPU-based to memory-based. For example, a virtual machine with 32 virtual CPUs and 1TB of virtual RAM (VRAM) would cost over $73,000 per year to license the VMware "Enterprise Plus" software. The only plus-side to this new licensing is that the "memory" entitlement transfers during Disaster Recovery to the remote location.
"Xen is dead." was the way Matt introduced the section discussing Hybrid Type-1 hypervisors like Xen and Hyper-V. These run bare-metal, but require networking and storage I/O to be processed by a single bottleneck partition referred to as "Dom 0". As such, this hybrid approach does not scale well on larger multi-sock host servers. So, his Xen-is-dead message was referring to all Hybrid-based Hypervisors including Hyper-V, not just those based on Xen itself.
The new up-and-comer is "Linux KVM". Last year, in my blog post about [System x KVM solutions], I mentioned the confusion over KVM acronym used with two different meanings. Many people use KVM to refer to Keyboard-Video-Mouse switches that allow access to multiple machines. IBM has renamed these switches to Local Console Managers (LCM) and Global Console Manager (GCM). This year, the System x team have adopted the use of "Linux KVM" to refer to the second meaning, the [Kernel-based Virtual Machine] hypervisor.
Linux KVM is not a product, but an open-source project. As such, it is built into every Linux kernel. Red Hat has created two specific deliverables under the name Red Hat Enterprise Virtualization (RHEV):
RHEV-H, a tiny ESXi-like bare-metal hypervisor that fits in 78MB, making it small enough to be on a USB stick, CD-rom or memory chip.
RHEV-M, a vCenter-like management software to manage multiple virtual machines across multiple hosts.
Personally, I run RHEL 6.1 with KVM on my IBM laptop as my primary operating system, with a Windows XP guest image to run a few Windows-specific applications.
A complaint of the current RHEV 2.2 release from Linux fanboys is that RHEV-M requires a Windows server, and uses Windows Powershell for scripting. The next release of RHEV is likely to provide a Linux-based option for management server.
Of the various hypervisors evaluated, KVM appears to be poised to offer the best scalability for multi-socket host machines. The next release is expected to support up to 4096 threads, 64TB of RAM, and over 2000 virtual machines. Compare that to VMware Vsphere 5 that supports only 160 threads, 2TB of RAM and up to 512 virtual machines.
Linux KVM Overview
Matt also presented a session focused on Linux KVM. While IBM is the leading reseller of VMware for the x86 server platform, it has chosen Linux KVM to run all of its internal x86 Cloud Computing facilities, as it can offer 40 to 80 percent savings, based on Total Cost of Ownership (TCO).
Linux KVM can run unmodified Windows and Linux guest operating systems as guest images with less than 5 percent overhead. Since KVM is built into the Linux kernel, any certification testing automatically benefits KVM as well. KVM takes advantage of modern CPU extensions like Intel's VT and AMD's AMD-V.
For high availability, in the event that a host fails, KVM can restart the guest images on other KVM hosts. RHEV offers "prioritized restart order" which allows mision-critical images to be started before less important ones.
RHEV also provides "Virtual Desktop Infrastructure", known as VDI. This allows a lightweight client with a browser to access an OS image running on a KVM host. Matt was able to demonstrate this with Firefox browser running on his Android-based Nexus One smartphone.
RHEV also adds features that make it ideal for cloud deployments, including hot-pluggable CPU, network and storage; service Level Agreement monitoring for CPU, memory and I/O resources; storage live migrations to move the raw image files while guests are running; and a self-service user portal.
IBM has been doing server virtualization for decades. When I first started at IBM in 1986, I was doing z/OS development and testing on z/VM guest images. Later, around 1999, I started working with the "Linux on z" team, running multiple Linux images under PR/SM and z/VM. While the server virtualization solutions most people are familiar with (VMware, Hyper-V, Xen) have only been around the last five years or so, IBM has a much deeper and robust understanding and long heritage. This helps to set IBM apart from the competition when helping clients.
For the past three decades, IBM has offered security solutions to protect against unauthorized access. Let's take a look at three different approaches available today for the encryption of data.
Approach 1: Server-based
Server-based encryption has been around for a while. This can be implemented in the operating system itself, such as z/OS on the System z mainframe platform, or with an applicaiton, such as IBM Tivoli Storage Manager for backup and archive.
While this has the advantage that you can selectively encrypt individual files, data sets, or columns in databases, it has several drawbacks. First, you consume server resources to perform the encryption. Secondly, as I mention in the video above, if you only encrypt selected data, the data you forget to, or choose not to, encrypt may result in data exposure. Third, you have to manage your encryption keys on a server-by-server basis. Fourth, you need encryption capability in the operating system or application. And fifth, encrypting the data first will undermine any storage or network compression capability down-line.
Approach 2: Network-based
Network-based solutions perform the encryption between the server and the storage device. Last year, when I was in Auckland, New Zealand, I covered the IBM SAN32B-E4 switch in my presentation [Understanding IBM's Storage Encryption Options]. This switch receives data from the server, encrypts it, and sends it on down to the storage device.
This has several advantages over the server-based approach. First, we offload the server resources to the switch. Second, you can encrypt all the files on the volume. You can select which volumes get encrypted, so there is still the risk that you encrypt only some volumes, and not others, and accidently expose your data. Third, the SAN32B-E4 can centralized the encryption key management to the IBM Tivoli Key Lifecycle Manager (TKLM). This is also operating system and application agnostic. However, network-based encryption has the same problem of undermining any storage device compression capability, and often has a limit on the amount of data bandwidth it can process. The SAN32B-E4 can handle 48 GB/sec, with a turbo-mode option to double this to 96 GB/sec.
Approach 3: Device-based
Device-based solutions perform the encryption at the storage device itself. Back in 2006, IBM was the first to introduce this method on its [TS1120 tape drive]. Later, it was offered on Linear Tape Open (LTO-4) drives. IBM was also first to introduce Full Disk Encryption (FDE) on its IBM System Storage DS8000. See my blog post [1Q09 Disk Announcements] for details.
As with the network-based approach, the device-based method offloads server resources, allows you to encrypt all the files on each volume, can centrally manage all of your keys with TKLM, and is agnostic to operating system and application used. The device can compress the data first, then encrypt, resulting in fewer tape cartridges or less disk capacity consumed. IBM's device-based approach scales nicely. IBM has an encryption chip is placed in each tape drive or disk drive. No matter how many drives you have, you will have all the encryption horsepower you need to scale up.
Not all device-based solutions use an encryption chip per drive. Some of our competitors encrypt in the controller instead, which operates much like the network-based approach. As more and more disk drives are added to your storage system, the controller may get overwhelmed to perform the encryption.
The need for security grows every year. Enterprise Systems are Security-ready to protect your most mission critical application data.
Special thanks to Anthony Vandewerdt, who sent me his version of this presentation that he planned to present in Australia next week. I "smartened it up" (or whatever the appropriate phrase is the opposite of "dumbed it down") for the technical audience.
Recovery procedures for single and double drive failures. A double drive failure on an XIV typically involves less recovery effort than traditional RAID5-based disk systems, and in many cases results in no data loss whatsoever. I provided details on this in my blog post [Double Drive Failure Debunked: XIV Two Years Later], so no need to repeat myself here.
Replacing the Automatic Transfer Switch (ATS) non-disruptively. To support either single-phase and triple-phase power sources, the XIV uses an ATS to take two independent power feeds, and distribute this out to the three Uninterruptible Power Supplies (UPS).
Built-in Migration capability to copy data off other disk systems over to the XIV.
Configuring Synchronous and Asynchronous mirroring using either the Fibre Channel or Internet Protocol ports.
Optimizing the use of XIV for VMware, AIX and other operating systems.
The IBM XIV Storage System is quite popular in New Zealand, with four times more boxes sold per capita than the other countries in the Asia Pacific region. I covered both the A14 model as well as the new Gen3 model.
Business Continuity/Disaster Recovery (BC/DR) Update: Lessons, Planning, Solutions
My colleague Vic Peltz from IBM Almaden presented on lessons learned from Hurricane Katrina and various other natural disasters. Unlike tradtional presentations the focus on technology, Vic took a different approach, focusing on people and procedures. I was here last year when the earthquake hit Christchurch on the south island, so I was well aware that BC/DR was top of mind for many of the attendees. Throughout this week, I have felt tremors, and many of the locals told me that these happen all the time.
Introduction to IBM Storwize V7000
I knew I was in trouble when the request for me to present Storwize sounded like something from [Mission Impossible]:
"Good morning, Mr. Pearson. Your mission, should you choose to accept it, involves presenting Storwize V7000 in Auckland, New Zealand. You may also present the Storwize V7000 Unified, but it is essential that you not cover the SAN Volume Controller or SONAS products from which they are based upon, as you will not have enough time. The audience is very technical, so be careful. As always, should any questions come up that you cannot answer, the conference coordinators will disavow all knowledge of your actions, nor reimburse your laundry charges. This message will self-destruct in five seconds."
Well, I accomplished my mission in 75 minutes. I was able to cover the block-only version of the IBM Storwize V7000, with support for clustering the control enclosures, expansion drawers and external storage virtualization. I then spent a few minutes on the block-and-file Storwize V7000 Unified, which adds support for CIFS, NFS, HTTPS, FTP and SCP protocols through two new "file modules", with integrated support for backup and anti-virus checking. I covered both IBM Easy Tier for sub-LUN automated tiering between Solid-State Drives (SSD) and spinning disk, as well as Active Cloud Engine for file-based movement between disk and tape.
It seems everyone is talking about stacks, appliances and clouds.
On StorageBod, fellow blogger Martin Glassborow has a post titled [Pancakes!] He feels that everyone from Hitachi to Oracle is turning into the IT equivalent of the International House of Pancakes [IHOP] offering integrated stacks of software, servers and storage.
Cisco introduced its "Unified Computing System" about a year ago, [reinventing the datacenter with an all-Ethernet approach]. Cisco does not offer its own hypervisor software nor storage, so there are two choices. First, Cisco has entered a joint venture, called Acadia, with VMware and EMC, to form the Virtual Computing Environment (VCE) coalition. The resulting stack was named Vblock, which one blogger had hyphenated as Vb-lock to raise awareness to the proprietary vendor lock-in nature of this stack. Second, Cisco, VMware and NetApp had a similar set of [Barney press releases] to announce a viable storage alternative to those not married to EMC.
"Only when it makes sense. Oracle/Sun has the better argument: when you know exactly what you want from your database, we’ll sell you an integrated appliance that will do exactly that. And it’s fine if you roll your own.
But those are industry-wide issues. There are UCS/VCE specific issue as well:
Cost. All the integration work among 3 different companies costs money. They aren’t replacing existing costs – they are adding costs. Without, in theory, charging more.
Lock-in. UCS/Vblock is, effectively, a mainframe with a network backplane.
Barriers to entry. Are there any? Cisco flagged hypervisor bypass and large memory support as unique value-add – and neither seems any more than a medium-term advantage.
BOT? Build, Operate, Transfer. In theory Vblocks are easier and faster to install and manage. But customers are asking that Acadia BOT their new Vblocks. The customer benefit over current integrator practice? Lower BOT costs? Or?
Price. The 3 most expensive IT vendors banding together?
Longevity. Industry “partnerships” don’t have a good record of long-term success. Each of these companies has its own competitive stresses and financial imperatives, and while the stars may be aligned today, where will they be in 3 years? Unless Cisco is piloting an eventual takeover."
Fellow blogger Bob Sutor (IBM) has an excellent post titled
[Appliances and Linux]. Here is an excerpt:
"In your kitchen you have special appliances that, presumably, do individual things well. Your refrigerator keeps things cold, your oven makes them hot, and your blender purees and liquifies them. There is room in a kitchen for each of these. They work individually but when you are making a meal they each have a role to play in creating the whole.
You could go out and buy the metal, glass, wires, electrical gadgets, and so on that you would need to make each appliance but it is is faster, cheaper, and undoubtably safer to buy them already manufactured. For each device you have a choice of providers and you can pay more for additional features and quality.
In the IT world it is far more common to buy the bits and pieces that make up a final solution. That is, you might separately order the hardware components, the operating system, and the applications, and then have someone put them all together for you. If you have an existing configuration you might add more blades or more storage devices.
You don’t have to do this, however, in every situation. Just from a hardware perspective, you can buy a ready-made machine just waiting for the on switch to be flicked and the software installed. Conversely, you might get a pre-made software image with operating system and applications in place, ready to be provisioned to your choice of hardware. We can get even fancier in that the software image might be deployable onto a virtual machine and so be a ready made solution runnable on a cloud.
Thus in the IT world we can talk about hardware-only appliances, software-only appliances (often called virtual software appliances), and complete hardware and software combinations. The last is most comparable to that refrigerator or oven in your kitchen."
If your company was a restaurant, how many employees would you have on hand to produce your own electricity from gas generators, pump your own water from a well, and assemble your own toasters and blenders from wires and motors? I think this is why companies are re-thinking the way they do their own IT.
Rather than business-as-usual, perhaps a mix of pre-configured appliances, consisting of software, server and storage stacked to meet a specific workload, connected to public cloud utility companies, might be the better approach. By 2013, some analysts feel that as many as 20 percent of companies might not even have a traditional IT datacenter anymore.
“By employing techniques like virtualization, automated management, and utility-billing models, IT managers can evolve the internal datacenter into a ‘private cloud’ that offers many of the performance, scalability, and cost-saving benefits associated with public clouds. Microsoft provides the foundation for private clouds with infrastructure solutions to match a range of customer sizes, needs and geographies.
The public cloud:
“Cloud computing is expanding the traditional web-hosting model to a point where enterprises are able to off-load commodity applications to third-party service providers (hosters) and, in the near future, the Microsoft Azure Services Platform. Using Microsoft infrastructure software and Web-based applications, the public cloud allows companies to move applications between private and public clouds.”
Finally, I saw this from fellow blogger, Barry Burke(EMC), aka the Storage Anarchist, titled [a walk through the clouds] which is really a two-part post.
The first part describes a possible future for EMC customers written by EMC employee David Meiri, envisioning a wonderful world with "No more Metas, Hypers, BIN Files...."
The vision is a pleasant one, and not far from reality. While EMC prefers to use the term "private cloud" to refer to both on-premises and off-premises-but-only-your-employees-can-VPN-to-it-and-your-IT-staff-still-manages-it flavors, the overall vision is available today from a variety of Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) providers.
A good analogy for "private cloud" might be a corporate "intranet" that is accessible only within the company's firewall. This allowed internal websites where information to be disseminated to employees could be posted, using standard HTML and standard web browsers that are already deployed on most PCs and workstations. Web pages running on an intranet can easily be moved to an external-facing website without too much rework or trouble.
The second part has Barry claiming that EMC has made progress towards a "Virtual Storage Server" that might be announced at next month's EMC World conference.
When people hear "Storage Virtualization" most immediately think of the two market leaders, IBM SAN Volume Controller and Hitachi Data Systems (HDS) Universal Storage Platform (USP) products. Those with a tape bent might throw in IBM's TS7000 virtual tape libraries or Oracle/Sun's Virtual Storage Manager (VSM). And those focused on software-only solutions might recall Symantec's Veritas Volume Manager (VxVM), DataCore's SANsymphony, or FalconStor's IPStor products.
But what about EMC's failed attempt at storage virtualization, the Invista? After five years of failing to deliver value, EMC has so far only publicised ONE customer reference account, and I estimate that perhaps only a few dozen actual customers are still running on this platform. Compare that to IBM selling tens of thousands of SAN Volume Controllers, and HDS selling thousands of their various USP-V and USP-VM products, and you quickly realize that EMC has a lot of catching up to do. EMC's first delivered Invista about 18 months after IBM SAN Volume Controller, similar to their introduction of Atmos being 18 months after our Scale-Out File Services (SoFS) and their latest CLARiiON-based V-Max coming out 18 months after IBM's XIV storage system.
So what will EMC's Invista follow-on "Virtual Storage Server" product look like? No idea. It might be another five years before you actually hear about a customer using it. But why wait for EMC to get their act together?
IBM offers solutions TODAY that can make life as easy as envisioned here. IBM offers integrated systems sold as ready-to-use appliances, customized "stacks" that can be built to handle particular workloads, residing on-premises or hosted at an IBM facility, and public cloud "as-a-service" offerings on the IBM Cloud.
Well, it's Tuesday, and that means IBM announcements! Today is bigger, as there are a lot of Dynamic Infrastructure announcements throughout the company with a common theme, cloud computing and smart business systems that support the new way of doing things. Today, IBM announced its new "IBM Smart Archive" strategy that integrates software, storage, servers and services into solutions that help meet the challenges of today and tomorrow. IBM has been spending the past few years working across its various divisions and acquisitions to ensure that our clients have complete end-to-end solutions.
IBM is introducing new "Smart Business Systems" that can be used on-premises for private-cloud configurations, as well as by cloud-computing companies to offer IT as a service.
IBM [Information Archive] is the first to be unveiled, a disk-only or blended disk-and-tape Information Infrastructure solution that offers a "unified storage" approach with amazing flexibility for dealing with various archive requirements:
For those with applications using the IBM Tivoli Storage Manager (TSM) or IBM System Storage Archive Manager (SSAM) API of the IBM System Storage DR550 data retention solution, the Information Archive will provide a direct migration, supporting this API for existing applications.
For those with IBM N series using SnapLock or the File System Gateway of the DR550, the Information Archive will support various NAS protocols, deployed in stages, including NFS, CIFS, HTTP and FTP access, with Non-Erasable, Non-Rewriteable (NENR) enforcement that are compatible with current IBM N series SnapLock usage.
For those using NAS devices with PACS applications to store X-rays and other medical images, the Information Archive will provide similar NAS protocol interfaces. Information Archive will support both read-only data such as X-rays, as well as read/write data such as Electronic Medical Records.
Information Archive is not just for compliance data that was previously sent to WORM optical media. Instead, it can handle all kinds of data, rewriteable data, read-only data, and data that needs to be locked down for tamper protection. It can handle structured databases, emails, videos and unstructured files, as well as objects stored through the SSAM API.
The Information Archive has all the server, storage and software integrated together into a single machine type/model number. It is based on IBM's General Parallel File System (GPFS) to provide incredible scalability, the same clustered file system used by many of the top 500 supercomputers. Initially, Information Archive will support up to 304TB raw capacity of disk and Petabytes of tape. You can read the [Spec Sheet] for other technical details.
For those who prefer a more "customized" approach, similar to IBM Scale-Out File Services (SoFS), IBM has [Smart Business Storage Cloud]. IBM Global Services can customize a solution that is best for you, using many of the same technologies. In fact, IBM Global Services announced a variety of new cloud-computing services to help enterprises determine the best approach.
In a related announcement, IBM announced [LotusLive iNotes], which you can think of as a "business-ready" version of Google's GoogleApps, Gmail and GoogleCalendar. IBM is focused on security and reliability but leaves out the advertising and data mining that people have been forced to tolerate from consumer-oriented Web 2.0-based solutions. IBM's clients that are already familiar with on-premises version of Lotus Notes will have no trouble using LotusLive iNotes.
There was actually a lot more announced today, which I will try to get to in later posts.
Last Tuesday, we had our official "Grand Opening" for the new Tucson Executive Briefing Center!
We sent out fancy invitations to all the IBM executives who supported this center, local dignitaries from the Tucson and State of Arizona level, and all of the IBM employees on the Tucson campus.
Since our new center is significantly cozier (5700 square feet versus our previous 15,000 square feet), we split the day into two separate events. The first for the IBM executives and local VIPs, and the second for the rest of the IBM employees on campus.
Of course, there is no free lunch. The day started out with a series of speeches. My manager, Doug Davies, was the master of ceremonies to introduce each speaker.
Alistair Symon, IBM Vice President of Enterprise Storage, explained how important storage affects everyone's lives. If you use an ATM machine to withdraw money, for example, you are most probably using IBM System Storage behind the scenes. Nearly all of the IBM disk and tape storage products are designed here in Tucson.
Bruce Wright (shown here) directs the University of Arizona's Office of University Research Parks, serves as CEO of the UA Tech Park, and the founder and president of the Arizona Center for Innovation. Bruce said a few words on how please he was that IBM decided to reverse its July 2011 decision to leave Tucson. The UofA owns all the property, renting back four of the eleven buildings back to IBM, so is effectively our landlord. Next year will mark the 20th anniversary of IBM's sale of the technology park to the University.
Tucson Councilwoman Shirley Scott talked about the improtance of high-paying jobs to the local economy. While IBMers in Tucson are paid less than our counterparts in San Jose, Austin, Raleigh or Poughkeepsie, we are certainly [paid more than the average Tucsonan], thus helping to raise the standard of living here.
Dr. Michael Varney, president and CEO of the local Tucson Metropolitan Chamber of Commerce, praised IBM for its strong reputation in ethics and diversity.
My new second-line manager, Karl Duvalsaint, and my new third-line manager, Doug Dreyer, emphasized the importance of co-locating Briefing Centers in sites that have Research and Development activity. It is important for clients to interact directly with developers, and it is also good for developers to understand directly from clients their needs, preferences and requirements. Worldwide, the IBM Systems and Technology Group has only twelve Executive Briefing Centers, and the Tucson EBC is one of them.
This is not to say that IBM does not have centers in other locations. Our newest client center in Singapore is a shining example. Of course, if they want experts to speak to clients there, they need to be flown in. Doug Dreyer mentioned that IBM plans to launch six such centers in Africa as well.
Next was the ribbon cutting. From left to right, Lee Olguin (our Gunny Sargeant), Tucson Councilwoman Shirley Scott, UofA's Bruce Wright, IBM VP of Program Management Calline Sanchez, My second-line manager Karl Duvalsaint, IBM VP Allistair Simon, my first-line manager Doug Davies, Tucson Chamber of Commerce President Dr. Michael Varney, and my third-line manager Doug Dreyer. We had a member of the local high school band do the drum roll.
Once the ribbon was cut, the IBM Executves and local VIPs were brought in to see the new facility, which has two large rooms, one common dining area, an 800-square foot green data center to showcase our products, our own set of restrooms, a galley to stage up the food and beverage service, and two smaller rooms for private conversations or conference calls. A local high school band provided live music throughout the day.
This week I am in Orlando, Florida for the IBM Edge conference. This is the last day, so it ends early for people who want to get home to their datacenters (er.. families) for the weekend.
How Real-Time Compression Can Maximize Storage Efficiency for Production Applications
This was a split session with two speakers. First, Ian Rimmer, Senior IT Engineer and Architect at iBurst, presented their experience with the IBM Real-Time Compression Appliance in front of NetApp NAS storage arrays. Second, Jerry Haigh, IBM offering manager for IBM System Storage, presented the new Real-Time compression feature announced this week on IBM SAN Volume Controller (SVC) and Storwize V7000.
iBurst is the #1 Wireless Telecom for South Africa. The also offer cable broadband and VOIP. They have 200 employees servicing 120,000 subscriber/households. They need to keep five years' worth of text files, and have chosen real-time compression of their NAS storage. This was before IBM acquired the Storwize company, as they have been using it for the past six years.
The monetary savings from compression was used to purchase Performance Accelerator Modules (PAM) cards for their NetApp NAS gear, which benefit from the compression (more data stored in SSD to improve performance).
For backup, they use NDMP with Symantec NetBackup that keeps data in its compressed form as it is written to tape. They have an IBM TS3100 library with LTO tape as the backup repository.
Jerry Haigh presented Real-Time compression for primary disk data. Unlike the competition, this is designed to be used with primary data, including databases, and does this real-time, not post-process. In some performance tests, DB2 compressed on 48 drives out-performed the same data uncompressed on 96 drives. In another test focused on VMware Vmark benchmark, the compressed data was able to be same or better performance as uncompressed. In a third test with SVC virtualizing XIV running Oracle ORION test, the Oracle databases compressed 50 to 64 percent, and had better performance.
For those who already have SVC or Storwize V7000, consider a 45-day trial to check out compression for yourself.
NAS File Systems: Access and Authentication
Mark Taylor, IBM Technical Specialist for SONAS, N series and Storwize V7000 Unified, presented the nuances of authentication and authorization for NAS file systems. The differences between these two are:
Authentication - Yes, you are who you are.
Authorization - Yes, you are permitted to do what you are trying to do
(Prior to working with SONAS, my only experience with access and authentication in NAS was setting up my LAN at home, which I have connecting my Mac, Linux and Windows machines. I have both N series and SONAS at the IBM Executive Briefing Center in Tucson, Arizona, so I know first-hand how complicated NAS access and authentication systems can be.
A few months ago, I taught "Intro to NAS" as one of my topics at the Top Gun class in Argentina and Brazil. Several of the students had mentioned they thought they knew NAS solutions but had not realized all the technical issues with access and authentication that I discussed in my presentation.)
Mark explained the differences between Windows NTFS-style System identifiers (SID), versus UNIX-style user and group identifiers (UID, GID). For NAS solutions that support both CIFS and NFS, there are four options:
Microsoft Active Director (AD) extended with Identity Management for UNIX, formerly known as Services for UNIX (SFU). AD servers normally store SID information, but the extensions add extra columns to hold UID/GID mappings.
AD with Network Information Service (NIS) server. The problem with this approach is that AD and NIS are separate databases, and you need to coordinate updates to them, and their backups.
Lightweight Directory Access Protocol (LDAP) with SAMBA extensions. LDAP holds UID/GID information, and the SAMBA extensions adds extra columns to hold SID mapping.
Local mapping. The dangerous part of local mapping is that the storage admin is also the security admin, and you may want different people doing these roles.
Of these four methods, Mark recommends the first and third as best practices for multi-protocol authentication.
SID-to-UID mapping, UID-to-SID mapping
SONAS and Storwize V7000
SID-to-UID/GID mapping, NFS v4 ACLs
NFS v4 ACLs
Mark then explained how NFS v4 ACLs work, basically an ordered collection of "Access Control Elements" or ACEs. Each ACE on the ACL may "allow" or "deny" the request. You want to avoid "Inheritance" as that can cause problems and unxpected results.
That's it folks. Next week, I am spending time with my research buddies at the Almaden Research Center near San Jose, California, and then it is off to Moscow, Russia to kick off a series of IBM events called "Edge Comes to You" (ECTY).
The ECTY conferences will be a smaller subset of the Edge conference here in Orlando, but offered in other countries for those who were unable to travel to the United States.
This year marks the 10 year anniversary of IBM's introduction of LTO tape technology. IBM is a member of the Linear Tape Open consortium which consists of IBM, HP and Quantum, referred to as "Technology Provider Companies" or TPCs. In an earlier job role, I was the "portfolio manager" for both LTO and Enterprise tape product lines.
Today, we held a celebration in Tucson, with cake and refreshments.
IBM Executives Doug Balog, IBM VP of Storage Platform, and Sanjay Tripathi, the new IBM Director and Business Line Executive for Tape, VTL and Archive systems, presented the successes of LTO tape over the past 10 years.
To date over 3.5 million LTO tape drives, and over 150 million LTO tape media cartridges have been shipped which is a testament to the remarkable marketplace acceptance of the technology.
In honor of this event, I decided to interview Bruce Master, IBM Senior Program Manager for Data Protection Systems, about this 10 year anniversary.
10 years of LTO technology is a great milestone. How is this especially significant to IBM and its clients?
According to IDC data, IBM has held the #1 leader position in market share for total world wide branded tape revenue for over 7 years and that IBM is still #1 in branded midrange tape revenue which includes the LTO tape technologies. IBM was the first drive manufacturer to deliver LTO-1 drives, back in September 2000, the first to deliver tape drive encryption to the marketplace on LTO-4 drives, and is shipping LTO generation 5 drives and libraries. IBM is the author of the new Linear Tape File System (LTFS) specification that has been adopted by the TPCs. This file system revolutionizes how tape can be used as if it were a giant 1.5 terabyte removable USB memory stick with the capability to be accessed with directory tree structures and drag and drop functionality. With LTO's built-in real-time compression, a single tape cartridge can hold up to 3TB of data.
The Linear Tape File System has been getting a lot of attention. Where can we learn more about it?
Why is tape still a critical part of a storage infrastructure?
Tape is low cost and provides critical off-line portable storage to help protect data from attacks that can occur with on-line data. For instance, on-line data is at risk of attack from a virus, hacker, system error, disgruntled employee, and more. Since tape is off-line, not accessible by the system, it protects against these forms of corruption. LTO technology also provides write-once read-many (WORM) tape media to help address compliance issues that specify non-erasable, non-rewriteable (NENR) storage, hardware encryption to secure data, as well as a low cost long term archive media. When data cools off, or becomes infrequently accessed, why keep it on spinning disk? Move it to tape where it is much greener and lower cost. A tape in a slot on a shelf consumes minimal energy.
So tape is not dead?
Ha! Far from it. Seems like disk-only "specialty shop" storage vendors that don’t have tape in their sales portfolio are the ones that propagate that myth. In reality, storage managers are tasked with meeting complex objectives for performance, compliance, security, data protection, archive and total cost of ownership. Optimally, a blend of disk and tape in a tiered infrastructure can best address these objectives. You can’t build a house with just a hammer. IBM has a rich tool kit of storage offerings including disk, tape, software, services and deduplication technologies to help clients address their needs.
Do you have an example of a client who was saved by tape?
Yes indeed. Estes Express, a large trucking firm, was hit by a hurricane that flooded their data center and destroyed all systems. Fortunately the company survived because the night before they had backed up all data on to IBM tape and moved the cartridges offsite! The company survived and has since implemented a best practices data protection strategy with a combination of disk-to-disk-to-tape (D2D2T) using LTO tape at the primary site, and a remote global mirrored site that is also backed up to LTO tape.
So tape saved the day. What is the outlook for tape innovation in the future?
The future is bright for tape. Earlier this year, IBM and Fujifilm were able to [demonstrate a tape density achievement] that could enable a native 35TB tape cartridge capacity! This shows a long roadmap ahead for tape and a continued good night’s sleep for storage managers knowing that their precious data will be safe.
Of course, LTO tape is just one of the many reasons IBM is a successful and profitable leader in the IT storage industry. Doug Balog talked about his experiences in London for the [October 7th launch] of IBM DS8800, Storwize V7000 and SAN Volume Controller 6.1. Sanjay Tripathi showed recent successes with IBM's ProtecTIER Data Deduplication Solution and Information Archive products.
I would like to thank Bruce Master for his time in completing this interview. To learn more about IBM tape and storage offerings, visit [ibm.com/storage].
The "Basic" offering includes a single IBM Storwize V7000 controller enclosure, and three year warranty package that includes software licenses for IBM Tivoli Storage FlashCopy Manager (FCM) and IBM Tivoli Storage Productivity Center for Disk - Midrange Edition (MRE). Planning, configuration and testing services for the software are included and can be performed by either IBM or an IBM Business Partner.
The "Standard" offering allows for multiple IBM Storwize V7000 enclosures, provides three year warranty package for the FCM and MRE software, and includes implementation services for both the hardware and the software components. These services can be performed by IBM or an IBM Business Partner.
Why bundle? Here are the key advantages for these offerings:
Increased storage utilization! First introduced in 2003, IBM SAN Volume Controller is able to improve storage utilization by 30 percent through virtualization and thin provisioning. IBM Storwize V7000 carries on this tradition. Space-efficient FlashCopy is included in this bundle at no additional charge and can reduce the amount of storage normally required for snapshots by 75 percent or more. IBM Tivoli Storage FlashCopy Manager can manage these FlashCopy targets easily.
Improved storage administrator productivity! The new IBM Storwize V7000 Graphical User Interface can help improve administrator productivity up to 2 times compared to other midrange disk solutions. The IBM Tivoli Storage Productivity Center for Disk - Midrange Edition provides real-time performance monitoring for faster analysis time.
Increased application performance! This bundle includes the "Easy Tier" feature at no additional charge. Easy Tier is IBM's implementation of sub-LUN automated tiering between Solid-State Drives (SSD) and spinning disk. Easy Tier can help improve application throughput up to 3 times, and improve response time up to 60 percent. Easy Tier can help meet or exceed application performance levels with its internal "hot spot" analytics.
Increased application availability! IBM Tivoli Storage FlashCopy Manager provides easy integration with existing applications like SAP, Microsoft Exchange, IBM DB2, Oracle, and Microsoft SQL Server. Reduce application downtime to just seconds with backups and restores using FlashCopy. The built-in online migration feature, included at no additional charge, allows you to seamlessly migrate data from your old disk to the new IBM Storwize V7000.
Significantly reduced implementation time! This bundle will help you cut implementation time in half, with little or no impact to storage administrator staff. This will help you realize your return on investment (ROI) much sooner.
Every January, we look back into the past as well as look into the future for trends to watch for the upcoming year. Ray Lucchesi of Silverton Consulting has a great post looking back at the [Top 10 storage technologies over the last decade]. I am glad to see that IBM has been involved with and instrumental in all ten technologies.
Looking into the future, Mark Cox of eChannel has an article [Storage Trends to Watch in 2011], based on his interviews with two fellow IBM executives: Steve Wojtowecz, VP of storage software development, and Clod Barrera, distinguished engineer and CTO for storage. Let's review the four key trends:
Cloud Storage and Cloud Computing
No question: Cloud Computing will be the battleground of the IT industry this decade. I am amused by the latest spate of Microsoft commercials where problems are solved with someone saying "...to the cloud". Riding on the coat tails of this is "Cloud Storage", the ability to store data across an Internet Protocol (IP) network, such as 10GbE Ethernet, in support of Cloud Computing applications. Cloud Storage protocols in the running include NFS, CIFS, iSCSI and FCoE.
Mark writes "..vendors who aren't investing in cloud storage solutions will fall behind the curve."
Economic Downturn forces Innovation
The old British adage applies: "Necessity is the mother of invention." The status quo won't do. In these difficult economic times, IT departments are running on constrained budgets and staff. This forces people to evaluate innovative technologies for storage efficiency like real-time compression and data deduplication to make better use of what they currently have. It also is forcing people to take a "good enough" attitude, instead of paying premium prices for best-of-breed they don't really need and can't really afford.
IT Service Management
Companies are getting away from managing individual pieces of IT kit, and are focusing instead on the delivery of information, from the magnetic surface of disk and tape media, to the eyes and ears of the end users. The deployment mix of private, hybrid and public clouds makes this even more important to measure and manage IT as a set of services that are delivered to the business. IT Service Management software can be the glue, helping companies implement ITIL v3 best practices and management disciplines.
Smarter Data Placement
A recent survey by "The Info Pro" analysts indicates that "managing storage growth" is considered more critical than "managing storage costs" or "managing storage complexity".
This tells me that companies are willing to spend a bit extra to deploy a tiered information infrastructure if it will help them manage storage growth, which typically ranges around 40 to 60 percent per year. While I have discussed the concept of "Information Lifecycle Management" (ILM), for the past four years on this blog, I am glad to see it has gone mainstream, helped in part with automated storage tiering features like IBM System Storage Easy Tier feature on the IBM DS8000, SAN Volume Controller and Storwize V7000 disk systems. Not all data is created equal, so the smart placement of data, based on the business value of the information contained, makes a lot of sense.
These trends are influencing what solutions the various different vendors will offer, and will influence what companies purchase and deploy.
Last July, IBM and EMC traded blog postings over SPC-1 benchmark results. Fellow EMC bloggerChuck Hollis wrote his post [Does Anyone Take The SPC Seriously?]. Here is an excerpt:
I think most storage users have figured this out. We've never done an SPC test, and probably will never do one. Anyone is free, however, to download the SPC code, lash it up to their CLARiiON, and have at it.
I responded with [Getting Under EMC Skin], and then followed up with a series explaining IBM SVC and SPC benchmarks here:
So what is the good news?Yesterday, our friends at NetApp took up Chuck's challenge and posted results on their FAS3040 as well as their EMC CLARiiON devices. IBM sells the FAS3040 under the name IBM System Storage N5300 disk system. Knowing that NetApp maintains excellent performance when it is doing point-in-time copies, NetApp ran both with and without on both boxes. I include DS4700 and DS4800 as well for comparison purposes, but only have them without FlashCopy running.
NetApp FAS3040 (IBM N5300)
NetApp FAS3040 (IBM N5300)
EMC CLARiiON CX3-40
IBM DS4700 Express
EMC CLARiiON CX3-40
One would expect some performance degradation with a box running point-in-time copies at the same time it is reading and writing data, but NetApp/IBM N5300 does not degrade by much, but EMC's drops a significant amount.
So what is the bad news? Last October, I welcomed HDS USP-V to the [Super High-End Club], but now we need to invite Texas Memory Systems as well.In 2006, I posted [Hybrid, Solid State and the future of RAID], and poked fun at Texas Memory Systems using the slogan "World's Fastest Storage", which at the time that honor belonged to IBM SAN Volume Controller instead.The VP of Texas Memory Systems, Woody Hutsell, explained the only reason their solid-state disk system, RAMSAN-320, didn't have faster results is that they didn't have the fastest IBM server to run against it. It may not surprise you that nearly everyone's SPC benchmarks use IBM servers because IBM has the fastest servers as well. I didn't have a million-dollar System p UNIX server to send Woody for this, but it looks like they have finally gotten one, and a new RAMSAN-400 device, as they have posted their latest results.
Texas Memory Systems RAMSAN-400
IBM SAN Volume Controller 4.2
EMC doesn't publish numbers for their Symmetrix box, despite their announcement of faster SSD drives. They claim that SSD drives make their overall disk system performance faster, but without SPC benchmarks, we will never know. If you have a Symmetrix, this YouTube video may help you decide where it belongs:
This week was the IBM Pulse 2011 converence in Las Vegas, Nevada, with over 7,000 attendees. I wasn't there, and my on-the-scene correspondent was too busy running the hands-on lab to get out and attend sessions. Fortunately, I was able to watch some of the [IBM Software live stream], and here are my thoughts and observations.
Fellow inventor [Dean Kamen] was the keynote speaker. His inventions help people, making the world a better place. Here are three examples I found interesting during his talk:
Helping third world countries
Dean started out with his favorite quote:
"A problem well defined is a problem half-solved." - John Dewey
Dean mentioned that we are fortunate, having both potable drinking water and a reliable supply of electricity, but 2 to 4 billion people on the planet do not. Sponsored by Coca-Cola, Dean and his team of innovators were able to come up with small units that can be placed in a village or town. One unit takes in wet liquid and produces potable drinking water. The other unit takes combustible materials, like cow dung, and products electricity. Each unit is roughly the size of half a standard server rack. What does Coca-Cola get out of this? New "vending machines"! By combining drinking water with flavored syrups, they can create soft drinks on demand.
Dean's opinion was that if you want something done, you need to work with large corporations, as governments are mired in beauracracy and rules. I agree. When I first joined IBM, I was introduced to [TRIZ] which was a systematic method for solving problems. IBM's best and brightest are working to solve some of the toughest computer science challenges. For more on TRIZ, see this blog post about [TRIZ in BusinessWeek].
Helping injured veterans
Dean Kamen is well known for inventing the two-wheeled [Segway Personal Transporter], but his company, [DEKA], makes all kinds of things, mostly medical equipment. To help wounded soldiers returning from Iraq or Afghanistan without one or both arms, Dean and his team developed a robotic arm that has enough motor dexterity to pick up a raisin or grape off the table without dropping or squashing it. Dean has appeared several times on the Colbert Report, and here is a video of the robotic arm:
I have myself enjoyed riding a Segway. A local place in Tucson uses them to lead tourists through downtown Tucson and the University of Arizona campus.
Helping young students to learn science and technology
Dean wrapped up his talking by talking about his passion about "For Inspiration and Recognition of Science and Technology" or [FIRST]. Modeled after sports competitions, FIRST encourages teams of kids to build robots that perform specific tasks. Every year, companies and universities sponsor teams by purchasing robot kits from FIRST. Teams compete in regional competitions, and then the best of those go on to compete in a stadium in Atlanta, Georgia, hosting 76,000 people cheering for their teams.
Unlike other school sports (Football, Basketball, Baseball, etc.) where a student is more likely to win the lottery than get a successful career as a professional athlete, every student involved in FIRST competitions can "go pro". A study of FIRST success tracked students who participated in competitions, and found a substantial improvement in percentage of those students attending college and working as science and engineering professionals.
I am a big fan of encouraging kids of all ages to learn more about science, technology, engineering and math [STEM]. Back in 2009, I blogged about my involvement with [One Laptop Per Child] and [Junior FIRST Lego League]. I've gotten a great reaction to my latest challenge, to build a Watson Jr. in your own basement, based on my [step-by-step] instructions.
If you attended IBM Pulse this week, please comment on your thoughts and observations!
IBM Information Archive for email, files and eDiscovery
Not too many people have heard of IBM's Smart Archive strategy and the storage products IBM offers to meet compliance regulations. This session covered the following:
The differences between backup and archive, including a few of my own personal horror stories helping companies who had foolishly thought that keeping backup copies for years would adequately serve as their archive strategy
The differences between optical media, Write-Once Read-Many (WORM) media, and Non-Erasable, Non-Rewriteable (NENR) storage options.
Why putting a [space heater] on your data center floor is a bad idea, driving up power and cooling costs for little business value to the enterprise once the unit is full of rarely accessed read-only data.
An overview of the [IBM Information Archive], an integrated stack of servers, storage and software that replaces previous offerings such as the IBM System Storage DR550 and the IBM Grid Medical Archive Solution (GMAS).
The marketing bundle known as the [Information Archive for Email, Files and eDiscovery] that combines the Information Archive storage appliance with Content Collectors for email and file systems, as well as eDiscovery tools, and implementation services for a solution that can support a small or medium size business, up to 1400 employees.
IBM Tivoli Storage Productivity Center v4.2 Overview and Update
Many of the concerns raised when I [presented v4.1 at this conference last year] were addressed this year in v4.2, including full performance statistics for IBM XIV storage system, storage resource agent support for HP-UX and Solaris, and a variety of other issues.
I presented this overview in stages:
"Productivity Center Basic Edition" that comes pre-installed on the IBM System Storage Productivity Center hardware console, that provides discover of devices, basic configuration, and a clever topology viewer of what is connected to what.
"Productivity Center for Disk" and "Productivity Center for Disk Midrange Edition (MRE)" that provides real-time and historical performance monitoring, asset and capacity reporting.
"Productivity Center for Replication" which supports monitoring, failover and failback for FlashCopy, Metro Mirror and Global Mirror on the SVC, Storwize V7000, DS8000, DS6000 and ESS 800.
"Productivity Center for Data" which supports reporting on files, file systems and databases on DAS, SAN and NAS attached storage from a Operating System viewpoint.
"Productivity Center Standard Edition" which includes all of the above except "Replication", and adds performance monitoring of SAN Fabric gear, and some very clever analytics to improver performance and problem determination.
One of the questions that came up was "How big does my company have to be to consider using Productivity Center?" which I answered as follows:
"If you are a small company, and the "IT Person" has responsibilities outside the IT, and managing the few pieces of kit is just part of his job, then consider just using the web-based GUI through a Firefox or similar browser. If you are a medium sized company with dedicated IT personnel, but mostly run by system admins or database admins that manage storage and networks on the side, you might want to consider the "Storage Control" plug-in for IBM Systems Director. But if you are larger shop, and there are employees with the title "Storage Administrator" and/or "SAN Administrator", then Tivoli Storage Productivity Center is for you."
Tivoli Storage Productivity Center has matured into a fine piece of software that truly can help medium and large sized data centers manage their storage and storage networking infrastructure.
I like speaking the first day of these events. Often people come in just to hear the keynote speakers, and stay the afternoon to hear a few break-out sessions before they leave Tuesday or Wednesday for other meetings.
Down the street, in Times Square, IBM made it on the big board.
Continuous Data Availability
Jeanine Cotter, IBM VP for Data Center Services, started out with a video about Sabre. IBM developed this revolutionary airline reservation system to handle the huge volume of transactions. Today, 18 percent of organizations consider downtime unacceptable for their tier-1 applications, and 53 percent would be seriously impacted by an outage lasting an hour or more.
Eventually, companies cross the "Continuous Availability" threshold, the point where they discover that the possibility of downtime is too costly to ignore. IBM has clients using 3-site Metro/Global Mirror that can fail-over an entire data center in just five mouse clicks.
Jeanine also mentioned Euronics, which is using SAN Volume Controller's Stretched Cluster capability, which allows them to easily vMotion virtual guest images from one data center to another. SVC has had this capability for a while, but now, with full VMcenter plug-in and VAAI support, the capability is fully integrated with VMware.
A final example was a mid-sized University, they are using IBM Storwize V7000 with Metro Mirror. The primary location's Storwize V7000 manages Solid-state drives with Easy Tier. The secondary location's Storwize V7000 has high-capacity SATA drives and FlashCopy.
Customer Testimonial - University of Rochester Medical Center
Rick Haverty, Director of IT infrastructure at University of Rochester Medical Center [URMC] provided the next client testimonial. The mission of the URMC is to use science, education and technology to improve health. URMC gets over $400 million USD in NIH grants, which puts them around 23rd largest University-based academic medical centers in the country. They have over 900 doctors, general practice and specialists.
URMC has an IBM BlueGene supercomputer, a Cisco network over 45,000 ports, and over 7.5 million square feet of Wi-Fi wireless internet coverage. They have three datacenters. The first is 7500 square feet, the second is 6000 square feet, and the third is just 800 square feet to hold their "off-site tapes".
URMC has digitized all of their records, including Electronic Medical Records (EMR) system, medical dosage history, imaging "priors", calibration of infusion pumps, RFID monitoring, and even provide IT support while the patient is on the operating table. RFID monitoring ensures all of the refrigerators are keeping medications at the right temperature. A single failed refrigerator can lose $20,000 dollars worth of medication.
When is a good time for downtime? At URMC, they handle 90,000 Emergency Room vists per year, so the answer is never. When is the ER busiest? Monday morning. (not what I expected!)
URMC's EMR software (Epic) runs on clustered POWER7 servers, with DS8700 disk systems using Metro Mirror to secondary location. They also keep a third "shadow" POWER7 for read-only purposes, and a separate system that provides web-based read-only access. Finally, they have 90 stand-alone Personal Computers (PCs) that contain information for all the patients that have reservations this week, just in case all the other systems fail.
The exploding volume of data comes from medical imaging. For radiology (X-rays), each image is called a "study" takes 20-30 MB each, and they have 650,000 studies per year. This represents about 16TB storage per year, with 3 second response time access. These must be kept for 7 years since last view, or until the patient reaches the age of 18 years old, which ever is later.
But radiology is just one discipline. Healthcare has a whole bunch of "ologies". Another is "Pathology" which looks at cells between glass slides in a microscope. Each study consumes 10-20GB, and URMC does about 100,000 pathology studies per year, representing 150TB per year.
URMC has identified that they have 42 mission-critical applications. The data for these are stored on DS8000, XIV, Storwize V7000 and DS5000, all managed behind SAN Volume Controller.
IBM makes another breakthrough today with an announcement about tape data density. Unlike hard disk drive technologies that are hitting physical limits, IBM is proving that tape technology still has plenty of life in its future.
When I first started working for IBM in Tucson, back in 1986, a 3420 tape reel held only 180MB of data, and a 3480 tape cartridge improved this to 200MB of data. Today's enterprise tapes, like 3592 cartridges for the TS1130 drives, or LTO4 cartridges for the IBM TS1040 drives, are half-inch wide, half-mile long, and can store 1 TB or more of data per cartridge, depending on how well the data can compress. To increase cartridge capacity, designers can make changes in three dimensions:
Wider tape: The film industry tried this, going from 35mm to 70mm film, only to find that most cinemas did not want to upgrade their equipment. Keeping the media dimensions to half inch wide allows much of the engineering hardware to continue unchanged.
Longer tape: The problem with longer tape is that either the reel inside gets fatter, or you need to develop flatter media to fit within the existing cartridge dimensions. Wider reels means a bigger tape cartridge external dimensions, forcing changes to shelving units, cartridge trays, and carrying units. The media just can't get any flatter without risking getting more brittle.
Denser bit recording: once a convenient width and length were established, improving bit density turned out to be the best way to increase cartridge capacity.
Working with FujiFilm Corporation of Japan, my colleagues at IBM Research facility in Zurich were able to demonstrate an incredible 29.5 Gigabits per square inch, nearly 40 times more dense than today's commercial tape technology. In the near future, we will be able to hold a 35TB tape cartridge in our hand. There was actually a lot to make this happen, improved giant magentoresistive read/write heads, better servo patterns to stay on track, thinner tracks less than a micron thick, and better signal-to-noise processing to accomplish this. To learn more, you can read the [Press Release] or watch this quick [4-minute YouTube video].