Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Tony Pearson is a Master Inventor and Senior Software Engineer for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson
Continuing my week in Washington DC for the annual [2010 System Storage Technical University], I presented a session on Storage for the Green Data Center, and attended a System x session on Greening the Data Center. Since they were related, I thought I would cover both in this post.
Storage for the Green Data Center
I presented this topic in four general categories:
Drivers and Metrics - I explained the three key drivers for consuming less energy, and the two key metrics: Power Usage Effectiveness (PUE) and Data Center Infrastructure Efficiency (DCiE).
Storage Technologies - I compared the four key storage media types: Solid State Drives (SSD), high-speed (15K RPM) FC and SAS hard disk, slower (7200 RPM) SATA disk, and tape. I had comparison slides that showed how IBM disk was more energy efficient than competition, for example DS8700 consumes less energy than EMC Symmetrix when compared with the exact same number and type of physical drives. Likewise, IBM LTO-5 and TS1130 tape drives consume less energy than comparable HP or Oracle/Sun tape drives.
Integrated Systems - IBM combines multiple storage tiers in a set of integrated systems managed by smart software. For example, the IBM DS8700 offers [Easy Tier] to offer smart data placement and movement across Solid-State drives and spinning disk. I also covered several blended disk-and-tape solutions, such as the Information Archive and SONAS.
Actions and Next Steps - I wrapped up the talk with actions that data center managers can take to help them be more energy efficient, from deploying the IBM Rear Door Heat Exchanger, or improving the management of their data.
Greening of the Data Center
Janet Beaver, IBM Senior Manager of Americas Group facilities for Infrastructure and Facilities, presented on IBM's success in becoming more energy efficient. The price of electricity has gone up 10 percent per year, and in some locations, 30 percent. For every 1 Watt used by IT equipment, there are an additional 27 Watts for power, cooling and other uses to keep the IT equipment comfortable. At IBM, data centers represent only 6 percent of total floor space, but 45 percent of all energy consumption. Janet covered two specific data centers, Boulder and Raleigh.
At Boulder, IBM keeps 48 hours reserve of gasoline (to generate electricity in case of outage from the power company) and 48 hours of chilled water. Many power outages are less than 10 minutes, which can easily be handled by the UPS systems. At least 25 percent of the Computer Room Air Conditioners (CRAC) are also on UPS as well, so that there is some cooling during those minutes, within the ASHRAE guidelines of 72-80 degrees Fahrenheit. Since gasoline gets stale, IBM runs the generators once a month, which serves as a monthly test of the system, and clears out the lines to make room for fresh fuel.
The IBM Boulder data center is the largest in the company: 300,000 square feet (the equivalent of five football fields)! Because of its location in Colorado, IBM enjoys "free cooling" using outside air temperature 63 percent of the year, resulting in a PUE of 1.3 rating. Electricity is only 4.5 US cents per kWh. The center also uses 1 Million KwH per year of wind energy.
The Raleigh data center is only 100,000 Square feet, with a PUE 1.4 rating. The Raleigh area enjoys 44 percent "free cooling" and electricity costs at 5.7 US cents per kWh. The Leadership in Energy and Environmental Design [LEED] has been updated to certify data centers. The IBM Boulder data center has achieved LEED Silver certification, and IBM Raleigh data center has LEED Gold certification.
Free cooling, electricity costs, and disaster susceptibility are just three of the 25 criteria IBM uses to locate its data centers. In addition to the 7 data centers it manages for its own operations, and 5 data centers for web hosting, IBM manages over 400 data centers of other clients.
It seems that Green IT initiatives are more important to the storage-oriented attendees than the x86-oriented folks. I suspect that is because many System x servers are deployed in small and medium businesses that do not have data centers, per se.
Michael Scott, one of my "Second Life" builder/scripters, for demonstrating client-focused dedication to IBM's corporate values.
Our site manager, Terri Mitchell, did a recap of all our recent awards and accomplishments.Of the nine Design Innovation awards won by IBM this year at the CeBIT conference, eight were for IBM System Storage products!
The IBM System Storage EXP3000: an entry-level data storage server that is optimized for cost-sensitive and space-limited environments and employs a user-centered design that enables ease of use and simple tool-less installation and removal of all components.
The IBM System Storage N7000 Series: a modular disk storage system that delivers high-end enterprise storage and data management value ideal for large-scale applications, while helping to anticipate growth, maintaindata availability and reduce costs.
The IBM System Storage N5000 Series: a modular disk storage system designed to address the entire spectrum of data availability challenges while offering value in price and scalability. Built-in enterprise serviceability and manageability features support efforts to increasereliability and simplify storage infrastructure and maintenance.
The IBM System Storage N3700: a filer that integrates storage and storage processing into a single unit, facilitating affordable network deployments.
The IBM System Storage DS4700: a NEBS-compliant disk storage server designed to address requirements for companies in the telecommunications industry, as well as other segments, such as oil and gas, meeting standardsfor electromagnetic compatibility, thermal robustness, earthquake and office vibration resistance, and provides protection for the product components from airborne contaminants.
The IBM System Storage EXP810: a data storage expansion unit capable of 4.8 Terabytes of physical storage, with a user-centered and tool-less design featuring redundant power, cooling, and disk modules for ease of use and simple serviceability.
The IBM System Storage TS3400: an affordable, space-friendly tape library for users in remote locations that supports enterprise-class technology and encryption capabilities.
A representative from Tucson's Brewster Center presented Terri an award, thanking IBM for its strong support for the community through various charity initiatives.
The final speaker was a new IBM client, Tony Casella, the IT Director of the town of Marana. Recently, the town of Marana selected IBM products made big news. Arizona is the fastest growing state in the USA, and the town of Marana, just north of Tucson, is one of the fastest growing communities in Arizona. The town is growing so large that it will soon spill over from Pima into Pinal county, and will be the first town in Arizona authorized to span county boundaries.
Well, it's Tuesday, and so it is "announcement day" again! Actually, for me it is Wednesday morning herein Mumbai, India, but since I was "press embargoed" until 4pm EDT in talking about these enhancements, I had to wait until Wednesday morning here to talk about them.
World's Fastest 1TB tape drive
IBM announced its new enterprise [TS1130 tape drive]and corresponding [TS3500 tape library support]. This one has a funny back-story. Last week while we were preparing the Press Release, we debated on whether we should compare the 1TB per cartridge capacity as double that of Sun's Enterprise T10000 (500GB), or LTO-4 (800GB). The problem changed when Sun announced on Monday they too had a 1TB tape drive, so now instead ofsaying that we had the "World's First 1TB tape drive", we quickly changed this to the "World's Fastest 1TB tape drive" instead. At 160MB/sec top speed, IBM's TS1130 is 33 percent faster than Sun's latest announcement. Sun was rather vague when they will actually ship their new units, so IBM may still end up being first to deliver as well.
While EMC and other disk-only vendors have stopped claiming that "tape is dead", these recent announcements from IBM and Sun indicate that indeed tape is alive and well. IBM is able to borrow technologies from disk, such as the Giant Magneto Resistive (GMR) head over to its tape offerings, which means much of the R&D for disk applies to tape, keeping both forms ofstorage well invested. Tape continues to be the "greenest" storage option, more energy efficient than disk, optical, film, microfiche and even paper.
On the LTO front, IBM enhanced the reporting capabilities of its[TS3310] midrange tape library. This includes identifying the resource utilization of the drives, reporting on media integrity, and improved diagnostics to support library-managed encryption.
IBM System Storage DR550
As a blended disk-and-tape solution, the [IBM System Storage DR550] easily replaces the EMC Centera to meet compliance storagerequirements. IBM announced that we have greatly expanded its scalability, being able to support both 1TBdisk drives, as well as being able to attach to either IBM or Sun's 1TB tape drives.
Massive Array of Idle Disks (MAID)
IBM now offers a "Sleep Mode" in the firmware of the [IBM System Storage DCS9550], which is often called "Massive Array of Idle Disks" (MAID) or spin-down capability. This can reduce the amount of power consumed during idle times.
That's a lot of exciting stuff. I'm off to breakfast now.
The "Storage Symposium Mexico - 2008" conference was a great success this week!
Day 1 - The plan was for me to arrive for the Wednesday night reception. Eachattendee was given a copy of my latest book[Inside System Storage: Volume I] and I was planning to sign them. I thought perhaps we should have a "book signing" tablelike all of the other published authors have.
Things didn't go according to plan. Thunderstorms at the Mexico City airport forced our pilot to find an alternate airport. Nearby Acapulco airport was the logical choice, but was full from all the otherflights, so the plane ended up in a tiny town called McAllen, Texas. I did not arrive until the morning of Day 2,so ended up signing the books throughout Thursday and Friday, during breaks and meals, wherever they couldfind me!
Special thanks to fellow IBMer Ian Henderson who picked me up from the airport at such an awkward hour anddrive me all the way to Cuernavaca!
All of us, IBMers, Business Partners and clients alike, all donned black tee-shirtswith a white eightbar logo for a group photo with one of those "wide lens" cameras. While we werebeing assembled onto the bleachers, I took this quick snapshot of myself and some of the guys behind me.
I was original scheduled to be first to speak, but with my flight delays, was moved to a time slot after lunch.After a big Mexican lunch, the conference coordinators were afraid the attendees might fall asleep,a Mexican tradition called [siesta], so I wasinstructed to WAKE THEM UP! Fortunately, my topic was Information Lifecycle Management, a topicI am very passionate about, since my days working on DFSMS on the mainframe. With 30percent reduction in hardware capital expenditures, 30 percent reduction in operational costs, and typical payback periods between 15 to 24 months, the presentation got everyone's attention.
Of course, a lot happens outside of the formal meetings. We had a Japanese theme dinner, where we woreJapanese Hachimaki [headbands]with the eightbar logo. For those not familiar with Japanese culture, hachimaki are worn today not so much for the practical purpose to catch the perspiration but rather for mental stimulation to express one's determination. Some students wear hachimaki when they study to put themselves in the right spirit and frame of mind.
Shown here are presenters Mike Griese (Infrastructure Management with IBM TotalStorage Productivity Center),Dave Larimer (Backup and Storage Management with IBM Tivoli Storage Manager), myself, and John Hamano(Unified Storage with IBM System Storage N series).
Day 3 - Wrapping up the week, I presented two more times.
First, I covered IBM Disk Virtualization with IBM SAN Volume Controller. One interesting question was if the SAN Volume Controller could be made to looklike a Virtual Tape Library. I explained that this was never part of the original design, but that if you wantto combine SVC with a VTL into a combined disk-and-tape blended solution, consider using theIBM product called Scale-Out File Services[SoFS] which I covered in my post[Moredetails about IBM clustered scalable NAS].
During one of the breaks, I took a picture of the behind-the-scenes staff that put this together. They had created these huge blocks representing puzzle pieces, emphasizing how IBM is one of the few ITvendors that can bring all the pieces together for a complete solution.
Shown hereare Mike Griese (presenter), Cyntia Martinez, Claudia Aviles, Cesar Campos (IBM Business Unit Executive forSystem Storage in Mexico), and Claudia Lopez. Each day the staff wore matching shirts so that it was easyto find them.
Later, I covered Archive and Compliance Solutions to highlight our complete end-to-end set of solutions.When asked to compare and contrast the architectures of the IBM System Storage DR550 with EMC Centera, I explainedthat the DR550 optimizes the use of online disk access for the most recent data. For example, if you aregoing to keep data for 10 years, maybe you keep the most recent 12 months on disk, and the rest is moved,using policy-based automation, to a tape library for the remaining nine years. This means that the disk insidethe DR550 is always being used to read and write the most recent data, the data you are most likely to retrievefrom an archive system. Data older than a year is still accessible, but might take a minute or two for the tapelibrary robot to fetch.The EMC Centera, on the other hand, is a disk-only solution. It offers no option to move older data to tape,nor the option to spin-down the drives to conserve power. It fills up after the same 12 months or so, and then you get towatch it the remaining nine years, consuming electricity and heating your data center.
I don't know about you, butI have never seen anyone purposely put in "space heaters" into their data center, but certainly a full EMC Centeradoes little else. Both devices use SATA drives and support disk mirroring between locations, but IBM DR550 offers dual-parity RAID-6, and supports encryption of the data on both the disk and the tape in the DR550. EMC Centerastill uses only RAID-5, and has not yet, as far as I know, offered any level of encryption. IBM System StorageDR550 was clocked at about three times faster than Centera at ingesting new archive objects over a 1GbE Ethernet connection.
This last photo is me and fellow IBMer Adriana Mondragón. She was one of my students in the [System Storage Portfolio Top Gun class],last February in Guadalajara, Mexico.She graduated in the top 10 percent of her group, earning her the prestigious titleof "Top Gun" storage sales specialist.
The conference wrapped up with a Mexican lunch with a traditional Mariachi band. I took pictures, but figured you allalready know what [Mariachi players] look like, and I didn't wantto detract from the otherwise serious tone of this blog post! This was the first System Storage Symposium in Mexico, butbased on its success, we might continue these annually.
A lot of people ask me about IBM branding, as we have recently changed brands. In the past we had two separate brands, one for servers (eServer) and one for storage (TotalStorage). These would be fine if we wanted to promote their independence, but customers today want synergy between servers and storage, they want systems that work well together.
Last year, in response to market feedback, we crated a new brand, "IBM Systems" and put all the server and storage product lines under one roof. Over time, we will transition from TotalStorage to System Storage naming. This will occur with new products, and major versions of existing products.
Two other phrases you will hear in the names of our offerings are "Virtualization Engine" and "Express". These are portfolio identifiers. The Virtualization Engine identifier was created to emphasize our leadership in system virtualization, and we have products that span product lines with this identifier.
The Express identifier was created to emphasize our focus on Small and Medium sized business (SMB). It spans not just servers and storage, but across other offerings from other IBM divisions.
Of course, just renaming products and services isn't enough. Systems don't work together just because they have similar names, are covered in similar "Apple white" plastic, or have similar black bezels. Obviously, thoughtful and collaborative design are needed, with the appropriate amounts of engineering and testing. IBM is aligning its server and storage development so that the IBM Systems brand keeps its promise.
With all the announcements we had in June, it is easy for some of the more subtle enhancements to get overlooked. While I was at Orlando for the IBM Edge conference, I was able to blog about some of the key featured announcements. Then, later, when I got back from Orlando to Tucson, I was able to then blog about [More IBM Storage Announcements]. For IBM's Scale-Out Network Attach Storage (SONAS), I had simply:
"SONAS v1.3.2 adds support for management by the newly announced IBM Tivoli Storage Productivity Center v5.1 release. Also, IBM now officially supports Gateway configurations that have the storage nodes connected to XIV or Storwize V7000 disk systems. These gateway configurations offer new flexible choices and options for our ever-expanding set of clients."
In my defense, IBM numbers its software releasees with version.release.modification, so 1.3.2 is Version 1, Release 3, Modification 2. Generally, modification announcements don't get much attention. The big announcement for v1.3.0 of SONAS happened last October, see my blog post [October 2011 Announcements - Part I] or
the nice summary post [IBM Scale-out Network Attached Storage 1.3.0] from fellow blogger Roger Luethy.
Here is a diagram showing the three configurations of SONAS.
I have covered the SONAS Appliance model in depth in previous blogs, with options for fast and slow disk speeds, choice of RAID protection levels, a collection of enterprise-class software features provided at no additional charge, and interfaces to support a variety of third party backup and anti-virus checking software.
The basics haven't changed. The SONAS appliance consists of 2 to 32 interface nodes, 2 to 60 storage nodes, and up to 7,200 disk drives. The maximum configuration takes up 17 frames and holds 21.6PB of raw disk capacity, which is about 17PB usable space when RAID6 is configured. An interface nodes has one or two hex-core processors with up to 144GB of RAM to offer up to 3.5GB/sec performance each. This makes IBM SONAS the fastest performing and most scalable disk system in IBM's System Storage product line.
I thought I would go a bit deeper on the gateway models. These models support up to ten storage nodes, organized in pairs. The key difference is that instead of internal disk controllers, the storage nodes connect to external disk systems. There is enough space in the base SONAS rack to hold up to six interface nodes, or you can add a second rack if you need more interface nodes for increased performance.
SONAS with XIV gateway
XIV offers a clever approach to storage that allows for incredibly fast access to data on relatively slow 7200 RPM drives. By scattering data across all drives and taking advantage of parallel processing, rebuild times for a failed 3TB drive are less than 75 minutes. Compare that to typical rebuild times for 3TB drives that could take as much as 9-10 hours under active I/O loads!
In the configuration, each pair of storage nodes can connect to external SAN Fabric switches that then connect to one or two XIV storage systems. How simple is that? These can be the original XIV systems that support 1TB and 2TB drives, or the new XIV Gen3 systems that support 400GB Solid-state drives (SSD) and 3TB spinning disk drives. In both cases, you can acquire additional storage capacity as little as 12 drives at a time (one XIV module holds 12 drives).
The maximum configuration of ten XIV boxes could hold 1,800 drives. At 3TB drive per drive, that would be 2.4PB usable capacity.
The SONAS with XIV gateway does not require the XIV devices to be dedicated for SONAS purposes. Rather, you can assign some XIV storage space for the SONAS, and the rest is available for other servers. In this manner, SONAS just looks like another set of Linux-based servers to the XIV storage system. This in effect gives you "Unified Storage", with a full complement of NAS protocols from the SONAS side (NFS, CIFS, FTP, HTTPS, SCP) as well as block-based protocols directly from the XIV (FCP, iSCSI).
SONAS with Storwize V7000 gateway
The other gateway offering is the SONAS with Storwize V7000. Like the SONAS with XIV gateway model, you connect a pair of SONAS storage nodes to 1 or 2 Storwize V7000 disk systems. However, you do not need a SAN Fabric switch in between. You can instead connect the SONAS storage nodes directly to the Storwize V7000 control enclosures.
To acquire additional storage capacity, you can purchase a single drive at a time. That's right. Not 12 drives, or 60 drives, at a time, but one at a time. The Storwize V7000 supports a wide range of SSD, SAS and NL-SAS drives at different sizes, speeds and capacities. The drives can be configured into various RAID protection levels: RAID 0, 1, 3, 5, 6 and 10.
Each Storwize V7000 control enclosure can have up to nine expansion drawers. If you choose the 2.5-inch 24-bay models, you can have up to 480 drives per storage node pair, for a total of 2,400 drives. If you choose the 3.5-inch 12-bay models, you can have up to 240 drives per node pair, 1,200 drives total. At 3TB per drive, this could be 3.6PB of raw capacity. The usable PB would depend on which RAID level you selected. Of course, you don't have to limit yourself all to one size or the other. Feel free to mix 2.5-inch and 3.5-inch drawers to provide different storage pool capabilities.
All three SONAS configurations support Active Cloud Engine. This is a collection of features that differentiate SONAS from the other scale-out NAS wannabees in the marketplace:
Policy-driven Data Placement -- Different files can be directed to different storage pools. You no longer have to associate certain file systems to certain storage technologies.
High-speed Scan Engine -- SONAS can scan 10 million files per minute, per node. These scans can be used to drive data migration, backups, expirations, or replications, for example. It is over 100 times faster than traditional walk-the-directory-tree approaches employed by other NAS solutions.
Policy-driven Migration -- You can migrate files from one storage pool to another, based on age, days since last reference, size, and other criteria. The files can be moved from disk to disk, or move out of SONAS and stored on external media, such as tape or a virtual tape library. A lot of data stored on NAS systems is dormant, with little or no likelihood of being looked at again. Why waste money keeping that kind of data on expensive disk? With SONAS, you can move those files to tape can save lots of money. The files are stubbed in the SONAS file system, so that an access request to a file will automatically trigger a recall to fetch the data from tape back to the SONAS system.
Policy-driven Expiration -- SONAS can help you keep your system clean, by helping you decide what files should be deleted. This is especially useful for things like logs and traces that tend to just hang around until some deletes them manually.
WAN Caching -- This allows one SONAS to act as a "Cloud Storage Gateway" for another SONAS at a remote location connected by Wide Area Network (WAN). Let's say your main data center has a large SONAS repository of files, and a small branch office has a smaller SONAS. This allows all locations to have a "Global" view of the all the interconnected SONAS systems, with a high-speed user experience for local LAN-based access to the most recent and frequently used files.
If you want to learn more, see the [IBM SONAS landing page]. Next week, I will be across the Pacific Ocean in [Taipei], to teach IBM Top Gun class to sales reps and IBM Business Partners. "Selling SONAS" will be one of the topics I will be covering!
It has always been the case in fast pace technology areas that you can't tell the players without a program card, andthis is especially true for storage.
When analyzing each acquistion move, you need to think of what is driving it. What are the motives?Having been in the storage business 20 years now, and seen my share of acquisitions, both from within IBM,as well as competition, I have come up with the following list of motives.
Although slavery was abolished in the US back in the 1800's, and centuries earlier everywhere else, many acquisitionsseem to be focused on acquiring the people themselves, rather than the products or client list. I have seen statistics such as "We retained 98% of the people!" In reality, these retentions usually involve costly incentives,sign-in bonuses, stock options, and the like. Desptie this, people leave after a few years, often because ofpersonality or "corporate culture" clash. For example, many former STK employees seem to be leaving after their company was acquired by Sun Microsystems.
If you can't beat them, join them. Acquisitions can often be used by one company to raise its ranking in marketshare, eliminating smaller competitors. And now that you have acquired their client list, perhaps you can sellthem more of your original set of products!
Symantec had acquired Veritas, which in turn had acquired a variety of other smaller players, and the end result is that they are now #1 backup software provider, even though none of theirproducts holds a candle to IBM's Tivoli Storage Manager. Meanwhile, EMC acquired Avamar to try to get more into the backup/recovery game, but most analysts still find EMC down in the #4 or #5 place in this category.
Next month,Brocade's acquisition of McData should take effect, furthering its marketshare in SAN switch equipment.
Prior to my current role as "brand market strategist" for System Storage, I was a "portfolio manager" where wetried to make sure that our storage product line investments were balanced. This was a tough job, as the investmentshad to balance the right development investments into different technologies, including patent portfolios.Despite IBM's huge research budget, I am not surprised that some clever inventions of new technologies comefrom smaller companies, that then get acquired once their results appear viable.
The last motive is value shift. This is where companies try to re-invent themselves, or find that they are stuck in acommodity market rut, and wish to expand into more profitable areas.
LSI Logic acquisition of StoreAge is a good exampleof this. Most of the major storage vendors have already shifted to software and services to provide customer value,as predicted in 1990's by Clayton Christensen in his book "The Innovator's Dilemma". The rest are still strugglingto develop the right strategy, but leaning in this general direction.
Jon Toigo over at DrunkenData writes in his post[A Wink and a Nod] about thebenefits of the new IBM System z10 Enterprise Class mainframe. Here's an excerpt about storage:
"The other key point worth making about this scenario is that storage behind a z10 must conform to IBM DASD rules. That means no more BS standards wars between knuckle-draggers in the storage world who continue to mitigate the heterogeneous interoperability and manageability of distributed systems storage using proprietary lock in technologies designed as much to lock in the consumer and lock out the competition as to deliver any real value. That has got to be worth something."
For z/OS and TPF operating systems, disk must support CCW commands over ESCON or FICON connections, or NFS commandsover the Local Area Network. However, most of the workloads that are being ported over from x86 platforms willprobably be running Linux on System z images, and as such Linux supports both CCW and SCSI protocols, the latterover native FCP connections through a Storage Area Network (SAN) or via iSCSI over the Local Area Network. Many SAN directors support both FCP and FICON, and the z10 also supports both 1Gbps and 10Gbps Ethernet, so you may not have to invest in any new networking gear.
The best part is that you may not have to migrate your data. The IBM System Storage SAN Volume Controller is supported for Linux on System z, and with "image mode" you can leave the data in its original format on its original disk array. Many file systems are now supported by Linux, including Windows NTFS with the latest NTFS-3G driver.
If your data is already on NAS storage, such as the IBM System Storage N series disk systems, then the IBM z10can access it directly, from z/OS, z/VM or Linux.
Have lots of LTO tape data? Linux on System z supports LTO as well.
Jon continues his rant with a question about porting Microsoft Windows applications. Here's another excerpt:
"For one, what do we do with all the Microsoft servers. There is no Redmond-sanctioned approach to my knowledge for virtualizing Microsoft SQL Server or Exchange Server in a mainframe partition."
Yes, it is possible to run Windows on a mainframe through emulation, but I feel that's the wrong approach. Instead, the focus should be on running "functionally equivalent" programs on the native mainframe operating systems, and again Linuxis often the best choice for this. Switching from Windows to Linux may not be "Redmond-sanctioned", but it getsthe job done.
Instead of SQL Server, consider something functionally equivalent like IBM's DB2 Universal Database, or perhaps an open source database like MySQL, PostgreSQL or Apache Derby. Well-written applications use standard SQL calls, so ifthe application does not try to use unique, proprietary features of MS SQL Server, you are in good shape.
In my discussion last November on [Microsoft Exchange email server], I mentioned that Bynari makes a functionally equivalent email server on Linux that works with your existing Microsoft Outlook clients. Your end-users wouldn't know you migrated to a mainframe! (well, they might notice their email runs faster)
So if your data center has three or more racks of Sun, Dell or HP "pizza box" or "blade" x86 servers, chances are you can migrate the processing over to a shiny new IBM z10 EC mainframe, save some money in the process, without too much impact to your existing Ethernet, SAN or storage system infrastructure. IBM can even help you dispose of the oldx86 machines so that their toxic chemicals don't end up in any landfill.
Two European scientists, Albert Fert (France) and Peter Grunberg (Germany) have won the 2007 Nobel Prize for physics for their research into Giant Magnetoresistance, or GMR. GMR read/write heads are used in IBM disk systems.
New high-density dual-coated particulate magnetic tape: Developed by Fuji Photo Film Co., Ltd., in Japan in collaboration with IBM Almaden researchers, this next-generation version of its NANOCUBIC™ tape uses a new barium-ferrite magnetic media that enables high-density data recording without using expensive metal sputtering or evaporation coating methods.
More sensitive read-write head: For the first time, magnetic tape technology employs the sensitive giant-magnetoresistive (GMR) head materials and structures used to sense very small magnetic fields in hard disk drives.
GMR servo reader: New GMR servo-reading elements, software and fast-and-precise positioning devices provides an active feedback system with unprecedented 0.35-micron accuracy in monitoring and positioning the read-write head over the 1.5-micron-wide residual data track.
Improved tape-handling features: Flangeless, grooved rollers permit smoother high-speed passage of the tape, which also enhances the ability of the head to write and read high-density data.
Innovative signal processing algorithms for the read data channel: An advanced read channel used new "noise-predictive, maximum-likelihood" (NPML) software developed at IBM's Zurich Research Laboratory to process the captured data faster and more accurately than would have been possible with existing methods.
IBM often leverages the research done in one part of its business over to other parts of its business. In this manner, advances in disk translate into advances in tape, keeping tape a viable medium for at least the next 8-10 years.
Am I dreaming? On his Storagezilla blog, fellow blogger Mark Twomey (EMC) brags about EMC's standard benchmark results, in his post titled [Love Life. Love CIFS.]. Here is my take:
A Full 180 degree reversal
For the past several years, EMC bloggers have argued, both in comments on this blog, and on their own blogs, that standard benchmarks are useless and should not be used to influence purchase decisions. While we all agree that "your mileage may vary", I find standard benchmarks are useful as part of an overall approach in comparing and selecting which vendors to work with, and which architectures or solution approaches to adopt, and which products or services to deploy. I am glad to see that EMC has finally joined the rest of the planet on this. I find it funny this reversal sounds a lot like their reversal from "Tape is Dead" to "What? We never said tape was dead!"
Impressive CIFS Results
The Standard Performance Evaluation Corporation (SPEC) has developed a series of NFS benchmarks, the latest, [SPECsfs2008] added support for CIFS. So, on the CIFS side, EMC's benchmarks compare favorably against previous CIFS tests from other vendors.
On the NFS side, however, EMC is still behind Avere, BlueArc, Exanet, and IBM/NetApp. For example, EMC's combination of Celerra gateways in front of V-Max disk systems resulted in 110,621 OPS with overall response time of 2.32 milliseconds. By comparison, the IBM N series N7900 (tested by NetApp under their own brand, FAS6080) was able to do 120,011 OPS with 1.95 msec response time.
Even though Sun invented the NFS protocol in the early 1980s, they take an EMC-like approach against standard benchmarks to measure it. Last year, fellow blogger Bryan Cantrill (Sun) gives his [Eulogy for a Benchmark]. I was going to make points about this, but fellow blogger Mike Eisler (NetApp) [already took care of it]. We can all learn from this. Companies that don't believe in standard benchmarks can either reverse course (as EMC has done), or continue their downhill decline until they are acquired by someone else.
(My condolences to those at Sun getting laid off. Those of you who hire on with IBM can get re-united with your former StorageTek buddies! Back then, StorageTek people left Sun in droves, knowing that Sun didn't understand the mainframe tape marketplace that StorageTek focused on. Likewise, many question how well Oracle will understand Sun's hardware business in servers and storage.)
What's in a Protocol?
Both CIFS and NFS have been around for decades, and comparisons can sometimes sound like religious debates. Traditionally, CIFS was used to share files between Windows systems, and NFS for Linux and UNIX platforms. However, Windows can also handle NFS, while Linux and UNIX systems can use CIFS. If you are using a recent level of VMware, you can use either NFS or CIFS as an alternative to Fibre Channel SAN to store your external disk VMDK files.
The Bigger Picture
There is a significant shift going on from traditional database repositories to unstructured file content. Today, as much as [80 percent of data is unstructured]. Shipments this year are expected to grow 60 percent for file-based storage, and only 15 percent for block-based storage. With the focus on private and public clouds, NAS solutions will be the battleground for 2010.
So, I am glad to see EMC starting to cite standard benchmarks. Hopefully, SPC-1 and SPC-2 benchmarks are forthcoming?
Last week's focus was on tape libraries, both virtual and real, leading up to our IBM announcement ofacquiring Diligent Technologies. I was focused on HDS blogger Hu Yoshida's post about his conversation with Mark,who was on an expert panel about these topics. Mark discovered that of the top energy consumersin his datacenter, his tape library was in the top five, a surprising result. Hu suggested that switching to a VTL with deduplicationtechnology was a potential alternative, and I pointed to a whitepaper from the Clipper Group that suggested otherwise.
My response was that perhaps Highmark's choice of backup software was poorly written, or that they had set it up with thewrong parameters, and just changing hardware might not be the right answer. I went too far given that I didn't know which software they had, which parameters theywere using, or which tape technology was involved. This came across wrong. I meant to poke fun at Hu's response.I did not mean to imply that Mark and his staff hadmade poor choices, or that they should automatically reject Hu's advice to consider other hardware alternatives.
I have discussed the situation with Mark, and agree that I should know his situation better before offeringsuggestions of my own.
Well, it's Tuesday again, and we had several announcements this month, so here is a quick recap.We had some things announce May 13, and then some more announcements today, but since I was busywith conferences, will combine them into one post for the entire month of May 2008.
This time, I thought I would go "audio" with a recording from Charlie Andrews, IBM director ofproduct marketing for IBM System Storage:
Well, its Tuesday, and that means more IBM announcements!!!
Let's do a quick recap of what was announced for storage:
We now support 1000GB SATA-II drives in the DS4000 series. This is available for the DS4200 model 7V, DS4700, DS4800 as well as the expansion drawers EXP420 and EX810. When I asked our marketing team why we weren't going to say "1TB" like everyone else, they thought 1000GB sounds bigger. I guess I should not have asked that on April Fool's day. For more details, see the IBM press releases for the [DS4200/EXP420and DS4700/DS4800/EXP810].
IBM announced new machine code Release 1.4a for the The IBM Virtualization Engine™ TS7700 virtual tape library for our System z mainframe customers.Various features come with this new level of machine code. See the IBM [Press Release] for more details.
Load balancing across the grid
Host control over the copy of logical volumes on a cluster by cluster basis
Option to gracefully remove an individual cluster from an existing grid
Initial-state reset for TS7700 database for cluster cleanup
Option to upgrade single-cache to dual-cache configuration
Also announced were updates to the 7214 model 1U2. Technically this is not in the IBM System Storage product line,but instead is designed specifically for our System p server line. This is a "media drawer" that allows you to havetape on one side, and optical on the other, in a single enclosure. IBM announced that you can now have DAT160 80GBdrives that is read-write compatible with DAT72 and DDS4 drives, and half-high LTO-4 drives that can read LTO-2 media, and is read-write compatible with LTO-3 media.Read the IBM [Press Release] for details.
Finally, if you are in the United States, Canada or the Carribean, there is a special discount promotionfor tape libraries purchased before June 20, 2008. This includes IBM TS3100, TS3200, TS3310 and TS3500 libraries.See the [Promotion Details] for eligibility.
IBM has added capability to the IBM TotalStorage Productivity Center for Replication. A quick review of the differentoptions for this component.
base Replication (uni-directional from primary to disaster site)
Two-site replication (bi-directional, including failover and failback)
Three-site replication (site awareness for all the copy sessions between all three sites in all situations)
Productivity Center for Replication supported all these levels for DS8000, DS6000 and ESS 800 disk models, butfor SVC it only supported FlashCopy and Metro Mirror for the uni-directional base. IBM announced version 3.4 today that has added support for SVC for Global Mirror (asynchronous disk mirroring) and bi-directional failover/failback. This supports lets you have "practice volumes" that allow IT managers to perform "disaster recovery exercises" without disrupting production workloads.
Also, for the DS8000, there is support for the new Space Efficient FlashCopy and DynamicVolume Expansion features. Here is the IBM
The Productivity Center for Replication server can run on either a Windows/Linux-x86 server or a z/OS mainframe server.The Productivity Center for Replication on System z offers all the same new support for SVC and DS8000, as well asincorporated Basic HyperSwap capability that I mentioned in my post last February[DS8000 Enhancements for the IBM System z10 EC].
Here are the IBM press releases for the TotalStorage Productivity Center for Replication on[Windows/Linux-x86and System z] servers.
I'm at a Business Partner conference today, discussing these announcements and other topics, so need to go back to those festivities.
Well it's Tuesday, which means its time to look at recent announcements.While I was on vacation last week, IBM made a lot of storage announcements October 23.Josh Krischer gives his summary on WikiBon [October 2007 Review].Austin Modine of the The Register went so far as to say that [IBM goes crazy with storage system updates].
IBM System Storage DS8000 series
This is "Release 3" software/microcode upgrades on our existing "Turbo" hardware.
IBM FlashCopy SE -- Here "SE" stands for Space Efficient. Rather than allocating a full 100% of the space for the FlashCopy destination, you can set aside just a fraction, and this will hold all the changed blocks, similar to whatIBM already offers on the DS4000 series.
Dynamic Volume Expansion -- In the past, if you needed more space for a LUN, you had to carve out a newer one elsewhere, and then copy the data over from the old to the new, leaving the old LUN around to be re-used or leftstranded. With this enhancement, you can just upgrade the LUN in place, making it bigger as needed, similar to whatIBM already offers on the DS4000 series and SAN Volume Controller. This applies to CKD volumes for the System zmainframe users out there as well.
Storage Pool Striping -- striping volumes across RAID ranks to eliminate or reduce hot-spots, and provide betterload balancing. Many used SAN Volume Controller in front of the DS8000 to do this, but now you can do it natively inthe DS8000 itself.
z/OS Global Mirror Multiple Reader -- for System z customers, "z/OS Global Mirror" is the new name for XRC. Thisenhancement improves the throughput of sending updates to the remote disaster recovery location.
DS Storage Manager enhancements, the element manager software has been enhanced, and is pre-installed on the new IBM System Storage Productivity Center, which I will talk about below.
Intermix of DS8000 machine types -- this is especially useful to allow new frames to have co-terminating warrantieswith the base units. In other words, as you expand your system, you can ensure that the entire chunk of iron runs outof warranty all at the same time, to simplify your decision making process to upgrade or contract for extended service.
One of the biggest complaints about IBM TotalStorage Productivity Center is that it is software that needs to beinstalled on its own server, and that this installation process can take a day or two. Why wait? Now you can havea hardware console that has the DS8000 Storage Manager software, SVC Admin Console software, and IBM TotalStorageProductivity Center "Basic Edition" pre-installed. Here are the key features.
Pre-installed and tested console
DS8000 R3 GUI integration
Cohabitation of SVC 4.2.1 GUI and CIMOM
Automated device discovery
Asset and capacity reporting, including tape library support
Our "Release 9" applies across the board, from N3000 to N5000 to N7000 series models, includingnew host bus adapters, and the new Data OnTAP 7.2.4 release level.
The Virtual File Manager (VFM) was announced as one of our latest [Storage Virtualization Solutions]. VFMprovides a global namespace that aggregates the file systems from Linux, UNIX, and Windows file servers, as well asN series storage, into a consolidated environment.
IBM's virtual tape library (VTL) for the distributed systems platform, has been enhanced to provide:
Up to 12TB of disk cache, using 750GB SATA disk.
F05 Tape Frames installed as TS7520 base units through a 32 port fibre channel switch
Support for LTO generation 4 tape drives, both as virtual tape drives and as physical tape drives within IBM automated tape libraries attached to the TS7520. This allows you to use Encryption capabilities of LTO4.
DS3000 series now supports SATA disk, and can be attached to AIX and Linux on System p servers. This appliesto the DS3200, DS3300 and DS3400 models.See the [DS3000 Announcement Letter] for more details.
It's Tuesday, which means IBM makes its announcements. We had several for the IBM System Storage product line. Here's a quick recap.
The IBM System Storage DS3000 now offers DC power models.New DC powered models of the DS3200, DS3400, and EXP3000 are well suited for Telco industry environments, as theseare NEBS and ETSI compliant and are powered by an industry standard 48 volt DC power source.
Also, the IBM System Storage N series now supports750GB SATA drives available for the EXN1000 drawer.
IBM Virtualization Engine TS7740now supports 3-cluster grids. Unlike 3-way replication on disk mirroring, such as IBM Metro/Global Mirror for the DS8000 that enforces a primary, secondary and tertiary copy, the grid implementation of TS7740 tape virtualization allows for any-to-any mirroring. Existing standalone TS7740 clusters can be converted to grid-enabled. A "Copy Export" feature allows virtual tapes to be exported onto physical tape. And in keeping with our theme of "enabling business flexibility", performance throughput can now be purchased in 100 MB/sec increments, up to 600 MB/sec, to match your workload bandwidth requirements.
The IBM System Storage TS1120drives installed in the IBM System Storage™ TS3400 Tape Library can now be attached to System z platforms using the IBM System Storage™ TS1120 Tape Controller. Before this, the TS3400 could only be attached to UNIX, Windows and Linux systems.
The IBM System StorageTS2230 Express is offered as an external stand-alone or rack-mountable unit. This model incorporates the new LTO IBM Ultrium 3 Serial Attached SCSI (SAS) Half-High Tape Drive, and a 3 Gbps single port SAS interface for a connection to a wide spectrum of distributed system servers that support Microsoft Windows and Linux systems.
IBM has added theCisco MDS 9124 for IBM System Storageentry-level fabric switch as an Express offering and part of the IBM Express Advantage Program. Express offerings are specifically created for mid-market companies and are well suited for workgroup storage applications like e-mail serving, collaborative databases and web serving. They bring enterprise-class performance, scalability and features to small and medium-sized companies and are easy to use, highly scalable, and cost-effective.This will make it easier for IBM Business Partners to provide fabric switch connectivity for:
Storage consolidation solutions with IBM System Storage™ DS4000 Express disk arrays, especially the DS4700 Express.
Backup / restore solutions with IBM System Storage™ TS3000 Tape Libraries, such as the TS3200.
Archive and Retention
Ordering large configurations of the IBM System Storage Grid Access Manager just got a lot easier.New features enable configurations greater than 500 TB to be submitted as a single order. No change in the actualproduct, just an improvement in the ordering process.
For System p and System i servers, the IBM 3996 Optical library now supports Gen 2 60GB optical cartridges. These can be read/write or WORM cartridges.
I'm off to Denver, Colorado this week. I hope it is cooler there than it is down here in Tucson, Arizona.
Next Monday, September 1, 2008, marks my two year "blogoversary" for this blog!
I won't be blogging on Monday, of course, because that is [Labor Day] holiday here in the United States.
(From a Canadian colleague: US is not the only country who celebrates Labor Day on the first weekend in September. Canada also celebrates Labour Day on the first weekend in September. It's the only holiday(other than Christmas/New Years) where we are in sync with US. Our Thanksgiving Days are different as is your July 4 vs our July 1. But for Labour Day we are one with the Borg...)
(From an Australian colleague: each province of Australia has its own day to celebrate Labor Day, see [Australia Public Holidays])
The rest of the world celebrates Labor Day on May 1, but the USA celebrates this on the first Monday of September, which this year lands on September 1.Originally, the day is intended to be a "day off for working citizens", IBM is kind enough to let managers and marketingpersonnel have the day off also. (Not that anyone is going to notice no press releases next Monday, right?)
I started this blog on September 1, 2006 as part of IBM's big["50 Years of Disk Systems Innovation"] campaign. IBM introduced the first commercial disk system on September 13, 1956 and so the 50th anniversary was in 2006. Last year, IBM celebrated the 55th anniversary of tape systems.
Several readers have asked me why I haven't talked about recent current events, such as the Olympic Games in Beijing, or the U.S. National Conventions for the race for U.S. President. I have to remind them of one of the key precepts of IBMblogging guidelines:
8. Respect your audience. Don’t use ethnic slurs, personal insults, obscenity, or engage in any conduct that would not be acceptable in IBM’s workplace. You should also show proper consideration for others’ privacy and for topics that may be considered objectionable or inflammatory - such as politics and religion.
I made subtle references to my senator from Arizona, John McCain, in my post [ILM for my iPod], and to Barack Obama in my post [Searching for matching information]. I don't think anyone would mind that I send a "Happy Birthday!" wish to both of them.Senator McCain turns 72 years old today, and Senator Obama turned 47 years old earlier this month.
And lastly, Tucson itself [celebrates this entire month] its 233rd birthday. That's right,Tucson, the 32nd largest city of the USA, and headquarters for IBM System Storage, is older than the USA itself.While the Tucson area has been continuously inhabited by humans for over 3500 years, it officially became Tucsonon August 20, 1775.
Fellow blogger Justin Thorp has opined that [blogging is like jogging]. Somedays, you are just too busy to do it, and other days, you make time for it, because you know it is important.For the record, it is not my job to blog for IBM, that ended last September 2007. I continue to blog anyways because I have benefited from it, both personally and professionally.I want to thank all of you readers out there for making this blog a great success! Being named one of the top 10 blogs of the IT storage industry by Network World, two back-to-back Brand Impact awards from Liquid Agency, and recently earning a "31" Technorati ranking, has really helped keep me going.
So, I look forward to next month, and beginning my third year on this blog. I am sure there will be lots of surprises and announcements you can all look forward to in the next coming weeks and months that I will have plenty to write about.
Yesterday (September 7, 2006) the Eclipse Foundation announced that it has approved the creation of the Aperi Storage Management Framework Project.
There's been a lot of confusion out there about Aperi, so I thought I would post some facts and opinions about this exciting new project. A few years ago, I was thelead architect for IBM TotalStorage Productivity Center, IBM's infrastructure management product that helped launch the creation of Aperi.
From the latin word for "open", Aperi is an open source project that aims to simplify the management of storage environments, using the Storage Management Initiative - Specification (SMI-S) open standardto promote interoperability and eliminate complexity in today’s storage environments.
Aperi should provide immediate value upon install with basic storage management capabilities, rather than just simply a collection of components that require costly integration. We've discussed requirements for functions such as:
Resource discovery, monitoring, and reporting
Fabric Topology mapping
Disk / Tape management
Device configuration & LUN assignment
SAN fabric management
Basic asset management
The big confusion most people have is Aperi's relation to SMI-S and the Storage Networking Industry Association (SNIA)open standards group. The best way to explain this is to go backto your High School SAT college-entrance exams. Remember questions like this?
(The answer: a crumb is to bread like a splinter is to wood.)
Aperi is an implementation of SMI-S standard, similar to MySQL or PostgreSQL areopen standard relational database implementations of Structured Query Language (SQL).These compete with proprietary database implementations such as IBM DB2 Universal Database,Oracle Database, Microsoft SQL Server, or Sybase.
Aperi: SMI-S :: PostgreSQL : Structured Query Language (SQL)
It is often the case that the folks writing the code are different than the folks defining the standards. This is the case between the members of Aperi writing code, and the members of the SNIA writing standards. IBM happens tohave employees writing Aperi code, and other employees helping define SMI-S standards.What can I say, IBM is a big company and a leader in many areas.
A good analogyis how the Apache community has developed an awesome web server, and the Firefox Mozillacommunity have developed an awesome web browser, both of which are implementations of the HTTP/HTML standards adopted by the World Wide Web Consortium. Apache and Firefoxcompete with proprietary implementations, such as Microsoft Internet Information Services(IIS) web server and Internet Explorer web browser.
Aperi: SNIA :: Apache : World Wide Web (WWW) Consortium
With this arrangement, Aperi and the SNIA will have very complementary roles in defining and driving standards across the entire storage market. To that end, Aperi will make extensive use of the SNIA’s Technology Center and SNIA’s “plugfests” to test the interoperability of the Aperi framework with the variety of 3rd-party storage offerings available. By providing a tested implementation of SMI-S, Aperi will drive broader industry availability of SMI-S, as well as offer the many benefits of an industry-backed open source community.
Check out this vote of confidence:
"Eclipse's Aperi Project will further advance the adoption of SNIA's SMI-S, benefiting the entire storage industry and IT community. Furthermore, the SNIA and Aperi will define plans to collaborate on new storage standards, standards testing programs, and storage interoperability programs." --- Wayne M. Adams, chair, SNIA Board of directors
So, both proprietary and open source implementations have their place in the world.Proprietary products are needed for advanced, unique value-add, and opensource projects are for basic support focused on interoperability and flexibility.These can be combined, for example, proprietary "plug-ins" built on an open source base. The more choices the client has, the better.
Storage vendors benefit too. Vendors are tired of being in the "Y.A.C." business, building "Yet Another Configurator" for each new device developed, with basic functionsto carve LUNs, read performance stats, and so on. By shipping Aperi instead, storagevendors like IBM can invest their development dollars in real innovations, things thatmatter for the customer.
As a consultant, I am often asked to help design the architecture for the information infrastructure. A usefulanalogy to gather requirements and preferences is the difference between area rugs and wall-to-wall carpeting. Arearugs are not secured to the floor and cover only a portion of the floor area. Carpets are generally tacked or cemented to the floor, often with an underlay of cushion padding, stretched across the entire floor surface, out to all four walls of each room.
Each has its pros and cons, and often is a matter of preference. Some people like area rugs because they can choosea different style for each room, match the decor and color scheme of furniture, and use these to define each livingspace. Ever since paleolithic man put animal skins on the floor of their cave, people recognize that cold, hard andugly floors could be covered up with something soft and more attractive.Others prefer wall-to-wall carpeting because they want to walk around the house barefoot, have their young children crawl on their hands and knees, and give the entire house a unified look and feel. This is often an inexpensive option when compared against the cost of individual rugs.
The same is true for an information infrastructure. For some, they prefer the "area rug" approach: this style ofstorage for their email, this other type of storage for their databases, and perhaps a third for their unstructuredfile systems. When customers ask what storage would I recommend for their SAP application, or their Microsoft Exchangeemail environment, or their Business Intelligence (BI) software, I recognize they are taking this "area rug" approach.
Like area rugs, having different storage can focus on specific attributes of the workload characteristics. It alsoinsulates against company-wide changes, the dreaded "rip-and-replace" of replacing all of your storage with somethingfrom a different vendor. With "area rug" storage, you can support a dual-vendor or multi-vendor strategy, and upgrade or replace each on its own schedule.
Thanks to open standards and industry-standard benchmarks, changing out one storage solution for another is assimple as rolling up an area rug, and putting another one in its place that is similar in size dimensions.
Others may prefer "wall-to-wall carpeting" approach: one disk system type, one tape library type,one network type, that provides unified management and minimizes the needs for unique skills. Generally, the choice of NAS, SAN or iSCSI infrastrucutre is done company-wide, and might strongly influence the set of products that will support that decision. For example, those with a mix of mainframe and distributed servers looking for SAN-attached storage may look at an [IBM System Storage DS8000] and [TS3500 tape library] that can provide support for FICON and FCP.
Those looking at NAS or iSCSI might consider the IBM System Storage N series products, "unified storage" supporting iSCSI, FCP and NAS protocols. If you want the "wall-to-wall" to stretch across all the sites in your globally integrated enterprise, IBM's scalable NAS product, Scale-Out File Services[SoFS], provides a global name spacein combination with a clustered file system that provides incredible scalability and performance based on field-proven technology used by the majority of the [Top 100 supercomputer] deployments.
IBM can help you design an information infrastructure that fits either approach.
For those who missed it, IBM announced last Tuesday encryption capability for the TS1120 drive, our enterprise tape drive that read and write 3592 cartridges. Do you need special cartridges for this? No! Use the sames ones you have already been using!
For those of you worried about my mysterious absence on the blogosphere, I am getting better. Sorry for not posting much lately, I have had more serious issues to worry about. I am awaiting results on whether I have Dengue fever from Brazil, Avian flu from Thailand, Malaria from Kenya, or perhaps it is just food poisoning from the otherwise fabulous French cuisine I ate last week in the South Pacific. Well, I am back in town for a while, and hopefully will recover to full health, and have some time to reflect my thoughts on storage topics.
Speaking of which, a lot has happened while I was out. Let's take a quick look.
Following our introduction of the world's first encryption-capable tape drive, the TS1120, IBM now offers higher capacity 700GB cartridges, in standard 3592 format.
The DS8000 Turbo disk system now is being offered with a flexible choice of warranty periods, 1-year, 2-year, 3-year and 4-year. Since IBM was the only one to offer 4-year warranties, it was sometimes difficult to compare apples-to-apples with our competition that offered lesser warranty periods. Now, we can match the warranty period you need, so the focus can shift on the added value the DS8000 Turbo provides at the right price.
IBM's newest low-end half-high tape drive, the TS2230 Tape Drive Express Model H3L, part of our Express portfolio of offerings designed for small and medium-sized businesses (SMB). It supports the latest LTO Generation 3 specification, so fully compatible with our larger tape systems, as well as the LTO-based gear from HP and Quantum.
'Those who cannot remember the past are condemned to repeat it.' --- George Santayana
This last week of 2006 seems like a good time to recap the past year, and review the upcoming new year.That said, a good start is PC World's Top 21 Tech Screwups of 2006.
Laptops made the news this year in a variety of ways. #1 was exploding batteries,and #6 were the stolen laptops that exposed private personal information. Someone I knowwas listed in one of these stolen databases, so this last one hits close to home. Securityis becoming a bigger issue now, and IBM was the first to deliver device-based encryptionwith the TS1120 enterprise tape drive.
IBM makes the chips used in all the major game consoles: Microsoft's Xbox 360, Nintendo's Wii,and Sony's PlayStation 3. Being all based on IBM technology doesn'tmake the games interoperable or compatible, and in the case of Sony, it made #8 for being incompatible with their own PlayStation 2.Sadly, Nintendo's Wii had its own set of problems, and I found this parody of asafety video on YouTubeyou might enjoy.
Microsoft had #5 (not understanding the holiday shopping season ends in December), #12 (not understanding people who use PCs prefer privacy), and #17 (not understanding how people useMP3 music players). At least they delivered their latest Xbox with minimal problems.As an engineer, taking on a market strategy role involved reading books and taking classeson marketing. I learned that it is all about understanding the marketplace well enoughso that your prospects "know, like, and trust" your company. Perhaps Microsoft should take a refresher course.
A few companies showed off their brilliant customer service. Comcast is representedin a video on #7, and AOL in a taped phone conversation on #15. Many of our clients areafraid of vendor lock-in, and how difficult it might be to undo the deployment of new storagetechnology. Fortunately, IBM is committed to open standards, making it easier for our clientsto make the right choice and feel good about it.
Hopefully, we can all learn from the mistakes of others, and not repeat them in 2007.
IBM makes another breakthrough today with an announcement about tape data density. Unlike hard disk drive technologies that are hitting physical limits, IBM is proving that tape technology still has plenty of life in its future.
When I first started working for IBM in Tucson, back in 1986, a 3420 tape reel held only 180MB of data, and a 3480 tape cartridge improved this to 200MB of data. Today's enterprise tapes, like 3592 cartridges for the TS1130 drives, or LTO4 cartridges for the IBM TS1040 drives, are half-inch wide, half-mile long, and can store 1 TB or more of data per cartridge, depending on how well the data can compress. To increase cartridge capacity, designers can make changes in three dimensions:
Wider tape: The film industry tried this, going from 35mm to 70mm film, only to find that most cinemas did not want to upgrade their equipment. Keeping the media dimensions to half inch wide allows much of the engineering hardware to continue unchanged.
Longer tape: The problem with longer tape is that either the reel inside gets fatter, or you need to develop flatter media to fit within the existing cartridge dimensions. Wider reels means a bigger tape cartridge external dimensions, forcing changes to shelving units, cartridge trays, and carrying units. The media just can't get any flatter without risking getting more brittle.
Denser bit recording: once a convenient width and length were established, improving bit density turned out to be the best way to increase cartridge capacity.
Working with FujiFilm Corporation of Japan, my colleagues at IBM Research facility in Zurich were able to demonstrate an incredible 29.5 Gigabits per square inch, nearly 40 times more dense than today's commercial tape technology. In the near future, we will be able to hold a 35TB tape cartridge in our hand. There was actually a lot to make this happen, improved giant magentoresistive read/write heads, better servo patterns to stay on track, thinner tracks less than a micron thick, and better signal-to-noise processing to accomplish this. To learn more, you can read the [Press Release] or watch this quick [4-minute YouTube video].
I have created blog categories, based on our System Storage offering matrix, which you can track individually:
Disk systems, including the IBM System Storage DS Family of products, SAN Volume Controller, N series, as well as features unique to these products, such as FlashCopy, MetroMirror, or SnapLock. Tape
Tape systems, including the IBM System Storage TS Family of products, tape-related products in the Virtualization Engine portfolio, drives, libraries and even tape media.
Storage Networking offerings, from Brocade, McData, Cisco and others, such as switches, routers and directors.
Infrastructure management, including IBM TotalStorage Productivity Center software, IBM Tivoli Provisioning Manager, IBM Tivoli Intelligent Orchestrator, and IBM Tivoli Storage Process Manager.
Business Continuity, including IBM Tivoli Storage Manager, Tivoli CDP for Files, Productivity Center for Replication software component, Continuous Availability for Windows (CAW), Continuous Availability for AIX (CAA).
Lifecycle and Retention offerings, including our IBM System Storage DR550, DR550 Express, GPFS, Tivoli Storage Manager Space Management for UNIX, Tivoli Storage Manager HSM for Windows, and DFSMS.
Storage services, including consulting, assessments, design, deployment, management and outsourcing.
This year I resolve to be more consistent in my blogging, and my goal is to give you one to five entries per week, every week, based on the advice from Glenn Wolsey, Jennette Banks, and others.On some weeks, I will have a running theme, so rather than super-long entries to cover everything I can think of on a topic, make the entries short and readable. This week is a good time to review last year's "New Year's Resolutions" and to make new ones for 2007. I will discuss actions that companies can adopt for their data centers.
A common resolution is to lose weight, as in this Dilbert comic. Last year, I resolved to lose weight in 2006, and am delighted with myself that I lost eight pounds. When people ask for the secret of my success, I whisper in their ear "Eat less, exercise more." In general, people (and companies) know what to do, but just don't do it, which Pfeffer and Sutton document in their book The Knowing-Doing Gap. In my case, it involved lifestyle change: I exercised at a gym three times per week in Tucson, with a personal trainer, and revamped my diet.
Not everyone subscribes to the "eat less exercise more" philosophy. For example, Ric Watson argues in his blog that you can eat fewer calories, but eat more in actual volume, by choosing the right foods. This brings up the issues of "metrics" that most data centers are familiar with. Last year, I read the book "You: On a Diet" which explains that it is better to focus on "waist reduction" as measured in inches around your mid-section at the belly button, than "weight reduction" as measured in pounds. This year, I resolve to get down to 35 inches by the end of 2007.
The problem with measuring "weight" is that you are weighing bones, muscle and fat. A person can gain ten pounds of muscle, lose ten pounds of fat, and the scale would indicate no progress. The same problem occurs in data centers. How many TB of data do you have? Storage admins can easily tell you, but can they tell how much of this is bone (data needed for operating infrastructure), muscle (data used in daily operations that generates revenue) or fat (obsolete or orphaned data)?
We at IBM often state that "Information Lifecycle Management (ILM)" is more lifestyle change than a "fad diet". Figuring out what data you should capture in the first place, where to place it, when to move it, and when to get rid of it, is more important that just buying different tiers of storage hardware. So, for those looking to make new data center resolutions, I suggest the following actions:
Re-evaluate the metrics you now use, and determine if they are helpful in making decisions and taking action.
Come up with new ones that are more focused to solve the issues you face.
Consider storage infrastructure software, such as IBM TotalStorage Productivity Center, to help you gather the information about your SAN, disk and tape systems, calculate the metrics, and automate the appropriate actions.
Continuing this week's theme of New Year's Resolutions for the data center, today we'll talk about one that people don't always think about on a personal level, that is to hone your tools and skills.
A long time ago, I used to be a regular speaker at the SHARE user group conference. One of the most attended sessions was Sam Golob presenting the latest CBT Tape set of tools. Over time, this large collection of "mainframe shareware" was handed out on 3480 tape cartridges, then on CDs, and finally made downloadable off the web.Sam's main point, which I remember to this day, was that everyone who has a job should figure out what tools they use, keep those tools functioning properly, and learn to use them well.
Later, I took some cooking classes at a culinary school. Among other things, we learned:
A sharp knife is safer and easier to use than a dull one, resulting in fewer accidents
Knowing what you are doing is the difference between food that is "simply awful" to that which is "awfully simple" to prepare.
A well trained chef can prepare most meals with just a sharp knife and wooden spoon.
The same could be said about software tools. What tools do you use in your job? Do you feel you know how to take full advantage of their power and capabilities?If you develop software, do you know all the features for your debugging tools? If you develop advertising or marketing materials, do you know all the features of your photo or video editing software? If you manage storage in a data center, do you know all the tools for managing your storage area network (SAN), disk systems, tape libraries, and reporting tools to identify all of your files and databases across your entire IT environment?I would not be surprised if you could replace a whole mess of tools with just one, such as the IBM TotalStorage Productivity Center.
Continuing this week's theme of New Year's Resolutions for the data center, today we'll talk about one that many people make for their own personal lives: staying on a budget.
Often, when faced with a tightening budgets, we try to make more use of what we already have. Tell someone they are only using 10 percent of their brain, and they immediatelybelieve you; but tell them they are only using 30 percent of their storage, and they ask for a whitepaper,magazine article, or clarification on how that percentage is calculated. I actually visiteda customer that was only using6 percent of the storage attached to their Windows servers!
So, to help those of you making data center resolutions to stay on budget, the terms to remember are "Reduce", "Reuse" and "Recycle".
When people come to request storage, are they being reasonable about what they need today, or are they asking for what they might need over the next three years? They might need 50GB, but they ask for 100GB, in case they grow, and a year later, you find they have only 15GB of data on it. On the flipside, the person asks for what they need but some storage admins give out more, just so they don't have to be bothered so often when growth happens. Finally, I have seen this formalized into fixed size LUNs, all the disk is carved into big huge 100GB pieces, so if you need 20GB, here's one big enough with plenty of room to grow.
If you are going to keep on a budget, remember that storage today is 30% more expensive than storage next year. That is the average drop in both disk and tape on a dollar-per-MB basis. If there is any way to postpone giving out storage until it is actually needed, you can save a bundle of money. Timing is everything! In the event of a disaster, getting immediate replacement for disk can be very expensive, but if you can wait just two weeks, you can negotiate a better deal. I thought of this while going to the movie theatre yesterday. A "hot dog" and a bottle of water was $8.00, but if you are able to wait two hours and eat after the movie, you can get a much better meal for less.
A lot of companies buy new storage because their existing storage isn't fast enough, or doesn't have the latest copy services. This can easily be solved with an IBM SAN Volume Controller (SVC). The SVC can virtualize slower, functionless storage, and present to your application hosts virtual disks that are faster, and with all the latest disk-to-disk copy services like FlashCopy, Metro Mirror, and Global Mirror.
Chances are, you have unused disk capacity spread across all your storage today, but perhaps they are formatted into small LUNs. The SVC can combine the capacity, and let you carve up big LUNs at the sizes you need.This is like taking all those tiny pieces of soap in your shower and forming a new bar of soap, or taking all the crumbs at the bottom of your bread box, and making a new slice of bread. And, the virtual LUNs are dynamically expandable,so give out only the amount they need today, as it is simple to expand them to larger sizes later.
Of my 13 patents, the first will always be my favorite, on a function called "RECYCLE" for the Data Facility Storage Management Subsystem Hierarchical Storage Manager (DFSMShsm) product, which is now a component of the IBM z/OS operating system. Basically, tapes could contain hundreds or thousands of files, such as backup versions or archive copies, and these expired on different dates. As a result, a tape would be written100 percent full, and then over time, decrease in valid data to 80, 60, 40, 20 until it hit 0 percent. In some cases, a single filecould hold an entire tape hostage. RECYCLE was able to read the valid data off tapes that were perhaps less than 20 percent full, and consolidate them onto fewer tapes. As a result, a whole bunch of tapes could be returned to the scratch pool, and reused immediately for other workloads. This also helps in moving to newer, higher capacity cartridges, such as the new 700GB cartridge that IBM co-developed with FujiFilm.(This RECYCLE function exists in our IBM Tivoli Storage Manager software, as well as our Virtual Tape Server, but is called "reclamation" instead, to avoid confusion on searches.)
When evaluating your use of tape, determine if you are making best use of the tapes you have now, and perhaps a RECYCLE (or reclamation) scheme may be in order. Fewer tapes can save money in many ways, such as reduced storage costs, and reduced courier costs to send the tapes offsite. Tape media can still be 10-20 times less expensive than disk, based on full capacity.
My IBM colleague Marissa Benekos brought her hand-held video camera to [Storage Networking World] conference in Orlando, Florida.I am not there, as I had a conflict with another conference going on here in Tucson, so am relyingon Marissa to feed me information to blog about.
In this segment, she interviews "booth babe" David Bricker. I've known David a long time,and if you are there at the conference, tell him I sent you to visit him at the IBM booth.
David Bricker shows off some of the IBM System Storage product line at SNWin this YouTube video (2 minutes)
Sadly, I can't be in two places at once. SNW is a great conference to attend!
Continuing this week in Las Vegas, we had a great set of sessions today.
Fibre Channel Overview
I like the manner in whichJim Robinson presented this "basics" session on how Fibre Channel works, why it is spelled "Fibre" not "Fiber", and how all the different layers work in the protocol.
IBM Virtualization Engine TS7700 series
Jim Fisher from the IBM Tucson lab presented the TS7700 series, which replaces our Virtual Tape Server (VTS). Hehad performance numbers to show that it was faster in various measurements against the B20 model of the VTS. Itis supported on the z/OS, z/VM, z/VSE, TPF and z/TPF operating systems.
IBM E-mail Archiving and Storage solution
Ron Henkhaus provided an overview of IBM's E-mail Archive and Storage appliance. The solution combines IBM BladeCenter server blade, DS4200 serieswith SATA disk, and pre-installed software: IBM Content Manager, IBM Records Manager, IBM CommonStore for Lotus Domino and Microsoft Exchange, and IBM System Storage Archive Manager. Services are included to get it connected toyour e-mail environment.
Lee La Frese from our Tucson performance lab presented various performance featuresof the IBM System Storage DS8000 series, and how they compare to competition.
First, some interesting statistics.
Back in 2002, the average high-end EnterpriseStorage Server (ESS) model F20 was configured only for 4 Terabytes (TB). In 2004,the average ESS was up to 12 TB. Today, the average DS8100 is 17.4 TB and the averageDS8300 is 41.5 TB.
51 percent of DS8000 series are configured for FCP only (Linux, UNIX, Windows, i5/OS),35 percent FICON only (System z mainframe), and 14% have both mixed.
Average I/O density has stabilized to about 0.6 IOPS per GB. This means that for everyTB of business data, you can expect most applications to issue 600 Input/Output requestsper second.
While IBM SAN Volume Controller has the fastest SPC-1 and SPC-2 benchmarks, the DS8000also has good results. Looking at just the monolithic "scale-up" systems, DS8000 hasthe fastest SPC-1, and second place for SPC-2.
Compared against the EMC DMX-3, the IBM DS8000 series has superior performance.For example, comparing 2Gbps port performance on each, DMX-3 is able to do 20 IOPS perport, compared to DS8000 with 38 IOPS per port.Compared against HDS USP, the response time for 60,000 IOPS for HDS averaged 10.5 milliseconds (msec), compared to IBM DS8000 less than 6.5 msec.
There are some unique features of the DS8000 to optimize performance. Two areAdaptive Multi-stream Prefetching (AMP) which helps improve processing of databasequeries, and HyperPAV which helps on mainframe workloads.
For FATA disks, performance of sequential reads and writes is only 20 percent less than15K RPM FC disks, but a whopping 50 percent less for random access. Consider using FATAfor audio/video streaming, surveillance data, seismic recordings, and medical imaging.
Comparing 146GB 10K versus 300GB 15K from a capacity perspective was interesting.37TB of 300GB 15K had 20 percent better response time, but 25 percent less maximum throughput,than 37TB of 146GB drives. Depending on your workload, this can help decided which youchoose.
Lee also covered RAID rebuild performance. When an individual HDD fails that is part of a RAIDgroup, the DS8000 performs a rebuild onto a spare drive. A RAID-5 rebuild is processedat 52 MB/sec, compared to RAID-10 at 56 MB/sec. Rebuild processing is low priority,so any other workload will take higher priority to avoid impacting application performance.Compared to EMC, the IBM DS8000 can rebuild RAID-5 73GB 15K RPM drive in only 24 minutes, but it takes 37 minutes to do this on a DMX-3. That is 13 minutes of additional exposure where a second drive failure might cause you to lose all your data in that RAID group!
N series ILM and Business Continuity
James Goodwin from our Advanced Technical Support team presented IBM System Storage N series featuresthat relate to ILM and Business Continuity. He covered features like SnapShot, SnapLock,SnapVault and LockVault.
The IBM Storage and Storage Networking Symposium in Las Vegas continues ...
N series and VMware
Jeff Barnett presented how VMware manages disk image files in its VMfs repository, and how N series offersa better alternative. Virtual machines can access N series volumes directly.
Business Continuity with System i
Allison Pate presented the various Business Continuity options for System i. Many customersuse internal storage for System i, but this then hampers Business Continuity efforts. Instead,you can have IBM System Storage DS8000 or DS6000 series disk systems provide disk mirroringbetween clustered systems.
There was a lot of interest in DR550, one of our many compliance storage solutions. Ron Henkhauspresented an overview of our DR550 and DR550 Express offerings. Unlike the competitive disk-onlysolutions, such as the EMC Centera, the DR550 allows you to attach an automated tape library, managing large amounts of fixed content data at a much lower cost point. It also has encryption, for both diskand tape data.
Open Systems Disk Management
Siebo Friesenborg presented the various steps needed to troubleshoot performance problemswith open systems, including the use of "iostat" on AIX systems as an example, and the stepsyou can take to make formal Service Level Agreements (SLA) between the IT department and thevarious lines of business.
IBM Encryption - TS1120 and LTO-4 encryption comparison
Tony Abete presented TS1120 and LTO-4 encryption techniques. Deploying encryption is more thanjust choosing a tape drive. There are a variety of factors involved, such as whether to managethe keys from the application, the operating system, or the library manager. You need policiesto decided when to encrypt tapes and when not to, generating your keys, storing them, and sharingthem with your business partners, suppliers and service providers with which you send tapes.
I can tell that many people are feeling like they are "drinking from a firehose".IBM's success in storage reaches out to so many different aspects of information management,a variety of industries, and disciplines as varied as regulatory compliance and medical imaging.
The IBM Storage and Storage Networking Symposium concludes today. As typical for manysuch conferences, it ended at noon, so that people can catch airline flights.
TS1120 Tape Encryption - Customer Experiences
Jonathan Barney had implemented many deployments of tape encryption, and shared hisexperiences at two customer locations.
The first company had decided to implement their EKM servers on dedicated 64-bitWindows servers. They had three sites, one in Chicago, Alphareta, and New York City,each with two EKM servers. Each library had a single TS3500 tape library, and pointedto four EKM servers, two local, and two remote.
The clever trick was managing the keystore. They decided that EKM-1 was their trustedsource, made all changes to that, and then copied it to the other five EKM servers.His team deployed one site at a time, which turned out to be ok, but he would notrecommend it. Better to design your complete solution, and make sure that all librariescan access all EKM servers.
This company decided to have a single key-label/key-pair for all three locations, but change it every 6 months. You have to keep the old keys for as long as you have tapesencrypted with those keys, perhaps 10-20 years.The customer found the IBM encryption implementation "elegant" and it can be easily replicated to a fourth site if needed.
The second company had both z/OS and Sun Solaris. Initially they planned to have botha hardware-based keystore on System z, and software-based keystore on Sun, but they realized that System z version was so much more secure and reliable, that it made nosense to have anything on the Sun Solaris platform.
On System z, they had two EKM images, and used VIPA to ensure load balancing fromthe library. Tapes written from z/OS used DFSMS Data Class to determine which tapesare encrypted and which aren't. All Tapes written from Sun Solaris were encryptied, written to a separate logical library partition of the TS3500, which in turn contactedthe System z for the EKM management to provide the keys to use for the encryption.
The "gotcha" for this case was that when they tested Disaster Recovery, they had torecover the two EKM servers first, before any other restores could take place, and thistook way too long. Instead, they developed a scaled-down 10-volume "rescue recovery" z/OS image that would contain the RACF database and all EKM related software to actas the keystore during a disaster recovery. Anytime they make updates, they only haveto dump 10 volumes to tape. Restore time is down to only 2 hours.
He gave this advice to deploy tape encryption:
Some third party z/OS security products, like Computer Associates Top Secret orACF2, require some PTFs to work with the EKM. The latest IBM RACF is good to go.
Getting IP support from IOS to OMVS requires IPL.
At one customer, an OMVS monitor software program killed the EKM because it wasn'tin their list of "acceptable Java programs". They updated the list and EKM ran fine.
DO not update EKM properties file while EKM is running. EKM keeps a lot of stuffin memory, and when it is recycled, copies this back to the EKM properties file, reversing any changes you may have done. It is best to shut down EKM, update theproperties file, then start up EKM back up again. This is why you should always haveat least two EKM servers for redundancy.
TSM for Linux on System z
Randy Larson from our Tivoli group presented this session.There is a lot of interest in deploying IBM Tivoli Storage Manager backup and archivesoftware on Linux for System z. Many customers are already invested in a mainframeinfrastructure, may have TSM for z/OS or z/VM, and want the newer features and functions that are available for TSM on Linux.
TSM has special support for Lotus Domino, Oracle, DB2 and WebSphere Application Servers.TSM clients can send backup data to a TSM server internally via Hipersockets, a virtualLAN feature on the System z platform that uses shared memory to emulate TCP/IP stack.
One of the big questions is whether to run Linux as guests under z/VM, or natively onLPAR. The general deployment is to carve an LPAR and run Linux natively untilyour server and storage administration staff have taken z/VM training classes. Oncetrained, they can easily move native LPAR images to z/VM guests. Unlike VMware that takesa hefty 40% overhead on x86 platforms to manage guests, z/VM only takes 5-10% overhead.
For the TSM database and disk storage pools, Randy recommends FC/SCSI disk, with ext3 file system, combined with LVM2 into logical volumes. ECKD disk and reiserfsworks too. Avoid use of z/VM minidisks. Under LVM2, consider 32KB stripes for the TSM database, and 256KB stripes for the disk storage pools. For multipathing, usefailover rather than multibus method. Read IC45459 before you activate "directio".
The TSM for Linux on z is very much like the TSM on AIX or Windows, and not like theTSM for z/OS. For tape, TSM for Linux on z does not support ESCON/FICON attached tape,you need to use FC/SCSI attached tape and tape libraries. TSM owns the library anddrives it uses, so give it a logical library partition separate from z/OS. ForSun/StorageTek customers, TSM works with or without the Gersham Enterprise Distrbu-Tape(EDT) software. Use the IBM-provided drivers for IBM tape. For non-IBM tape, TSM providessome drivers that you can use instead.
That wraps up my week. This was a great conference! If you missed it, look for the one in Montpelier, France this October. Check out the list of IBM Technical Conferencesto find others that might interest you.
The title of this post is inspired by Baxter Black's [latest book]. Rathera recap of the break-out sessions, I thought I would comment on a fewsentences, phrases or comments I heard in the afternoon and evening.
Stop buying storage from EMC or NetApp
The lunch was sponsored by Symantec. Rod Soderbery presented "Taking the cost out ofcost savings", explaining some ideas to reduce IT costs immediately.
First, he suggested to "stop buying storage" from EMC or NetApp that charge a premiumfor tier-one products. Instead, Rod suggested that people should "think like a Web company"and buy only storage products based on commodity hardware to save money, and to use SRM software to identify areas of poor storage utilization. IBM's TotalStorage Productivity Center softwareis often used to help with this analysis.
His other suggestions were to adopt thin provisioning, data deduplication, and virtualization.The discussion at my table started with someone asking, "How do we adopt those functions without buying new storage capacity with those features already built-in?" I explained that IBM's SAN Volume Controller (SVC),N series gateways, and TS7650G ProtecTIER virtual tape gateway can all provide one or moreof these features to your existing disk storage capacity.
IBM and HP are leaders in blade servers
In the session "Future of Server and OS: Disappearing Boundaries", the audience confirmedby electronic survey that IBM and HP are the leaders in blade servers, although blades representonly 8-10 percent of the overall server market.
Interestingly, 22 percent of the audience has deployed both x86 and non-x86 (POWER, SPARC, etc.) blade servers.The presenters considered this an interesting insight.
Another survey of the audience found that 3 percent considered Sun/STK as their primary storagevendor. One of the presenters was delighted that Sun is still hanging in there.
IBM Business Partners deliver the best of IBM and mask the worst
Elaine Lennox, IBM VP, and Mark Wyllie, CEO of Flagship Solutions Group, Inc. presentedIBM-sponsored back to back sessions. Elaine presented IBM's vision, the New Enterprise Data Center, and the challenges that demand a smarter planet.
Mark focused on his company's experience working with IBM through Innovation Workshops. Theseare assessments that can help someone identify where you are now, where you want to be, andthen action plans to address the gaps.
Cats and Dogs, Oil and Water, Microsoft Windows and Mission-critical applications, what do all of these have in common?
NEC Corporation of America sponsored some sessions on some x86-based solutions they have to offer.The first part, titled "Rats Nests, Snow Drifts and Trailers" focused unified storage, andthe second part, presented by Michael Nixon, focused on how to bring Microsoft Windows servers into the data center for mission-critical applications.
The Economy might be slowing, but storage is still growing
Two analysts co-presented "The Enterprise Storage Scenario". Unlike computing capacity, thereis no on/off switch for storage, not from applications nor from end-users. The cost ofpower for storage is expected to be 3x by 2013. Virtual servers, includingVMware and Microsoft's Hyper-V will drive the need for shared external disk storage.A survey of the audience found 20 percent were expecting to purchase additional storagecapacity 4Q08.
When someone reaches age 52, they expect to coast the rest of their career
At dinner with analysts, the discussion of financial meltdown and bailouts is unavoidable,including everyone's views about the proposed bailout of the Big 3 automakers. I can'tdefend Ford, GM and Chrysler paying their people $70 US dollars per hour, when their UScounterparts at Toyota or Honda are only paid $45 to $50 dollars per hour.
However, I have a close friend who retired after 20 years working for the fire department,and a cousin who retired after 20 years serving in the Navy (the US Navy, not the BolivianNavy), and both are still in their forties in age. A long time ago, IT professionalsretired after 30 years, in some cases with 50 to 60 percent of their base pay as theirpension for the rest of their lives. A 52-year-old that has worked 30 years might expect to enjoy the rest of his old age playing golf and pursuing other hobbies. This is not "coasting", it is called "retirement". The few of my colleagues that I have seen who worked 35 to 40 years did so becausethey enjoyed the challenge of work at IBM. They enjoyed solving tough engineering problems and helping customers.As long as they were having fun on the job,IBM was glad to keep their wealth of experience on board and actively engaged.
Unfortunately, many people rely on their own investments in the stock market for retirement, ratherthan company pensions. With the current financial crisis, I suspect many people my age arereconsidering their previous retirement plans.
We're going to need more trains!
I took the monorail back to my hotel. The ride includes funny announcements and statistics,including this gem:
"Since 1940, Las Vegas has doubled in population every ten years, which means thatby the year 2230, we will have over 1 trillion people calling Las Vegas home. We're goingto need more trains!"
That wraps up Tuesday, Day 2 of my attendance here! Now for some sleep.
Lagasse, Inc. sells janitorial supplies, such as mops, cleaning chemicals, waste receptacles, and garbage can liners. Of the 1000 employees of Lagasse nationwide, about 200 associates were located in New Orleans at their main Headquarters, primary customer care center, and primary IT computing center.
Amazingly, Lagasse did not have a formally documented BCP (Business Continuity Plan) but more of aBCI (Business Continuity Idea). They chose to take a ["donut tire"] approach, putting older previous-generation equipment at their DR site. They knew that in the event of a disaster,they would not be processing as many transactions per second. That was a business trade-offthey could accept.
Evaluating all the different threat scenarios for impact and likelihood, and focused on hurricanes and floods.They had experienced previous hurricanes, learning from each,with the most recent being 2004 Hurricane Ivan and 2005 Hurricane Dennis. From this, they wereable to categorize three levels of DR recovery:
Tier 1 - The most mission-critical, which for them related to picking, packing and shipping products.
Tier 2 - The next most important, focused on maintaining good customer service
Tier 3 - Everything else, including reporting and administrative functions
The time-line of events went as follows:
The US Government issues warning that a hurricane may hit New Orleans
August 27 - 7pm
Lagasse declares a disaster, starts recovery procedures to an existing IT facility in Chicago, owned by their parent company. A temporary "Southeast" Headquarters were set up in Atlanta.Remote call centers were identified in Dallas, Atlanta, San Antonio, and Miami.
August 28 - just after midnight
In just five hours, they recovered their "Tier 1" applications.
August 28 - 7:30pm
In just over 24 hours, they recovered their "Tier 2" applications.
August 29 - 6am
The Hurricane hits land. With 73 levees breached, the city of New Orleans was flooded.
The following week
Lagasse was fully operational, and recorded their second and third best sales days ever.
I was quite impressed with their company's policy for how they treat their employees during a disaster. For many companies, people during a disaster prioritize on their families, not their jobs.If any associate was asked to work during a disaster, the company would take care of:
The safety of their family
The safety of their pets. (In the weeks following this hurricane, I sponsored people in Tucson to go to New Orleans to attend to lost and stray dogs and cats, many of which were left behind when rescuers picked up people from their rooftops.)
Any emergency repairs to secure the home they leave behind
Marshall felt that if you don't know the names of the spouse and kids of your key employees, you are not emotionally-invested enough to be successful during a disaster.
For communications, cell phones were useless. They could call out on them, but anyone with acell phone with 504 area code had difficulty receiving calls, as the calls had to be processedthrough New Orleans. Instead, they used Voice over IP (VoIP) to redirect calls to whichever remote call center each associate went to. Laptops, Citrix, VPN and email were considered powerful tools during this process. They did not have Instant Messaging (IM) at the time.
While the disk and tapes needed to recover Tiers 1 and 2 were already in Chicago, the tapes for Tier 3 were stored locally by a third-party provider. When Lagasse asked for thier DR tapes back, the third-party refused, based on their [force majeure] clause. Force majeure is a common clause in many business contracts to free parties from liabilityduring major disasters.Marshall advised everyone to strike out any "force majeure" clauses out of any future third-party DR protection contracts.
Hurricane Katrina hit the US hard, killing over 1400 people, and America still has not fully recovered. The recovery of thecity of New Orleans has been slow. Massive relocations has caused a deficit of talent inthe area, not just IT talent, but also in the areas of medicine, education and other professions. The result has been degraded social services, encouraging others to relocate as well. Some have called it the "liberation effect", a major event that causespeople to move to a new location or take on a new career in a different field.
On a personal note, I was in New Orleans for a conference the week prior to landfall, and helped clients with their recoveries the weeks after. For more on how IBM Business Continuity Recovery Services (BCRS) helped clients during Hurricane Katrina, see the following [media coverage].
A [recent survey] conductedby Fleishman-Hillard Researchindicates that the majority of disk-only customers are now lookingat adding tape back into their infrastructure. Here are some excerpts:
"Over two thirds of surveyed businesses said they were lookingto add tape storage back into their overall network infrastructure and of those respondents, over80-percent plan to add tape storage solutions within the next 12 months.The survey, which was taken in the fourth quarter of 2007, focused on the views of morethan 200 network administrators and mid-level tech specialists at mid-size to large companiesthroughout the United States.
The integration of tape storage into a tiered information infrastructure is highly strategic forcustomers, due to its low cost of ownership, low energy consumption and portability for dataprotection, said Cindy Grossman, Vice President of Tape Storage Systems, IBM. LTO tapetechnology is a perfect choice for enterprise and mid-sized customer with its proven reliability, highcapacity, high performance and ability to address data security with built-in encryption and dataretention requirements for the evolving data center.
According to the survey, 58-percent of the respondents use a combination of disk and tapefor long term archiving, 24-percent use tape exclusively, and 18-percent employ a disk-onlyapproach. In this group, 68-percent of the current disk-only users plan to start using tape for longtermarchiving, and over half (58-percent) plan to add tape for short-term data protection.The survey findings suggest that disk-only users may be experiencing a bit of buyer sremorse, said David Geddes, senior vice president at Fleishman-Hillard Research, who oversawthe study. We found that a wide majority of companies that employ purely disk-basedapproaches are looking to quickly include tape in their backup and archiving strategies."
While disk provides online data access and availability, tape provides additional data protectionand security, lower total cost of ownership (TCO), lower energy consumption (Tape is more "green"),and can be an important part of a long term data retention and compliance strategy.
Disk is more costly, more energy hungry, and some data, although it must be retained, may seldom, if ever be looked at, so why keep it spinning?
Speaking of TCO, in a recent 5-year TCO analysis by the Clipper Group titled[“Disk and Tape Square Off Again”]stored 2.4PB of data long term on SATA disk and on an LTO tape library, the disk system was:23:1 more costly, used 290 times the amount of energy than tapeEven with a data dedupe system like IBM System Storage N series, disk was still 5 times more costly than the tape system.
The Linear Tape Open (LTO) consortium --consisting of IBM, Hewlett-Packard (HP) and Quantum-- just released its "LTO-5" plans. With 2:1 compression,you will be able to pack up to 3TB of data on a single tape cartridge. And while dollar-per-GB declinefor disk is slowing down to 25-30 percent per year, tape continues to decline at a healthy 40 percent rate, so the price gap between diskand tape will actually widen even further over the next few years.
I am back at "the Office" for a single day today. This happens often enough I need a name for it.Air Force pilots that practice landing and take-offs call them "Touch and Go", but I think I needsomething better. If you can think of a better phrase, let me know.
This week, I was in Hartford, CT, Somers, NY and our Corporate Headquarters in Armonk, in a varietyof meetings, some with editors of magazines, others with IBMers I have only spoken to over the phone andfinally got a chance to meet face to face.
I got back to Tucson last night, had meetings this morning in Second Life, then presented "InformationLifecycle Management" in Spanish to a group of customers from Mexico, Chile, and Brazil. We have a great Tucson Executive Briefing Center, and plenty of foreign-language speakers to draw from our localemployees here at the lab site.
Sunday, I leave for Las Vegas for our upcoming IBM Storage and Storage Networking Symposium. We will cover the latest in our disk, tape, storage networking and related software.Do you have your tickets? If you plan to attend, and want to meet up with me, let me know.
It's Tuesday, and you know what that means-- IBM makes its announcements.
Today, IBM announced a variety of storage offerings, but I am going to just focus this poston just the new DR550 models. The DR550 is the leading disk-and-tape solution forstoring non-erasable, non-rewriteable (NENR) data. This type of data, often called fixed-contentor compliance data, was previously writtento Write-Once-Read-Only (WORM) optical media. However, Optical technology has not advanced as fastas magnetic recording, so disk and tape have taken over this role. While there are still a fewlaws on the books that mandate "optical media" as the storage solution, new laws like SEC 17a-4and Sarbanes-Oxley (SOX) allow for NENR solutions based on magnetic disk or tape instead.
As we had done for the IBM SAN Volume Controller (SVC), the DR550 was based on "off the shelf"components. The File System Gateway (FSG) was based on System x server, the DR550 hardwarebased on System p server and DS4000 disk arrays, with "hardened" versions of the AIX,DS4000 Storage Manager and IBM Tivoli Storage Manager (TSM) that we renamed the IBM SystemStorage Archive Manager (SSAM).
The DR550 is Ethernet-based, so it can be used with all IBM server platforms, from System xand BladeCenter, to System i, and System p, and even System z mainframe customers, as wellas non-IBM platforms from Sun, HP and others. There are two ways to get data stored ontothe DR550:
Sending archive objects via the SSAM archive API. This is an API based on the XBSA open standardthat many applications have coded to.
Writing files via standard CIFS and NFS protocols through the File System Gateway (FSG), an optional priced feature that you can have incorporated into the DR550.
Generally, business applications like SAP or Microsoft Exchange don't do this directly, but ratheryou have an "archive management application" that acts as the go-between broker. IBM offers IBM Content Manager, IBM CommonStore for eMail (Exchange and Lotus Domino), and IBM CommonStore for SAP.IBM also recently acquired FileNet and Princeton Softech that provide additional support. Third partyproducts like Zantaz and Symantec KVS Enterprise Vault have also passed System Storage Provencertification for the DR550. These go-between applications understand the underlying storagestructure of their respective applications, and can apply policies to extract database rows, individualemails, or other attachments, as appropriate, and either move or copy them into the DR550.
The DR550 has built in support to move data from disk to tape, through policy-based automation behind the scenes. This is the key differentiator fromdisk-only solutions. Rather than filling up an EMC Centera, and watching it sit there idle burning energyfor five to seven years, or however long you are required to keep the data, you can instead use the disk for the most recent months worth of data on a DR550. The DR550 attaches to tapedrives or libraries, not just IBM TS1120 or LTO based models, but hundreds of systems from other vendorsas well. You can combine this with either rewriteable or WORM tape cartridge media, depending on yourcircumstances. This can be directly cabled, or through a SAN fabric environment. Storing the bulk ofthis rarely-referenced data on tape makes the DR550 substantially more affordable and more green thandisk-only alternatives.
Let's take a look at the specific models:
IBM System Storage DR550 DR1
The DR1 machine-type-model replaces the "DR550 Express" for small and medium size business workloads. This is a singleSystem p server with anywhere from 1 to 36 TB of raw disk capacity in a nice lockable 25U cabinet (see picture at left). On the original DR550 Express, the 25U cabinet was optional, but so many people opted for it, that wemade it standard feature. You can add the File System Gateway, which is a System x running Linuxwith NFS and CIFS protocols converted to SSAM API calls.
IBM System Storage DR550 DR2
The DR2 machine-type-model replaces the larger "DR550" for enterprise workloads. This can be either a single or dual node System p configuration, anywhere from 6 to 168 TB in raw disk capacity, in a lockable 36U cabinet. This also allows for an optional File System Gateway, and in the case of thedual node configuration, you can have two System p servers, and two System x servers with two Ethernetand two SAN switches for complete redundancy.
Common Information Model (CIM) and SMI-S interfaces have been added so that IBM Director can providea "single pane of glass" to manage all of the components of the DR550.
The system is based on high-capacity 750GB SATA drives, installed in half-drawer (eight drives, 6 TB)and full-drawer (16 drives, 12 TB) increments. Your choices will be 7+P RAID5 or 6+P+Q RAID6.Here is an Intel article that explains [RAID6 P+Q].In the future, as new disk technologies are introduced, the DR550 supports moving the disk datafrom old to new seamlessly, without disrupting the data retention policies enforcement.
For more information, here is a [6-page brochure] thathas specifications for both the DR1 and DR2 models.
These disk capacities can have up to 25x times their effective capacity with IBM's HyperFactorin-line deduplication capability. So the smallest 7TB model could be as effective as 175TB of traditionaldisk storage.
IBM Tivoli Storage Manager (TSM) v6
After years and years in development, IBM announces[TSM v6]. Here's a quick summary of the key features:
DB2 instead of an internal database
For years, people have complained that IBM used its own internal relational database. This was becausewhen TSM was first launched back in 1993, the DB2 did not have all the features on all of the various server platforms that TSM needed. Today, DB2is the leading relational database on all the key platforms that TSM server runs on, and therefore good enough for use within Tivoli Storage Manager. If you don't already have DB2, it is included for use with TSM v6.1 at no additional charge. Do you have to become a DB2 expert to use TSM? No! The TSM administration commands have been updated to hide all the complexity of DB2 away, behind the scenes. You now just use TSM commands to administer the database,as you did before. IBM will provide conversion utilities to help existing TSM customers migrate to thisnew database environment.
Better Operational Reporting
Another big complaint was that TSM had fixed reporting, and administrators that wanted customized reportsoften had to resort to purchasing third party products. With the change over to DB2, TSM now enables youto create your own reports using Eclipse's Business Intelligence and Reporting Tools[BIRT]! If you haven't used BIRT, you can downloada free open source copy and start playing around with its capabilities. This is combined with a revamped GUI that provides a customizable dashboard using IBM's Integrated Solutions Console (ISC)infrastructure.
Lastly, IBM has incorporated deduplication capability within the TSM v6.1 software for its own diskstorage pools. This is done in a post-process manner so as to dedupe all of your legacy backup dataas well, not just the new stuff, without impacting the current TSM server performance.
At this point, you might be thinking "Wait, what about IBM TS7650 ProtecTIER deduplication?" which is really two questions.
Can I use TSM v6.1 with IBM TS7650 ProtecTIER?
Yes, however since TSM progressive incremental method is vastly more efficientthan other backup products like Veritas NetBackup or EMC Legato NetWorker, the TS7650 may only get 10x reductionof TSM backups, versus up to 25x with full-backups-every-night backup schemes. TSM only dedupes itsdisk storage pools, so it won't dedupe data directed at tape systems like the TS7650 or othertape libraries. This avoids the "double dedupe" concern.
When should I use TSM's software version versus TS7650's hardware deduplication?
This is a positioning question. For now, the cut-over point is about 10TB per night backup processing. If youbackup more than 10TB per night, TS7650 hardware may be the better approach. If you are a smaller customer nowhere near that volume of data, then using TSM v6.1 software deduplication may be a morecost-effective solution. If you start small, and grow beyond 10TB per night, it is easy to bring ina TS7650 into an existing TSM environment and migrate the data over.
If you run TSM server on a logical partition (LPAR) or virtual guest OS under VMware ESX, Xen or Microsoft'sHyper-V environment, why should you have to license it for the whole box? With TSM v6.1, you nowcan pay for only the amount of processors you use, down to a single core even.If you currently run TSM v5 on z/OS, you can migrate over to TSM v6.1 server for Linux on System z totake advantage of cost savings using IFL engines.
IBM Tivoli Key Lifecycle Manager (TKLM) v1.0
Don't let the "v1.0" scare you, this is the successor to IBM's Encryption Key Manager (EKM) that hasthousands of clients using today with IBM encrypting tape drives. The new TKLM adds support for full disk encryption (FDE) drives--like those for the DS8000 I mentioned in [yesterday's post]--as well as new features to support key rotation for compliance and business controls.
IBM Tivoli Storage Productivity Center
Last, but not least, we have IBM Tivoli Storage Productivity Center [TSPC]. No, that is not a typo. IBM is renaming IBM TotalStorage Productivity Center to Tivoli Storage Productivity Center toavoid trademark conflicts with the [Professional Golfer's Association].
This is not just renaming existing product. Here some key improvements:
TSPC brings back together Productivity Center Standard Edition (Disk, Tape, SAN and Data) with Productivity Center for Replication, which were separate at birth a few years ago.
TSPC adds support for IBM's Storage Enterprise Resource Planner[SERP] from theNovusCG acquisition.
End-to-end view for EMC storage devices connected to supported servers via EMC Powerpath multipathing driver. As customers switch away from EMC Control Center over to IBM's Productivity Center, IBM can continue to provide support for existing EMC gear.
Of course, IBM will still offer IBM System Storage Productivity Center[SSPC] which is a piece of hardware pre-installed with Productivity Center software.
Hopefully, you can now see why I had to split up all these announcements into separate posts acrossmultiple days!
Continuing my catch-up on past posts, Jon Toigo on his DrunkenData blog, posted a ["bleg"] for information aboutdeduplication. The responses come from the "who's who" of the storage industry, so I will provide IBM'sview. (Jon, as always, you have my permission to post this on your blog!)
Please provide the name of your company and the de-dupe product(s) you sell. Please summarize what you think are the key values and differentiators of your wares.
IBM offers two different forms of deduplication. The first is IBM System Storage N series disk system with Advanced Single Instance Storage (A-SIS), and the second is IBM Diligent ProtecTier software. Larry Freeman from NetApp already explains A-SIS in the [comments on Jon's post], so I will focus on the Diligent offering in this post. The key differentiators for Diligent are:
Data agnostic. Diligent does not require content-awareness, format-awareness nor identification of backup software used to send the data. No special client or agent software is required on servers sending data to an IBM Diligent deployment.
Inline processing. Diligent does not require temporarily storing data on back-end disk to post-process later.
Scalability. Up to 1PB of back-end disk managed with an in-memory dictionary.
Data Integrity. All data is diff-compared for full 100 percent integrity. No data is accidentally discarded based on assumptions about the rarity of hash collisions.
InfoPro has said that de-dupe is the number one technology that companies are seeking today — well ahead of even server or storage virtualization. Is there any appeal beyond squeezing more undifferentiated data into the storage junk drawer?
Diligent is focused on backup workloads, which has the best opportunity for deduplication benefits. The two main benefits are:
Keeping more backup data available online for fast recovery.
Mirroring the backup data to another remote location for added protection. With inline processing, only the deduplicated data is sent to the back-end disk, and this greatly reduces the amount of data sent over the wire to the remote location.
Every vendor seems to have its own secret sauce de-dupe algorithm and implementation. One, Diligent Technologies (just acquired by IBM), claims that their’s is best because it collapses two functions — de-dupe then ingest — into one inline function, achieving great throughput in the process. What should be the gating factors in selecting the right de-dupe technology?
As with any storage offering, the three gating factors are typically:
Will this meet my current business requirements?
Will this meet my future requirements for the next 3-5 years that I plan to use this solution?
What is the Total Cost of Ownership (TCO) for the next 3-5 years?
Assuming you already have backup software operational in your existing environment, it is possible to determine thenecessary ingest rate. How many "Terabytes per Hour" (TB/h) must be received, processed and stored from the backup software during the backup window. IBM intends to document its performance test results of specific software/hardwarecombinations to provide guidance to clients' purchase and planning decisions.
For post-process deployments, such as the IBM N series A-SIS feature, the "ingest rate" during the backup only has to receive and store the data, and the rest of the 24-hour period can be spent doing the post-processing to find duplicates. This might be fine now, but as your data grows, you might find your backup window growing, and that leaves less time for post-processing to catch up. IBM Diligent does the processing inline, so is unaffected by an expansion of the backup window.
IBM Diligent can scale up to 1PB of back-end data, and the ingest rate does not suffer as more data is managed.
As for TCO, post-process solutions must have additional back-end storage to temporarily hold the data until the duplicates can be found. With IBM Diligent's inline methodology, only deduplicated data is stored, so less disk space is required for the same workloads.
Despite the nuances, it seems that all block level de-dupe technology does the same thing: removes bit string patterns and substitutes a stub. Is this technically accurate or does your product do things differently?
IBM Diligent emulates a tape library, so the incoming data appears as files to be written sequentially to tape. A file is a string of bytes. Unlike block-level algorithms that divide files up into fixed chunks, IBM Diligent performs diff-compares of incoming data with existing data, and identifies ranges of bytes that duplicate what already is stored on the back-end disk. The file is then a sequence of "extents" representing either unique data or existing data. The file is represented as a sequence of pointers to these extents. An extent can vary from2KB to 16MB in size.
De-dupe is changing data. To return data to its original state (pre-de-dupe) seems to require access to the original algorithm plus stubs/pointers to bit patterns that have been removed to deflate data. If I am correct in this assumption, please explain how data recovery is accomplished if there is a disaster. Do I need to backup your wares and store them off site, or do I need another copy of your appliance or software at a recovery center?
For IBM Diligent, all of the data needed to reconstitute the data is stored on back-end disks. Assuming that all of your back-end disks are available after the disaster, either the original or mirrored copy, then you only need the IBM Diligent software to make sense of the bytes written to reconstitute the data. If the data was written by backup software, you would also need compatible backup software to recover the original data.
De-dupe changes data. Is there any possibility that this will get me into trouble with the regulators or legal eagles when I respond to a subpoena or discovery request? Does de-dupe conflict with the non-repudiation requirements of certain laws?
I am not a lawyer, and certainly there are aspects of[non-repudiation] that may or may not apply to specific cases.
What I can say is that storage is expected to return back a "bit-perfect" copy of the data that was written. Thereare laws against changing the format. For example, an original document was in Microsoft Word format, but is converted and saved instead as an Adobe PDF file. In many conversions, it would be difficult to recreate the bit-perfect copy. Certainly, it would be difficult to recreate the bit-perfect MS Word format from a PDF file. Laws in France and Germany specifically require that the original bit-perfect format be kept.
Based on that, IBM Diligent is able to return a bit-perfect copy of what was written, same as if it were written to regular disk or tape storage, because all data is diff-compared byte-for-byte with existing data.
In contrast, other solutions based on hash codes have collisions that result in presenting a completely different set of data on retrieval. If the data you are trying to store happens to have the same hash code calculation as completely different data already stored on a solution, then it might just discard the new data as "duplicate". The chance for collisions might be rare, but could be enough to put doubt in the minds of a jury. For this reason, IBM N series A-SIS, that does perform hash code calculations, will do a full byte-for-byte comparison of data to ensure that data is indeed a duplicate of an existing block stored.
Some say that de-dupe obviates the need for encryption. What do you think?
I disagree. I've been to enough [Black Hat] conferences to know that it would be possible to read thedata off the back-end disk, using a variety of forensic tools, and piece together strings of personal information,such as names, social security numbers, or bank account codes.
Currently, IBM provides encryption on real tape (both TS1120 and LTO-4 generation drives), and is working withopen industry standards bodies and disk drive module suppliers to bring similar technology to disk-based storage systems.Until then, clients concerned about encryption should consider OS-based or application-based encryption from thebackup software. IBM Tivoli Storage Manager (TSM), for example, can encrypt the data before sending it to the IBMDiligent offering, but this might reduce the number of duplicates found if different encryption keys are used.
Some say that de-duped data is inappropriate for tape backup, that data should be re-inflated prior to write to tape. Yet, one vendor is planning to enable an “NDMP-like” tape backup around his de-dupe system at the request of his customers. Is this smart?
Re-constituting the data back to the original format on tape allows the original backup software to interpret the tape data directly to recover individual files. For example, IBM TSM software can write its primary backup copies to an IBM Diligent offering onsite, and have a "copy pool" on physical tape stored at a remote location. The physical tapes can be used for recovery without any IBM Diligent software in the event of a disaster. If the IBM Diligent back-end disk images are lost, corrupted, or destroyed, IBM TSM software can point to the "copy pool" and be fully operational. Individual files or servers could be restored from just a few of these tapes.
An NDMP-like tape backup of a deduplicated back-end disk would require that all the tapes are in-tact, available, and fully restored to new back-end disk before the deduplication software could do anything. If a single cartridge fromthis set was unreadable or misplaced, it might impact the access to many TBs of data, or render the entire systemunusable.
In the case of a 1PB of back-end disk for IBM Diligent, you would be having to recover over a thousand tapes back to disk before you could recover any individual data from your backup software. Even with dozens of tape drives in parallel, could take you several days for the complete process.This represents a longer "Recovery Time Objective" (RTO) than most people are willing to accept.
Some vendors are claiming de-dupe is “green” — do you see it as such?
Certainly, "deduplicated disk" is greener than "non-deduplicated" disk, but I have argued in past posts, supportedby Analyst reports, that it is not as green as storing the same data on "non-deduplicated" physical tape.
De-dupe and VTL seem to be joined at the hip in a lot of vendor discussions: Use de-dupe to store a lot of archival data on line in less space for fast retrieval in the event of the accidental loss of files or data sets on primary storage. Are there other applications for de-duplication besides compressing data in a nearline storage repository?
Deduplication can be applied to primary data, as in the case of the IBM System Storage N series A-SIS. As Larrysuggests, MS Exchange and SharePoint could be good use cases that represent the possible savings for squeezing outduplicates. On the mainframe, many master-in/master-out tape applications could also benefit from deduplication.
I do not believe that deduplication products will run efficiently with “update in place” applications, that is high levels of random writes for non-appending updates. OLTP and Database workloads would not benefit from deduplication.
Just suggested by a reader: What do you see as the advantages/disadvantages of software based deduplication vs. hardware (chip-based) deduplication? Will this be a differentiating feature in the future… especially now that Hifn is pushing their Compression/DeDupe card to OEMs?
In general, new technologies are introduced on software first, and then as implementations mature, get hardware-based to improve performance. The same was true for RAID, compression, encryption, etc. The Hifn card does "hash code" calculations that do not benefit the current IBM Diligent implementation. Currently, IBM Diligent performsLZH compression through software, but certainly IBM could provide hardware-based compression with an integrated hardware/software offering in the future. Since IBM Diligent's inline process is so efficient, the bottleneck in performance is often the speed of the back-end disk. IBM Diligent can get improved "ingest rate" using FC instead of SATA disk.
Sorry, Jon, that it took so long to get back to you on this, but since IBM had just acquired Diligent when you posted, it took me a while to investigate and research all the answers.
The technology industry is full of trade-offs. Take for example solar cells that convert sunlight to electricity. Every hour, more energy hits the Earth in the form of sunlight than the entire planet consumes in an entire year. The general trade-off is between energy conversion efficiency versus abundance of materials:
Get 9-11 percent efficiency using rare materials like indium (In), gallium (Ga) or cadmium (Cd).
Get only 6.7 percent efficiency using abundant materials like copper (Cu), tin (Sn), zinc (Zn), sulfur (S), and selenium (Se)
A second trade-off is exemplified by EMC's recent GeoProtect announcement. This appears similar to the geographic dispersal method introduced by a company called [CleverSafe]. The trade-off is between the amount of space to store one or more copies of data and the protection of data in the event of disaster. Here's an excerpt from fellow blogger Chuck Hollis (EMC) titled ["Cloud Storage Evolves"]:
"Imagine a average-sized Atmos network of 9 nodes, all in different time zones around the world. And imagine that we were using, say, a 6+3 protection scheme.
The implication is clear: any 3 nodes could be completely lost: failed, destroyed, seized by the government, etc.
-- and the information could be completely recovered from the surviving nodes."
For organizations worried about their information falling into the wrong hands (whether criminal or government sponsored!), any subset of the nodes would yield nothing of value -- not only would the information be presumably encrypted, but only a few slices of a far bigger picture would be lost.
Seized by the government?falling into the wrong hands? Is EMC positioning ATMOS as "Storage for Terrorists"? I can certainly appreciate the value of being able to protect 6PB of data with only 9PB of storage capacity, instead of keeping two copies of 6PB each, the trade-off means that you will be accessing the majority of your data across your intranet, which could impact performance. But, if you are in an illicit or illegal business that could have a third of your facilities "seized by the government", then perhaps you shouldn't house your data centers there in the first place. Having two copies of 6PB each, in two "friendly nations", might make more sense.
(In reality, companies often keep way more than just two copies of data. It is not unheard of for companies to keep three to five copies scattered across two or three locations. Facebook keeps SIX copies of photographs you upload to their website.)
ChuckH argues that the governments that seize the three nodes won't have a complete copy of the data. However, merely having pieces of data is enough for governments to capture terrorists. Even if the striping is done at the smallest 512-byte block level, those 512 bytes of data might contain names, phone numbers, email addresses, credit cards or social security numbers. Hackers and computer forensics professionals take advantage of this.
You might ask yourself, "Why not just encrypt the data instead?" That brings me to the third trade-off, protection versus application performance. Over the past 30 years, companies had a choice, they could encrypt and decrypt the data as needed, using server CPU cycles, but this would slow down application processing. Every time you wanted to read or update a database record, more cycles would be consumed. This forced companies to be very selective on what data they encrypted, which columns or fields within a database, which email attachments, and other documents or spreadsheets.
An initial attempt to address this was to introduce an outboard appliance between the server and the storage device. For example, the server would write to the appliance with data in the clear, the appliance would encrypt the data, and pass it along to the tape drive. When retrieving data, the appliance would read the encrypted data from tape, decrypt it, and pass the data in the clear back to the server. However, this had the unintended consequences of using 2x to 3x more tape cartridges. Why? Because the encrypted data does not compress well, so tape drives with built-in compression capabilities would not be able to shrink down the data onto fewer tapes.
(I covered the importance of compressing data before encryption in my previous blog post
[Sock Sock Shoe Shoe].)
Like the trade-off between energy efficiency and abundant materials, IBM eliminated the trade-off by offering compression and encryption on the tape drive itself. This is standard 256-bit AES encryption implemented on a chip, able to process the data as it arrives at near line speed. So now, instead of having to choose between protecting your data or running your applications with acceptable performance, you can now do both, encrypt all of your data without having to be selective. This approach has been extended over to disk drives, so that disk systems like the IBM System Storage DS8000 and DS5000 can support full-disk-encryption [FDE] drives.
There's some good discussion in the comments section over at Robin Harris' StorageMojo blog for hispost [Building a 1.8 Exabyte Data Center].To summarize, a student is working on a research archive and asked Robin Harris for his opinion. The archive will consist of 20-40 million files averaging 90 GB in size each, for a total of 1800 PB or 1.8 EB. By comparison, anIBM DS8300 with five frames tops out at 512TB, so it would take nearly 3600 of these to hold 1.8 EB. While this might seem like a ridiculous amount of data, I think the discussion is valid as our world is certainly headed in that direction.
IBM works with a lot of research firms, and the solution is to put most of this data on tape, with just enough disk for specific analysis. Robin mentions a configurion with Sun Fire 4540 disk systems (aka Thumper). Despite Sun Microsystems' recent [$1.7 Billion dollar quarterly loss], I think even the experts at Sun would recommend a blended disk-and-tape solution for this situation.
Take for example IBM's Scale Out File Services [SoFS] which today handles 2-3 billion files in a single global file system, so 20-40 million would present no problem. SoFS supports a mix of disk and tape, with built-in movement, so that files that were referenced would automatically be moved to disk when needed, and moved back to tape when no longer required, based on policies set by the administrator. Depending on the analysis, you may only need 1 PB or less of disk to perform the work, which can easily be accomplished with a handful of disk systems, such as IBM DS8300 or IBM XIV, for example.
The rest would be on tape. Let's consider using the IBM TS3500 with [S24 High Density] frames. A singleTS3500 tape library with fifteen of these HD frames could hold 45PB of data, assuming 3:1 compression on 1TB-size 3592 cartridges. You wouldneed 40 (forty) of these libraries to get to the full 1800 PB required, and these could hold even more as higher capacity cartridges are developed. IBM has customers with over 40 tape libraries today (not all with these HD frames, of course), but the dimensions and scale that IBM is capable lies within this scope.
(For LTO fans, fifteen S54 frames would hold 32PB of data, assuming 2:1 compression on 800GB-size LTO-4 cartridges.so you would need 57 libraries instead of 40 in the above example.)
This blended disk-and-tape approach would drastically reduce the floorspace and electricity requirements when compared against all-disk configurations discussed in the post.
People are rediscovering tape in a whole new light. ComputerWorld recently came out with an 11-page Technology Brief titled [The Business Value of Tape Storage],sponsored by Dell. (Note: While Dell is a competitor to IBM for some aspects of their business, they OEM their tape storage systems from IBM, so in that respect, I can refer to them as a technology partner.) Here are some excerpts from the ComputerWorld brief:
For IT managers, the question isnot whether to use tape, but whereand how to best use tape as part of acomprehensive, tiered storage architecture.In the modern storage architecture,tape plays a role not onlyin data backup, but also in long-termarchiving and compliance.
“Long-term archiving is the primaryreason any company shoulduse tape these days,” says MikeKarp, senior analyst at EnterpriseManagement Associates in Boulder,Colo. Companies are increasinglylikely to use disk in conjunctionwith tape for backup, but for long-termarchiving needs, tape remainsunbeatable.
After factoring inacquisition costs of equipment andmedia, as well as electricity and datacenter floor space, Clipper Groupfound that the total cost of archivingsolutions based on SATA disk, theleast expensive disk, was up to 23times more expensive than archivingsolutions involving tape. Calculatingenergy costs for the competing approaches,the costs for disk jumpedto 290 times that of tape.
“Tape isalways the winner anywhere costtrumps anything else,” says Karp.No matter how the cost is figured,tape is less expensive.
Beyond IT familiarity with tape,analysts point to other reasons whyorganizations will likely keep tapein their IT storage infrastructures.Energy savings, for example, is themost recent reason to stick withtape. “The economics of tape arepretty compelling, especially whenyou figure in the cost of power,”Schulz says.
So, whether you are planning for an Exabyte-scale data center, or merely questioning the logic of a disk-for-everything storage approach, you might want to consider tape. It's "green" for the environment, and less expensive on your budget.
In case you haven't noticed, IBM System Storage makes most of their announcements on Tuesdays. IBM announced a lot today, so here is a quick run-down.
Cisco storage networking products
IBM continues to resell Cisco switches and directors, but now can offer these with a 1-year IBM warranty.
The entry-level Cisco 9124offers 8 to 24 ports. For IBM BladeCenter, IBM now offers the Cisco10-port and 20-port modules that slide into the back of the chassis, and are functionally equivalent to the 9124.The original BladeCenter came with a 16-port module with 14 internal, but only 2 external, which severely hamperedbandwidth connectivity to external storage. These new modules provide more external ports to relieve that constraint.
The midrange Cisco9200switches have two models, both with 16 fixed ports, with the option for a blade that can provide 12, 24 or 48 additional ports. The 9216A has 16 FCP ports, and the 9216i has 14 FCP ports, and 2 GbE ports to act as a router, such as toconnect to a remote location for business continuity using Metro Mirror or Global Mirror.
The enterprise-class Cisco 9500directors can support up to 528 ports.
TS3400 Tape Library
The new TS3400library is a small entry-level size library, supporting the enterprise-class TS1120 drive, providing interoperabilitywith the larger tape libraries, with all the support for tape encryption.
In addition to Linux, Unix, and WIndows, the TS1120 can now be connected to System i servers. In the past, the only IBMtape available to System i were the LTO models. There are a lot of businesses that need to comply with government regulations that are looking for tape encryption, and now IBM has made it accessible to more clients.
300GB drives at 15K RPM
The DS8000 can now support new drives with 300GB capacity at 15,000 RPM (15K). These can be up to 30 percent faster than the 10,000 RPM drives for typical workloads.
IBM continues its market leadership with these new set of features and offerings!
Today was the "First Ever Live Virtual Virtualization Tech Fair" sponsored by IBM and VMware. This was a 1-day event hosted by Unisfair.
The day included presentations done at a conference call, along with exhibition booths.
We had an exhibition booth exclusively for "storage virtualization" featuring our IBM System Storage SAN Volume Controller (disk virtualization) and IBM System Storage TS7520 Virtualization Engine (a virtual tape library, or VTL).
People who were logged in were represented in silhouette form. When someone walked into the booth, our army of "booth reps" were able to chat with them and answer their questions. They could also peruse the various online materials we made available about each product.
Here are some of my observations:
A lot of questions were related to IBM's support for VMware. Although VMware is now currently owned by EMC, pending a spin-off IPO, IBM is its biggest reseller, given IBM's vast experience in server virtualization. Ironically, IBM's SAN Volume Controller supports VMware better than EMC's own storage virtualization product, Invista.
People also familiar with Second Life thought this 2-D "silhouette" version eliminated the need to configure and dress up your avatar as is required in participating in Second Life events. However, being only ableto chat, send e-mail and show web pages seemed less immersive than what Second Life can offer.
This event generated over 60 leads. We will pass on the contact information to the appropriate sales team.
Well, this week I am in Maryland, just outside of Washington DC. It's a bit cold here.
Robin Harris over at StorageMojo put out this Open Letter to Seagate, Hitachi GST, EMC, HP, NetApp, IBM and Sun about the results of two academic papers, one from Google, and another from Carnegie Mellon University (CMU). The papers imply that the disk drive module (DDM) manufacturers have perhaps misrepresented their reliability estimates, and asks major vendors to respond. So far, NetAppand EMC have responded.
I will not bother to re-iterate or repeat what others have said already, but make just a few points. Robin, you are free to consider this "my" official response if you like to post it on your blog, or point to mine, whatever is easier for you. Given that IBM no longer manufacturers the DDMs we use inside our disk systems, there may not be any reason for a more formal response.
Coke and Pepsi buy sugar, Nutrasweet and Splenda from the same sources
Somehow, this doesn't surprise anyone. Coke and Pepsi don't own their own sugar cane fields, and even their bottlers are separate companies. Their job is to assemble the components using super-secret recipes to make something that tastes good.
IBM, EMC and NetApp don't make DDMs that are mentioned in either academic study. Different IBM storage systems uses one or more of the following DDM suppliers:
Seagate (including Maxstor they acquired)
Hitachi Global Storage Technologies, HGST (former IBM division sold off to Hitachi)
In the past, corporations like IBM was very "vertically-integrated", making every component of every system delivered.IBM was the first to bring disk systems to market, and led the major enhancements that exist in nearly all disk drives manufactured today. Today, however, our value-add is to take standard components, and use our super-secret recipe to make something that provides unique value to the marketplace. Not surprisingly, EMC, HP, Sun and NetApp also don't make their own DDMs. Hitachi is perhaps the last major disk systems vendor that also has a DDM manufacturing division.
So, my point is that disk systems are the next layer up. Everyone knows that individual components fail. Unlike CPUs or Memory, disks actually have moving parts, so you would expect them to fail more often compared to just "chips".
If you don't feel the MTBF or AFR estimates posted by these suppliers are valid, go after them, not the disk systems vendors that use their supplies. While IBM does qualify DDM suppliers for each purpose, we are basically purchasing them from the same major vendors as all of our competitors. I suspect you won't get much more than the responses you posted from Seagate and HGST.
American car owners replace their cars every 59 months
According to a frequently cited auto market research firm, the average time before the original owner transfers their vehicle -- purchased or leased -- is currently 59 months.Both studies mention that customers have a different "definition" of failure than manufacturers, and often replace the drives before they are completely kaput. The same is true for cars. Americans give various reasons why they trade in their less-than-five-year cars for newer models. Disk technologies advance at a faster pace, so it makes sense to change drives for other business reasons, for speed and capacity improvements, lower power consumption, and so on.
The CMU study indicated that 43 percent of drives were replaced before they were completely dead.So, if General Motors estimated their cars lasted 9 years, and Toyota estimated 11 years, people still replace them sooner, for other reasons.
At IBM, we remind people that "data outlives the media". True for disk, and true for tape. Neither is "permanent storage", but rather a temporary resting point until the data is transferred to the next media. For this reason, IBM is focused on solutions and disk systems that plan for this inevitable migration process. IBM System Storage SAN Volume Controller is able to move active data from one disk system to another; IBM Tivoli Storage Manager is able to move backup copies from one tape to another; and IBM System Storage DR550 is able to move archive copies from disk and tape to newer disk and tape.
If you had only one car, then having that one and only vehicle die could be quite disrupting. However, companies that have fleet cars, like Hertz Car Rentals, don't wait for their cars to completely stop running either, they replace them well before that happens. For a large company with a large fleet of cars, regularly scheduled replacement is just part of doing business.
This brings us to the subject of RAID. No question that RAID 5 provides better reliability than having just a bunch of disks (JBOD). Certainly, three copies of data across separate disks, a variation of RAID 1, will provide even more protection, but for a price.
Robin mentions the "Auto-correlation" effect. Disk failures bunch up, so one recent failure might mean another DDM, somewhere in the environment, will probably fail soon also. For it to make a difference, it would (a) have to be a DDM in the same RAID 5 rank, and (b) have to occur during the time the first drive is being rebuilt to a spare volume.
The human body replaces skin cells every day
So there are individual DDMs, manufactured by the suppliers above; disk systems, manufactured by IBM and others, and then your entire IT infrastructure. Beyond the disk system, you probably have redundant fabrics, clustered servers and multiple data paths, because eventually hardware fails.
People might realize that the human body replaces skin cells every day. Other cells are replaced frequently, within seven days, and others less frequently, taking a year or so to be replaced. I'm over 40 years old, but most of my cells are less than 9 years old. This is possible because information, data in the form of DNA, is moved from old cells to new cells, keeping the infrastructure (my body) alive.
Our clients should approach this in a more holistic view. You will replace disks in less than 3-5 years. While tape cartridges can retain their data for 20 years, most people change their tape drives every 7-9 years, and so tape data needs to be moved from old to new cartridges. Focus on your information, not individual DDMs.
What does this mean for DDM failures. When it happens, the disk system re-routes requests to a spare disk, rebuilding the data from RAID 5 parity, giving storage admins time to replace the failed unit. During the few hours this process takes place, you are either taking a backup, or crossing your fingers.Note: for RAID5 the time to rebuild is proportional to the number of disks in the rank, so smaller ranks can be rebuilt faster than larger ranks. To make matters worse, the slower RPM speeds and higher capacities of ATA disks means that the rebuild process could take longer than smaller capacity, higher speed FC/SCSI disk.
According to the Google study, a large portion of the DDM replacements had no SMART errors to warn that it was going to happen. To protect your infrastructure, you need to make sure you have current backups of all your data. IBM TotalStorage Productivity Center can help identify all the data that is "at risk", those files that have no backup, no copy, and no current backup since the file was most recently changed. A well-run shop keeps their "at risk" files below 3 percent.
So, where does that leave us?
ATA drives are probably as reliable as FC/SCSI disk. Customers should chose which to use based on performance and workload characteristics. FC/SCSI drives are more expensive because they are designed to run at faster speeds, required by some enterprises for some workloads. IBM offers both, and has tools to help estimate which products are the best match to your requirements.
RAID 5 is just one of the many choices of trade-offs between cost and protection of data. For some data, JBOD might be enough. For other data that is more mission critical, you might choose keeping two or three copies. Data protection is more than just using RAID, you need to also consider point-in-time copies, synchronous or asynchronous disk mirroring, continuous data protection (CDP), and backup to tape media. IBM can help show you how.
Disk systems, and IT environments in general, are higher-level concepts to transcend the failures of individual components. DDM components will fail. Cache memory will fail. CPUs will fail. Choose a disk systems vendor that combines technologies in unique and innovative ways that take these possibilities into account, designed for no single point of failure, and no single point of repair.
So, Robin, from IBM's perspective, our hands are clean. Thank you for bringing this to our attention and for giving me the opportunity to highlight IBM's superiority at the systems level.
We've been quite busy here at the Tucson Executive Briefing Center. I am often asked to explain the relationship between IBM's various storage products. While automakers don't have to explain why they sell sports coupes, pickup trucks and minivans, this analogy does not adequately cover IT storage products. So, I have come up with a new analogy that seems to be a better fit: foundations and flavorings.
All over the world, meals are often comprised of a foundation, perhaps rice, potatoes or pasta, covered with some form of flavoring, sauces, pieces of meat or fish, grated cheese and spices. In Puerto Rico, I had dishes where the foundation was mashed bananas called [plantains]. Sandwich shops often let you pick your choice of bread, the foundation, and then your meats and cheeses, the flavorings.At our local steakhouse,[McMahon's], the menulists a set of steaks, the foundation such as Rib Eye, Filet Mignon, Prime Rib or New York Strip, andvarious flavorings, such as sauces and rubs to cover the steak. Last night, I had the Delmonico steak with the Cristiani sauce consisting of Portobello mushrooms, garlic and aged Romano cheese.
This serves as a useful analogy for IBM's storage strategy. Allowing thefoundations and flavorings to be separately orderable greatly simplifies the selection menu and providesa nearly any-to-any approach to meeting a variety of client needs.Let's take a look at both.
IBM's foundation products are the DS family [DS3000, DS4000, DS5000, DS6000 and DS8000 series], [DS9900 series], and [XIV] for disk, and the TS family [TS1000, TS2000, TS3000] series for tape drives and libraries. In much thesame way you might prefer brown rice instead of white rice, or linguine instead of penne pasta, you might find the attributes of one storagefoundation more attractive based on its performance, scalability and availability features for yourparticular application workloads.
Fellow IBM blogger Barry Whyte discusses SVC at great length on his [Storage Virtualization] blog. Flavoring disk foundation storage with SAN Volume Controller can provide you additionalfeatures and functions, and help improve the scalability, performance or availability characteristics.For example, if you have DS4000, DS8000 and XIV, you might use SVC to provide a consistent methodologyfor asynchronous replication, a form of consistent "flavoring" if you will.
N series Gateways
The [N series gateways] offerflavoring to disk foundation, including unified NAS, iSCSI and FCP protocol host attachment, and application aware capabilities. (As for our IBM N series appliances or "filers", these could be foundational storage behind an SVC, but that's perhaps a topic for another post.)
SoFS provides a global namespace with clustered NAS access to files. This is a blended disk-and-tape solution with built-in backup and Information Lifecycle Management [ILM]. Policies can be used to place different files onto different tiers of storage, automate the movement from tier to tier, including migration to tape, and even expiration when the data is no longer needed.
The [IBM System Storage DR550] provides Non-erasable, Non-rewriteable (NENR) flavoring to storage. While the DR550 comes with internal disk storage, it can front end a tape library filled with WORM cartridges. The DR550 hasbeen paired up with small libraries (TS3200 or TS3310) as well as larger libraries like the TS3500.
The IBM Grid Medical Archive Solution [GMAS] provides a variety of capabilities for storing and accessing medical images, using a blended disk-and-tape approach. This allows hospital and clinicnetworks to provide access for doctors and radiologists from multiple locations.
Many of the flavorings are called "gateways". The IBM TS7650G flavors disk that provides a virtualtape library[VTL] with inline data deduplication capability. Recent performance tests pairing the TS7650G flavoring with XIV foundation storage found this combination to be an excellent match.
Let me know what you think. Does this help you understand IBM's storage strategy and acquisitions? Enteryour comments below.
I've blogged about some of these videos already, but since there are probably a few out there buying the brand new Apple iPhone looking for YouTube videos to play on them, these links might provide some exampleentertainment on your new handheld device.
Next week has "Fourth of July" Independence Day holiday in the USA smack in the middle of the week, so I suspect the blogosphereto quiet down a bit. So whether you are working next week or not, in the USA or elsewhere, take some time to enjoy your friends and family.
Many people have asked me if there was any logic with the IBM naming convention of IBM Systems branded servers. Here's your quick and easy cheat sheet:
System x -- "x" for cross-platform architecture. Technologies from our mainframe and UNIX servers were brought into chips that sit next to the Intel or AMD processors to provide a more reliable x86 server experience. For example, some models have a POWER processor-based Remote Supervisor Adapter (RSA).
System p -- "p" for POWER architecture.
System z -- "z" for Zero-downtime, zero-exposures. Our lawyers prefer "near-zero", but this is about as close as you get to ["six-nines" availability] in our industry, with the highest level of security and encryption, no other vendor comes close, so you get the idea.
But what about the "i" for System i? Officially, it stands for "Integrated" in that it could integrate different applications running on different operating systems onto a [COMMON] platform. Options were available to insert Intel-based processor cards that ran Windows, or attach special cables that allowed separate System x servers running Windows to attach to a System i. Both allowed Windows applications to share the internal LAN and SAN inside the System i machine. Later, IBM allowed [AIX on System i] and [Linux on Power] operating systems to run as well.
From a storage perspective, we often joked that the "i" stood for "island", as most System i machines used internal disk, or attached externally to only a fewselected models of disk from IBM and EMC that had special support for i5/OS using a special, non-standard 520-byte disk block size. This meant only our popular IBM System Storage DS6000 and DS8000 series disk systems were available. This block size requirement only applies to disk. For tape, i5/OS supports both IBM TS1120 and LTO tape systems. For the most part,System i machines stood separate from the mainframe, and the rest of the Linux, UNIX and Windows distributed serverson the data center floor.
Often, when I am talking to customers, they ask when will product xyz be supported on System z or System i?I explained that IBM's strategy is not to make all storage devices connect via ESCON/FICON or support non-standard block sizes, but rather to get the servers to use standard 512-byte block size, Fibre Channel and other standard protocols.(The old adage applies: If you can't get Mohamed to move to the mountain, get the mountain to move to Mohamed).
On the System z mainframe, we are 60 percent there, allowing three of the five operating systems (z/VM, z/VSE and Linux) to access FCP-based disk and tape devices. (Four out of six if you include [OpenSolaris for the mainframe])But what about System i? As the characters on the popular television show [LOST] would say: It's time to get off the island!
Last week, IBM announced the new [i5/OS V6R1 operating system] with features that will greatly improve the use of external storage on this platform. Check this out:
POWER6-based System i 570 model server
Our latest, most powerful POWER processor brought to the System i platform. The 570 model will be the first in the System i family of servers to make use of new processing technology, using up to 16 (sixteen!) POWER6 processors (running at 4.7GHZ) in each machine.The advantage of the new processors is the increased commercial processing workload (CPW) rating, 31 percent greater than the POWER5+ version and 72 percent greater than the POWER5 version. CPW is the "MIPS" or "TeraFlops" rating for comparing System i servers.Here is the[Announcement Letter].
Fibre Channel Adapter for System i hardware
That's right, these are [Smart IOAs], so an I/O Processor (IOP) is no longer required! You can even boot the Initial Program Load (IPL) direclty from SAN-attached tape.This brings System i to the 21st century for Business Continuity options.
Virtual I/O Server (VIOS)
[VirtualI/O Server] has been around for System p machines, but now available on System i as well. This allows multiplelogical partitions (LPARs) to access resources like Ethernet cards and FCP host bus adapters. In the case of storage, the VIOS handles the 520-byte to 512-byte conversion, so that i5/OS systems can now read and write to standard FCP devices like the IBM System Storage DS4800 and DS4700 disk systems.
IBM System Storage DS4000 series
Initially, we have certified DS4700 and DS4800 disk systems to work with i5/OS, but more devices are in plan.This means that you can now share your DS4700 between i5/OS and your other Linux, UNIX and Windowsservers, take advantage of a mix of FC and SATA disk capacities, RAID6 protection, and so on.
To call [IBM PowerVM] the "VMware for the POWER architecture" would not do it quite justice. In combination with VIOS, IBM PowerVM is able to run a variety of AIX, Linux and i5/OS guest images.The "Live Partition Mobility" feature allows you to easily move guest images from one system to another, while they are running, just like VMotion for x86 machines.
And while we are on the topic of x86, PowerVM is also able to represent a Linux-x86 emulation base to run x86-compiled applications. While many Linux applications could be re-complied from source code for the POWER architecture "as is", others required perhaps 1-2 percent modification to port them over, and that was too much for some software development houses. Now, we can run most x86-compiled Linux application binaries in their original form on POWER architecture servers.
BladeCenter JS22 Express
The POWER6-based [JS22 Express blade] can run i5/OS, taking advantage of PowerVM and VIOS to access all of the BladeCenterresources. The BladeCenter lets you mix and match POWER and x86-based blades in the same chassis, providing theultimate in flexibility.
This week I am in Costa Rica to celebrate[Earth Day] and promote IBM's [Smarter Planet strategy] to help solve the world's energy and environmental problems. This is thethird in the series. The first two posts were:
Here in Costa Rica, they separate their recyclables, and encourage even the hotel guests from other countries to do the same. See my photo on the left for an example.
This is more than most in the United States will do. We're lucky to get North Americans to just separate all recyclables in one bin separate from all trash in a second bin.
Leaving Arenal, I went to Escazu, a suburb of San Jose, the capital of this country. I met with Patrick, one of the owners of [Exclusive Excursions Travel] that helped me organize the eco-tourism portion of this trip to Costa Rica.
Most people are familiar with the [star rating system] that rank most hotels from one star (budget class/economy) to five stars (deluxe/luxury). The nicest hotel I've been to was the [Burj Al Arab] in Dubai, which claims a seven star rating. For eco-tourism, there is a similar "Green Leaf" rating system. According to Patrick,the Instituto Costarricense de Turismo [ICT] (tourism board of Costa Rica) rates hotels from one leaf (adopting some measures, like separating recyclables shown above) to five leaves (entirely carbon neutral).This Green Leaf system seems more important to European and Canadian tourists, but those from United States may not even be aware of it.
The food at these hotels vary. The typical dish here for breakfast, lunch and dinner is the Casado, consisting of mostly rice and beans. I have found thatCosta Rica has come up with as many creative ways to combine rice and beans in various proportions as Starbucks® serve various combinations of coffee and milk.The locals might be accustomed to a steady diet of rice and beans for every meal of every day, but those of us from North America aren't! Not counting tourist flatulence, Costa Rica has[pledged to be carbon neutral by 2021], the country's 200th birthday.
Sadly, most folks in the United States don't categorize their hotels with a Green Leaf rating system, nor do they even bother to categorize their recyclables. I spent 18 months in the field doing Information Lifecycle Management (ILM) assessments for clients, and most didn't categorize their data either.So, the next time you have some combination of coffee and milk, whether its a Latte, Misto, Espresso, or Macchiato, remember that the coffee came from acountry trying to be more environmentally responsible, grown by a farmerwho eats a simple diet of rice and beans, and has no problem separating different categories of recyclables. Perhaps you will remember to separate your data, and store it on an information infrastructure based on an environmentally-responsible combination of SSD, FC, SATA and tape, to reduce your costs and minimize your carbon footprint.
On the news today, they mentioned it was "Happy Pi Day". Today is the 14th day of the 3rd month, and "pi" is about 3.14159, the ratio of the circumference of a circle to its diameter. So, in Tucson it is celebrated on 3/14, at 1:59pm MST.
The ratio has a lot to do with storage.
Tape wrapped around a hub. Tape is thin, but not completely, so wrapping hundreds of meters on tape results in a change in diameter of the spool. This impacts the rotational velocity needed to get the linear meters-per-second on the tape media consistent as the diameter changes when you wind down from a full spindle toward the hub. IBM has variable speed motors and other clever technologies to handle this adjustment.
Disks spin at consistent speeds, but tracks on the outside edge travel faster across the head than the inside tracks.Currently, the top speeds for disk are 15000 Revolutions per minute (RPM). As faster rotational speeds are investigated, the researchers find they need to make the diameters smaller to compensate.
The diameters of disks were based on "U", the unit height of standard 19" racks. A "U" is 1.75 inches, and standard floppy diskettes were 5.25 inch (3U) and 3.5 inch (2U). For those who have a difficult time remember how many inches a "U" is, it is the height of a standard two-by-four (2x4) piece of lumber.
The value of "pi" has been calculated to over a billion significant digits. Here is a cuteapplet to use if you ever need the value to any level of accuracy.
HealthAlliance Hospital has implemented an IBM System Storage Grid Medical Archive Solution (GMAS) to make patient records available to clinicians anytime, anywhere. IBM has a [Case Study] on this implementation.Here is an excerpt from the IBM [Press Release]:
HealthAlliance Hospital, a member of UMass Memorial Health Care, serves the communities of north-central Massachusetts and southern New Hampshire with acute care facilities, a cancer center, outpatient physical therapy facilities and a remote home health agency. As an investment in continued high-quality patient care, the hospital has implemented a picture archiving and communication system (PACS) from Siemens Medical Solutions so that it can move toward digital health records while eliminating traditional paper and film.
HealthAlliance is now able to make all of their data, including PACS images, available instantly, using the IBM GMAS, a cross-IBM offering comprised of storage, software, servers and services. The GMAS solution provides hospitals, clinics, research institutions and pharmaceutical companies with an automated and resilient enterprise storage archive for delivering medical images, patient records and other critical healthcare reference information on demand.
"Fast, easy access to diagnostic images is a priority," said Rick Mohnk, Vice President and Chief Information Officer of HealthAlliance. "Being paperless not only helps our staff improve their productivity and the quality of patient care, but also lowers our costs and improves our competitiveness. The IBM GMAS has helped us stay competitive and offer the leading edge technology that attracts top physicians to our staff and keeps patients feeling comfortable and well cared for."
Normally when you read or hear the term "grid", you might think of supercomputers, but in this case we are talking about information that is accessible from different interconnected locations. I've mentioned GMAS before in my posts [Blocks, Files and Content Addressable Storage and What Happened to CAS?] but I thought I would provide more detail on the elements of the solution.
Medical imaging equipment are called "modalities", which is just fancy hospital talk for "method of treatment".These have Ethernet connections designed to write to any storage with a CIFS or NFS interface. For example, press the button on the "X-ray" machine, and the digitized version of the X-ray is stored as a file to whatever NAS storage on the other end.
[Picture Archiving and Communication System] refer to the application and the computer equipment to manage these medical images, often stored in a DICOM format and indexed with HL7 metadata headers. There are many PACS vendors, GE Medical Systems, Siemens Medical, Agfa, Fuji, Philips, Kodak, Stentor, Emageon, Brit Systems, Mckesson, Amicus, Cerner, Medweb and Teramedica, to name a few. Many PACS providers embedded specific storage as part of their solution, but now are starting to realize that they need to be part of a larger storage infrastructure.
IBM System Storage [Multi-Level Grid Access Manager] is softwareon IBM System x servers that manages access across the grid of inter-connected hospitals, clinics and imaging facilities. It provides the NFS and CIFS interfaces to the modalities, and places the data into a GPFS file system on DS4000 series disk.
GPFS and DS4000 series disk
IBM [General Parallel File System] has all the Information Lifecycle Management (ILM) capabilities to move data from one disk storage level to another, automates deletion based on expiration date, and can provide concurrent access from multiple requesters.The IBM System Storage DS4000 series disk products can support both high-speed FC disk as well as low-cost SATA disk.For large medical images, the SATA disk is often a good fit. The advantage of GPFS is that you can have policies todecide which images are placed on FC disk, and which on SATA, and then later move these files based on access reference. Images that are accessed the most frequently can be on FC disk, and those that haven't been accessed in a while on SATA disk.
TSM space management
IBM [Tivoli Storage Manager for Space Management] supports moving files out of the GPFS file system and onto tape, based on policies. For example,keep the most recent 18 months on disk, and anything older than that gets moved to tape. This is similar to themigrate/recall technology used in DFSMShsm on the mainframe.
Tape Library automation
Before GMAS, paper and film images had to be retrieved manually from shelves and filing cabinets. The massive amountsof data being stored, and for such long periods of time, makes it impractical to store all of it on disk. With tape automation, any medical image more than 18 months old can be retrieved in minutes. Patients with an appointment can have all of their medical images retrieved in bulk the night before. Emergency room patients can have previous images retrieved while admission clerks check for insurance coverage and perform triage.
Images archived on the IBM GMAS are accessible in numerous ways. For example, all clinicians can access GMAS through hospital record system, which provides complete paperless and filmless access to the patient record including medical images, lab results, radiology reports, and pharmacy records. Medical workers at any location can also access the grid using their Web browsers. This allows each employee to use the display systems they are already familiar with.
Unlike disk-only based NAS systems, IBM's blended disk-and-tape approach makes this a much more cost-effective solution.For more details on IBM GMAS, read this 6-page[Frost & Sullivan whitepaper].
Continuing this week's theme of doing important things without leaving town, I present our results foran exciting project I started earlier this year.
For seven weeks, my coworker Mark Haye and I voluntarily led a class of students here in Tucson, Arizona in an after-school pilot project to teach the ["C" programming language] using [LEGO® Mindstorms® NXT robots]. The ten students, boys and girls ages 9 to 14 years old, were already part of the FIRST [For Inspiration and Recognition of Science and Technology] program, and participated in FIRST Lego League[FLL] robot competitions.Since the students were already familiar building robots, and programming them with a simple graphical system of connecting blocks that perform actions. However, to compete in the next level of robot competitions, FIRST Tech Challenge [FTC],we need to leave this simple graphical programming behind, and upgrade to more precise "C" programming.
Mark is a software engineer for IBM Tivoli Storage Manager and has participated in FLL competitions over the past nine years. This week, he celebrates his 25th anniversary at IBM, and I celebrate my 23rd. The teacher, Ms. Ackerman, and the students referred to us as "Coach Mark" and "Coach Tony".
This was the first time I had worked with LEGO NXT robots. For those not familiar with these robots, you can purchase a kit at your localtoy store. In addition to regular LEGO bricks, beams, and plates, there are motors, wheels, and sensors. A programmable NXT brick has three outputs (marked A,B, and C) to control three motors, and four inputs (marked 1,2,3,4) to receive values from sensors. Programs are written and compiled on laptops and then downloaded to the NXT programmable brick through an USB cable, or wirelessly via Bluetooth.
In the picture shown, an image of the Mars planetary surface is divided into a grid with thick black lines.A light sensor between the front two wheels of the robot is over the black line.
We used the [RobotC programming firmware] and integrated development environment (IDE) from [Carnegie Mellon University].The idea of this pilot was to see how well the students could learn "C". With only a few hours after class on each Wednesday, could we teach young students "C" programming in just seven weeks?
My contribution? I have taught both high school and college classes, and spent over 15 years programming for IBM, so Mark asked me to help.We started with a basic lesson plan:
A brief history of the "C" language
Understanding statements and syntax
Setting motor speed and direction
Compiling and downloading your first program
Understanding the "while" loop
Retrieving input sensor values
Understanding the "if-then-else" statement
Defining variables with different data types
Manipulating string variables
Writing a program for the robot to track along a black line on a white background.
Understanding local versus global scope variables
Writing a program for a robot to count black lines as it crosses them.
Perform left turns, right turns, and to cross a specific number of lines on a grid pattern to move the robot to a specific location.
Weeks 6 and 7
Mission Impossible: come up with a challenge to make the robot do something that would be difficult to accomplish using the previous NXT visual programming language.
At the completion of these seven weeks, I sat down to interview "Coach Mark"on his thoughts on this pilot project.
This is a practical programming skill. The "C" language is used throughout the world to program everything from embedded systems to operating systems, and even storage software. This would allow the robots to handle more precise movements, more accurate turns, and more complicated missions.
Can kids learn "C" in only seven weeks?
Part of the pilot project was to see how well the students could understand the material. They were already familiar with building the robots, and understood the basics of programming sensors and motors, so we were hoping this was a good foundation to work from. Some kids managed very well, others struggled.
Did everything go according to plan?
The first two weeks went well, turning on motors and having robots move forward and backward were easy enough. We seemed to lose a few students on week 3, and things got worse from there. However, several of the students truly surprised us and managed to implement very complicated missions. We were quite pleased with the results.
What kind of problems did the kids encounter?
Touch sensor required loops waiting for pressing. Motors did not necessarily turn as expected until more advanced methods were used. Making 90 degree left and right turns accurately was more difficult than expected.
Any funny surprises?
Yes, we had a Challenge Map representing the Mars planetary surface from a previous FLL competition that was dark red and divided into squares with thick black lines. An active light sensor returns a value of "0" (complete darkness) to "100" (bright white).However, the Mars surface had craters that were dark enough to be misinterpreted as a black line causing some unusual results. This required some enhanced programming techniques to resolve.
Did robots help or hurt the teaching process?
I think they helped. Rather than writing programs that just display "Hello World!" on a computer screen, the students can actually see robots move, and either do what they expect, or not!
And when the robots didn't do what they were expected to?
The students got into "debug" mode. They were already used to doing this from previous FLL competitions, but with RobotC, you can leave the USB cable connected (or use wireless Bluetooth) and actually gather debugging information while the robot is running, to see the value of sensors and other variables and help determine why things are not working properly.
Any applicability to the real world of storage?
We have robots in the IBM System Storage TS3500 tape library. These robots scan bar code labels, pull tapes out of shelves and mount them into drives.The programming skills are the same needed for storage software, suchas IBM Tivoli Storage Manager or IBM Tivoli Storage Productivity Center.
The world is becoming smarter, instrumented with sensors, interconnected over a common network, and intelligent enough to react and respond correctly. The lessons of reading sensor values and moving motors can be considered the first step in solutions that help to make a smarter planet.
Use more efficient disk media, such as high-capacity SATA disk drives
Both are great recommendations, but why limit yourself to what EMC offers? Your x86-based machines are only a subset of your servers,and disk is only a subset of your storage. IBM takes a more holistic approach, looking at the entire data center.
VMware is a great product, and IBM is its top reseller. But in addition to VMware, there are other solutions for the x86-based servers, like Xen and Microsoft Virtual Server. IBM's System p, System i, and System z product lines all support logical partitioning.
To compare the energy effectiveness of server virtualization, consider a metric that can apply across platforms. For example, for an e-mail server, consider watts per mailbox. If you have, say, 15,000 users, you can calculate how many watts you are consuming to manage their mailboxes on your current environment, and compare that with running them on VMware, or logical partitions on other servers. Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
More efficient Media
SATA and FATA disks support higher capacities, and run at slower RPM speeds, thus using fewer watts per terabyte.A terabyte stored on 73GB high-speed 15K RPM drives consumes more watts than the same terabyte stored using 500GB SATA.Chuck correctly identifies that tape is more power-efficient than disk, but then argues that paper is more power-efficient than tape. But paper is not necessarily more efficient than tape.
ESG analyst Steve Duplessie divides up data betweenDynamic vs. Persistent. The best place to put dynamic data is on disk, and here is where evaluation of FC/SAS versus SATA/FATA comes into play.Persistent data, on the other hand, can be stored on paper, microfiche, optical or tape media. All of these shelf-resident media consume no electricity, nor generate any heat that would require additional cooling.
A study by scientists at the Lawrence Berkeley National Laboratory titled High-Tech Means High-Efficiency: The Business Case for Energy Management in High-Tech Industries indicates thatData centers consume 15 to 100 times more energy per square foot than traditional office space. Storing persistent data in traditional office space can save a huge amount of energy. Steve Duplessie feels the ratio of dynamic to persistent data is 1:10 today, but is likely to grow to 1:100 in the near future, raising the demand for energy-efficient storage of persistent data ever more important to our environment.
Data centers consume nearly 5000 Megawatts in the USA alone, 14000 Megawatts worldwide. To put that in perspective, the country of Hungary I was in last week can generate up to 8000 Megawatts for the entire country (and they were using 7400 Megawatts last week as a result of their current heat wave, causing them grave concern).
Back in the 1990's, one of the insurance companies IBM worked with kept data on paper in manila folders, and armiesof young adults in roller skates were dispatched throughout the large warehouses of shelves to get the appropriate folder in response to customer service inquiries. Digitizing this paper into electronic format greatly reduced the need for this amount of warehouse space, as well as improved the time to retrieve the data.
A typical file storage box (12 inch x 12 inch x 18 inch) containing typed pages single-spaced, double-sided, 12 point font could hold perhaps 100MB. The same box could hold a hundred or more LTO or 3592 tape cartridges, each storing hundreds of GB of information. That's a million-to-one improvement of space-efficiency, and from a watts-per-TB basis, translates to substantial improvement in standard office air conditioning and lighting conditions.
To learn more about IBM's Project Big Green, watch thisintroductory video which used Second Life for the animation.
EMC Corporation (NYSE:EMC) today announced it has been positioned as a leader in the Forrester Wave™: Enterprise Open Systems Virtual Tape Library (VTL), Q1 2008 by Forrester Research, Inc. (January 31, 2008), an independent market and technology research firm. EMC achieved a position as a leader in the Forrester Wave report on virtual tape libraries based on the largest installed base of the EMC® Disk Library family of systems, its broad ecosystem interoperability. Virtual tape libraries emulate tape drives and work in conjunction with existing backup software applications, enabling fast backup and restoration of data by using high-capacity, low-cost disk drives.
EMC was the first major vendor in the open systems virtual tape library market as it introduced the EMC Disk Library in April 2004 and today is a leading provider of open systems virtual tape solutions, with systems that are designed for businesses and organizations of all sizes.
While the press release implies that "EDL equals VTL", Chuck tries to explain they are in fact very different. Here is an excerpt from his blog post:
Virtual Tape Libraries vs. Disk Libraries
As many of you know, VTLs have been around for a while. They use disk as a cache -- they buffer the incoming backup streams, do some housekeeping and stacking, then turn around and write tape efficiently. When you go to restore, you're usually coming back off of tape, unless the backup image in question is sitting in the disk cache.
Now, there is nothing wrong with the VTL approach, but it was conceived in a time when disks were horribly expensive. It was also pretty clear to many of us that disks were going to be a whole lot cheaper in the near future, and this fundamental assumption wouldn't be valid for much longer.
I kept thinking in terms of disk as a direct target for a backup application. No modifications to the backup application. Native speed of sequential disks for both backup and restore. Tape positioned as a backup to the backup. Use the strengths of the underlying array (e.g. CLARiiON) for performance, availability, management, etc.
We ended up calling the concept a "disk library" to differentiate from the VTLs that had come before it. It was a different value proposition and offering, based on the emergence of lower-cost disk media.
... It's nice to see we're at 1,100+ customers, and still going strong.
For those new to the blogosphere, there is a difference between "Press Releases" as formalcorporate communications versus "Blog Posts" which are informal opinions of the individual blogger, whichmay or may not match exactly the views of their respective employer.As we've learned many times before, one should not treat termslike "first" or "leader" in corporate press releases literally! Let's explore each.
Was EDL the first "open systems" Virtual Tape Library?
This is implied by the Forrester report. Chuck mentions the "VTLs that had came before it" in his blog, and many people are aware that IBM and StorageTek had introduced mainframe-attached VTLs in the 1990s. But what about VTL for "open systems"?
(Hold aside for the moment that IBM System zmainframe is an open system itself, with z/OS certified as a bona fide UNIX operating system by the [the Open Group] standards body. Most analysts and research firms usually refer only to the non-mainframe versions of UNIX and Windows. Alternative definitions for "open systems" can be foundin [Web definitions or Wikipedia]. I will assume Forrester meantnon-mainframe servers.)
IBM announced AIX non-mainframe attachment via SCSI connectivity to the IBM 3494 Virtual Tape Server (VTS) on Feb 16, 1999, with general availability in May 28, 1999. That's nearly FIVE YEARS before the April 2004 introduction of EDL. IBM VTS support for Sun Solaris and Microsoft Windows came shortly thereafter in November 2000, and support for HP-UX a bit later in June 2001. One of my 17 patents is for the software inside the IBM 3494 VTS, so like Chuck, I can takesome pride in the success of a successful product.
(I don't remember if StorageTek, which was subsequently acquired by Sun, had ever supported non-mainframe operating systems with their Virtual Storage Manager[VSM] offering, but if they did, I am sure it was also before EMC.)
Last week, another EMC blogger, BarryB (aka [the Storage Anarchist]),took me to task in comments on my post [IBM now supports 1TB SATA drives]. He felt that IBM should not claim support, given that the software inside the IBM System Storage N series is developed by NetApp. He compared this to the situation of HP and Sun re-badging the HDS USP-V disk system. If someone else wrote the software, BarryB opines, IBM should not claim credit for it. I tried to explain how IBM provides added value and has full-time employees dedicated to N series development and support, butdoubt I have changed his mind.
Why do I bring that up? Because the EMC Disk Library runs OEM software from FalconStor. Basically EMC is assembling a hardware/software solution with components provided from OEM suppliers. Hmmm? Sound familiar? Who is calling the kettle black?
If there is a clear winner here, it is FalconStor itself.Perhaps one of the worst kept industry secrets is that FalconStor software is also used in VTL offerings from Sun, Copan, and IBM, the latter embodied as the [IBM TS7520 Virtualization Engine] offering. If you like the concept of an EDL,but prefer instead one-stop shopping from an "information infrastructure" vendor, IBM can offer the TS7520 along with servers, software and services for a complete end-to-end solution.
Can EMC claim to be "a leader" in Virtual Tape Libraries?
During the measured quarter, IBM shipped its 10 millionth LTO-4 tape drive cartridge to Getty Images, the world's leading creator and distributor of still imagery, footage and multi-media products, as well as a recognized provider of other forms of premium digital content, including music. Getty Images is using the LTO-4 drives as part of a tiered infrastructure of IBM disk and tape solutions that help support the backup needs of their digital imagery;
IBM shipped more than 1,500 Petabytes of tape storage in Q3'07 alone;
During Q3'07, IBM shipped the 10,000th IBM System Storage TS3500 Tape Library. The TS3500 is a highly scalable tape library with support from 1 to 192 tape drives and up to 6,400 cartridge slots for open system, mainframe and virtual tape system attachment.
Let's take a look at the numbers. IBM has sold over 5,400 virtual tape libraries. Sun/STK has sold over 4,000 virtual tape libraries. Both are drastically more than the 1,100 mentioned in Chuck's post. Does IDC recognize EMC in third place? No, EMC chooses instead to declare EDL as disk arrays (probably toprop up their IDC "Disk Tracker" numbers), so they don't even earn an honorable mention under the virtual tape librarycategory. This of course includes the number of mainframe-attached models from IBM and Sun/STK. So, if EMC did call these tape systems instead, they might showup in third place, and as such EMC could claim to be "a leader" in much the same way an athlete can claim to be an "Olympic medalist" winning the bronze for third place. (If you limit thecount to just the FalconStor-based models from IBM, EMC, Sun and Copan, then EMC moves up to first or second, but then press release titles like "EMC a Leader in FalconStor-based non-mainframe Virtual Tape Libraries" can get too confusing.)
Chuck, if you are reading this, I feel you have every right to celebrate your involvement with the EDL. Despite having common software and hardware components, both IBM and EMC can rightfully declare their own unique value-add through their respective VTL offerings. Like the IBM N series, the EMC Disk Library is not diminished by the fact the software was written by someone else. BarryB might disagree.
Normally, IBM only makes announcements on Tuesdays, but today, Friday, IBM announces that it acquired Diligent Technologies. What? I got a lot ofquestions about this, so I thought I would start with this...
When I posted in January that[IBM Acquires XIV],fellow EMC blogger Mark Twomey of StorageZilla fame, sent me a comment:
"Ah now Tony I wasn't poking fun. Indeed I find it fascinating that Moshe who's been sitting out on the fringes for years having been banished for being an obstructionist to EMC entering the mid-market is now back.
Which reminds me what happens with Diligent? There his as well aren't they or has he packed his stake in that in?"
As you might have guessed, I am privy to a lot of stuff going on behind the scenes at IBM that I can't talk about in this blog, and all these rumors in the blogosphere about IBM acquisition of Diligent was a topic I couldn't officially recognize, defend or deny, until official IBM announcements were made.
In his latest post, Mark wonders about[the last Tape and Mainframe sales person on earth]. He recounts my interaction with fellow HDS blogger Hu Yoshia about the energy benefits ofVirtual Tape Libraries. Knowing that we were going to announcement IBM's acquisition of Diligent soon, I thoughtthis would be a worthy exchange, driving up the sales of Diligent boxes (whether you buy them from IBM or HDS).Diligent already had reselling arrangements with HDS, and IBM plans to continue thosearrangements going forward with HDS. As I have explained before in my post [Supermarketsand Specialty Shops], IBM and HDS cater to different customers, so if a customer who wants the best technologyfrom a specialty shop, they can buy IBM Diligent products from HDS, but if they want one-stop shopping, they can buyIBM Diligent directly from IBM or its other IBM Business Partners.
(Perhaps a more tricky situation is that Diligent also had an arrangement with Sun Microsystems, which competesdirectly against IBM as another IT supermarket vendor, but I have not heard how IBM has decided to handle thisgoing forward.)
For more on this intricate mess of interconnected companies, alliances and partnerships, read Dave Raffo's article[Data dedupe dance cardfilling up] over at Storage Soup.
So, let's tackle the first question:
Q1. What will happen to IBM's real tape library business?
Come on! IBM is Number one in tape, we've had virtual tape libraries since 1997 (the first in the industry)and continue to do well in both virtual and real tape libraries. Both provide value to the customer, and bothhave their place as part of the overall "information infrastructure". This acquisition provides yet another choicefor clients on our "supermarket" shelf.
(For those following the ["which is greener"] discussion, the robot of the IBM TS3500 real tape library consumes185W per frame (when moving) and each tape drive consumes 50W (when actively working on a tape). Compared to 13W per SATA disk drive, each 6-drive frame of a TS3500 consumes as much electricity as 37 SATA disk drives. If you are not running backups 24x7, the total KWh per day for your tape library is actually quite less, but as several people have pointed out, there are customers that do run backups 80-90 percent of the time. LTO-4 tapes can hold 800GB uncompressed, and SATA disk are now available in 1TB (1000 GB) size, so you can have fun with your own comparisons.)
Meanwhile, Scott Waterhouse, one of the few people at EMC who understand tape workloadslike backup and archive, takes me to task in his Backup Blog with his post[I want a Red Ferrari].For those who are surprised that anyone at EMC might understand backup workloads, EMC did acquire a company calledLegato, and perhaps Scott came from that acquisition. I've never met Scott in person, but based solely only fromhis writings, he seems to know his stuff and makes strong arguments for using IBM Tivoli Storage Manager (TSM) with deduplication and virtual tape libraries.
While TSM does a good job of "deduplicating" at the client first, backing up only changed data, Scott feels database and email repositories must be backed up entirely each time, which is what happens in many other backup software products. Some clients might have 80 percent database/email and only 20 percent files, while others might have less than 20 percent database/email and 80 percent files, so this might influence whether deduplication will have small or big benefit.If TSM has to backup the entire database, even though little has changed since the last backup, that is where deduplication on a virtual tape library can come in handy. For IBM DB2 and Oracle databases, IBM TSM application-aware Tivoli Data Protection module interface backs up only changed data, not the entire file. Thanks to IBM's FilesX acquisition-- (also coincidently from Israel) --IBM can extend this support now to SQL Server databases as well.However, to be fair, Scott is partly correct, TSM does backup some database and email repositories in their entirety, which is why it is a good idea to have BOTH an IBM virtual tape library with deduplication and Tivoli Storage Manager to handle all cases. This brings us to the next question:
Q2. What will happen to IBM's patented "progressive backup" technology?
IBM will continue to use TSM's progressive backup technology. TSM already works great with Diligent virtual tapelibraries. One example is LAN-free backup. In this configuration, the TSM client writes its backups directly toa virtual or real tape library, over the SAN, and then sends the list of files backed up to the TSM server over theLAN to record in its database. This can greatly reduce IP traffic on your LAN during peak backup periods. For more about this, see the IBM Redbook titled["Get More Out of Your SAN with IBM Tivoli Storage Manager"].
Jon Toigo from DrunkenData asks[Did IBM Do Due Diligence Before Making Diligent Acquisition a Done Deal?] which is probably always a valid question. Unlike XIV, I wasn't part of the Diligent acquisition team, so I can't provide first hand account of the process. I am told that the IBM team did all the right things to make sure everything is going to turn out right.Sadly, many companies that make acquisitions in the IT industry fail to make them work. Fortunately, IBM is one of the few companies that has a great success record, with over 60 acquisitions in the past six years.In the Xconomy forum, Wade Rousch writes[IBM and the Art of Acquisitions]and gives some insight why IBM is different. Jon did not understand why Cindy Grossman, IBM VP of tape and archive solutions, ran the analyst conference call for this announcement, which brings me to the next question:
Q3. What is Diligent virtual tape library going to be categorized as, a disk system or a tape system?
IBM organizes its storage systems based on the host application workloads.Products to address disk workloads (SVC, DS8000 series, DS6000 series, DS4000 series, DS3000 series, N series, XIV Nextra) are in our disk systems group. Storage that appears to host applications like a tape system to address workloads like backup and archive (tape drives, libraries and tape virtualization) are in our tape and archive group. IBM Diligent has two products, one for big workloads and one for medium workloads. Both look liketape systems, so our tape and archive team, who understand tape workloads like backup and archive the best, are obviously the best choice to support IBM Diligent in the mix.
IBM will offer both N series and Diligent deduplication capabilities. For disk workloads, IBM N series offers a post-process deduplication feature at no additional charge. For tape workloads, IBM will now offer an in-line deduplication feature with Diligent Technologies. Different workloads, different offerings.
As with any acquisition, there will be some changes. The 100 folks from Diligent will get to learn the IBM wayof doing things. This brings me to our fifth and final question:
Q5. What is the correct spelling: deduplication or de-duplication?
It appears that Diligent has a corporate-wide standard to hyphenate this term (de-duplication), but the "word police" at IBM that control and standardize all "proper spellings, trademarks, and capitalization" have sent me corporate instructions a few days ago that IBM does not to hyphenate this term (deduplication). So, going forward, it will be "deduplication", or "dedupe" for short.I suspect one of the first tasks that our new IBMers from Diligent will be doing is removing all those hyphens fromthe [Diligent Technologies website]!
That's all for now, I'm off to Chicago, Illinois tomorrow!
Over on his Backup Blog, fellow blogger Scott Waterhouse from EMC has a post titled
[Backup Sucks: Reason #38]. Here is an excerpt:
Unfortunately, we have not been able to successfully leverage economies of scale in the world of backup and recovery. If it costs you $5 to backup a given amount of data, it probably costs you $50 to back up 10 times that amount of data, and $500 to back up 100 times that amount of data.
If anybody can figure out how to get costs down to $40 for 10 times the amount of data, and $300 for 100 times the amount of data, they will have an irrefutable advantage over anybody that has not been able to leverage economies of scale.
I suspect that where Scott mentions we in the above excerpt, he is referring to EMC in general, with products like
Legato. Fortunately, IBM has scalable backup solutions, using either a hardware approach, or one purely with software.
The hardware approach involves using deduplication hardware technology as the storage pool for IBM Tivoli Storage Manager (TSM). Using this approach, IBM Tivoli Storage Manager would receive data from dozens, hundreds or even thousands
of client nodes, and the backup copies would be sent to an IBM TS7650 ProtecTIER data deduplication appliance, IBM TS7650G gateway, or IBM N series with A-SIS. In most cases, companies have standardized on the operating systems and applications used on these nodes, and multiple copies of data reside across employee laptops. As a result, as you have more nodes backing up, you are able to achieve benefits of scale.
Perhaps your budget isn't big enough to handle new hardware purchases at this time, in this economy. Have no fear,
IBM also offers deduplication built right into the IBM Tivoli Storage Manager v6 software itself. You can use sequential access disk storage pool for this. TSM scans and identifies duplicate chunks of data in the backup copies, and also archive and HSM data, and reclaims the space when found.
If your company is using a backup software product that doesn't scale well, perhaps now is a good time to switch over to IBM Tivoli Storage Manager. TSM is perhaps the most scalable backup software product in the marketplace, giving IBM an "irrefutable advantage" over the competition.
Well, I have left Japan, and while everyone else is enjoying the Super Bowl, I am now in Australia, at another conference.Today I had the pleasure to hear filmmakers talk about their successes, and how IBM helps the movie industry.
At one extreme was Khoa Do, independent filmmaker. After acting in movies asideMichael Caine and Billy Zane, he decided to become his own director. He started a project to help seven disadvantaged youths from a poor drug-ridden section of Sydney, by having them act in his first full-length film.Armed with only an IBM laptop and small budget, he made the film called "The Finished People" that had critical acclaim.
The film was a success, and many of the disadvantaged youths have gone on to act in other movies. In 2005, Khoa Do was named "Young Australian of the Year".
Thanks to IBM technology, filmmaking is now accessible to a wider number of aspiring wanna-be directors. It is no longer necessary to be part of a large film studio with a multi-million dollar budget to tell your story.
At the other extreme, was Xavier Desdoigts, director of technical operations at Animal Logic, the Computer Graphics (CG) arthouse that produced special effects of movies like "The Matrix", "House of Flying Dragons" and "World Trade Center". They started with producing digital effects for TV commercials, like this one forCarlton Draught Beer.
With the support of a large film studio and multi-million dollar budget, Animal Logic now boasts the 86th most powerful "Supercomputer" based on IBM BladeCenter technology, with over 4000 servers connected into a cluster, for making the movie "Happy Feet". The movie took four years to make, with over 500 people, of 27 different nationalities. It was the first CG movie made in Australia, and has been well-received by audiences worldwide.
Mr. Desdoigts gave out some interesting facts and figures about the movie:
While visually stunning on the big screen, each frame is only 1.4 Megapixel, about the same resolution as most camera phones.
In one scene, there are 427,086 penguins all appearing on frame.
Mumble, the lovable lead character, is made up of over 6 million feathers.
As many as 17 dancers were "motion-captured" to choreograph the tap-dancing and character interaction segments.
Only one system admin was needed to manage this entire server farm. (IBM Systems Director technology makes this possible)
The movie consumed 103 TB of disk space, backed up to 595 LTO tape cartridges.
An estimated 17 million CPU-hours were needed for all the processing and rendering.
Rather than talking about technology for technology sake, these filmmakers showed how technology couldbe put to use, in a practical sense, to provide the world something of value.
"IBM announced that Northwest Radiology Network has gone live with a new virtualized enterprise of IBM servers and storage to support its growing medical imaging needs, giving its four locations an enterprise-class infrastructure which enables its doctors to recover medical image reports faster for analysis and enables remote 24x7 access to its medical image report system.
Founded in 1967, Northwest Radiology (NWR) is ranked as one of the largest physician groups in the Indianapolis, Indiana area. With 180 employees who offer the Central Indiana community comprehensive inpatient and outpatient imaging services such as mammography, ultrasonography, CT scans, PET-CT scans, bone density scans and MRIs – the Network had a dramatic need to develop a centralized infrastructure where large amounts of data could be stored and shared. A new data center would benefit the company’s clientele; which includes area hospitals and doctor’s offices serving thousands of patients each year.
Storing more than ten thousand medical imaging reports and radiographic images each month for doctors to analyze, the Network realized it had single points of failure and at one point a critical report server failed. Northwest Radiology turned to IBM and IBM Business Partner Software Information Systems (SIS) for a more efficient solution to prevent any possible downtime in the future.
SIS recommended and installed a virtualized infrastructure with IBM servers and storage as the heart of Northwest Radiology’s Indianapolis data center. By April 2007, Northwest Radiology replaced eight servers and direct attached storage with just two IBM System x3650 servers connected to an IBM System Storage DS3400. Today, the new servers run 15 virtual servers to ensure the availability of their services 24x7. When the business needs it, a new server can be provisioned in just minutes. With a Fibre Channel on the SAN Disk, the DS3400 not only increased performance but also met NWR’s requirement to not have one single point of failure. With three TB of storage capacity, they can meet the demands of increased business well into the future. The systems are also now easily managed from a remote site."
“Uptime is paramount in our business. We selected IBM based on the reliability and flexibility of IBM System x servers and the IBM System Storage DS3400,” said Marty Buening, IT Director, Northwest Radiology Network. “The virtualized infrastructure and the SAN storage array that SIS and IBM brought to the table is improving our service and giving our doctors and staff piece of mind knowing each patient’s medical imaging reports are always available.”
Second, we have [Iowa Health System], a large enterprise with over 19,000 employees, managing four million patients and hundreds of TBs of data.
Here is a 4-minute video on IBM TV from the good folks at Iowa Health System discussing theIBM Grid Medical Archive Solution (GMAS) as part of their information infrastructure for theirPicture Archiving and Communication Systems (PACS) application.
In both cases, IBM technology was able to provide remote access to medical information, making images and patient records available to more doctors, specialists and radiologists. Last January, in my post[Five in Five], IBM had predicted that remote access to healthcare would have an impact over the next five years.
Whether you are a small company or a large one, IBM probably has the right solution for you.
Ten years ago, I travelled to New York City with my colleague, Randy Fleenor, to present the latest in IBM tape technology for the 50th Anniversary. On Thursday evening that week, the latest movie in the Star Wars saga, Episode II: Attack of the Clones was just released, and it was being shown using the new Digital Light Projection (DLP) technology just around the corner at the Ziegfeld theater! This movie was the first live-action film to be filmed entirely digital. George Lucas saw that digital video was the future, and started the process moving forward with this film.
I convinced Randy to join me, and we arrived at 11:10pm, the movie was scheduled to start at 11pm, so we figured we had only missed a few previews. We walked into a completely empty lobby. I asked for two tickets for the 11pm show at the ticket counter, and was told it was all sold out, and there was a huge line around the building for all the people waiting to see the 1:00am show, and that we might get in to see the 3:00am show.
Randy and I had meetings on Friday morning, so we were not going to wait in line all night to see a 3am show! Just then, a young man comes out of the theater. He said his girlfriend can't make it, and wanted a refund for his two tickets. I pulled out a twenty-dollar bill, offered to buy them directly at face value, and the theater employees approved the transaction. The seats were front row of the balcony section. By then we had missed all the previews and a short bit of the movie, but that was alright with us.
(FTC Disclosure: I am both an employee and stockholder in IBM. The U.S. Federal Trade Commission may consider this a paid, celebrity endorsement of LTO-5 tapes and the LTFS technology. References to other companies are for illustrative purposes and do not represent an endorsement of their products or services.)
Digital recording is ideal for all types of video, including movies, television, and commercial advertisements.
The latest excitement is over IBM's Linear Tape File System™ (LTFS), which IBM donated to the IT industry as open source so that everyone in the world can benefit. This allows tape cartridges to be treated like USB memory sticks, the ultimate in portability of data. It is supported for Windows, Mac OS, and Linux, and already well embraced by the Media-and-Entertainment (M&E) industry.
"The move to IBM technology has helped the network shrink its archive from 1,507 to just 388 square feet, representing dramatic systems and energy-cost savings."
"AlphaTV has been broadcasting since 1996, creating and storing all forms of video entertainment, from soap operas and documentaries, to movies and sporting events, and creating a vast video archive along the way. Initially, AlphaTV archived its programming on Sony Beta SP format video cassettes that stored up to 90 minutes of content. Not long after, in need of storage that offered greater density, it turned to DVCPRO format videos that stored up to 120 minutes. But even that format was not allowing the network to keep pace with its ballooning archive, a storage infrastructure that by 2011 spanned more than 1,507 square feet."
"'A Greek TV series stored on 100 DVCPRO tapes took up four shelves in our library, whereas on LTO-5 cartridge now takes up the space of a deck of playing cards,' Constantinos Colombus, chief technology officer at AlphaTV, said in a statement."
"IBM LTFS, an intuitive and graphical file system that provides direct access to data on LTO 5 drives, has enabled AlphaTV to manage, move and share video files much like they can with disk-management systems, by simply dragging and dropping. As a result, file management is easier to do and far more efficient, said Colombus."
To prepare for this anniversary, I spoke with Brad Johns, of [Brad Johns Consulting]. Brad was head of IBM tape marketing for a while, and ran tape customer councils to gather feedback from our largest customers. Brad was my mentor in marketing at IBM from 2003-2007 and has since retired from IBM to start his own consulting practice.
The comparison was made between Crossroad Systems' Strongbox® with Enterprise tape library, LTO-5 tapes using LTFS, versus a unified disk storage system offering NAS protocols on high-capacity 3TB drives. The findings: the tape-based archive had nearly 80 percent lower TCO than the disk-based solution!
You don't have to be in the middle of the Greek economy to real that is a good value!
Earlier this year, IBM launched its[New Enterprise Data Center vision]. The average data center was built 10-15 years ago,at a time when the World Wide Web was still in its infancy, some companies were deploying their first storage areanetwork (SAN) and email system, and if you asked anyone what "Google" was, they might tell you it was ["a one followed by a hundred zeros"]!
Full disclosure: Google, the company, justcelebrated its [10th anniversary] yesterday, and IBM has partnered with Google on a varietyof exciting projects. I am employed by IBM, and own stock in both companies.
In just the last five years, we saw a rapid growth in information, fueled by Web 2.0 social media, email, mobile hand-held devices, and the convergenceof digital technologies that blurs the lines between communications, entertainment and business information. This explosion in information is not just "more of the same", but rather a dramatic shift from predominantly databases for online transaction processing to mostly unstructured content. IT departments are no longer just the"back office" recording financial transactions for accountants, but now also take on a more active "front office" role. For a growing number of industries, information technology plays a pivotal role in generating revenue, making smarter business decisions, and providing better customer service.
IBM felt a new IT model was needed to address this changing landscape, so IBM's New Enterprise Data Center vision has these five key strategic initiatives:
Highly virtualized resources
Business-driven Service Management
Green, Efficient, Optimized facilities
In February, IBM announced new products and features to support the first two initiatives, including the highlyvirtualized capability of the IBM z10 EC mainframe, and and related business resiliency features of the [IBM System Storage DS8000 Turbo] disk system.
In May, IBM launched its Service Management strategic initiative at the Pulse 2008 conference. I was there in Orlando, Florida at the Swan and Dolphin resort to present to clients. You can read my three posts:[Day 1; Day 2 Main Tent; Day 2 Breakout sessions].
In June, IBM launched its fourth strategic initiative "Green, Efficient and Optimized Facilities" with [Project BigGreen 2.0], which included the Space-Efficient Volume (SEV) and Space-Efficient FlashCopy (SEFC) capabilitiesof the IBM System Storage SAN Volume Controller (SVC) 4.3 release. Fellow blogger and IBM master inventor Barry Whyte (BarryW) has three posts on his blog about this:[SVC 4.3.0Overview; SEV and SEFCdetail; Virtual Disk Mirroring and More]
Some have speculated that the IBM System Storage team seemed to be on vacation the past two months, with few pressreleases and little or no fanfare about our July and August announcements, and not responding directly to critics and FUD in the blogosphere.It was because we were holding them all for today's launch, taking our cue from a famous perfume commercial:
"If you want to capture someone's attention -- whisper."
My team and I were actually quite busy at the [IBM Tucson Executive Briefing Center]. In between doing our regular job talking to excited prospects and clients,we trained sales reps and IBM Business Partners, wrote certification exams, and updated marketing collateral. Fortunately, competitors stopped promotingtheir own products to discuss and demonstrate why they are so scared of what IBM is planning.The fear was well justified. Even a few journalists helped raise the word-of-mouth buzz and excitement level. A big kiss to Beth Pariseau for her article in [SearchStorage.com]!
(Last week we broke radio silence to promote our technology demonstration of 1 million IOPS using Solid StateDisk, just to get the huge IBM marketing machine oiled up and ready for today)
Today, IBM General Manager Andy Monshaw launchedthe fifth strategic initiative, [IBM Information Infrastructure], at the[IBM Storage and Storage Networking Symposium] in Montpellier, France. Montpellier is one of the six locations of our New Enterprise Data Center Leadership Centers launched today. The other five are Poughkeepsie, Gaithersburg, Dallas, Mainz and Boebligen, with more planned for 2009.
Although IBM has been using the term "information infrastructure" for more than 30 years, it might be helpful to define it for you readers:
“An information infrastructure comprises the storage, networks, software, and servers integrated and optimized to securely deliver information to the business.”
In other words, it's all the "stuff" that delivers information from the magnetic surface recording of the disk ortape media to the eyes and ears of the end user. Everybody has an information infrastructure already, some are just more effective than others. For those of you not happy with yours, IBM hasthe products, services and expertise to help with your data center transformation.
IBM wants to help its clients deliver the right information to theright people at the right time, to get the most benefits of information, while controlling costs and mitigatingrisks. There might be more than a dozen ways to address the challenges involved, but IBM's Information Infrastructure strategic initiative focuses on four key solution areas:
Last, but not least, I would like to welcome to the blogosphere IBM's newest blogger, Moshe Yanai, formerly the father of the EMC Symmetrix and now leading the IBM XIV team. Already from his first poston his new [ThinkStorage blog], I can tell he is not going to pullany punches either.
IDC, an independent industry analyst firm, put out their 4Q07"Worldwide Disk Storage Systems Quarterly Tracker" report. Here is an excerpts from their [press release]:
"Worldwide external disk storage systems factory revenues posted 9.8 percent year-over-year growth in the fourth quarter of 2007 (4Q07) and totaling $5.3 billion (USD), according to the IDC Worldwide Disk Storage Systems Quarterly Tracker. For the quarter, the total disk storage systems market grew to $7.5 billion (USD), up 7.6 percent from the prior year's fourth quarter. Total disk storage systems capacity shipped reach 1,645 petabytes, growing 56.3 percent."
For those wondering how an industry could grow 56.3 percent in capacity, but only 7.6 percent in revenue, it isbecause the average dollar-per-GB dropped in 2007 from $6.63 down to $4.56 (USD), representing a 31 percent decline.In the past, disk prices dropped 40 to 60 percent each year, so making single digit growth was the best major vendorscould hope for. However, lately this has slowed down to 25 to 35 percent decline, but the client demand for capacity continues at the 60 percent pace, which means that vendors could achieve double digit revenue growth soon.
Once again, IBM was ranked number 1 in total disk storage. No surprise there. Here are the details:
"Total Disk Storage Systems Market
In the total worldwide disk storage systems market, IBM lead the market with 22.9 percent followed by HP with 18.1 percent revenue share. EMC maintained the third position with 16.0 percent revenue share.
For the full year, the total disk storage systems market posted 6.6 percent growth to $26.3 billion (USD). In the total worldwide disk storage systems market, IBM and HP lead the market in statistical tie with 20.1 percent and 19.4 percent revenue share, respectively. EMC maintained the third position with 15.2 revenue revenue share."
But why focus just on disk? IDC also released their"Worldwide Combined Disk and Tape Storage 3Q07 Market Share Update", and IBM was number one for that as well,taking in 21.9 percent share. Here's a quote of IBM VP Barry Rudolph in[CNN Money]:
"IBM's continued leadership in the storage hardware market reaffirms our strategy to provide the most comprehensive tiered portfolio of storage offerings, ranging from software and services to disk and tape storage solutions," said Barry Rudolph, Vice President, Storage Stack Solutions, IBM. "IBM is the clear choice for providing information infrastructure solutions that offer the most cost-efficient, streamlined approach to help our customers increase overall productivity and maximize performance."
It is looking like 2008 is going to be a good year for IBM!
IDC announced that IBM was number #1 in storage hardware (disk and tape combined)for 2006. Here are some excerpts from the IBM press release:
The newly released May 2007 report  by leading industry analyst firm IDC, "Worldwide Combined Disk and Tape Storage 2006 Market Share Update," shows IBM in the #1 overall position for all disk and tape storage hardware for the full year 2006.
In a total disk and tape storage hardware segment that increased to $28.2 billion in 2006, IBM captured 22.2 percent of the combined revenue for full year 2006, besting HP's 20.9 percent and EMC's 13.2 percent.
Five years ago, IBM was only #3 in this area, butis this new standing from IBM doing things better, or HP and EMC doing things poorly? Probably a little of both, but since it's not polite to point out the flaws of others in a blog, I will focus on what IBM is doing right, and I think our leadership in tape accounts for a good measure of this.
The resurgence of tape comes from a variety of factors:
The focus on being "green", to conserve energy power and cooling costs. Tape is the cheapest storage in this regard, as the tape cartridges only consume power when read or written.
Government regulations where more data must be stored for longer periods of time, such as theFederal Rules of Civil Procedures (FRCP), Sarbanes-Oxley, SEC regulations, and so on.
The widening gap in dollars per MB. Advancements in tape are outpacing disk. Disk is slowing down to about 25% improvement year on year, but tape continues its 30-40% improvement curve. A solution like Information Lifecycle Management (ILM) that moves older less valuable data from disk to tape can result in excellent cost savings.
Exciting "combined storage" solutions like the IBM System Storage DR550 and the IBM Grid Medical Archive Solution (GMAS) that combine disk and tape with internal hierarchy storage management of data, based on policies.
BladeCenterservers come in many flavors, including blades with Intel, AMD and POWER chipsets, and can be configured in Grid and SuperComputer configurations. Up to 14 blade servers can fit intoa single 7U-high chassis, making this twice as dense as standard 1U-high rack-mounted servers.
System x, the new "IBM Systems" name for our popular xSeries product line, support Intel and AMD chipsets. These come in both rack-mountedand tower configurations. These also are idea for clustered and SuperComputer configurations.[Read More]
Well it's Tuesday, and ["election day"] here in the USA, and again IBM has more announcements.
IBM announced [IBM Tivoli Key Lifecycle Manager v1.0] (TKLM) to manage encryption keys. This provides a graphical interface to manage encryption keys, including retention criteria when sharing keys with other companies.
TKLM is supported on AIX, Solaris, Windows, Red Hat and SUSE Linux. IBM plans to offer TKLM forz/OS in 2009. TKLM can be used with Firefox or Internet Explorer web browser. This will include the Encryption Key Manager (EKM) that IBM offered initially to support encryption keys for the TS1120, TS1130, and LTO-4 drives.
While this is needed today for tape, IBM positions this software to also manage the encryption keys for "Full Drive Encryption" (FDE) disk drive modules (DDM) in IBM disk systems in 2009.
Continuing this week's theme on products that were part of last week'sIBM Information Infrastructure launch, today I'll cover the TS2900.
IBM System Storage TS2900 Tape Autoloader
This little baby is SWEET! At 1U high, it holds a single drive and up to 9 cartridges,up to a total of 14.4 TB at 2:1 compression. Thedrive can be a Half-Height (HH) LTO-3 or LTO-4 drive. (It is called an autoloader because there isonly a single drive. Automation with multiple drives are called libraries).
This can be rack-mounted, or sit on your desktop. There is an I/O station for insertingor removing individual cartridges, as well as a removable tape magazine to populate orremove the tapes in a more efficient manner.
Both LTO3 and LTO4 support a mix of regular and "Write Once, Read Many" (WORM) media tohelp comply with regulations demanding "Non-erasable, Non-rewriteable" storage. TheLTO4 can also support on-drive encryption, managed by the IBM Encryption Key Manager (EKM).
To learn more, see the IBM System Storage[TS2900 page].
Well, it's Tuesday, and that means IBM announcements! Today is bigger, as there are a lot of Dynamic Infrastructure announcements throughout the company with a common theme, cloud computing and smart business systems that support the new way of doing things. Today, IBM announced its new "IBM Smart Archive" strategy that integrates software, storage, servers and services into solutions that help meet the challenges of today and tomorrow. IBM has been spending the past few years working across its various divisions and acquisitions to ensure that our clients have complete end-to-end solutions.
IBM is introducing new "Smart Business Systems" that can be used on-premises for private-cloud configurations, as well as by cloud-computing companies to offer IT as a service.
IBM [Information Archive] is the first to be unveiled, a disk-only or blended disk-and-tape Information Infrastructure solution that offers a "unified storage" approach with amazing flexibility for dealing with various archive requirements:
For those with applications using the IBM Tivoli Storage Manager (TSM) or IBM System Storage Archive Manager (SSAM) API of the IBM System Storage DR550 data retention solution, the Information Archive will provide a direct migration, supporting this API for existing applications.
For those with IBM N series using SnapLock or the File System Gateway of the DR550, the Information Archive will support various NAS protocols, deployed in stages, including NFS, CIFS, HTTP and FTP access, with Non-Erasable, Non-Rewriteable (NENR) enforcement that are compatible with current IBM N series SnapLock usage.
For those using NAS devices with PACS applications to store X-rays and other medical images, the Information Archive will provide similar NAS protocol interfaces. Information Archive will support both read-only data such as X-rays, as well as read/write data such as Electronic Medical Records.
Information Archive is not just for compliance data that was previously sent to WORM optical media. Instead, it can handle all kinds of data, rewriteable data, read-only data, and data that needs to be locked down for tamper protection. It can handle structured databases, emails, videos and unstructured files, as well as objects stored through the SSAM API.
The Information Archive has all the server, storage and software integrated together into a single machine type/model number. It is based on IBM's General Parallel File System (GPFS) to provide incredible scalability, the same clustered file system used by many of the top 500 supercomputers. Initially, Information Archive will support up to 304TB raw capacity of disk and Petabytes of tape. You can read the [Spec Sheet] for other technical details.
For those who prefer a more "customized" approach, similar to IBM Scale-Out File Services (SoFS), IBM has [Smart Business Storage Cloud]. IBM Global Services can customize a solution that is best for you, using many of the same technologies. In fact, IBM Global Services announced a variety of new cloud-computing services to help enterprises determine the best approach.
In a related announcement, IBM announced [LotusLive iNotes], which you can think of as a "business-ready" version of Google's GoogleApps, Gmail and GoogleCalendar. IBM is focused on security and reliability but leaves out the advertising and data mining that people have been forced to tolerate from consumer-oriented Web 2.0-based solutions. IBM's clients that are already familiar with on-premises version of Lotus Notes will have no trouble using LotusLive iNotes.
There was actually a lot more announced today, which I will try to get to in later posts.
In case you missed it, IBMunveiled a new digital video surveillance service yesterday. This "marks an important shift in the industry's approach to security, applying advanced analytics to video data and signaling the ability to converge physical and information technology (IT) security."
The IBM Smart Surveillance Solution is designed to provide the unique capability to carry out efficient data analysis of video sequences either in real time or from recordings. These recordings can be on disk or tape storage.
The problem with today's existing "analog" surveillance is that the analog cameras record onto traditional VHS tapes, and these are rotated through, re-written after a few hours or days. To review tapes often involves human intervention, and must be done before the VHS tapes are re-used. Many shoplifters, thieves, and other law-breakers take a chance that their actions will not be caught on tape, or that they will be long gone by the time the video is analyzed.
The IBM Smart Surveillance Solution can provide a number of advantages over traditional video solutions, including:
Real-time alerts that can help anticipate incidents by identifying suspicious behaviors.
Forensic capabilities are enhanced by utilizing unique indexing and attribute-based search of video events to classify objects into categories such as people and cars.
Situational awareness of the location, identity and activity of objects in a monitored space including license plate recognition and face capture.
With real-time analytics capabilities, the new DVS service can open up a wide array of new applications that go far beyond the traditional security aspects of surveillance systems. Early adopter industries in this rapidly evolving market include retail, public sector and financial services. The retail industry estimates nearly $50 billion is lost annually to fraud, theft and administrative errors.
Once in digital format, video surveillance can be sent further, processed quicker, and stored for longer periods of time, than traditional media makes practical today.
In last week's System Storage Portfolio Top Gun class in Dallas, some of the students were not familiarwith Really Simple Syndication (RSS). For the uninitiated, this can be intimidating.I thought a quick overview of what I've done might help:
Chose a "feed reader". I chose Bloglines but there are many others.
Use Technorati to search other blogs for keywords or phrases I am looking for.
When I find a blog that I like to continue tracking, I "add" it to my subscription list on bloglines. Just hit "add" and copy the URL of the blog you want to track. Bloglines will figure out the RSS keywords required.I track eight blogs at the momemnt, but some people with lots of time on their hands track 20 or more. It is easy to unsubscribe, so don't be afraid to try some out for a few days.
Since I was actually going to run a blog of my own, I read a few books on the topic. One I recommend is "Naked Conversations" by Robert Scoble and Shel Israel, both experienced bloggers.
Finally, I am not big on spell checking, but most places have the option to preview your post or comment before it actually gets posted, which is not a bad idea if you use any HTML tags.
For a quick taste of blogging, consider using Data Storage Blogger Feed Reader. This has a lot of blogs on the topic of storage, already added and categorized for your convenience, ready for your perusal.
I am sure there are many other ways to enjoy the Blogosphere, but this works for me.[Read More]
If you are ever down in Sao Paulo, Brazil, may I suggest not drinking "American amounts" of their "Brazilian Coffee". The coffee here is "robust", to say the least.
Yesterday, my blog focused on IBM iSCSI offerings that were announced in August.Also announced earlier this month, the Integrated Removable Media Manager (IRMM) on System zhas been years in the making.IRMM is a new robust systems management product for Linux® on IBM System z™ that manages open system media in heterogeneous distributed environments and virtualizes physical tape libraries. IRMM combines the capacity of multiple heterogeneous libraries into a single reservoir of tape storage that can be managed from a central point.By providing an integrated solution with the opportunity for both mainframe z/OS DFSMSrmm and distributed Tivoli® Storage Manager™ environments to be managed by IRMM, System z can now be a hub for the management of removable media.
The people who thought the "Mainframe is obsolete", and those that thought "Tape is dead", are both proven wrong again with this announcement. People are looking to deploy robust tape automation for backup and archive, and this convergence with mainframe makes perfect sense by providing business value that extends to other distributed systems.
With mixed emotions, Jon Peake announced he will retire from IBM next week. Jon is known as thefather of IBM Virtual Tape Server (VTS), the industry's first virtual tape system, announced in 1996and generally available in 1997.One of my 19 patents was for the VTS pre-migration capability, and as lead architect for DFSMS, I worked closely with Jon and his tape systems team to ensure its success.
From left to right:
Chris Telford, IBM Development manager for Enterprise Tape Integration
Jon Peake, IBM Distinguished Engineer and Master Inventor
Annette Estelle, Jon's global admin assistant
At his retirement celebration, Jon was awarded the coveted "Project Bulldog" jacket, which has an interesting history.
In response to IBM's 1996 VTS announcement, the top StorageTek (STK) tape sales teams and most of the dedicated tape technicians were invited to a global assembly at a fancy resort in Winter Park, CO (about 90 miles west of STK's Louisville headquarters) in early 1997. The gathering was named Project Bulldog, after Ron Korngiebel, STK's director of competitive marketing, who I am told had voice and facial resemblance to justify the project moniker. Ron had recruited Fred Moore, Steve Blenderman, and other prized engineers as speakers. I have seen both Fred and Steve speak at various conferences such as SHARE and GUIDE, and agree they are high quality speakers.
The goal was to have STK's brightest in Louisville go down in the trenches, work the field guys into a frenzy, defend STK Tape at any cost, and send IBM packing. At the end of the two day fest, many participants received the coveted Project Bulldog jacket.
Former STKers who now work at IBM can remember this meeting involved:
Bashing of the [IBM Seascape] architecture approach. The use of commodity servers and componentsto build storage systems continues today in the IBM System Storage DS8000, SAN Volume Controller,XIV, and TS7650 Deduplication solutions.
Explanations how and why IBM's VTS would never work, and how only STK virtual tape would make it in the market. Today, IBM is the leader in storage virtualization, both for disk and tape.
Mock interview videos with claims that IBM could never figure out how to attach IBM drives to the STK Silo. I was a big proponent of this, having visited customers who specifically asked for IBM to sell its better, faster IBM drives into their existing STK silos. At first, upper management was hesitant to do this, but the IBMengineers worked out what changes were needed, and today many STK tape automation libraries run with IBM tape drives.
While some analysts frowned on Sun's [2005 acquisition of StorageTek], IBM was delighted, given Sun's previous track record in storagecompany acquisitions. I joke that we are still picking up confetti in the hallways of IBM's Tucsonlab. I was in New York city when I heard Sun's announcement, and it didn't take long for STKemployees offering me their resumes.Since then, many STK engineers, technicians and sales team have left Sun, many coming over to IBM.Back then, there were many intelligent and talented people working for StorageTek, and IBM is gladto have hired them.
With the resurgence of interest in tape systems, from dealing with new legislation for long term retention of electronic data to a focus on energy efficiency, Jon leaves much like a champion retiring at the top of his game.
Jon, I am going to miss you! Enjoy your retirement!
Well, it's Tuesday, and you know what that means? IBM announcements!
Today we had several for the IBM System Storage product line. Here are some of them:
DS8000 gets thinner, leaner and faster
The 4.3 level of microcode for the IBM System Storage DS8000 series disk systems [announced enhancements] for both fixed block architecture (FBA) LUNs and count key data (CKD) volumes.
For FBA LUNs that attach to Linux, UNIX and Windows distributed systems, IBM announced DS8000 Thin Provisioning native support. Of course, many people already had this by putting IBM System Storage SAN Volume Controller (SVC) in front, but now DS8000 clients out there without SVC can also achieve benefits ofthin provisioning. This support also improves quick initialization a whopping 2.6 times faster.
For CKD volumes attached to z/OS on System z mainframes, IBM announced zHPF multitrack support for z/OS 1.9 and above. zHPF provide high performance FICON performance, and can now handle multitrack I/O transfers foreven better performance for zFS, HFS, PDSE, and extended striped data sets.
XIV gets better connected
A lot of XIV[announced enhancements] and preview announcements centered around better connectivity. Here's a run down:
Better host attachment connectivity by beefing up the interface modules that hold the FCP and iSCSI interface cards. XIV disk arrays have 3 to 6 of these in different configurations, and since they manage both their own disks,as well as receive host I/O requests for other disks, are basically doing double-duty.These interface modules can now be ordered as [Dual-CPU] modules.
Better infrastructure management by connecting XIV with the industry standard SMI-S interface to IBM Tivoli Storage Productivity Center. Now, XIV can be part of the single pane of glass console that manages all of your other disk arrays, tape libraries and SAN fabrics.
Better copy services for backups by connecting XIV with IBM Tivoli Storage Manager Advanced Copy Services. TSM for Advanced Copy Services is application aware and can coordinate XIV Snapshots similar to its current support for SVC and DS8000 FlashCopy capabilities.
Better connectivity to security systems by supporting LDAP credentials. Before, you had individual userid and passwords for each XIV, and these were probably different than all the other userid/password combinations you have for every other box on your data center floor. IBM is working on getting all products to support theLightweight Directory Access Protocol, or [LDAP] so that we can reach the nirvana of "single sign-on",one userid/password per administrator for all IT devices in the company.
Better support with flexible warranty periods and non-disruptive code load options.
Better remote copy support by connecting to sites far, far away. IBM previewed that it will provideasynchronous disk mirroring from one XIV to another XIV natively. Before this, XIV's synchronous mirroring was limited to 300km distances. Many of our clients do long distance global mirroring of their XIV today behind an SVC, but again, for those out there that don't yet have an SVC, this can be a reasonable alternative.
TS7650 ProtecTIER data deduplication appliance now offers "no dedupe" option
In what some might consider a surprising move, IBM announced a "no dedupe" licensing option on their premiere deduplication solution, which somewhat reminds me of IBM's NOCOPY option on DS8000 FlashCopy. At first I thought "Are you kidding me?!?!" However, this new license option allows the TS7650 appliance to compete with other virtual tape libraries (VTL) that do not offer deduplication capability on an even playing field. It also allows TS7650 to be used for data that doesn'tdedupe very well, such as seismic recordings, satellite images, or what have you. There are also clients who do not yet feel comfortable to dedupe their financial records for compliance reasons.This option now allows IBM to withdraw from marketing the TS7530 non-dedupe library. Having one technology thatdoes both dedupe and no-dedupe is better than offering two separate libraries based on different technologies.
The ProtecTIER series also announced [IP remote distance replication]. This can be used to replicate virtualtape cartridges in one ProtecTIER over to another ProtecTIER at a remote location. You can decide to replicateall or just a subset of your virtual tapes, and this feature can be used to migrate, merge or split ProtecTIERconfigurations as your needs grow. Before this support, our TS7650G clients replicated the disk repositoryusing native disk array replication technology, such as Global Mirror on the DS8000, but that meant that all data was replicated over to the secondary site. Now, with this new IP replication feature, you can be selective, and replicate only those virtual tapes that are mission critical.
The appliance now supports up to 36TB of disk capacity, and the new "IBM i" operating system on System i servers,formerly known as i5/OS.
GPFS does Windows
IBM's General Parallel File System (GPFS) has the lion's marketshare of file systems used in the [Top 500 Supercomputers]. For a while, it was limited to just Linux and AIX operating system support, but version 3.3 [extends this to Windows 2008 on 64-bit architectures]. GPFS isthe file system used in IBM's Scale-Out File Services, the underlying technology of IBM's Cloud Computing and Storage offerings.
Well, it's Tuesday again, and that means more announcements from IBM!
In conjunction with IBM's new [System z10 Business Class (BC)] mainframe designed for Small and Medium-sized Businesses (SMB), IBM also announced related storage productenhancements.
Yes, it's alive! Contrary to the FUD you might have read from our competitors, IBM continues to sell thousands and thousands of IBM System Storage DS6800 disk systems, and now enhances them with the optionfor 450GB 15K RPM drives. What is nice about these 450GB drives is that they are as fast or faster* than 300GBdrives, so the typical trade-off between performance and capacity do not apply.
(* I compared Seagate 15.6K (450GB) with 15.5K (300GB) models.
Avg Seek time (Read)
Avg Seek time (Write)
Full Seek time (Read)
Full Seek time (Write)
This may or may not result in application performance improvements, depending on workload pattern. Your mileage may vary.)
Our clients report back that these are incredibly stable systems that they don't have toworry about. This enhancement applies to both the [511/EX1 models] and [522/EX2 models].
Understanding that clients want complete solutions from single vendors, IBM offers synergy between System z and the IBM System Storage DS8000 disk systems. The latest R4.1 microcode upgrade offers two key features onthe various models [2107,
zHPF - High Performance FICON for System z. IBM was able to increase the throughput on 4 Gbps links. For OLTP workloads randomly accessing 4KB blocks, IBM internal tests showed zHPF doubled performance from 13,000 IOPSto 26,000 IOPS per channel. For sequential workloads, such as batch processing, zHPF increased performance 50 percent, from 350 MB/sec to 525 MB/sec.
In February, IBM previewed[IncrementalResync] for z/OS Metro Global Mirror. However, some concepts are better explained with pictures.
One way to set up a 3-site disaster recovery protection is to have your production synchronously mirrored to a second site nearby, and at the same time asynchronously mirrored to a remote location. On the System z, you can have site "A" using synchronous IBM System Storage Metro Mirror over to nearby site "B", and also have site "A" sending data over to site "C" asynchronously using z/OS Global Mirror. This is called "z/OS Metro Global Mirror".
In the past, if the disk system in site A failed, you would switch over to site B, which would have to resend send all the data again to site C to be resynchronized. This is because site B was not tracking what the System Data Mover (SDM) reader had or had not yet processed.
With DS8000 4.1, the "incremental resync" function that, along with using IBM HyperSwap, requires site B to only send and resync the data that was in-flight when the outage occurred. When you compare the difference in sending this limited amount of in-flight data with the traditional complete volume of data, you can see how "Incremental Resync" can resynchronize the data 95% faster, and also greatly decrease your bandwidth requirements. This reduces the risk in case a subsequent outage occurs.
Introduced originally in 1997 as the IBM Virtual Tape Server (VTS), the [IBMSystem Storage TS7700] series supports Grid capabilityto replicate tape image data across locations. Here's a quick recap of today's announcement:
Existing TS7740 can be upgraded up to 9TB of disk cache. New models can have up to 13TB of disk cache.
A new "tape-less" TS7720 that has up to 70TB of disk cache.
Integrate Library Management support. I discussed[IntegratedRemovable Media Manager (IRMM)] before, and this is basically IRMM inside. For those with TS3500 tape libraries,this support eliminates the need for a separate IBM 3953 L05 Library Manager.
TS1130 back-end tape drive support. These are the fastest 1TB drives in the industry, with support of built-in encryption, and now can be used asthe physical tape back-end for the virtual tape TS7740 repository.
While our competitors might be boarding up their windows in preparation for the economic downturn in the USAeconomy, IBM remains generating solid results. San Jose Mercury News has an article that discusses this titled[IBM's 3Q profit strong on global sales].There has never been a better time to buy from, or invest in, IBM!
For those of us in the northern hemisphere, yesterday was this year's Winter Solstice, representingthe shortest amount of daylight between sunrise and sunset. So today, I thought I would blog on my thoughtsof managing scarcity.
Earlier in my career, I had the pleasure to serve as "administrative assistant" to Nora Denzel for the week at a storage conference. My job was to make her look good at the conference, which if you know Nora, doesn't take much. Later, she left IBM to work at HP, and I gotto hear her speak at a conference, and the one thing that I remember most was her statement that thewhole point of "management" was to manage scarcity, as in not enough money in the budget,not enough people to implement change, or not enough resources to accomplish a task.(Nora, I have no idea where you are today, so if you are reading this, send me a note).
Of course, the flip-side to this is that resources that are in abundance are generallytaken for granted. Priorities are focused on what is most scarce. Let's examine some of theresources involved in an IT storage environment:
Capacity - while everyone complains that they are "running out of space", the truth is that most external disk attached to Linux, UNIX, or Windows systems contain only 20-40% data. Many years ago, I visitedan insurance company to talk about a new product called IBM Tivoli Storage Manager. This company had 7TB of disk on their mainframe,and another 7TB of disk scattered on various UNIX and Windows machines. In the room were TWO storage admins for
the mainframe, and 45 storage admins for the distributed systems. My first question was "why so many people forthe mainframe, certainly one of you could manage all of it yourself, perhaps on Wednesday afternoons?" Their response was that they acted as eachother's backup, in case one goes on vacation for two weeks. My follow-up question to the rest of the audience was:"When was the last time you took two weeks vacation?" Mainframes fill their disk and tape storage comfortablyat over 80-90% full of data, primarily because they have a more mature, robust set of management software, likeDFSMS.
Labor - by this I mean skilled labor able to manage storage for a corporation. Some companies I have visitedkeep their new-hires off production systems for the first two years, working only on test or development systemsonly until then. Of course, labor is more expensive in some countries than others. Last year, I was doing a whiteboard session on-site for a client in China, and the last dry-erase pen ran out of ink. I asked for another pen, and they instead sent someone to go re-fill it. I asked wouldn't it be cheaper just to buy another pen, and they said "No, labor is cheap, but ink is expensive." Despite this, China does complain that there is a shortage of askilled IT labor force, so if you are looking for a job, start learning Mandarin.
Power and Cooling - Most data centers are located on raised floors, with large trunks of electrical power and hugeair conditioning systems to deal with all the heat generated from each machine. I have visited the data centers ofclients that are forced now to make decisions on storage based on power and cooling consumption, because the coststo upgrade their aging buildings are too high. Leading the charge is IBM, with technology advancements in chips, cards, and complete systems that use less power, and generate less heat. While energy is still fairly cheap in the grand scheme of things, fears ofGlobal Warmingand declining oil supplies, the costs ofpower and cooling have gotten some news lately. In 1956, Hubbert predicted US would reach peak oil supplies by1965-1970 (it happened in 1971), and this year Simmonsestimated that world-wide oil production began its decline already in 2005. Smart companies like Google have movedtheir server farms to places like Oregon in the Pacific Northwest for cheaper hydroelectric power.
Bandwidth - Last year IBM introduced 4Gbps Fibre Channel and FICON SAN networking gear, along with the servers and storage needed to complete the solution. 4Gbps equates to about 400 MB/sec in data throughput. By comparison, iSCSI is typically run on 1Gbps Ethernet, but has so much overheads that you only get abour 80 MB/sec. Next year, we may see both 8 Gbps SAN, and 10 GbE iSCSI, to provide 800 MB/sec throughputs. My experience is that the SAN is not the bottleneck, instead people run out of bandwidth at the server or storage end first. They may not have a million dollars to buy the fastest IBM System p5 servers, or may not have enough host adapters at the storage system end.
Floorspace - I end with floorspace because it reminds me that many "shortages" are temporary or artificially created. Floorspace is only in short supply because you don't want to knock down a wall, or build a new building, to handle your additional storage requirements.In 1997, Tihamer Toth-Fejel wrote an article for the National Space Society newsletter that estimated that ...Everybody on Earth could live comfortably in the USA on only 15% of our land area, with a population density between that of Chicago and San Francisco. Using agricultural yields attained widely now, the rest of the U.S. would be sufficient to grow enough food for everyone. The rest of the planet, 93.7% of it, would be completely empty.Of course, back in 1997 the world population was only 5.9 billion, and this year it is over 6.5 billion.
This last point brings me back to the concept of food, and I am not talking about doughnuts in the conference room, or pizza while making year-end storage upgrades. I'm talking aboutthe food you work so hard to provide for yourself and your family. The folks at Oxfam came up with a simpleanalogy. If 20 people sit down at your table, representing the world’s population:
3 would be served a gourmet, multi-course meal, while sitting at decorated table and a cushioned chair.
5 would eat rice and beans with a fork and sit on a simple cushion
12 would wait in line to receive a small portion of rice that they would eat with their hands while sitting on the floor.
So for those of you planning a special meal next Monday, be thankful you are one of the lucky three, and hopefulthat IBM will continue to lead the IT industry to help out the other seventeen.
Well, it's Tuesday again, and that means more IBM announcements!
Storage Area Network (SAN)
IBM and Cisco announced [three new blades] for the Cisco MDS 9500 seriesdirectors: 24-port 8 Gbps, 48-port 8 Gbps, and 4/44 blended. The 4/44blended has 4 of the faster 8 Gbps ports, and 44 of the 4 Gpbs ports,so that you can auto-negotiate down to 1 Gbps for your older gear, andstill take advantage of the faster 8 Gbps speeds during the transition.
On the Brocade side, IBM announced the newIBM System Storage Data Center Fabric Manager [DCFM] V10 software. This replaces the products formerly known as BrocadeFabric Manager and McData Enterprise Fabric Connection Manager (EFCM).This software can support up to 24 distinct fabrics, up to 9000 ports,including a mix of FCP, FICON, FCIP and iSCSI protocols.
(On a related note, I heard that Microsoft is planning to rename "Windows Vista" to "Windows 7" next year! Like we say here in Tucson,if it ends in "-ista" it is going to fail in the marketplace! Perhaps EMC should rename their storage virtualization product to "In-7"?).
IBM System Storage DR550
IBM announced today that it now supports [RAID 6 onthe DR550] compliance and retention storage system.
There are a few RAID-5 based EMC Centera customers out there who have notyet switched over to the IBM DR550, and now this might be just the littlenudge they need. For long-term retention of regulatory compliance data,RAID-5 doesn't cut it, you need an advanced RAID scheme, such as RAID-6, RAID-DP or RAID-X.
The DR550 provides non-erasable, non-rewriteable (NENR) storage supportto keep retention-managed data on disk and tape media. It supports 1 TBSATA disk drives and 1TB tape cartridges to provide high capacity at lowcost and "green" low energy consumption.
IBM System Storage N series
Several of our disk systems got improved and enhanced. Let's start withthe IBM System Storage N series[hardware and software] enhancements. IBM now offers high-speed 450GB 15K RPM drives. These are Fibre Channel (FC) drives for the EXN4000 expansion drawers, and Serial Attached SCSI (SAS) drives for the entry-levelN3300 and N3600 models.
The "gateway" models now support a variety of functions that were formerlyonly available on the appliance models. This includes Advanced Single Instance Storage (A-SIS), Disk Sanitization, and FlexScale.
A-SIS is IBM's "other" deduplication function, and I talked about this in my post [A-SIS Storage Savings Estimator Tool]. Disk Sanitization will physicallywrite ones and zeros over existing data to eliminate it, what IBM sometimes calls "Data Shredding".
The last feature, FlexScale, might be new for many. It is software toenable to use of the "Performance Accelerator Module" (PAM). The PAM isa PCI-Express card with 16GB on-board RAM that acts as a secondary cachebehind main memory of the N series controller. Depending on the model,you can have one to five of these cards fit into the controller itself,boosting random read performance, metadata access, and write block destage.
IBM System Storage DS5000
IBM's latest entry into the DS family has been hugely successful.In addition to Linux, Windows and AIX, the DS5000 now supports [Novell Netware and Sun Solaris] operating systems.
For infrastructure management, IBM has enhanced the Remote Support Manager [RSM]that supports DS3000 and DS4000 has been extended to support DS5000 as well. This software can monitor up to 50 disk systems, will e-mail alerts to IBM when something goes wrong, and allow IBM to dial in via modem to get more diagnostic information to improve service to the client. Also, the IBM System Storage Productivity Center [SSPC]which now supports the DS8000 and SAN Volume Controller (SVC) has been extended to also support the DS5000.
IBM XIV Storage System
In addition to 1-year and 3-year maintenance agreements, IBM now offers[2-year, 4-year and 5-year] software maintenance agreements.
RFID labels for IBM tape media
IBM 3589 (20-pack of LTO cartridges) and IBM 3599 (20-pack of 3592 cartridges for TS1100 series)now offer [RFID labels]. These labels match the volume serial (VOLSER) with a 216-bit unique identifier and 256 bits of user-defined content. This can help with tape inventory,and to prevent people from walking out of the building with a tape cartridge stuffed in their jacket.
32GB memory stick
While not technically part of the IBM System Storage matrix of offerings, Lenovo announced their new[Essential Memory Key] which holds 32GB of memory and workswith both USB 1.1 and USB 2.0 protocols.
I wish I could say this is it for the IBM announcements for October, given that this is the last Tuesday of the month, but there are three days left, so there might be just a few more!
This post will focus on Information Compliance, the fourth and final part of the four-part series this week.I have received a few queries on my choice of sequence for this series: Availability, Security, Retention andCompliance.
Why not have them in alphabetical order? IBM avoids alphabetizing in one language, because thenit may not be alphabetized when translated to other languages.
Why not have them in a sequence that spells outan easy to remember mnemonic, like "CARS"? Again, when translated to other languages, those mnemonics no longerwork.
Instead, I worked with our marketing team for a more appropriate sequence, based on psychology and the cognitive bias of [primacy and recency effects].
Here's another short 2-minute video, on Information Compliance
Full disclosure: I am not a lawyer. The following will delveinto areas related to government and industry regulations. Consultyour risk officer or legal counsel to make sure any IT solution is appropriatefor your country, your industry, or your specific situation.
IBM estimates there are over 20,000 regulations worldwide related to information storage and transmission.
For information availability, some industry regulations mandate a secondary copy a minimum distance away toprotect against regional disasters like hurricanes or tsunamis.IBM offers Metro Mirror (up to 300km) and Global Mirror (unlimited distance) disk mirroring to support theserequirements.
For information security, some regulations relate to privacy and prevention of unauthorized access. Twoprominent ones in the United States are:
Health Insurance Portability and Accountability Act (HIPAA) of 1996
HIPAA regulates health care providers, health plans, and health care clearinghouses in how they handle the privacy of patient's medical records. These regulations apply whether the information is on film, paper, or storedelectronically. Obviously, electronic medical records are easier to keep private. Here is an excerpt froman article from [WebMD]:
"There are very good ways to protect data electronically. Although it sounds scary, it makes data more protected than current paper records. For example, think about someone looking at your medical chart in the hospital. It has a record of all that is happening -- lab results, doctor consultations, nursing notes, orders, prescriptions, etc. Anybody who opens it for whatever reason can see all of this information. But if the chart is an electronic record, it's easy to limit access to any of that. So a physical therapist writing physical therapy notes can only see information related to physical therapy. There is an opportunity with electronic records to limit information to those who really need to see it. It could in many ways allow more privacy than current paper records."
GLBA regulates the handling of sensitive customer information by banks, securities firms, insurance companies, and other financial service providers. Financial companies use tape encryption to comply with GLBA when sending tapes from one firm to another. IBM was the first to deliver tape drive encryption withthe TS1120, and then later with LTO-4 and TS1130 tape drives.
For information retention, there are a lot of regulations that deal with how information is stored, in some casesimmutable to protect against unethical tampering, and when it can be discarded. Two prominent regulations inthe United States are:
U.S. Securities and Exchange Commission (SEC) 17a-4 of 1997
In the past, the IT industryused the acronym "WORM" which stands for the "Write Once, Read Many" nature of certain media, like CDs, DVDs,optical and tape cartridges. Unfortunately, WORM does not apply to disk-based solutions, so IBM adopted the languagefrom SEC 17a-4 that calls for storage that is "Non-Erasable, Non-Rewriteable" or NENR. This new umbrella term applies to disk-based solutions, as well as tape and optical WORM media.
SEC 17a-4 indicates that broker/dealers and exchange members must preserve all electronic communications relating to the business of their firmm a specific period of time. During this time, the information must not be erased or re-written.
Sarbanes-Oxley (SOX) Act of 2002
SOX was born in the wake of [Enron and other corporate scandals]. It protects the way that financial information is stored, maintained and presented to investors, as well as disciplines those who break its rules. It applies onlyto public companies, i.e. those that offer their securities (stock shares, bonds, liabilities) to be sold to the publicthrough a listing on a U.S. exchange, such as NASDAQ or NYSE.
SOX focuses on preventing CEOs and other executives from tampering the financial records.To meet compliance, companies are turning to the [IBM System Storage DR550] which providesNon-erasable, Non-rewriteable (NENR) storage for financial records. Unlike competitive products like EMC Centera thatfunction mostly as space-heaters on the data center floor once they filled up, the DR550 can be configured as a blended disk-and-tape storage system, so that the most recent, and most likely to be accessed data, remains on disk, but the older, least likely to be accessed data, is moved automatically to less expensive, more environment-friendly "green" tape media.
Did SOX hurt the United States' competitiveness? Critics feared that these new regulations would discourage newcompanies from going public. Earnst & Young found these fears did not come true, and published a study [U.S. Record IPO Activity from 2006 Continues in 2007]. In fact, the improved confidence that SOX has given investors has given rise to similarlegislation in other parts of the world: Euro-Sox for the European Union Investor Protection Act, and J-SOX Financial Instruments and Exchange Law for Japan.
For those who only read the first and last paragraphs of each post, here is my recap:Information Compliance is ensuring that information is protected against regional disasters, unauthorizedaccess, and unethical tampering, as required to meet industry and government regulations. Such regulationsoften apply if the information is stored on traditional paper or film media, but can often be handled more cost-effectively when stored electronically. Appropriate IT governance can help maintain investor confidence.
In Monday's post, [IBM Information Infrastructure launches today], I explained how this strategic initiative fit into IBM's New EnterpriseData Center vision. The launch was presented at the IBM Storage and Storage Networking Symposium to over 400 attendeesin Montpelier, France, with corresponding standing-room-only crowds in New York and Tokyo.
This post will focus on Information Retention, the third of the four-part series this week.
Here's another short 2-minute video, on Information Retention
Let's start with some interesting statistics.Fellow blogger Robin Harris on his StorageMojo blog has an interesting post:[Our changing file workloads],which discusses the findings of study titled"Measurement and Analysis of Large-Scale Network File System Workloads"[14-page PDF]. This paper was a collaborationbetween researchers from University of California Santa Cruz and our friends at NetApp.Here's an excerpt from the study:
Compared to Previous Studies:
Both of our workloads are more write-oriented. Read to write byte ratios have significantly decreased.
Read-write access patterns have increased 30-fold relative to read-only and write-only access patterns.
Most bytes are transferred in longer sequential runs. These runs are an order of magnitude larger.
Most bytes transferred are from larger files. File sizes are up to an order of magnitude larger.
Files live an order of magnitude longer. Fewer than 50 percent are deleted within a day of creation.
Files are rarely re-opened. Over 66 percent are re-opened once and 95% fewer than five times.
Files re-opens are temporally related. Over 60 percent of re-opens occur within a minute of the first.
A small fraction of clients account for a large fraction of file activity. Fewer than 1 percent of clients account for50 percent of file requests.
Files are infrequently shared by more than one client. Over 76 percent of files are never opened by more than one client.
File sharing is rarely concurrent and sharing is usually read-only. Only 5 percent of files opened by multiple clients are concurrent and 90 percent of sharing is read-only.
Most file types do not have a common access pattern.
Why are files being kept ten times longer than before? Because the information still has value:
Provide historical context
Gain insight to specific situations, market segment demographics, or trends in the greater marketplace
Help innovate new ideas for products and services
Make better, smarter decisions
National Public Radio (NPR) had an interesting piece the other day. By analyzing old photos, a researcher for Cold War Analysis was able to identify an interesting [pattern for Russian presidents]. (Be sure to listen to the 3-minute audio to hear a hilarious song about the results!)
Which brings me to my own collection of "old photos". I bought my first digital camera in the year 2000,and have taken over 15,000 pictures since then. Before that,I used 35mm film camera, getting the negatives developed and prints made. Some of these date back to my years in High School and College. I have a mix of sizes, from 3x5, 4x6 and 5x7 inches,and sometimes I got double prints.Only a small portion are organized intoscrapbooks. The rest are in envelopes, prints and negatives, in boxes taking up half of my linen closet in my house.Following the success of the [Library of Congress using flickr],I decided the best way to organize these was to have them digitized first. There are several ways to do this.
This method is just too time consuming. Lift the lid place 1 or a few prints face down on the glass, close the lid,press the button, and then repeat. I estimate 70 percent of my photos are in [landscape orientation], and 30 percent in [portrait mode]. I can either spend extra time toorient each photo correctly on the glass, or rotate the digital image later.
I was pleased to learn that my Fujitsu ScanSnap S510 sheet-feed scanner can take in a short stack (dozen or so) photos, and generate JPEG format files for each. I can select 150, 300 or 600dpi, and five levels of JPEG compression.All the photos feed in portrait mode, which I can then rotate later on the computer once digitized.A command line tool called [ImageMagick] can help automate the rotations.While I highly recommend the ScanSnap scanner, this is still a time-consuming process for thousands of photos.
"The best way to save your valuable photos may be by eliminating the paper altogether. Consider making digital images of all your photos."
Here's how it works:You ship your prints (or slides, or negatives) totheir facility in Irvine, California. They have a huge machine that scans them all at 300dpi, no compression, andthey send back your photos and a DVD containing digitized versions in JPEG format, all for only 50 US dollars plusshipping and handling, per thousand photos. I don't think I could even hire someone locally to run my scanner for that!
The deal got better when I contacted them. For people like me with accounts on Facebook, flickr, MySpace or Blogger,they will [scan your first 1000 photos for free] (plus shipping and handling). I selected a thousand 4x6" photos from my vast collection, organized them into eight stacks with rubber bands,and sent them off in a shoe box. The photos get scanned in landscape mode, so I had spent about four hours in preparing what I sent them, making sure they were all face up, with the top of the picture oriented either to the top or left edge.For the envelopes that had double prints, I "deduplicated" them so that only one set got scanned.
The box weighed seven pounds, and cost about 10 US dollars to send from Tucson to Irvinevia UPS on Tuesday. They came back the following Monday, all my photos plus the DVD, for 20 US dollars shipping and handling. Each digital image is about 1.5MB in size, roughly 1800x1200 pixels in size, so easily fit on a single DVD. The quality is the sameas if I scanned them at 300dpi on my own scanner, and comparable to a 2-megapixel camera on most cell phones.Certainly not the high-res photos I take with my Canon PowerShot, but suitable enough for email or Web sites. So, for about 30 US dollars, I got my first batch of 1000 photos scanned.
ScanMyPhotos.com offers a variety of extra priced options, like rotating each file to the correct landscape or portrait orientation, color correction, exact sequence order, hosting them on their Web site online for 30 days to share with friends and family, and extra copies of the DVD.All of these represent a trade-off between having them do it for me for an additional fee, or me spending time doing it myself--either before in the preparation, or afterwards managing the digital files--so I can appreciate that.
Perhaps the weirdest option was to have your original box returned for an extra $9.95? If you don't have a hugecollection of empty shoe boxes in your garage, you can buy a similarly sized cardboard box for only $3.49 at the local office supply store, so I don't understand this one. The box they return all your photos in can easily be used for the next batch.
I opted not to get any of these extras. The one option I think they should add would be to have them just discardthe prints, and send back only the DVD itself. Or better yet, discard the prints, and email me an ISO file of the DVD that I can burn myself on my own computer.Why pay extra shipping to send back to me the entire box of prints, just so that I can dump the prints in the trash myself? I will keep the negatives, in case I ever need to re-print with high resolution.
Overall, I am thoroughlydelighted with the service, and will now pursue sending the rest of my photos in for processing, and reclaim my linen closet for more important things. Now that I know that a thousand 4x6 prints weighs 7 pounds, I can now estimate how many photos I have left to do, and decide on which discount bulk option to choose from.
With my photos digitized, I will be able to do all the things that IBM talks about with Information Retention:
Place them on an appropriate storage tier. I can keep them on disk, tape or optical media.
Easily move them from one storage tier to another. Copying digital files in bulk is straightforward, and as new techhologies develop, I can refresh the bits onto new media, to avoid the "obsolescence of CDs and DVDs" as discussed in this article in[PC World].
Share them with friends and family, either through email, on my Tivo (yes, my Tivo is networked to my Mac and PC and has the option to do this!), or upload themto a photo-oriented service like [Kodak Gallery or flickr].
Keep multiple copies in separate locations. I could easily burn another copy of the DVD myself and store in my safe deposit box or my desk at work.With all of the regional disasters like hurricanes, an alternative might be to backup all your files, including your digitized photos, with an online backup service like [IBM Information Protection Services] from last year's acquisition of Arsenal Digital.
If the prospect of preserving my high school and college memories for the next few decades seems extreme,consider the [Long Now Foundation] is focused on retaining information for centuries.They areeven suggesting that we start representing years with five digits, e.g., 02008, to handle the deca-millennium bug which will come into effect 8,000 years from now. IBM researchers are also working on [long-term preservation technologies and open standards] to help in this area.
For those who only read the first and last paragraphs of each post, here is my recap:Information Retention is about managing [information throughout its lifecycle], using policy-based automation to help with the placement, movement and expiration. An "active archive" of information serves to helpgain insight, innovate, and make better decisions. Disk, tape, and blended disk-and-tape solutions can all play a part in a tiered information infrastructure for long-term retention of information.
In Monday's post, [IBM Information Infrastructure launches today], I explained how this strategic initiative fit into IBM's New EnterpriseData Center vision. For you podcast fans, IBM Vice Presidents Bob Cancilla (Disk Systems), Craig Smelser (Storage and Security Software), and Mike Riegel (Information Protection Services), highlight some of the new products and offerings in this 12-minute recording:
This post will focus on Information Security, the second of the four-part series this week.
Here's another short 2-minute video, on Information Security
Security protects information against both internal and external threats.
For internal threats, most focus on whether person A has a "need-to-know" about information B. Most of the time, thisis fairly straightforward. However, sometimes production data is copied to support test and development efforts. Here is the typical scenario: the storage admin copies production data that contains sensitive or personal informationto a new copy and authorizes software engineers or testers full read/write access to this data.In some cases, the engineers or testers may be employees, other times they might be hired contractors from an outside firm.In any case, they may not be authorized to read this sensitive information. To solve this IBM announced the[IBM Optim Data Privacy Solution] for a variety of environments, including Siebel and SAP enterprise resource planning (ERP)applications.
I found this solution quite clever. The challenge is that production data is interrelated and typically liveinside [relational databases].For example, one record in one database might have a name and serial number, and then that serial number is used to reference a corresponding record in another database. The IBM Optim Data Privacy Solution applies a range of"masks" to transform complex data elements such as credit card numbers, email addresses and national identifiers, while retaining their contextual meaning. The masked results are fictitious, but consistent and realistic, creating a “safe sandbox” for application testing. This method can mask data from multiple interrelated applications to create a “production-like” test environment that accurately reflects end-to-end business processes.The testers get data they can use to validate their changes, and the storage admins can rest assured theyhave not exposed anyone's sensitive information.
Beyond just who has the "need-to-know", we might also be concerned with who is "qualified-to-act".Most systems today have both authentication and authorization support. Authentication determines that youare who you say you are, through the knowledge of unique userid/passwords combinations, or other credentials. Fingerprint, eye retinal scans or other biometrics look great in spy movies, but they are not yetwidely used. Instead, storage admins have to worry about dozens of different passwords on differentsystems. One of the many preview announcements made by Andy Monshaw on Monday's launch was that IBM isgoing to integrate the features of [Tivoli Access Manager for Enterprise Single Sign-On] into IBM's Productivity Center software, and be renamed "IBM Tivoli Storage Productivity Center".You enter one userid/password, and you will not have to enter the individual userid/password of all the managedstorage devices.
Once a storage admin is authenticated,they may or may not be authorized to read or act on certain information.Productivity Center offers role-based authorization, so that people can be identifiedby their roles (tape operator, storage administrator, DBA) and that would then determine what they areauthorized to see, read, or act upon.
For external threats, you need to protect data both in-flight and at-rest. In-flight deals with data thattravels over a wire, or wirelessly through the air, from source to destination. When companies have multiplebuildings, the transmissions can be encrypted at the source, and decrypted on arrival.The bigger threat is data at-rest. Hackers and cyber-thieves looking to download specific content, like personal identifiable information, financial information, and other sensitive data.
IBM was the first to deliver an encrypting tape drive, the TS1120. The encryption process is handled right at the driveitself, eliminating the burden of encryption from the host processing cycles, and eliminating the need forspecialized hardware sitting between server and storage system. Since then, we have delivered encryption onthe LTO-4 and TS1130 drives as well.
When disk drives break or are decommissioned, the data on them may still be accessible. Customers have a tough decision to make when a disk drive module (DDM) stops working:
Send it back to the vendor or manufacturer to have it replaced, repaired or investigated, exposing potentialsensitive information.
Keep the broken drive, forfeit any refund or free replacement, and then physically destroy the drive. Thereare dozens of videos on [YouTube.com] on different ways to do this!
The launch previewed the [IBM partnership with LSI and Seagate] to deliver encryption technology for disk drives, known as "Full Drive Encryption" or FDE.Having all data encrypted on all drives, without impacting performance, eliminates having to decide which data gets encryptedand which doesn't. With data safely encrypted, companies can now send in their broken drives for problemdetermination and replacement.Anytime you can apply a consistent solution across everything, without human intervention anddecision making, the less impact it will have. This was the driving motivation in both disk and tape driveencryption.
(Early in my IBM career, some lawyers decided we need to add a standard 'paragraph' to our copyright text in the upper comment section of our software modules, and so we had a team meeting on this. The lawyer that presented to us that perhaps only20 to 35 percent of the modules needed to be updated with this paragraph, and taught us what to look for to decidewhether or not the module needed to be changed. Myteam argued how tedious this was going to be, that this will take time to open up each module, evaluate it, and make the decision. With thousands of modules involved the process could take weeks. The fact that this was going to take us weeks did not seem to concern our lawyer one bit, it was just thecost of doing business.Finally, I asked if it would be legal to just add the standard paragraph to ALL the modules without any analysis whatsoever. The lawyer was stunned. There was no harm adding this paragraph to all the modules, he said, but that would be 3-5x more work and why would I even suggest that. Our team laughed, recognizing immediately that it was the fastest way to get it done. One quick program updated all modules that afternoon.)
To manage these keys, IBM previewed the Tivoli Key Lifecycle Manager (TKLM).This software helps automate the management of encryption keys throughout their lifecycle to help ensure that encrypted data on storage devices cannot be compromised if lost or stolen. It will apply to both disk and tapeencryption, so that one system will manage all of the encryption keys in your data center.
For those who only read the first and last paragraphs of each post, here is my recap:Information Security is intended as an end-to-end capability to protect against both internal and external threats, restricting access only to those who have a "need-to-know" or are "qualified-to-act". Security approacheslike "single sign-on" and encryption that applies to all tapes and all disks in the data center greatly simplify the deployment.
It's official! My "blook" Inside System Storage - Volume I is now available.
This blog-based book, or “blook”, comprises the first twelve months of posts from this Inside System Storage blog,165 posts in all, from September 1, 2006 to August 31, 2007. Foreword by Jennifer Jones. 404 pages.
IT storage and storage networking concepts
IBM strategy, hardware, software and services
Disk systems, Tape systems, and storage networking
Storage and infrastructure management software
Second Life, Facebook, and other Web 2.0 platforms
IBM’s many alliances, partners and competitors
How IT storage impacts society and industry
You can choose between hardcover (with dust jacket) or paperback versions:
This is not the first time I've been published. I have authored articles for storage industry magazines, written large sections of IBM publications and manuals, submitted presentations and whitepapers to conference proceedings, and even had a short story published with illustrations by the famous cartoon writer[Ted Rall].
But I can say this is my first blook, and as far as I can tell, the first blook from IBM's many bloggers on DeveloperWorks, and the first blook about the IT storage industry.I got the idea when I saw [Lulu Publishing] run a "blook" contest. The Lulu Blooker Prize is the world's first literary prize devoted to "blooks"--books based on blogs or other websites, including webcomics. The [Lulu Blooker Blog] lists past year winners. Lulu is one of the new innovative "print-on-demand" publishers. Rather than printing hundredsor thousands of books in advance, as other publishers require, Lulu doesn't print them until you order them.
I considered cute titles like A Year of Living Dangerously, orAn Engineer in Marketing La-La land, or Around the World in 165 Posts, but settled on a title that matched closely the name of the blog.
In addition to my blog posts, I provide additional insights and behind-the-scenes commentary. If you go to the Luluwebsite above, you can preview an entire chapter in its entirety before purchase. I have added a hefty 56-page Glossary of Acronyms and Terms (GOAT) with over 900 storage-related terms defined, which also doubles as an index back to the post (or posts) that use or further explain each term.
So who might be interested in this blook?
Business Partners and Sales Reps looking to give a nice gift to their best clients and colleagues
Managers looking to reward early-tenure employees and retain the best talent
IT specialists and technicians wanting a marketing perspective of the storage industry
Mentors interested in providing motivation and encouragement to their proteges
Educators looking to provide books for their classroom or library collection
Authors looking to write a blook themselves, to see how to format and structure a finished product
Marketing personnel that want to better understand Web 2.0, Second Life and social networking
Analysts and journalists looking to understand how storage impacts the IT industry, and society overall
College graduates and others interested in a career as a storage administrator
And yes, according to Lulu, if you order soon, you can have it by December 25.
IBM had some big announcements today. The theme for today's announcement was "Protected Information", as there are many reasons to protect your most strategic asset, your information. Let's do a quick run-down of a few of them.
IBM LTO generation 4
LTO 4 provides encryption at the drive level, and supports WORM cartridges similar to LTO 3. It continues the LTO consortium's strategy for higher capacity and faster performance. If you have LTO 1 or LTO 2, now is a good time to consider upgrading your tape technology. The combination of encryption and WORM protects your information against unauthorized access, and unethical tampering of the data. The support is from our largest automated tapelibrary (TS3500),to our smallest drives.
TS7520 Virtualization Engine
The TS7520replaces the TS7510, providing enhanced Virtual Tape Library (VTL) capability. When you hear "storage virtualization" you often think disk, but IBM invented "tape storage virtualization" and this product continues that leadership.
Support for Half-high LTO 3 drives
The TS3100 and TS3200 now support half-high LTO 3 drives, which means you can have twice the number of drives in each unit. LTO 4 drives can read and write to LTO 3 media, so this provides additional investment protection.
IBM System Storage DR550 File System Gateway
This new offering provides much-needed CIFS and NFS access to the DR550, the worlds most flexible compliance-and-retention storage available. Already there is a large body of ISVs that support the DR550 today, and with this new gateway, the list is even longer. The DR550 provides encryption for both disk and tape data, as well as policy-based non-erasable, non-rewriteable enforcement, designed for compliance with government regulations like Sarbanes-Oxley Act, HIPAA, and many others.
IBM System Storage SAN32B-3 switch
This is the first major deliverable from Brocade since their acquisition of McDATA. A powerful switch packs 4 Gbps support in a small 1U form factor. Start with 16 ports, then add in increments of 8 ports to a maximum of 32 ports.
I've provided all the links, so that you can delve deeply into all the data sheets.
Wrapping up my week on the Feb 12 announcements, I will finish off talking about thenew Half-High (HH) LTO4 drives available for our TS3100 and TS3200 tape libraries.
Small and medium sized business (SMB) clients are looking for small, affordable tapesystems. Tape is inherently green, using orders of magnitude less energy than disk,and is very scalable by simply purchasing more tape cartridges.
When IBM first announced them, the TS3100 supported one drive with 24 cartridges,and the TS3200 (see picture at left) supported two drives and 48 cartridges. Unlike disk, that mentions RAWcapacity and then lowers it to indicate usable capacity in RAID configurations, tapeis just the opposite. LTO4 cartridges have 800 GB raw capacity, but with an average of 2:1compression, can hold a usable 1.6 TB of data. LTO4 also supports WORM cartridges fornon-erasable, non-rewriteable (NENR) types of data, and encryption capability.
As a follow-on to our HH LTO3 drives, IBM is the first major storage vendor to offerthe new HH LTO4 drives in entry-level automation, which directly attach via 3Gbps SAS connections to your host servers. The HH models allows you to have two drives in the TS3100, and four drives in the TS3200.
You can mix and match, LTO3 and LTO4. Why would anyone do that? Well, the Linear Tape Open [LTO]consortium --made up of technology provider companies IBM, HP and Quantum--decided to support N-2 generation read, and N-1 generation read/write. So, anLTO3 can read LTO1 cartridges, and read/write LTO2 and LTO3 cartridges. TheLTO4 can read LTO2 cartridges, and read/write LTO3 and LTO4 cartridges. For SMBcustomers that still have some LTO1 cartridges they might want to read some day,mixing LTO3 and LTO4 is a viable combination.
Of course, IBM still offers full-high (FH) versions of LTO3 and LTO4, which offer a bit faster acceleration, back-hitch and rewind times than their HH counterparts, and also offer additional attachment choices of LVD Ultra160 SCSIand 4 Gbps Fibre Channel as well.
So, for SMB customers that are simply using their tape for backup and archive,and probably not driving maximum rated speeds, having twice as many slowerdrives might be just the right fit.
We had a great event today! This was a first-of-a-kind product launch, using Second Life as the medium. We invited IBM Business Partners, industry analysts and reporters from the Press to have their "avatars" in-world to watch us launch new tape systems, archive and retention systems, and disk systems announced this month.
Andy Monshaw, IBM System Storage General Manager, welcomed everyone to the event, and introduced our three speakers.He mentioned that this was a great innovative way to meet, collaborate and forge relationships without the carbon pollution associated with travel required by a more traditional face-to-face meeting. We had attendees from the USA, UK, Germany, Sweden, Italy, Colombia, and Brazil.
All the attendees were given a "goody bag" that contained IBM BP-logo clothing, animations and gestures to be used during the meeting.
Eric Buckley, one of our marketing managers for tape systems, introduced our complete line of LTO 4 tape systems, as wellas the TS7520 Virtualization Engine, a virtual tape library for Windows, UNIX and Linux servers. Eric had a virtual 3-Dversion of an LTO cartridge that is photo-realistic and dimensionally correct.
Funda Eceral, our solutions manager for archive and retention offerings, presented the new version of the IBM System Storage DR550, the DR550 file system gateway, and the IBM System Storage Multilevel Grid Archive Manager. At first we thought we would "pass the microphone" from speaker to speaker, but it turned out to be easier just to give all three speakers their own microphone.
Last, but not least, was David Tareen, marketing manager for disk systems, covering the entry-level DS3000 Express disk system bundles designed for our SMB client. David used a black-and-brown pointer stick to point out specific things on the charts.
After the presentations, Kristie Bell, VP of Marketing for IBM System Storage, hosted a Question & Answer (Q&A) panel.Avatars rose their left hand to indicate they had a question.
We thought it would be a good idea to have a few minutes at the end to socialize over a cup of coffee. This involved making a "coffee machine" that dispensed coffee, and the appropriate animations and gestures so that everyone could sip the coffee, and hold the coffee at waist level when they were talking.
The event was held upstairs in one of the conference rooms of the IBM Briefing Center, located on "IBM 8" island.Many people went to the ground floor to look at the many IBM System Storage products on display. Unlike a picture on a web-page, Second Life gives you a 3-D view that you can walk around each product, and get a feel for the size and shape of the hardware.
We had four photographers and camera-persons on hand to capture still shots, video, audio, and chat text, and are working now to combine them for marketing collateral. I want to thank the builders, script programmers, animators, clothing designers, speakers, editors, and channel enablement team for making this event such a great success!
Continuing my summary of Pulse 2008, the premiere service managementconference focusing on IBM Tivoli solutions, I attended and presentedbreakout sessions on Monday afternoon.
Tivoli Storage "State-of-the-Subgroup" update
Kelly Beavers, IBM director of Tivoli Storage, presented the first breakout for all of the Tivoli Storage subgroup.Tivoli has several subgroups, but Tivoli Storage leads with revenuesand profits over all the others.Tivoli storage has top performing business partner channel of anysubgroup in IBM's Software Group division.IBM is world's #1 provider of storage vendor (hardware, softwareand services), so this came to no surprise to most of the audience.
Looking at just the Storage Software segment, it is estimatedthat customers will spend $3.5 billion US dollars more in the year 2011 than they did last year in 2007. IBM is #2 or #3 in eachof the four major categories: Data Protection, Replication, Infrastructure management, and Resource management. In eachcategory, IBM is growing market share, often taking away share fromthe established leaders.
There was a lot of excitement over the FilesX acquisition.I am still trying to learn more about this, but what I have gathered so far is that it can:
Like turning a "knob", you can adjust the level of backupprotection from traditional discrete scheduled backups, to morefrequent snapshots, to continuous data protection (CDP). Inthe past, you often used separate products or features to dothese three.
Perform "instantaneous restore" by performing a virtualmount of the backup copy. This gives the appearance that therestore is complete.
This year marks the 15th anniversary of IBM Tivoli StorageManager (TSM), with over 20,000 customers. Also, this yearmarks the 6th year for IBM SAN Volume Controller, having soldover 12,000 SVC engines to over 4,000 customers.
Data Protection Strategies
Greg Tevis, IBM software architect for Tivoli Technical Strategy,and I presented this overview of data protection. We coveredthree key areas:
Protecting against unethical tampering with Non-erasable, Non-rewriteable (NENR) storage solutions
Protecting against unauthorized access with encryption ondisk and tape
Protecting against unexpected loss or corruption with theseven "Business Continuity" tiers
There was so much interest in the first two topics that weonly had about 9 minutes left to cover the third! Fortunately,Business Continuity will be covered in more detail throughoutthe week.
Henk de Ruiter from ABN Amro bank presented his success storyimplementing Information Lifecycle Management (ILM) across hisvarious data centers using IBM systems, software and services.
Making your Disk Systems more Efficient and Flexible
I did not come up with the titles of these presentations. Theteam that did specifically chose to focus on the "business value"rather than the "products and services" being presented. Inthis session, Dave Merbach, IBM software architect, and I presentedhow SAN Volume Controller (SVC), TotalStorage Productivity Center,System Storage Productivity Center, Tivoli Provisioning Managerand Tivoli Storage Process Manager work to make your disk storagemore efficient and flexible.
I attended the main tent sessions on Day 2 (Monday). The focuswas on Visibility, Control and Automation.
Steve is IBM senior VP and Group Executive of the IBM Software Group, and presented someinsightful statistics from the IBM Global Technology Outlookstudy, some recent IBM wins, and other nuggets of IT trivia:
In 2001, there were about 60 million transistors per humanbeing. By 2010, this is estimated to increase to one billion per human
In 2005, there were about 1.3 billion RFID tags, by 2010this is estimated to grow to over 30 billion
IBM helped the City of Stockholm, Sweden, reduce traffic congestion 20-25% using computer technology
Only about 25% data is original, the remaining75% is replicated
In 2007, there were approximately 281 Exabytes (EB), expected to increase to 1800 EB by the year 2011
70 percent of unstructured data is user-created content, but 85 percent of this will be managed by enterprises
Only 20% of data is subject to compliance rules and standards, and about 30% subject to security applications
Human error is the primary reason for breaches, with34% of organizations experiencing a major breach in 2006
10% of IT budget is energy costs (power and cooling), and thiscould rise to 50% in the next decade
30 to 60 percent of energy is wasted. During the next 5 years, people will spend as much on energy as they will on new hardware purchases.
Al Zollar is the General Manager of IBM Tivoli. He discussedthe 20 some recent software acquisitions, including Encentuate and FilesX earlier this year.
"The time has come to fully industrialize operations" -- Al Zollar
What did Al mean about "industrizalize"? This is theclosed-loop approach of continuous improvement, including design, delivery and management.
Al used several examples from other industries:
Henry Ford used standardized parts and processautomation. Assembly of an automatobile went from 12 hours by master craftsmen, to delivering a new model T every 23 seconds off anassembly line.
Power generation was developed by Thomas Edison. A satellite picture showed the extent of the [Blackout of 2003 in Northeast US and Canada]. The time for "smart grid" has arrived, making sensors andmeters more intelligent. This allows non-essential IP-enabled appliances in our home or office to be turned off to reduce energy consumption.
[McCarran International Airport] integrated the management of 13,000 assets with IBM Tivoli Maximo Enterprise Asset Management (EAM) software, and was able to increase revenues through more accurate charge-back. Unlike traditional EnterpriseResource Planning (ERP) applications, EAM offers the deep management of four areas: production equipment, facilities, transportation, and IT.
When compared to these other industries, management of IT is in itsinfancy. The expansion of [Web 2.0] and Service-Oriented Architecture [SOA] is driving this need.What people need is a "new enterprise data center" that IBM Tivoli software can help you manage across operational boundaries. IBM can integrate through open standards with management software from Cisco, Sun, OracleMicrosoft, CA, HP, BMC Software, Alcatel Lucent, and SAP.Together with our ecosystems of technology partners, IBM ismeeting these challenges.
IBM clients have achieved return on investment from gettingbetter control of their environment. This week there are client experience presentations Sandia National Labs, Spirit AeroSystems, Bank of America, and BT Converged communication services.
Chris O'Connor used some of his staff as "actors" to show an incredible live demo of various Tivoli and Maximo products for the mythical launch of "Project Vitalize", thenew online web store for a new "Aero Z bike" from the mythical VCA Bike and Motorcycle company.
Shoel Perelman played the role of "CIO".The CIO locked down all spending, and asked the IT staff to make the shift from bricks-and-mortar to web salesof this new product on in 15 months. While the company andsituation were mythical, all the products that were part of thelive demo are all readily available.The CIO had three goals:
What do we have? where is it? what's connected to what?Traditionally, these would be answered from lists in spreadsheets.The CIO had a goal to deploy IBM Tivoli Application DependenceDiscover Manager (TADDM) which discovered all hardware and software,with an easy to understand view, and how each piece serves the business applications.
Each of the teams have processes, and needed them consistent andrepeatable, tightly linked together. Time is often wasted on thephone coordinating IT changes. For this, the CIO had a goalto deploy Tivoli Change and Configuration Management Database (CCMDB) for "strict change control".The process dashboard is accessible for all teams, to see how all projects are progressing. There is also aCompliance dashboard, which identifies all changes by role, clearly spelling out who can do what.
There is a lot of computerized machinery, Manufacturing assets and robotics. The CIO set a goal to "do more with existing people", and needed to automate key processes.Sales rep wanted to add a new distributor to key web portal.This was all done through their "service catalog", When they needed to deploy a new application, they were able to find servers with available capacity and adjust using automatic provisioning. Thanks to IBM, the IT staff no longer get paged at 3am in the morning, and fewer days are spent in the "war room". They now have confidence that thelaunch will be successful.
Ritika Gunnar played the role of "Operations manager". She highlightedfive areas:
"Service viewer" dashboard with green/yellow/red indicators forall of their edge, application and datbase servers. This allowsher to get data 4-5 times faster and more accurate.
Tivoli Enterprise Portal eliminates bouncingaround various products.
Tivoli Common Reporting for CPU utilization of all systems, helps find excess capacity usingIBM Tivoli Monitor
On average, 85 percent of problems are caused by IT changes to the environment. IBM can help find dependencies, so that changes in one area do not impact other areas unexpectedly
Process Automation will Show changes that have been completed, in progress, or overdue.She can see all steps in a task or change request. A"workflow" automates all the key steps that need to be taken.
Laura Knapp played the role of "Facilities manager". She wanted to See all processes that apply to her work using a role-based process dashboard. The advantage of using IBM is that it changes work habits, reduces overtimeby 42 percent, improves morale. The IT staff now works as team,collaborates more, and jobs get done faster with fewer mistakes.Employees are online, accessing, monitoring and managing dataquicker. In days not weeks.
IBM Tivoli Enterprise Console (TEP) served as a common vehicle.She was able to pull up floor plan online, displaying all of the managed assets and mapped features. With the temperature overlay from Maximo Spatial, she was able to review hot spots on data center floor. Heat can cause servers to fail or shut down.
Power utilization chart at peak loadsCan now anticipate, predict and watch power consumption,and were able to justify replacement with newer, more energy-efficient equipment.
The CIO got back on stage, and explained the great success of thelaunch. They use Webstore usage tracking, security tools tracking all new registrations, and trackingserver and storage load.It now only takes hours, not weeks, to add new business partners and distributors.Tivoli Service Quality Assurance toolstrack all orders placed, processed, and shipped.Faster responsiveness is competitive advantage. TheirIT department is no longer seen as stodgy group, but as a world classorganization.
The live demo showed how IBM can help clients with rapid decisionmaking, speed and accuracy of change processes, and automation to take actions quickly. The result is a strong return on investment (ROI).
Liz Smith, IBM General Manager of Infrastructure Services, presented the results of an IBM survey to CEOs and CIOs asking questions like: What is the next big impact? Where are you investing?What will new datacenter look like?
The five key traits they found for companies of the future:
They were hungry for change
Innovative beyond customer imagination
Disruptive by nature
Genuine, not just generous
The IT infrastructure must be secure, reliable, and flexible.Taking care of environment is a corporate responsibility, notjust a way to reduce costs.
The five entry points for IBM Service Management: Integrate, Industrialize,Discover,Monitor and Protect.IBM Service management and compliance are critical for theGlobally Integrated Enterprise, with repeatable, scalable and consistent processes that enablechange to an automated workflow. This reduces errors, risks and costs, and improves productivity.IBM has talent, assets and experience to help any client get there.
Lance lives in Austin, TX, where IBM Tivoli is headquartered,so this made a good choice as a keynote speaker.He is best known for winning seven "Tour de France" bicycle races in a row, but he spoke instead gave an inspirational talk about how he survived cancer.
In 1996, Lance was diagnosed with cancer. Surprisingly, He said it was thegreatest thing that happened to him, and gave him new perspective on his life, family and the sport ofbicycling.Back then, there wasn't a webMD, Google or other Web 2.0 socialnetworking sites for Lance to better understand what he wasgoing through, learn more about treatment options, or find othersgoing through the same ordeal.
After his treatment, he was considered "damaged goods" by manyof the leading European bicycle teams. So, he joined the US Postal Serviceteam, not known for their wins, but often invited to sell TVrights to American audiences. Collaborating with his coachesand other members of his team, he revolutionized the bicycling sport, analyzed everything about the race, and built up morale.He won the first "yellow jersey" in 1999, and did so each yearfor a total of seven wins.
Lance formed the [Livestrong foundation] to help other cancer survivors. Nike came to him and proposed donating 5 million "rubber bracelets"colored yellow to match his seven yellow jerseys, with the name "Livestrong" embossed on them, that his foundation couldthen sell for one dollar apiece to raise funds. What some thought was a silly idea at first has started amovement.At the 2004 Olympics, many athletes from all nations and religious backgrounds, wore these yellow braceletsto show solidarity with this cause.To date, the foundation has sold over 72 million yellow bracelets, and these have served to provide a symbol,a brand, a color identity, to his cause.
He explained that doctor's have a standard speech to cancer survivors.As a patient, you can go out this doorway and never tell anyone,keep the situation private. Or you can go out this other doorway, you tell everybody your story. Lance chose the latter, and he felt it was the best decision he ever made.He wrote a book titled [It's Not About the Bike: My Journey Back to Life].
His call to action for the audience: find out what can you do to make a difference.A million non-governmental organizations[NGO] have started in the past 10 years. Don't just give cash, also give your time and passion.
Registration for the "Meet the Storage Experts" event in Second Life will close this week fornext week's September 20 event. All IBMers, clients and IBM Business Partners are welcome to attend. We will focus this time on DS3000 and N series disk systems, tape systems,and IBM storage networking gear.
If you miss this one, we plan to have another one in November!
Last week, I opined that Monday's IDC announcement "IBM #1 in combined disk and tape storage hardwaresales for 2006" was in part because of a resurgence of interest in tape, with four specific examples. There was a lot of reaction and reflection fromboth sides.
On the one side...
EMC blogger Mark Twomey at Storagezilla admits that perhapsTape Isn't Dead after all,is perhaps the best place to put long-term archive data, but not for backup? EMC's "creative marketing types" put out this Fun With Tape video that I found amusing. (It asks for a first name,last name, and e-mail address, which are then embedded into the resulting video itself, and perhaps forwarded to your nearest EMC sales rep, so answer according to your wishes for privacy).
The "mummy wrapped in tape media" seems to be a common theme, and shows up again in LiveVault'svideo with John Cleese, which makes the same argument asthe EMC video above, namely: switch your backups from tape to disk because we are a disk-only vendor.
... and on the other side
JWT over at DrunkenData asks Which is greener, disk or tape?Tape is, of course, by a long shot, and an essential part of IBM's Big Green initiative, a project to invest$1US Billion dollars per year for data centers to be more efficient for power and cooling.
Sun/StorageTek blogger Randy Chalfant questions the Death of Tape, and argues thatdisk-only solutions suffer from atrophy.The results he posts from a survey of 200 customers are similar to those we've seen with customers using IBM TotalStorage Productivity Center, our software to help evaluate data usage, and identify misuse, in your data center.
To my readers in the USA, United Kingdom, Ireland, South Africa, China and Japan, and a few other countries, Happy Father's Day!
Back in the late 1980's and early 1990's, I was one of the architects for DFSMS on z/OS, and customers always asked, "What is the clip level?", in other words, how big does a customer have to be to take advantage of DFSMS. We worked it out that if you had more than 100GB of disk data, DFSMS is worthwhile. DFSMS is now just standard by default, as everyone now easily has more than 100GB of data.
Later, in the late 1990's, I worked on Linux for System z. Again, customers asked how many Linux guest images would justify deploying applications on a mainframe. We worked it out to about 10 images. 10 Linux logical partitions, or Linux guests under z/VM was enough to cost justify the entire investment.
So what is the "clip level" for SANs? How many servers does an SMB need to have to justify deploying a SAN? IBM announced the new BladeCenter S designed specifically for mid-sized companies, 100 to 1000 employees, typically running 25 to 45 servers. However, I suspect companies as small as 7-10 servers would probably benefit from deploying an FC or IP SAN.
What do you think? Send me a comment on how many servers should be the clip level.
Based on this success, and perhaps because I am also fluent in Spanish, I was asked to help with Proyecto Ceibal, the team for OLPC Uruguay. Normally theXS school server resides at the school location itself, so that even if the internet connection is disrupted or limited, the school kids can continue to access each other and the web cache content until internet connection is resumed.However, with a diverse developmentteam with people in United States, Uruguay, and India, we first looked to Linux hosting providers that wouldagree to provide free or low-cost monthly access. We spent (make that "wasted") the month of May investigating.Most that I talked to were not interested in having a customized Linux kernel on non-standard hardware on their shop floor, and wanted instead to offer their own standard Linux build on existing standard servers, managed by theirown system administrators, or were not interested in providing it for free. Since the XS-163 kernel is customizedfor the x86 architecture, it is one of those exceptions where we could not host it on an IBM POWER or mainframe as a virtual guest.
This got picked up as an [idea] for the Google's[Summer of Code] and we are mentoring Tarun, a 19-year-old student to actas lead software developer. However, summer was fast approaching, and we wanted this ready for the next semester. In June, our project leader, Greg, came up with a new plan. Build a machine and have it connected at an internet service provider that would cover the cost of bandwidth, and be willing to accept this with remote administration. We found a volunteer organization to cover this -- Thank you Glen and Vicki!
We found a location, so the request to me sounded simple enough: put together a PC from commodity parts that meet the requirements of the customizedLinux kernel, the latest release being called [XS-163]. The server would have two disk drives, three Ethernet ports, and 2GB of memory; and be installed with the customized XS-163 software, SSHD for remote administration, Apache web server, PostgreSQL database and PHP programming language.Of course, the team wanted this for as little cost as possible, and for me to document the process, so that it could be repeated elsewhere. Some stretch goals included having a dual-boot with Debian 4.0 Etch Linux for development/test purposes, an alternative database such as MySQL for testing, a backup procedure, and a Recover-DVD in case something goes wrong.
Some interesting things happened:
The XS-163 is shipped as an ISO file representing a LiveCD bootable Linux that will wipe your system cleanand lay down the exact customized software for a one-drive, three-Ethernet-port server. Since it is based on Red Hat's Fedora 7 Linux base, I found it helpful to install that instead, and experiment moving sections of code over.This is similar to geneticists extracting the DNA from the cell of a pit bull and putting it into the cell for a poodle. I would not recommend this for anyone not familiar with Linux.
I also experimented with modifying the pre-built XS-163 CD image by cracking open the squashfs, hacking thecontents, and then putting it back together and burning a new CD. This provided some interesting insight, but in the end was able to do it all from the standard XS-163 image.
Once I figured out the appropriate "scaffolding" required, I managed to proceed quickly, with running versionsof XS-163, plain vanilla Fedora 7, and Debian 4, in a multi-boot configuration.
The BIOS "raid" capability was really more like BIOS-assisted RAID for Windows operating system drivers. This"fake raid" wasn't supported by Linux, so I used Linux's built-in "software raid" instead, which allowed somepartitions to be raid-mirrored, and other partitions to be un-mirrored. Why not mirror everything? With two160GB SATA drives, you have three choices:
No RAID, for a total space of 320GB
RAID everything, for a total space of 160GB
Tiered information infrastructure, use RAID for some partitions, but not all.
The last approach made sense, as a lot of of the data is cache web page images, and is easily retrievable fromthe internet. This also allowed to have some "scratch space" for downloading large files and so on. For example,90GB mirrored that contained the OS images, settings and critical applications, and 70GB on each drive for scratchand web cache, results in a total of 230GB of disk space, which is 43 percent improvement over an all-RAID solution.
While [Linux LVM2] provides software-based "storage virtualization" similar to the hardware-based IBM System Storage SAN Volume Controller (SVC), it was a bad idea putting different "root" directories of my many OS images on there. With Linux, as with mostoperating systems, it expects things to be in the same place where it last shutdown, but in a multi-boot environment, you might boot the first OS, move things around, and then when you try to boot second OS, it doesn'twork anymore, or corrupts what it does find, or hangs with a "kernel panic". In the end, I decided to use RAIDnon-LVM partitions for the root directories, and only use LVM2 for data that is not needed at boot time.
While they are both Linux, Debian and Fedora were different enough to cause me headaches. Settings weredifferent, parameters were different, file directories were different. Not quite as religious as MacOS-versus-Windows,but you get the picture.
During this time, the facility was out getting a domain name, IP address, subnet mask and so on, so I testedwith my internal 192.168.x.y and figured I would change this to whatever it should be the day I shipped the unit.(I'll find out next week if that was the right approach!)
Afraid that something might go wrong while I am in Tokyo, Japan next week (July 7-11), or Mumbai, India the following week (July 14-18), I added a Secure Shell [SSH] daemon that runs automaticallyat boot time. This involves putting the public key on the server, and each remote admin has their own private key on their own client machine.I know all about public/private key pairs, as IBM is a leader in encryption technology, and was the first todeliver built-in encryption with the IBM System Storage TS1120 tape drive.
To have users have access to all their files from any OS image required that I either (a) have identical copieseverywhere, or (b) have a shared partition. The latter turned out to be the best choice, with an LVM2 logical volumefor "/home" directory that is shared among all of the OS images. As we develop the application, we might findother directories that make sense to share as well.
For developing across platforms, I wanted the Ethernet devices (eth0, eth1, and so on) match the actual ports they aresupposed to be connected to in a static IP configuration. Most people use DHCP so it doesn't matter, but the XSsoftware requires this, so it did. For example, "eth0" as the 1 Gbps port to the WAN, and "eth1/eth2" as the two 10/100 Mbps PCI NIC cards to other servers.Naming the internet interfaces to specific hardware ports wasdifferent on Fedora and Debian, but I got it working.
While it was a stretch goal to develop a backup method, one that could perform Bare Machine Recovery frommedia burned by the DVD, it turned out I needed to do this anyways just to prevent me from losing my work in case thingswent wrong. I used an external USB drive to develop the process, and got everything to fit onto a single 4GB DVD. Using IBM Tivoli Storage Manager (TSM) for this seemed overkill, and [Mondo Rescue] didn't handle LVM2+RAID as well as I wanted, so I chose [partimage] instead, which backs up each primary partition, mirrored partition, or LVM2 logical volume, keeping all the time stamps, ownerships, and symbolic links in tact. It has the ability to chop up the output into fixed sized pieces, which is helpful if you are goingto burn them on 700MB CDs or 4.7GB DVDs. In my case, my FAT32-formatted external USB disk drive can't handle files bigger than 2GB, so this feature was helpful for that as well. I standardized to 660 GiB [about 692GB] per piece, sincethat met all criteria.
The folks at [SysRescCD] saved the day. The standard "SysRescueCD" assigned eth0, eth1, and eth2 differently than the three base OS images, but the nice folks in France that write SysRescCD created a customized[kernel parameter that allowed the assignments to be fixed per MAC address ] in support of this project. With this in place, I was able to make a live Boot-CD that brings up SSH, with all the users, passwords,and Ethernet devices to match the hardware. Install this LiveCD as the "Rescue Image" on the hard disk itself, and also made a Recovery-DVD that boots up just like the Boot-CD, but contains the 4GB of backup files.
For testing, I used Linux's built-in Kernel-based Virtual Machine [KVM]which works like VMware, but is open source and included into the 2.6.20 kernels that I am using. IBM is the leadingreseller of Vmware and has been doing server virtualization for the past 40 years, so I am comfortable with thetechnology. The XS-163 platform with Apache and PostgreSQL servers as a platform for [Moodle], an open source class management system, and the combination is memory-intensive enough that I did not want to incur the overheads running production this manner, but it wasgreat for testing!
With all this in place, it is designed to not need a Linux system admin or XS-163/Moodle expert at the facility. Instead, all we need is someone to insert the Boot-CD or Recover-DVD and reboot the system if needed.
Just before packing up the unit for shipment, I changed the IP addresses to the values they need at the destination facility, updated the [GRUB boot loader] default, and made a final backup which burned the Recover-DVD. Hopefully, it works by just turning on the unit,[headless], without any keyboard, monitor or configuration required. Fingers crossed!
So, thanks to the rest of my team: Greg, Glen, Vicki, Tarun, Marcel, Pablo and Said. I am very excited to bepart of this, and look forward to seeing this become something remarkable!
Well, we had another successful event in Second Life today.
Unlike our April 26 launch of our System Storage products for IBM Business Partners only, this time we decided this time to make it as a "Meet the Storage Experts" Q&A Panel format, and open up registration to everyone. Thesubject matter experts sat at the front of the room on four stools. We had six rows of chairs arrangedsemi-circularly.
Shown above, from left to right, are the avatars of our four experts:
IBM System Storage N series, focusing on recent N3000 disk system announcements
Harold Pike (holding the microphone while speaking)
IBM System Storage DS3000 and DS4000 series, focusing on recent DS3000 disk system announcements
IBM System Storage TS series, focusing on recent TS2230, TS3400 and TS7700 tape system announcements
IBM storage networking, focusing on recent IBM SAN256B director blade announcements
While Eric was a veteran Second Lifer, having presented at our April event, the other three were trainedon how to raise their hand, speak into the microphone, sit on the stool, and so on. I want to thank allof our experts for putting in this effort!
The event was produced by Katrina H Smith. She did a great job, and made sure we were on top ofall the issues and tasks required to get the job done. Running a Second Life event is every bit ashard as running a real face-to-face event. We had several meetings to discuss venue details, placementof chairs, placement of product demos, audio/video recording, wall decorations, tee-shirt and coffee mug design, logistics, and so on.
I acted as moderator/emcee for the event. That is my back in the picture above. The process wassimple, modeled after the "Birds of a Feather" sessions at events like SHARE and the IBMStorage and Storage Networking Symposium. We threw out a list of topics the experts would cover,and people in the audience would "raise their left hand". I, as the moderator, would then walkover to each person, and hold out the microphone for them to ask the question. I would then repeat the question and ask the appropriate expert to provide an answer. We defined gestures onhow to "raise hand" and "put hand down" that we gave to each registered participant.
We had four dedicated "camera-avatars" in world to capture both video and screenshots.Our video editors are now working to edit "highlight videos" that we can use at future events, for training materials, and for our internal "BlueTube" online video system.
The room was filled with examples of each of our products, made into 3D objects that were dimensionallycorrect, and "textured" with photographs of the actual products. If you click on an object, you get a "notecard" that provided more information. Special thanks to Scott Bissmeyer for making all of theseobjects for us.
We made posters of each expert and placed them in all four corners of the room. On the bottom of each coffee mug was a picture of each of the experts, and if you walked under each of the posters, you were"dispensed" a coffee mug matching the expert shown in the poster.Participants could "Collect all Four!" When you bring the coffee mug up to takea sip, the picture on the bottom of the mug is exposed for all to see.And as a final give-away to the audience, we made a variety of event tee-shirts and polo-shirts.
At the end of the session, we asked everyone to click on the "Survey" kiosk near the exit door. We askedsix simple questions using SurveyMonkey.com that took only a fewminutes to process. We found asking questions immediately at the end of the event was the best way tocapture this feedback.
From a "Green" perspective, we had people registered from the following countries: US, India, Mexico,Australia, United Kingdom, Brazil, Germany, Argentina, Chile, China, Canada, and Venezuela. Second Lifeallows all these people who probably could not travel, or could not afford the time and expense to travel,to participate in a simulated face-to-face meeting without energy consumption of traditional travel methods.
More importantly, we got several leads for business. People often ask "Yes, but is there any businessassociated with this?" This time, there was, based on the answers to the questions, several avatars asked for a real sales call to follow-up on the products and offerings they were discussed.
With such a great success, we have already scheduled our next Second Life event, November 8. Mark your calendars! I'll postmore details on the registration process of the November event when available.
Federal Rules for Civil Procedures (FRCP) will increase adoption of unstructured data classification, email archive systems and CAS.
CAS continues to flounder, but the rest I can agree with. Regulations are being adopted world wide. Japan has its own Sarbanes-Oxley (SOX) style legislation go into effect in 2008.IBM TotalStorage Productivity Center for Data is a great tool to help classify unstructured file systems. IBM CommonStore for email supports both Microsoft Exchange and Lotus Domino, and can be connected to IBM System Storage DR550 for compliance storage.
Unified storage systems (combined file and block storage target systems) will become increasingly attractive in 2007, because of their ease of use and simplicity.
I agree with this one also. Our sales of IBM N series in 2006 was great, and looking to continue its strong growth in 2007. The IBM N series brings together FCP, iSCSI and NAS protocols into one disk system. With the SnapLock(tm) feature, N series can store both re-writable data, as well as non-erasable, non-rewriteable data, on the same box. Combine the N series gateway on the front-end with SAN Volume Controller on the back-end, and you have an even more powerful combination.
Distributed ROBO backup to disk will emerge as the fastest growing data protection solution in 2007.
IDC had a similar prediction for 2006. ROBO refers to "Remote Office/Branch Office", and so ROBO backup deals with how to back up data that is out in the various remote locations. Do you back it up locally? or send it to a central location?Fortunately, IBM Tivoli Storage Manager (TSM) supports both ways, and IBM has introduced small disk and tape drives and auto-loaders that can be used in smaller environments like this. I don't know whether "backup to disk" will be the fastest growing, but I certainly agree that a variety of ROBO-related issues will be of interest this year.
2007 will be remembered as the year iSCSI SAN took off because of the much reduced pricing for 10 Gbit iSCSI and the continued deployment of 10 Gbit iSCSI targets.
While I agree that iSCSI is important, I can't say 2007 will be remembered for anything.We have terrible memory in these things. Ask someone what year did Personal Computers (PC) take off, and they will tell you about Apple's famous 1984 commercial. Ask someone when the Internet took off, cell phones took off, etc, and I suspect most will provide widely different answers, but most likely based on their own experience.
For the longest time, I resisted getting a cell phone. I had a roll of quarters in my car, and when I needed to make a call, I stopped at the nearby pay-phone, and made the call. In 1998, pay phones disappeared. You can't find them anymore. That was the year of the cell phones took off, at least for me.
Back to iSCSI, now that you can intermix iSCSI and SAN on the same infrastructure, either through intelligent multi-protocol switches available from your local IBM rep, or through an N series gateway, you can bring iSCSI technology in slowly and gradually. Low-cost copper wiring for 10 Gbps Ethernet makes all this very practical.
Another up-and-coming technology is AoE, or ATA-over-Ethernet. Same idea as iSCSI, but taken down to the ATA level.
CDP will emerge as an important feature on comprehensive data protection products instead of a separate managed product.
Here, CDP stands for Continuous Data Protection. While normal backups work like a point-and-shoot camera, taking a picture of the data once every midnight for example. CDP can record all the little changes like a video camera, with the option to rewind or fast-forward to a specific point in the day. IBM Tivoli CDP for Files, for example, is an excellent complement to IBM Tivoli Storage Manager.
The technology is not really new, as it has been implemented as "logs" or "journals" on databases like DB2 and Oracle, as well as business applications like SAP R/3.
The prediction here, however, relates to packaging. Will vendors "package" CDP into existing backup products, possibly as a separately priced feature, or will they leave it as a separate product that perhaps, like in IBM's case, already is well integrated.
The VTL market growth will continue at a much reduced rate as backup products provide equivalent features directly to disk. Deduplication will extend the VTL market temporarily in 2007.
VTL here refers to Virtual Tape Library, such as IBM TS7700 or TS7510 Virtualization Engine. IBM introduced the first one in 1997, the IBM 3494 Virtual Tape Server, and we have remained number one in marketshare for virtual tape ever since. I find it amusing that people are now just looking at VTL technology to help with their Disk-to-Disk-to-Tape (D2D2T) efforts, when IBM Tivoli Storage Manager has already had the capability to backup to disk, then move to tape, since 1993.
As for deduplication, if you need the end-target box to deduplicate your backups, then perhaps you should investigatewhy you are doing this in the first place? People take full-volume backups, and keep to many copies of it, when a more sophisticated backup software like Tivoli Storage Manager can implement backup policies to avoid this with a progressive backup scheme. Or maybe you need to investigate why you store multiple copies of the same data on disk, perhaps NAS or a clustered file system like IBM General Parallel File System (GPFS) could provide you a single copy accessible to many servers instead.
The reason you don't see deduplication on the mainframe, is that DFSMS for z/OS already allows multiple servers to share a single instance of data, and has been doing so since the early 1980s. I often joke with clients at the Tucson Executive Briefing Center that you can run a business with a million data sets on the mainframe, but that there wereprobably a million files on just the laptops in the room, but few would attempt to run their business that way.
Optical storage that looks, feels and acts like NAS and puts archive data online, will make dramatic inroads in 2007.
Marc says he's going out on a limb here, and that's good to make at least one risky prediction. IBM used to have anoptical library emulate disk, called the IBM 3995. Lack of interest and advancement in technology encouraged IBM to withdraw it. A small backlash ensued, so IBM now offers the IBM 3996 for the System p and System i clients that really, really want optical.
As for optical making data available "online", it takes about 20 seconds to load an optical cartridge, so I would consider this more "nearline" than online. Tape is still in the 40-60 second range to load and position to data, so optical is still at an advantage.
Optical eliminates the "hassles of tape"? Tape data is good for 20 years, and optical for 100 years, but nobody keeps drives around that long anyways. In general, our clients change drives every 6-8 years, and migrate the data from old to new. This is only a hassle if you didn't plan for this inevitable movement. IBM Tivoli Storage Manager, IBM System Storage Archive Manager, and the IBM System Storage DR550 all make this migration very simple and easy, and can do it with either optical or tape.
The Blue-ray vs. DVD debate will continue through 2007 in the consumer world. I don't see this being a major player in more conservative data centers where a big investment in the wrong choice could be costly, even if the price-per-TB is temporarily in-line with current tape technologies. IBM and others are investing a lot of Research and Development funding to continue the downward price curve for tape, and I'm not sure that optical can keep up that pace.
Well, that's my take. It is a sunny day here in China, and have more meetings to attend.
Two weeks ago, I mentioned in my post [Pulse 2008 - Day 2 Breakout sessions] thatHenk de Ruiter from ABN Amro bank presented his success storyimplementing Information Lifecycle Management (ILM) across hisvarious data centers. I am no stranger to ABN Amro, having helped "ABN" and "Amro" banks merge their mainframe data in 1991. Henk has agreed to let me share with my readers more ofthis success story here on my blog:
Back in December 2005, Henkand his colleagues had come to visit the IBM Tucson ExecutiveBriefing Center (EBC) to hear about IBM products and services. At the time, I was part of our "STG Lab Services" team that performed ILM assessments at client locations. I explained to ABN Amro that the ILM methodology does not requirean all-IBM solution, and that ILM could even provide benefits with their current mix of storage, software and service providers.The ABN Amro team liked what I had to say, andmy team was commissioned to perform ILM assessments atthree of their data centers:
Sao Paulo (Brazil)
Chicago, IL (USA)
Each data center had its own management, its owndecision making, and its own set of issues, so we structuredeach ILM assessment independently. When we presented our results,we showed what each data center could do better with their existing mixed bagof storage, software and service providers, and also showed howmuch better their life would be with IBM storage, software andservices. They agreed to give IBM a chance to prove it, and soa new "Global Storage Study" was launched to take the recommendationsfrom our three ILM studies, and flesh out the details to make aglobally-integrated enterprise work for them. Once completed,it was renamed the "Global Storage Solution" (GSS).
Henk summarized the above with "I am glad to see Tony Pearsonin the audience, who was instrumental to making this all happen."As with many client testimonials, he presented a few charts onwho ABN Amro is today, the 12th largest bank worldwide, 8th largest in Europe. They operate in 53 countries and manage over a trillioneuros in assets.
They have over 20 data centers, with about 7 PB of disk, and over20 PB of tape, both growing at 50 to 70 percent CAGR. About 2/3 of theiroperations are now outsourced to IBM Global Services, the remaining 1/3is non-IBM equipment managed by a different service provider.
ABN Amro deployed IBM TotalStorage Productivity Center, variousIBM System Storage DS family disk systems, SAN Volume Controller (SVC), Tivoli StorageManager (TSM), Tivoli Provisioning Manager (TPM), and several other products. Armed with these products, they performed the following:
Clean Up. IBM uses the term "rationalization" to relate to the assignment of business value, to avoid confusion with theterm "classification" which many in IT relate to identifyingownership, read and write authorization levels. Often, in theinitial phases of an ILM deployment, a portion of the data isdetermined to be eligible for clean up, either to move to a lower-cost tier or deleted immediately. ABN Amro and IBM set a goal to identifyat least 20 percent of their data for clean up.
New tiers. Rather than traditional "storage tiers" which are often justTier 1 for Fibre Channel disk and Tier 2 for SATA disk, ABN Amroand IBM came up with seven "information infrastructure tiers" thatincorporate service levels, availability and protection status.They are:
High-performance, Highly-available disk with Remote replication.
High-performance, Highly-available disk (no remote replication)
Mid-performance, high-capacity disk with Remote replication
Mid-performance, high-capacity disk (no remote replication)
Non-erasable, Non-rewriteable (NENR) storage employinga blended disk and tape solution.
Enterprise Virtual Tape Library with remote replicationand back-end physical tape
Mid-performance physical tape
These tiers are applied equally across their mainframe anddistributed platforms. All of the tiers are priced per "primary GB", so any additional capacity required for replication orpoint-in-time copies, either local or remote, are all folded into the base price.ABN Amro felt a mission-critical applicationon Windows or UNIX deserves the same Tier 1 service level asa mission-critical mainframe application. Exactly!
Deployed storage virtualization for disk and tape. Thisinvolved the SAN Volume Controller and IBM TS7000 series library.
Implemented workflow automation. The key product here is IBM Tivoli Provisioning Manager
Started an investigation for HSM on distributed. This would be policy-based space management to migrate lessfrequently accessed data to the TSM pool for Windows or UNIX data.
While the deployment is not yet complete, ABN Amro feels they have alreadyrecognized business value:
Reduced cost by identifying data that should be stored on lower tiers
Simplified management, consolidated across all operating systems (mainframe, UNIX, Windows)
Increased utilization of existing storage resources
Reduced manual effort through policy-based automation, which can lead to fewer human errors and faster adaptability to new business opportunities
Standardized backup and other operational procedures
Henk and the rest of ABN Amro are quite pleased with the progress so far,although recent developments in terms of the takeover of ABN AMRO by aconsortium of banks means that the model is only implemented so far in Europe. Further rollout depends on the storage strategy of the new owners. Nonetheless,I am glad that I was able to work with Henk, Jason, Barbara, Steve, Tom, Dennis, Craig and othersto be part of this from the beginning and be able to see it rollout successfully over the years.
I survived my first day at SNW Spring 2007.This is my first time at SNW, but it is very much like many of the other conferences I have been to.It officially started Monday morning with pre-conferencetutorials and primer break-outsessions that covered storage fundamentals, but I didn't arrive until late Monday night due to highwind conditions at the Phoenix airport that delayed my travel.
Tuesday started out with main tent sessions. Ron Milton, VP of ComputerWorld that puts on this conference,and Vincent Franceschini, Chairman of the Board for SNIA, kicked off the event.It didn't take them long to get into the alphabet soup: ILM, ITIL, SMI-S, XAM, IMA, MMA, DDF,MF, DMF, IPSF, SSIF, and SRM.Several hundred people had "voting devices" so that they could participate in "informal" surveys.
Q1. What was the greatest need?
37% Storage Resource Management (SRM) tools
19% Storage Virtualization
19% Information Lifecycle Management (ILM)
14% Integration with other management tools
11% Compliance storage for regulations
Q2. What are people doing to address storage infrastructure complexity?
33% Deploying new SRM and SAN management tools
26% Adopting "Storage as a Service" methodology
22% Deploying new storage virtualization technologies
8% Hiring more staff
9% (complexity was not an issue)
The first keynote speaker was Cora Carmody, CIO of SAIC. In the late 1980s and early 1990s, I did a lot of work with SAIC here in San Diego, and so IBM sent me to San Diego quite frequentlyfor face-to-face meetings with them. Her talk was cryptically titled "Jumbo Shrimp, InformationManagement, and the Mark of the Beast." Coming up with good titles is important. Some of herkey points:
"Information management" was as much an oxymoron as "jumbo shrimp" or "military intelligence".(SAIC is a general contractor for the US Military, so this was especially funny).
Computer data needs both "ownership" and "stewardship".
Gartner analyst reports that 50% of digital information for a business resides in personal files onindividual PCs.
PAN-StaRRs project is ingesting 10TB per week of astronomical data.
TeraTEXT(R) project is a non-relational database that supports a large mix of structured and unstructured content.
The next "Y2K" crisis for the USA is changing from 3-digit to 4-digit area codes for our telephone numbers.
Battery size and life have not advanced as fast as we need
There has been little progress in "User Interface" ease of use
Formats and standards are picked for the most part by the winning vendors, and it is the silence of themarketplace that lets them get away with this.
We are overly reliant on an inherently insecure medium.
The "mark of the beast" refers to exciting new technologies based on "presence awareness". For example,some hotels now are able to check you into the hotel as you drive up in your car, based on your car's licenseplate. Some 24-hour gyms use your fingerprint as your entry credentials, eliminating the need to staff peopleat the front desk.
IBM's own Barry Rudolph, presented "Storage in an Age of Inconvenient Truths", dressed up like Oscar-winner andformer USA Vice President Al Gore. Barry's focus was on the growingconcern of over environmental Power and Cooling issues in the data center. According to IDC, the cost of power and cooling an individual server, over its lifetime, now exceeds its acquisition cost. Storage devices are not as bad as servers in this regard. Data centers now consume 1.2% of the worlds energy.
Over lunch, I heard Tony Asaro from ESG present "The Need for Highly Virtualized Storage Systems withina Virtualized Data Center." His concern is that there is still a "heavy touch" required to manage storage.Without virtualization, your data center is less than the sum of its parts. Although IBM has been doingstorage virtualization since 1974, Tony mentioned that most storage vendors were "late to the party".He argues that "internal virtualization" inside storage arrays is not enough, you need "external virtualization"(like the IBM System Storage SAN Volume Controller) to virtualize your entire infrastructure.What storage administrators would like is for storage to have consumer levels of "ease of use", and today'snon-virtualized storage environments are nowhere near that.
"The great advantage [the telephone] possesses over every other form of electrical apparatus consists in the fact that it requires no skill to operate the instrument." - Alexander Graham Bell, 1878
I attended a few break-out sessions in the afternoon.
Ralph presented "Crisis of Capacity" which covered the drastic actions he had to take to handle power and coolingin their expanding data center during their summer months, where temperatures peak up to 105 degrees. This included creating "hot" and "cold" aisles onhis raised floor by re-organizing the perforated floor tiles, and doing a better job standardizing how cables areconnected to the back of racks and up through the ceiling to maximize airflow. An amp-meter on each power strip was used to measure the powerused at each rack, which allowed them to better prioritize their efforts. Their Air Conditioning unit was only 12inches from the concrete floor, and raising it to 18 inches greatly reduced noise and vibration. Adding a second AC unit made a world of difference. Finally, they eliminatedKVMs, because people who use KVMs break other parts of thedata center. His rule of thumb: the cooling requirements will be 50% of the rated power requirements for equipment.
Terry Yoshi, Intel internal IT department, as a member of the SNIA's end user council
Terry presented "Taming the SAN Complexity". The problem with "complexity" as a concept is that it is very subjective, difficult to quantify, and therefore difficult to manage. He presented complexity in four areas:Organizational structure of the company as a whole; skill sets required of the IT staff; business process andprocedures; and technology. Dealing with complexity is a battle between Old School (because we've always doneit this way) and New School (because it is new and different technology). Storage Area Networks are inherentlya "shared resource", and the increased complexity is a direct result of the low reliability of the componentsand devices it is composed of. People should focus on the "Total Cost of Ownership" (TCO) for a SAN, and not just the initial acquisitionprice of SAN gear.He was not a fan of the "dual/multiple" vendor strategy that many companies employto reduce costs. His suggestion that things should be tried out first on your "test SAN" caused some chuckles,as few have such a thing. Finally, he suggested not only documenting "Best Practices" and "Best Known Methods"but also things that have been found not to work, his do-not-try-this-at-home list.
Tony Antony, Cisco marketing manager for Optical products
This was an overview of the technologies available for long distance connections for disaster recovery,business continuity, and resilience. He covered three levels.
IP - Fibre Channel of IP (FCIP) offers the greatest "global" distance but forces people into asynchronous mirroring.
SONET/SDH - SONET is what we call it in the USA, and SDH is what it is called in other countries. This provides state-to-state or "out-of-region" distances, which is ideal to meet certain government regulations for homeland defense. He suggests this is offered when dark fiber or DWDM is not available.
DWDM/CWDM - this is using a prism to run multiple colors of light through a single fiber optic cable. CWDM ischeaper, but only handles 8 signals per cable. DWDM can handle 32 to 160 signals per cable, but is more expensive.
His rule of thumb: one buffer credit for every kilometer at 2Gbps speed (for every 2km at 1Gbps).
The day ended at the "Expo". I hung out at the IBM booth to help answer questions and network with others.
A client complained that their tape drives were not compressing data as well as it used to. Investigating further reminded me of a scene from the 1970's television show "All in the family", summarized well inAmerican Scientist:
... in one episode of All in the Family, Archie Bunker's son-in-law, Mike, watches Archie put on his shoes and socks. Mike goes into a conniption when Archie puts the sock and shoe completely on one foot first, tying a bow to complete the action, while the other foot remains bare. To Mike, if I remember correctly, the right way to put on shoes and socks is first to put a sock on each foot and only then put the shoes on over them, and only in the same order as the socks. In an ironic development in his character, the politically liberal Mike shows himself to be intolerant of differences in how people do common little things, unaccepting of the fact that there is more than one way to skin a cat or put on one's shoes.
Both agreed that socks go first, then shoes, but the actual deployment was different.
In the case of this customer, a recent change was the use of "encryption" before the data reached the tape drive. In regards to compression and encryption, you should always compress first, then encrypt. Compression algorithms rely on frequency of data, for example the letter "E" appears more often in the English language than the letter "Z". However, once you encrypt data, those data patterns are randomized, and any attempt to compress the data afterwards is wasted effort.
With IBM tape encryption on either the TS1120 or LTO4 tape drives, we compress, then encrypt, the data when it arrives to the tape drive, so that the compression has some chance of getting up to 3:1 reduction. This compress-then-encrypt process can be done at the host as well, either from the application software or feature of the operating system.
So, just as the case between Archie Bunker and his son-in-law, there are many ways to deploy compression and encryption, just make sure you do them in the right order to get the most benefit.
Continuing this week's theme on customer references of IBM's solutions, today I will discussthe success at Kantana Animation Studios.
Here is a 3-minute video from the good folks at Kantana Animation Studios,part of the [Kantana Group].They produced the animated movie [Khan Kluay]using IBM Scale-out File Services (SoFS), a product IBM announced last November 2007.
As a film-maker myself (see this sample [Highlights clip])and active member of the Tucson Film Society,I am pleased to see IBM so greatly involved in the film industry. I've had the pleasure to visit some of theseanimation studios myself and meet with other film-makers at various conferences.
For more details on Kantana's implementation, see the [Case Study]