Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Tony Pearson is a Master Inventor and Senior Software Engineer for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson
While most of the post is accurate and well-stated, two opinions particular caught my eye. I'll be nice and call them opinions, since these are blogs, and always subject to interpretation. I'll put quotes around them so that people will correctly relate these to Hu, and not me.
"Storage virtualization can only be done in a storage controller. Currently Hitachi is the only vendor to provide this." -- Hu Yoshida
Hu, I enjoy all of your blog entries, but you should know better. HDS is fairly new-comer to the storage virtualization arena, so since IBM has been doing this for decades, I will bring you and the rest of the readers up to speed. I am not starting a blog-fight, just want to provide some additional information for clients to consider when making choices in the marketplace.
First, let's clarify the terminology. I will use 'storage' in the broad sense, including anything that can hold 1's and 0's, including memory, spinning disk media, and plastic tape media. These all have different mechanisms and access methods, based on their physical geometry and characteristics. The concept of 'virtualization' is any technology that makes one set of resources look like another set of resources with more preferable characteristics, and this applies to storage as well as servers and networks. Finally, 'storage controller' is any device with the intelligence to talk to a server and handle its read and write requests.
Second, let's take a look at all the different flavors of storage virtualization that IBM has developed over the past 30 years.
IBM introduces the S/370 with the OS/VS1 operating system. "VS" here refers to virtual storage, and in this case internal server memory was swapped out to physical disk. Using a table mapping, disk was made to look like an extension of main memory.
IBM introduces the IBM 3850 Mass Storage System (MSS). Until this time, programs that ran on mainframes had to be acutely aware of the device types being written, as each device type had different block, track and cylinder sizes, so a program written for one device type would have to be modified to work with a different device type. The MSS was able to take four 3350 disks, and a lot of tapes, and make them look like older 3330 disks, since most programs were still written for the 3330 format. The MSS was a way to deliver new 3350 disk to a 3330-oriented ecosystem, and greatly reduce the cost by handling tape on the back end. The table mapping was one virtual 3330 disk (100 MB) to two physical tapes (50 MB each). Back then, all of the mainframe disk systems had separate controllers. The 3850 used a 3831 controller that talked to the servers.
IBM invents Redundant Array of Independent Disk (RAID) technology. The table mapping is one or more virtual "Logical Units" (or "LUNs") to two or more physical disks. Data is striped, mirrored and paritied across the physical drives, making the LUNs look and feel like disks, but with faster performance and higher reliability than the physical drives they were mapped to. RAID could be implemented in the server as software, on top or embedded into the operating system, in the host bus adapter, or on the controller itself. The vendor that provided the RAID software or HBA did not have to be the same as the vendor that provided the disk, so in a sense, this avoided "vendor lock-in".Today, RAID is almost always done in the external storage controller.
IBM introduces the Personal Computer. One of the features of DOS is the ability to make a "RAM drive". This is technology that runs in the operating system to make internal memory look and feel like an external drive letter. Applications that already knew how to read and write to drive letters could work unmodified with these new RAM drives. This had the advantage that the files would be erased when the system was turned off, so it was perfect for temporary files. Of course, other operating systems today have this feature, UNIX has a /tmp directory in memory, and z/OS uses VIO storage pools.
This is important, as memory would be made to look like disk externally, as "cache", in the 1990s.
IBM AIX v3 introduces Logical Volume Manager (LVM). LVM maps the LUNs from external RAID controllers into virtual disks inside the UNIX server. The mapping can combine the capacity of multiple physical LUNs into a large internal volume. This was all done by software within the server, completely independent of the storage vendor, so again no lock-in.
IBM introduces the Virtual Tape Server (VTS). This was a disk array that emulated a tape library. A mapping of virtual tapes to physical tapes was done to allow full utilization of larger and larger tape cartridges. While many people today mistakenly equate "storage virtualization" with "disk virtualization", in reality it can be implemented on other forms of storage. The disk array was referred to as the "Tape Volume Cache". By using disk, the VTS could mount an empty "scratch" tape instantaneously, since no physical tape had to be mounted for this purpose.
Contradicting its "tape is dead" mantra, EMC later developed its CLARiiON disk library that emulates a virtual tape library (VTL).
IBM introduces the SAN Volume Controller. It involves mapping virtual disks to manage disks that could be from different frames from different vendors. Like other controllers, the SVC has multiple processors and cache memory, with the intelligence to talk to servers, and is similar in functionality to the controller components you might find inside monolithic "controller+disk" configurations like the IBM DS8300, EMC Symmetrix, or HDS TagmaStore USP. SVC can map the virtual disk to physical disk one-for-one in "image mode", as HDS does, or can also map virtual disks across physical managed disks, using a similar mapping table, to provide advantages like performance improvement through striping. You can take any virtual disk out of the SVC system simply by migrating it back to "image mode" and disconnecting the LUN from management. Again, no vendor lock-in.
The HDS USP and NSC can run as regular disk systems without virtualization, or the virtualization can be enabled to allow external disks from other vendors. HDS usually counts all USP and NSC sold, but never mention what percentage these have external disks attached in virtualization mode. Either they don't track this, or too embarrassed to publish the number. (My guess: single digit percentage).
Few people remember that IBM also introduced virtualization in both controller+disk and SAN switch form factors. The controller+disk version was called "SAN Integration Server", but people didn't like the "vendor lock-in" having to buy the internal disk from IBM. They preferred having it all external disk, with plenty of vendor choices. This is perhaps why Hitachi now offers a disk-less version of the NSC 55, in an attempt to be more like IBM's SVC.
IBM also had introduced the IBM SVC for Cisco 9000 blade. Our clients didn't want to upgrade their SAN switch networking gear just to get the benefits of disk virtualization. Perhaps this is the same reason EMC has done so poorly with its "Invista" offering.
So, bottom line, storage virtualization can, and has, been delivered in the operating system software, in the server's host bus adapter, inside SAN switches, and in storage controllers. It can be delivered anywhere in the path between application and physical media. Today, the two major vendors that provide disk virtualization "in the storage controller" are IBM and HDS, and the three major vendors that provide tape virtualization "in the storage controller" are IBM, Sun/STK, and EMC. All of these involve a mapping of logical to physical resources. Hitachi uses a one-for-one mapping, whereas IBM additionally offers more sophisticated mappings as well.
You may not be the right person to ask but I am asking everyone so "How do you see hybrid disk drives?"
(For the record, I am not immediately related to Robert. At onepoint, "Pearson" was the 12th most common surname in the USA, but now doesn't even make the Top 100.)
Robert, I would like to encourage you and everyone else to ask questions, don't worry if I am the wrong person to ask, asprobably I know the right person within IBM. Some people have called me the "Kevin Bacon" of Storage,as I am often less than six degrees away from the right person, having worked in IBM Storage for over 20 years.
For those not familiar with hybrid drives, there is a good write-up in Wikipedia.
Unfortunately, most of the people I would consult on this question, such as those from Market Intelligence or Research, are on vacation for the holidays, so, Robert, I will have to rely on my trusted 78-card Tarot deck and answer you with a five-card throw.
Your first card, Robert, is the Hermit. This card represents "introspection". The best I/O is no I/O, which means that if applications can keep the information they need inside server memory, you can avoid the bus bandwidth limitations to going to external storage devices. Where external storage makes sense is when data is shared between servers, or when the single server is limited to a set amount of internal memory. So, consider maxing out the memory in your server first (IBM would be glad to sell you more internal memory!!!), then consider outside solid-state or hybrid devices. Windows for example has an architectural limit of 4GB.
Your second card, Robert, is the Four of Cups, representing "apathy".On the card, you see three cups together, with the fourth cup being delivered from a cloud. This reminds me thatwe have three storage tiers already (memory,disk,tape), and introducing a fourth tier into the mix may not garnermuch excitement. For the mainframe, IBM introduced a Solid-State Device, call the Coupling Facility, which can be accessed from multipleSystem z servers. It is used heavily by DFSMS and DB2 to hold shared information. However, given some customer's apathytowards Information Lifecycle Management which includes "tiered storage", introducing yet another tier that forcespeople to decide what data goes where may be another challenge.
Your third card, Robert, is the Chariot, which represents "Speed, Determination,and Will". In some cases, solid state disk are faster for reading, but can be slower for writing. In the case of ahybrid drive, where the memory acts as a front-end cache, read-hits would be faster, but read-misses might be slower.While the idea of stopping the drives during inactivity will reduce power consumption, spinning up and slowing downthe disk may incur additional performance penalties. At the time of this post, the fastest disk system remains the IBM SAN Volume Controller, based on SPC-1 and SPC-2 benchmarks in excess of those published for other devices.
Your fourth card, Robert, is the Eight of Pentacles, which represents"Diligence, Hard work". The pentacles are coins with five-sided stars on them, and this often represents money.Our research team has projected that spinning disk will continue to be a viable and profitable storage media for at least anothereight years.
Your fifth and last card, Robert, is the World, which normallyrepresents "Accomplishment", but since it is turned upside down, the meaning is reversed to "Limitation". Some Hybriddisks, and some types of solid state memory in general, do have limitations in the number of write cycles they can handle. For thoseunhappy with the frequency and slowness for rebuilds on SATA disk may find similar problems with hybrid drives.For that reason, businesses may not trust using hybrid drives for their busiest, mission-critical applications, but certainlymight use it for archive data with lower write-cycle requirements.
The tarot cards are never wrong, but certainly interpretations of the cards can be.
Jon Toigo has a funny cartoon on his post, [As I Listen to EMC Brag on “New” Functionality…]. Basically, it pokes fun that many of us bloggers argue which vendor was first to introduce some technology or another. We all do this, myself included.
Recently, Claus Mikkelsen's, currently with HDS, [brought up accurately some past history from the 1990s], which is before many storage bloggers hired on with their current employers. Claus and I worked together for IBM back then, so I recognized many of the events he mentions that I can't talk about either. In many cases, IBM or HDS delivered new features before EMC.
I've been reading with some amusement as fellow blogger Barry Burke asked Claus a series of questions about Hitachi's latest High Availability Manager (HAM) feature. Claus was too busy with his "day job" and chose to shut Barry down. Sadly, HDS set themselves up for ridicule this round, first by over-hyping a function before its announcement, and then announcing a feature that IBM and EMC have offered for a while. The problem and confusion for many is that each vendor uses different terminology. Hitachi's HAM is similar to IBM's HyperSwap and EMC's AutoSwap. The implementations are different, of course, which is often why vendors are often asked to compare and contrast one implementation to another.
In his latest response,[how to mind the future of a mission-critical world], Barry reports that several HDS bloggers now censor his comments.That's too bad. I don't censor comments, within reason, including Barry's inane questions about IBM's products, and am glad that he does not censor my inane questions to him about EMC products in return. The entire blogosphere benefits from these exchanges, even if they are a bit heated sometimes.
We all have day jobs, and often are just too busy, or too lazy, to read dozens or hundreds of pages of materials, if we can even find them in the first place. Not everyone has the luxury of a "competitive marketing" team to help do the research for you, so if we can get an accurate answer or clarification about a product that is generally available directly from a vendor's subject matter expert, I am all for that.
It's official! My "blook" Inside System Storage - Volume I is now available.
This blog-based book, or “blook”, comprises the first twelve months of posts from this Inside System Storage blog,165 posts in all, from September 1, 2006 to August 31, 2007. Foreword by Jennifer Jones. 404 pages.
IT storage and storage networking concepts
IBM strategy, hardware, software and services
Disk systems, Tape systems, and storage networking
Storage and infrastructure management software
Second Life, Facebook, and other Web 2.0 platforms
IBM’s many alliances, partners and competitors
How IT storage impacts society and industry
You can choose between hardcover (with dust jacket) or paperback versions:
This is not the first time I've been published. I have authored articles for storage industry magazines, written large sections of IBM publications and manuals, submitted presentations and whitepapers to conference proceedings, and even had a short story published with illustrations by the famous cartoon writer[Ted Rall].
But I can say this is my first blook, and as far as I can tell, the first blook from IBM's many bloggers on DeveloperWorks, and the first blook about the IT storage industry.I got the idea when I saw [Lulu Publishing] run a "blook" contest. The Lulu Blooker Prize is the world's first literary prize devoted to "blooks"--books based on blogs or other websites, including webcomics. The [Lulu Blooker Blog] lists past year winners. Lulu is one of the new innovative "print-on-demand" publishers. Rather than printing hundredsor thousands of books in advance, as other publishers require, Lulu doesn't print them until you order them.
I considered cute titles like A Year of Living Dangerously, orAn Engineer in Marketing La-La land, or Around the World in 165 Posts, but settled on a title that matched closely the name of the blog.
In addition to my blog posts, I provide additional insights and behind-the-scenes commentary. If you go to the Luluwebsite above, you can preview an entire chapter in its entirety before purchase. I have added a hefty 56-page Glossary of Acronyms and Terms (GOAT) with over 900 storage-related terms defined, which also doubles as an index back to the post (or posts) that use or further explain each term.
So who might be interested in this blook?
Business Partners and Sales Reps looking to give a nice gift to their best clients and colleagues
Managers looking to reward early-tenure employees and retain the best talent
IT specialists and technicians wanting a marketing perspective of the storage industry
Mentors interested in providing motivation and encouragement to their proteges
Educators looking to provide books for their classroom or library collection
Authors looking to write a blook themselves, to see how to format and structure a finished product
Marketing personnel that want to better understand Web 2.0, Second Life and social networking
Analysts and journalists looking to understand how storage impacts the IT industry, and society overall
College graduates and others interested in a career as a storage administrator
And yes, according to Lulu, if you order soon, you can have it by December 25.
In North America, today marks the start of the "Give 1 Get 1" program.
Children using the XO laptop
I first learned from this when I was reading about Timothy Ferriss' [LitLiberation project] on his [Four Hour Work Week] blog, and was surfing around for related ideas, and chanced upon this. I registered for a reminder, and it came today(the reminder, not the laptop itself).
Here's how the program works. You give $399 US dollars to the "One Laptop per Child" (OLPC)[laptop.org] organization for two laptops: One goes to a deserving child ina developing country, the second goes to you, for your own child, or to donate to a localcharity that helps children. This counts as a $199 purchase plus a $200 tax-deductible donation.For Americans, this is a [US 501(c)(3)] donation, and for Canadians and Mexicans, take advantage of the low-value of the US dollar!
If your employer matches donations, like IBM does, get them to match the $200donation for a third laptop, which goes to another child in a developing country. As for shipping, you pay only for the shipping of the one to you, each receiving country covers their own shipping. In my case, the shipping was another $24 US dollars for Arizona.No guarantees that it will arrive in time for the holidays this December, but it might.
To sweeten the deal, T-mobile throws in a year's worth of "Wi-Fi Hot Spot"that you can use for yourself, either with the XO laptop itself, or your regular laptop, iPhone, or otherWi-Fi enabled handheld device.
National Public Radio did a story last week on this:[The $100 Laptop Heads for Uganda]where they interview actor [Masi Oka], best known from the TV show ["Heroes"], who has agreed to be their spokesman.At the risk of sounding like their other spokesman, I thought I would cover the technology itself, inside the XO,and how this laptop represents IBM's concept of "Innovation that matters"!
The project was started by [Nicholas Negroponte] from [MIT University] as the "$100 laptop project". Once the final designwas worked out, it turns out it costs $188 US dollars to make, so they rounded it up to $200. This is stillan impressive price, and requires that hundreds of thousands of them be manufactured to justify ramping upthe assembly line.
Two of IBM's technology partners are behind this project. First is Advanced Micro Devices (AMD) that providesthe 433Mhz x86 processor, which is 75 percent slower than Thinkpad T60. Second is Red Hat,as this runs lean Fedora 6 version of Linux. Obviously, you couldn't have Microsoft Windows or Apple OS X, as both require significantly more resources.
The laptop is "child size", and would be considered in the [subnotebook] category. At 10" x 9" x 1.25", it is about the size of class textbook,can be carried easily in a child's backpack, or carried by itself with the integrated handle. When closed, it is sealedenough to be protected when carried in rain or dust storms. It weighs about 3.5 pounds, less than the 5.2 pounds of myThinkpad T60.
The XO is "green", not just in color, but also in energy consumption.This laptop can be powered by AC, or human power hand-crank, with workin place to get options for car-battery or solar power charging. Compared to the 20W normally consumed bytraditional laptops, the XO consumes 90 percent less, running at 2W or less. To accomplish this, there is no spinning disk inside. Instead, a 1GB FLASH drive holds 700MB of Linux, and gives you 300MB to hold your files. There isa slot for an MMC/SD flash card, and three USB 2.0 ports to connect to USB keys, printers or other remote I/O peripherals.
The XO flips around into three positions:
Standard laptop position has screen and keyboard. The water-tight keyboard comes in ten languages:International/English, Thai, Arabic, Spanish, Portuguese, West African, Urdu, Mongolian, Cyrillic, and Amharic.(I learned some Amharic, having lived five years with Ethiopians.)There does not appear be a VGA port, so don't be thinking this could be used as an alternative to project Powerpoint presentations onto a big screen.
Built-in 640x480 webcam, microphone and speakers allow the XO to be used as a communication device. Voice-over-IP (VOIP) client software, similar to Skype or [IBM Lotus Sametime], is pre-installed for this purpose.
The basic built-in communication are 802.1g (54Mbs) that you can use to surf the web usingthe Wi-Fi at your local Starbucks; and 802.1s which forms a "mesh network" with other XO laptops, and can surf theweb finding the one laptop nearby that is connected to the internet to share bandwidth. This eliminates the need to build a separate Wi-Fi hub at the school. There are USB-to-Ethernet and USB-to-Cellular converters, so that might be an alternative option.
Flipped vertically, the device can be read like a book.The screen can be changed between full-color and black-white, 200 dpi, with decent 1200x900 pixel resolution. The full-color is back-lit, and can be read in low-lighting. The black-white is not back-lit, consumes much less power, andcan be read in bright sunlight. In that regards, it is comparable to other [e-book devices], like a Cybook or Sony Reader.
Software includes a web-browser, document reader, word processor and RSS feed reader to read blogs.The OLPC identifies all of the software, libraries and interfaces they use, so that anyone that wants to developchildren software for this platform can do so.
With the keyboard flipped back, the 6" x 4.5" screen has directional controls and X/Y/A/B buttons to run games. This would make it comparable to a Nintendo DS or Playstation Portable (PSP). Again, the choice between back-lit color,or sunlight black-white screen modes apply. Some games are pre-installed.
So for $399, you could buy a Wi-Fi enabled[16GB iPod Touch] for yourself, which does much the same thing, or you can make a difference in the world.I made my donation this morning, and suggest you--my dear readers in the US, Canada and Mexico--consider doing the same.Go to [www.laptopgiving.org] for details.
Well, this week I am in Maryland, just outside of Washington DC. It's a bit cold here.
Robin Harris over at StorageMojo put out this Open Letter to Seagate, Hitachi GST, EMC, HP, NetApp, IBM and Sun about the results of two academic papers, one from Google, and another from Carnegie Mellon University (CMU). The papers imply that the disk drive module (DDM) manufacturers have perhaps misrepresented their reliability estimates, and asks major vendors to respond. So far, NetAppand EMC have responded.
I will not bother to re-iterate or repeat what others have said already, but make just a few points. Robin, you are free to consider this "my" official response if you like to post it on your blog, or point to mine, whatever is easier for you. Given that IBM no longer manufacturers the DDMs we use inside our disk systems, there may not be any reason for a more formal response.
Coke and Pepsi buy sugar, Nutrasweet and Splenda from the same sources
Somehow, this doesn't surprise anyone. Coke and Pepsi don't own their own sugar cane fields, and even their bottlers are separate companies. Their job is to assemble the components using super-secret recipes to make something that tastes good.
IBM, EMC and NetApp don't make DDMs that are mentioned in either academic study. Different IBM storage systems uses one or more of the following DDM suppliers:
Seagate (including Maxstor they acquired)
Hitachi Global Storage Technologies, HGST (former IBM division sold off to Hitachi)
In the past, corporations like IBM was very "vertically-integrated", making every component of every system delivered.IBM was the first to bring disk systems to market, and led the major enhancements that exist in nearly all disk drives manufactured today. Today, however, our value-add is to take standard components, and use our super-secret recipe to make something that provides unique value to the marketplace. Not surprisingly, EMC, HP, Sun and NetApp also don't make their own DDMs. Hitachi is perhaps the last major disk systems vendor that also has a DDM manufacturing division.
So, my point is that disk systems are the next layer up. Everyone knows that individual components fail. Unlike CPUs or Memory, disks actually have moving parts, so you would expect them to fail more often compared to just "chips".
If you don't feel the MTBF or AFR estimates posted by these suppliers are valid, go after them, not the disk systems vendors that use their supplies. While IBM does qualify DDM suppliers for each purpose, we are basically purchasing them from the same major vendors as all of our competitors. I suspect you won't get much more than the responses you posted from Seagate and HGST.
American car owners replace their cars every 59 months
According to a frequently cited auto market research firm, the average time before the original owner transfers their vehicle -- purchased or leased -- is currently 59 months.Both studies mention that customers have a different "definition" of failure than manufacturers, and often replace the drives before they are completely kaput. The same is true for cars. Americans give various reasons why they trade in their less-than-five-year cars for newer models. Disk technologies advance at a faster pace, so it makes sense to change drives for other business reasons, for speed and capacity improvements, lower power consumption, and so on.
The CMU study indicated that 43 percent of drives were replaced before they were completely dead.So, if General Motors estimated their cars lasted 9 years, and Toyota estimated 11 years, people still replace them sooner, for other reasons.
At IBM, we remind people that "data outlives the media". True for disk, and true for tape. Neither is "permanent storage", but rather a temporary resting point until the data is transferred to the next media. For this reason, IBM is focused on solutions and disk systems that plan for this inevitable migration process. IBM System Storage SAN Volume Controller is able to move active data from one disk system to another; IBM Tivoli Storage Manager is able to move backup copies from one tape to another; and IBM System Storage DR550 is able to move archive copies from disk and tape to newer disk and tape.
If you had only one car, then having that one and only vehicle die could be quite disrupting. However, companies that have fleet cars, like Hertz Car Rentals, don't wait for their cars to completely stop running either, they replace them well before that happens. For a large company with a large fleet of cars, regularly scheduled replacement is just part of doing business.
This brings us to the subject of RAID. No question that RAID 5 provides better reliability than having just a bunch of disks (JBOD). Certainly, three copies of data across separate disks, a variation of RAID 1, will provide even more protection, but for a price.
Robin mentions the "Auto-correlation" effect. Disk failures bunch up, so one recent failure might mean another DDM, somewhere in the environment, will probably fail soon also. For it to make a difference, it would (a) have to be a DDM in the same RAID 5 rank, and (b) have to occur during the time the first drive is being rebuilt to a spare volume.
The human body replaces skin cells every day
So there are individual DDMs, manufactured by the suppliers above; disk systems, manufactured by IBM and others, and then your entire IT infrastructure. Beyond the disk system, you probably have redundant fabrics, clustered servers and multiple data paths, because eventually hardware fails.
People might realize that the human body replaces skin cells every day. Other cells are replaced frequently, within seven days, and others less frequently, taking a year or so to be replaced. I'm over 40 years old, but most of my cells are less than 9 years old. This is possible because information, data in the form of DNA, is moved from old cells to new cells, keeping the infrastructure (my body) alive.
Our clients should approach this in a more holistic view. You will replace disks in less than 3-5 years. While tape cartridges can retain their data for 20 years, most people change their tape drives every 7-9 years, and so tape data needs to be moved from old to new cartridges. Focus on your information, not individual DDMs.
What does this mean for DDM failures. When it happens, the disk system re-routes requests to a spare disk, rebuilding the data from RAID 5 parity, giving storage admins time to replace the failed unit. During the few hours this process takes place, you are either taking a backup, or crossing your fingers.Note: for RAID5 the time to rebuild is proportional to the number of disks in the rank, so smaller ranks can be rebuilt faster than larger ranks. To make matters worse, the slower RPM speeds and higher capacities of ATA disks means that the rebuild process could take longer than smaller capacity, higher speed FC/SCSI disk.
According to the Google study, a large portion of the DDM replacements had no SMART errors to warn that it was going to happen. To protect your infrastructure, you need to make sure you have current backups of all your data. IBM TotalStorage Productivity Center can help identify all the data that is "at risk", those files that have no backup, no copy, and no current backup since the file was most recently changed. A well-run shop keeps their "at risk" files below 3 percent.
So, where does that leave us?
ATA drives are probably as reliable as FC/SCSI disk. Customers should chose which to use based on performance and workload characteristics. FC/SCSI drives are more expensive because they are designed to run at faster speeds, required by some enterprises for some workloads. IBM offers both, and has tools to help estimate which products are the best match to your requirements.
RAID 5 is just one of the many choices of trade-offs between cost and protection of data. For some data, JBOD might be enough. For other data that is more mission critical, you might choose keeping two or three copies. Data protection is more than just using RAID, you need to also consider point-in-time copies, synchronous or asynchronous disk mirroring, continuous data protection (CDP), and backup to tape media. IBM can help show you how.
Disk systems, and IT environments in general, are higher-level concepts to transcend the failures of individual components. DDM components will fail. Cache memory will fail. CPUs will fail. Choose a disk systems vendor that combines technologies in unique and innovative ways that take these possibilities into account, designed for no single point of failure, and no single point of repair.
So, Robin, from IBM's perspective, our hands are clean. Thank you for bringing this to our attention and for giving me the opportunity to highlight IBM's superiority at the systems level.
( I cannot take credit for coining the new term "bleg". I saw this term firstused over on the [FreakonomicsBlog]. If you have not yet read the book "Freakonomics", I highly recommend it! The authors' blog is excellent as well.)
For this comparison, it is important to figure out how much workload a mainframe can support, how much an x86 cansupport, and then divide one from the other. Sounds simple enough, right? And what workload should you choose?IBM chose a business-oriented "data-intensive" workload using Oracle database. (If you wanted instead a scientific"compute-intensive" workload, consider an [IBM supercomputer] instead, the most recent of which clocked in over 1 quadrillion floating point operations per second, or PetaFLOP.) IBM compares the following two systems:
Sun Fire X2100 M2, model 1220 server (2-way)
IBM did not pick a wimpy machine to compare against. The model 1220 is the fastest in the series, with a 2.8Ghz x86-64 dual-core AMD Opteron processor, capable of running various levels of Solaris, Linux or Windows.In our case, we will use Oracle workloads running on Red Hat Enterprise Linux.All of the technical specifications are available at the[Sun Microsystems Sun Fire X1200] Web site.I am sure that there are comparable models from HP, Dell or even IBM that could have been used for this comparison.
IBM z10 Enterprise Class mainframe model E64 (64-way)
This machine can run a variety of operating systems also, including Red Hat Enterprise Linux (RHEL). The E64 has four "multiple processor modules" called"processor books" for a total of 77 processing units: 64 central processors, 11 system assist processors (SAP) and 2 spares. That's right, spare processors, in case any others gobad, IBM has got your back. You can designate a central processor in a variety of flavors. For running z/VM and Linux operating systems, the central processors can be put into "Integrated Facility for Linux" (IFL) mode.On IT Jungle, Timothy Patrick Morgan explains the z10 EC in his article[IBM Launches 64-Way z10 Enterprise Class Mainframe Behemoth]. For more information on the z10 EC, see the 110-page [Technical Introduction], orread the specifications on the[IBM z10 EC] Web site.
In a shop full of x86 servers, there are production servers, test and development servers, quality assuranceservers, standby idle servers for high availability, and so on. On average, these are only 10 percent utilized.For example, consider the following mix of servers:
125 Production machines running 70 percent busy
125 Backup machines running idle ready for active failover in case a production machine fails
1250 machines for test, development and quality assurance, running at 5 percent average utilization
While [some might question, dispute or challenge thisten percent] estimate, it matches the logic used to justify VMware, XEN, Virtual Iron or other virtualization technologies. Running 10 to 20 "virtual servers" on a single physical x86 machine assumes a similar 5-10 percent utilization rate.
Note: The following paragraphs have been revised per comments received.
Now the math. Jon, I want to make it clear I was not involved in writing the press release nor assisted with thesemath calculations. Please, don't shoot the messenger! Remember this cartoon where two scientists in white lab coats are writing mathcalculations on a chalkboard, and in the middle there is "and then a miracle happens..." to continue the rest ofthe calculations?
In this case, the miracle is the number that compares one server hardware platform to another. I am not going to bore people with details like the number of concurrent processor threads or the differencesbetween L1 and L3 cache. IBM used sophisticated tools and third party involvement that I am not allowed to talk about, and I have discussed this post with lawyers representing four (now five) different organizations already,so for the purposes of illustration and explanation only, I have reverse-engineered a new z10-to-Opteron conversion factor as 6.866 z10 EC MIPS per GHz of dual-core AMD Opteron for I/O-intensive workloads running only 10 percent average CPU utilization. Business applications that perform a lot of I/O don't use their CPU as much as other workloads.For compute-intensive or memory-intensive workloads, the conversion factor may be quite different, like 200 MIPS per GHz, as Jeff Savit from Sun Microsystems points out in the comments below.
Keep in mind that each processor is different, and we now have Intel, AMD, SPARC, PA-RISC and POWER (and others); 32-bit versus 64-bit; dual-core and quad-core; and different co-processor chip sets to worry about. AMD Opteron processors come in different speeds, but we are comparing against the 2.8GHz, so 1500 times 6.866 times 2.8 is 28,337. Since these would be running as Linux guestsunder z/VM, we add an additional 7 percent overhead or 2,019 MIPS. We then subtract 15 percent for "smoothing", whichis what happens when you consolidate workloads that have different peaks and valleys in workload, or 4,326 MIPS.The end is that we need a machine to do 26,530 MIPS. Thanks to advances in "Hypervisor" technological synergy between the z/VM operating system and the underlying z10 EC hardware, the mainframe can easily run 90 percent utilized when aggregating multiple workloads, so a 29,477 MIPS machine running at 90 percent utilization can handle these 26,530 MIPS.
N-way machines, from a little 2-way Sun Fire X2100 to the might 64-way z10 EC mainframe, are called "Symmetric Multiprocessors". All of the processors or cores are in play, but sometimes they have to taketurns, wait for exclusive access on a shared resource, such as cache or the bus. When your car is stopped at a red light, you are waiting for your turn to use the shared "intersection". As a result, you don't get linear improvement, but rather you get diminishing returns. This is known generically as the "SMP effect", and in IBM documentsthis as [Large System Performance Reference].While a 1-way z10 EC can handle 920 MIPS, the 64-way can only handle30,657 MIPS. The 29,477 MIPS needed for the Sun x2100 workload can be handled by a 61-way, giving you three extraprocessors to handle unexpected peaks in workload.
But are 1500 Linux guest images architecturally possible? A long time ago, David Boyes of[Sine Nomine Associates] ran 41,400 Linux guest images on a single mainframe using his [Test Plan Charlie], and IBM internallywas able to get 98,000 images, and in both cases these were on machines less powerful than the z10 EC. Neitherof these were tests ran I/O intensive workloads, but extreme limits are always worth testing. The 1500-to-1 reduction in IBM's press release is edge-of-the-envelope as well, so in production environments, several hundred guest images are probably more realistic, and still offer significant TCO savings.
The z10 EC can handle up to 60 LPARs, and each LPAR can run z/VM which acts much like VMware in allowing multipleLinux guests per z/VM instance. For 1500 Linux guests, you could have 25 guests each on 60 z/VM LPARs, or 250 guests on each of six z/VM LPARs, or 750 guests on two LPARs. with z/VM 5.3, each LPAR can support up to 256GB of memory and 32 processors, so you need at least two LPAR to use all 64 engines. Also, there are good reasons to have different guests under different z/VM LPARs, such as separating development/test from production workloads. If you had to re-IPLa specific z/VM LPAR, it could be done without impacting the workloads on other LPARs.
To access storage, IBM offers N-port ID Virtualization (NPIV). Without NPIV, two Linux guest images could not accessthe same LUN through the same FCP port because this would confuse the Host Bus Adapter (HBA), which IBM calls "FICON Express" cards. For example, Linux guest 1 asks to read LUN 587 block 32 and this is sent out a specific port, to a switch, to a disk system. Meanwhile, Linux guest 2 asks to read LUN 587 block 49. The data comes back to the z10 EC with the data, gives it to the correct z/VM LPAR, but then what? How does z/VM know which of the many Linux guests to give the data to? Both touched the same LUN, so it is unclear which made the request. To solve this, NPIV assigns a virtual "World Wide Port Name" (WWPN), up to 256 of them per physical port, so you can have up to 256 Linux guests sharing the same physical HBA port to access the same LUN.If you had 250 guests on each of six z/VM LPARs, and each LPAR had its own set of HBA ports, then all 1500 guestscould access the same LUN.
Yes, the z10 EC machines support Sysplex. The concept is confusing, but "Sysplex" in IBM terminology just means that you can have LPARs either on the same machine or on separate mainframes, all sharing the same time source, whether this be a "Sysplex Timer" or by using the "Server Time Protocol" (STP). The z10 EC can have STP over 6 Gbps Infiniband over distance. If you wantedto have all 1500 Linux guests time stamp data identically, all six z/VM LPARs need access to the shared time source. This can help in a re-do or roll-back situation for Oracle databases to complete or back-out "Units of Work" transactions. This time stamp is also used to form consistency groups in "z/OS Global Mirror", formerly called "XRC" for Extended Remote Distance Copy. Currently, the "timestamp" on I/O applies only to z/OS and Linux and not other operating systems. (The time stamp is done through the CDK driver on Linux, and contributed back to theopen source community so that it is available from both Novell SUSE and Red Hat distributions.)To have XRC have consistency between z/OS and Linux, the Linux guests would need to access native CKD volumes,rather than VM Minidisks or FCP-oriented LUNs.
Note: this is different than "Parallel Sysplex" which refers to having up to 32 z/OS images sharing a common "Coupling Facility" which acts as shared memory for applications. z/VM and Linux do not participate in"Parallel Sysplex".
As for the price, mainframes list for as little as "six figures" to as much as several million dollars, but I have no idea how much this particular model would cost. And, of course, this is just the hardware cost. I could not find the math for the $667 per server replacement you mentioned, so don't have details on that.You would need to purchase z/VM licenses, and possibly support contracts for Linux on System z to be fully comparable to all of the software license and support costs of the VMware, Solaris, Linux and/or Windows licenses you run on the x86 machines.
This is where a lot of the savings come from, as a lot of software is licensed "per processor" or "per core", and so software on 64 mainframe processors can be substantially less expensive than 1500 processors or 3000 cores.IBM does "eat its own cooking" in this case. IBM is consolidating 3900 one-application-each rack-mounted serversonto 30 mainframes, for a ratio of 130-to-1 and getting amazingly reduced TCO. The savings are in the followingareas:
Hardware infrastructure. It's not just servers, but racks, PDUs, etc. It turns out to be less expensive to incrementally add more CPU and storage to an existing mainframe than to add or replace older rack-em-and-stack-emwith newer models of the same.
Cables. Virtual servers can talk to each other in the same machine virtually, such as HiperSockets, eliminatingmany cables. NPIV allows many guests to share expensive cables to external devices.
Networking ports. Both LAN and SAN networking gear can be greatly reduced because fewer ports are needed.
Administration. We have Universities that can offer a guest image for every student without having a majorimpact to the sys-admins, as the students can do much of their administration remotely, without having physicalaccess to the machinery. Companies uses mainframe to host hundreds of virtual guests find reductions too!
Connectivity. Consolidating distributed servers in many locations to a mainframe in one location allows youto reduce connections to the outside world. Instead of sixteen OC3 lines for sixteen different data centers, you could have one big OC48 line instead to a single data center.
Software licenses. Licenses based on servers, cores or CPUs are reduced when you consolidate to the mainframe.
Floorspace. Generally, floorspace is not in short supply in the USA, but in other areas it can be an issue.
Power and Cooling. IBM has experienced significant reduction in power consumption and cooling requirementsin its own consolidation efforts.
All of the components of DFSMS (including DFP, DFHSM, DFDSS and DFRMM) were merged into a single product "DFSMS for z/OS" and is now an included element in the base z/OS operating system. As a result of these, customers typically have 80 to 90 percent utilization on their mainframe disk. For the 1500 Linux guests, however, most of the DFSMS features of z/OS do not apply. These functions were not "ported over" to z/VM nor Linux on any platform.
Instead, the DFSMS concepts have been re-implemented into a new product called "Scale-Out File Services" (SOFS) which would provide NAS interfaces to a blendeddisk-and-tape environment. The SOFS disk can be kept at 90 percent utilization because policies can place data, movedata and even expire files, just like DFSMS does for z/OS data sets. SOFS supports standard NAS protocols such as CIFS,NFS, FTP and HTTP, and these could be access from the 1500 Linux guests over an Ethernet Network Interface Card (NIC), which IBM calls "OSA Express" cards.
Lastly, IBM z10 EC is not emulating x86 or x86-64 interfaces for any of these workloads. No doubt IBM and AMD could collaborate together to come up with an AMD Opteron emulator for the S/390 chipset, and load Windows 2003 right on top of it, but that would just result in all kinds of emulation overhead.Instead, Linux on System z guests can run comparable workloads. There are many Linux applications that are functionally equivalent or the same as their Windows counterparts. If you run Oracle on Windows, you could runOracle on Linux. If you run MS Exchange on Windows, you could run Bynari on Linux and let all of your Outlook Expressusers not even know their Exchange server had been moved! Linux guest images can be application servers, web servers, database servers, network infrastructure servers, file servers, firewall, DNS, and so on. For nearly any business workload you can assign to an x86 server in a datacenter, there is likely an option for Linux on System z.
Hope this answers all of your questions, Jon. These were estimates based on basic assumptions. This is not to imply that IBM z10 EC and VMware are the only technologies that help in this area, you can certainly find virtualization on other systems and through other software.I have asked IBM to make public the "TCO framework" that sheds more light on this.As they say, "Your mileage may vary."
For more on this series, check out the following posts:
If in your travels, Jon, you run into someone interested to see how IBM could help consolidate rack-mounted servers over to a z10 EC mainframe, have them ask IBM for a "Scorpion study". That is the name of the assessment that evaluates a specific clientsituation, and can then recommend a more accurate estimate configuration.
Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
Perhaps they won't be surprised any more. Here is an article in eWeek that explains how IBM isreducing energy costs 80% by consolidating 3,900 rack-optimized servers to 33 IBM System z mainframe servers, running Linux, in its own data centers. Since 1997, IBM has consolidated its 155 strategic worldwide data center locations down to just seven.
I am very pleased that IBM has invested heavily into Linux, with support across servers, storage, software andservices. Linux is allowing IBM to deliver clever, innovative solutions that may not be possible with other operating systems. If you are in storage, you should consider becoming more knowledgeable in Linux.
The older systems won't just end up in a landfill somewhere. Instead, the details are spelled out inthe IBM Press Release:
As part of the effort to protect the environment, IBM Global Asset Recovery Services, the refurbishment and recycling unit of IBM, will process and properly dispose of the 3,900 reclaimed systems. Newer units will be refurbished and resold through IBM's sales force and partner network, while older systems will be harvested for parts or sold for scrap. Prior to disposition, the machines will be scrubbed of all sensitive data. Any unusable e-waste will be properly disposed following environmentally compliant processes perfected over 20 years of leading environmental skill and experience in the area of IT asset disposition.
Whereas other vendors might think that some operational improvements will be enough, such as switching to higher-capacity SATA drives, or virtualizing x86 servers, IBM recognizes that sometimes more fundamental changes are required to effect real changes and real results.
People are confused over various orders of magnitude. News of the economic meltdownoften blurs the distinction between millions (10^6), billions (10^9), and trillions (10^12).To show how different these three numbers are, consider the following:
A million seconds ago - you might have received your last paycheck (12 days)
A billion seconds ago - you were born or just hired on your current job (31 years)
A trillion seconds ago - cavemen were walking around in Asia (31,000 years)
That these numbers confuse the average person is no surprise, but that it confuses marketing people in the storage industry is even more hilarious. I am often correcting people who misunderstandMB (million bytes), GB (billion bytes) and TB (trillion bytes) of information.Take this graph as an example from a recent presentation.
At first, it looks reasonable, back in 2004, black-and-white 2D X-Ray images were only 1MBin size when digitized, but by 2010 there will be fancy 4D images that now take 1TB, representinga 1000x increase. What?When I pointed out this discrepancy, the person who put this chart together didn't know what to fix.Were 4D images only 1GB in size, or was it really a 1000000x increase.
If a 2D image was 1000 by 1000 pixels, each pixel was a byte of information, then a 3D imagemight either be 1000 by 1000 by 1000 [voxels], or 1000 by 1000 at 1000 frames per second (fps). Thefirst being 3D volumetric space, and the latter called 2D+time in the medical field, the rest of us just say "video".4D images are 3D+time, volumetric scans over time, so conceivably these could be quite large in size.
The key point is that advances in medical equipment result in capturing more data, which canhelp provide better healthcare. This would be the place I normally plug an IBM product, like the Grid Medical Archive Solution [GMAS], a blended disk and tape storage solution designed specifically for this purpose.
So, as government agencies look to spend billions of dollars to provide millions of peoplewith proper healthcare, choosing to spend some of this money on a smarter infrastructure can result in creating thousands of jobs and save everyone a lot of money, but more importantly, save lives.
Short 2-minute [video] argues the case for Smarter Healthcare
For more on this, check out Adam Christensen's blog post on[Smarter Planet], which points to a podcast byDr. Russ Robertson, chairman of the Counsel of Medical Education at Northwestern University’s Feinberg School of Medicine, and Dan Pelino, general manager of IBM's Healthcare and Life Sciences Industry.
Wrapping up this week's theme on ways to make the planet smarter, and less confusing, I present IBM's third annual [five in five]. These are five IBM innovations to watch over the next five years, all of which have implications on information storage. Here is a quick [3-minute video] that provides the highlights:
This week is Thanksgiving holiday in the USA, so I thought a good theme would be things I am thankful for.
I'll start with saying that I am thankful EMC has finally announcedAtmos last week. This was the "Maui" part of the Hulk/Maui rumors we heard over a year ago. To quickly recap, Atmos is EMC's latest storage offeringfor global-scale storage intended for Web 2.0 and Digital Archive workloads. Atmos can be sold as just software, or combined with Infiniflex,EMC's bulk, high-density commodity disk storage systems. Atmos supports traditionalNFS/CIFS file-level access, as well as SOAP/REST object protocols.
I'm thankful for various reasons, here's a quick list:
It's hard to compete against "vaporware"
Back in the 1990s, IBM was trying to sell its actual disk systems against StorageTek's rumored "Iceberg" project. It took StorageTek some four years to get this project out,but in the meantime, we were comparing actual versus possibility. The main feature iswhat we now call "Thin Provisioning". Ironically, StorageTek's offering was not commercially successful until IBM agreed to resell this as the IBM RAMAC Virtual Array (RVA).
Until last week, nobody knew the full extent of what EMC was going to deliver on the many Hulk/Maui theories. Severalhinted as to what it could have been, and I am glad to see that Atmos falls short of those rumored possibilities. This is not to say that Atmos can't reach its potential, and certainly some of the design is clever, such as offering native SOAP/REST access.
Instead, IBM now can compare Atmos/Infiniflex directly to the features and capabilities of IBM's Scale Out File Services [SoFS], which offers a global-scale multi-site namespace with policy-based data movement, IBM System Storage Multilevel Grid Access Manager[GAM] that manages geographical distrubuted information,and IBM [XIV Storage System] that offers high-density bulk storage.
Web 2.0 and Digital Archive workloads justify new storage architectures
When I presented SoFS and XIV earlier this year, I mentioned they were designed forthe fast-growing Web 2.0 and Digital Archive workloads that were unique enough to justify their own storage architectures. One criticism was that SoFS appeared to duplicate what could be achieved with dozens of IBM N series NAS boxes connected with Virtual File Manager (VFM). Why invent a new offering with a new architecture?
With the Atmos announcement, EMC now agrees with IBM that the Web 2.0 and DigitalArchive workloads represent a unique enough "use case" to justify a new approach.
New offerings for new workloads will not impact existing offerings for existing workloads
I find it amusing that EMC is quickly defending that Atmos will not eat into its DMXbusiness, which is exactly the FUD they threw out about IBM XIV versus DS8000 earlier this year. In reality, neither the DS8000 nor the DMX were used much for Web 2.0 andDigital Archive workloads in the past. Companies like Google, Amazon and others hadto either build their own from piece parts, or use low-cost midrange disk systems.
Rather, the DS8000 and DMX can now focus on the workloads they were designed for,such as database applications on mainframe servers.
Cloud-Oriented Storage (COS)
Just when you thought we had enough terminology already, EMC introduces yet another three-letter acronym [TLA]. Kudos to EMC for coining phrases to help move newconcepts forward.
Now, when an RFP asks for Cloud-oriented storage, I am thankful this phrase will help serve as a trigger for IBM to lead with SoFS and XIV storage offerings.
Digital archives are different than Compliance Archives
EMC was also quick to point out that object-storage Atmos was different from theirobject-storage EMC Centera. The former being for "digital archives" and the latter for"compliance archives". Different workloads, Different use cases, different offerings.
Ever since IBM introduced its [IBM System Storage DR550] several years ago, EMC Centera has been playing catch-up to match IBM'smany features and capabilities. I am thankful the Centera team was probably too busy to incorporate Atmos capabilities, so it was easier to make Atmos a separate offering altogether. This allows the IBM DR550 to continue to compete against Centera's existingfeature set.
Micro-RAID arrays, logical file and object-level replication
I am thankful that one of the Atmos policy-based feature is replicating individualobjects, rather than LUN-based replication and protection. SoFS supports this forlogical files regardless of their LUN placement, GAM supports replication of files and medical images across geographical sites in the grid, and the XIV supports this for 1MBchunks regardless of their hard disk drive placement. The 1MB chunk size was basedon the average object size from established Web 2.0 and DigitalArchive workloads.
I tried to explain the RAID-X capability of the XIV back in January, under muchcriticism that replication should only be done at the LUN level. I amthankful that Marc Farley on StorageRap coined the phrase[Micro-RAID array] to helpmove this new concept further. Now, file-level, object-level and chunk-level replication can be considered mainstream.
Much larger minimum capacity increments
The original XIV in January was 51TB capacity per rack, and this went up to 79TB per rack for the most recent IBM XIV Release 2 model. Several complained that nobody would purchase disk systems at such increments. Certainly, small and medium size businessesmay not consider XIV for that reason.
I am thankful Atmos offers 120TB, 240TB and 360TB sizes. The companies that purchasedisk for Web 2.0 and Digital Archive workloads do purchase disk capacity in these large sizes. Service providers add capacity to the "Cloud" to support many of theirend-clients, and so purchasing disk capacity to rent back out represents revenue generating opportunity.
Renewed attention on SOAP and REST protocols
IBM and Microsoft have been pushing SOA and Web Services for quite some time now.REST, which stands for [Representational State Transfer] allows static and dynamic HTML message passing over standard HTTP.SOAP, which was originally [Simple Object Access Protocol], and then later renamed to "Service Oriented Architecture Protocol", takes this one step further, allowingdifferent applications to send "envelopes" containing messages and data betweenapplications using HTTP, RPC, SMTP and a variety of other underlying protocols.Typically, these messages are simple text surrounded by XML tags, easily stored asfiles, or rows in databases, and served up by SOAP nodes as needed.
It's hard to show leadership until there are followers
IBM's leadership sometimes goes unnoticed until followerscreate "me, too!" offerings or establish similar business strategies. IBM's leadership in Cloud and Grid computing is no exception.Atmos is the latest me-too product offering in this space, trying pretty muchto address the same challenges that SoFS and XIV were designed for.
So, perhaps EMC is thankful that IBM has already paved the way, breaking throughthe ice on their behalf. I am thankful that perhaps I won't have to deal with as much FUD about SoFS, GAM and XIV anymore.
At IBM, our standard is to have a limit of 200MB per user mailbox. A few of us get exceptions and have up to500MB limit because of the work we do. By comparison, my personal Gmail account is now up to 6500MB. Whenthis limit is exceeded, you are unable to send out any mail until it is brought down below the limit, and a request to be "re-enabled for send" is approved, a situation we call "mail jail".
The biggest culprit are attachments. Only 10 percent of emails have attachments, but those that do take up 90percent of the total space! People attach a 15MB presentation or document, and copy the world ondistribution list. Everyone saves their notes with these attachments, and soon, the limits are blown. Not surprisingly, deduplication has been cited as a "killer app" to address email storage, exactly for this reason.If all the users have their mailboxes all stored on the same deduplication storage device, it might find theseduplicate blocks, and manage to reduce the space consumed.
A better practice would be to avoid this in the first place. Here are the techniques I use instead:
Point to the document in a database
We are heavy users of Lotus Notes databases. These can be encrypted and controlled with Access Control Lists (ACL)that determine who can create or read documents in each database. Annually, all the database ACLs are validatedso that people can confirm that they continue to have a need-to-know for the documents in each database. Sendinga confidential document as a "document link" to a database entry takes only a few bytes, and all the recipientsthat are already on the ACL have access to that document.
Point to the document on a web page
If the document is available on an internal or external website, just send the URL instead of attaching the file.Again, this takes only a few bytes. We have websites accessible only to all internal employees, websites thatcan be accessed only by a subset of employees with special permissions and credentials based on their job role, and websites that are accessible to our IBM Business Partners.
In my case, if I happen to have a blog posting that answers a question or helps illustrate an idea, I will sendthe "permalink" URL of that blog post in my email.
Point to the document on shared NAS file system
Internally, IBM uses a "Global Storage Architecture" (GSA) based on IBM's Scale-Out File Services [SoFS] with everyone getting initially 10GB of disk space to store files, with the option to request more if needed. The system has policy-based support for placing and migrating older data to tape to reduce actual disk usage, and combines a clustered file system with a global name space.
My SoFS space is now up to 25GB, and I store a lot of presentationsand whitepapers that are useful to others. A URL with "ftp://" or "http://" is all you need to point to a filein this manner, and greatly reduces the need for attachments. I can map my space as "Drive X:" on my Windows system,or as a NFS mount point on my Linux system, which allows me to easily drag files back and forth.
Departments that don't need to offer "worldwide access" use NAS boxes instead, such as the IBM System Storage N series.
Pointing to files in a shared space, rather than as attachments in email, may take some getting used to. I've hada few recipients send me requests such as "can you send that as an attachment (not a URL)" because they plan toread it on the airplane or train, where they won't have online connectivity.
"Have you invested in the latest and greatest in collaboration technology but still feel people are still not collaborating? How many Microsoft Sharepoint servers and IBM Quickplaces remain relatively untouched or only used by the organization's technorati? I think it's a big problem because this narrow view of collaboration starts to get the concept a bad name: "yeah, we did collaboration but no one used it." And then there the issue of the vast amount of money wasted and opportunities lost. We can't afford to loose faith in collaboration because the external environment is moving in a direction that mandates we collaborate. The problems we face now and into the future will only increase in complexity and it will require teams of people within and across organizations to solve them."
Well, sending pointers instead of attachments works for me, and has kept me out of "mail jail" for quite some timenow.
Federal Rules for Civil Procedures (FRCP) will increase adoption of unstructured data classification, email archive systems and CAS.
CAS continues to flounder, but the rest I can agree with. Regulations are being adopted world wide. Japan has its own Sarbanes-Oxley (SOX) style legislation go into effect in 2008.IBM TotalStorage Productivity Center for Data is a great tool to help classify unstructured file systems. IBM CommonStore for email supports both Microsoft Exchange and Lotus Domino, and can be connected to IBM System Storage DR550 for compliance storage.
Unified storage systems (combined file and block storage target systems) will become increasingly attractive in 2007, because of their ease of use and simplicity.
I agree with this one also. Our sales of IBM N series in 2006 was great, and looking to continue its strong growth in 2007. The IBM N series brings together FCP, iSCSI and NAS protocols into one disk system. With the SnapLock(tm) feature, N series can store both re-writable data, as well as non-erasable, non-rewriteable data, on the same box. Combine the N series gateway on the front-end with SAN Volume Controller on the back-end, and you have an even more powerful combination.
Distributed ROBO backup to disk will emerge as the fastest growing data protection solution in 2007.
IDC had a similar prediction for 2006. ROBO refers to "Remote Office/Branch Office", and so ROBO backup deals with how to back up data that is out in the various remote locations. Do you back it up locally? or send it to a central location?Fortunately, IBM Tivoli Storage Manager (TSM) supports both ways, and IBM has introduced small disk and tape drives and auto-loaders that can be used in smaller environments like this. I don't know whether "backup to disk" will be the fastest growing, but I certainly agree that a variety of ROBO-related issues will be of interest this year.
2007 will be remembered as the year iSCSI SAN took off because of the much reduced pricing for 10 Gbit iSCSI and the continued deployment of 10 Gbit iSCSI targets.
While I agree that iSCSI is important, I can't say 2007 will be remembered for anything.We have terrible memory in these things. Ask someone what year did Personal Computers (PC) take off, and they will tell you about Apple's famous 1984 commercial. Ask someone when the Internet took off, cell phones took off, etc, and I suspect most will provide widely different answers, but most likely based on their own experience.
For the longest time, I resisted getting a cell phone. I had a roll of quarters in my car, and when I needed to make a call, I stopped at the nearby pay-phone, and made the call. In 1998, pay phones disappeared. You can't find them anymore. That was the year of the cell phones took off, at least for me.
Back to iSCSI, now that you can intermix iSCSI and SAN on the same infrastructure, either through intelligent multi-protocol switches available from your local IBM rep, or through an N series gateway, you can bring iSCSI technology in slowly and gradually. Low-cost copper wiring for 10 Gbps Ethernet makes all this very practical.
Another up-and-coming technology is AoE, or ATA-over-Ethernet. Same idea as iSCSI, but taken down to the ATA level.
CDP will emerge as an important feature on comprehensive data protection products instead of a separate managed product.
Here, CDP stands for Continuous Data Protection. While normal backups work like a point-and-shoot camera, taking a picture of the data once every midnight for example. CDP can record all the little changes like a video camera, with the option to rewind or fast-forward to a specific point in the day. IBM Tivoli CDP for Files, for example, is an excellent complement to IBM Tivoli Storage Manager.
The technology is not really new, as it has been implemented as "logs" or "journals" on databases like DB2 and Oracle, as well as business applications like SAP R/3.
The prediction here, however, relates to packaging. Will vendors "package" CDP into existing backup products, possibly as a separately priced feature, or will they leave it as a separate product that perhaps, like in IBM's case, already is well integrated.
The VTL market growth will continue at a much reduced rate as backup products provide equivalent features directly to disk. Deduplication will extend the VTL market temporarily in 2007.
VTL here refers to Virtual Tape Library, such as IBM TS7700 or TS7510 Virtualization Engine. IBM introduced the first one in 1997, the IBM 3494 Virtual Tape Server, and we have remained number one in marketshare for virtual tape ever since. I find it amusing that people are now just looking at VTL technology to help with their Disk-to-Disk-to-Tape (D2D2T) efforts, when IBM Tivoli Storage Manager has already had the capability to backup to disk, then move to tape, since 1993.
As for deduplication, if you need the end-target box to deduplicate your backups, then perhaps you should investigatewhy you are doing this in the first place? People take full-volume backups, and keep to many copies of it, when a more sophisticated backup software like Tivoli Storage Manager can implement backup policies to avoid this with a progressive backup scheme. Or maybe you need to investigate why you store multiple copies of the same data on disk, perhaps NAS or a clustered file system like IBM General Parallel File System (GPFS) could provide you a single copy accessible to many servers instead.
The reason you don't see deduplication on the mainframe, is that DFSMS for z/OS already allows multiple servers to share a single instance of data, and has been doing so since the early 1980s. I often joke with clients at the Tucson Executive Briefing Center that you can run a business with a million data sets on the mainframe, but that there wereprobably a million files on just the laptops in the room, but few would attempt to run their business that way.
Optical storage that looks, feels and acts like NAS and puts archive data online, will make dramatic inroads in 2007.
Marc says he's going out on a limb here, and that's good to make at least one risky prediction. IBM used to have anoptical library emulate disk, called the IBM 3995. Lack of interest and advancement in technology encouraged IBM to withdraw it. A small backlash ensued, so IBM now offers the IBM 3996 for the System p and System i clients that really, really want optical.
As for optical making data available "online", it takes about 20 seconds to load an optical cartridge, so I would consider this more "nearline" than online. Tape is still in the 40-60 second range to load and position to data, so optical is still at an advantage.
Optical eliminates the "hassles of tape"? Tape data is good for 20 years, and optical for 100 years, but nobody keeps drives around that long anyways. In general, our clients change drives every 6-8 years, and migrate the data from old to new. This is only a hassle if you didn't plan for this inevitable movement. IBM Tivoli Storage Manager, IBM System Storage Archive Manager, and the IBM System Storage DR550 all make this migration very simple and easy, and can do it with either optical or tape.
The Blue-ray vs. DVD debate will continue through 2007 in the consumer world. I don't see this being a major player in more conservative data centers where a big investment in the wrong choice could be costly, even if the price-per-TB is temporarily in-line with current tape technologies. IBM and others are investing a lot of Research and Development funding to continue the downward price curve for tape, and I'm not sure that optical can keep up that pace.
Well, that's my take. It is a sunny day here in China, and have more meetings to attend.
For those of us in the northern hemisphere, yesterday was this year's Winter Solstice, representingthe shortest amount of daylight between sunrise and sunset. So today, I thought I would blog on my thoughtsof managing scarcity.
Earlier in my career, I had the pleasure to serve as "administrative assistant" to Nora Denzel for the week at a storage conference. My job was to make her look good at the conference, which if you know Nora, doesn't take much. Later, she left IBM to work at HP, and I gotto hear her speak at a conference, and the one thing that I remember most was her statement that thewhole point of "management" was to manage scarcity, as in not enough money in the budget,not enough people to implement change, or not enough resources to accomplish a task.(Nora, I have no idea where you are today, so if you are reading this, send me a note).
Of course, the flip-side to this is that resources that are in abundance are generallytaken for granted. Priorities are focused on what is most scarce. Let's examine some of theresources involved in an IT storage environment:
Capacity - while everyone complains that they are "running out of space", the truth is that most external disk attached to Linux, UNIX, or Windows systems contain only 20-40% data. Many years ago, I visitedan insurance company to talk about a new product called IBM Tivoli Storage Manager. This company had 7TB of disk on their mainframe,and another 7TB of disk scattered on various UNIX and Windows machines. In the room were TWO storage admins for
the mainframe, and 45 storage admins for the distributed systems. My first question was "why so many people forthe mainframe, certainly one of you could manage all of it yourself, perhaps on Wednesday afternoons?" Their response was that they acted as eachother's backup, in case one goes on vacation for two weeks. My follow-up question to the rest of the audience was:"When was the last time you took two weeks vacation?" Mainframes fill their disk and tape storage comfortablyat over 80-90% full of data, primarily because they have a more mature, robust set of management software, likeDFSMS.
Labor - by this I mean skilled labor able to manage storage for a corporation. Some companies I have visitedkeep their new-hires off production systems for the first two years, working only on test or development systemsonly until then. Of course, labor is more expensive in some countries than others. Last year, I was doing a whiteboard session on-site for a client in China, and the last dry-erase pen ran out of ink. I asked for another pen, and they instead sent someone to go re-fill it. I asked wouldn't it be cheaper just to buy another pen, and they said "No, labor is cheap, but ink is expensive." Despite this, China does complain that there is a shortage of askilled IT labor force, so if you are looking for a job, start learning Mandarin.
Power and Cooling - Most data centers are located on raised floors, with large trunks of electrical power and hugeair conditioning systems to deal with all the heat generated from each machine. I have visited the data centers ofclients that are forced now to make decisions on storage based on power and cooling consumption, because the coststo upgrade their aging buildings are too high. Leading the charge is IBM, with technology advancements in chips, cards, and complete systems that use less power, and generate less heat. While energy is still fairly cheap in the grand scheme of things, fears ofGlobal Warmingand declining oil supplies, the costs ofpower and cooling have gotten some news lately. In 1956, Hubbert predicted US would reach peak oil supplies by1965-1970 (it happened in 1971), and this year Simmonsestimated that world-wide oil production began its decline already in 2005. Smart companies like Google have movedtheir server farms to places like Oregon in the Pacific Northwest for cheaper hydroelectric power.
Bandwidth - Last year IBM introduced 4Gbps Fibre Channel and FICON SAN networking gear, along with the servers and storage needed to complete the solution. 4Gbps equates to about 400 MB/sec in data throughput. By comparison, iSCSI is typically run on 1Gbps Ethernet, but has so much overheads that you only get abour 80 MB/sec. Next year, we may see both 8 Gbps SAN, and 10 GbE iSCSI, to provide 800 MB/sec throughputs. My experience is that the SAN is not the bottleneck, instead people run out of bandwidth at the server or storage end first. They may not have a million dollars to buy the fastest IBM System p5 servers, or may not have enough host adapters at the storage system end.
Floorspace - I end with floorspace because it reminds me that many "shortages" are temporary or artificially created. Floorspace is only in short supply because you don't want to knock down a wall, or build a new building, to handle your additional storage requirements.In 1997, Tihamer Toth-Fejel wrote an article for the National Space Society newsletter that estimated that ...Everybody on Earth could live comfortably in the USA on only 15% of our land area, with a population density between that of Chicago and San Francisco. Using agricultural yields attained widely now, the rest of the U.S. would be sufficient to grow enough food for everyone. The rest of the planet, 93.7% of it, would be completely empty.Of course, back in 1997 the world population was only 5.9 billion, and this year it is over 6.5 billion.
This last point brings me back to the concept of food, and I am not talking about doughnuts in the conference room, or pizza while making year-end storage upgrades. I'm talking aboutthe food you work so hard to provide for yourself and your family. The folks at Oxfam came up with a simpleanalogy. If 20 people sit down at your table, representing the world’s population:
3 would be served a gourmet, multi-course meal, while sitting at decorated table and a cushioned chair.
5 would eat rice and beans with a fork and sit on a simple cushion
12 would wait in line to receive a small portion of rice that they would eat with their hands while sitting on the floor.
So for those of you planning a special meal next Monday, be thankful you are one of the lucky three, and hopefulthat IBM will continue to lead the IT industry to help out the other seventeen.
It has always been the case in fast pace technology areas that you can't tell the players without a program card, andthis is especially true for storage.
When analyzing each acquistion move, you need to think of what is driving it. What are the motives?Having been in the storage business 20 years now, and seen my share of acquisitions, both from within IBM,as well as competition, I have come up with the following list of motives.
Although slavery was abolished in the US back in the 1800's, and centuries earlier everywhere else, many acquisitionsseem to be focused on acquiring the people themselves, rather than the products or client list. I have seen statistics such as "We retained 98% of the people!" In reality, these retentions usually involve costly incentives,sign-in bonuses, stock options, and the like. Desptie this, people leave after a few years, often because ofpersonality or "corporate culture" clash. For example, many former STK employees seem to be leaving after their company was acquired by Sun Microsystems.
If you can't beat them, join them. Acquisitions can often be used by one company to raise its ranking in marketshare, eliminating smaller competitors. And now that you have acquired their client list, perhaps you can sellthem more of your original set of products!
Symantec had acquired Veritas, which in turn had acquired a variety of other smaller players, and the end result is that they are now #1 backup software provider, even though none of theirproducts holds a candle to IBM's Tivoli Storage Manager. Meanwhile, EMC acquired Avamar to try to get more into the backup/recovery game, but most analysts still find EMC down in the #4 or #5 place in this category.
Next month,Brocade's acquisition of McData should take effect, furthering its marketshare in SAN switch equipment.
Prior to my current role as "brand market strategist" for System Storage, I was a "portfolio manager" where wetried to make sure that our storage product line investments were balanced. This was a tough job, as the investmentshad to balance the right development investments into different technologies, including patent portfolios.Despite IBM's huge research budget, I am not surprised that some clever inventions of new technologies comefrom smaller companies, that then get acquired once their results appear viable.
The last motive is value shift. This is where companies try to re-invent themselves, or find that they are stuck in acommodity market rut, and wish to expand into more profitable areas.
LSI Logic acquisition of StoreAge is a good exampleof this. Most of the major storage vendors have already shifted to software and services to provide customer value,as predicted in 1990's by Clayton Christensen in his book "The Innovator's Dilemma". The rest are still strugglingto develop the right strategy, but leaning in this general direction.
In last week's System Storage Portfolio Top Gun class in Dallas, some of the students were not familiarwith Really Simple Syndication (RSS). For the uninitiated, this can be intimidating.I thought a quick overview of what I've done might help:
Chose a "feed reader". I chose Bloglines but there are many others.
Use Technorati to search other blogs for keywords or phrases I am looking for.
When I find a blog that I like to continue tracking, I "add" it to my subscription list on bloglines. Just hit "add" and copy the URL of the blog you want to track. Bloglines will figure out the RSS keywords required.I track eight blogs at the momemnt, but some people with lots of time on their hands track 20 or more. It is easy to unsubscribe, so don't be afraid to try some out for a few days.
Since I was actually going to run a blog of my own, I read a few books on the topic. One I recommend is "Naked Conversations" by Robert Scoble and Shel Israel, both experienced bloggers.
Finally, I am not big on spell checking, but most places have the option to preview your post or comment before it actually gets posted, which is not a bad idea if you use any HTML tags.
For a quick taste of blogging, consider using Data Storage Blogger Feed Reader. This has a lot of blogs on the topic of storage, already added and categorized for your convenience, ready for your perusal.
I am sure there are many other ways to enjoy the Blogosphere, but this works for me.[Read More]
I get a lot of suggestions for what to put on my blog.I realize that tweets are limited to 140 characters, so pointing to a video URL without muchexplanation or warning can be dangerous. An email can at least add appropriate warnings,such NSFW (Not Safe For Work) or "sorry if this offends you". The only warning I got fora video posted to YouTube by "StorageNetworkDud" was this short email:
"Sorry about the language they have used in some translations, but not sure who put this. It was on twitter."
Fortunately, I have my browser set up not to automatically play YouTube videos. The titlehelped warn me of the content, which turned out to be a [fan-subbed] scene from a World War II movie with brown-shirted tyrannical leader of an evil empire talking to his top generals. He dismisses all but threewith "Hollis, Burke, and Twomey stay in here" followed by a lengthy recap of EMC's recent troublesin the marketplace. At least in the video, the fuhrer correctly follows Tim Sander's advice:"if you have to tell someone bad news, say it in person."
While I understand that many people don't like EMC, the #3 storagevendor in the world, this type of "geek humor" hits a new low. The video was posted over amonth ago, but in light of the recent [shooting in Washington DC], I felt it was just notappropriate to post it here.
Readers, I appreciate all the suggestions, but give me some better warning next time!
Continuing this week's theme on Cloud Computing, Dynamic Infrastructure and Data Center Networking, IBM unveiled details of an advanced computing system that will be able to compete with humans on Jeopardy!, America’s favorite quiz television show. Additionally, officials from Jeopardy! announced plans to produce a human vs. machine competition on the renowned show.
For nearly two years, IBM scientists have been working on a highly advanced Question Answering (QA) system, codenamed "Watson" after IBM's first president, [Thomas J. Watson]. The scientists believe that the computing system will be able to understand complex questions and answer with enough precision and speed to compete on Jeopardy!Produced by Sony Pictures Television, the trivia questions on Jeopardy! cover a broad range of topics, such as history, literature, politics, film, pop culture, and science. It poses a grand challenge for a computing system due to the variety of subject matter, the speed at which contestants must provide accurate responses, and because the clues given to contestants involve analyzing subtle meaning, irony, riddles, and other complexities at which humans excel and computers traditionally do not. Watson will incorporate massively parallel analytical capabilities and, just like human competitors, Watson will not be connected to the Internet or have any other outside assistance.
If this all sounds familiar, you might remember some of the events that have led up to this:
In 1984, the movie ["The Terminator"] introduced the concept of [Skynet], a fictional computer system developed by the militarythat becomes self-aware from its advanced artificial intelligence.
In 1997, an IBM computer called Deep Blue defeated World Chess Champion [Garry Kasparov] in a famous battle of human versus machine. To compete at chess, IBM built an extremely fast computer that could calculate 200 million chess moves per second based on a fixed problem. IBM’s Watson system, on the other hand, is seeking to solve an open-ended problem that requires an entirely new approach – mainly through dynamic, intelligent software – to even come close to competing with the human mind. Despite their massive computational capabilities, today’s computers cannot consistently analyze and comprehend sentences, much less understand cryptic clues and find answers in the same way the human brain can.
In 2005, Ray Kurzweil wrote [The Singularity is Near] referring to the wonders that artificial intelligence will bring to humanity.
The research underlying Watson is expected to elevate computer intelligence and human-to-computer communication to unprecedented levels. IBM intends to apply the unique technological capabilities being developed for Watson to help clients across a wide variety of industries answer business questions quickly and accurately.
In the post [Flowing Workflow], the folks over at Eightbar point to the latest 3D work being done with IBM Lotus Sametime.
IBM Sametime is IBM's instant messaging facility, which has been extended to include Voice over IP (VOIP) capability similar to Skype, and now is being developed as a launch point for 3D impromptu meetings "in-world", similar to [Second Life].
With many companies facing hard times and considering travel restrictions for face-to-faceinternal meetings, an information infrastructure that adopts this technology might be a reasonable alternative.
In Monday's post, [IBM Information Infrastructure launches today], I explained how this strategic initiative fit into IBM's New EnterpriseData Center vision. The launch was presented at the IBM Storage and Storage Networking Symposium to over 400 attendeesin Montpelier, France, with corresponding standing-room-only crowds in New York and Tokyo.
This post will focus on Information Retention, the third of the four-part series this week.
Here's another short 2-minute video, on Information Retention
Let's start with some interesting statistics.Fellow blogger Robin Harris on his StorageMojo blog has an interesting post:[Our changing file workloads],which discusses the findings of study titled"Measurement and Analysis of Large-Scale Network File System Workloads"[14-page PDF]. This paper was a collaborationbetween researchers from University of California Santa Cruz and our friends at NetApp.Here's an excerpt from the study:
Compared to Previous Studies:
Both of our workloads are more write-oriented. Read to write byte ratios have significantly decreased.
Read-write access patterns have increased 30-fold relative to read-only and write-only access patterns.
Most bytes are transferred in longer sequential runs. These runs are an order of magnitude larger.
Most bytes transferred are from larger files. File sizes are up to an order of magnitude larger.
Files live an order of magnitude longer. Fewer than 50 percent are deleted within a day of creation.
Files are rarely re-opened. Over 66 percent are re-opened once and 95% fewer than five times.
Files re-opens are temporally related. Over 60 percent of re-opens occur within a minute of the first.
A small fraction of clients account for a large fraction of file activity. Fewer than 1 percent of clients account for50 percent of file requests.
Files are infrequently shared by more than one client. Over 76 percent of files are never opened by more than one client.
File sharing is rarely concurrent and sharing is usually read-only. Only 5 percent of files opened by multiple clients are concurrent and 90 percent of sharing is read-only.
Most file types do not have a common access pattern.
Why are files being kept ten times longer than before? Because the information still has value:
Provide historical context
Gain insight to specific situations, market segment demographics, or trends in the greater marketplace
Help innovate new ideas for products and services
Make better, smarter decisions
National Public Radio (NPR) had an interesting piece the other day. By analyzing old photos, a researcher for Cold War Analysis was able to identify an interesting [pattern for Russian presidents]. (Be sure to listen to the 3-minute audio to hear a hilarious song about the results!)
Which brings me to my own collection of "old photos". I bought my first digital camera in the year 2000,and have taken over 15,000 pictures since then. Before that,I used 35mm film camera, getting the negatives developed and prints made. Some of these date back to my years in High School and College. I have a mix of sizes, from 3x5, 4x6 and 5x7 inches,and sometimes I got double prints.Only a small portion are organized intoscrapbooks. The rest are in envelopes, prints and negatives, in boxes taking up half of my linen closet in my house.Following the success of the [Library of Congress using flickr],I decided the best way to organize these was to have them digitized first. There are several ways to do this.
This method is just too time consuming. Lift the lid place 1 or a few prints face down on the glass, close the lid,press the button, and then repeat. I estimate 70 percent of my photos are in [landscape orientation], and 30 percent in [portrait mode]. I can either spend extra time toorient each photo correctly on the glass, or rotate the digital image later.
I was pleased to learn that my Fujitsu ScanSnap S510 sheet-feed scanner can take in a short stack (dozen or so) photos, and generate JPEG format files for each. I can select 150, 300 or 600dpi, and five levels of JPEG compression.All the photos feed in portrait mode, which I can then rotate later on the computer once digitized.A command line tool called [ImageMagick] can help automate the rotations.While I highly recommend the ScanSnap scanner, this is still a time-consuming process for thousands of photos.
"The best way to save your valuable photos may be by eliminating the paper altogether. Consider making digital images of all your photos."
Here's how it works:You ship your prints (or slides, or negatives) totheir facility in Irvine, California. They have a huge machine that scans them all at 300dpi, no compression, andthey send back your photos and a DVD containing digitized versions in JPEG format, all for only 50 US dollars plusshipping and handling, per thousand photos. I don't think I could even hire someone locally to run my scanner for that!
The deal got better when I contacted them. For people like me with accounts on Facebook, flickr, MySpace or Blogger,they will [scan your first 1000 photos for free] (plus shipping and handling). I selected a thousand 4x6" photos from my vast collection, organized them into eight stacks with rubber bands,and sent them off in a shoe box. The photos get scanned in landscape mode, so I had spent about four hours in preparing what I sent them, making sure they were all face up, with the top of the picture oriented either to the top or left edge.For the envelopes that had double prints, I "deduplicated" them so that only one set got scanned.
The box weighed seven pounds, and cost about 10 US dollars to send from Tucson to Irvinevia UPS on Tuesday. They came back the following Monday, all my photos plus the DVD, for 20 US dollars shipping and handling. Each digital image is about 1.5MB in size, roughly 1800x1200 pixels in size, so easily fit on a single DVD. The quality is the sameas if I scanned them at 300dpi on my own scanner, and comparable to a 2-megapixel camera on most cell phones.Certainly not the high-res photos I take with my Canon PowerShot, but suitable enough for email or Web sites. So, for about 30 US dollars, I got my first batch of 1000 photos scanned.
ScanMyPhotos.com offers a variety of extra priced options, like rotating each file to the correct landscape or portrait orientation, color correction, exact sequence order, hosting them on their Web site online for 30 days to share with friends and family, and extra copies of the DVD.All of these represent a trade-off between having them do it for me for an additional fee, or me spending time doing it myself--either before in the preparation, or afterwards managing the digital files--so I can appreciate that.
Perhaps the weirdest option was to have your original box returned for an extra $9.95? If you don't have a hugecollection of empty shoe boxes in your garage, you can buy a similarly sized cardboard box for only $3.49 at the local office supply store, so I don't understand this one. The box they return all your photos in can easily be used for the next batch.
I opted not to get any of these extras. The one option I think they should add would be to have them just discardthe prints, and send back only the DVD itself. Or better yet, discard the prints, and email me an ISO file of the DVD that I can burn myself on my own computer.Why pay extra shipping to send back to me the entire box of prints, just so that I can dump the prints in the trash myself? I will keep the negatives, in case I ever need to re-print with high resolution.
Overall, I am thoroughlydelighted with the service, and will now pursue sending the rest of my photos in for processing, and reclaim my linen closet for more important things. Now that I know that a thousand 4x6 prints weighs 7 pounds, I can now estimate how many photos I have left to do, and decide on which discount bulk option to choose from.
With my photos digitized, I will be able to do all the things that IBM talks about with Information Retention:
Place them on an appropriate storage tier. I can keep them on disk, tape or optical media.
Easily move them from one storage tier to another. Copying digital files in bulk is straightforward, and as new techhologies develop, I can refresh the bits onto new media, to avoid the "obsolescence of CDs and DVDs" as discussed in this article in[PC World].
Share them with friends and family, either through email, on my Tivo (yes, my Tivo is networked to my Mac and PC and has the option to do this!), or upload themto a photo-oriented service like [Kodak Gallery or flickr].
Keep multiple copies in separate locations. I could easily burn another copy of the DVD myself and store in my safe deposit box or my desk at work.With all of the regional disasters like hurricanes, an alternative might be to backup all your files, including your digitized photos, with an online backup service like [IBM Information Protection Services] from last year's acquisition of Arsenal Digital.
If the prospect of preserving my high school and college memories for the next few decades seems extreme,consider the [Long Now Foundation] is focused on retaining information for centuries.They areeven suggesting that we start representing years with five digits, e.g., 02008, to handle the deca-millennium bug which will come into effect 8,000 years from now. IBM researchers are also working on [long-term preservation technologies and open standards] to help in this area.
For those who only read the first and last paragraphs of each post, here is my recap:Information Retention is about managing [information throughout its lifecycle], using policy-based automation to help with the placement, movement and expiration. An "active archive" of information serves to helpgain insight, innovate, and make better decisions. Disk, tape, and blended disk-and-tape solutions can all play a part in a tiered information infrastructure for long-term retention of information.
Based on this success, and perhaps because I am also fluent in Spanish, I was asked to help with Proyecto Ceibal, the team for OLPC Uruguay. Normally theXS school server resides at the school location itself, so that even if the internet connection is disrupted or limited, the school kids can continue to access each other and the web cache content until internet connection is resumed.However, with a diverse developmentteam with people in United States, Uruguay, and India, we first looked to Linux hosting providers that wouldagree to provide free or low-cost monthly access. We spent (make that "wasted") the month of May investigating.Most that I talked to were not interested in having a customized Linux kernel on non-standard hardware on their shop floor, and wanted instead to offer their own standard Linux build on existing standard servers, managed by theirown system administrators, or were not interested in providing it for free. Since the XS-163 kernel is customizedfor the x86 architecture, it is one of those exceptions where we could not host it on an IBM POWER or mainframe as a virtual guest.
This got picked up as an [idea] for the Google's[Summer of Code] and we are mentoring Tarun, a 19-year-old student to actas lead software developer. However, summer was fast approaching, and we wanted this ready for the next semester. In June, our project leader, Greg, came up with a new plan. Build a machine and have it connected at an internet service provider that would cover the cost of bandwidth, and be willing to accept this with remote administration. We found a volunteer organization to cover this -- Thank you Glen and Vicki!
We found a location, so the request to me sounded simple enough: put together a PC from commodity parts that meet the requirements of the customizedLinux kernel, the latest release being called [XS-163]. The server would have two disk drives, three Ethernet ports, and 2GB of memory; and be installed with the customized XS-163 software, SSHD for remote administration, Apache web server, PostgreSQL database and PHP programming language.Of course, the team wanted this for as little cost as possible, and for me to document the process, so that it could be repeated elsewhere. Some stretch goals included having a dual-boot with Debian 4.0 Etch Linux for development/test purposes, an alternative database such as MySQL for testing, a backup procedure, and a Recover-DVD in case something goes wrong.
Some interesting things happened:
The XS-163 is shipped as an ISO file representing a LiveCD bootable Linux that will wipe your system cleanand lay down the exact customized software for a one-drive, three-Ethernet-port server. Since it is based on Red Hat's Fedora 7 Linux base, I found it helpful to install that instead, and experiment moving sections of code over.This is similar to geneticists extracting the DNA from the cell of a pit bull and putting it into the cell for a poodle. I would not recommend this for anyone not familiar with Linux.
I also experimented with modifying the pre-built XS-163 CD image by cracking open the squashfs, hacking thecontents, and then putting it back together and burning a new CD. This provided some interesting insight, but in the end was able to do it all from the standard XS-163 image.
Once I figured out the appropriate "scaffolding" required, I managed to proceed quickly, with running versionsof XS-163, plain vanilla Fedora 7, and Debian 4, in a multi-boot configuration.
The BIOS "raid" capability was really more like BIOS-assisted RAID for Windows operating system drivers. This"fake raid" wasn't supported by Linux, so I used Linux's built-in "software raid" instead, which allowed somepartitions to be raid-mirrored, and other partitions to be un-mirrored. Why not mirror everything? With two160GB SATA drives, you have three choices:
No RAID, for a total space of 320GB
RAID everything, for a total space of 160GB
Tiered information infrastructure, use RAID for some partitions, but not all.
The last approach made sense, as a lot of of the data is cache web page images, and is easily retrievable fromthe internet. This also allowed to have some "scratch space" for downloading large files and so on. For example,90GB mirrored that contained the OS images, settings and critical applications, and 70GB on each drive for scratchand web cache, results in a total of 230GB of disk space, which is 43 percent improvement over an all-RAID solution.
While [Linux LVM2] provides software-based "storage virtualization" similar to the hardware-based IBM System Storage SAN Volume Controller (SVC), it was a bad idea putting different "root" directories of my many OS images on there. With Linux, as with mostoperating systems, it expects things to be in the same place where it last shutdown, but in a multi-boot environment, you might boot the first OS, move things around, and then when you try to boot second OS, it doesn'twork anymore, or corrupts what it does find, or hangs with a "kernel panic". In the end, I decided to use RAIDnon-LVM partitions for the root directories, and only use LVM2 for data that is not needed at boot time.
While they are both Linux, Debian and Fedora were different enough to cause me headaches. Settings weredifferent, parameters were different, file directories were different. Not quite as religious as MacOS-versus-Windows,but you get the picture.
During this time, the facility was out getting a domain name, IP address, subnet mask and so on, so I testedwith my internal 192.168.x.y and figured I would change this to whatever it should be the day I shipped the unit.(I'll find out next week if that was the right approach!)
Afraid that something might go wrong while I am in Tokyo, Japan next week (July 7-11), or Mumbai, India the following week (July 14-18), I added a Secure Shell [SSH] daemon that runs automaticallyat boot time. This involves putting the public key on the server, and each remote admin has their own private key on their own client machine.I know all about public/private key pairs, as IBM is a leader in encryption technology, and was the first todeliver built-in encryption with the IBM System Storage TS1120 tape drive.
To have users have access to all their files from any OS image required that I either (a) have identical copieseverywhere, or (b) have a shared partition. The latter turned out to be the best choice, with an LVM2 logical volumefor "/home" directory that is shared among all of the OS images. As we develop the application, we might findother directories that make sense to share as well.
For developing across platforms, I wanted the Ethernet devices (eth0, eth1, and so on) match the actual ports they aresupposed to be connected to in a static IP configuration. Most people use DHCP so it doesn't matter, but the XSsoftware requires this, so it did. For example, "eth0" as the 1 Gbps port to the WAN, and "eth1/eth2" as the two 10/100 Mbps PCI NIC cards to other servers.Naming the internet interfaces to specific hardware ports wasdifferent on Fedora and Debian, but I got it working.
While it was a stretch goal to develop a backup method, one that could perform Bare Machine Recovery frommedia burned by the DVD, it turned out I needed to do this anyways just to prevent me from losing my work in case thingswent wrong. I used an external USB drive to develop the process, and got everything to fit onto a single 4GB DVD. Using IBM Tivoli Storage Manager (TSM) for this seemed overkill, and [Mondo Rescue] didn't handle LVM2+RAID as well as I wanted, so I chose [partimage] instead, which backs up each primary partition, mirrored partition, or LVM2 logical volume, keeping all the time stamps, ownerships, and symbolic links in tact. It has the ability to chop up the output into fixed sized pieces, which is helpful if you are goingto burn them on 700MB CDs or 4.7GB DVDs. In my case, my FAT32-formatted external USB disk drive can't handle files bigger than 2GB, so this feature was helpful for that as well. I standardized to 660 GiB [about 692GB] per piece, sincethat met all criteria.
The folks at [SysRescCD] saved the day. The standard "SysRescueCD" assigned eth0, eth1, and eth2 differently than the three base OS images, but the nice folks in France that write SysRescCD created a customized[kernel parameter that allowed the assignments to be fixed per MAC address ] in support of this project. With this in place, I was able to make a live Boot-CD that brings up SSH, with all the users, passwords,and Ethernet devices to match the hardware. Install this LiveCD as the "Rescue Image" on the hard disk itself, and also made a Recovery-DVD that boots up just like the Boot-CD, but contains the 4GB of backup files.
For testing, I used Linux's built-in Kernel-based Virtual Machine [KVM]which works like VMware, but is open source and included into the 2.6.20 kernels that I am using. IBM is the leadingreseller of Vmware and has been doing server virtualization for the past 40 years, so I am comfortable with thetechnology. The XS-163 platform with Apache and PostgreSQL servers as a platform for [Moodle], an open source class management system, and the combination is memory-intensive enough that I did not want to incur the overheads running production this manner, but it wasgreat for testing!
With all this in place, it is designed to not need a Linux system admin or XS-163/Moodle expert at the facility. Instead, all we need is someone to insert the Boot-CD or Recover-DVD and reboot the system if needed.
Just before packing up the unit for shipment, I changed the IP addresses to the values they need at the destination facility, updated the [GRUB boot loader] default, and made a final backup which burned the Recover-DVD. Hopefully, it works by just turning on the unit,[headless], without any keyboard, monitor or configuration required. Fingers crossed!
So, thanks to the rest of my team: Greg, Glen, Vicki, Tarun, Marcel, Pablo and Said. I am very excited to bepart of this, and look forward to seeing this become something remarkable!
There is a difference between improving "energy efficiency" versus reducing "power consumption".
Let's consider the average 100 watt light bulb, of which 5 watts generate the desired feature (light), and 95 percent generated as undesired waste (heat). In this case, it would be 5 percent efficient. If you delivered a new light bulb that generated 3 watts of light for only 30 watts of energy, then you would have an offering that was more energy efficient (10 percent instead of 5 percent) and use 70 percent less power (30 watts instead of 100 watts). This new "dim bulb" would not be as bright as the original, but has other desirable energy qualities.
Nearly all of the output of data center equipment results in heat.In The Raised Floor blog [It's Too Darn Hot!], Will Runyon explains how IBM researcher Bruno Michel in Zurich has developed new ways to cool chips with water shot through thousands of nozzles, much like capillaries in the human body. This is just one of many developments that are part of IBM's [Project Big Green]
But what if the desired feature is heat, and the undesired feature is light?In the case of Hasbro's toy[Easy-Bake Oven],a 100W incadescent light bulb is used to bake small cakes. This is generating 95W of desired heat, and onlywasting 5 percent as light (unused inside the oven). That makes this little toy 95 percent energy efficient, butconsumes as much energy as any other 100W light bulb lamp or fixture in your house. With manufacturing switchingfrom incadescent to compact flourescent bulbs, this toy oven may not be around much longer.
While we all joke that it is just a matter of time before our employers make us ride stationary bicycles attached to generators to power our monstrous data centers, 23-year old student Daniel Sheridan designeda see-saw for kids in Africa to play on that generates electricity for nearby schools. [Dan won the "mostinnovative product" at the Enterprise Festival].
Another approach is to improve efficiency by converting previously undesirable outcomes to desirable. Brian Bergstein has a piece in Forbes titled["Heat From Data Center to Warm a Pool"].Here's an excerpt:
"In a few cases, the heat produced by the computers is used to warm nearby offices. In what appears to be a first, the town pool in Uitikon, Switzerland, outside Zurich, will be the beneficiary of the waste heat from a data center recently built by IBM Corp. (nyse: IBM) for GIB-Services AG.
As in all data centers, air conditioners will blast the computers with chilly air - to keep the machines from exceeding their optimum temperature of around 70 degrees - and pump hot air out.
Usually, the hot air is vented outdoors and wasted. In the Uitikon center, it will flow through heat exchangers to warm water that will be pumped into the nearby pool. The town covered the cost of some of the connecting equipment but will get to use the heat for free."
I see a business opportunity here. Next to every data center lamenting about their power and cooling, build a state-of-the-art fitness center for the employees and nearby townspeople. Exercise on a stationary bicyclegenerating electricity, while your kids play on the see-saw generating electricity, and then afterwards thewhole family can take a dip in the heated swimming pool. And if the company subscribes to the notion of a Results-Oriented Work Environment [ROWE],it could encourage its employees to take "fitness" breaks throughout the day, rather than having everyone there in the early morning or late evening hours, leveling out the energy generated.
While many are just becoming familiar with the end-user interfaces of Web 2.0, from blogs and wikis to FaceBook and FlickR, fewer may be familiar with the "information infrastructure" of servers and storagebehind the scenes.
Last year, I bought an XO laptop under the One Laptop Per Child [OLPC] foundation's Give-1-Get-1 program and posted my impressions on this blog. One in particular, my post[Printingon XO laptop with CUPS and LPR] showed how to print from the XO laptop over to a network-attached printer.This caught the attention of the OLPC development team, who asked me tohelp them with another project as a volunteer. Before accepting, I had to learn what skills they were really looking for, especially since I do notconsider myself an expert in neither printing nor networking.
(Unlike a regular 9-to-5 job where most people just try to look busy for eight hours a day, doingvolunteer work means being ready to ["roll up your sleeves"] and actuallyaccomplish something. This applies to any kind of volunteer work, from hammering nails for [Habitat for Humanity] to sorting cans at the [Community Food Bank].Best Buy uses the phrase "Results Oriented Work Environment" [ROWE] to describetheir latest program, modeled in part after the mobile workforce policies of Web2.0-enlightened companiesIBM and Sun, but that is perhaps a topic for another blog post!)
Apparently, to support a school full of students with XO laptops, it would be nice to have a few serversthat provide support to manage the class lesson plans, make reading materials and other content available,and keep track of results. What they need is an "information infrastructure"! They decided on two specific servers:
School Server -- this would run a popular class management system called [Moodle]
Library Server -- a server for a digital library collection, based on Fedora Commons[16-minute video]
In keeping with OLPC philosophy to use free and open source software[FOSS], both servers are based on the [LAMP] platform. LAMP is an acronym for thecombined software bundle of Linux, Apache, MySQL and a Programming language like PHP. The "XS" team working onthe school server wanted me to build a LAMP server and install Moodle to help test the configuration, determinewhat other software is required, and perhaps develop a backup/recovery scenario. Basically, they needed someone with Linux skills to put some hardware and software together.
(I am no stranger to Linux. Back in the 1990s, I was part of the Linux for S/390 team, led the effort to createthe infamous "compatible disk layout" (CDL) that allows z/OS to access ESCON and FICON-attached Linux volumes,took my LPI certification exam, and led a team to validate FCP drivers for our disk and tape storage systems. For an IBMer to volunteer foran Open Source community project, you have to take an "open source" class and get management approval to reviewfor any possible "conflicts of interest". I got this all taken care of, and accepted to help the XS team.)
Building a test environment is similar to baking a cake. You have a recipe, utensils, and ingredients. Here'sa bit of description of each of the ingredients:
Like Windows, the Linux operating system comes in different flavors to run on handhelds, desktops and servers. For servers, IBM tends to focus on Red Hat Enterprise Linux (RHEL) and SUSE Linux Eneterprise Server (SLES). However, the XS team decidedinstead to use [Fedora 7], a community-supported version from Red Hat. Earlier versions of Fedora were known as "Fedora Core", but apparently with version 7, the word "Core" has been dropped. Fedora 7 can be used in either desktop or server mode.
[Apache] is web server software, and half of all web servers on the internet use it. It competes head-on against Micorosofts Internet Information Services (IIS) serverprovided in Windows 2003. The Apache name is partly from thefact that its origins were "a patchy" variant of the NCSA HTTPd 1.3 codebase. Thepopular [IBM HTTP Server] is poweredby Apache, with added support to the rest of the IBM WebSphere software portfolio. The XS team chose Apache v2as the web server platform.
[MySQL] is a relational database management system (RDBMS) software, similar to commercial products like IBM DB2 Universal Database, Oracle DB, or Microsoft SQL Server. The SQL stands for Structured Query Language, developed by IBM in the early 1970s as a standard languageto update and query database tables. MySQL comes in two flavors, MySQL Enterprise for commercial use, and MySQLCommunity, which is community-supported. There are over 10 million instances of MySQL running websites on the internet, which helps explain why Sun Microsystems agreed to acquire MySQL AB company last month.The XS team decided on MySQL 5.0 as the database platform.
To make HTML pages dynamic, including the possibility to add or query database contents, requires programming.A variety of web scripting languages were developed, all starting with the letter "P" to claim to be the programming part of the LAMP platform, including [PHP], Perl, and Python. Later, new programming language frameworks have been developed that do not start with the letter "P", like [Ruby on Rails]. PHP is short for PHP: Hypertext Preprocessor which explains that it pre-processes HTML during web serving,looking for special tags indicating PHP code, allowing programming logic to insert HTML content, such as information extracted from a database.While Python is the language that runs the Sugar interface on the XO laptops, the XS team decided onPHP v5 as the programming language for the server.
As for utensils, you only need a few utilities
A simple text editor: I go old-school and use the classic "vi" (to learn this editor, see the["Cheat Sheet" method] on IBM Developerworks)
secure socket shell (SSH): this allows you to access one server from another
browser access to the internet: when you encounter problems, get error messages, or whatever, it pays to know how to search for things with Google
As for a recipe, the Moodle website spells out some unique details and parameters. For the base LAMP platform,I chose to follow the book [Fedora 7 Unleashed] that has specific chapters on setting up SSH, Apache, MySQL, PHP, Squid and so on. The resultingconfiguration looks like this:
Here were the sequence of events:
I took an old PC that I wasn't using anymore, backed up the Windows system, and installed Linux on top. Thebook above had a Fedora 7 DVD on the back jacket, but I used the [OLPC LiveCD] that had some values pre-configured.
Set the IP address static. I set mine to 192.168.0.77 which nobody sees except my other systems.
My school server is "headless" which means it does not have its own keyboard, video or mouse. It also runs only to Linux run level 3, command line interface only, no graphics.I was able toshare using a KVM switch], but this meant having to remember something on one screen while I was switching over to the other. My Windows XP system has mybrowser connection to the internet to follow instructions or read error messages, so I need that up all thetime. To get around this, on my Windows XP system,I generated SSH public and private keys, copied the public key over to my new Linux system, and used [OpenSSH for Windows] to connect over. Now, on one screen,I have my Windows XP Firefox browser, and a separate command line window that is accessing my Linux schoolserver.
With SSH up and running, I can now use "vi" to edit files, and issue commands to install or activatethe remaining software. First up, Apache. I got this working, and from Windows XP, verified that going to"http://192.168.0.77" showed the Apache test screen.
I installed PHP, and tested it with a simple short index.php file.
I installed MySQL, setup the base "installation databases", and created a test database. Here is whereyou might want to set a password for the MySQL root user, but I chose to do that later for now.
I installed Moodle. It was smart enough to check that Apache, PHP, and MySQL were operational, andapparently I missed a few special "PHP" modules that had to be linked in. I was able to find them, downloadthem, and get them installed.
I brought up Moodle, created a "class category" of SCIENCE and a new class "Chemistry 101", and it allworked.
I also activated Squid, which is a web proxy cache server that stores web pages for faster access.
Another idea was to activate Samba, to provide CIFS file and print sharing, but I decided to put this off.
I got all of this done last Saturday, start to finish. Now the fun begins. We are going to run throughsome tests, document the procedures, and try to get a system up and running in a remote school in Nepal. Fornow, I have only one XO laptop to simulate what the student sees, and one laptop that can represent eithera teacher's Windows-based laptop, or run QEMU and emulate a second XO laptop.For tuning, I might go through the procedures mentioned on IBM Developerworks "Tuning LAMP"[Part 1, Part 2,Part 3].
For those in the server or storage industry that need to understand Web 2.0 information infrastructure better,building a LAMP server like this can be quite helpful.
Wrapping up my week's theme on IBM's acquisition XIV, we have gotten hundreds of positive articles and reviews in the press, but has caused quite a stir with the[Not-Invented-Here] folks at EMC.We've heard already from EMC bloggers [Chuck Hollis] and [Mark Twomey].The latest is fellow EMC blogger BarryB's missive [Obligatory "IBM buys XIV" Post], which piles on the "Fear, Uncertainty and Doubt" [FUD], including this excerpt here:
In a block storage device, only the host file system or database engine "knows" what's actually stored in there. So in the Nextra case that Tony has described, if even only 7,500-15,000 of the 750,000 total 1MB blobs stored on a single 750GB drive (that's "only" 1 to 2%) suddenly become inaccessible because the drive that held the backup copy also failed, the impact on a file system could be devastating. That 1MB might be in the middle of a 13MB photograph (rendering the entire photo unusable). Or it might contain dozens of little files, now vanished without a trace. Or worst yet, it could actually contain the file system metadata, which describes the names and locations of all the rest of the files in the file system. Each 1MB lost to a double drive failure could mean the loss of an enormous percentage of the files in a file system.
And in fact, with Nextra, the impact will be across not just one, but more likely several dozens or even hundreds of file systems.
Worse still, the Nextra can't do anything to help recover the lost files.
Nothing could be further from the truth. If any disk drive module failed, the system would know exactly whichone it was, what blobs (binary large objects) were on it, and where the replicated copies of those blobs are located. In the event of a rare double-drive failure, the system would know exactly which unfortunate blobs were lost, and couldidentify them by host LUN and block address numbers, so that appropriate repair actions could be taken from remote mirrored copies or tape file backups.
Second, nobody is suggesting we are going to put a delicateFAT32-like Circa-1980 file system that breaks with the loss of a single block and requires tools like "fsck" to piece back together. Today's modern file systems--including Windows NTFS, Linux ext3, and AIX JFS2--are journaled and have sophisticated algorithms tohandle the loss of individual structure inode blocks. IBM has its own General Parallel File System [GPFS] and corresponding Scale out File Services[SOFS], and thus brings a lotof expertise to the table.Advanced distributed clustered file systems, like [Google File System] and Yahoo's [Hadoop project] take this one step further, recognizing that individual node and drive failures at the Petabyte-scale are inevitable.
In other words, XIV Nextra architecture is designed to eliminate or reduce recovery actions after disk failures, not make them worse. Back in 2003, when IBM introduced the new and innovative SAN Volume Controller (SVC), EMCclaimed this in-band architecture would slow down applications and "brain-damage" their EMC Symmetrix hardware.Reality has proved the opposite, SVC can improve application performance and help reduce wear-and-tear on the manageddevices. Since then, EMC acquired Kashya to offer its own in-band architecture in a product called EMC RecoverPoint, that offers some of the features that SVC offers.
If you thought fear mongering like this was unique to the IT industry, consider that 105years ago, [Edison electrocuted an elephant]. To understand this horrific event, you have to understand what was going on at the time.Thomas Edison, inventor of the light bulb, wanted to power the entire city of New York with Direct Current(DC). Nikolas Tesla proposed a different, but more appropriate architecture,called Alternating Current(AC), that had lower losses over distances required for a city as large and spread out as New York. But Thomas Edison was heavily invested in DC technology, and would lose out on royalties if ACwas adopted.In an effort to show that AC was too dangerous to have in homes and businesses, Thomas Edison held a pressconference in front of 1500 witnesses, electrocuting an elephant named Topsy with 6600 volts, and filmed the event so that it could be shown later to other audiences (Edison invented the movie camera also).
Today's nationwide electric grid would not exist without Alternating Current.We enjoy both AC for what it is best used for, and DC for what it is best used for. Both are dangerous at high voltage levels if not handled properly. The same is the case for storage architectures. Traditional high-performance disk arrays, like the IBM System Storage DS8000, will continue to be used for large mainframe applications, online transaction processing and databases. New architectures,like IBM XIV Nextra, will be used for new Web 2.0 applications, where scalability, self-tuning, self-repair,and management simplicity are the key requirements.
(Update: Dear readers, this was meant as a metaphor only, relating the concerns expressed above thatthe use of new innovative technology may result in the loss or corruption of "several dozen or even hundreds of file systems" and thus too dangerous to use, with an analogy on the use of AC electricity was too dangerous to use in homes. To clarify, EMC did not re-enact Thomas Edison's event, no animalswere hurt by EMC, and I was not trying to make political commentary about the current controversy of electrocution as amethod of capital punishment. The opinions of individual bloggers do not necessarily reflect the official positions of EMC, and I am not implying that anyone at EMC enjoys torturing animals of any size, or their positions on capital punishment in general. This is not an attack on any of the above-mentioned EMC bloggers, but rather to point out faulty logic. Children should not put foil gum wrappers in electrical sockets. BarryB and I have apologized to each other over these posts for any feelings hurt, and discussion should focus instead on the technologies and architectures.)
While EMC might try to tell people today that nobody needs unique storage architectures for Web 2.0 applications, digital media and archive data, because their existing products support SATA disk and can be used instead for these workloads, they are probably working hard behind the scenes on their own "me, too" version.And with a bit of irony, Edison's film of the elephant is available on YouTube, one of the many Web 2.0 websites we are talking about. (Out of a sense of decency, I decided not to link to it here, so don't ask)
Continuing this week's theme of "Innovation that Matters", today I'll discuss cell phones, and their rolein "cloud computing". Some people call these "cellular phones", "mobile phones" or "hand phones".I have posted about these topics before. Last January, I discussed the[Convergence]represented by Apple's iPhone, and in August, I talked about[Accessing Data in the Clouds], but some recent announcements bring this back up as a fresh topic.
This is a major game-changer, forcing companies to rethink many of their strategies. For example,John Windsor, on The YouBlog asks the CBS Interactive division[What Business Are You In?]The answer is that CBS is shifting from a content focus, to an audience focus, looking to provide CBS television contentto an audience of cell phone users.ThinkBeta [Me, My Cell Phone and I] presents some interesting statistics. Google CEO Eric Schmidt estimates there are over 2.5 billion cell phones in use today, with 288 million units shipped alone in 3Q07.
That's quite a trend. As a leader in IT innovation, IBM tries to stay one step ahead of the industry, selling off mature technologies to other manufacturers, like typewriters, printers, and most recently laptops and desktop PCs, so that it can focus on newer technologies and market trends. For example, while many people might be aware that IBM designs and fabricates processor chips for all of the major game consoles (Microsoft's Xbox 360, Nitentendo's Wii, and Sony'sPlay Station 3), they might not know that IBM also makes chips for many cell phone manufacturers. IBM[POWER Architecture] blog writes about the IBM CMOS 7RF SOI semiconductor:
IBM has managed to integrate seven Radio Frequency (RF) front-end functions onto this single CMOS chip using silicon-on-insulator (SOI) technology. And this means? For cell phones, according to IBM foundry product director Ken Torino, "Our solution minimizes insertion loss and maximizes isolation which will prevent dropped calls even on the most inexpensive handsets." Currently, cell phone RF front-end functions are handled by five to seven chips and at least two of those are using expensive gallium arsenide (GaA) technologies. The CMOS 7RF SOI should not only reduce costs by eliminating the need for so many chips, but also trim the fat from materials expenditures since GaA tech is somewhat expensive. IBM predicts that manufacturers will first use the chip to reduce on-phone processors to two or three before making the leap to a single chip.
With all this demand, the world will need engineers to develop softwareapplications that work in this new environment. This plays into IBM's strength in the area of grid and supercomputing.IBM and Google announced they have jointly established an Internet-scale computing initiative to promote new software development methods that can help students and researchers address the challenges of Internet-scale applications. From[IBM Internet-scale computing] webpage:
Internet use and content has grown dramatically, fueled by global reach, mobile device access, and user-generated Web content, including large audio and video files. More of the world population is looking to the mobile Web to fulfill basic economic needs. To meet this challenge, Web developers need to adopt new methods to address significant applications such as search, social networking, collaborative innovation, virtual worlds and mobile commerce.
The University of Washington is the first to join the initiative. A small number of universities will also pilot the program, including Carnegie-Mellon University, Massachusetts Institute of Technology, Stanford University, the University of California at Berkeley, and the University of Maryland. In the future, the program will be expanded to include additional researchers, educators and scientists.
The heart of the project is a large cluster of several hundred computers (a combination of Google and IBM systems) that is planned to grow to more than 1,600 processors. Students will access the cluster through the Internet to test their parallel programming projects. The cluster is powered with open source software, including:
"EqualLogic didn’t get 2,000 customers because people were dying to use iSCSI. It got them because it built systems that scale dynamically and because a system the size of Montana can be managed by someone as clueless as my ex-wife."
As with any acquisition, people might be asking if this is a "match made in heaven" that makes strong business sense,or another HP-Compaq debacle. Back in September, I posted [Supermarkets and Specialty Shops] to explain how the storage marketplace has two market segments. Internally, IBM distinguishesbetween "clients" and "customers". Clients are those that buy services and complete solutions from a one-stop systems vendor, such as IBM, HP, Sun, or Dell, or systems integrator like IBM, CSC or EDS. Customers are those that buy products and components, from the systems vendors I just mentioned, as well as from individual specialty shops, like EMC, HDS, or NetApp.
To reach the growing "supermarket" segment, specialty shops are dependent on systems vendors to OEM or resell their kit: EMC disk through Dell, HDS disk through Sun and HP, NetApp through IBM. Until now, EqualLogichad to make their living as a "specialty" shop, but iSCSI appeals more to SMB than large enterprises, andSMB tend to be in the "supermarket" segment, so they partnered with Sun. Here is the timeline of this likely awkwardand strained relationship:
I am not surprised that I haven't seen anything in the blogosphere yet from HP, Dell or Sun. I suspect this news meansthat Sun won't be reselling Dell's EqualLogic boxes anymore, and perhaps there is nothing more for Sun bloggers Randy Chalfant or Nigel Dessau to add to that. HP and Dell are practically non-existent in the storage blogosphere, so I didn't expect much from them either.
I did, however, expect EMC to put in their spin, given that Dell resells EMC disk, and accounts for perhaps 15% of their revenues.Now that Dell has multiple offerings, they will be instructing their channel reps when to lead with EqualLogic versus when to sell EMC, for now, until 2011, at which point may simplify their storage sales model to just EqualLogic. I don't know if Dell would do that in 2011. Depending on how quick the decline happens, EMC may have to increase the pricesof their gear, or cut into their development budgets, to make up for this loss.
I started this post because of a comment from EMC blogger Chuck Hollis, who speculates how this will impact[Dell, EqualLogic and EMC].In that post, he expresses his opinion (which I will put into a different color):
"Speculation is pretty evenly split. Neither HP nor IBM have a good, entry-level iSCSI product."
If he had left out the word "good", then that would just be a false statement, but by adding the word "good" reduces this to merely an opinion of IBM products that I disagree with. (I have no experience with whateverHP sells in this category, nor talked to any customers about their experiences, so will neither agree nordisagree with Chuck's opinion of the HP half of his statement). As for the term "Entry-level", this is fairly well defined by analysts as a storage system under $50,000 US Dollars. Actually, IBM has three good offerings.
Our basic, lowest-price model is the IBM System Storage DS3300, which does iSCSI only, like the EqualLogic offerings. This supports both SAS and SATA disks, and can attach to our System x and System p server product lines.
Our smallest model of our fancier IBM System Storage N series not only supports iSCSI, but also CIFS, NFS,HTTP, FTP, and FCP protocols, what we call "Unified Storage". The iSCSI feature is included at no additional charge, and small customers can start with this, then scale up to larger N3600, N5000 or N7000 models, andadd more protocols and software features, as their business grows.
Our next larger model, but still entry-level, is the N3600. Since the N series supports a unified multi-protocolplatform, with features like SnapLock for regulatory compliance and SnapMirror for remote disk mirroring. The IBM System Storage N series easily replaces any mix of EMC "C-boxes": Centera, Celerra, and CLARiiON.
Both the DS3300 and the N series support the various Business Applications I have discussed this week, Microsoft Exchange, Lotus Domino, SAP, Oracle, Siebel, JD Edwards and PeopleSoft. N series offers SnapManager for variousapplications to make the business value even that much better.
Chuck speculates that Dell did this to compete better against rival HP, but that doesn't make sense, sincehe feels HP didn't have much to offer in this space. Perhaps Dell did this to competebetter against IBM, the number one vendor in storage hardware, according to IDC. Looking at what IBM andNetApp have to offer, Dell may have realized that they didn't have competitive disk systems from their resellingrelationship with EMC, looked elsewhere and found EqualLogic. Meanwhile, EqualLogic probably felt that Sun wasgoing out of business, or not yet fully supportive of IP SAN environments, and decided to ["switch horses midstream"].
Continuing this week's theme on Enterprise Applications, I thought that since I mentioned Lotus Notes in my discussion ofSAP yesterday, that I would cover Microsoft Exchange today.
IBM and Microsoft is the ultimate example of "Coopetition". Both companies develop popular operating systems. Microsoft's "Xbox 360" gaming console uses IBM processors. Microsoft Exchange and IBM Lotus Domino are the Coke-and-Pepsi dominant players in the email marketplace, with Microsoft slightly in the lead, as seen on this graph[Lotus Notes/Domino marketshare growing] from fellow IBM Lotus blogger Alan Lepofsky.And now, Microsoft is getting serious about participating in the storage software business, with its strong support for iSCSI and its SharePoint product. For this post, I will focus just on email.
For those not familiar with both Microsoft and IBM products, I offer the simple cheat-sheet below:
Microsoft Outlook (client)::IBM Lotus Notes (client) Microsoft Exchange (server)::IBM Lotus Domino (server)
Email has become the primary collaboration tool for most businesses, raising it to the level of "mission-critical".Microsoft has introduced its new Exchange 2007 to replace the existing Exchange 2003. Here are the key differences:
Windows 2000 or 2003
Runs on 32-bit x86
Requires 64-bit EM64T or AMD64, but Itanium IA64 not supported
Two(2) server roles
Five(5) server roles
Edge Server Role for combating SPAM
Unified Messaging services to combine voicemail, email, fax
5 storage groups
50 storage groups per server on Enterprise edition
50 databases per server on Enterprise edition (max 5 per storage group)
NAS or NTFS-formatted block disk
NTFS-formatted block disk recommended
Obviously, Exchange only runs on Windows operating system. The change from 32-bit to 64-bit means that many Exchange 2003 customers have not yet migrated over, and perhapsnow is a good time to point out alternative email servers on more reliable operating system platforms.For example, in addition to Windows 2003, Lotus Domino runs on IBM AIX, Linux on x86, Linux on System z, Sun Solaris, i5/OS on System i, and z/OS.
Another Linux alternative to Microsoft Exchange is Bynari InsightServer, which allows you to use your existing Windows-based Microsoft Outlook clients, swapping out only the server. This approach can be used when consolidating Windows servers to Linux virtual images on System z mainframe.Linux desktops can run [Ximian Evolution] to attach to either Bynari server, or Windows-based Microsoft Exchange server.Linux Journal offers a few articles on this:[Understanding and Replacing Microsoft Exchange, andExchange Functionality for Linux].
As with [Exchange 2003 editions], the new Exchange 2007 comes in both ["Standard" and "Enterprise" editions]. With all the newroles supported, you now can limit your "Mailbox Storage Server" role as Enterprise, and have the other roles, likeEdge and Hub, as simply "Standard" instead. Enterprise is about 5x more expensive than Standard, so that can makea difference.With Exchange 2003, the big difference was that "Standard" supported only 16GB, versus 16TB with "Enterprise",making "Standard" impractical for all but the smallest company. In the new Exchange 2007, both Standard and Enterprise support 16TB.
Exchange 2007 is also less IOPS-intensive. Thanks to 64-bit addressing, it generates about 75 percent fewer IOPS than Exchange 2003 for comparable configurations. This is good becauseaccording to a 2006 Radicati Group survey, the average corporate employee gets 84 emails per day, averaging 10MBdaily ingestion, and this is expected to grow to 15.8MB daily ingestion by 2008. The number of mailboxes worldwideis growing at a rate of 16 percent per year.
IBM System Storage is a Microsoft Gold certified partner, and participates in Microsoft's Exchange Solution Reviewed Program [ESRP].Both IBM DS8000 and DS4000 series are certified under this program, using a testbed called Jetstress.Those considering IBM System Storage N series can use Exchange 2007 with NTFS-formatted LUNs via FCP or iSCSIattachment.
Backup and Business Continuity
Back in 2003, the Meta Group found that 80 percent of organizations surveyed felt access to email was more importantthan telephone service, and that 74 percent believed being without email would present a greater hardship thanlosing telephone service. These percentages are probably higher today, with websiteslike ["Crackberry.com"] to cater to those addicted to theirRIM Blackberry hand-held devices.
IBM Tivoli Storage Manager can provide backup and recovery support for Microsoft Exchange.TSM for Mail supports both Microsoft Exchange and Lotus Domino. TSM for Copy Services can use MicrosoftVolume Shadow Copy Services (VSS) interfaces. I blogged about this before, back in June[Exchange 2003 VSS Snapshot Backup Whitepaper], and now there TSM has support for Exchange 2007 as well.
Interestingly, Exchange 2007 has some built-in"Business Continuity" features. Of the ones below, Standard edition has LCR only, Enterprise edition gives you the full set.
Local Continuous Replication (LCR):In this approach, a single server ships update logs from the active storage group on one disk system over to a passivecopy on a secondary disk system, presumably within 10km FCP distance. These logs can then be forward-applied to thepassive copy. This is sometimes called "database shadowing".
Cluster Continuous Replication (CCR):This is based on two servers in an active/passive MSCS cluster. First server is attached to the primary disk system,and ships logs to the passive copy attached to the second server.
Standby Continuous Replication (SCR):For the MSCS cluster-averse customer, SCR is based on two independent servers that are in two locations. In the event of failure on thefirst, scripts can be run to switch over to the second server. Each server has its own disk system.
Single Copy Clusters (SCC):This is for customers who have existing systems, but not recommended for new customers. An MSCS cluster, where both active andpassive servers are connected to the same single disk system. The disk array can be a single point of failure (SPOF) in this environment.You could mitigate risks by using IBM's disk mirroring in this situation, but then you are left coordinating those copies with new servers at the remote location.
It is estimated that as much as 75 percent of a company's intellectual property (IP) can be found somewhere in their email repository. Email is often requested in lawsuits and regulatory investigations. According to the Workplaceemail IM & blogging 2006 survey by AMA and the ePolicy Institute, 24 percent of organizations have be subpoenaed by courts and regulators, and another 15 percent have gone to court in lawsuits triggered by employee emails.
New regulations now mandate that emails are archived, protected against tampering and unauthorized access, and kept for a specific amount of time, or until certain conditions are met. According to a 2004 CSI and FBI Computer Crime and Security survey, 78 percent of organizations were hit by viruses (the rest must have been running Linux, AIX, i5/OS or z/OS!)and 37 percent reported unauthorized access to confidential information.
According to Gartner, over 60 million people will be doing some form of telecommuting, so access Microsoft hasbeen working on extending the reach of email beyond Outlook client. There is now "Outlook Web Access" thatprovides browser-based access, "Outlook Mobile" to provide text access from cellular phones, and even "Outlook Voice Access" which allows you to listen to your emails from any phone. These are all part of the new Unified MessagingServices feature.
Chris Evans over at Storage Architect posts aboutHardware Replacement Lifecycle Update, on how storage virtualization can helpwith storage hardware replacemement. He makes two points that I would like to comment on.
... indeed products such as USP, SVC and Invista can help in this regard. However at some stage even the virtualisation tools need replacing and the problem remains, although in a different place.
Knowing that replacement of technologies at all levels are inevitable, IBM System Storage SAN Volume Controlleris actually designed to allow cluster non-disruptive upgrade, which we announcedMay 2006.
The process is quite elegant. The SVC consists of one or more node-pairs, and can be upgraded while the systemis up and running by replacing nodes one at a time in a sequence of suspend and resume. All of the mapping tablesare loaded onto the new nodes from the rest of the still active nodes.
I was hoping as part of the USP-V announcement HDS would indicate how they intend to help customers migrate from an existing USP which is virtualising storage, but alas it didn't happen.
Unlike the SVC, once cannot just upgrade the USP in place and make it into a USP-V. While it might be possible tounplug external disk from the old USP, and re-plug into the new USP-V, what do you do about the internal disk data?I doubt you can just move drawers and trays of disk from the old to the new. The data has to be moved some other way.
Some have asked why not just put an SVC in front of both the old USP and the new USP-V and transfer the data that way.While SVC does support virtualizing the old USP device, IBM is still testing the new USP-V as a managed device, and so this solution is not yet available, and would only apply to the LUNs in the USP-V, not the volumes specifically formatted for System i or System z.
An alternative is to take advantage of IBM's Data Mobility Services, the result of our recentacquisition of SofTek. IBM can help you both mainframe and distributed systems data from any device, to any device.
In a typical four year lifecycle of storage arrays, it might take six months or so to fill up the box, and might takeas much as a year at the end to move the data out to other equipment. SVC can greatly reduce both of these, so that you can take immediate advantage of new equipment as soon as possible, and keep using it for close to the full four years,migrating weeks or days before your lease expires.
Yesterday, IBM announced a variety of new storage offerings. Our theme this time around was "Policies and Performance". Here's a quick recap.
IBM offers new appliance and gateway models of its popular "unified storage" IBM System Storage N series disk systems.The N5300 appliance has two models. A10 for the single-controller, and A20 for the dual-controller model. The N5600 gateway also has two models. G10 for the single-controller, and G20 for the dual-controller model.A new EXN4000 disk expansion drawer is 3U high, and can hold up to 14 disks. It can support 1Gbps, 2Gpbs and 4Gpbs speeds.In addition to all this new "performance", we offer a new "policy" called the Advanced Single Instance Storage feature for the N5000 and N7000 series, which provides de-duplication at the block level. This can be particularly useful if you are using your N series for e-mail, document publishing, databases, backups or archives.
SAN Volume Controller
A technology refresh with the new 8G4 model. Like its predecessor, the 8F4, this new model has 8GB of cache per node, and is fitted with 4Gpbs SAN attachment ports. The difference is that the 8G4 is based on our successfulIBM System x3550 server.This baby screams, so I look forward to seeing the updated SPC-1 and SPC-2 performance benchmark ratings.The new SVC 4.2 software provides additional authentication policies for more granular administration support, andmulti-destination FlashCopy (one source copied to up to 16 destination copies at the same time).
The DS8000 series now supports having third and fourth expansion frames. This was actually already available via RPQ, but now it can be directly ordered.This means that you can now hold up to half a Petabyte in a single disk system.
IBM TotalStorage Productivity Center v3.3 offers policy and performance-based guidance in configuring disk system volumes, specification of paths between hosts and disk systems during storage provisioning, policy-based specification of zone membership, configuration analysis capabilities, configuration change management, extended tape management, and both content-sensitive and scalable enterprise-wide reports. There is also a version specifically designedto manage disk replication on System z platforms.
Deep Computing Storage
The IBM System Storage DCS9550 Storage System comes in a 4U controller and 3U disk expansion drawers. It is designed for High Performance Computing (HPC) such as genome medical research, government research and rich media applications.
Our clients tell us they need performance to meet their dynamic business demands, and policies to help them manage the ever growing size of their storage infrastructure. We listened!
Yesterday morning, the entire country of Colombia suffered their worst black-out (power outage) in 22 years. 98% of the country was out for 4 1/2 hours.This is just 5 months after an outage that hit 25% of the country, December 7, 2006.Ironically, this one happened the week I am here explaining the need for Business Continuity plans to IBM Business Partners from Argentina, Peru, Velenzuela, Ecuador and Colombia. As is oftenthe case, people often need a real example to recognize the need for planning is important.
It reminded me of the Northeast Black-out of 2003 that impacted USA and Canada. I was speaking to a crowd of 800 people at the SHARE conference in Washington D.C. when it happened, and hundreds of pagers and cell-phones went off all at the same time. Although we were outside the effected area and had plenty of lighting, we ended up canceling therest of my talk, and many people left immediately to help execute their business continuity plans.Of course, terrorism was immediately assumed, but a final report showed that it was initiated in Ohiodue to overgrown trees, and then propagated due to a software bug to hundreds of other plants.
According to this morning's Bogota newspaper, "El Tiempo", nobody knows the root cause of yesterday's outage. Immediately, the country's leftist rebels were blamed, but now the leading theory is that it was initiated byoperator error (a technician touching something he shouldn't have), and then propagated by a faulty distribution system.
Another example of the need for a robust and resilient infrastructure, and appropropriate business continuity plans.
Today,Apple and EMI announced that EMI’s entire music and video catalog will be available in May without any digital rights management (DRM) protection.Not only with the music be higher quality, but can be played on any player, presumably using MP3 format instead ofApple's proprietary AAC format. Being locked into any single vendor solution is undesirable. Similar issues abound for Microsoft Office 2007 file formats.
On my iPod, I ripped all my CDs into MP3 format, not AAC. I love my iPod, but if I ever decided to chose a different MP3 player, I did not want to go through the time-consuming process or re-ripping them again.
A blog by Seth Godin feels this Apple-EMI announcement means thatDRM is dead.
Back when music labels added value by producing and distributing music in physical form, it made sense for them to take a cut. Mass-producing CDs and distributing them out to music stores across the country costs lots of money. However, for online music, music labels don't have these same overhead costs, but continue the process of paying the artists only a few pennies per dollar. Some artists have file lawsuits to get their fair share.
This process applies to any published work. For example, you can purchase Kevin Kelly's book in various formats, at different prices, from different distributors. For example:
In PDF for $2, directly from the author via PayPal
black-and-white hardcover, for $20, from Amazon
color softcopy, for $30, from Lulu
Each nets the author $1.50 in royalties per copy. You can decide how much in production and distribution costs you want to pay.
I wasn't at the event, but thought it would be good to explain some basic concepts ofInformation Lifecycle Management (ILM),using the files on my iPod as an example. (Disclosure: IBM makes the technology inside many of Apple's computers, and so IBMers get to buy Appleproducts at employee prices. I own a Mac Mini based on IBM's POWER4 processor, and an iPod Photo 60GB model).
I have 20,000 MP3 music files, representing 106GB of data. This fits nicely on my 250GB external disk system attached to my Mac Mini, but won't all fit on my little 60GB iPod. I needed a way to decide what music I keep on bothmy iPod and Mac Mini, and which I keep only on my Mac Mini. When I am traveling, I am able to listen only to the musicin the first group, but when I am at home, I am able to listen to all my music in both groups.(Another disclosure: I use my Tivo connected to my LAN to play all my MP3 music through my home stereo system.I had my entire house wired with Cat5 to make this possible.)
Apple's iTunes software lets me decide which MP3 files are copied to my iPod using "playlists". A playlist is a list of songs. Fixed playlists are created manually, each song copied to its list in a specific order. Smart playlists are createdautomatically, via policy. I give it the criteria, and it finds the songs for me. If I import a new music CD,none of the songs will be added to any fixed playlists, but could be added to my smart playlists if I set the policiescorrectly. Apple iTunes supports both "include" and "exclude" methodologies.
I use primarily smart playlists, based on genre and rating. I have tried to keep the number of genre down to a small manageable list:
Rhythm & Blues
Of course, what I have for genre may not match what's in theGracenote database, so I sometimes have to makeupdates to match my convention. I've picked these based on my different "applications" for my music. For example, I listen to Ambient music to help me fall asleep on airplanes, but Rock when I exercise at the gym.
Next, I use the ratings from one to five stars. The advantage to the rating is that I can change them on-the-fly directly on my iPod. All other "metadata" has to be entered only from the keyboard of my Mac Mini.
Files for Mac Mini only, not copied to my iPod
Non-mix, copied to my iPod, but typically spoken words, such as language lessons
Mix, music to include in my music mixes
Keep on my iPod, but re-evaluate
So, I have five smart playlists, "One Star", "Two Stars", etc. for each rating, and have decidedto keep only the 2, 3, 4 and 5 star songs on my iPod, by simply putting check marks on those playlists to copythem over. I have about 50 songs with 5 stars, and 8000 with 3 stars, and the rest in the other categories,leaving me a few GB to spare.
I also have playlists for each genre, "Rock mix", "Pop Mix", "Ambient Mix", etc. where I have selected thosethat match the genre, AND have 3, 4 or 5 stars. In this manner, I can listen to a mix. If I find a song mis-classified for that genre, I change it to four stars, which serves as myreminder to re-evaluate when I am back at home on my Mac Mini. If I don't want a song in my mix, I just lowerit to 2 stars. I want it off my iPod altogether, I lower it to one star.
This method is simple enough, and allows me to enjoy my music right away, and more effectively, without having to wait for completely finishing my classification process.
Next week, I'm traveling to Africa (purely vacation, not related to my job, my senator, or myinvolvement in anycharitable organizations). My Canon camera has only a 1GB IBM Microdrive, but I am able to offloadmy pictures to my iPod, connected via USB cable, and review the pictures on the little 2-inch screen. By simply "unchecking" my 2-star and 3-starplaylists, and checking only those mixes I plan to take with me, I was able to clear 17GB of space, plenty ofroom for all my photos of elephants and giraffes, but still plenty of music to listen to. Thanks to my simple methodology, I was able to do this with minimal effort, and willhave no problem putting all my music back when I return.
When evaluating an ILM process, many people are overwhelmed by their fear of the classification process, when in reality it doesn't have to be so complicated.
Is there an "iTunes" for the storage in your datacenter? Yes! It's called IBM TotalStorage Productivity Center. It can help you list and classify all the files in your IT environment,including files in your internal disks inside the servers, your NAS and SAN external disk systems, across both IBM and non-IBM hardware.It's a good thing to consider as part of your overall ILM strategy.
As I mentioned in my post [Moving Over to MyDeveloperWorks], those of us bloggers on IBM's DeveloperWorks are moving over to a new system called "MyDeveloperWorks" which has a host of new features.
Fortunately for me, I missed the note to volunteer to be one of the first bloggers on the block to volunteer to move over. I was traveling and decided not to deal with it until I got back.However, fellow IBM Master Inventor, Barry Whyte, was not so lucky. It is safe to say he was stupid enough to volunteer, and is probably regretting the decision every day since. In case you lost his RSS feed, or can't find him anymore on Google or whatever search engine, here is his[new blog].
As for my blog, I have asked to postpone the move until all the problems that Barry has encountered are resolved. That might be a awhile, but if you lose access to mine sometime in the near future, hopefully at least you have been warned as to what might have happened.
Continuing this week's theme of doing important things without leaving town, I present our results foran exciting project I started earlier this year.
For seven weeks, my coworker Mark Haye and I voluntarily led a class of students here in Tucson, Arizona in an after-school pilot project to teach the ["C" programming language] using [LEGO® Mindstorms® NXT robots]. The ten students, boys and girls ages 9 to 14 years old, were already part of the FIRST [For Inspiration and Recognition of Science and Technology] program, and participated in FIRST Lego League[FLL] robot competitions.Since the students were already familiar building robots, and programming them with a simple graphical system of connecting blocks that perform actions. However, to compete in the next level of robot competitions, FIRST Tech Challenge [FTC],we need to leave this simple graphical programming behind, and upgrade to more precise "C" programming.
Mark is a software engineer for IBM Tivoli Storage Manager and has participated in FLL competitions over the past nine years. This week, he celebrates his 25th anniversary at IBM, and I celebrate my 23rd. The teacher, Ms. Ackerman, and the students referred to us as "Coach Mark" and "Coach Tony".
This was the first time I had worked with LEGO NXT robots. For those not familiar with these robots, you can purchase a kit at your localtoy store. In addition to regular LEGO bricks, beams, and plates, there are motors, wheels, and sensors. A programmable NXT brick has three outputs (marked A,B, and C) to control three motors, and four inputs (marked 1,2,3,4) to receive values from sensors. Programs are written and compiled on laptops and then downloaded to the NXT programmable brick through an USB cable, or wirelessly via Bluetooth.
In the picture shown, an image of the Mars planetary surface is divided into a grid with thick black lines.A light sensor between the front two wheels of the robot is over the black line.
We used the [RobotC programming firmware] and integrated development environment (IDE) from [Carnegie Mellon University].The idea of this pilot was to see how well the students could learn "C". With only a few hours after class on each Wednesday, could we teach young students "C" programming in just seven weeks?
My contribution? I have taught both high school and college classes, and spent over 15 years programming for IBM, so Mark asked me to help.We started with a basic lesson plan:
A brief history of the "C" language
Understanding statements and syntax
Setting motor speed and direction
Compiling and downloading your first program
Understanding the "while" loop
Retrieving input sensor values
Understanding the "if-then-else" statement
Defining variables with different data types
Manipulating string variables
Writing a program for the robot to track along a black line on a white background.
Understanding local versus global scope variables
Writing a program for a robot to count black lines as it crosses them.
Perform left turns, right turns, and to cross a specific number of lines on a grid pattern to move the robot to a specific location.
Weeks 6 and 7
Mission Impossible: come up with a challenge to make the robot do something that would be difficult to accomplish using the previous NXT visual programming language.
At the completion of these seven weeks, I sat down to interview "Coach Mark"on his thoughts on this pilot project.
This is a practical programming skill. The "C" language is used throughout the world to program everything from embedded systems to operating systems, and even storage software. This would allow the robots to handle more precise movements, more accurate turns, and more complicated missions.
Can kids learn "C" in only seven weeks?
Part of the pilot project was to see how well the students could understand the material. They were already familiar with building the robots, and understood the basics of programming sensors and motors, so we were hoping this was a good foundation to work from. Some kids managed very well, others struggled.
Did everything go according to plan?
The first two weeks went well, turning on motors and having robots move forward and backward were easy enough. We seemed to lose a few students on week 3, and things got worse from there. However, several of the students truly surprised us and managed to implement very complicated missions. We were quite pleased with the results.
What kind of problems did the kids encounter?
Touch sensor required loops waiting for pressing. Motors did not necessarily turn as expected until more advanced methods were used. Making 90 degree left and right turns accurately was more difficult than expected.
Any funny surprises?
Yes, we had a Challenge Map representing the Mars planetary surface from a previous FLL competition that was dark red and divided into squares with thick black lines. An active light sensor returns a value of "0" (complete darkness) to "100" (bright white).However, the Mars surface had craters that were dark enough to be misinterpreted as a black line causing some unusual results. This required some enhanced programming techniques to resolve.
Did robots help or hurt the teaching process?
I think they helped. Rather than writing programs that just display "Hello World!" on a computer screen, the students can actually see robots move, and either do what they expect, or not!
And when the robots didn't do what they were expected to?
The students got into "debug" mode. They were already used to doing this from previous FLL competitions, but with RobotC, you can leave the USB cable connected (or use wireless Bluetooth) and actually gather debugging information while the robot is running, to see the value of sensors and other variables and help determine why things are not working properly.
Any applicability to the real world of storage?
We have robots in the IBM System Storage TS3500 tape library. These robots scan bar code labels, pull tapes out of shelves and mount them into drives.The programming skills are the same needed for storage software, suchas IBM Tivoli Storage Manager or IBM Tivoli Storage Productivity Center.
The world is becoming smarter, instrumented with sensors, interconnected over a common network, and intelligent enough to react and respond correctly. The lessons of reading sensor values and moving motors can be considered the first step in solutions that help to make a smarter planet.
Today we watched Barack Obama get inaugurated as the 44th President of the United States, and he reminded all Americans that the power and strength of this country comes through its diversity.To some extent, this is also what gives IBM its power and strength as well. While not quite the orator of President Obama, IBM's own CFO, Mark Loughridge, gave a rousing speech about IBM's 4Q08 and year-end financial results.
In 2008, IBM was not just successful because it had a wide diversity of servers and storage hardware products, but also a diversity of software, and a diversity of service offerings.And lastly, IBM sells to a diversity of clients in different industries, throughout a diversity of markets. While the current economic meltdown might have affected businesses focused on the US and other major markets, IBM did particularly well last year in growth markets, including the so-called BRIC countries (Brazil, Russia, India and China).
IBM's approach to invest in R&D and its nearly 400,000 employees for long-term success continues to pay off. Where "Cash is King", IBM can also afford all those acquisitions and strategic initiatives, positioning the company for a brighter future.
Where there are challenges, IBM finds opportunity.
It's Thursday here at the [Data Center Conference] here in Las Vegas. Trying to keep up with all the sessions and activities has been quite challenging. As is often the case, there are more sessions that I want to attend than I physically am able to, so have to pick and choose.
Making the Green Data Center a Reality
The sixth and final keynote was an expert panel session, with Mark Bramfitt from Pacific Gas and Electric [PG&E], and Mark Thiele from VMware.
Mark explained PG&E's incentive program to help data centers be more energyefficient. They have spent $7 million US dollars so far on this, and he has requested another$50 million US dollars over the next three years. One idea was to put "shells" aroundeach pod of 28 or so cabinets to funnel the hot air up to the ceiling, rather than havingthe hot air warm up the rest of the cold air supply.
The fundamental disconnect for a "green" data center is that the Facilities team pay for the electricity, but it is the IT department that makes decisions that impact its use. The PG&E rebates reward IT departments for making better decisions. The best metric available is"Power Usage Effectiveness" or [PUE], which is calculated by dividing total energy consumed in the data center, divided by energy consumed by the IT equipment itself.Typical PUE runs around 3.0 which means for every Watt used for servers, storage or network switches, another 2 Watts are used for power, cooling, and facilities. Companies are tryingto reduce their PUE down to 1.6 or so. The lower the better, and 1.0 is the ideal.The problem is that changing the data center infrastructure is as difficult as replacingthe phone system or your primary ERP application.
While California has [Title 24], stating energy efficiency standards for both residential and commercial buildings, it does notapply to data centers. PG&E is working to add data center standards into this legislation.
The two speakers also covered Data Center [bogeymans], unsubstantiated myths that prevent IT departments fromdoing the right thing. Here are a few examples:
Power cycles - some people believe that x86 servers can typically only handle up to 3000 shutdowns, and so equipment is often left running 24 hours a day to minimize these. Most equipment is kept less than 5 years (1826 days), so turning off non-essential equipment at night, and powering it back on the next morning, is well below this 3000 limit and can greatly reduce kWh.
Dust - many are so concerned about dust that they run extra air-filters which impactsthe efficiency of cooling systems air flow. New IT equipment tolerates dust much betterthan older equipment.
Humidity - Mark had a great story on this one. He said their "de-humidifier" broke,and they never got around to fixing it, and they went years without it, realizing they didn't need to de-humidify.
The session wrapped up with some "low hanging fruit", items that can provide immediate benefit with little effort:
Cold-aisle containment--Why are so few data centers doing this?
Colocation providers need to meter individual clients' energy usage -- IBM offers the instrumentation and software to make this possible
Air flow management--Simply organizing cables under the floor tiles could help this.
Virtualization and Consolidation.
High-efficiency power supplies
Managing IT from a Business Service Perspective
The "other" future of the data center is to manage it as a set of integrated IT services,rather than a collection of servers, storage and switches.IT Infrastructure Library (ITIL) is widely-accepted as a set of best practices to accomplish this "service management" approach. The presenter from ASG Software Solutions presented their Configuration Management Data Base (CMDB) and application dependency dashboard. Theyhave some customers with as many as 200,000 configuration items (CIs) in their CMDB.
The solution looked similar to the IBM Tivoli software stack presented earlier this yearat the [Pulse conference].Both ASG and IBM "eat their own dog food", or perhaps more accurately "drink their own champagne", using these software products to run their own internal IT operations.
For many, the future of a "green" data center managed as a set of integrated service are years away, but the technologies and products are available today, and there is no reasonto postpone these projects any longer than necessary. For more about IBM's approach togreen data center, see [Energy EfficiencySolutions]. You can also take IBM's[IT Service Management self-assessment] to help determine whichIBM tools you need for your situation.
Last month, HP and Oracle jointly announced their new "Exadata Storage Server".This solution involves HP server and storage paired up with Oracle software, designed for Data Warehouse andBusiness Intelligence workloads (DW/BI).
I immediately recognized the Exadata Storage Server as a "me too" product, copying the idea from IBM's [InfoSphere Balanced Warehouse]which combines IBM servers, IBM storage and IBM's DB2 database software to accomplish this, but from a singlevendor, rather than a collaboration of two vendors.The Balanced Warehouse has been around for a while. I even blogged about this last year, in my post[IBMCombo trounces HP and Sun] when IBM announced its latest E7100 model. IBM offers three different sizes: C-class for smaller SMB workloads, D-class for moderate size workloads, and E-class for large enterprise workloads.
One would think that since IBM and Oracle are the top two database software vendors, and IBM and HP are the toptwo storage hardware vendors, that IBM would be upset or nervous on this announcement. We're not. I would gladlyrecommend comparing IBM offerings with anything HP and Oracle have to offer. And with IBM's acquisition of Cognos,IBM has made a bold statement that it is serious about competing in the DW/BI market space.
But apparently, it struck a nerve over at EMC.
Fellow blogger Chuck Hollis from EMC went on the attack, and Oracle blogger Kevin Closson went on the defensive.For those readers who do not follow either, here is the latest chain of events:
When it comes to blog fights like these, there are no clear winners or losers, but hopefully, if done respectfully,can benefit everyone involved, giving readers insight to the products as well as the company cultures that produce them.Let's see how each side fared:
Chuck implies that HP doesn't understand databases and Oracle doesn't understand server and storage hardware, socobbling together a solution based on this two-vendor collaboration doesn't make sense to him. The few I know who work at HP and Oracle are smart people, so I suspect this is more a claim againsteach company's "core strengths". Few would associate HP with database knowledge, or Oracle with hardware expertise,so I give Chuck a point on this one.
Of course, Chuck doesn't have deep, inside knowledge of this new offering, nor do I for that matter, and Kevin is patient enough to correct all of Chuck's mistaken assumptions and assertions. Kevin understands that EMC's "core strengths" isn't in servers or databases, so he explains things in simple enough terms that EMC employees can understand, so I give Kevin a point on this one.
If two is bad, then three is worse! How much bubble gum and bailing wire do you need in your data center? The better option is to go to the one company that offers it all and brings it together into a single solution: IBM InfoSphere Balanced Warehouse.
Continuing this week's theme on dealing with the global economic meltdown, recession and financial crisis, I found a great video that recaps IBM CEO Sam Palmisano's recommendations to being more competitive in thisenvironment.
In a recent speech to business leaders, Sam outlined what he sees as the four most importantsteps to thriving in the global economy. The highlights can be seen here in this [2-minute video]on IBM's "Forward View" eMagazine.
As financial firms focus on costs, the IT departments will have an opportunity to consolidate their servers, networks and storage equipment. Consolidating disk and tape resources, implementing storage virtualization, and reducingenergy costs might get a boost from this crisis. Consolidating disparate storage resources to a big SoFS, XIV,DS8000 disk system, or TS3500 tape library might greatly help reduce costs.
Having mixed vendor environments that result from such mergers and acquisitions can be complicated to manage. Thankfully, IBM TotalStorage Productivity Centermanages both IBM and non-IBM equipment, based on open industry standards like SMI-S and WBEM.Merged companies might let go IT people with limited vendor-specific knowledge, but keep the ones familiar withcross-vendor infrastructure management skills and ITIL certification.
Comparing different vendor equipment
It seems that often times when there is a merger or acquisition, the two companies were using different storage gear from different vendors. IBM has made some incredible improvements over the past three years, in both performance enhancements and energy efficiency, but many companies with non-IBM equipment may not be aware of them.If there was ever a time to perform a side-by-side comparison between IBM and non-IBM equipment, here isyour chance.
For more on the impact of the financial meltdown on IT, see this InfoWorld[Special Report].
In Monday's post, [IBM Information Infrastructure launches today], I explained how this strategic initiative fit into IBM's New EnterpriseData Center vision. For you podcast fans, IBM Vice Presidents Bob Cancilla (Disk Systems), Craig Smelser (Storage and Security Software), and Mike Riegel (Information Protection Services), highlight some of the new products and offerings in this 12-minute recording:
This post will focus on Information Security, the second of the four-part series this week.
Here's another short 2-minute video, on Information Security
Security protects information against both internal and external threats.
For internal threats, most focus on whether person A has a "need-to-know" about information B. Most of the time, thisis fairly straightforward. However, sometimes production data is copied to support test and development efforts. Here is the typical scenario: the storage admin copies production data that contains sensitive or personal informationto a new copy and authorizes software engineers or testers full read/write access to this data.In some cases, the engineers or testers may be employees, other times they might be hired contractors from an outside firm.In any case, they may not be authorized to read this sensitive information. To solve this IBM announced the[IBM Optim Data Privacy Solution] for a variety of environments, including Siebel and SAP enterprise resource planning (ERP)applications.
I found this solution quite clever. The challenge is that production data is interrelated and typically liveinside [relational databases].For example, one record in one database might have a name and serial number, and then that serial number is used to reference a corresponding record in another database. The IBM Optim Data Privacy Solution applies a range of"masks" to transform complex data elements such as credit card numbers, email addresses and national identifiers, while retaining their contextual meaning. The masked results are fictitious, but consistent and realistic, creating a “safe sandbox” for application testing. This method can mask data from multiple interrelated applications to create a “production-like” test environment that accurately reflects end-to-end business processes.The testers get data they can use to validate their changes, and the storage admins can rest assured theyhave not exposed anyone's sensitive information.
Beyond just who has the "need-to-know", we might also be concerned with who is "qualified-to-act".Most systems today have both authentication and authorization support. Authentication determines that youare who you say you are, through the knowledge of unique userid/passwords combinations, or other credentials. Fingerprint, eye retinal scans or other biometrics look great in spy movies, but they are not yetwidely used. Instead, storage admins have to worry about dozens of different passwords on differentsystems. One of the many preview announcements made by Andy Monshaw on Monday's launch was that IBM isgoing to integrate the features of [Tivoli Access Manager for Enterprise Single Sign-On] into IBM's Productivity Center software, and be renamed "IBM Tivoli Storage Productivity Center".You enter one userid/password, and you will not have to enter the individual userid/password of all the managedstorage devices.
Once a storage admin is authenticated,they may or may not be authorized to read or act on certain information.Productivity Center offers role-based authorization, so that people can be identifiedby their roles (tape operator, storage administrator, DBA) and that would then determine what they areauthorized to see, read, or act upon.
For external threats, you need to protect data both in-flight and at-rest. In-flight deals with data thattravels over a wire, or wirelessly through the air, from source to destination. When companies have multiplebuildings, the transmissions can be encrypted at the source, and decrypted on arrival.The bigger threat is data at-rest. Hackers and cyber-thieves looking to download specific content, like personal identifiable information, financial information, and other sensitive data.
IBM was the first to deliver an encrypting tape drive, the TS1120. The encryption process is handled right at the driveitself, eliminating the burden of encryption from the host processing cycles, and eliminating the need forspecialized hardware sitting between server and storage system. Since then, we have delivered encryption onthe LTO-4 and TS1130 drives as well.
When disk drives break or are decommissioned, the data on them may still be accessible. Customers have a tough decision to make when a disk drive module (DDM) stops working:
Send it back to the vendor or manufacturer to have it replaced, repaired or investigated, exposing potentialsensitive information.
Keep the broken drive, forfeit any refund or free replacement, and then physically destroy the drive. Thereare dozens of videos on [YouTube.com] on different ways to do this!
The launch previewed the [IBM partnership with LSI and Seagate] to deliver encryption technology for disk drives, known as "Full Drive Encryption" or FDE.Having all data encrypted on all drives, without impacting performance, eliminates having to decide which data gets encryptedand which doesn't. With data safely encrypted, companies can now send in their broken drives for problemdetermination and replacement.Anytime you can apply a consistent solution across everything, without human intervention anddecision making, the less impact it will have. This was the driving motivation in both disk and tape driveencryption.
(Early in my IBM career, some lawyers decided we need to add a standard 'paragraph' to our copyright text in the upper comment section of our software modules, and so we had a team meeting on this. The lawyer that presented to us that perhaps only20 to 35 percent of the modules needed to be updated with this paragraph, and taught us what to look for to decidewhether or not the module needed to be changed. Myteam argued how tedious this was going to be, that this will take time to open up each module, evaluate it, and make the decision. With thousands of modules involved the process could take weeks. The fact that this was going to take us weeks did not seem to concern our lawyer one bit, it was just thecost of doing business.Finally, I asked if it would be legal to just add the standard paragraph to ALL the modules without any analysis whatsoever. The lawyer was stunned. There was no harm adding this paragraph to all the modules, he said, but that would be 3-5x more work and why would I even suggest that. Our team laughed, recognizing immediately that it was the fastest way to get it done. One quick program updated all modules that afternoon.)
To manage these keys, IBM previewed the Tivoli Key Lifecycle Manager (TKLM).This software helps automate the management of encryption keys throughout their lifecycle to help ensure that encrypted data on storage devices cannot be compromised if lost or stolen. It will apply to both disk and tapeencryption, so that one system will manage all of the encryption keys in your data center.
For those who only read the first and last paragraphs of each post, here is my recap:Information Security is intended as an end-to-end capability to protect against both internal and external threats, restricting access only to those who have a "need-to-know" or are "qualified-to-act". Security approacheslike "single sign-on" and encryption that applies to all tapes and all disks in the data center greatly simplify the deployment.
Continuing this week's theme on the z10 EC mainframe being able to perform the workloadof hundreds or thousands of small 2-way x86 servers, I offer a simple analogy.
One car, one driver
If you wonder why so many companies subscribe to the notion that you should only runa single application per server, blame Sun, who I think helped promote this idea.Not to be out-done, Microsoft, HP and Dell think that it is a great idea too. Imaginethe convenience for operators to be able to switch off a single machine and impactonly a single application. Imagine how much this simplifies new application development,knowing that you are the only workload on a set of dedicated resources.
This is analogous to a single car, single driver, where the car helps get the personfrom "point A" to "point B" and the single driver represents the driver and solepassenger of the vehicle. If this were a single driver on a energy-efficient motorcycleor scooter, than would be reasonable, but people often drive alone much bigger vehicles,what Jeff Savit would call "over-provisioning". Chips have increased in processingpower much faster than individual applications have increased their requirements, so as a result,you have over-provisioning.
Carpooling - one bus, one driver, and many other passengers riding along
This is how z/OS operates. Yes, you could have up to 60 LPARs that you could individuallyturn on and off, but where z/OS gets most of its advantages is that you can run many applicationsin a single OS instance, through the use of "Address Spaces" which act as application containers.Of course, it is more difficult to write for this environment, because you have to be a good"z/OS citizen", share resources nicely, and be WLM-compliant to allow your application to beswapped out for others.
While you get efficiencies with this approach, when you bring the OS down, all the apps on that OS image haveto stop with it. For those who have "Parallel Sysplex" that is not an issue. For example, let's say youhave three mainframes, each running several LPARs of z/OS, and your various z/OS images all are able toprocess incoming transactions for a common shared DB2 database. Thanks to DB2 sharing technology, youcould take down an individual LPAR or z/OS image, and not disrupt transaction processing, because theIP spreader just sends them to the remaining LPARs. A "Coupling Facility" allows for smooth operationsif any of the OS images are lost from an unexpected disaster or disruption.
Needless to say, IBM does not give each z/OS developer his or her own mainframe. Instead, we get to run z/OS guest images under z/VM. It was even possible to emulate the next generation S/390 chipsetto allow us to test software on hardware that hasn't been created yet. With HiperSockets, we canhave virtual TCP/IP LAN connections between images, have virtual coupling facilities, have virtualdisk and virtual tape, and so on. It made development and test that much more efficient, which iswhy z/OS is recognized as one of the most rock-solid bullet-proof operating systems in existence.
The negatives of carpooling or taking the bus applies here as well. I have been on buses that havestopped working, and 50 people are stranded. And you don't need more than two people to make thelogistics of most carpools complicated. This feeds the fear that people want to have separatemanageable units one-car-one-driver than putting all of their eggs into one basket, having to scheduleoutages together, and so on.
(Disclaimer: From 1986 to 2001 I helped the development of z/OS and Linux on System z. Mostof my 17 patents are from that time of my career!)
Bicycle races and Marathons
The third computing model is the Supercomputer. Here we take a lot of one-way and two-way machines,and lash them together to form an incredible machine able to perform mathematical computations fasterthan any mainframe. The supercomputer that IBM built for Los Alamos National Laboratory just clockedin at 1,000,000,000,000,000 floating point operations per second. This is not a single operating system,but rather each machine runs its own OS, is given its primary objective, and tries to get it done.NetworkWorld has a nice article on this titled:[IBM, Los Alamos smash petaflop barrier, triple supercomputer speed record].If every person in the world was armed with a handheld calculator and performed one calculation per second, it would take us 46 years collectively to do everything this supercomputer can do in one day.
I originally thought of bicycle races as an analogy for this, but having listened to Lance Armstrong at the[IBM Pulse 2008] conference, I learned thatbiking is a team sport, and I wanted something that had the "every-man-for-himself" approach to computing.So, I changed this to marathons.
The marathon was named after a fabled greek soldier was sent as messenger from the [Battle of Marathon to the City of Athens],a distance that is now standardized to 26 miles and 385 yards, or 42.195 kilometers for my readersoutside the United States.
If you were given the task to get thousands of people from "point A" to "point B" 26 plus milesaway, would you chose thousands of cars, each with a lone driver? Conferences with a lot of people in a few hotels useshuttle buses instead. A few drivers, a few buses, and you can get thousands of people from a fewplaces to a few places. But the workloads that are sent to supercomputers have a single end point,so a dispatcher node gives a message to each "greek soldier" compute node, and has them run it on their own. Somemake it, some don't, but for a supercomputer that is OK. When the message is delivered, the calculation for thatlittle piece is done, and the compute node gives it another message to process. All of the computations areassembled to come up with the final result. Applications must be coded very speciallyto be able to handle this approach, but for the ones that are, amazing things happen.
So, how does "server virtualization" come into play?
IBM has had Logical Partitions for quite some time. A logical partition, or LPAR, can run its own OSimage, and can be turned on and off without impacting other LPARs. LPARs can have dedicated resources,or shared resources with other LPARs. The IBM z10 EC can have up to 60 LPARs. System p and System i,now merged into the new "POWER Systems" product line, also support LPARs in this manner. Depending onthe size of your LPAR, this could be for a single OS and application, or a single OS with lots of applications.
Address Spaces/Application Containers
This is the bus approach. You have a single OS, and that is shared by a set of application containers. z/OS does this with address spaces, all running under a single z/OS image, and for x86there are products like [Parallels Virtuozzo Containers] that can run hundred of Windows instances under a single Windows OS image, or a hundred Linux imagesunder a single Linux OS image. However, you cannot mix and match Windows with Linux, just as all theaddress spaces on z/OS all have to be coded for the same z/OS level on the LPAR they run in.
The term "guests" were chosen to model this after the way hotels are organized. Each guest has a roomwith its own lockable entrance and privacy, but shared lobby, and in some countries, shared bathroomson every hall. This approach is used by z/VM, VMware and others. The z/VM operating system can handle any S/390-chip operating system guest, so you could have a mix ofz/OS, TPF, z/VSE, Linux and OpenSolaris, and even other z/VM levels running as guests. Many z/VM developers runin this "second level" mode to develop new versions of the z/VM operating system!
As part of the One Laptop Per Child [OLPC] development team (yes, I ama member of their open source community, and now have developer keys to provide contributions), I havebeen experimenting with Linux KVM. This was [folded into the base Linux 2.6.20 kernel and availableto run Linux and Windows guest images. This is a nice write-up on[Wikipedia].
The key advantage of this approach is that you are back to one-car-one-driver simplistic mode of thinking. Each guest can be turned on and off without impacting otherapplications. Each guest has its own OS image, so you can mix different OS on the same server hardware.You can have your own customized kernel modules, levels of Java, etc.Externally, it looks like you are running dozens of applications on a single server, but internally,each application thinks it is the only one running on its own OS. This gives you simpler codingmodel to base your test and development with.
Jeff is correct that running less than 10 percent utilization average across your servers is a cryingshame, and that it could be managed in a manner that raises the utilization of the servers so that fewer areneeeded. Just as people could carpool, or could take the bus to work, it just doesn't happen, and data centersare full of single-application servers.
VMware has an architectural limit of 128 guests per machine, and IBM is able to reach this withits beefiest System x3850 M2 servers, but most of the x86 machines from HP, Dell and Sun are less powerful,and only run a dozen or so guests. In all cases, fewer servers means it is simpler to manage, so moreapplications per server is always the goal in mind.
VMware can soak up 30 to 40 percent of the cycles, meaning the most you can get from a VMware-basedsolution is 60 to 70 percent CPU utilization (which is still much better than the typical 5 to 10 percent average utilization we see today!) z/VM has been finely tuned to incur as little as 7 percent overhead,so IBM can achieve up to 93 percent utilization.
Jeff argues that since many of the z/OS technologies that allow customers to get over90 percent utilization don't apply to Linux guests under z/VM, then all of the numbers are wrong.My point is that there are two ways to achieve 90 percent utilization on the mainframe, one is throughz/OS running many applications on a single LPAR (the application container approach), and the other through z/VM supporting many Linux OS images, each with one (or a few) applications (the virtual guest approach).
I am still gathering more research on this topic, so I will try to have it ready later this week.
Yesterday's post [Software Programmers as Bees]was not meant as "career advice", but certainly I got some interesting email as if it was.Orson Scott Card was poking fun at the culture clash between software programmers andmanagement/marketers, and I gave my perspective, having worked both types of jobs.
This is June. Many students are graduating from high school or college and lookingfor jobs. Some of these might be jobs just for the summer to make some spending money,and others mights be jobs like internships to explore different career paths. I found both programming and marketing are rewarding and interesting work, but each person is different.
There are a variety of ways to find out what your personality traits are,and then focus on those jobs or career paths that are best for those strengths. Hereis an online [Typology Test] based onthe work of psychologists Carl Jung and Isabel Myers-Briggs. The result is a four-letterscore that represents 16 possible personalities. For example, mine is "ENTP",which stands for "Extroverted, Intuitive, Thinking, Perceiving". You can find out otherfamous people that match your personality type. For ENTP, I am lumped together withfellow master inventor Thomas Edison, fellow author Lewis Carrol (Alice in Wonderland), Cooking great Julia Child, Comedians George Carlin and Rodney Dangerfield (I get no respect!),movie director Alfred Hitchcock, and actor Tom Hanks.
USA Today had an article ["CEOsvalue lessons from teen jobs"] which offers some career advice from successful business people.Of course, what worked for them may not work for you, all based on different personality types. Hereis an excerpt of the advice I thought the most useful:
"If you are committed, you will be successful." (unfortunately, the reverse is also true: if you are successful,you will be asked to move to a different job)
"Tackle offbeat jobs. Challenge conventional wisdom within reason. Come into contact with people from all walks of life."
"Show an interest, demonstrate you want to be on the job."
"Never limit yourself. Look beyond to what needs to be done, or should be done. Then do it. Stretch. Go beyond what others expect."
"Find a job that forces you to work effectively with people. No matter what you end up doing, dealing with others will be critical."
"Bring your best to the table every day. Learn professional responsibility and how to handle difficult situations."
"Listen carefully to what customers want."
Before IBM, I ran my own business. If you are thinking, "Maybe I will start my own business instead?" you might want to see this advice from Venture Capitalist [Guy Kawasaki on Innovation].While running your own business has advantages, like avoiding issues "working for the man", it has somedisadvantages as well. It is certainly not as easy as some people make it seem to be.
Of course, things are a lot different nowadays than they were when these CEOs were teenagers. And the pace ofchange does not seem to be slowing down any either. Here is a presentation on [SlideShare.net] that helps bring to focus the realities of globalization:
Continuing this week's theme on "best of breed", some questions arise: How is this calculated or determined?How is one storage solution "better" than another? Which attributes weigh more heavily in the decision?
Some attributes are directly measurable, like storage performance. For this, gather up a list ofall the storage products you are interested in, go to the [Storage Performance Council website],determine whether SPC-1 or SPC-2more closely matches your application workload, and then choose the best product fromthe benchmarks, discarding any vendors that don't bother to have benchmarks posted.The new SPC-2 benchmark was created, in part, to address new workloads for the Media and Entertainmentindustry. (For a comparison of the two, see my post [SPC benchmarks for Disk System Performance])
However, other attributes, like "easy to manage", are not as straightforward to measure.One client compared the complexity of different solutions by counting the number of cables involved to connect the various parts of each solution. Only external cables were considered. All of the cables inside an IBM SystemStorage DS8000 would not be counted. By this measure, a single IBM System z10 EC mainframe connected to a single IBM DS8000 disk system over a few FICON cables would therefore be "less complicated" than a thousand x86 servers connectedvia FCP SAN switches to dozens of disk systems.
But counting cables only handles the hardware part of the interconnections. You have to also considerthe interconnections between the software, between users, and between IT administrators. It is not alwaysobvious where those connections are, and how to count them into consideration.
This month, IBM introduced the first "Management Complexity Factor" (MCF) for the Media andEntertainment industry. IBM MCF a result of IBM's acquisition of NovusCG, and is an essential part of"Storage Optimization Services" being offered by IBM. Here is an excerpt from the[IBM Press Release]:
"Media companies are facing a double-edged sword with the exponential rise in digital media storage needs, coupled with concerns about optimizing storage to be more efficient," said Steve Canepa, vice president of Media and Entertainment, IBM. "By quickly and cost-effectively analyzing the interconnected IT and storage environments that increasingly comprise media operations, MCF for Media helps our clients identify opportunities for improvement and align their IT and business strategies."
Since 1995, IBM has invested more than $18 billion on public acquisitions, making it the most acquisitive company in the technology industry, based on volume of transactions.
IBM has a strong global focus on the media and entertainment industry across all of its services and products, serving all the major industry segments -- entertainment, publishing, information providers, media networks and advertising.
It's been a while since I've talked about [Second Life].
The latest post on eightbar[Spimes, Motes and Data centers]discusses IBM's use of virtual world technology to analyze data centers in three dimensions.New World Note asks[What's The Point Of 3D Data Centers?]One would think that a simple monitoring tool based on a two-dimensional floor plan would be enough to evaluate a data center.
Enter Michael Osias, IBM (a.k.a Illuminous Beltran in Second Life). Some of the leading news sites havebegun to notice some 3D data centers that he has helped pioneer. UgoTrade writes up an article aboutMichael and the media attention in [The Wizard of IBM's 3DData Centers].
Of course, in presenting these "Real Life/Second Life" (RL/SL) interactive technologies, IBM is sometimes the target of ridicule. Why? Because IBM is 10 years ahead of everyone else. So, are there aspects of a data center where 3D interfaces makes sense? I think there is.
IBM TotalStorage Productivity Center has an awesome "topology viewer" that shows what servers are connectedto which switches, to which disk systems and tape libraries. This is all done in a 2D diagram, generated dynamicallywith data discovered through open standard interfaces, similar to what you might draw manually with toolslike Visio. Imagine, however, howmore powerful if it were a 3D viewer, with virtual equipment mapped to the physical location of each pieceof hardware on the data center floor, including the position on the rack and location on the data center floor.
Designing computer room air conditioning (CRAC) systems is actually a three dimensional problem. Cold air isfed underneath the raised floor, comes up through strategically placed "vent" tiles, taken in the front ofeach rack. Hot air comes out the back of each rack, and hopefully finds ceiling duct intake to get cooled again.The temperature six inches off the floor is different than the temperature six feet off the floor, and 3Dmonitor tools could be helpful in identifying "hot spots" that need attention. In this case "spimes" representsensors in the 3D virtual world, able to report back information to help diagnose problems or monitor events.
After many people left the mainframe in favor of running a single application per distributed server, the pendulumhas finally swung back. Companies are discovering the many benefits of changing this behavior. "Re-centralization" is the task at hand. Thanks to virtualization of servers, networks and storage, sharing common resources canonce again claim the benefits of economies of scale. In many cases, servers work together in collective unitsfor specific applications that might benefit better if consolidated together onto the same equipment.
IBM's "New Enterprise Data Center" vision recognizes that people will need to focus on the management aspectsof their IT infrastructure, and 3D virtual world technologies might be an effective way to getthe job done.
This week I'm in beautiful Guadalajara, Mexico teaching at our[System Storage Portfolio Top Gun class].We have all of our various routes-to-market represented here, including our direct sales force, our technicalteams, our online IBM.COM website sales, as well as IBM Business Partners.Everyone is excited over last week's IBM announcement of [4Q07 and full year 2007 results], which includesdouble-digit growth in our IBM System Storage business, led by sales of our DS8000, SAN Volume Controller and Tapesystems. Obviously, as an IBM employee and stockholder, I am biased, so instead I thought I would provide someexcerpts from other bloggers and journalists.
But what was striking in the company’s conference call on Thursday afternoon was the unhedged optimism in its outlook for 2008, given the strong whiff of recession fear elsewhere.
The questions from Wall Street analysts in the conference call had a common theme. Why are you so comfortable about the 2008 outlook? Now, that might just be professional churlishness, since so many of them have been so wrong recently about I.B.M. Wall Street had understandably thought, for example, that I.B.M.’s sales to financial services companies — the technology giant’s largest single customer category — would suffer in the fourth quarter, given the way banks have been battered by the mortgage credit crunch.
But Mr. Loughridge said that revenue from financial services customers rose 11 percent in the fourth quarter, to $8 billion. The United States, he noted, accounts for only 25 percent of I.B.M.’s financial services business.
The other thing that seems apparent is how much I.B.M.’s long-term strategy of moving up to higher-profit businesses and increasingly relying on services and software is working. Its huge services business grew 17 percent to $14.9 billion in the quarter. After the currency benefit, the gain was 10 percent, but still impressive. Software sales rose 12 percent to $6.3 billion.
Looking at IBM's business segments, it can be seen that they offer far more coverage of the technology space that those of the typical tech company:
IBM is just so big and diversified that there is little comparison between it and most other tech companies. IBM is a member of an elite group of companies like Cisco Systems (CSCO), Microsoft (MSFT), Oracle (ORCL) or Hewlett-Packard (HPQ).
IBM's wide international coverage and deep technological capabilities dwarf those of most tech companies. Not only do they have sales organizations worldwide but they have developers, consultants, R&D workers and supply chain workers in each geographic region. Their product mix runs from custom software to packaged enterprise software, hardware (mainframes and servers), semiconductors, databases, middleware technology, etc., etc. There are few tech companies that even attempt to support that many kinds and variations of products.
As color on the fourth quarter earnings announcement, there are a couple of observations that I would like to make. The first one speaks to IBM's international prowess. The company indicated that growth in the Americas was only 5%. International sales were a primary driver of IBM's good results. As an insight on the difference between IBM and most other tech companies, it is clear that nowadays, a tech company that isn't adept at selling internationally is going to be in trouble.
Terrific performance in a terrific year - no doubt a result of its strong global model. IBM operates in 170 countries, with about 65% of its employees outside US and about 30% in Asia Pacific. For fiscal 2007, revenues from Americas grew 4% to $41.1 billion (42% of total revenue), [EMEA] grew 14% to $34.7 billion (35%of total revenue), and Asia-Pacific grew by 11% to $19.5 billion (19.7% of total revenue). IBM sees growth prospects not just in [BRIC] but also countries like Malaysia, Poland, South Africa, Peru, and Singapore.
Thus far 2008–all two weeks of it–hasn’t been a pretty for the tech industry. Worries about the economy prevail. And even companies that had relatively good things to say like Intel get clobbered. It’s ugly out there–unless you’re IBM.
I am sure there will be more write-ups and analyses on this over the next coming weeks, and others will probably waituntil more tech companies announce their results for comparison.
Well, tomorrow is the Winter solstice, at least for those of us in the Northern hemisphere of the planet.As often happens, I have more vacation days left than I can physically take before they evaporateat the end of the year, so next week I will be off, going to see movies like the new["Golden Compass"]or perhaps read the latest book from [Richard Dawkins].
Next week, I suspect some of the kids on my block will be playing with radio-controlled cars orplanes. If you are not familiar with these, here's a [video on BoingBoing]that shows Carl Rankin's flying machines that he made out of household materials.
Which brings me to the thought of scalability. For the most part, the physics involvedwith cars, planes, trains or sailboats apply at the toy-size level as well as the real-world level. One human operator can drive/manage/sail one vehicle. While I have seen a chess master play seven opponents on seven chess boards concurrently, itwould be difficult for a single person to fly seven radio-controlled airplanes at the same time.
How can this concept be extended to IT administrators in the data center? They have to deal withhundreds of applications running on thousands of distributed servers.In a whitepaper titled [Single System Image (SSI)], the threeauthors write:
A single system image (SSI) is the property of a systemthat hides the heterogeneous and distributed nature of theavailable resources and presents them to users and applicationsas a single unified computing resource.
IBM has some offerings that can help towards this goal.
Even in the case where yourvehicle is being pulled by eight horses--(or eight reindeer?)--a single operator can manage it, holding the reins in both hands. In the same manner,IBM has spent a lot of investment and research into supercomputers, where hundreds of individualservers all work together towards a common task. The operator submits a math problem, for example,and the "system system image" takes care of the rest, dividing the work up into smaller chunksthat are executed on each machine.
When done with IBM mainframes, it is called a Parallel Sysplex. The world's largest business workloadsare processed by mainframes, and connecting several together and working in concert makes this possible.In this case, the tasks are typically just single transactions, no need to divide them up further, justbalance the workload across the various machines, with shared access to a common database and storageinfrastructure so they can all do the work equally.
Last August, in my post [Fundamental Changes for Green Data Centers], I mentioned that IBM consolidated 3900 Intel-based servers onto 33 mainframes. This not only saves lots of electricity, but makes it much easier for the IT administratorsto manage the environment.
Parallel Sysplex configurations often require thousands of disk volumes, which would have been quitea headache dealing with them individually. With DFSMS, IBM was able to create "storage groups" wherea few groups held the data. You might have reasons to separate some data from others, put them inseparate groups. An IT administrator could handle a handful of storage groups much easier than thousandsof disk volumes. As businesses grow, there would be more data in each storage group, but the numberof storage groups remains flat, so an IT administrator could manage the growth easily.
IBM System Storage SAN Volume Controller (SVC) is able to accomplish this for other distributed systems.All of the physical disk space assigned to an SVC cluster is placed into a handful of "managed diskgroups". As the system grows in capacity, more space is added to each managed disk group, but few IT administrators can continue to manage this easily.
The new IBM System Storage Virtual File Manager (VFM) is able to aggregate file systems into one globalname space, again simplifying heterogeneous resources into a single system image. End users have a singledrive letter or mount point to deal with, rather than many to connect to all the disparate systems.
Lastly we get to the actual management aspect of it all. Wouldn't it be nice if your entire data centercould be managed by a hand-held device with two joysticks and a couple of buttons? We're not quite there yet, but last October we announced the [IBM System Storage Productivity Center (SSPC)]. This is a master consolethat has a variety of software pre-installed to manage your IBM and non-IBM storage hardware, includingSAN fabric gear, disk arrays and even tape libraries. It lets the storage admin see the entire data centeras a single system image, displaying the topology in graphical view that can be drilled down using semanticzooming to look at or manage a particular device or component.
Customers are growing their storage capacity on average 60 percent per year. They could do this by havingmore and more things to deal with, and gripe about the complexity, or they can try to grow theirsingle system image bigger, with interfaces and technologies that allow the existing IT staff to manage.
While Bill Gates is personally benefiting from code he wrote 30 years ago, most software engineers don't getroyalties for their creative efforts. Robin Harris on StorageMojo has a great piece on [Why are the writers striking?]The writers in this case are those who write scripts for television programs. They get 4 cents for every$19.99 DVD sold today, and want this bumped up to 8 cents. More importantly, they want the same deal forcontent shown over the internet. Currently, they get nothing when content they wrote for is shown on the internet, and they would like that fixed also.
Paying royalties to creative writers encourages them to write good stuff. The best stuff will result in moreroyalties, and we want to encourage this. What about software engineers? Don't we want them to write the beststuff also? Shouldn't they get royalties too, not just a flat salary and continued employment?
For those in the US, last friday, the day after Thanksgiving, marks the official start of the Holiday shopping season. This has been called [Black Friday] as some stores open as early as 4am in the morning, when it is still dark outside, to offer special discount prices. Some shoppers camp out in sleeping bags and lawn chairs in front of stores overnight to be the first to get in.
Not surprisingly, some folks don't care for this approach to shopping, and prefer instead shopping online. Since 2005, the Monday after Thanksgiving (yesterday) has been called [Cyber Monday].USA Today newspaper reports [Cyber Monday really clicks with customers]. Many of the major online shopping websites indicated a 37 percent increase in sales yesterday over last year's Cyber Monday.
On Deadline dispels the hype on both counts:[Cyber Monday: Don't Believe the Hype?"], indicating that Black Friday is not the peak shopping for bricks-and-mortar shops, andthat Cyber Monday is not the busiest online shopping day of the year, either.
A flood of new video and other Web content could overwhelm the Internet by 2010 unless backbone providers invest up to US$137 billion in new capacity, more than double what service providers plan to invest, according to the study, by Nemertes Research Group, an independent analysis firm. In North America alone, backbone investments of $42 billion to $55 billion will be needed in the next three to five years to keep up with demand, Nemertes said.
Internet users will create 161 exabytes of new data this year, and this exaflood is a positive development for Internet users and businesses, IIA says.
If the "161 Exabytes" figure sounds familiar, it is probably from the IDC Whitepaper [The Expanding Digital Universe] that estimated the 161 Exabytes created, captured or replicated in 2006 will increase six-fold to 988 Exabytes by the year 2010. This is not just video captured for YouTube by internet users, but also corporate data captured by employees, and all of the many replicated copies. The IDC whitepaper was based on an earlier University of California Berkeley's often-cited 2003[How Much Info?] study, which not only looked at magnetic storage (disk and tape), but also optical, film, print, and transmissions over the air like TV and Radio.
A key difference was that while UC Berkeley focused on newly created information, the IDC study focused on digitized versions of this information, and included theadded impact of replication.It is not unusual for a large corporate databases to be replicated many times over. This is done for business continuity, disaster recovery, decision support systems, data mining, application testing, and IT administrator training. Companies often also make two or three copies of backups or archives on tape or optical media, to storethem in separate locations.
Likewise, it should be no surprise that internet companies maintain multiple copies of data to improve performance.How fast a search engine can deliver a list of matches can be a competitive advantage. Content providers may offer the same information translated into several languages.Many people replicate their personal and corporate email onto their local hard drives, to improve access performance, as well as to work offline.
The big question is whether we can assume that an increased amount of information created, captured and replicated will have a direct linear relation to the growth of what is transmitted over the internet. Three fourths of the U.S. internet users watched an average of 158 minutes of online video in May 2007, is this also expected to grow six-fold by 2010? That would be fifteen hours a month, at current video densities, or more likely it would be the same 158 minutes but of much higher quality video.
On the other hand, much of what is transmitted is never stored, or stored for only very short periods of time.Some of these transmissions are live broadcasts, you are either their to watch and listen to them when they happen, or you are not. Online video games are a good example. The internet can be used to allow multiple players to participate in real time, but much of this is never stored long-term. An interesting feature of the Xbox 360 is to allow you to replay "highlight" videos of the game just played, but I do not know if these can be stored away or transferred to longer term storage.
Of course, there will always be people who will save whatever they can get their hands on. Wired Magazine has anarticle [Downloading Is a Packrat's Dream], explaining that many [traditional packrats] are now also "digital packrats", and this might account for some of this growth. If you think you might be a digital packrat,Zen Habits offers a [3-step Cure].
In any case, the trends for both increased storage demand, and increased transmission bandwidth requirements, are definitely being felt. Hopefully, the infrastructure required will be there when needed.
Continuing this week's theme on Enterprise Applications, I will talk about [SAP] today.
The history of SAP is fascinating. Back in 1972, five IBMers noticed that IBM wasn't leveragingits internal accounting/inventory software package. They asked if they could buy the rights to it, leave IBM to form their own company to fix it up, and sell it as their own. Since IBM had decided not tobe in the enterprise applications business any longer, they approved. These guys renamed the software to "Realtime Data Processing/1" or just R/1 for short, andformed Systemanalyse und Programmentwicklung AG. In 2005, they renamed this to Systeme, Anwendungen, Produkte in der Datenverarbeitung AG,which is German for "Systems, Applications, and Products in Data Processing, Inc.", withSAP AG as the preferred abbeviation (the AG here is justthe German version of "Inc.").
R/1 became R/2, then R/3, and today is now called the SAP ERP forthe SAP Business Suite, although many still call it R/3. Other popular Business Suite components includeCustomer Relationship Management(CRM), Product Lifecycle Management (PLM), and Supply Chain Management (SCM),and Supplier Relationship Manager (SRM).The architecture had evolved in this time frame, separating out the application components from a base platform product line called NetWeaver, similar to IBM WebSphere Application Server (WAS). Other ISVs or in-house developers can build their applications directlyonto the NetWeaver base platform, creating a form of eco-system of software applications.
Today, SAP is now the fourth largest software company (behind Microsoft, IBM and Oracle) employing over 42,000 employees worldwide,and is considered the leading global vendor of Enterprise Application software, generating over $14 billiondollars in revenue each year.
SAP runs on all of IBM's major operating systems and server platforms, so it makes sense for IBM to continue its strong ties to SAP. Together, we formed the IBM SAP International Competency Center[ISICC], in Waldorf, Germany,where SAP has its headquarters. I have been to Germany and visited with the folks from the ISICC.Of my 17 U.S. patents, several were for a feature called z/OS DFSMShsm "Fast Replication" that was requested by SAPat one of these meetings. This featuretakes advantage of IBM System Storage DS8000 FlashCopy to make instantaneous backups of an SAP environment built on DB2 for z/OS databases. For more details read the [IBM Redbook: Fast Replication].
The #1 UNIX platform for SAP is IBM's AIX operating system that runs on System p servers. Some of our customers create a[Composite Application] by havingthe SAP front-end application server run on AIX, and use z/OS to host the SAP DB2 databases. Thisallows you to take advantage of DFSMShsm Fast Replication on System z, with the number-crunching power of theSystem p server.
What's most exciting to me about SAP is that for every dollar spend on IT hardware to support an SAP application,60% is for storage, and 40% for servers. Therefore, buying both from IBM is simpler and easier than shoppingfor these separately.
IBM Tivoli Storage Manager (TSM) was the first product to certify to SAP's BC-BRS interface for copy/mirror/backup/restore. IBM provides additional support with TSM for SAP, TSM for Databases, andTSM for Advanced Copy Services (ACS). TSM for ACS supports the use of FlashCopy on SVC, DS8000, DS6000, and ESS; as well as SnapShot on the IBM System Storage N series.
IBM's recent push into the Archive and Compliance space offers[IBM CommonStore for SAP],which acts as an "archive file manager" between you SAP application and your archive repository, such asthe IBM System Storage DR550, DR550 Express, N series, and tape.
What's Next: SMB and SaaS
Since SAP has saturated the market for medium and large size businesses, IBM is now focused on helping theSMB customer base. The majority of these are expected to deploy SAP on x86 platforms running Linux orWindows. For smaller companies, SAP has their "Business All-in-One" for companies with 100-500 users,and "Business One" for companies with less than 100 users. Note: not every employee may need to use SAP,so larger companies may have only a subset of their employees actually using the SAP system and find thesesmaller offerings a good fit.
Nicholas Carr on his Rough Type blog writes:[has SAP unleashed a cannibal?],referring to SAP's new "Business ByDesign" Software as a Service (SaaS) offering to compete againstSalesforce.com business model. Rather than installing and maintaining the SAP software yourself, youinstead pay SAP on a per-user/per-month basis to use their systems remotely. The reference to cannibalism comes from the IT slang "eat your ownchildren", the notion that IT companies may introduce a new offering that eats away at future sales of theexisting product set.
For more information on IBM's support of SAP enterprise applications, check out this [IBM and SAP website].
Well it's Tuesday, which means its time to look at recent announcements.While I was on vacation last week, IBM made a lot of storage announcements October 23.Josh Krischer gives his summary on WikiBon [October 2007 Review].Austin Modine of the The Register went so far as to say that [IBM goes crazy with storage system updates].
IBM System Storage DS8000 series
This is "Release 3" software/microcode upgrades on our existing "Turbo" hardware.
IBM FlashCopy SE -- Here "SE" stands for Space Efficient. Rather than allocating a full 100% of the space for the FlashCopy destination, you can set aside just a fraction, and this will hold all the changed blocks, similar to whatIBM already offers on the DS4000 series.
Dynamic Volume Expansion -- In the past, if you needed more space for a LUN, you had to carve out a newer one elsewhere, and then copy the data over from the old to the new, leaving the old LUN around to be re-used or leftstranded. With this enhancement, you can just upgrade the LUN in place, making it bigger as needed, similar to whatIBM already offers on the DS4000 series and SAN Volume Controller. This applies to CKD volumes for the System zmainframe users out there as well.
Storage Pool Striping -- striping volumes across RAID ranks to eliminate or reduce hot-spots, and provide betterload balancing. Many used SAN Volume Controller in front of the DS8000 to do this, but now you can do it natively inthe DS8000 itself.
z/OS Global Mirror Multiple Reader -- for System z customers, "z/OS Global Mirror" is the new name for XRC. Thisenhancement improves the throughput of sending updates to the remote disaster recovery location.
DS Storage Manager enhancements, the element manager software has been enhanced, and is pre-installed on the new IBM System Storage Productivity Center, which I will talk about below.
Intermix of DS8000 machine types -- this is especially useful to allow new frames to have co-terminating warrantieswith the base units. In other words, as you expand your system, you can ensure that the entire chunk of iron runs outof warranty all at the same time, to simplify your decision making process to upgrade or contract for extended service.
One of the biggest complaints about IBM TotalStorage Productivity Center is that it is software that needs to beinstalled on its own server, and that this installation process can take a day or two. Why wait? Now you can havea hardware console that has the DS8000 Storage Manager software, SVC Admin Console software, and IBM TotalStorageProductivity Center "Basic Edition" pre-installed. Here are the key features.
Pre-installed and tested console
DS8000 R3 GUI integration
Cohabitation of SVC 4.2.1 GUI and CIMOM
Automated device discovery
Asset and capacity reporting, including tape library support
Our "Release 9" applies across the board, from N3000 to N5000 to N7000 series models, includingnew host bus adapters, and the new Data OnTAP 7.2.4 release level.
The Virtual File Manager (VFM) was announced as one of our latest [Storage Virtualization Solutions]. VFMprovides a global namespace that aggregates the file systems from Linux, UNIX, and Windows file servers, as well asN series storage, into a consolidated environment.
IBM's virtual tape library (VTL) for the distributed systems platform, has been enhanced to provide:
Up to 12TB of disk cache, using 750GB SATA disk.
F05 Tape Frames installed as TS7520 base units through a 32 port fibre channel switch
Support for LTO generation 4 tape drives, both as virtual tape drives and as physical tape drives within IBM automated tape libraries attached to the TS7520. This allows you to use Encryption capabilities of LTO4.
DS3000 series now supports SATA disk, and can be attached to AIX and Linux on System p servers. This appliesto the DS3200, DS3300 and DS3400 models.See the [DS3000 Announcement Letter] for more details.
I have arrived safely in Las Vegas for the IBM System Storage and Storage Networking Symposium. This eventis held once every year. The gold sponsors were: Brocade, Cisco, Finisar, Servergraph, and VMware. Our silversponsor was Qlogic.
I presented IBM's System Storage strategy and an overview of our product line. For those who missed it,our strategy is focused on helping customers in four key areas:
Optimize IT - to simplify and automate your IT operations and optimize performance and functionality, through server/storage synergies, storage virtualization, and intergrated storage infrastructure management.
Leverage Information - to enable a single view of trusted business information through data sharing, and to get the most value from information through Information Lifecycle Management (ILM).
Mitigate Risk - to comply with security and regulatory requirements, and keep your business running with a complete set of business continuity solutions. IBM offers a range of non-erasable, non-rewriteable storage, encryption on disk and tape, and support for IT Infrastructure Library (ITIL) service management disciplines.
Enable Business Flexibility - to provide scalable solutions and protect your IT investment through the use of open industry standards like Storage Networking Industry Association (SNIA) Storage Management Initiative Specification (SMI-S). IBM offers scalability in three dimensions: Scale-up, Scale-out, and Scale-within.
IBM has a broad storage portfolio, in seven offering categories:
Disk Systems, including our SAN Volume Controller, DS family, and N series.
Tape Systems, including tape drives, libraries and virtualization.
Storage Networking, a complete set of switches, directors and routes
Infrastructure Management, featuring the IBM TotalStorage Productivity Center software
Business Continuity, advanced copy services and the software to manage them
Lifecycle and Retention, our non-erasable, non-rewriteable storage including DR550, N series with SnapLock, and WORM tape support, Grid Archive Manager and our Grid Medical Archive Solution (GMAS)
Storage Services, everything from consulting, design and deployment to outsourcing and hosting.
I could talk all day on this, but given that the room was packed, every seat taken and the rest of the audience standing along the walls, I had to keep it down to one hour.
SAN Volume Controller Overview
I presented an overview of the IBM System Storage SAN Volume Controller (SVC), IBM's flagship disk virtualizationproduct. Rather than giving a long laundry list of features and benefits,I focused on the five that matter most:
Reduces the cost and complexity of managing storage, especially for mixed storage environments
Simplifies Business Continuity through non-disruptive data migration and advanced copy services
Improves storage utilization, getting more value from the storage hardware you already have
Enhances personnel productivity, empowering storage administrators to get their job done
Delivers high availability and performance
SAN Volume Controller - Customer Success Stories
A good part of this conference are presented by non-IBMers, which include Business Partners and clientssharing their experiences. In this session, we had two speakers share their experiences with SVC.
David Snyder keeps over 80 web sites online and available. His digital media technologiesteam uses SVC to make their storage administration easier, and ensure high availability for web site content creation and publishing.
Mark Prybylski manages storage at his company, a financial bank. His storage management team uses SVC Global Mirror which provides asynchronous disk mirroring between different types of disk, as part oftheir Business Continuity/Disaster Recovery plan.
The last session I attended was "Storage .. to Optimize your ECM depoloyments" by Jerry Bower, now working for IBM as part of our recent acquisition of the Filenet company. ECM stands for Enterprise Content Management, and IBM is the market leader in this space. Jerry gave a great overview of IBM Content Manager software suite, our newly acquired Filenet portfolio, and the storage supported.
After the sessions was a reception at the Solution Center with dozens of exhibitor booths. For example,Optica Technologies had their PRIZM productswhich are able to connect FICON servers to ESCON storage devices.
I am back at "the Office" for a single day today. This happens often enough I need a name for it.Air Force pilots that practice landing and take-offs call them "Touch and Go", but I think I needsomething better. If you can think of a better phrase, let me know.
This week, I was in Hartford, CT, Somers, NY and our Corporate Headquarters in Armonk, in a varietyof meetings, some with editors of magazines, others with IBMers I have only spoken to over the phone andfinally got a chance to meet face to face.
I got back to Tucson last night, had meetings this morning in Second Life, then presented "InformationLifecycle Management" in Spanish to a group of customers from Mexico, Chile, and Brazil. We have a great Tucson Executive Briefing Center, and plenty of foreign-language speakers to draw from our localemployees here at the lab site.
Sunday, I leave for Las Vegas for our upcoming IBM Storage and Storage Networking Symposium. We will cover the latest in our disk, tape, storage networking and related software.Do you have your tickets? If you plan to attend, and want to meet up with me, let me know.
Stephen over at RupturedMonkey discusses the challenges of recruiting storage administrators:
There has been a Storage Admin job advertised for many months but no one wants it. Why? It's offering VERY good money but the word has got around the company has poor management practices and most people don't last for more than 6 months. So, with the shortage of good SAN people, good money and conditions, what can that company do to recruit someone? ...
This leads me to the thought that has anyone ever thought about the standards that storage administrators should follow? Can an employer look up a web site to find questions to ask prospective employees? More often than not, they are recruiting because the previous one left so how can companies know what they are getting.
There is actually a great standard called Information Technology Infrastructure Library (ITIL) that applies not just to storage administrators, but other IT personnel such as network administrators and server administrators. Here's a quick web-site about ITIL History:
ITIL History can be traced back to the late 1980’s when the British government determined that the level of IT service quality provided to them was not sufficient enough. The Central Computer and Telecommunications Agency (CCTA), now called the Office of Government Commerce (OGC), was tasked with developing a framework for efficient and financially responsible use of IT resources within the British government and the private sector.
The goal was to develop an approach that would be vendor-independent and applicable to organizations with differing technical and business needs. This resulted in the creation of the ITIL.
This standard spread from the UK to other governments in Europe, and is now being adopted worldwide by government agencies, non-profit organizations and commercial enterprises. IBM, of course, has been involved along the way, encouraging this set of best practices to take hold.
ITIL provides a common vocabulary that puts everyone in the IT industry on the same page, with the ultimate goal of helping companies run their IT organizations more efficiently.
ITIL provides recommendations, or best practices, for managing the way IT provides services to the rest of the organization, in the same way you would the rest of your business, with a defined set of processes.
While ITIL does a great job of describing what needs to be done, it doesn’t describe how to get it done. It doesn’t tell you how to take those best practices and implement them with real-life tools and technology. It’s not prescriptive.
The general process is now referred to as "IT Service Management", and the seven ITIL books are managed by the IT Service Management forum (ITSMf).
ITIL is vendor-independent. You can learn ITIL disciplines at one IT shop, and carry those skills with you when you go to another IT shop that has completely different gear. A common vocabulary would allow employers to post jobs in a consistent manner, and ask questions to those interviewing for the job. You can be ITIL-trained, and even ITIL-certified. IBM offers this training.
Of course, specific skills on how to use specific software to configure storage devices, request change control approvals, or define SAN zones, are useful, but often can be picked up on the job, reading the vendor manuals on the specifics. Of course, you can use IBM TotalStorage Productivity Center, which would allow someone to manage a variety of disk, tape and SAN fabric gear from one interface, greatly reducing the learning curve.
Use more efficient disk media, such as high-capacity SATA disk drives
Both are great recommendations, but why limit yourself to what EMC offers? Your x86-based machines are only a subset of your servers,and disk is only a subset of your storage. IBM takes a more holistic approach, looking at the entire data center.
VMware is a great product, and IBM is its top reseller. But in addition to VMware, there are other solutions for the x86-based servers, like Xen and Microsoft Virtual Server. IBM's System p, System i, and System z product lines all support logical partitioning.
To compare the energy effectiveness of server virtualization, consider a metric that can apply across platforms. For example, for an e-mail server, consider watts per mailbox. If you have, say, 15,000 users, you can calculate how many watts you are consuming to manage their mailboxes on your current environment, and compare that with running them on VMware, or logical partitions on other servers. Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
More efficient Media
SATA and FATA disks support higher capacities, and run at slower RPM speeds, thus using fewer watts per terabyte.A terabyte stored on 73GB high-speed 15K RPM drives consumes more watts than the same terabyte stored using 500GB SATA.Chuck correctly identifies that tape is more power-efficient than disk, but then argues that paper is more power-efficient than tape. But paper is not necessarily more efficient than tape.
ESG analyst Steve Duplessie divides up data betweenDynamic vs. Persistent. The best place to put dynamic data is on disk, and here is where evaluation of FC/SAS versus SATA/FATA comes into play.Persistent data, on the other hand, can be stored on paper, microfiche, optical or tape media. All of these shelf-resident media consume no electricity, nor generate any heat that would require additional cooling.
A study by scientists at the Lawrence Berkeley National Laboratory titled High-Tech Means High-Efficiency: The Business Case for Energy Management in High-Tech Industries indicates thatData centers consume 15 to 100 times more energy per square foot than traditional office space. Storing persistent data in traditional office space can save a huge amount of energy. Steve Duplessie feels the ratio of dynamic to persistent data is 1:10 today, but is likely to grow to 1:100 in the near future, raising the demand for energy-efficient storage of persistent data ever more important to our environment.
Data centers consume nearly 5000 Megawatts in the USA alone, 14000 Megawatts worldwide. To put that in perspective, the country of Hungary I was in last week can generate up to 8000 Megawatts for the entire country (and they were using 7400 Megawatts last week as a result of their current heat wave, causing them grave concern).
Back in the 1990's, one of the insurance companies IBM worked with kept data on paper in manila folders, and armiesof young adults in roller skates were dispatched throughout the large warehouses of shelves to get the appropriate folder in response to customer service inquiries. Digitizing this paper into electronic format greatly reduced the need for this amount of warehouse space, as well as improved the time to retrieve the data.
A typical file storage box (12 inch x 12 inch x 18 inch) containing typed pages single-spaced, double-sided, 12 point font could hold perhaps 100MB. The same box could hold a hundred or more LTO or 3592 tape cartridges, each storing hundreds of GB of information. That's a million-to-one improvement of space-efficiency, and from a watts-per-TB basis, translates to substantial improvement in standard office air conditioning and lighting conditions.
To learn more about IBM's Project Big Green, watch thisintroductory video which used Second Life for the animation.
Back in the late 1980's and early 1990's, I was one of the architects for DFSMS on z/OS, and customers always asked, "What is the clip level?", in other words, how big does a customer have to be to take advantage of DFSMS. We worked it out that if you had more than 100GB of disk data, DFSMS is worthwhile. DFSMS is now just standard by default, as everyone now easily has more than 100GB of data.
Later, in the late 1990's, I worked on Linux for System z. Again, customers asked how many Linux guest images would justify deploying applications on a mainframe. We worked it out to about 10 images. 10 Linux logical partitions, or Linux guests under z/VM was enough to cost justify the entire investment.
So what is the "clip level" for SANs? How many servers does an SMB need to have to justify deploying a SAN? IBM announced the new BladeCenter S designed specifically for mid-sized companies, 100 to 1000 employees, typically running 25 to 45 servers. However, I suspect companies as small as 7-10 servers would probably benefit from deploying an FC or IP SAN.
What do you think? Send me a comment on how many servers should be the clip level.
The results are finally in. IBMer Wolfgang Singer was awarded "Top Speaker" award for his NAS and iSCSI tutorial at last year's Orlando 2006 conference. Here he is receiving the awardfrom SNIA Executive Director Leo Leger.
Of course, NAS and iSCSI technologies have been around for a while, but they are still new formany customers, which is why tutorials like this are so important.
The movie industry is slowly making the conversion to digital.
For about 25 years, movies were silent, actors acted, text was shown on the screen, and an organ or piano player added the musical score. My mother was a concert pianist, so I grew up listening to all kinds of piano music. Last weekend, while I was in Chicago for St. Patricks Day, we watched and listened to the dueling pianos at a bar called "Howl at the Moon". Those not familiar with this art form can watch this 1-minute video of Star Wars re-imagined as a Silent Movie.
About 80 years ago, "talkies" appeared. The sound was converted to a series of colors that were recorded as a separate strip on the film media itself, hence the name "soundtrack". When the movie ran, the colors would then be converted back to voice and music. While the live piano players were out of jobs, the move to sound created a whole new industry for foley artists, orchestras and composers.InformationWeek's Mitch Wagner explains in Something Will Be Lost thatgreat artists like Charlie Chaplin and Mary Pickford never completely made the transition to talkies.
Now the movie industry is changing again, this time from film to digital format. Thanks to digital, we can now see videos on the internet, such as this set of Impressive Palindromes parody of a Bob Dylan song.
While movies are digital when you rent them from the DVD store, download them on iTunes, or play them on YouTube, they are still mostly in analog format on 35mm or 70mm film stock when you see them on the big screen.
My first "digital projection" experience was the movie "Ice Age" shown in Denver, Colorado. The theatre owner came out to show us what film stock looks like, and then how small the DVD was that held the digital version. The theatre also showed previews of other movies first on film, then in digital, so that we could see the difference in quality.My second experience was "Star Wars: Attack of the Clones (episode II)", which I saw opening night at the Ziegfeld theatre in New York City. This was a huge theatre, and we had front row seats in the upper balcony.
Of course, the transition of film stock to digital projection is just one of the many trends resulting in the fast growth of computer IT storage. Documents transitioned from paper, to being scanned into digital format, to being created digitally using word processing software. Likewise, photographs went from film, to being scanned, to being captured with digital cameras.
As with talkies, history repeats itself; the transition to digital projection is not going smoothly.NPR's Laura Sydell reports thatDigital Projection in Theaters Slowed by Dispute. The dispute is between movie production companies and theatre owners. Currently, it is quite expensive to send out film stock to all the theatres, so the transition to digital will save the movie production companies lots of money. On the other hand, installing digital projection equipment will be costly for theatre owners. How the two groups will share the burdensome costs to convert this infrastructure is still under negotiation.
As a fan of going to the movies, I hope they resolve this dispute soon.
Wrapping up my week in China, I read an article by Li Xing in the local "China Daily" about energy efficiency in buildings. She argues that it is not enough for a building to be energy-efficient on its own, but you have to consider the impact of the other buildings around. Does it reflect the sun so harshly into neighboring windows that people are forced to put up blinds and use artificial light? Does it block the sun, so that rooms that previously could be used with natural sunlight must now be artificially lit?
A similar effect happens with power and cooling in the data center. Servers and storage systems generate heat, and that heat affects all the other equipment in the data center. IBM has the most power-efficient and heat-efficient servers and storage, but that is not enough. You have to consider the heat generated by all systems that might raise overall temperature.
Research has indicated that water can remove far more heat per volume unit than air. For example, in order to disperse 1,000 watts, with 10 degree temperature difference, only 24 gallons of water per hour is needed, while the same space would require nearly 11,475 cubic feet of air. IBM's Rear Door Heat eXchanger helps keep growing datacenters at safe temperatures, without adding AC units. The unobtrusive solution brings more cooling capacity to areas where heat is the greatest -- around racks of servers with more powerful and multiple processors.
The CoolBlue portfolio of IBM innovations includes comprehensive hardware and systems-management tools for computing environments, enabling clients to better optimize the power consumption, management and cooling of infrastructure at the system, rack and datacenter levels. The CoolBlue portfolio includes IBM PowerConfigurator, PowerExecutive, and Rear Door Heat eXchanger.
The eXchanger works on standard 42U racks, and can help clients deal with the rapid growth of rack-mounted servers and storage on their raised floor. How cool is that!
I have created blog categories, based on our System Storage offering matrix, which you can track individually:
Disk systems, including the IBM System Storage DS Family of products, SAN Volume Controller, N series, as well as features unique to these products, such as FlashCopy, MetroMirror, or SnapLock. Tape
Tape systems, including the IBM System Storage TS Family of products, tape-related products in the Virtualization Engine portfolio, drives, libraries and even tape media.
Storage Networking offerings, from Brocade, McData, Cisco and others, such as switches, routers and directors.
Infrastructure management, including IBM TotalStorage Productivity Center software, IBM Tivoli Provisioning Manager, IBM Tivoli Intelligent Orchestrator, and IBM Tivoli Storage Process Manager.
Business Continuity, including IBM Tivoli Storage Manager, Tivoli CDP for Files, Productivity Center for Replication software component, Continuous Availability for Windows (CAW), Continuous Availability for AIX (CAA).
Lifecycle and Retention offerings, including our IBM System Storage DR550, DR550 Express, GPFS, Tivoli Storage Manager Space Management for UNIX, Tivoli Storage Manager HSM for Windows, and DFSMS.
Storage services, including consulting, assessments, design, deployment, management and outsourcing.
Before we started, we asked the first survey question: "How is storage planning conducted in your shop?" Of the various responses, nearly four out of ten responded "Part of an overall IT infrastructure strategy".
Jon Toigo went first, and spent 20 minutes or so laying out the problem as he sees it. Jon travels all over visiting customers struggling with their storage infrastructures, so he gets to hear a lot of this first hand.
I then spent 20 minutes or so presenting IBM's vision, strategy and offerings to help solve these problems. I could speak for hours on this topic, but we kept it short for this one-hour webcast. To learn more, request a visit to the Tucson Executive Briefing Center.
At the end of my talk, we put out the second survey, asking the audience "What is your number one priority with respect to storage operations today?" Over one fourth of the attendees were focused on reducing storage infrastructure cost of ownership by any means possible.
I am glad we saved the last 15 minutes for Q&A, as there were a lot of questions.
The replay is now available. If you attended the event and want to hear it again, or want to share it with your colleagues, or you missed it and want to hear it, then [Register for the Replay].
To make true advances in any industry or field requires forward thinking—as well as industry insight and experience. It can't be done just by packaging a bag of piece parts and putting a new label on it. But forward thinkers are putting smarter, more powerful technology to uses that were once unimaginable -- either in scale or in progress.
The graphics developed for the IBM Smarter Planet vision are interesting. This one for Infrastructure includes images relating to public utilities, like gas, water and electricity, clouds representing cloud computing, green forests representing the need for energy efficiency and reducing carbon footprint to fight global warming, roads, representing the intricate transportation and traffic systems, highways and city streets that connect us all together, and a printed circuit board, representing the Information Technology that makes all of this possible.
Ironically, I didn't even know I made the final cut until I got three, yes three, separate requests for interviews about it. I already reached the "million hits" milestone. Other people track these things for me, so it will be interesting how much additional traffic my latest [15 minutes of fame] will generate.
Infrastructure is just one of the 25 different areas that IBM's vision for a Smarter Planet is trying to address, including the need for smarter buildings, smarter cities, smarter transportation systems, smarter energy grids, smarter healthcare and public safety, and smarter governments.
This week I am at the Data Center Conference 2009 in Las Vegas. There are some 1700 people registered this year for this conferece, representing a variety of industries like Public sector, Services, Finance, Healthcare and Manufacturing. A survey of the attendees found:
55 percent are at this conference for the first time.
18 percent once before, like me
15 percent two or three times before
12 percent four or more times before
Plans for 2010 IT budgets were split evenly, one third planning to spend more, one third planning to spend about the same, and the final third looking to cut their IT budgets even further than in 2009. The biggest challenges were Power/Cooling/Floorspace issues, aligning IT with Business goals, and modernizing applications. The top three areas of IT spend will be for Data Center facilities, modernizing infrastructure, and storage.
There are six keynote sessions scheduled, and 66 breakout sessions for the week. A "Hot Topic" was added on "Why the marketplace prefers one-stop shopping" which plays to the strengths of IT supermarkets like IBM, encourages HP to acquire EDS and 3Com, and forces specialty shops like Cisco and EMC to form alliances.
Day 2 began with a series of keynote sessions. Normally when I see "IO" or "I/O", I immediately think of input/output, but here "I&O" refers to Infrastructure and Operations.
Business Sensitivity Analysis leads to better I&O Solutions
The analyst gave examples from Alan Greenspan's biography to emphasize his point that what this financial meltdown has caused is a decline in trust. Nobody trusts anyone else. This is true between people, companies, and entire countries. While the GDP declined 2 percent in 2009 worldwide, it is expected to grow 2 percent in 2010, with some emerging markets expected to grow faster, such as India (7 percent) and China (10 percent). Industries like Healthcare, Utilities and Public sector are expected to lead the IT spend by 2011.
While IT spend is expected to grow only 1 to 5 percent in 2010, there is a significant shift from Capital Expenditures (CapEx) to Operational Expenses (OpEx). Five years ago, OpEx used to represent only 64 percent of IT budget in 2004, but today represents 76 percent and growing. Many companies are keeping their aging IT hardware longer in service, beyond traditional depreciation schedules. The analyst estimated over 1 million servers were kept longer than planned in 2009, and another 2 million will be kept longer in 2010.
An example of hardware kept too long was the November 17 delay of 2000 some flights in the United States, caused by a failed router card in Utah that was part of the air traffic control system. Modernizing this system is estimated to cost $40 billion US dollars.
Top 10 priorities for the CIO were Virtualization, Cloud Computing, Business Intelligence (BI), Networking, Web 2.0, ERP applications, Security, Data Management, Mobile, and Collaboration. There is a growth in context-aware computing, connecting operational technologies with sensors and monitors to feed back into IT, with an opportunity for pattern-based strategy. Borrowing a concept from the military, "OpTempo" allows a CIO to speed up or slow down various projects as needed. By seeking out patterns, developing models to understand those patterns, and then adapting the business to fit those patterns, a strategy can be developed to address new opportunities.
Infrastructure and Operations: Charting the course for the coming decade
This analyst felt that strategies should not just be focused looking forward, but also look left and right, what IBM calls "adjacent spaces". He covered a variety of hot topics:
65 percent of energy running x86 servers is doing nothing. The average x86 running only 7 to 12 percent CPU utilization.
Virtualization of servers, networks and storage are transforming IT to become on big logical system image, which plays well with Green IT initiatives. He joked that this is what IBM offered 20 years ago with Mainframe "Single System Image" sysplexes, and that we have come around full circle.
One area of virtualization are desktop images (VDI). This goes back to the benefits of green-screen 3270 terminals of the mainframe era, eliminating the headaches of managing thousands of PCs, and instead having thin clients rely heavily on centralized services.
The deluge in data continues, as more convenient access drives demand for more data. The anlyst estimates storage capacity will increase 650 percent over the next five years, with over 80 percent of this unstructured data. Automated storage tiering, ala Hierarchical Storage Manager (HSM) from the mainframe era, is once again popular, along with new technologies like thin provisioning and data deduplication.
IT is also being asked to do complex resource tracking, such as power consumption. In the past IT and Facilities were separate budgets, but that is beginning to change.
The fastest growing social nework was Twitter, with 1382 percent growth in 2009, of which 69 percent of new users that joined this year were 39 to 51 years old. By comparison, Facebook only grew by 249 percent. Social media is a big factor both inside and outside a company, and management should be aware of what Tweets, Blogs, and others in the collective are saying about you and your company.
The average 18 to 25 year old sends out 4000 text messages per month. In 24 hours, more text messages are sent out than people on the planet (6.7 billion). Unified Communications is also getting attention. This is the idea that all forms of communication, from email to texts to voice over IP (VoIP), can be managed centrally.
Smart phones and other mobile devices are changing the way people view laptops. Many business tasks can be handled by these smaller devices.
It costs more in energy to run an x86 server for three years than it costs to buy it. The idea of blade servers and componentization can help address that.
Mashups and Portals are an unrecognized opportunity. An example of a Mashup is mapping a list of real estate listings to Google Maps so that you can see all the listings arranged geographically.
Lastly, Cloud Computing will change the way people deliver IT services. Amusingly, the conference was playing "Both Sides Now" by Joni Mitchell, which has the [lyrics about clouds]
Unlike other conferences that clump all the keynotes at the beginning, this one spreads the "Keynote" sessions out across several days, so I will cover the rest over separate posts.
Eventually, there comes a time to drop support for older, outdated programs that don't meet the latest standards. I had several complain that they could not read my last post on Internet Explorer 6. The post reads fine on more modern browsers like Firefox 3 and even Google's Chrome browser, but not IE6.
Google confirms that warnings are appearing:
[Official: YouTube to stop IE6 support].
My choice is to either stop embedding YouTube videos, some of which are created by my own marketing team specifically on my behalf, or drop support for IE6. I choose the latter. If you are still using IE6, please consider switching to Firefox 3 or Google Chrome instead.
Over on his Backup Blog, fellow blogger Scott Waterhouse from EMC has a post titled
[Backup Sucks: Reason #38]. Here is an excerpt:
Unfortunately, we have not been able to successfully leverage economies of scale in the world of backup and recovery. If it costs you $5 to backup a given amount of data, it probably costs you $50 to back up 10 times that amount of data, and $500 to back up 100 times that amount of data.
If anybody can figure out how to get costs down to $40 for 10 times the amount of data, and $300 for 100 times the amount of data, they will have an irrefutable advantage over anybody that has not been able to leverage economies of scale.
I suspect that where Scott mentions we in the above excerpt, he is referring to EMC in general, with products like
Legato. Fortunately, IBM has scalable backup solutions, using either a hardware approach, or one purely with software.
The hardware approach involves using deduplication hardware technology as the storage pool for IBM Tivoli Storage Manager (TSM). Using this approach, IBM Tivoli Storage Manager would receive data from dozens, hundreds or even thousands
of client nodes, and the backup copies would be sent to an IBM TS7650 ProtecTIER data deduplication appliance, IBM TS7650G gateway, or IBM N series with A-SIS. In most cases, companies have standardized on the operating systems and applications used on these nodes, and multiple copies of data reside across employee laptops. As a result, as you have more nodes backing up, you are able to achieve benefits of scale.
Perhaps your budget isn't big enough to handle new hardware purchases at this time, in this economy. Have no fear,
IBM also offers deduplication built right into the IBM Tivoli Storage Manager v6 software itself. You can use sequential access disk storage pool for this. TSM scans and identifies duplicate chunks of data in the backup copies, and also archive and HSM data, and reclaims the space when found.
If your company is using a backup software product that doesn't scale well, perhaps now is a good time to switch over to IBM Tivoli Storage Manager. TSM is perhaps the most scalable backup software product in the marketplace, giving IBM an "irrefutable advantage" over the competition.
Continuing my week in Chicago, for the IBM Storage Symposium 2008, I attended several sessions intended to answer the questions of the audience.
In an effort to be cute, the System x team have a "Meet the xPerts" session at their System x and BladeCenter Technical Conference, so the storage side decided to do the same. Traditionally, these have been called "Birds of a Feature", "Q&A Panel", or "Free-for-All". They allow anyone to throw out a question, and have the experts in the room, either
IBM, Business Partner or another client, answer the question from their experience.
Meet the Experts - Storage for z/OS environments
Here were some of the questions answered:
I've seen terms like "z/OS", "zSeries" and "System z" used interchangeably, can you help clarify what this particular session is about?
IBM's current mainframe servers are all named "System z", such as our System z9 or System z10. These replace the older zSeries models of hardware. z/OS is one of the six operating systems that run on this hardware platform. The other five are z/VM, z/VSE, z/TPF, Linux and OpenSolaris. The focus of this session will be storage attached and used for z/OS specifically, including discussions of Omegamon and DFSMS software products.
What can we do to reduce our MIPS-based software licensing costs from our third party vendors?
Consider using IBM System z Integrated Information Processor
What about 8 Gbps FICON?
IBM has already announced
[FICON Express8] host bus adapter (HBA) cards, that will auto-negotiate to 4Gbps and 2Gbps speeds. If you don't need full 8Gbps speed now, you can
still get the Express8 cards, but put 4/2/1 Gbps SFP ports instead. Currently, LongWave (LW) is only supported to 4km at 8Gbps speed.
I want to use Global Mirror for my DS8100 to my remote DS8100, but also make test copies of my production data to
an older ESS 800 I have locally. Any suggestions? Yes, consider using FlashCopy to simplify this process.
I have Global Mirror (GM) running now successfully with DSCLI, and now want to deploy IBM Tivoli Storage Productivity Center for Replication. Is that possible? Yes, Productivity Center for Replication will detect existing GM relationships, and start managing them.
I have already deployed HyperPAV and zHPF, is there any value in getting Solid-State Drives as well?
HyperPAV and zHPF impact CONN time, but SSD impacts DISC time, so they are mutually complementary.
How should I size my FlashCopy SE pool? SE refers to "Space Efficient", which stores only the changes
between the source and destination copies of each LUN or CKD volume involved. General recommendation is to start with 20 percent and adjust accordingly.
How many RAID ranks should I configure per DS8000 extent pool? IBM recommends 4 to 8 ranks per pool.
Meet the Experts: Storage for Linux, UNIX and Windows distributed systems
This session was focused on storage systems attached to distributed servers, as well as products from Tivoli used to manage them. Here were some of the questions answered:
When we migrated from Tivoli Storage Manager v5 to v6, we lost our favorite "Operational Reporting" tool. How can we get TOR back? You now get the new Tivoli Common Reporting tool.
How can we identify appropriate port distribution for multiple SVC node pairs for load balancing?
IBM Tivoli Storage Productivity Center v4.1 has hot-spot analysis with recommendations for Vdisk migrations.
We tried TotalStorage Productivity Center way back when, but the frequent upgrades were killing us. How has it been lately? It has been much more stable since v3.3, and completely renamed to Tivoli Storage Productivity Center to avoid association with versions 1 and 2 of the predecessor product. The new "lightweight agents" feature of v4.1 resolve many of the problems you were experiencing.
We have over 1600 SVC virtual disks, how do we handle this in IBM Tivoli Storage Productivity Center? Use the Filter capability in combination with clever naming conventions for your virtual disks.
How can we be clever when we are limited to only 15 characters? Ok. We understand.
We are currently using an SSPC with Windows 2003 and 2GB memory, but we are only using the Productivity Center for Replication feature of it. Can we move the DB2 database over to a Windows 2008 server with 4GB of memory?
Consider using the IBM Tivoli Storage Productivity Center for Replication software instead of SSPC for special
circumstances like this.
We love the XIV GUI, how soon will all other IBM storage products have it also? As with every acquisition,
IBM evaluates if there are technologies from new products that can be carried back to existing products.
We are currently using 12 ports on our existing XIV, and love it so much we plan to buy a second frame, but are concerned about consuming another 12 ports on our SAN switch. Any suggestions? Yes, use only six ports per frame. Just because you have more ports, doesn't mean you are required to use them.
We have heard there are concerns from the legal community about using deduplication technology, any ideas how to address that?
Nobody here in the room is a lawyer, and you should consult legal counsel for any particular situation.
None of the IBM offerings intended for non-erasable, non-rewriteable (NENR) data retention records (DR550, WORM tape, N series SnapLock) support dedupe today, and none of IBM's deduplication offerings (TS7650,N series A-SIS,TSM) make any claims for fit-for-purpose for compliance regulatory storage. However, be assured that all of IBM's dedupe technology involves byte-for-byte comparisons so that you never lose any data due to false hash collisions. For all IBM compliance storage, what you write will be read back in the correct sequence of ones and zeros.
Continuing my week in Chicago, I decided to attend some of the presentations from the System x side. This is the advantage of running both conferences in the same hotel, attendees can choose how many of each they want to participate in.
Wayne Wigley, IBM Advanced Technical Support (ATS), presented a series of presentations on different server virtualization offerings available for System x and BladeCenter servers. I am very familiar with virtualization implemented on System z mainframes, as well as IBM's POWER systems, and have working knowledge of Linux KVM and Xen, so I was well prepared to handle hearing the latest about Microsoft's Hyper-V and VMware's Vsphere version 4.
Microsoft Hyper-V 2008
Hyper-V can run as part of Windows 2008, are standalone on its own.Different levels of Windows 2008 include licenses for different number of Windows virtual machines (VMs).Windows Server 2008 Standard includes 1 Windows VM, Enterprise includes 4 Windows VMs, and the Datacenter edition includes unlimited number of Windows VMs. If you want to run more Window VMs than come included, you need to pay extra for each additional one. For example, to run 10 Windows VMs on a 2-socket server would cost about $9000 US dollars on Standard but only $6000 US dollars on Datacenter edition (list prices from Microsoft Web site).
Unlike VMware, which takes a monolithic approach as hypervisor, Hyper-V is more like Xen with a microkernelized approach. This means you need a "parent" guest OS image, and the rest of the Guest OS images are then considered "child" images.These child images can be various levels of Windows, from Windows XP Pro to Windows Server 2008, Xen-enabled Linux, or even a non-hypervisor-aware OS.The "parent" guest OS image provides networking and storage I/O services to these "child" images.For the hypervisor-aware versions of Windows and Linux, Hyper-V allows optimized access to the hypervisor, "synthetic devices", and hypercalls. Synthetic devices present themselves as network devices, but only serve to pass data along the VMBus to other networking resources. This process does not require software emulation, and therefore offers higher performance for virtual machines and lower host system overhead.For non-hypervisor-aware OS images, Hyper-V provides device emulation through the "parent" image, which is slower.
Microsoft System Center Virtual Machine Manager (SCVMM) can manage both Hyper-V and VMware VI3 images.Wayne showed various screen shots of the GUI available to manage Hyper-V images.In standalone mode, you lose the nice GUI and management console.
Hyper-V supports external, internal and private virtual LANs (VLAN). External means that VMs can communicate with the outside world over standard ethernet connections. Internal means that VMs can communicate with "parent" and "child" guest images on the same server only. Private means that only "child" guests can communicate with other "child" images.
Hyper-V supports disk attached via IDE, SATA, SCSI, SAS, FC, iSCSI, NFS and CIFS. One mode is "Virtual Hard Disk" (VHD) similar to VMware VMDK files. The other is "pass through" mode, which are actual disk LUNs accessed natively. VHDs can be dynamic (thin provisioned), fixed (fully allocated), or differencing. The concept of differencing is interesting, as you start with a base read-only VHD volume image, and have a separate "delta" file that contains changes from the base image.
Some of the key features of Hyper-V 2008 are:
Being able to run concurrently 32-bit and 64-bit versions of Linux and Windows guest images
Support for 64 GB of memory and 4-way symmetric multiprocessing (SMP) per VM
Clustering for High Availability and Quick Migration of VM images
Live backup with integration with Microsoft's Volume Shadow Copy Services (VSS)
Virtual LAN (VLAN) support, and Virtual and Pass-through physical disk support
A clever VMbus, virtual service parent/client approach to sharing hardware
Optimized performance options for hypervisor-aware versions of Windows and Linux, and emulated supportfor non-hypervisor-aware OS images.
VMware Vsphere v4.0
This was titled as an "Overview" session, but really was an "Update" session on the newest features of this release. The big change appears to be that VMware added "v" in front of everything.
Under vCompute, there are some new features on VMware's Distributed Resource Scheduler (DRS) which includes recommended VM migrations. Dynamic Power Management (DPM) will move VMs during periods of low usageto consolidate onto fewer physical servers so as to reduce energy consumption.
Under vStorage, vSphere introduces an enhanced Plugable Storage Architecture (PSA), with supportfor Storage Array Type Plugins (SATP) and Path Selection Plugins (PSP). This vStorage API allows forthird party plugins for improved fault-tolerance and complex I/O load balancing algorithms. This releasealso has improved support for iSCSI, including Challenge-Handshake Authentication Protocol (CHAP) support.Similar to Hyper-V's dynamic VHD, VMware supports "thin provisioning" for their virtual disk VMDK files.A feature of "Storage Vmotion" allows conversion between "thick" and "thin" provisioning formats.
The vStorage API for Data Protection provide all the features of VMware Consolidated Backup (VCB). The APIprovides full, incremental and differential file-level backups for Windows and Linux guests, including supportfor snapshots and Volume Shadow Copy Services (VSS) quiescing.
VMware introduces direct I/O pass-through for both NIC and HBA devices. While thisallows direct access to SAN-attached LUNs similar to Hyper-V, you lose a lot of features like Vmotion, High Availability and Fault Tolerance. Wayne felt that these restrictions are temporary, that hopefully VMwarewill resolve this over the next 12 months.
Under vNetwork, VMware has virtual LAN switches called vSwitches. This includes support for IPv6and VLAN offloading.
The vSphere server can now run with up to 1TB of RAM and 64 logical CPUs to support up to 320 VM guest images.Each VM can have up to 255GB RAM and up to 8-way SMP.Vsphere ESX 4 introduces a new virtual hardware platform called VM Hardware v7. While Vsphere 4.0 can run VMs from ESX 2 and ESX 3, the problem is if you have new VMs based on this newer VM Hardware v7, you cannot run them on older ESX versions.
Vsphere comes in four sizes: Standard, Advanced, Enterprise, and Enterprise Plus, ranging in list price from $795 US dollars to $3495 US dollars.
While IBM is the #1 reseller of VMware, we also are proud to support Hyper-V, Xen, KVM and other similar products.Analysts expect most companies will have two or more server virtualization solutions in their data center, and it is good to see that IBM supports them all.
Continuing my week in Chicago for the IBM Storage and Storage Networking Symposium and System x and BladeCenter Technical Conference, I presented a variety of topics.
Hybrid Storage for a Green Data Center
The cost of power and cooling has risen to be a #1 concern among data centers. I presented the following hybrid storage solutions that combine disk with tape. These provide the best of both worlds, the high performance access time of disk with the lower costs and reduced energy consumption of tape.
IBM [System Storage DR550] - IBM's Non-erasable, Non-rewriteable (NENR) storage for archive and compliance data retention
IBM Grid Medical Archive Solution [GMAS] - IBM's multi-site grid storage for PACS applications and electronic medical records[EMR]
IBM Scale-out File Services [SoFS] - IBM's scalable NAS solution that combines a global name space with a clustered GPFS file system, serving as the ideal basis for IBM's own[Cloud Computing and Storage] offerings
Not only do these help reduce energy costs, they provide an overall lower total cost of ownership (TCO) thantraditional WORM optical or disk-only storage configurations.
The Convergence of Networks - Understanding SAN, NAS and iSCSI in the Data Center Network
This turned out to be my most popular session. Many companies are at a crossroads in choosing data and storage networking solutions in light of recent announcements from IBM and others. In the span of 75 minutes, I covered:
Block storage concepts, storage virtualization and RAID levels
File system concepts, how file systems map files to block storage
Network Attach Storage, the history of the NFS and CIFS protocols, Pros and Cons of using NAS
Storage Area Networks, the history of SAN protocols including ESCON, FICON and FCP, Pros and Cons of using SAN
IP SAN technologies, iSCSI and Fibre Channel over Ethernet (FCoE), Pros and Cons of using this approach
Network Convergence with Infiniband and Fibre Channel over Convergence Enhanced Ethernet (FCoCEE), why Infiniband was not adopted historically in the marketplace as a storage protocol, and the features and enhancements of Convergence Enhanced Ethernet (CEE) needed to merge NAS, SAN and iSCSI traffic onto a single converged data center network [DCN]
Yes, it was a lot of information to cover, but I managed to get it done on time.
IBM Tivoli Storage Productivity Center version 4.1 Overview and Update
In conferences like these, there are two types of product-level presentations. An "Overview" explains howproducts work today to those who are not familiar with it. An "Update" explains what's new in this version of the product for those who are already familiar with previous releases. I decided to combine these into one sessionfor IBM's new version of [Tivoli Storage Productivity Center].I was one of the original lead architects of this product many years ago, and was able to share many personalexperiences about its evolution in development and in the field at client facilities.Analysts have repeatedly rated IBM Productivity Center as one of the top Storage Resource Management (SRM) tools available in the marketplace.
Information Lifecycle Management (ILM) Overview
Can you believe I have been doing ILM since 1986? I was the lead architect for DFSMS which provides ILM support for z/OS mainframes. In 2003-2005, I spent 18 months in the field performingILM assessments for clients, and now there are dozens of IBM practitioners in Global Technology Services andSTG Lab Services that do this full time. This is a topic I cover frequently at the IBM Executive Briefing Center[EBC], because it addressesseveral top business challenges:
Reducing costs and simplifying management
Improving efficiency of personnel and application workloads
Managing risks and regulatory compliance
IBM has a solution based on five "entry points". The advantage of this approach is that it allows our consultants to craft the right solution to meet the specific requirements of each client situation. These entry points are:
Tiered Information Infrastructure - we don't limit ourselves to just "Tiered Storage" as storage is only part of a complete[information infrastructure] of servers,networks and storage
Storage Optimization and Virtualization - including virtual disk, virtual tape and virtual file solutions
Process Enhancement and Automation - an important part of ILM are the policies and procedures, such as IT Infrastructure Library [ITIL] best practices
Archive and Retention - space management and data retention solutions for email, database and file systems
I did not get as many attendees as I had hoped for this last one, as I was competing head-to-head in the same time slot as Lee La Frese covering IBM's DS8000 performance with Solid State Disk (SSD) drives, John Sing covering Cloud Computing and Storage with SoFS, and Eric Kern covering IBM Cloudburst.
I am glad that I was able to make all of my presentations at the beginning of the week, so that I can then sit back and enjoy the rest of the sessions as a pure attendee.
This week I am in Chicago for the IBM Storage and Storage Networking Symposium, which coincides with the System x and BladeCenter Technical Conference. This allows the 800 attendees to attend both storage or server presentations at their convenience. There were hundreds of sessions, over 20 time slots, so for each time slot, you have 15 or so topics to choose from.Mike Kuhn kicked off the series of keynote sessions. Here's my quick recap of each one:
Curtis Tearte, General Manager, IBM System Storage
Curtis replaced Andy Monshaw as General Manager for IBM System Storage. His presentation focused on how storage fits into IBM's Dynamic Infrastructure strategy. Some interesting points:
a billion camera-enabled cell phones were sold in 2007, compared to 450 million in 2006.
IBM expects that there will be 2 billion internet users by 2011, as well as trillions of "things".
In the US, there were 2.2 million medical pharmacy dispensing errors resulting for handwritten prescriptions.
Time wasted looking for parking spaces in Los Angeles consumed 47,000 gallons of gasoline, and generated 730 tons of carbon dioxide.
In the US, 4.2 billion hours are lost, and 2.9 billion gallons of gas consumed, due to traffic congestion.
Over the past decade, servers went from 8 watts to 100 watts per $1000 US dollars.
Data growth appears immune to the economic recession. The digital footprint per person is expected to grow from 1TB today to over 15TB by 2020.
10 hours of YouTube videos are uploaded every minute.
Bank of China manages 380 million bank accounts, processing over 10,000 transactions per second.
At the end of the session, Curtis transitioned from demonstrating his knowledge and passion of storage to his knowledge and passion in his favorite sport: baseball. Chicago is home to both the Cubs and the White Sox.
Roland Hagan, Vice President Business Line Executive, System x
IBM sets the infrastructure agenda for the entire industry. The Dynamic Infrastructure initiative is not just IT, but a complete end-to-end view across all of the infrastructures in play, including transportation, manufacturing, services and facilities.Companies spent over $60 billion US dollars on servers last year. Of these, 53 percent for x86-based servers, 9 percent for Itanium-based, 26 percent for RISC-based (POWER6, SPARC, etc.), and 11 percent mainframe. Theeconomic downturn has impacted revenues, but the percentages continue about the same.
The dominant deployment model remains one application per server. As a result, power, cooling and management costs have grown tremendously. There are system admins opposed to consolidating server images with VMware, Hyper-V, Xen or other server virtualizaition technologies. Roland referred to these admins as "server huggers".To help clients adopt cloud computing technologies, IBM introduced [Cloudburst] appliances. IBM plans to offer specialized versions for developers, for service providers, and for enterprises.
IBM's Enterprise-X Architecture is what differentiates IBM's x86-based servers from all the competitors, surrounding Intel and AMD processors with technology that provides distinct advantages. For example, to support server virtualization, IBM's eX4 provides support for more memory, which often is more critical than CPU resources when deploying large number of guest OS images. IBM System x servers have an integrated management module (IMM) and was the first to change over from BIOS to the new Unified Extensible Firmware Interface [UEFI] standard.
IBM servers offer double the performance, consume half the power, and cost a third less to manage, than comparably priced servers from competitors. Of the top 20 more energy efficient server deployments, 19 are from IBM. Roland cited customer reference SciNet, a 4000-server supercomputer with 30,000 cores based on IBM [iDataPlex] servers. At 350 TeraFLOPs it is ranked #16 fastest supercomputer in the world, and #1 in Canada. With apower usage effectiveness (PUE) less than 1.2, it also is very energy efficient. This means that for every 12 watts of electricity going in to the data center, 10 watts are used for servers, storage and networking gear, andonly 2 watts used for power and cooling. Traditional data centers have PUE around 2.5, consuming 25 watts total for every 10 watts used by servers, storage and networking gear.
Clod Barrera, Distinguished Engineer, Chief Technical Strategist for IBM System Storage
Clod presented trends and directions for disk and tape technology, disk and tape systems, and the direction towards cloud computing.
Ideally, every airline would use the most experienced seasoned professional airline pilots money could buy, but some airlines, in an effort to compete on ticket price, may elect instead to have less experienced pilots.Here's a great excerpt:
Airline history lesson 101: It used to be, up until the mid 1980’s, that a young pilot would be hired on at a major carrier, become a flight engineer (FE), and then spend a few years managing the systems of the older-generation airplanes. But he or she was learning all the while. These new “pilots” sat in the FE seat and did their job, all the while observing the “pilots” doing the flying, day in and day out.
The FE’s learned from the seasoned pilots about the real world of flying into the Chicago O’Hares and New York LaGuardias. They learned decision making, delegation, and the reality of “captain’s final authority” as confirmed in the law. When they got the chance to upgrade, they became a copilot. The copilot’s duty was to assist the captain in flying; but even during their time as the new copilot, they had the luxury of the FE looking over their shoulders — i.e., more learning. This three-man-crew concept, now a fond memory in the domestic markets but used predominately in international flying, was considered one more layer of protection. But it’s gone.
To become the public speaker I am today, IBM put me through a variety of speaking classes. I taught high school and college classes to practice in front of groups. But most importantly, I traveled with seasoned colleagues and watched them in action from the front row.I learned how to handle tough questions, how to react to hecklers causing trouble, and how to deal with the unexpected before, during and after each presentation. In addition to speaking skills, I ended up having to learn travel skills, foreign language skills, and a variety of cultural social skills. All part of the job in my line of work.
Likewise, being a storage administrator is an important job, and for some data centers, not something to give lightly to a fresh college graduate. Unless they have had format IT Infrastructure Library [ITIL] certification coursework, I doubt they would understand the processes and disciplines demanded by the typical data center. I have been to accounts where new hires are not allowed to touch production systems for the first two years. Instead they watch the seasoned professionals do their jobs, and are given only access to "sand box" systems that are used for application testing or Quality Assurance (QA). Sadly, I have also been to other accounts where people with no storage experience whatsoever were tossed into the admin pool and let loose with superuser passwords, all in an effort to save money during times of exponential data growth rates, only to pay the price later with outages or lost data.
The parallels between the airline industry and the IT industry are eerie.
This week, I have been presenting how to do important things without travel. Of course, there are times where you need some boots on the ground, while your support team remains remote.
Last month, fellow co-worker Liz Goodman reached out to me. She was part of a ten-person team that went to Tanzania as part of IBM's[Corporate Service Corps]. Other teams went to Brazil, China, Ghana, Romania, South Africa, The Philippines, Turkey and Vietnam.(I've been to half these other countries, but the closest I have ever been toTanzania was a safari I took in Kenya that included the Masai Mara national park which runsalong the border with Tanzania's Serengheti national park).
Liz was one of the lucky[200 candidates chosen among over 5000 applications] IBM reviews each year for this program. IBM does business in over170 countries, so learning to work in or with emerging growth markets requires a bit of "cultural intelligence".Liz and three others worked with the University of Dudoma [UDOM] to lead some students in adopting a [Moodle] infrastructure based on Linux, Apache, PHP and MySQL [LAMP] platform. She noticed that I had experience with both Moodle and LAMP from [my work with OLPC], and reached out to me for help.I was able to provide some insight, things to watch out for, and how to tackle not just the technical challenges, but a few that many don't consider:
Educational content. Digitizing materials already available in hardcopy, or obtaining digital rights to existing content.
Business Process. Getting the teachers and students to adopt new process and procedures enabled by these new capabilities.
Project Management. Fortunately, Liz is already [PMP-certified], and knows well the importance of managing even a small 4-person, 4-week project like this.
How well did her team do? Liz blogged before, during and after her trip. Read all about iton her blog [Liz Goes To Tanzania]!
Jim Stallings, IBM General Manager for Global Markets, will explain why a smarter planet needs a dynamicinfrastructure. I used to work for Jim, when he was in charge of the IBM Linux initiative and I was on the Linux forS/390 mainframe team.
Erich Clementi, IBM Vice President, Strategy & General Manager Enterprise Initiatives, will explain how to best leverage opportunities with cloud computing.
Steve Forbes, Chairman and CEO of Forbes Inc. and Editor-in-Chief of Forbes Magazine, will presentGlobal Outlooks and the Challenge of Change.
Rich Lechner, IBM Vice President, Energy & Environment, will explain the importance of Building an Energy-Efficient Dynamic Infrastructure. I also worked for Rich, back when he was the VP of Marketing for IBM System Storage, and Iwas back then the "Technical Evangelist". See my post [The Art of Evangelism] to better understand why I don't carry that title anymore.
In addition to these presentations, you will be able to "walk" around to different booths and have on-line chats with subject matter experts and download resources. Don't worry, this is not based on [Second Life], but rather using "On24" much simpler visual interface.Of course, you can follow on [Twitter] or join the fan club at[Facebook].
This is a worldwide event, with translated resource materials and on-line subject matter experts in six different languages (English, French, Italian, German, Mandarin and Japanese). Those in North, Central and South Americas can participate June 23, and those in Europe, Asia and the rest of the world on June 24. [Register Today] and mark your calendars!
Spend twenty hours a week running a project for a non-profit.
Teach yourself Java, HTML, Flash, PHP and SQL. Not a little, but mastery. [Clarification: I know you can't become a master programmer of all these in a year. I used the word mastery to distinguish it from 'familiarity' which is what you get from one of those Dummies type books. I would hope you could write code that solves problems, works and is reasonably clear, not that you can program well enough to work for Joel Spolsky. Sorry if I ruffled feathers.]
Volunteer to coach or assistant coach a kids sports team.
Start, run and grow an online community.
Give a speech a week to local organizations.
Write a regular newsletter or blog about an industry you care about.
Learn a foreign language fluently.
Write three detailed business plans for projects in the industry you care about.
Self-publish a book.
Run a marathon.
In 2007, 51 percent of graduating college students could find jobs in their field, and this year it has dropped to only 20 percent. If you find yourself with some time on your hands, either recently graduated or recently unemployed, consider volunteerism.Last year, I chose to donate my time and money to an innovative project called "One Laptop per Child" [OLPC]. It was one of my [New Years Resolutions] for 2008. I was actually "recruited" by folks from the OLPC after they read my [series of blog posts] on things that can be done with their now famous green-and-white XO laptop.
The first half of the year, I spent helping "Open Learning Exchange Nepal" [OLE Nepal], a non-government organization (NGO) to help education in that country. XO laptops were provided to second and sixth graders at several schools, and my assignment was to help with the school "XS" server. This would be the server that all the laptops connect to. My blog posts on this included:
Rather than [Move to Nepal], I was able to help by building an identical XS server in Tucson, and provide support remotely. This included getting the "Mesh Antennas"to be properly recognized, having an internet filter using [DansGuardian] software, and working out backup procedures.
For the second half of the year, I was asked to mentor a college student inHyderabad, India as part of the ["Google Summer of Code"] to develop an[Educational Blogger System]on the XS server. We called it "EduBlog" and based it on the popular [Moodle] educational software platform.This was going to be tested with kids from Uruguay, but sending a serverdown to this country proved politically-challenging, so instead, I [builta server and shipped it] to a co-location facility in Pennsylvania that agreed to donate the cost and expenses needed to run the server there with full internet connection. I acted as "system admin" for the box, was able to connect remotely via SSH, while Tarun, the college student I was mentoring, developed the EduBlog software. Twice the system washacked, but I was able to restore the system remotely thanks to a multi-boot configuration that allowedme to reboot to a read-only operating system image and restore the operating system and data.
The students and teachers in Uruguay were helped locally by [Proyecto Ceibal]. We were able to translate the system into Spanish, and the project was a big success, enough to convince local government to provideXO laptops to their students to further the benefits.
This week I am in Minneapolis, MN, so was hoping that the complicated process of moving this blog over to "MyDeveloperWorks" would happen while I was gone, but alas, that does not appear to be the case.
Meanwhile, my partner in crime, Barry Whyte, has moved his blog [Storage Virtualization]successfully over to the new server.
Perhaps next week. If all goes well, the URL links should redirect correctly, but those of you out there using feed readers might require you to re-subscribe to get the right RSS feeds.
Continuing my blog coverage of the [Forrester IT Forum 2009 conference],I will group a bunch of topics related to Cloud Computing into one post. Cloud Computing was a big topichere at the IT Forum, and probably was also in the other two conferences IBM participated in this week inLas Vegas:
The CIOs and IT professionals at this Forrester IT Forum seemed to be IT decision makers with a broader view. There was a lot of interest in Cloud Computing. What is Cloud Computing? Basically, it is renting IT capability on an as-needed basis from a computing service provider. The different levels of cloud computing depends on what the computing service provider actually provides. How do these compare with traditional co-location facilities or your own in-house on-premises computing? Here's my handy-dandy quick-reference guide:
Cloud Software-as-a-Service [SaaS], Examples: SalesForce and Google Apps.
Cloud Infrastructure-as-a-Service [IaaS], such as Amazon EC2, RackSpace.
Tradtional Co-Location facility, you park your equipment on rented floorspace, power, cooling and bandwidth.
Traditional On-Premises, what most people do today, build or buy your own data center, buy the hardware, write or buy the software, then install and manage it.
A main tent session had a moderated Q&A panel of three Forrester Analysts titled "Saving, Making and Risking Cash with Cloud Computing." Here are some key points from this panel:
Is Cloud Computing just another tool in the IT toolbox, or does it represent a revolution? The panel gave arguments for both. As a set of technology, protocols and standards, it is an evolutionary progression of other standards already in place, and an extension of methods used in co-location and time-share facilities. However,from a business model perspective, Cloud Computing represents a revolutionary trend, eliminating in some cases huge up-front capital expenses and/or long-term outsourcing contracts. PaaS and IaaS offerings can be rented by the hour, for example.
An example of using Cloud Computing for a one-time batch job: The New York Times decided to build an archive of 11 million articles, but this meant having to convert them all from TIFF to PDF format. The IT person they put in charge of this rented 100 machines on [Amazon Elastic Compute Cloud (EC2)] for 24 hours and was able to convert all 4TB of data for only $240 US dollars.
Cloud Computing can make it easier for companies to share information with clients, suppliers and business partners, eliminating the need to punch holes through firewalls to provide access.
Since it is relatively cheap for companies to try out different cloud computing offerings with little or no capital investment, the spaghetti model applies--"throw it on the wall, and see what sticks!"
What application areas should you consider running in the cloud? Employee self-service portals-Yes, ERP-Mixed, On-time batch jobs-Mixed, Email-Yes, Access Control-No, Web 2.0-Mixed, Testing/QA-Mixed, Back Office Transactions-No, Disaster Recovery-Mixed.
Different IT roles will see varying benefits and risks with cloud computing. However, by 2011, every new IT project must answer the question "Why not run in the cloud?"
There were a variety of track sessions that explored different aspects of cloud computing:
Software-as-a-Server: When and Why
This session had three Forrester analysts in a Q&A panel format. SaaS can provide much-needed relief from application support, maintenance and upgrade chores. The choice and depth of offerings is improving from SaaS providers. However, when comparing TCO between SaaS and on-premises deployments, can yield different results for different use cases. For example, a typical SaaS rate of $100 US dollars per user per month, with discounts, could be $1000 per year, or $10,000 over a 10-year period. Compare that to the total 10-year costs of an on-premises deployment, and you have a good ball-park comparison. SaaS can provide faster time-to-value, and you can easily just try-before-you-buy several alternative offerings before making a decision.
The downside to SaaS is that you need to understand their data center, where it is located, and how it is protected for backup and disaster recovery. Some SaaS providers have only a single data center, so it mightbe disruptive if it experiences a regional disaster.
Cloud IT Services: The Next Big Thing or Just Marketing Vapor?
Economic pressures are forcing companies to explore alternatives, and Cloud IT services are providingadditional options over traditional outsourcing. Only 70-80 percent of companies are satisfied with traditionaloutsourcing, so there is opportunity for Cloud IT services to address those not satisfied. Scalable, consumption-based billing with Web-based accessibility and flexibility is an attractive proposition. Tenyears ago, you could not buy an hour on a mainframe with your credit card, now you can.
Cloud technologies are mature, and there is interest in using these services. About 10 percent of companies are piloting SaaS offerings, 16 percent piloting PaaS offerings, and 13 percent investing in deploying "private clouds" within their data center. This week Aneesh Chopra, who is Barack Obama's pick as the first CTO for the US Federal Government, [stated to congressional leaders]: “The federal government should be exploring greater use of cloud computing where appropriate.”
IBM is betting heavily on their Cloud Computing strategy, has already gone through the reorganizations needed to be positioned well, and claims to have thousands of clients already. HP has some cloud offerings focused on their enterprise customers. Dell is investing and reorganizing for cloud as well.
Network Strategic Planning for Challenging Times
While not limited to Cloud Computing, companies are seeing WAN traffic doubling every 18 months, but withoutthe corresponding increases in budget to cover it. The Forrester analyst covered WAN optimization management services, hybrid Ethernet-MPLS offerings to help people transition from MPLS VPNs to Carrier-grade Ethernet.
Who should you hire for WAN optimization? Do you trust your own Telco that provides your bandwidth to help you figure out ways to use less of it? Alternatives include System Integrators and Service providers like IBM and EDS.Or, you could try to do it yourself, but this requires capital investment in gear and performance monitoring software.
New workloads like Voice over IP (VoIP) and digital surveillance can help cost-justify upgrading your MPLS VPNs to Carrier-grade Ethernet. The possibility of converging this with iSCSI and/or Fibre Channel (FC) over Ethernet (FCoE) and this can help reduce costs as well. Both MPLS and Ethernet will co-exist for awhile, and hybrid offerings from Telcos will help ease the transition. In the meantime, switching some workloads to Cloud Computing can provide immediate relief to in-house networks now. Converging voice, video, LAN, WAN and SAN traffic may require the IT departments to reorganize how the IT role of "network administrator" is handled.
Navigating the Myriad New Sourcing Models
The landscape of outsourcing has changed with the introducing of new Cloud Computing offerings. However, adapting these new offerings to internal preferences may prove challenging. The Forrester analyst suggesting being ready to try to influence their companies to adopt Cloud Computing as a new sourcing option.
Traditional outsourcing just manages your existing hardware and software, often referred to as "Your mess for less!" However, outsourcing contract law is mature and many outsource providers are large, well-established providers. In contrast, some SaaS providers are small, and the few that are largemay be fairly new to the outsourcing business. Here are some things to consider:
Where will the data physically be located? There are government regulations, such as the US Patriot Act, that can influence this decision.Many Canadian and European customers are avoiding providers where datais stored in the United States for this reason.
What is the service delivery chain? Some cloud providers in turn useother cloud providers. For example a SaaS provider might develop the software and then rent the platform it runs on from a PaaS, which in turn mightbe using offshore or co-location facilities to actually house their equipment.Knowing the service delivery chain may prove important on contractnegotiations. Clarify "cloud" terminology and avoid mixed metaphors.
What is their contingency plan? What is your contingency plan if the system is slow or inaccessible. What is their plan to protect against data loss during disasters? What if they go out of business? Source Code Escrow has proven impractical in many cases. SLAs should provide for performance, availability and other key metrics. However, service level penalties are not a cure-all for major disruptions, loss of revenues or reputation.
How will they handle security, compliance and audits? Heavy regulatory requirements may favor dedicated resources to be used.
Who has "custodianship" of the data? Will you get the data back if you discontinue the contract? If so, what format will it be in, and will it make any sense if you are not running the same application as the cloud provider?
Will they provide transition assistance? Moving from on-premises to cloud may involve some effort, including re-training of end users.
Are the resources shared or dedicated? For shared resource environments, is the capacity "fenced off" in any way to prevent having other clients impact your performance or availability.
I am glad to see so much interest in Cloud Computing. To learn more, here is IBM's [Cloud Computing] landing page.
Forrester analysts kicked off the keynote sessions for Day 1 of the Forrester IT Forum 2009 event. The theme for this conference is "Redefining IT's value to the Enterprise."Rather than focusing on blue-sky futures that are decades away, Forrester wants to present instead a blend of pragmatic informationthat is actionable now in the next 90 days along with some forward-looking trends.
If you ask CEOs how well their IT operations are doing, 75 percent will saythey are doing great. However, if you dig down, and ask how their companies are leveraging IT to help generate revenues, reduce costs, improve employee morale, drive profits, improve customer service, or manage risks, then the percentage drops down to 30 to 35 percent.
What are the root causes of this "perception gap" in value between business and IT? Several ideas come to mind:
Some CEOs still consider IT departments as "cost centers". Rather than exploiting technology to help drive the rest of the business, they are seen as a necessary evil, an extension of the accounting department, for example.
Some CEOs consider IT's role as basically "keeping the lights on". They only notice IT when the lights go out, or other business outages caused by disruptions in IT.
IT departments measure themselves in technology terms, not business terms. CEOs and the rest of the senior management team may not be "tech savvy", and the CIO and IT directors may not be "business savvy", resulting in failure to communicate IT's role and value to the rest of the business.
This conference is focused on CIOs and IT professionals, and how they can bridge the tech/business gap. The first two executive keynote presentations emphasized this point.
Bob Moffat, Senior VP and Group Executive, IBM
Bob Moffat (my fifth-line manager, or if you prefer, my boss's boss's boss's boss's boss) is the Senior VP and Group Executive of IBM's Systems and Technology Group that manufactures storage and other hardware. He presented how IBM is helping our clients deploy smarter solutions. Globalization has changed world business markets, has changed the reach of information technology, and has changed our client's needs.To support that, IBM is focused on making the world a smarter planet, instrumented with appropriate sensors, interconnected over converging networks, and intelligent to provide visibility, control and automation.
It's time to rethink IT in light of these new developments, to think about IT in client terms, with business metrics. Bob gave several internal and customer examples, here's one from the City of Stockholm:
Covering nine square miles of Stockholm Sweden, IBM led [the largest project of its kind] for traffic congestion in Europe. To reduce congestion caused by 300,000 vehicles, the City of Stockhold enacted a "congestion fee" with real-time recognition of license plates and a Web infrastructure to collect payments. The analytics, metrics and incentives have paid off. Since August 2007, traffic is reduced 18 percent, a reduction of travel time on inner streets, and a 9 percent increase in "green" vehicles.
In addition to smarter traffic, IBM has initiatives for smarter water, smarter energy, smarterhealthcare, smarter supply chain, and smarter food supply.
Dave Barnes, Senior VP and CIO, United Postal Service (UPS)
Dave Barnes must act as the "trusted advisor" to the rest of the senior management team. UPS delivers packages worldwide. They put sensors on all of the vehicles, not just to know how fast they were driving,but also how often they drove in reverse gear, and sensors on the engines to determine maintenance schedules.Analytics found that driving in reverse was the most dangerous, and by providing this information to the drivers themselves, the drivers were able to come up with their own innovative ways to minimize accidents.This is one role of IT, to provide employees the information they need to enable them to be better at their own jobs.
Dave also mentioned the importance of collaborating across business units. Their "Information Technology Steering Committee (ITSC)" has 15 members, of which only three are from the IT department. This helped deploy social media initiatives within UPS. For example, Twitter has been adopted so that senior management can get unfiltered customer feedback. This is perhaps another key role of IT, to flatten an organization from cultural hierarchies that prevent top brass up in the ivory tower from hearing what is going wrong down on the street. Too often, a customer or client complains to the nearest employee, and this may or may not get passed up accurately along the chain of command. Twitter allowed executives to see what was going on for themselves.
Dave also covered the "Best Neighbor" approach. If you were going to build a deck in your back yard, you might ask your neighbors that have already done this, and learn from their experience. Sadly, this does not happen enough in IT. To address this, UPS has a "Tech Governance Group" that focused on business process across the organization. For example, they improved "package flow", reducing 100 million miles in the past few years.
Lastly, he mentioned that many technologists are "loners". They have a few like that, but try to hire techies who look to team across business units instead. Likewise, they try to hire business people who are somewhat tech savvy. For example, they have encouraged business employees to write their own reports, rather than requesting new reports to be developed by the IT department. The end result, the business people get exactly the reports they want, faster than waiting for IT to do it. Another role for IT is to provide end-users the tools to make their own reports.
(Dave didn't mention what tools these were, but it sounded like the Business Intelligence and Reporting Tools [BIRT] that IBM uses.)
These two sessions were a great one-two punch to the audience of 600 CIOs and IT professionals. First, IBM sets the groundwork for what needs to be done. Then, UPS shows how they did exactly that, adopting a dynamic infrastructure and got great results. This is going to be an interesting week!
Recently, IBM and the University of Texas Medical Branch (UTMB) [launched an effort] using IBM's World Community Grid "virtual supercomputer" to allow laboratory tests on drug candidates for drug-resistant influenza strains and new strains, such as H1N1 (aka "swineflu"), in less than a month.
Researchers at the University of Texas Medical Branch will use [World Community Grid] to identify the chemical compounds most likely to stop the spread of the influenza viruses and begin testing these under laboratory conditions. The computational work adds up to thousands of years of computer time which will be compressed into just months using World Community Grid. As many as 10 percent of the drug candidates identified by calculations on World Community Grid are likely to show antiviral activity in the laboratory and move to further testing.
According to the researchers, without access to World Community Grid's virtual super computing power, the search for drug candidates would take a prohibitive amount of time and laboratory testing.
This reminded me of an 18-minute video of Larry Brilliant at the 2006 Technology, Entertainment and Design [TED] conference. Back in 2006, Larry predicted a pandemic in the next three years, and here it is 2009 and we have the H1N1 virus.
His argument was to have "early detection" and "early response" to contain worldwide diseases like this.
A few months after Larry's "call to action" in 2006, IBM and over twenty major worldwide public health institutions, including the World Health Organization [WHO] and the Centers for Disease Control and Prevention [CDC], [announced the Global Pandemic Initiative], a collaborative effort to help stem the spread of infectious diseases.
One might think that with our proximity to Mexico that the first cases would have been the border states, such as Arizona, but instead there were cases as far away as New York and Florida. The NYT explains in an article [Predicting Flu With the Aid of (George) Washington] that two rival universities, Northwestern University and Indiana University, both predicted that there would be about 2500 cases in the United States, based on air traffic control flight patterns, and the tracking data from a Web site called ["Where's George"] which tracks the movement of US dollar bills stamped with the Web site URL.
The estimates were fairly close. According to the Centers for Disease Control and Prevention [H1N1 Flu virus tracking page], there are currently 3009 cases of H1N1 in 45 states, as of this writing.
This is just another example on how an information infrastructure, used properly to provide insight, make predictions, and analyze potential cures, can help the world be a smarter planet. Fortunately, IBM is leading the way.
Wrapping up this week's theme on Cloud Computing, I finish with an IBM announcement for two new products to help clients build private cloud environments from their existing Service Oriented Architecture (SOA) deployments.
IBM WebSphere CloudBurst Appliance -- a new hardware appliance that provides access to software virtual images and patterns that can be used as is or easily customized, and then securely deployed, managed and maintained in a private cloud.
IBM WebSphere Application Server Hypervisor Edition -- a version of IBM WebSphere Application Server software optimized to run in a virtualized hardware server environments such as VMware, and comes preloaded in WebSphere Cloudburst.
With more than 7,000 customer implementations worldwide, IBM is the SOA market leader. Of course, both of these products above can be used with IBM System Storage solutions, including Cloud-Optimized Storage offerings like Grid Medical Archive Solution (GMAS), Grid Access Manager software, Scale-Out File Services (SoFS), and the IBM XIV disk system.
IBM is part of the "Cloud Computing 5" major vendors pushing the envelope (the other four are Google, Microsoft, Amazon and Yahoo). In fact, IBM has a number of initiatives that allow customers to leverage IBM software in a cloud. IBM is working in collaboration with Amazon Web Services (AWS), a subsidiary of Amazon.com, Inc. to make IBM software available in the Amazon Elastic Compute Cloud (Amazon EC2). WebSphere sMash, Informix Dynamic Server, DB2, and WebSphere Portal with Lotus Web Content Management Standard Edition are available today through a "pay as you go" model for both development and production instances. In addition to those products, IBM is also announcing the availability of IBM Mashup Center and Lotus Forms Turbo for development and test use in Amazon EC2, and intends to add WebSphere Application Server and WebSphere eXtreme Scale to these offerings.
For more about IBM's leadership in Cloud Computing, see the IBM [Press Release].
This week's theme is alternative sourcing through Cloud Computing.
I thoughtI would start off the week interviewing an owner at a Small or Medium-sized Business [SMB] that recently adopted this approach.
Meet Fred, one of the new co-owners of my singles activities club, TucsonFun and Adventures, known affectionately as [TFA]. TFA recentlyadopted a new "Software-as-a-Service" [SaaS] for the company's Web site.
While the experience is still fresh in his mind, I thought this would be a goodopportunity to illustrate some of the concepts of alternative sourcing through Cloud Computing byusing a local example.
Give me some background on the company. How long has it been around? How many employees?
TFA has been in business since 1997, and has six employees, including an office manager, event planners and event coordinators.
How critical is "Web presence" to the business?
It's very important in several ways.First, the TFA staff plans 25 events per month, and our hundreds of members register for these events mostly through the Web site. Second, we have it connected to our bank accounts, so that it can process credit cards to collect the funds for renewals and event registrations.Third, it serves as a way to communicate upcoming events to our members, especially trips, so they can save the date on their own calendars. And fourth, the Web site serves as a "landing page"for all of our radio spots, newspaper ads, and other marketing efforts.
TFA had a Web site before, and now you have helped launch this new Web site. What motivated this change?
Our members were complaining about our 1999-era Web site. The pages were written in HTML, ASP (Active Service Page) and SQL (Structured Query Language) connected to a Microsoft SQL Server 2005 database. It was mostly text-based, with the only animation being text scrolling horizontally across the screen. The Web hostingprovider offered reliable access, but was located in New York state on East Coast time. If a member signed up for an event after 9pm or 10pm here in Tucson, it was marked as the next date, which could change the price of the event, or indicate the deadline was missed.If there were any changes to the pages or logic needed, or new columns required in the database, it gotexpensive. The TFA employees don't know how to program in ASP or SQL, so we hadto hire outside professionals each time.
Does this new Software-as-a-Service (SaaS) Web site address these problems you were trying to solve?
Yes. The new Web site is hosted by [Memberize] which provides a hosted membership management application. The TFA staff can nowmanage its membership, plan events, and communicate them with graphics, videos,and links to maps. They don't need to know ASP or SQL programming, because a built-in[WYSIWYG] editor is simple enough for anyone with standard word-processing skills. The database allowed the optionto add customized fields for each member we have in our club.
Was it difficult to switch over?
Not at all. Memberize gave us a 60-day free trial, and we needed all that time totransfer over our membership records, customize the style of the overall templatefor all pages, and then copy over the content from our old Web site. Wehad to transfer over our e-commerce service over, and contact GoDaddy to transfer the domain. The employee training required was fairly minimal.Cost-wise, it was only a few hundred dollars one-time setup fee, and then we pay a monthly fee,based on a tiered pricing structure based on the count of our active members.
How has the reaction been from your membership?
I've gotten a lot of positive feedback. The learning curve was minimal. Ourmembers found the new Web site intuitive and interactive. For example, thecalendar of events can be shown in a single month-at-a-glance format, with greendots showing the events you are signed up for.
And from your perspective, Fred, is the new Web site easy to administer?
Yes, I can now easily generate standard reports, and create my own ad-hoc reports as needed. This wasn't possible with the old system unless I hired an ASP programmer.
Hopefully, this provides some insight on how even the smallest SMB enterprises can adopt a Dynamic Infrastructure through alternative sourcing. Cloud Computing takes many forms, including Software-as-a-Service managed offerings.
Wrapping up this week's theme on IBM's Dynamic Infrastructure® strategic initiative, we have a few more goodies in the goody bag.
First item: Dave Bricker shows off the XIV cloud-optimized storage at Pulse 2009
Second item: Rodney Dukes discusses the latest features of the DS8000 disk system at Pulse 2009
Third item: IBM launches the [Dynamic Infrastructure Journal]. You can read the February 2009 edition online, and if you find it useful and interesting, subscribe to learn from IBM's transformation experts how to reduce cost, manage risk and improve service.
Whether or not you attended the IBM Pulse 2009 conference, you might enjoy looking at the rest of the series of videos on [YouTube] and photographs on [Flickr].
It seems like [only yesterday] I was talking about IBM's strategic initiatives for the New Enterprise Data Center, including the launch of asset and service management at [Pulse 2008] in Orlando, Florida.
This week, my colleagues are at [Pulse 2009] in Las Vegas, Nevada. (I'm not there this time, so stop asking all my colleagues where I am!)Obviously, a lot has change in the last 12 months: the world's financial economy has collapsed, our delicate environment continues to unravel, and a new US President was elected to fix all that was broken by the former occupant. As a result, IBM's strategy has evolved beyond just data centers for large enterprises.
I can't think of a better time to emphasize the need for a more dynamic infrastructure. And this is not just focused on IT operations, but smarter business infrastructure as well, as the two now are very much intertwined. Everything from smarter healthcare, smarter telecom, smarter retail, smarter distribution, smarter transportation, and smarter financial services. IBM's [Dynamic Infrastructure@reg;] is one of four strategic initiatives to help build a smarter planet.
Let's take a quick look at the key benefits:
Do you remember back to the days that the IT department was like the accounting department in the back office, merely recording what happened in a series of transactions? Not anymore! Today, IT is front and center of most businesses, helping to generate revenue, drive innovation, and provide better customer service. We are finding a convergence between the physical world of running business with the digital world of IT. Intelligence is everywhere, embedded in systems and operations throughout, not just in a data center.
Imagine only 10-15 years ago the primary concern for IT operations was the cost of hardware. Now, thanks to[Moore's law], hardware is cheaper, but other IT budget costs like labor, management software, power and cooling costs are growing faster and becoming more predominant factors. IBM recognizes that you must consider thetotal cost of ownership, not just the acquisition cost of new hardware. But again, this isn't just reducing the costs of IT, but making more effective use of IT resources to reduce costs everywhere else, in schedulingtransportation, in managing manufacturing assets, and so on.
While the world feels much safer now that Barack Obama has taken over, there are still risks and threats out there, and businesses large and small have to manage them. Economic swings like we have experienced lately help weed out those companies that had fixed costs and static infrastructures, in favor of those with more variable costs and dynamic infrastructures. When the marketplace slows down, can your business "dial down" its operations to match? And when the recession is over and business is booming again, can your business "ramp up" fast enough to take on new opportunity? With IBM's Cloud Computing, companies can minimize their fixed investments and use a variable amount of computing as business needs change dynamically.
To learn more about Dynamic Infrastructure, read the IBM [Press Release].
When I was a kid, I used to love old spy movies where they would hide a small microchip or microfiche behind the stamp on a letter or postcard. "Yeah right," I would think to myself, "how much information could that little thing possibly hold."On their post[Bringing the "New Intelligence" Down to Earth: Intro to Semantic Web, Internet-of-Thing], My fellow IBM bloggers Jack Mason and Adam Christensen pointed me to a crazy new product called "Mir:ror" that connects to your PC or laptop.
At first, I thought it was a another product spoof, like Onion News Networks'video of the [Apple MacBook Wheel] that eliminatesthe need for a keyboard.But no, this product is real, from a company called [Violet]. The mir:ror, the internet-connected rabbits, and the tiny postage stamps called "ztamps" with embedded RFID chips that allow everything to be interconnected.I can see a lot of interesting uses for the ztamps. Squishing CD-romsor memory sticks inside presentation folders was always awkward. Butthese are small, flat and discrete. I don't know how many GBs of storage each ztamp holds, but they look cool, don't they?
Just another example of becoming a smarter planet!
IBM's emphasis on "Information Infrastructure" is to help organizations get the right information, to the right people at the right time. This helps them to have the right insights, make the right decisions, and develop the right innovations needed for the challenges at hand.
As the planet got smaller and flatter, IBM led the way. Now, as the planet needs to get smarter--with more efficient health care, energy distribution, financial institutions, and IT infrastructures--IBM will once again take the lead.
Continuing my coverage of the 27th annual[Data Center Conference], the weather here in Las Vegas has been partly cloudy,which leads me to discuss some of the "Cloud Computing" sessions thatI attended on Wednesday.
The x86 Server Virtualization Storm 2008-2012
Along with IBM, Microsoft is recognized as one of the "Big 5" of Cloud Computing. With theirrecent announcements of Hyper-V and Azure, the speaker presented pros-and-cons between thesenew technologies versus established offerings from VMware. For example, Microsoft's Hyper-Vis about three times cheaper than VMware and offers better management tools. That could beenough to justify some pilot projects. By contrast, VMware is more lightweight, only 32MB,versus Microsoft Hyper-V that takes up to 1.5GB. VMware has a 2-3 year lead ahead of Microsoft, and offers some features that Microsoft does not yet offer.
Electronic surveys of the audience offered some insight. Today, 69 percent were using VMware only, 8 percent had VMware plus other, including Xen-based offerings from Citrix,Virtual Iron and others. However, by 2010, the audience estimated that 39 percent would be VMware+Microsoft and another 23 percent VMware plus Xen, showing a shift away from VMware'scurrent dominance. Today, there are 11 VMware implementations to Microsoft Hyper-V, and thisis expected to drop to 3-to-1 by 2010.
Of the Xen-based offerings, Citrix was the most popular supplier. Others included Novell/PlateSpin,Red Hat, Oracle, Sun and Virtual Iron. Red Hat is also experimenting with kernel-based KVM.However, the analyst estimated that Xen-based virtualization schemes would never get past8 percent marketshare. The analyst felt that VMware and Microsoft would be the two dominant players with the bulk of the marketshare.
For cloud computing deployments, the speaker suggested separating "static" VMs from "dynamic" ones. Centralize your external storage first, and implement data deduplicationfor the OS load images. Which x86 workloads are best for server virtualization? The speaker offered this guidance:
The "good" are CPU-bound workloads, small/peaky in nature.
The "bad" are IO-intensive, those that exploit the features of native hardware
The "ugly" refers to workloads based on software with restrictive licenses and those not fully supported on VMs. If you have problems, the software vendor may not help resolve them.
Moving to the Cloud: Transforming the Traditional Data Center
IBM VP Willie Chiu presented the various levels of cloud computing.
Software-as-a-Service (SaaS) provides the software application, operating system and hardware infrastructure, such as SalesForce.com or Google Apps. Either the software meets your needs or it doesn't, but has the advantage that the SaaS provider takes care of all the maintenance.
Platform-as-a-Service (PaaS) provides operating system, perhaps some middleware like database or web application server, and the hardware infrastructure to run it on. The PaaS provider maintains the operating system patches, but you as the client must maintain your own applications. IBM has cloud computing centers deployed in nine different countries across the globe offering PaaS today.
Infrastructure-as-a-Service (IaaS) provides the hardware infrastructure only. The client must maintain and patch the operating system, middleware and software applications. This can be very useful if you have unique requirements.
In one case study, Willie indicated that moving a workload from a traditional data center to the cloud lowered the costs from $3.9 million to $0.6 million, an 84 percent savings!
We've Got a New World in Our View
Robert Rosier, CEO of iTricity, presented their "IaaS" offering. "iTricity" was coined from the concept of "IT as electricity". iTricity is the largest Cloud Computing company in continental Europe, hosting 2500 servers with 500TB of disk storage across three locations in the Netherlands and Germany.
Those attendees I talked to that were at this conference before commented that this year's focus on virtualization and cloud computing is noticeably more than in previous years. For more on this, read this 12-page whitepaper:[IBM Perspective on Cloud Computing]
Continuing this week's coverage of the 27th annual [Data Center Conference] I attended some break-out sessions on the "storage" track.
Effectively Deploying Disruptive Storage Architectures and Technologies
Two analysts co-presented this session. In this case, the speakers are using the term "disruptive" in the [positive sense] of the word, as originally used by Clayton Christensen in hisbook[The Innovator's Dilemma], andnot in the negative sense of IT system outages. By a show of hands,they asked if anyone had more storage than they needed. No hands went up.
The session focused on the benefits versus risks of new storage architectures, and which vendors they felt would succeed in this new marketplace around the years 2012-2013.
By electronic survey, here were the number of storage vendors deployed by members of the audience:
14 percent - one vendor
33 percent - two vendors, often called a "dual vendor" strategy
24 percent - three vendors
29 percent - four or more storage vendors
For those who have deployed a storage area network (SAN), 84 percent also have NAS, 61 percent also have some form or archive storage such as IBM System Storage DR550, and 18 percent also have a virtual tape library (VTL).
The speaker credited IBM's leadership in the now popular "storage server" movement to the IBM Versatile Storage Server [VSS] from the 1990s, the predecessor to IBM's popular Enterprise Storage Server (ESS). A "storage server" is merely a disk or tape system built using off-the-shelf server technology, rather than customized [ASIC] chips, lowering thebarriers of entry to a slew of small start-up firms entering the IT storage market, and leading to newinnovation.
How can a system designed for now single point of failure (SPOF) actually then fail? The speaker convenientlyignored the two most obvious answers (multiple failures, microcode error) and focused instead on mis-configuration. She felt part of the blame falls on IT staff not having adequate skills to deal with the complexities of today's storage devices, and the other part of the blame falls on storage vendors for making such complicated devices in the first place.
Scale-out architectures, such as IBM XIV and EMC Atmos, represent a departure from traditional "Scale-up" monolithic equipment. Whereas scale-up machines are traditionally limited in scalability from their packaging, scale-out are limited only by the software architecture and back-end interconnect.
To go with cloud computing, the analyst categorized storage into four groups: Outsourced, Hosted, Cloud, and Sky Drive. The difference depended on where servers, storage and support personnel were located.
How long are you willing to wait for your preferred storage vendor to provide a new feature before switching to another vendor? A shocking 51 percent said at most 12 months! 34 percent would be willing to wait up to 24 months, and only 7 percent were unwilling to change vendors. The results indicate more confidence in being able to change vendors, rather than pressures from upper management to meet budget or functional requirements.
Beyond the seven major storage vendors, there are now dozens of smaller emerging or privately-held start-ups now offering new storage devices. How willing were the members of the audience to do business with these? 21 percent already have devices installed from them, 16 percent plan to in the next 12-24 months, and 63 percent have no plans at all.
The key value proposition from the new storage architectures were ease-of-use and lower total cost of ownership.The speaker recommended developing a strategy or "road map" for deploying new storage architectures, with focus on quantifying the benefits and savings. Ask the new vendor for references, local support, and an acceptance test or "proof-of-concept" to try out the new system. Also, consider the impact to existing Disaster Recovery or other IT processes that this new storage architecture may impact.
Tame the Information Explosion with IBM Information Infrastructure
Susan Blocher, IBM VP of marketing for System Storage, presented this vendor-sponsored session, covering theIBM Information Infrastructure part of IBM's New Enterprise Data Center vision. This was followed by BradHeaton, Senior Systems Admin from ProQuest, who gave his "User Experience" of the IBM TS7650G ProtecTIER virtual tape library and its state-of-the-art inline data deduplication capability.
Best Practices for Managing Data Growth and Reducing Storage Costs
The analyst explained why everyone should be looking at deploying a formal "data archiving" scheme. Not just for "mandatory preservation" resulting from government or industry regulations, but also the benefits of "optional preservation" to help corporations and individual employees be more productive and effective.
Before there were only two tiers of storage, expensive disk and inexpensive tape. Now, with the advent of slower less-expensive SATA disks, including storage systems that emulate virtual tape libraries, and others that offer Non-Erasable, Non-Rewriteable (NENR) protection, IT administrators now have a middle ground to keep their archive data.
New software innovation supports better data management. The speaker recalled when "storage management" was equated to "backup" only, and now includes all aspects of management, including HSM migration, compliance archive, and long term data preservation. I had a smile on my face--IBM has used "storage management" to refer to these other aspects of storage since the 1980s!
The analyst felt the best tool to control growth is the "Delete" the data no longer needed, but felt that nobody uses Storage Resource Management (SRM) tools needed to make this viable. Until then, people willchose instead to archive emails and user files to less expensive media.The speaker also recommended looking into highly-scalable NAS offerings--such as IBM's Scale-Out File Services (SoFS), Exanet, Permabit, IBRIX, Isilon, and others--when fast access to files is worth the premium price over tape media.The speaker also made the distinction between "stub-based" archiving--such as IBM TSM Space Manager, Sun's SAM-FS, and EMC DiskXtender--from "stub-less" archive accomplished through file virtualization that employes a global namespace--such as IBM Virtual File Manager (VFM), EMC RAINfinity or F5's ARX.
She made the distinction between archives and backups. If you are keeping backups longer than four weeks, they are not really backups, are they? These are really archives, but not as effective. Recent legal precedent no longer considers long-term backup tapes as valid archive tapes.
To deploy a new archive strategy, create a formal position of "e-archivist", chose the applications that will be archived and focus on requirements first, rather than going out and buying compliance storage devices. Try to get users to pool their project data into one location, to make archiving easier. Try to have the storage admins offer a "menu" of options to Line-of-Business/Legal/Compliance teams that may not be familiar with subtle differences in storage technologies.
While I am familiar with many of these best practices already, I found it useful to see which competitiveproducts line up with those we have already within IBM, and which new storage architectures others find mostpromising.
I did not register soon enough to get into the MGM Grand itself, so I am staying at a Hiltonat the other end of the Las Vegas strip, but am able to hop on the "Monorail" to get to the MGM,just in time for the breakfast and first welcome session.
This conference has a familiar set up: six keynote sessions, 62 break-out sessions, and fourtown hall meetings. Thanks to electronic survey devices on the seats, speakers were able to gatherreal-time demographics. A large portion of attendees, including myself, are attending this conference for theirfirst time. Here's my recap of the first three keynote sessions:
The Future of Infrastructure and Operations: The Engine of Cloud Computing
How much do companies spend just to keep current? As much as 70 percent! The speaker noted thatthe best companies can get this down to 10 to 30 percent, leaving the rest of the IT budget to facilitate transformation. He predicts that companies are transforming their data centers fromsprawled servers to virtualization, towards a fully automated, service-oriented, real-time infrastructure.
Whereas the original motivation for IT virtualization was to reduce costs, companies now recognizethat they greatly improve agility, the ability to rapidly provision resources for new workloads, and that this will then lead to opportunites for alternative sourcing, such as cloud computing.
The operating system is becoming commoditized, focusing attention instead to a new concept: the"Meta OS". VMware's Virtual Data Center and Microsoft's Azure Fabric Controller are just two examples.Currently, analysts estimate only about 12 percent of x86 workloads are running virtualized, but thatthis could be over 50 percent by 2012.In this same time frame, year 2012, storage Terabytes is expected to increase 6.5x fold, and WAN bandwidthgrowing 35 percent per year.
Virtualization is not just for business applications. There are opportunities to eliminate the mostcostly part of any business: the Personal Computer, poster child of the skyrocketing costs of the client/server movement. Remote hosting of applications, streaming of applications,software as a service (SaaS) and virtual machines for the desktop can greatly reduce costs of customizedPC images and help desk support.
Cloud computing not only reduces per costs per use, but provides a lower barrier of entry and somemuch needed elasticity.Draw a line anywhere along the application-to-hardware software/hardware stack, and you can define acloud computing platform/service. About 65 percent of the attendees surveyed indicated that they were already doing something with CloudComputing, or were planning to in the next four years.
To help get there, the speaker felt that Value-added Resellers (VAR) and System Integrators (SI) wouldevolve into "service brokers", providing Small and Medium sized Businesses (SMB) "one throat to choke" in mixedmultisourced operations. The term "multisource" caught me a bit off-guard, referring to having someworkloads run internally (insourced) while other workloads run out on the Cloud (outsourced). Largerenterprises might have a "Dynamic Sourcing Team", a set of key employees serving as decision makers, employing both business and IT skills to determine the best sourcing for each application workload.
What are the biggest obstacles to getting there? The speaker felt it was the IT staff. People and cultureare the most difficult to change. The second are lack of appropriate metrics. Here were the survey resultsof the attendees:
41 percent had metrics for infrastructure economic attributes
49 percent had metrics for qualities of service (QoS)
12 percent had metrics to measure agility, speed of resource provisioning
The Data Center Scenario: Planning for the Future
This second keynote had two analyst "co-presenters". The focus was on the importance of having a documented Data Center strategy and architecture. Unfortunately, most Data Centers "happen on their own", with a majoroverhaul every 5 to 10 years. The speakers presented some "best practices" for driving this effort.
The first issue was to identify tiers of criticality, similar to those by the[Uptime Institute]. In their example, the most criticalworkloads would have perhaps recovery point objectives (RPO) of zero, and recover time objectives of lessthan 15 minutes. This is achievable using synchronous mirroring with fully automation to handle the failover.
The second issue was to recognize that many applications were designed for local area networks (LAN), butmany companies have distributed processing over a wide area network (WAN). Latency over these longer distancescan kill distributed performance of these applications.
The third issue was that different countries offer different levels of security, privacy and law enforcement.Canada and Ireland, for example, had the lowest risk, countries like India had medium risk, and countries likeChina and Russia had the highest risk, based on these factors.
The speakers suggested the following best practices:
Get a better understanding of the costs involved in providing IT services
Centralize applications that are not affected by latency, but regionalize those that are affected toremote locations to minimize distance delays.
Work towards a "lights out" data center facility, with operations personnel physically separated fromdata center facilities.
For the unfortunate few that are trying to stretch out more life from their existing aging data centers,the speakers offered this advice:
Build only what you need
Decommission orphaned servers and storage, which can be 1 to 12 percent of your operations
Target for replacement any hardware over five years old, not just to reduce maintenance costs, butalso to get more energy-efficient equipment.
Consider moving test workloads, and as much as half of your web servers, off UPS and onto the nativeelectricity grid. In the event of an outage, this reduces UPS consumption.
Implement power-capping and load-shedding, especially during peak times.
Enacting these changes can significantly improve the bottom line. Archaic data centers, those typically over 10 years old with power usage effectiveness (PUE) over 3.0 can cost over twice as much as a moreefficient data center. To learn more about PUE as a metric, see the Green Grid's whitepaper[Data Center power efficiency metrics:PUE and DCiE].
While virtualization can help with these issues, it also introduces new problems, such as VM sprawl anddealing with antiquated licensing schemes of software companies.
The Four Traits of the World's Best-Performing Business Leaders
Best-selling author Jason Jennings presented his findings in researching his various books:
It's Not the Big That Eat the Small... It's the Fast That Eat the Slow : How to Use Speed as a Competitive Tool in Business
Less Is More : How Great Companies Use Productivity As a Competitive Tool in Business
Think Big, Act Small
Hit the Ground Running : A Manual for New Leaders
Jason identified the best companies and interviewed their leaders, including such companies as Koch Industries, Nucor Steel, and IKEA furniture. The leaders he interviewed felt a calling to serveas stewards of their companies, not just write mission and vision statements, and be willingto let go of projects or people that aren't working out.
Jasonindicated a 2007 Gallup poll on the American workplace indicates that 70 percent of employees do notfeel engaged in their jobs.The focus of these leaders isto hire people with the right attitudes, rather than the right aptitudes, and give those people with the knowledge and the right to make business decisions. If done well,employees will think and act as owners, and hold themselves accountable for their economic results. Jason found cases where 25-year-olds were givenresponsibility to make billion-dollar decisions!
I found his talk inspiring! The audience felt motivated to do their jobs better, and be more engagedin the success of their companies.
These keynote sessions set the mood for the rest of the week. I can tell already that the speakers willtoss out a large salad of buzzwords and IT industry acronyms. I saw several people in the audience confusedon some of the terminology, and hopefully they will come over to IBM booth 20 at the Solutions Expofor straight talk and explanation.
Well it's Tuesday, and ["election day"] here in the USA, and again IBM has more announcements.
IBM announced [IBM Tivoli Key Lifecycle Manager v1.0] (TKLM) to manage encryption keys. This provides a graphical interface to manage encryption keys, including retention criteria when sharing keys with other companies.
TKLM is supported on AIX, Solaris, Windows, Red Hat and SUSE Linux. IBM plans to offer TKLM forz/OS in 2009. TKLM can be used with Firefox or Internet Explorer web browser. This will include the Encryption Key Manager (EKM) that IBM offered initially to support encryption keys for the TS1120, TS1130, and LTO-4 drives.
While this is needed today for tape, IBM positions this software to also manage the encryption keys for "Full Drive Encryption" (FDE) disk drive modules (DDM) in IBM disk systems in 2009.
This is page 34 of Sequoia Capital's[56-slide presentation] about the current financial meltdown. In the past, IT spending tracked closely to the rest of the economy, but the latest downturn has not yet reflected in IT spend.
The rest of the deck is worth going through, with interesting stats presented in a clear manner.
For a while now, IBM has been trying to explain to clients that focusingon just storage hardware acquisition costs is not enough. You need toconsider the "Total Cost of Ownership" or TCO of a purchase decision.For active data, a 3-5 year TCO assessment can give you a better comparison of costs between IBM and competitive choices. For long-term archive retention, 7-10 year TCO assessment may be necessary.
Now, IBM has a cute [2-minute video] that brings anappropriate analogy to help IT and non-IT executives understand.
While some might be familiar with mashups that combine public Web 2.0 sources of information, enterprise mashups go one step further, integrating withthe "information infrastructure" of your data center. It's not just enough to deliver theright information to the right person at the right time, it has to bein the right format, in a manner that can be readily understood andacted upon. Enterprise mashups can help.
Wrapping up my week on successful uses of information, I thought I would discuss the visualization of data.Not just bar charts and pie charts, but how effective visual information can be on multi-variable plots.
IBM's [Many Eyes] recognizes that 70 percentof our sensory input neurons in our brain our focused on visual inputs, and so we might recognize patternsif only data was presented in more interesting and visual representations.
In addition to X/Y axis, variables can be presented by size of circle and color. Here's an example plot of the past US bailouts, with variables representing amount, year, company andindustry. This plot does not include the current 700 Billion US Dollar bailout currently under discussion.
This is part of IBM's Collaborative User Experience (CUE) research lab. The software is available Web2.0style at no charge, just upload your data set, and choose one of 16 different presentation styles.
These plots get even more interesting when you animate them over time. In 2006, Hans Rosling presenteddata he gathered from the United Nations and other publicly funded sources and presented his findings atthe TED conference. Here is the 20-minute video of that presentation (click on play at right), titled ["Debunking third-world myths with the best stats you've ever seen"], in which he debunks the myth that all countries fall into two distinct categories: Industrialized and Developing.
Amazingly, the data--as well as the software to analyze it--is available at[GapMinder.org] Web site.
For more information on how you can deploy an information infrastructure that allows you to search, visualize and leverage the most value from your information, contact your local IBM representative or IBM Business Partner.
This post will focus on Information Compliance, the fourth and final part of the four-part series this week.I have received a few queries on my choice of sequence for this series: Availability, Security, Retention andCompliance.
Why not have them in alphabetical order? IBM avoids alphabetizing in one language, because thenit may not be alphabetized when translated to other languages.
Why not have them in a sequence that spells outan easy to remember mnemonic, like "CARS"? Again, when translated to other languages, those mnemonics no longerwork.
Instead, I worked with our marketing team for a more appropriate sequence, based on psychology and the cognitive bias of [primacy and recency effects].
Here's another short 2-minute video, on Information Compliance
Full disclosure: I am not a lawyer. The following will delveinto areas related to government and industry regulations. Consultyour risk officer or legal counsel to make sure any IT solution is appropriatefor your country, your industry, or your specific situation.
IBM estimates there are over 20,000 regulations worldwide related to information storage and transmission.
For information availability, some industry regulations mandate a secondary copy a minimum distance away toprotect against regional disasters like hurricanes or tsunamis.IBM offers Metro Mirror (up to 300km) and Global Mirror (unlimited distance) disk mirroring to support theserequirements.
For information security, some regulations relate to privacy and prevention of unauthorized access. Twoprominent ones in the United States are:
Health Insurance Portability and Accountability Act (HIPAA) of 1996
HIPAA regulates health care providers, health plans, and health care clearinghouses in how they handle the privacy of patient's medical records. These regulations apply whether the information is on film, paper, or storedelectronically. Obviously, electronic medical records are easier to keep private. Here is an excerpt froman article from [WebMD]:
"There are very good ways to protect data electronically. Although it sounds scary, it makes data more protected than current paper records. For example, think about someone looking at your medical chart in the hospital. It has a record of all that is happening -- lab results, doctor consultations, nursing notes, orders, prescriptions, etc. Anybody who opens it for whatever reason can see all of this information. But if the chart is an electronic record, it's easy to limit access to any of that. So a physical therapist writing physical therapy notes can only see information related to physical therapy. There is an opportunity with electronic records to limit information to those who really need to see it. It could in many ways allow more privacy than current paper records."
GLBA regulates the handling of sensitive customer information by banks, securities firms, insurance companies, and other financial service providers. Financial companies use tape encryption to comply with GLBA when sending tapes from one firm to another. IBM was the first to deliver tape drive encryption withthe TS1120, and then later with LTO-4 and TS1130 tape drives.
For information retention, there are a lot of regulations that deal with how information is stored, in some casesimmutable to protect against unethical tampering, and when it can be discarded. Two prominent regulations inthe United States are:
U.S. Securities and Exchange Commission (SEC) 17a-4 of 1997
In the past, the IT industryused the acronym "WORM" which stands for the "Write Once, Read Many" nature of certain media, like CDs, DVDs,optical and tape cartridges. Unfortunately, WORM does not apply to disk-based solutions, so IBM adopted the languagefrom SEC 17a-4 that calls for storage that is "Non-Erasable, Non-Rewriteable" or NENR. This new umbrella term applies to disk-based solutions, as well as tape and optical WORM media.
SEC 17a-4 indicates that broker/dealers and exchange members must preserve all electronic communications relating to the business of their firmm a specific period of time. During this time, the information must not be erased or re-written.
Sarbanes-Oxley (SOX) Act of 2002
SOX was born in the wake of [Enron and other corporate scandals]. It protects the way that financial information is stored, maintained and presented to investors, as well as disciplines those who break its rules. It applies onlyto public companies, i.e. those that offer their securities (stock shares, bonds, liabilities) to be sold to the publicthrough a listing on a U.S. exchange, such as NASDAQ or NYSE.
SOX focuses on preventing CEOs and other executives from tampering the financial records.To meet compliance, companies are turning to the [IBM System Storage DR550] which providesNon-erasable, Non-rewriteable (NENR) storage for financial records. Unlike competitive products like EMC Centera thatfunction mostly as space-heaters on the data center floor once they filled up, the DR550 can be configured as a blended disk-and-tape storage system, so that the most recent, and most likely to be accessed data, remains on disk, but the older, least likely to be accessed data, is moved automatically to less expensive, more environment-friendly "green" tape media.
Did SOX hurt the United States' competitiveness? Critics feared that these new regulations would discourage newcompanies from going public. Earnst & Young found these fears did not come true, and published a study [U.S. Record IPO Activity from 2006 Continues in 2007]. In fact, the improved confidence that SOX has given investors has given rise to similarlegislation in other parts of the world: Euro-Sox for the European Union Investor Protection Act, and J-SOX Financial Instruments and Exchange Law for Japan.
For those who only read the first and last paragraphs of each post, here is my recap:Information Compliance is ensuring that information is protected against regional disasters, unauthorizedaccess, and unethical tampering, as required to meet industry and government regulations. Such regulationsoften apply if the information is stored on traditional paper or film media, but can often be handled more cost-effectively when stored electronically. Appropriate IT governance can help maintain investor confidence.
In yesterday's post, [IBM Information Infrastructure launches today], I explained how this strategic initiative fit into IBM's New EnterpriseData Center vision. For those who prefer audio podcasts, here is Marissa Benekos interviewing Andy Monshaw, IBM General Manager of IBM System Storage.
This post will focus on Information Availability, the first of the four-part series this week.
Here's another short 2-minute video, on Information Availability
I am not in marketing department anymore, so have no idea how much IBM spentto get these videos made, but hate for the money to go wasted. I suspect theonly way they will get viewed is if I include them in my blog. I hope youlike them.
As with many IT terms, "availability" might conjure up different meanings for different people.
Some can focus on the pure mechanics of delivering information. An information infrastructure involves all of thesoftware, servers, networks and storage to bring information to the application or end user, so all of the chainsin the link must be highly available: software should not crash, servers should have "five nines" (99.999%) uptime, networks should be redundant, and storage should handle the I/O request with sufficient performance. For tape libraries, the tape cartridge must be available, robotics are needed to fetch the tape, and a drive must be available toread the cartridge. All of these factors represent the continuous operations and high availability features of business continuity.
In addition to the IT equipment, you need to make sure your facilities that support that equipment, such aspower and cooling, are also available.Independent IT analyst Mark Peters from Enterprise Strategy Group (ESG) summarizes his shock about the findings in a recent [survey commissioned by Emerson Network Power]on his post [Backing Up Your Back Up]. Here is an excerpt:
"The net take-away is that the majority of SMBs in the US do not have back-up power systems. As regional power supplies get more stretched in many areas, the possibility of power outages increases and obviously many SMBs would be vulnerable. Indeed, while the small business decision makers questioned for the survey ranked such power outages ahead of other threats (fires, government regulation, weather, theft and employee turnover) only 39% had a back-up power system. Yeah, you could say, but anything actually going wrong is unlikely; but apparently not, as 79% of those surveyed had experienced at least one power outage during 2007. Yeah, you might say, but maybe the effects were minor; again, apparently not, since 42% of those who'd had outages had to actually close their businesses during the longest outages. The DoE says power outages cost $80 billion a year and businesses bear 98% of those costs."
Others might be more concerned about outages resulting from planned and unplanned downtime. Storage virtualizationcan help reduce planned downtime, by allowing data to be migrated from one storage device to another withoutdisrupting the application's ability to read and write data. The latest "Virtual Disk Mirroring" (VDM) feature of the IBM System Storage SAN Volume Controller takes it one stepfurther, providing high-availability even for entry-level and midrange disk systems managed by the SVC.For unplanned downtime, IBM offers a complete range of support, from highly available clusters, two-site and three-site disaster recovery support, and application-aware data protection through IBM Tivoli Storage Manager.
Many outages are caused by human error, and in many cases it is the human factor that prevent quick resolution.Storage admins are unable to isolate the failing component, identify the configuration or provide the appropriateproblem determination data to the technical team ready to offer support and assistance. For this, IBM TotalStorageProductivity Center software, and its hardware-version the IBM System Storage Productivity Center, can helpreduce outage time and increase information availability. It can also provide automation to predict or provideearly warning of impending conditions that could get worse if not taken care of.
But perhaps yet another take on information availability is the ability to find and communicate the right informnationto the right people at the right time. Recently, Google announced a historic milestone, their search engine nowindexes over [One trillion Web pages]!Google and other search engines have changed the level of expectations for finding information. People ask whythey can find information on the internet so quickly, yet it takes weeks for companies to respond to a judge foran e-discovery request.
Lastly, the team at IBM's[Eightbar blog] pointedme to Mozilla Lab's Ubiquity project for their popular FireFox browser. This project aims to help people communicate the information in a more natural way, rather than unfriently URL links on an email. It is still beta, of course, but helps show what "information availability" might be possible in the near future.Here is a 7-minute demonstration:
For those who only read the first and last paragraphs of each post, here is my recap:Information Availability includes Business Continuity and Data Protection to facilitatequick recovery, storage virtualization to maximize performance and minimize planned downtime, infrastructure management and automation to reduce human error, and the ability to find and communicate information to others.
Earlier this year, IBM launched its[New Enterprise Data Center vision]. The average data center was built 10-15 years ago,at a time when the World Wide Web was still in its infancy, some companies were deploying their first storage areanetwork (SAN) and email system, and if you asked anyone what "Google" was, they might tell you it was ["a one followed by a hundred zeros"]!
Full disclosure: Google, the company, justcelebrated its [10th anniversary] yesterday, and IBM has partnered with Google on a varietyof exciting projects. I am employed by IBM, and own stock in both companies.
In just the last five years, we saw a rapid growth in information, fueled by Web 2.0 social media, email, mobile hand-held devices, and the convergenceof digital technologies that blurs the lines between communications, entertainment and business information. This explosion in information is not just "more of the same", but rather a dramatic shift from predominantly databases for online transaction processing to mostly unstructured content. IT departments are no longer just the"back office" recording financial transactions for accountants, but now also take on a more active "front office" role. For a growing number of industries, information technology plays a pivotal role in generating revenue, making smarter business decisions, and providing better customer service.
IBM felt a new IT model was needed to address this changing landscape, so IBM's New Enterprise Data Center vision has these five key strategic initiatives:
Highly virtualized resources
Business-driven Service Management
Green, Efficient, Optimized facilities
In February, IBM announced new products and features to support the first two initiatives, including the highlyvirtualized capability of the IBM z10 EC mainframe, and and related business resiliency features of the [IBM System Storage DS8000 Turbo] disk system.
In May, IBM launched its Service Management strategic initiative at the Pulse 2008 conference. I was there in Orlando, Florida at the Swan and Dolphin resort to present to clients. You can read my three posts:[Day 1; Day 2 Main Tent; Day 2 Breakout sessions].
In June, IBM launched its fourth strategic initiative "Green, Efficient and Optimized Facilities" with [Project BigGreen 2.0], which included the Space-Efficient Volume (SEV) and Space-Efficient FlashCopy (SEFC) capabilitiesof the IBM System Storage SAN Volume Controller (SVC) 4.3 release. Fellow blogger and IBM master inventor Barry Whyte (BarryW) has three posts on his blog about this:[SVC 4.3.0Overview; SEV and SEFCdetail; Virtual Disk Mirroring and More]
Some have speculated that the IBM System Storage team seemed to be on vacation the past two months, with few pressreleases and little or no fanfare about our July and August announcements, and not responding directly to critics and FUD in the blogosphere.It was because we were holding them all for today's launch, taking our cue from a famous perfume commercial:
"If you want to capture someone's attention -- whisper."
My team and I were actually quite busy at the [IBM Tucson Executive Briefing Center]. In between doing our regular job talking to excited prospects and clients,we trained sales reps and IBM Business Partners, wrote certification exams, and updated marketing collateral. Fortunately, competitors stopped promotingtheir own products to discuss and demonstrate why they are so scared of what IBM is planning.The fear was well justified. Even a few journalists helped raise the word-of-mouth buzz and excitement level. A big kiss to Beth Pariseau for her article in [SearchStorage.com]!
(Last week we broke radio silence to promote our technology demonstration of 1 million IOPS using Solid StateDisk, just to get the huge IBM marketing machine oiled up and ready for today)
Today, IBM General Manager Andy Monshaw launchedthe fifth strategic initiative, [IBM Information Infrastructure], at the[IBM Storage and Storage Networking Symposium] in Montpellier, France. Montpellier is one of the six locations of our New Enterprise Data Center Leadership Centers launched today. The other five are Poughkeepsie, Gaithersburg, Dallas, Mainz and Boebligen, with more planned for 2009.
Although IBM has been using the term "information infrastructure" for more than 30 years, it might be helpful to define it for you readers:
“An information infrastructure comprises the storage, networks, software, and servers integrated and optimized to securely deliver information to the business.”
In other words, it's all the "stuff" that delivers information from the magnetic surface recording of the disk ortape media to the eyes and ears of the end user. Everybody has an information infrastructure already, some are just more effective than others. For those of you not happy with yours, IBM hasthe products, services and expertise to help with your data center transformation.
IBM wants to help its clients deliver the right information to theright people at the right time, to get the most benefits of information, while controlling costs and mitigatingrisks. There might be more than a dozen ways to address the challenges involved, but IBM's Information Infrastructure strategic initiative focuses on four key solution areas:
Last, but not least, I would like to welcome to the blogosphere IBM's newest blogger, Moshe Yanai, formerly the father of the EMC Symmetrix and now leading the IBM XIV team. Already from his first poston his new [ThinkStorage blog]