Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Continuing my week in Chicago, for the IBM Storage Symposium 2008, I attended several sessions intended to answer the questions of the audience.
In an effort to be cute, the System x team have a "Meet the xPerts" session at their System x and BladeCenter Technical Conference, so the storage side decided to do the same. Traditionally, these have been called "Birds of a Feature", "Q&A Panel", or "Free-for-All". They allow anyone to throw out a question, and have the experts in the room, either
IBM, Business Partner or another client, answer the question from their experience.
Meet the Experts - Storage for z/OS environments
Here were some of the questions answered:
I've seen terms like "z/OS", "zSeries" and "System z" used interchangeably, can you help clarify what this particular session is about?
IBM's current mainframe servers are all named "System z", such as our System z9 or System z10. These replace the older zSeries models of hardware. z/OS is one of the six operating systems that run on this hardware platform. The other five are z/VM, z/VSE, z/TPF, Linux and OpenSolaris. The focus of this session will be storage attached and used for z/OS specifically, including discussions of Omegamon and DFSMS software products.
What can we do to reduce our MIPS-based software licensing costs from our third party vendors?
Consider using IBM System z Integrated Information Processor
What about 8 Gbps FICON?
IBM has already announced
[FICON Express8] host bus adapter (HBA) cards, that will auto-negotiate to 4Gbps and 2Gbps speeds. If you don't need full 8Gbps speed now, you can
still get the Express8 cards, but put 4/2/1 Gbps SFP ports instead. Currently, LongWave (LW) is only supported to 4km at 8Gbps speed.
I want to use Global Mirror for my DS8100 to my remote DS8100, but also make test copies of my production data to
an older ESS 800 I have locally. Any suggestions? Yes, consider using FlashCopy to simplify this process.
I have Global Mirror (GM) running now successfully with DSCLI, and now want to deploy IBM Tivoli Storage Productivity Center for Replication. Is that possible? Yes, Productivity Center for Replication will detect existing GM relationships, and start managing them.
I have already deployed HyperPAV and zHPF, is there any value in getting Solid-State Drives as well?
HyperPAV and zHPF impact CONN time, but SSD impacts DISC time, so they are mutually complementary.
How should I size my FlashCopy SE pool? SE refers to "Space Efficient", which stores only the changes
between the source and destination copies of each LUN or CKD volume involved. General recommendation is to start with 20 percent and adjust accordingly.
How many RAID ranks should I configure per DS8000 extent pool? IBM recommends 4 to 8 ranks per pool.
Meet the Experts: Storage for Linux, UNIX and Windows distributed systems
This session was focused on storage systems attached to distributed servers, as well as products from Tivoli used to manage them. Here were some of the questions answered:
When we migrated from Tivoli Storage Manager v5 to v6, we lost our favorite "Operational Reporting" tool. How can we get TOR back? You now get the new Tivoli Common Reporting tool.
How can we identify appropriate port distribution for multiple SVC node pairs for load balancing?
IBM Tivoli Storage Productivity Center v4.1 has hot-spot analysis with recommendations for Vdisk migrations.
We tried TotalStorage Productivity Center way back when, but the frequent upgrades were killing us. How has it been lately? It has been much more stable since v3.3, and completely renamed to Tivoli Storage Productivity Center to avoid association with versions 1 and 2 of the predecessor product. The new "lightweight agents" feature of v4.1 resolve many of the problems you were experiencing.
We have over 1600 SVC virtual disks, how do we handle this in IBM Tivoli Storage Productivity Center? Use the Filter capability in combination with clever naming conventions for your virtual disks.
How can we be clever when we are limited to only 15 characters? Ok. We understand.
We are currently using an SSPC with Windows 2003 and 2GB memory, but we are only using the Productivity Center for Replication feature of it. Can we move the DB2 database over to a Windows 2008 server with 4GB of memory?
Consider using the IBM Tivoli Storage Productivity Center for Replication software instead of SSPC for special
circumstances like this.
We love the XIV GUI, how soon will all other IBM storage products have it also? As with every acquisition,
IBM evaluates if there are technologies from new products that can be carried back to existing products.
We are currently using 12 ports on our existing XIV, and love it so much we plan to buy a second frame, but are concerned about consuming another 12 ports on our SAN switch. Any suggestions? Yes, use only six ports per frame. Just because you have more ports, doesn't mean you are required to use them.
We have heard there are concerns from the legal community about using deduplication technology, any ideas how to address that?
Nobody here in the room is a lawyer, and you should consult legal counsel for any particular situation.
None of the IBM offerings intended for non-erasable, non-rewriteable (NENR) data retention records (DR550, WORM tape, N series SnapLock) support dedupe today, and none of IBM's deduplication offerings (TS7650,N series A-SIS,TSM) make any claims for fit-for-purpose for compliance regulatory storage. However, be assured that all of IBM's dedupe technology involves byte-for-byte comparisons so that you never lose any data due to false hash collisions. For all IBM compliance storage, what you write will be read back in the correct sequence of ones and zeros.
Many people have asked me if there was any logic with the IBM naming convention of IBM Systems branded servers. Here's your quick and easy cheat sheet:
System x -- "x" for cross-platform architecture. Technologies from our mainframe and UNIX servers were brought into chips that sit next to the Intel or AMD processors to provide a more reliable x86 server experience. For example, some models have a POWER processor-based Remote Supervisor Adapter (RSA).
System p -- "p" for POWER architecture.
System z -- "z" for Zero-downtime, zero-exposures. Our lawyers prefer "near-zero", but this is about as close as you get to ["six-nines" availability] in our industry, with the highest level of security and encryption, no other vendor comes close, so you get the idea.
But what about the "i" for System i? Officially, it stands for "Integrated" in that it could integrate different applications running on different operating systems onto a [COMMON] platform. Options were available to insert Intel-based processor cards that ran Windows, or attach special cables that allowed separate System x servers running Windows to attach to a System i. Both allowed Windows applications to share the internal LAN and SAN inside the System i machine. Later, IBM allowed [AIX on System i] and [Linux on Power] operating systems to run as well.
From a storage perspective, we often joked that the "i" stood for "island", as most System i machines used internal disk, or attached externally to only a fewselected models of disk from IBM and EMC that had special support for i5/OS using a special, non-standard 520-byte disk block size. This meant only our popular IBM System Storage DS6000 and DS8000 series disk systems were available. This block size requirement only applies to disk. For tape, i5/OS supports both IBM TS1120 and LTO tape systems. For the most part,System i machines stood separate from the mainframe, and the rest of the Linux, UNIX and Windows distributed serverson the data center floor.
Often, when I am talking to customers, they ask when will product xyz be supported on System z or System i?I explained that IBM's strategy is not to make all storage devices connect via ESCON/FICON or support non-standard block sizes, but rather to get the servers to use standard 512-byte block size, Fibre Channel and other standard protocols.(The old adage applies: If you can't get Mohamed to move to the mountain, get the mountain to move to Mohamed).
On the System z mainframe, we are 60 percent there, allowing three of the five operating systems (z/VM, z/VSE and Linux) to access FCP-based disk and tape devices. (Four out of six if you include [OpenSolaris for the mainframe])But what about System i? As the characters on the popular television show [LOST] would say: It's time to get off the island!
Last week, IBM announced the new [i5/OS V6R1 operating system] with features that will greatly improve the use of external storage on this platform. Check this out:
POWER6-based System i 570 model server
Our latest, most powerful POWER processor brought to the System i platform. The 570 model will be the first in the System i family of servers to make use of new processing technology, using up to 16 (sixteen!) POWER6 processors (running at 4.7GHZ) in each machine.The advantage of the new processors is the increased commercial processing workload (CPW) rating, 31 percent greater than the POWER5+ version and 72 percent greater than the POWER5 version. CPW is the "MIPS" or "TeraFlops" rating for comparing System i servers.Here is the[Announcement Letter].
Fibre Channel Adapter for System i hardware
That's right, these are [Smart IOAs], so an I/O Processor (IOP) is no longer required! You can even boot the Initial Program Load (IPL) direclty from SAN-attached tape.This brings System i to the 21st century for Business Continuity options.
Virtual I/O Server (VIOS)
[VirtualI/O Server] has been around for System p machines, but now available on System i as well. This allows multiplelogical partitions (LPARs) to access resources like Ethernet cards and FCP host bus adapters. In the case of storage, the VIOS handles the 520-byte to 512-byte conversion, so that i5/OS systems can now read and write to standard FCP devices like the IBM System Storage DS4800 and DS4700 disk systems.
IBM System Storage DS4000 series
Initially, we have certified DS4700 and DS4800 disk systems to work with i5/OS, but more devices are in plan.This means that you can now share your DS4700 between i5/OS and your other Linux, UNIX and Windowsservers, take advantage of a mix of FC and SATA disk capacities, RAID6 protection, and so on.
To call [IBM PowerVM] the "VMware for the POWER architecture" would not do it quite justice. In combination with VIOS, IBM PowerVM is able to run a variety of AIX, Linux and i5/OS guest images.The "Live Partition Mobility" feature allows you to easily move guest images from one system to another, while they are running, just like VMotion for x86 machines.
And while we are on the topic of x86, PowerVM is also able to represent a Linux-x86 emulation base to run x86-compiled applications. While many Linux applications could be re-complied from source code for the POWER architecture "as is", others required perhaps 1-2 percent modification to port them over, and that was too much for some software development houses. Now, we can run most x86-compiled Linux application binaries in their original form on POWER architecture servers.
BladeCenter JS22 Express
The POWER6-based [JS22 Express blade] can run i5/OS, taking advantage of PowerVM and VIOS to access all of the BladeCenterresources. The BladeCenter lets you mix and match POWER and x86-based blades in the same chassis, providing theultimate in flexibility.
This week I am in Chicago for the IBM Storage and Storage Networking Symposium, which coincides with the System x and BladeCenter Technical Conference. This allows the 800 attendees to attend both storage or server presentations at their convenience. There were hundreds of sessions, over 20 time slots, so for each time slot, you have 15 or so topics to choose from.Mike Kuhn kicked off the series of keynote sessions. Here's my quick recap of each one:
Curtis Tearte, General Manager, IBM System Storage
Curtis replaced Andy Monshaw as General Manager for IBM System Storage. His presentation focused on how storage fits into IBM's Dynamic Infrastructure strategy. Some interesting points:
a billion camera-enabled cell phones were sold in 2007, compared to 450 million in 2006.
IBM expects that there will be 2 billion internet users by 2011, as well as trillions of "things".
In the US, there were 2.2 million medical pharmacy dispensing errors resulting for handwritten prescriptions.
Time wasted looking for parking spaces in Los Angeles consumed 47,000 gallons of gasoline, and generated 730 tons of carbon dioxide.
In the US, 4.2 billion hours are lost, and 2.9 billion gallons of gas consumed, due to traffic congestion.
Over the past decade, servers went from 8 watts to 100 watts per $1000 US dollars.
Data growth appears immune to the economic recession. The digital footprint per person is expected to grow from 1TB today to over 15TB by 2020.
10 hours of YouTube videos are uploaded every minute.
Bank of China manages 380 million bank accounts, processing over 10,000 transactions per second.
At the end of the session, Curtis transitioned from demonstrating his knowledge and passion of storage to his knowledge and passion in his favorite sport: baseball. Chicago is home to both the Cubs and the White Sox.
Roland Hagan, Vice President Business Line Executive, System x
IBM sets the infrastructure agenda for the entire industry. The Dynamic Infrastructure initiative is not just IT, but a complete end-to-end view across all of the infrastructures in play, including transportation, manufacturing, services and facilities.Companies spent over $60 billion US dollars on servers last year. Of these, 53 percent for x86-based servers, 9 percent for Itanium-based, 26 percent for RISC-based (POWER6, SPARC, etc.), and 11 percent mainframe. Theeconomic downturn has impacted revenues, but the percentages continue about the same.
The dominant deployment model remains one application per server. As a result, power, cooling and management costs have grown tremendously. There are system admins opposed to consolidating server images with VMware, Hyper-V, Xen or other server virtualizaition technologies. Roland referred to these admins as "server huggers".To help clients adopt cloud computing technologies, IBM introduced [Cloudburst] appliances. IBM plans to offer specialized versions for developers, for service providers, and for enterprises.
IBM's Enterprise-X Architecture is what differentiates IBM's x86-based servers from all the competitors, surrounding Intel and AMD processors with technology that provides distinct advantages. For example, to support server virtualization, IBM's eX4 provides support for more memory, which often is more critical than CPU resources when deploying large number of guest OS images. IBM System x servers have an integrated management module (IMM) and was the first to change over from BIOS to the new Unified Extensible Firmware Interface [UEFI] standard.
IBM servers offer double the performance, consume half the power, and cost a third less to manage, than comparably priced servers from competitors. Of the top 20 more energy efficient server deployments, 19 are from IBM. Roland cited customer reference SciNet, a 4000-server supercomputer with 30,000 cores based on IBM [iDataPlex] servers. At 350 TeraFLOPs it is ranked #16 fastest supercomputer in the world, and #1 in Canada. With apower usage effectiveness (PUE) less than 1.2, it also is very energy efficient. This means that for every 12 watts of electricity going in to the data center, 10 watts are used for servers, storage and networking gear, andonly 2 watts used for power and cooling. Traditional data centers have PUE around 2.5, consuming 25 watts total for every 10 watts used by servers, storage and networking gear.
Clod Barrera, Distinguished Engineer, Chief Technical Strategist for IBM System Storage
Clod presented trends and directions for disk and tape technology, disk and tape systems, and the direction towards cloud computing.
Continuing this week's theme on Cloud Computing, Dynamic Infrastructure and Data Center Networking, IBM unveiled details of an advanced computing system that will be able to compete with humans on Jeopardy!, America’s favorite quiz television show. Additionally, officials from Jeopardy! announced plans to produce a human vs. machine competition on the renowned show.
For nearly two years, IBM scientists have been working on a highly advanced Question Answering (QA) system, codenamed "Watson" after IBM's first president, [Thomas J. Watson]. The scientists believe that the computing system will be able to understand complex questions and answer with enough precision and speed to compete on Jeopardy!Produced by Sony Pictures Television, the trivia questions on Jeopardy! cover a broad range of topics, such as history, literature, politics, film, pop culture, and science. It poses a grand challenge for a computing system due to the variety of subject matter, the speed at which contestants must provide accurate responses, and because the clues given to contestants involve analyzing subtle meaning, irony, riddles, and other complexities at which humans excel and computers traditionally do not. Watson will incorporate massively parallel analytical capabilities and, just like human competitors, Watson will not be connected to the Internet or have any other outside assistance.
If this all sounds familiar, you might remember some of the events that have led up to this:
In 1984, the movie ["The Terminator"] introduced the concept of [Skynet], a fictional computer system developed by the militarythat becomes self-aware from its advanced artificial intelligence.
In 1997, an IBM computer called Deep Blue defeated World Chess Champion [Garry Kasparov] in a famous battle of human versus machine. To compete at chess, IBM built an extremely fast computer that could calculate 200 million chess moves per second based on a fixed problem. IBM’s Watson system, on the other hand, is seeking to solve an open-ended problem that requires an entirely new approach – mainly through dynamic, intelligent software – to even come close to competing with the human mind. Despite their massive computational capabilities, today’s computers cannot consistently analyze and comprehend sentences, much less understand cryptic clues and find answers in the same way the human brain can.
In 2005, Ray Kurzweil wrote [The Singularity is Near] referring to the wonders that artificial intelligence will bring to humanity.
The research underlying Watson is expected to elevate computer intelligence and human-to-computer communication to unprecedented levels. IBM intends to apply the unique technological capabilities being developed for Watson to help clients across a wide variety of industries answer business questions quickly and accurately.
Use more efficient disk media, such as high-capacity SATA disk drives
Both are great recommendations, but why limit yourself to what EMC offers? Your x86-based machines are only a subset of your servers,and disk is only a subset of your storage. IBM takes a more holistic approach, looking at the entire data center.
VMware is a great product, and IBM is its top reseller. But in addition to VMware, there are other solutions for the x86-based servers, like Xen and Microsoft Virtual Server. IBM's System p, System i, and System z product lines all support logical partitioning.
To compare the energy effectiveness of server virtualization, consider a metric that can apply across platforms. For example, for an e-mail server, consider watts per mailbox. If you have, say, 15,000 users, you can calculate how many watts you are consuming to manage their mailboxes on your current environment, and compare that with running them on VMware, or logical partitions on other servers. Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
More efficient Media
SATA and FATA disks support higher capacities, and run at slower RPM speeds, thus using fewer watts per terabyte.A terabyte stored on 73GB high-speed 15K RPM drives consumes more watts than the same terabyte stored using 500GB SATA.Chuck correctly identifies that tape is more power-efficient than disk, but then argues that paper is more power-efficient than tape. But paper is not necessarily more efficient than tape.
ESG analyst Steve Duplessie divides up data betweenDynamic vs. Persistent. The best place to put dynamic data is on disk, and here is where evaluation of FC/SAS versus SATA/FATA comes into play.Persistent data, on the other hand, can be stored on paper, microfiche, optical or tape media. All of these shelf-resident media consume no electricity, nor generate any heat that would require additional cooling.
A study by scientists at the Lawrence Berkeley National Laboratory titled High-Tech Means High-Efficiency: The Business Case for Energy Management in High-Tech Industries indicates thatData centers consume 15 to 100 times more energy per square foot than traditional office space. Storing persistent data in traditional office space can save a huge amount of energy. Steve Duplessie feels the ratio of dynamic to persistent data is 1:10 today, but is likely to grow to 1:100 in the near future, raising the demand for energy-efficient storage of persistent data ever more important to our environment.
Data centers consume nearly 5000 Megawatts in the USA alone, 14000 Megawatts worldwide. To put that in perspective, the country of Hungary I was in last week can generate up to 8000 Megawatts for the entire country (and they were using 7400 Megawatts last week as a result of their current heat wave, causing them grave concern).
Back in the 1990's, one of the insurance companies IBM worked with kept data on paper in manila folders, and armiesof young adults in roller skates were dispatched throughout the large warehouses of shelves to get the appropriate folder in response to customer service inquiries. Digitizing this paper into electronic format greatly reduced the need for this amount of warehouse space, as well as improved the time to retrieve the data.
A typical file storage box (12 inch x 12 inch x 18 inch) containing typed pages single-spaced, double-sided, 12 point font could hold perhaps 100MB. The same box could hold a hundred or more LTO or 3592 tape cartridges, each storing hundreds of GB of information. That's a million-to-one improvement of space-efficiency, and from a watts-per-TB basis, translates to substantial improvement in standard office air conditioning and lighting conditions.
To learn more about IBM's Project Big Green, watch thisintroductory video which used Second Life for the animation.
In North America, today marks the start of the "Give 1 Get 1" program.
Children using the XO laptop
I first learned from this when I was reading about Timothy Ferriss' [LitLiberation project] on his [Four Hour Work Week] blog, and was surfing around for related ideas, and chanced upon this. I registered for a reminder, and it came today(the reminder, not the laptop itself).
Here's how the program works. You give $399 US dollars to the "One Laptop per Child" (OLPC)[laptop.org] organization for two laptops: One goes to a deserving child ina developing country, the second goes to you, for your own child, or to donate to a localcharity that helps children. This counts as a $199 purchase plus a $200 tax-deductible donation.For Americans, this is a [US 501(c)(3)] donation, and for Canadians and Mexicans, take advantage of the low-value of the US dollar!
If your employer matches donations, like IBM does, get them to match the $200donation for a third laptop, which goes to another child in a developing country. As for shipping, you pay only for the shipping of the one to you, each receiving country covers their own shipping. In my case, the shipping was another $24 US dollars for Arizona.No guarantees that it will arrive in time for the holidays this December, but it might.
To sweeten the deal, T-mobile throws in a year's worth of "Wi-Fi Hot Spot"that you can use for yourself, either with the XO laptop itself, or your regular laptop, iPhone, or otherWi-Fi enabled handheld device.
National Public Radio did a story last week on this:[The $100 Laptop Heads for Uganda]where they interview actor [Masi Oka], best known from the TV show ["Heroes"], who has agreed to be their spokesman.At the risk of sounding like their other spokesman, I thought I would cover the technology itself, inside the XO,and how this laptop represents IBM's concept of "Innovation that matters"!
The project was started by [Nicholas Negroponte] from [MIT University] as the "$100 laptop project". Once the final designwas worked out, it turns out it costs $188 US dollars to make, so they rounded it up to $200. This is stillan impressive price, and requires that hundreds of thousands of them be manufactured to justify ramping upthe assembly line.
Two of IBM's technology partners are behind this project. First is Advanced Micro Devices (AMD) that providesthe 433Mhz x86 processor, which is 75 percent slower than Thinkpad T60. Second is Red Hat,as this runs lean Fedora 6 version of Linux. Obviously, you couldn't have Microsoft Windows or Apple OS X, as both require significantly more resources.
The laptop is "child size", and would be considered in the [subnotebook] category. At 10" x 9" x 1.25", it is about the size of class textbook,can be carried easily in a child's backpack, or carried by itself with the integrated handle. When closed, it is sealedenough to be protected when carried in rain or dust storms. It weighs about 3.5 pounds, less than the 5.2 pounds of myThinkpad T60.
The XO is "green", not just in color, but also in energy consumption.This laptop can be powered by AC, or human power hand-crank, with workin place to get options for car-battery or solar power charging. Compared to the 20W normally consumed bytraditional laptops, the XO consumes 90 percent less, running at 2W or less. To accomplish this, there is no spinning disk inside. Instead, a 1GB FLASH drive holds 700MB of Linux, and gives you 300MB to hold your files. There isa slot for an MMC/SD flash card, and three USB 2.0 ports to connect to USB keys, printers or other remote I/O peripherals.
The XO flips around into three positions:
Standard laptop position has screen and keyboard. The water-tight keyboard comes in ten languages:International/English, Thai, Arabic, Spanish, Portuguese, West African, Urdu, Mongolian, Cyrillic, and Amharic.(I learned some Amharic, having lived five years with Ethiopians.)There does not appear be a VGA port, so don't be thinking this could be used as an alternative to project Powerpoint presentations onto a big screen.
Built-in 640x480 webcam, microphone and speakers allow the XO to be used as a communication device. Voice-over-IP (VOIP) client software, similar to Skype or [IBM Lotus Sametime], is pre-installed for this purpose.
The basic built-in communication are 802.1g (54Mbs) that you can use to surf the web usingthe Wi-Fi at your local Starbucks; and 802.1s which forms a "mesh network" with other XO laptops, and can surf theweb finding the one laptop nearby that is connected to the internet to share bandwidth. This eliminates the need to build a separate Wi-Fi hub at the school. There are USB-to-Ethernet and USB-to-Cellular converters, so that might be an alternative option.
Flipped vertically, the device can be read like a book.The screen can be changed between full-color and black-white, 200 dpi, with decent 1200x900 pixel resolution. The full-color is back-lit, and can be read in low-lighting. The black-white is not back-lit, consumes much less power, andcan be read in bright sunlight. In that regards, it is comparable to other [e-book devices], like a Cybook or Sony Reader.
Software includes a web-browser, document reader, word processor and RSS feed reader to read blogs.The OLPC identifies all of the software, libraries and interfaces they use, so that anyone that wants to developchildren software for this platform can do so.
With the keyboard flipped back, the 6" x 4.5" screen has directional controls and X/Y/A/B buttons to run games. This would make it comparable to a Nintendo DS or Playstation Portable (PSP). Again, the choice between back-lit color,or sunlight black-white screen modes apply. Some games are pre-installed.
So for $399, you could buy a Wi-Fi enabled[16GB iPod Touch] for yourself, which does much the same thing, or you can make a difference in the world.I made my donation this morning, and suggest you--my dear readers in the US, Canada and Mexico--consider doing the same.Go to [www.laptopgiving.org] for details.
For those of us in the northern hemisphere, yesterday was this year's Winter Solstice, representingthe shortest amount of daylight between sunrise and sunset. So today, I thought I would blog on my thoughtsof managing scarcity.
Earlier in my career, I had the pleasure to serve as "administrative assistant" to Nora Denzel for the week at a storage conference. My job was to make her look good at the conference, which if you know Nora, doesn't take much. Later, she left IBM to work at HP, and I gotto hear her speak at a conference, and the one thing that I remember most was her statement that thewhole point of "management" was to manage scarcity, as in not enough money in the budget,not enough people to implement change, or not enough resources to accomplish a task.(Nora, I have no idea where you are today, so if you are reading this, send me a note).
Of course, the flip-side to this is that resources that are in abundance are generallytaken for granted. Priorities are focused on what is most scarce. Let's examine some of theresources involved in an IT storage environment:
Capacity - while everyone complains that they are "running out of space", the truth is that most external disk attached to Linux, UNIX, or Windows systems contain only 20-40% data. Many years ago, I visitedan insurance company to talk about a new product called IBM Tivoli Storage Manager. This company had 7TB of disk on their mainframe,and another 7TB of disk scattered on various UNIX and Windows machines. In the room were TWO storage admins for
the mainframe, and 45 storage admins for the distributed systems. My first question was "why so many people forthe mainframe, certainly one of you could manage all of it yourself, perhaps on Wednesday afternoons?" Their response was that they acted as eachother's backup, in case one goes on vacation for two weeks. My follow-up question to the rest of the audience was:"When was the last time you took two weeks vacation?" Mainframes fill their disk and tape storage comfortablyat over 80-90% full of data, primarily because they have a more mature, robust set of management software, likeDFSMS.
Labor - by this I mean skilled labor able to manage storage for a corporation. Some companies I have visitedkeep their new-hires off production systems for the first two years, working only on test or development systemsonly until then. Of course, labor is more expensive in some countries than others. Last year, I was doing a whiteboard session on-site for a client in China, and the last dry-erase pen ran out of ink. I asked for another pen, and they instead sent someone to go re-fill it. I asked wouldn't it be cheaper just to buy another pen, and they said "No, labor is cheap, but ink is expensive." Despite this, China does complain that there is a shortage of askilled IT labor force, so if you are looking for a job, start learning Mandarin.
Power and Cooling - Most data centers are located on raised floors, with large trunks of electrical power and hugeair conditioning systems to deal with all the heat generated from each machine. I have visited the data centers ofclients that are forced now to make decisions on storage based on power and cooling consumption, because the coststo upgrade their aging buildings are too high. Leading the charge is IBM, with technology advancements in chips, cards, and complete systems that use less power, and generate less heat. While energy is still fairly cheap in the grand scheme of things, fears ofGlobal Warmingand declining oil supplies, the costs ofpower and cooling have gotten some news lately. In 1956, Hubbert predicted US would reach peak oil supplies by1965-1970 (it happened in 1971), and this year Simmonsestimated that world-wide oil production began its decline already in 2005. Smart companies like Google have movedtheir server farms to places like Oregon in the Pacific Northwest for cheaper hydroelectric power.
Bandwidth - Last year IBM introduced 4Gbps Fibre Channel and FICON SAN networking gear, along with the servers and storage needed to complete the solution. 4Gbps equates to about 400 MB/sec in data throughput. By comparison, iSCSI is typically run on 1Gbps Ethernet, but has so much overheads that you only get abour 80 MB/sec. Next year, we may see both 8 Gbps SAN, and 10 GbE iSCSI, to provide 800 MB/sec throughputs. My experience is that the SAN is not the bottleneck, instead people run out of bandwidth at the server or storage end first. They may not have a million dollars to buy the fastest IBM System p5 servers, or may not have enough host adapters at the storage system end.
Floorspace - I end with floorspace because it reminds me that many "shortages" are temporary or artificially created. Floorspace is only in short supply because you don't want to knock down a wall, or build a new building, to handle your additional storage requirements.In 1997, Tihamer Toth-Fejel wrote an article for the National Space Society newsletter that estimated that ...Everybody on Earth could live comfortably in the USA on only 15% of our land area, with a population density between that of Chicago and San Francisco. Using agricultural yields attained widely now, the rest of the U.S. would be sufficient to grow enough food for everyone. The rest of the planet, 93.7% of it, would be completely empty.Of course, back in 1997 the world population was only 5.9 billion, and this year it is over 6.5 billion.
This last point brings me back to the concept of food, and I am not talking about doughnuts in the conference room, or pizza while making year-end storage upgrades. I'm talking aboutthe food you work so hard to provide for yourself and your family. The folks at Oxfam came up with a simpleanalogy. If 20 people sit down at your table, representing the world’s population:
3 would be served a gourmet, multi-course meal, while sitting at decorated table and a cushioned chair.
5 would eat rice and beans with a fork and sit on a simple cushion
12 would wait in line to receive a small portion of rice that they would eat with their hands while sitting on the floor.
So for those of you planning a special meal next Monday, be thankful you are one of the lucky three, and hopefulthat IBM will continue to lead the IT industry to help out the other seventeen.
Federal Rules for Civil Procedures (FRCP) will increase adoption of unstructured data classification, email archive systems and CAS.
CAS continues to flounder, but the rest I can agree with. Regulations are being adopted world wide. Japan has its own Sarbanes-Oxley (SOX) style legislation go into effect in 2008.IBM TotalStorage Productivity Center for Data is a great tool to help classify unstructured file systems. IBM CommonStore for email supports both Microsoft Exchange and Lotus Domino, and can be connected to IBM System Storage DR550 for compliance storage.
Unified storage systems (combined file and block storage target systems) will become increasingly attractive in 2007, because of their ease of use and simplicity.
I agree with this one also. Our sales of IBM N series in 2006 was great, and looking to continue its strong growth in 2007. The IBM N series brings together FCP, iSCSI and NAS protocols into one disk system. With the SnapLock(tm) feature, N series can store both re-writable data, as well as non-erasable, non-rewriteable data, on the same box. Combine the N series gateway on the front-end with SAN Volume Controller on the back-end, and you have an even more powerful combination.
Distributed ROBO backup to disk will emerge as the fastest growing data protection solution in 2007.
IDC had a similar prediction for 2006. ROBO refers to "Remote Office/Branch Office", and so ROBO backup deals with how to back up data that is out in the various remote locations. Do you back it up locally? or send it to a central location?Fortunately, IBM Tivoli Storage Manager (TSM) supports both ways, and IBM has introduced small disk and tape drives and auto-loaders that can be used in smaller environments like this. I don't know whether "backup to disk" will be the fastest growing, but I certainly agree that a variety of ROBO-related issues will be of interest this year.
2007 will be remembered as the year iSCSI SAN took off because of the much reduced pricing for 10 Gbit iSCSI and the continued deployment of 10 Gbit iSCSI targets.
While I agree that iSCSI is important, I can't say 2007 will be remembered for anything.We have terrible memory in these things. Ask someone what year did Personal Computers (PC) take off, and they will tell you about Apple's famous 1984 commercial. Ask someone when the Internet took off, cell phones took off, etc, and I suspect most will provide widely different answers, but most likely based on their own experience.
For the longest time, I resisted getting a cell phone. I had a roll of quarters in my car, and when I needed to make a call, I stopped at the nearby pay-phone, and made the call. In 1998, pay phones disappeared. You can't find them anymore. That was the year of the cell phones took off, at least for me.
Back to iSCSI, now that you can intermix iSCSI and SAN on the same infrastructure, either through intelligent multi-protocol switches available from your local IBM rep, or through an N series gateway, you can bring iSCSI technology in slowly and gradually. Low-cost copper wiring for 10 Gbps Ethernet makes all this very practical.
Another up-and-coming technology is AoE, or ATA-over-Ethernet. Same idea as iSCSI, but taken down to the ATA level.
CDP will emerge as an important feature on comprehensive data protection products instead of a separate managed product.
Here, CDP stands for Continuous Data Protection. While normal backups work like a point-and-shoot camera, taking a picture of the data once every midnight for example. CDP can record all the little changes like a video camera, with the option to rewind or fast-forward to a specific point in the day. IBM Tivoli CDP for Files, for example, is an excellent complement to IBM Tivoli Storage Manager.
The technology is not really new, as it has been implemented as "logs" or "journals" on databases like DB2 and Oracle, as well as business applications like SAP R/3.
The prediction here, however, relates to packaging. Will vendors "package" CDP into existing backup products, possibly as a separately priced feature, or will they leave it as a separate product that perhaps, like in IBM's case, already is well integrated.
The VTL market growth will continue at a much reduced rate as backup products provide equivalent features directly to disk. Deduplication will extend the VTL market temporarily in 2007.
VTL here refers to Virtual Tape Library, such as IBM TS7700 or TS7510 Virtualization Engine. IBM introduced the first one in 1997, the IBM 3494 Virtual Tape Server, and we have remained number one in marketshare for virtual tape ever since. I find it amusing that people are now just looking at VTL technology to help with their Disk-to-Disk-to-Tape (D2D2T) efforts, when IBM Tivoli Storage Manager has already had the capability to backup to disk, then move to tape, since 1993.
As for deduplication, if you need the end-target box to deduplicate your backups, then perhaps you should investigatewhy you are doing this in the first place? People take full-volume backups, and keep to many copies of it, when a more sophisticated backup software like Tivoli Storage Manager can implement backup policies to avoid this with a progressive backup scheme. Or maybe you need to investigate why you store multiple copies of the same data on disk, perhaps NAS or a clustered file system like IBM General Parallel File System (GPFS) could provide you a single copy accessible to many servers instead.
The reason you don't see deduplication on the mainframe, is that DFSMS for z/OS already allows multiple servers to share a single instance of data, and has been doing so since the early 1980s. I often joke with clients at the Tucson Executive Briefing Center that you can run a business with a million data sets on the mainframe, but that there wereprobably a million files on just the laptops in the room, but few would attempt to run their business that way.
Optical storage that looks, feels and acts like NAS and puts archive data online, will make dramatic inroads in 2007.
Marc says he's going out on a limb here, and that's good to make at least one risky prediction. IBM used to have anoptical library emulate disk, called the IBM 3995. Lack of interest and advancement in technology encouraged IBM to withdraw it. A small backlash ensued, so IBM now offers the IBM 3996 for the System p and System i clients that really, really want optical.
As for optical making data available "online", it takes about 20 seconds to load an optical cartridge, so I would consider this more "nearline" than online. Tape is still in the 40-60 second range to load and position to data, so optical is still at an advantage.
Optical eliminates the "hassles of tape"? Tape data is good for 20 years, and optical for 100 years, but nobody keeps drives around that long anyways. In general, our clients change drives every 6-8 years, and migrate the data from old to new. This is only a hassle if you didn't plan for this inevitable movement. IBM Tivoli Storage Manager, IBM System Storage Archive Manager, and the IBM System Storage DR550 all make this migration very simple and easy, and can do it with either optical or tape.
The Blue-ray vs. DVD debate will continue through 2007 in the consumer world. I don't see this being a major player in more conservative data centers where a big investment in the wrong choice could be costly, even if the price-per-TB is temporarily in-line with current tape technologies. IBM and others are investing a lot of Research and Development funding to continue the downward price curve for tape, and I'm not sure that optical can keep up that pace.
Well, that's my take. It is a sunny day here in China, and have more meetings to attend.
Continuing my week in Chicago, I decided to attend some of the presentations from the System x side. This is the advantage of running both conferences in the same hotel, attendees can choose how many of each they want to participate in.
Wayne Wigley, IBM Advanced Technical Support (ATS), presented a series of presentations on different server virtualization offerings available for System x and BladeCenter servers. I am very familiar with virtualization implemented on System z mainframes, as well as IBM's POWER systems, and have working knowledge of Linux KVM and Xen, so I was well prepared to handle hearing the latest about Microsoft's Hyper-V and VMware's Vsphere version 4.
Microsoft Hyper-V 2008
Hyper-V can run as part of Windows 2008, are standalone on its own.Different levels of Windows 2008 include licenses for different number of Windows virtual machines (VMs).Windows Server 2008 Standard includes 1 Windows VM, Enterprise includes 4 Windows VMs, and the Datacenter edition includes unlimited number of Windows VMs. If you want to run more Window VMs than come included, you need to pay extra for each additional one. For example, to run 10 Windows VMs on a 2-socket server would cost about $9000 US dollars on Standard but only $6000 US dollars on Datacenter edition (list prices from Microsoft Web site).
Unlike VMware, which takes a monolithic approach as hypervisor, Hyper-V is more like Xen with a microkernelized approach. This means you need a "parent" guest OS image, and the rest of the Guest OS images are then considered "child" images.These child images can be various levels of Windows, from Windows XP Pro to Windows Server 2008, Xen-enabled Linux, or even a non-hypervisor-aware OS.The "parent" guest OS image provides networking and storage I/O services to these "child" images.For the hypervisor-aware versions of Windows and Linux, Hyper-V allows optimized access to the hypervisor, "synthetic devices", and hypercalls. Synthetic devices present themselves as network devices, but only serve to pass data along the VMBus to other networking resources. This process does not require software emulation, and therefore offers higher performance for virtual machines and lower host system overhead.For non-hypervisor-aware OS images, Hyper-V provides device emulation through the "parent" image, which is slower.
Microsoft System Center Virtual Machine Manager (SCVMM) can manage both Hyper-V and VMware VI3 images.Wayne showed various screen shots of the GUI available to manage Hyper-V images.In standalone mode, you lose the nice GUI and management console.
Hyper-V supports external, internal and private virtual LANs (VLAN). External means that VMs can communicate with the outside world over standard ethernet connections. Internal means that VMs can communicate with "parent" and "child" guest images on the same server only. Private means that only "child" guests can communicate with other "child" images.
Hyper-V supports disk attached via IDE, SATA, SCSI, SAS, FC, iSCSI, NFS and CIFS. One mode is "Virtual Hard Disk" (VHD) similar to VMware VMDK files. The other is "pass through" mode, which are actual disk LUNs accessed natively. VHDs can be dynamic (thin provisioned), fixed (fully allocated), or differencing. The concept of differencing is interesting, as you start with a base read-only VHD volume image, and have a separate "delta" file that contains changes from the base image.
Some of the key features of Hyper-V 2008 are:
Being able to run concurrently 32-bit and 64-bit versions of Linux and Windows guest images
Support for 64 GB of memory and 4-way symmetric multiprocessing (SMP) per VM
Clustering for High Availability and Quick Migration of VM images
Live backup with integration with Microsoft's Volume Shadow Copy Services (VSS)
Virtual LAN (VLAN) support, and Virtual and Pass-through physical disk support
A clever VMbus, virtual service parent/client approach to sharing hardware
Optimized performance options for hypervisor-aware versions of Windows and Linux, and emulated supportfor non-hypervisor-aware OS images.
VMware Vsphere v4.0
This was titled as an "Overview" session, but really was an "Update" session on the newest features of this release. The big change appears to be that VMware added "v" in front of everything.
Under vCompute, there are some new features on VMware's Distributed Resource Scheduler (DRS) which includes recommended VM migrations. Dynamic Power Management (DPM) will move VMs during periods of low usageto consolidate onto fewer physical servers so as to reduce energy consumption.
Under vStorage, vSphere introduces an enhanced Plugable Storage Architecture (PSA), with supportfor Storage Array Type Plugins (SATP) and Path Selection Plugins (PSP). This vStorage API allows forthird party plugins for improved fault-tolerance and complex I/O load balancing algorithms. This releasealso has improved support for iSCSI, including Challenge-Handshake Authentication Protocol (CHAP) support.Similar to Hyper-V's dynamic VHD, VMware supports "thin provisioning" for their virtual disk VMDK files.A feature of "Storage Vmotion" allows conversion between "thick" and "thin" provisioning formats.
The vStorage API for Data Protection provide all the features of VMware Consolidated Backup (VCB). The APIprovides full, incremental and differential file-level backups for Windows and Linux guests, including supportfor snapshots and Volume Shadow Copy Services (VSS) quiescing.
VMware introduces direct I/O pass-through for both NIC and HBA devices. While thisallows direct access to SAN-attached LUNs similar to Hyper-V, you lose a lot of features like Vmotion, High Availability and Fault Tolerance. Wayne felt that these restrictions are temporary, that hopefully VMwarewill resolve this over the next 12 months.
Under vNetwork, VMware has virtual LAN switches called vSwitches. This includes support for IPv6and VLAN offloading.
The vSphere server can now run with up to 1TB of RAM and 64 logical CPUs to support up to 320 VM guest images.Each VM can have up to 255GB RAM and up to 8-way SMP.Vsphere ESX 4 introduces a new virtual hardware platform called VM Hardware v7. While Vsphere 4.0 can run VMs from ESX 2 and ESX 3, the problem is if you have new VMs based on this newer VM Hardware v7, you cannot run them on older ESX versions.
Vsphere comes in four sizes: Standard, Advanced, Enterprise, and Enterprise Plus, ranging in list price from $795 US dollars to $3495 US dollars.
While IBM is the #1 reseller of VMware, we also are proud to support Hyper-V, Xen, KVM and other similar products.Analysts expect most companies will have two or more server virtualization solutions in their data center, and it is good to see that IBM supports them all.
Wrapping up this week's theme on ways to make the planet smarter, and less confusing, I present IBM's third annual [five in five]. These are five IBM innovations to watch over the next five years, all of which have implications on information storage. Here is a quick [3-minute video] that provides the highlights:
Yesterday's post [Software Programmers as Bees]was not meant as "career advice", but certainly I got some interesting email as if it was.Orson Scott Card was poking fun at the culture clash between software programmers andmanagement/marketers, and I gave my perspective, having worked both types of jobs.
This is June. Many students are graduating from high school or college and lookingfor jobs. Some of these might be jobs just for the summer to make some spending money,and others mights be jobs like internships to explore different career paths. I found both programming and marketing are rewarding and interesting work, but each person is different.
There are a variety of ways to find out what your personality traits are,and then focus on those jobs or career paths that are best for those strengths. Hereis an online [Typology Test] based onthe work of psychologists Carl Jung and Isabel Myers-Briggs. The result is a four-letterscore that represents 16 possible personalities. For example, mine is "ENTP",which stands for "Extroverted, Intuitive, Thinking, Perceiving". You can find out otherfamous people that match your personality type. For ENTP, I am lumped together withfellow master inventor Thomas Edison, fellow author Lewis Carrol (Alice in Wonderland), Cooking great Julia Child, Comedians George Carlin and Rodney Dangerfield (I get no respect!),movie director Alfred Hitchcock, and actor Tom Hanks.
USA Today had an article ["CEOsvalue lessons from teen jobs"] which offers some career advice from successful business people.Of course, what worked for them may not work for you, all based on different personality types. Hereis an excerpt of the advice I thought the most useful:
"If you are committed, you will be successful." (unfortunately, the reverse is also true: if you are successful,you will be asked to move to a different job)
"Tackle offbeat jobs. Challenge conventional wisdom within reason. Come into contact with people from all walks of life."
"Show an interest, demonstrate you want to be on the job."
"Never limit yourself. Look beyond to what needs to be done, or should be done. Then do it. Stretch. Go beyond what others expect."
"Find a job that forces you to work effectively with people. No matter what you end up doing, dealing with others will be critical."
"Bring your best to the table every day. Learn professional responsibility and how to handle difficult situations."
"Listen carefully to what customers want."
Before IBM, I ran my own business. If you are thinking, "Maybe I will start my own business instead?" you might want to see this advice from Venture Capitalist [Guy Kawasaki on Innovation].While running your own business has advantages, like avoiding issues "working for the man", it has somedisadvantages as well. It is certainly not as easy as some people make it seem to be.
Of course, things are a lot different nowadays than they were when these CEOs were teenagers. And the pace ofchange does not seem to be slowing down any either. Here is a presentation on [SlideShare.net] that helps bring to focus the realities of globalization:
Well it's Tuesday, which means its time to look at recent announcements.While I was on vacation last week, IBM made a lot of storage announcements October 23.Josh Krischer gives his summary on WikiBon [October 2007 Review].Austin Modine of the The Register went so far as to say that [IBM goes crazy with storage system updates].
IBM System Storage DS8000 series
This is "Release 3" software/microcode upgrades on our existing "Turbo" hardware.
IBM FlashCopy SE -- Here "SE" stands for Space Efficient. Rather than allocating a full 100% of the space for the FlashCopy destination, you can set aside just a fraction, and this will hold all the changed blocks, similar to whatIBM already offers on the DS4000 series.
Dynamic Volume Expansion -- In the past, if you needed more space for a LUN, you had to carve out a newer one elsewhere, and then copy the data over from the old to the new, leaving the old LUN around to be re-used or leftstranded. With this enhancement, you can just upgrade the LUN in place, making it bigger as needed, similar to whatIBM already offers on the DS4000 series and SAN Volume Controller. This applies to CKD volumes for the System zmainframe users out there as well.
Storage Pool Striping -- striping volumes across RAID ranks to eliminate or reduce hot-spots, and provide betterload balancing. Many used SAN Volume Controller in front of the DS8000 to do this, but now you can do it natively inthe DS8000 itself.
z/OS Global Mirror Multiple Reader -- for System z customers, "z/OS Global Mirror" is the new name for XRC. Thisenhancement improves the throughput of sending updates to the remote disaster recovery location.
DS Storage Manager enhancements, the element manager software has been enhanced, and is pre-installed on the new IBM System Storage Productivity Center, which I will talk about below.
Intermix of DS8000 machine types -- this is especially useful to allow new frames to have co-terminating warrantieswith the base units. In other words, as you expand your system, you can ensure that the entire chunk of iron runs outof warranty all at the same time, to simplify your decision making process to upgrade or contract for extended service.
One of the biggest complaints about IBM TotalStorage Productivity Center is that it is software that needs to beinstalled on its own server, and that this installation process can take a day or two. Why wait? Now you can havea hardware console that has the DS8000 Storage Manager software, SVC Admin Console software, and IBM TotalStorageProductivity Center "Basic Edition" pre-installed. Here are the key features.
Pre-installed and tested console
DS8000 R3 GUI integration
Cohabitation of SVC 4.2.1 GUI and CIMOM
Automated device discovery
Asset and capacity reporting, including tape library support
Our "Release 9" applies across the board, from N3000 to N5000 to N7000 series models, includingnew host bus adapters, and the new Data OnTAP 7.2.4 release level.
The Virtual File Manager (VFM) was announced as one of our latest [Storage Virtualization Solutions]. VFMprovides a global namespace that aggregates the file systems from Linux, UNIX, and Windows file servers, as well asN series storage, into a consolidated environment.
IBM's virtual tape library (VTL) for the distributed systems platform, has been enhanced to provide:
Up to 12TB of disk cache, using 750GB SATA disk.
F05 Tape Frames installed as TS7520 base units through a 32 port fibre channel switch
Support for LTO generation 4 tape drives, both as virtual tape drives and as physical tape drives within IBM automated tape libraries attached to the TS7520. This allows you to use Encryption capabilities of LTO4.
DS3000 series now supports SATA disk, and can be attached to AIX and Linux on System p servers. This appliesto the DS3200, DS3300 and DS3400 models.See the [DS3000 Announcement Letter] for more details.
Based on this success, and perhaps because I am also fluent in Spanish, I was asked to help with Proyecto Ceibal, the team for OLPC Uruguay. Normally theXS school server resides at the school location itself, so that even if the internet connection is disrupted or limited, the school kids can continue to access each other and the web cache content until internet connection is resumed.However, with a diverse developmentteam with people in United States, Uruguay, and India, we first looked to Linux hosting providers that wouldagree to provide free or low-cost monthly access. We spent (make that "wasted") the month of May investigating.Most that I talked to were not interested in having a customized Linux kernel on non-standard hardware on their shop floor, and wanted instead to offer their own standard Linux build on existing standard servers, managed by theirown system administrators, or were not interested in providing it for free. Since the XS-163 kernel is customizedfor the x86 architecture, it is one of those exceptions where we could not host it on an IBM POWER or mainframe as a virtual guest.
This got picked up as an [idea] for the Google's[Summer of Code] and we are mentoring Tarun, a 19-year-old student to actas lead software developer. However, summer was fast approaching, and we wanted this ready for the next semester. In June, our project leader, Greg, came up with a new plan. Build a machine and have it connected at an internet service provider that would cover the cost of bandwidth, and be willing to accept this with remote administration. We found a volunteer organization to cover this -- Thank you Glen and Vicki!
We found a location, so the request to me sounded simple enough: put together a PC from commodity parts that meet the requirements of the customizedLinux kernel, the latest release being called [XS-163]. The server would have two disk drives, three Ethernet ports, and 2GB of memory; and be installed with the customized XS-163 software, SSHD for remote administration, Apache web server, PostgreSQL database and PHP programming language.Of course, the team wanted this for as little cost as possible, and for me to document the process, so that it could be repeated elsewhere. Some stretch goals included having a dual-boot with Debian 4.0 Etch Linux for development/test purposes, an alternative database such as MySQL for testing, a backup procedure, and a Recover-DVD in case something goes wrong.
Some interesting things happened:
The XS-163 is shipped as an ISO file representing a LiveCD bootable Linux that will wipe your system cleanand lay down the exact customized software for a one-drive, three-Ethernet-port server. Since it is based on Red Hat's Fedora 7 Linux base, I found it helpful to install that instead, and experiment moving sections of code over.This is similar to geneticists extracting the DNA from the cell of a pit bull and putting it into the cell for a poodle. I would not recommend this for anyone not familiar with Linux.
I also experimented with modifying the pre-built XS-163 CD image by cracking open the squashfs, hacking thecontents, and then putting it back together and burning a new CD. This provided some interesting insight, but in the end was able to do it all from the standard XS-163 image.
Once I figured out the appropriate "scaffolding" required, I managed to proceed quickly, with running versionsof XS-163, plain vanilla Fedora 7, and Debian 4, in a multi-boot configuration.
The BIOS "raid" capability was really more like BIOS-assisted RAID for Windows operating system drivers. This"fake raid" wasn't supported by Linux, so I used Linux's built-in "software raid" instead, which allowed somepartitions to be raid-mirrored, and other partitions to be un-mirrored. Why not mirror everything? With two160GB SATA drives, you have three choices:
No RAID, for a total space of 320GB
RAID everything, for a total space of 160GB
Tiered information infrastructure, use RAID for some partitions, but not all.
The last approach made sense, as a lot of of the data is cache web page images, and is easily retrievable fromthe internet. This also allowed to have some "scratch space" for downloading large files and so on. For example,90GB mirrored that contained the OS images, settings and critical applications, and 70GB on each drive for scratchand web cache, results in a total of 230GB of disk space, which is 43 percent improvement over an all-RAID solution.
While [Linux LVM2] provides software-based "storage virtualization" similar to the hardware-based IBM System Storage SAN Volume Controller (SVC), it was a bad idea putting different "root" directories of my many OS images on there. With Linux, as with mostoperating systems, it expects things to be in the same place where it last shutdown, but in a multi-boot environment, you might boot the first OS, move things around, and then when you try to boot second OS, it doesn'twork anymore, or corrupts what it does find, or hangs with a "kernel panic". In the end, I decided to use RAIDnon-LVM partitions for the root directories, and only use LVM2 for data that is not needed at boot time.
While they are both Linux, Debian and Fedora were different enough to cause me headaches. Settings weredifferent, parameters were different, file directories were different. Not quite as religious as MacOS-versus-Windows,but you get the picture.
During this time, the facility was out getting a domain name, IP address, subnet mask and so on, so I testedwith my internal 192.168.x.y and figured I would change this to whatever it should be the day I shipped the unit.(I'll find out next week if that was the right approach!)
Afraid that something might go wrong while I am in Tokyo, Japan next week (July 7-11), or Mumbai, India the following week (July 14-18), I added a Secure Shell [SSH] daemon that runs automaticallyat boot time. This involves putting the public key on the server, and each remote admin has their own private key on their own client machine.I know all about public/private key pairs, as IBM is a leader in encryption technology, and was the first todeliver built-in encryption with the IBM System Storage TS1120 tape drive.
To have users have access to all their files from any OS image required that I either (a) have identical copieseverywhere, or (b) have a shared partition. The latter turned out to be the best choice, with an LVM2 logical volumefor "/home" directory that is shared among all of the OS images. As we develop the application, we might findother directories that make sense to share as well.
For developing across platforms, I wanted the Ethernet devices (eth0, eth1, and so on) match the actual ports they aresupposed to be connected to in a static IP configuration. Most people use DHCP so it doesn't matter, but the XSsoftware requires this, so it did. For example, "eth0" as the 1 Gbps port to the WAN, and "eth1/eth2" as the two 10/100 Mbps PCI NIC cards to other servers.Naming the internet interfaces to specific hardware ports wasdifferent on Fedora and Debian, but I got it working.
While it was a stretch goal to develop a backup method, one that could perform Bare Machine Recovery frommedia burned by the DVD, it turned out I needed to do this anyways just to prevent me from losing my work in case thingswent wrong. I used an external USB drive to develop the process, and got everything to fit onto a single 4GB DVD. Using IBM Tivoli Storage Manager (TSM) for this seemed overkill, and [Mondo Rescue] didn't handle LVM2+RAID as well as I wanted, so I chose [partimage] instead, which backs up each primary partition, mirrored partition, or LVM2 logical volume, keeping all the time stamps, ownerships, and symbolic links in tact. It has the ability to chop up the output into fixed sized pieces, which is helpful if you are goingto burn them on 700MB CDs or 4.7GB DVDs. In my case, my FAT32-formatted external USB disk drive can't handle files bigger than 2GB, so this feature was helpful for that as well. I standardized to 660 GiB [about 692GB] per piece, sincethat met all criteria.
The folks at [SysRescCD] saved the day. The standard "SysRescueCD" assigned eth0, eth1, and eth2 differently than the three base OS images, but the nice folks in France that write SysRescCD created a customized[kernel parameter that allowed the assignments to be fixed per MAC address ] in support of this project. With this in place, I was able to make a live Boot-CD that brings up SSH, with all the users, passwords,and Ethernet devices to match the hardware. Install this LiveCD as the "Rescue Image" on the hard disk itself, and also made a Recovery-DVD that boots up just like the Boot-CD, but contains the 4GB of backup files.
For testing, I used Linux's built-in Kernel-based Virtual Machine [KVM]which works like VMware, but is open source and included into the 2.6.20 kernels that I am using. IBM is the leadingreseller of Vmware and has been doing server virtualization for the past 40 years, so I am comfortable with thetechnology. The XS-163 platform with Apache and PostgreSQL servers as a platform for [Moodle], an open source class management system, and the combination is memory-intensive enough that I did not want to incur the overheads running production this manner, but it wasgreat for testing!
With all this in place, it is designed to not need a Linux system admin or XS-163/Moodle expert at the facility. Instead, all we need is someone to insert the Boot-CD or Recover-DVD and reboot the system if needed.
Just before packing up the unit for shipment, I changed the IP addresses to the values they need at the destination facility, updated the [GRUB boot loader] default, and made a final backup which burned the Recover-DVD. Hopefully, it works by just turning on the unit,[headless], without any keyboard, monitor or configuration required. Fingers crossed!
So, thanks to the rest of my team: Greg, Glen, Vicki, Tarun, Marcel, Pablo and Said. I am very excited to bepart of this, and look forward to seeing this become something remarkable!
People are confused over various orders of magnitude. News of the economic meltdownoften blurs the distinction between millions (10^6), billions (10^9), and trillions (10^12).To show how different these three numbers are, consider the following:
A million seconds ago - you might have received your last paycheck (12 days)
A billion seconds ago - you were born or just hired on your current job (31 years)
A trillion seconds ago - cavemen were walking around in Asia (31,000 years)
That these numbers confuse the average person is no surprise, but that it confuses marketing people in the storage industry is even more hilarious. I am often correcting people who misunderstandMB (million bytes), GB (billion bytes) and TB (trillion bytes) of information.Take this graph as an example from a recent presentation.
At first, it looks reasonable, back in 2004, black-and-white 2D X-Ray images were only 1MBin size when digitized, but by 2010 there will be fancy 4D images that now take 1TB, representinga 1000x increase. What?When I pointed out this discrepancy, the person who put this chart together didn't know what to fix.Were 4D images only 1GB in size, or was it really a 1000000x increase.
If a 2D image was 1000 by 1000 pixels, each pixel was a byte of information, then a 3D imagemight either be 1000 by 1000 by 1000 [voxels], or 1000 by 1000 at 1000 frames per second (fps). Thefirst being 3D volumetric space, and the latter called 2D+time in the medical field, the rest of us just say "video".4D images are 3D+time, volumetric scans over time, so conceivably these could be quite large in size.
The key point is that advances in medical equipment result in capturing more data, which canhelp provide better healthcare. This would be the place I normally plug an IBM product, like the Grid Medical Archive Solution [GMAS], a blended disk and tape storage solution designed specifically for this purpose.
So, as government agencies look to spend billions of dollars to provide millions of peoplewith proper healthcare, choosing to spend some of this money on a smarter infrastructure can result in creating thousands of jobs and save everyone a lot of money, but more importantly, save lives.
Short 2-minute [video] argues the case for Smarter Healthcare
For more on this, check out Adam Christensen's blog post on[Smarter Planet], which points to a podcast byDr. Russ Robertson, chairman of the Counsel of Medical Education at Northwestern University’s Feinberg School of Medicine, and Dan Pelino, general manager of IBM's Healthcare and Life Sciences Industry.
I have arrived safely in Las Vegas for the IBM System Storage and Storage Networking Symposium. This eventis held once every year. The gold sponsors were: Brocade, Cisco, Finisar, Servergraph, and VMware. Our silversponsor was Qlogic.
I presented IBM's System Storage strategy and an overview of our product line. For those who missed it,our strategy is focused on helping customers in four key areas:
Optimize IT - to simplify and automate your IT operations and optimize performance and functionality, through server/storage synergies, storage virtualization, and intergrated storage infrastructure management.
Leverage Information - to enable a single view of trusted business information through data sharing, and to get the most value from information through Information Lifecycle Management (ILM).
Mitigate Risk - to comply with security and regulatory requirements, and keep your business running with a complete set of business continuity solutions. IBM offers a range of non-erasable, non-rewriteable storage, encryption on disk and tape, and support for IT Infrastructure Library (ITIL) service management disciplines.
Enable Business Flexibility - to provide scalable solutions and protect your IT investment through the use of open industry standards like Storage Networking Industry Association (SNIA) Storage Management Initiative Specification (SMI-S). IBM offers scalability in three dimensions: Scale-up, Scale-out, and Scale-within.
IBM has a broad storage portfolio, in seven offering categories:
Disk Systems, including our SAN Volume Controller, DS family, and N series.
Tape Systems, including tape drives, libraries and virtualization.
Storage Networking, a complete set of switches, directors and routes
Infrastructure Management, featuring the IBM TotalStorage Productivity Center software
Business Continuity, advanced copy services and the software to manage them
Lifecycle and Retention, our non-erasable, non-rewriteable storage including DR550, N series with SnapLock, and WORM tape support, Grid Archive Manager and our Grid Medical Archive Solution (GMAS)
Storage Services, everything from consulting, design and deployment to outsourcing and hosting.
I could talk all day on this, but given that the room was packed, every seat taken and the rest of the audience standing along the walls, I had to keep it down to one hour.
SAN Volume Controller Overview
I presented an overview of the IBM System Storage SAN Volume Controller (SVC), IBM's flagship disk virtualizationproduct. Rather than giving a long laundry list of features and benefits,I focused on the five that matter most:
Reduces the cost and complexity of managing storage, especially for mixed storage environments
Simplifies Business Continuity through non-disruptive data migration and advanced copy services
Improves storage utilization, getting more value from the storage hardware you already have
Enhances personnel productivity, empowering storage administrators to get their job done
Delivers high availability and performance
SAN Volume Controller - Customer Success Stories
A good part of this conference are presented by non-IBMers, which include Business Partners and clientssharing their experiences. In this session, we had two speakers share their experiences with SVC.
David Snyder keeps over 80 web sites online and available. His digital media technologiesteam uses SVC to make their storage administration easier, and ensure high availability for web site content creation and publishing.
Mark Prybylski manages storage at his company, a financial bank. His storage management team uses SVC Global Mirror which provides asynchronous disk mirroring between different types of disk, as part oftheir Business Continuity/Disaster Recovery plan.
The last session I attended was "Storage .. to Optimize your ECM depoloyments" by Jerry Bower, now working for IBM as part of our recent acquisition of the Filenet company. ECM stands for Enterprise Content Management, and IBM is the market leader in this space. Jerry gave a great overview of IBM Content Manager software suite, our newly acquired Filenet portfolio, and the storage supported.
After the sessions was a reception at the Solution Center with dozens of exhibitor booths. For example,Optica Technologies had their PRIZM productswhich are able to connect FICON servers to ESCON storage devices.