Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Tony Pearson is a Master Inventor and Senior Software Engineer for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson
Eventually, there comes a time to drop support for older, outdated programs that don't meet the latest standards. I had several complain that they could not read my last post on Internet Explorer 6. The post reads fine on more modern browsers like Firefox 3 and even Google's Chrome browser, but not IE6.
Google confirms that warnings are appearing:
[Official: YouTube to stop IE6 support].
My choice is to either stop embedding YouTube videos, some of which are created by my own marketing team specifically on my behalf, or drop support for IE6. I choose the latter. If you are still using IE6, please consider switching to Firefox 3 or Google Chrome instead.
( I cannot take credit for coining the new term "bleg". I saw this term firstused over on the [FreakonomicsBlog]. If you have not yet read the book "Freakonomics", I highly recommend it! The authors' blog is excellent as well.)
For this comparison, it is important to figure out how much workload a mainframe can support, how much an x86 cansupport, and then divide one from the other. Sounds simple enough, right? And what workload should you choose?IBM chose a business-oriented "data-intensive" workload using Oracle database. (If you wanted instead a scientific"compute-intensive" workload, consider an [IBM supercomputer] instead, the most recent of which clocked in over 1 quadrillion floating point operations per second, or PetaFLOP.) IBM compares the following two systems:
Sun Fire X2100 M2, model 1220 server (2-way)
IBM did not pick a wimpy machine to compare against. The model 1220 is the fastest in the series, with a 2.8Ghz x86-64 dual-core AMD Opteron processor, capable of running various levels of Solaris, Linux or Windows.In our case, we will use Oracle workloads running on Red Hat Enterprise Linux.All of the technical specifications are available at the[Sun Microsystems Sun Fire X1200] Web site.I am sure that there are comparable models from HP, Dell or even IBM that could have been used for this comparison.
IBM z10 Enterprise Class mainframe model E64 (64-way)
This machine can run a variety of operating systems also, including Red Hat Enterprise Linux (RHEL). The E64 has four "multiple processor modules" called"processor books" for a total of 77 processing units: 64 central processors, 11 system assist processors (SAP) and 2 spares. That's right, spare processors, in case any others gobad, IBM has got your back. You can designate a central processor in a variety of flavors. For running z/VM and Linux operating systems, the central processors can be put into "Integrated Facility for Linux" (IFL) mode.On IT Jungle, Timothy Patrick Morgan explains the z10 EC in his article[IBM Launches 64-Way z10 Enterprise Class Mainframe Behemoth]. For more information on the z10 EC, see the 110-page [Technical Introduction], orread the specifications on the[IBM z10 EC] Web site.
In a shop full of x86 servers, there are production servers, test and development servers, quality assuranceservers, standby idle servers for high availability, and so on. On average, these are only 10 percent utilized.For example, consider the following mix of servers:
125 Production machines running 70 percent busy
125 Backup machines running idle ready for active failover in case a production machine fails
1250 machines for test, development and quality assurance, running at 5 percent average utilization
While [some might question, dispute or challenge thisten percent] estimate, it matches the logic used to justify VMware, XEN, Virtual Iron or other virtualization technologies. Running 10 to 20 "virtual servers" on a single physical x86 machine assumes a similar 5-10 percent utilization rate.
Note: The following paragraphs have been revised per comments received.
Now the math. Jon, I want to make it clear I was not involved in writing the press release nor assisted with thesemath calculations. Please, don't shoot the messenger! Remember this cartoon where two scientists in white lab coats are writing mathcalculations on a chalkboard, and in the middle there is "and then a miracle happens..." to continue the rest ofthe calculations?
In this case, the miracle is the number that compares one server hardware platform to another. I am not going to bore people with details like the number of concurrent processor threads or the differencesbetween L1 and L3 cache. IBM used sophisticated tools and third party involvement that I am not allowed to talk about, and I have discussed this post with lawyers representing four (now five) different organizations already,so for the purposes of illustration and explanation only, I have reverse-engineered a new z10-to-Opteron conversion factor as 6.866 z10 EC MIPS per GHz of dual-core AMD Opteron for I/O-intensive workloads running only 10 percent average CPU utilization. Business applications that perform a lot of I/O don't use their CPU as much as other workloads.For compute-intensive or memory-intensive workloads, the conversion factor may be quite different, like 200 MIPS per GHz, as Jeff Savit from Sun Microsystems points out in the comments below.
Keep in mind that each processor is different, and we now have Intel, AMD, SPARC, PA-RISC and POWER (and others); 32-bit versus 64-bit; dual-core and quad-core; and different co-processor chip sets to worry about. AMD Opteron processors come in different speeds, but we are comparing against the 2.8GHz, so 1500 times 6.866 times 2.8 is 28,337. Since these would be running as Linux guestsunder z/VM, we add an additional 7 percent overhead or 2,019 MIPS. We then subtract 15 percent for "smoothing", whichis what happens when you consolidate workloads that have different peaks and valleys in workload, or 4,326 MIPS.The end is that we need a machine to do 26,530 MIPS. Thanks to advances in "Hypervisor" technological synergy between the z/VM operating system and the underlying z10 EC hardware, the mainframe can easily run 90 percent utilized when aggregating multiple workloads, so a 29,477 MIPS machine running at 90 percent utilization can handle these 26,530 MIPS.
N-way machines, from a little 2-way Sun Fire X2100 to the might 64-way z10 EC mainframe, are called "Symmetric Multiprocessors". All of the processors or cores are in play, but sometimes they have to taketurns, wait for exclusive access on a shared resource, such as cache or the bus. When your car is stopped at a red light, you are waiting for your turn to use the shared "intersection". As a result, you don't get linear improvement, but rather you get diminishing returns. This is known generically as the "SMP effect", and in IBM documentsthis as [Large System Performance Reference].While a 1-way z10 EC can handle 920 MIPS, the 64-way can only handle30,657 MIPS. The 29,477 MIPS needed for the Sun x2100 workload can be handled by a 61-way, giving you three extraprocessors to handle unexpected peaks in workload.
But are 1500 Linux guest images architecturally possible? A long time ago, David Boyes of[Sine Nomine Associates] ran 41,400 Linux guest images on a single mainframe using his [Test Plan Charlie], and IBM internallywas able to get 98,000 images, and in both cases these were on machines less powerful than the z10 EC. Neitherof these were tests ran I/O intensive workloads, but extreme limits are always worth testing. The 1500-to-1 reduction in IBM's press release is edge-of-the-envelope as well, so in production environments, several hundred guest images are probably more realistic, and still offer significant TCO savings.
The z10 EC can handle up to 60 LPARs, and each LPAR can run z/VM which acts much like VMware in allowing multipleLinux guests per z/VM instance. For 1500 Linux guests, you could have 25 guests each on 60 z/VM LPARs, or 250 guests on each of six z/VM LPARs, or 750 guests on two LPARs. with z/VM 5.3, each LPAR can support up to 256GB of memory and 32 processors, so you need at least two LPAR to use all 64 engines. Also, there are good reasons to have different guests under different z/VM LPARs, such as separating development/test from production workloads. If you had to re-IPLa specific z/VM LPAR, it could be done without impacting the workloads on other LPARs.
To access storage, IBM offers N-port ID Virtualization (NPIV). Without NPIV, two Linux guest images could not accessthe same LUN through the same FCP port because this would confuse the Host Bus Adapter (HBA), which IBM calls "FICON Express" cards. For example, Linux guest 1 asks to read LUN 587 block 32 and this is sent out a specific port, to a switch, to a disk system. Meanwhile, Linux guest 2 asks to read LUN 587 block 49. The data comes back to the z10 EC with the data, gives it to the correct z/VM LPAR, but then what? How does z/VM know which of the many Linux guests to give the data to? Both touched the same LUN, so it is unclear which made the request. To solve this, NPIV assigns a virtual "World Wide Port Name" (WWPN), up to 256 of them per physical port, so you can have up to 256 Linux guests sharing the same physical HBA port to access the same LUN.If you had 250 guests on each of six z/VM LPARs, and each LPAR had its own set of HBA ports, then all 1500 guestscould access the same LUN.
Yes, the z10 EC machines support Sysplex. The concept is confusing, but "Sysplex" in IBM terminology just means that you can have LPARs either on the same machine or on separate mainframes, all sharing the same time source, whether this be a "Sysplex Timer" or by using the "Server Time Protocol" (STP). The z10 EC can have STP over 6 Gbps Infiniband over distance. If you wantedto have all 1500 Linux guests time stamp data identically, all six z/VM LPARs need access to the shared time source. This can help in a re-do or roll-back situation for Oracle databases to complete or back-out "Units of Work" transactions. This time stamp is also used to form consistency groups in "z/OS Global Mirror", formerly called "XRC" for Extended Remote Distance Copy. Currently, the "timestamp" on I/O applies only to z/OS and Linux and not other operating systems. (The time stamp is done through the CDK driver on Linux, and contributed back to theopen source community so that it is available from both Novell SUSE and Red Hat distributions.)To have XRC have consistency between z/OS and Linux, the Linux guests would need to access native CKD volumes,rather than VM Minidisks or FCP-oriented LUNs.
Note: this is different than "Parallel Sysplex" which refers to having up to 32 z/OS images sharing a common "Coupling Facility" which acts as shared memory for applications. z/VM and Linux do not participate in"Parallel Sysplex".
As for the price, mainframes list for as little as "six figures" to as much as several million dollars, but I have no idea how much this particular model would cost. And, of course, this is just the hardware cost. I could not find the math for the $667 per server replacement you mentioned, so don't have details on that.You would need to purchase z/VM licenses, and possibly support contracts for Linux on System z to be fully comparable to all of the software license and support costs of the VMware, Solaris, Linux and/or Windows licenses you run on the x86 machines.
This is where a lot of the savings come from, as a lot of software is licensed "per processor" or "per core", and so software on 64 mainframe processors can be substantially less expensive than 1500 processors or 3000 cores.IBM does "eat its own cooking" in this case. IBM is consolidating 3900 one-application-each rack-mounted serversonto 30 mainframes, for a ratio of 130-to-1 and getting amazingly reduced TCO. The savings are in the followingareas:
Hardware infrastructure. It's not just servers, but racks, PDUs, etc. It turns out to be less expensive to incrementally add more CPU and storage to an existing mainframe than to add or replace older rack-em-and-stack-emwith newer models of the same.
Cables. Virtual servers can talk to each other in the same machine virtually, such as HiperSockets, eliminatingmany cables. NPIV allows many guests to share expensive cables to external devices.
Networking ports. Both LAN and SAN networking gear can be greatly reduced because fewer ports are needed.
Administration. We have Universities that can offer a guest image for every student without having a majorimpact to the sys-admins, as the students can do much of their administration remotely, without having physicalaccess to the machinery. Companies uses mainframe to host hundreds of virtual guests find reductions too!
Connectivity. Consolidating distributed servers in many locations to a mainframe in one location allows youto reduce connections to the outside world. Instead of sixteen OC3 lines for sixteen different data centers, you could have one big OC48 line instead to a single data center.
Software licenses. Licenses based on servers, cores or CPUs are reduced when you consolidate to the mainframe.
Floorspace. Generally, floorspace is not in short supply in the USA, but in other areas it can be an issue.
Power and Cooling. IBM has experienced significant reduction in power consumption and cooling requirementsin its own consolidation efforts.
All of the components of DFSMS (including DFP, DFHSM, DFDSS and DFRMM) were merged into a single product "DFSMS for z/OS" and is now an included element in the base z/OS operating system. As a result of these, customers typically have 80 to 90 percent utilization on their mainframe disk. For the 1500 Linux guests, however, most of the DFSMS features of z/OS do not apply. These functions were not "ported over" to z/VM nor Linux on any platform.
Instead, the DFSMS concepts have been re-implemented into a new product called "Scale-Out File Services" (SOFS) which would provide NAS interfaces to a blendeddisk-and-tape environment. The SOFS disk can be kept at 90 percent utilization because policies can place data, movedata and even expire files, just like DFSMS does for z/OS data sets. SOFS supports standard NAS protocols such as CIFS,NFS, FTP and HTTP, and these could be access from the 1500 Linux guests over an Ethernet Network Interface Card (NIC), which IBM calls "OSA Express" cards.
Lastly, IBM z10 EC is not emulating x86 or x86-64 interfaces for any of these workloads. No doubt IBM and AMD could collaborate together to come up with an AMD Opteron emulator for the S/390 chipset, and load Windows 2003 right on top of it, but that would just result in all kinds of emulation overhead.Instead, Linux on System z guests can run comparable workloads. There are many Linux applications that are functionally equivalent or the same as their Windows counterparts. If you run Oracle on Windows, you could runOracle on Linux. If you run MS Exchange on Windows, you could run Bynari on Linux and let all of your Outlook Expressusers not even know their Exchange server had been moved! Linux guest images can be application servers, web servers, database servers, network infrastructure servers, file servers, firewall, DNS, and so on. For nearly any business workload you can assign to an x86 server in a datacenter, there is likely an option for Linux on System z.
Hope this answers all of your questions, Jon. These were estimates based on basic assumptions. This is not to imply that IBM z10 EC and VMware are the only technologies that help in this area, you can certainly find virtualization on other systems and through other software.I have asked IBM to make public the "TCO framework" that sheds more light on this.As they say, "Your mileage may vary."
For more on this series, check out the following posts:
If in your travels, Jon, you run into someone interested to see how IBM could help consolidate rack-mounted servers over to a z10 EC mainframe, have them ask IBM for a "Scorpion study". That is the name of the assessment that evaluates a specific clientsituation, and can then recommend a more accurate estimate configuration.
Array-based replication does have drawbacks; all externalised storage becomes dependent on the virtualising array. This makes replacement potentially complex. To date, HDS have not provided tools to seamlessly migrate away from one USP to another (as far as I am aware). In addition, there's the problem of "all your eggs in one basket"; any issue with the array (e.g. physical intervention like fire, loss of power, microcode bug etc) could result in loss of access to all of your data. Consider the upgrade scenario of moving to a higher level of code; if all data was virtualised through one array, you would want to be darn sure that both the upgrade process and the new code are going to work seamlessly...
The final option is to use fabric-based virtualisation and at the moment this means Invista and SVC. SVC is an interesting one as it isn't an array and it isn't a fabric switch, but it does effectively provide switching capabilities. Although I think SVC is a good product, there are inevitably going to be some drawbacks, most notably those similar issues to array-based virtualisation (Barry/Tony, feel free to correct me if SVC has a non-disruptive replacement path).
I would argue that the IBM System Storage SAN Volume Controller (SVC) is more like the HDS USP, and less like the Invista. Both SVC and USP provide a common look and feel to the application server, both provide additional cache to external disk, both are able to provide a consistent set of copy services.
IBM designed the SVC so that upgrades can occur non-disruptively. You can replace the hardware nodes, one node at a time, while the SVC system is up and running, without disruption to reading and writing data on virtual disk. You can upgrade the software, one node at a time, while the SVC system is up and running, without disruption to reading and writing data on virtual disk. You can upgrade the firmware on the managed disk arrays behind the SVC, again, without disruption to reading and writing data on virtual disk.
More importantly, SVC has the ultimate "un-do" feature. It is called "image mode". If for any reason you want to take a virtual disk out of SVC management, you migrate over to an "image mode" LUN, and then disconnect it from SVC. The "image mode" LUN can then be used directly, with all the file system data in tact.
I define "virtualization" as technology that makes one set of resources look and feel like a different set of resources with more desirable characteristics. For SVC, the more desirable characteristics include choice of multi-pathing driver, consistent copy services, improved performance, etc. For EMC Invista, the question is "more desirable for whom?" EMC Invista seems more designed to meet EMC's needs, not its customers. EMC profits greatly from its EMC PowerPath multi-pathing driver, and from its SRDF copy services, so it appears to have designed a virtualization offering that:
Continuesthe use of EMC Powerpath as a multi-pathing driver. SVC supports driversthat are provided at no charge to the customer, as well as those built-in to each operating system like MPIO.
and, continuesthe use of Array-based copy services like SRDF of the underlying disk. SVC providesconsistent copy services regardless of storage vendor being managed.
A post from Dan over at Architectures of Control explains the anti-social nature of public benches. City planners, in an effort to discourage homeless people from sleeping on benches in parks or sidewalks, design benches that are so uncomfortableto use, that nobody uses them. These included benches made of metal that are too hot or too cold during certainmonths, benches slanted at an angle that dump you on the ground if you lay down, or benches that have dividers sothat you must be in an upright seated position to use.
This is not a disparagement of split-path switch-based designs. Rather, EMC's specific implementation appears to be designed for it to continuevendor lock-in for its multi-pathing driver, continuevendor lock-in for its copy services when used with EMC disk, and only provide slightly improved data migration capability for heterogeneous storage environments. Other switch-based solutions, such as those from Incipient or StoreAge, had different goals in mind.
Sadly, my IBM colleague BarryW and I have probably spent more words discussing Invista than all eleven EMC bloggers combined this year. While everyone in the industry is impressed how often EMC can sell "me, too" products with an incredibly large marketing budget, EMC appears not to have set aside funds for the Invista.
If a customer could design the ideal "storage virtualization" solution that would provide them the characteristics they desire the most from storage resources, it would not be anything like an Invista. While there are pros and cons between IBM's SVC and HDS's TagmaStore offerings, the reason both IBM and HDS are the market leaders in storage virtualization is because both companies are trying to provide value to the customer, just in different ways, and with different implementations.
Last week, Paul Weinberg of eChannelLine.com asks Is this the year of the SAN (again)?So, I thought this week I would cover my thoughts and opinions on storage networking. We oftenfocus on servers or storage devices, and forget that the network in between is an entire worldon itself.
I believe Mr. Weinberg is basing this on the idea that in 2007, over 50 percent of disk will beattached over SAN, edging out the alternative: Direct Attached Storage (DAS). But perhaps 50 percentis the wrong number to focus on. In 2007, The United Nations estimates thatcities will surpass rural areas, with just over 50 percent of theworld's population. Does that make this the "Year of the City"? Of course not.
Instead, I prefer to use the methodology that Malcolm Gladwell uses in his book, The Tipping Point.(I have read this book and highly recommend it!)Gladwell indicates that the tipping point happens at the start of the epidemic, not when it is half over.Isn't it better to celebrate the sweet 16 debutante ball when young ladies have completed their years of training and preparation, and are ready to be introduced to the rest of the world, rather than after they are thirty-something, married with children.
Let's explore some of the history. Stuart Kendric has a nice 7-page summary on theHistory & Plumbing of SANs.
IBM announced the first SAN technology calledEnterprise Systems Connection (ESCON) way back in September 1990. This allowed multiplemainframe servers to connect to multiple storage systems over equipment called "ESCON Directors" that directedtraffic from point A to point B. Before this, mainframes sent "ChannelCommand Words" or CCWs, across parallel "bus and tag" copper cables. ESCON was serial overfiber optic wiring. SANs solved two problems: first, it reduced the "rat's nest" of cables between many serversand many storage systems, and second, it extended the distance between server and storage device.
For distributed systems running UNIX or Windows, the CCW-equivalent over parallel cables was called Small ComputerSystem Interface (SCSI). The SCSI command had over 1000 command words, so for its Advanced Technology (AT) personal computers (PC AT), IBM introduced a subset of SCSI commands called ATA (Advanced Technology Attachment). ATA drives supportedfewer commands, ran at slower speeds, and were manufactured with a less rigorous process. Today ATA drives are about 55 percent the cost per MB as comparable SCSI drives.
Anyone who has ever opened their PC and found flat ribbon cable with eight or sixteen wires in parallel, can understand that the same issues applied externally. Parallel technologies arelimited to distance and speed, as all the bits have to arrive at the end of the wire at approximately thesame time. Direct attach schemes with every server attaches directly to every storage device were also problematic.Imagine 100 servers connected to 100 storage devices, that would be 10,000 wires!
So, a new technology standard was developed, called Fibre Channel, ratified in 1994.The spelling of "Fibre" was intentionally made different than "Fiber" on purpose. "Fibre" is a protocol thatcan travel over copper or glass wires. "Fiber" represents the glass wiring itself.
Fibre Channel is amazingly versatile. For today's Linux, UNIX and Windows servers, it can carry SCSI commands, and the combination of SCSI over FC is called Fibre Channel Protocol (FCP). For the mainframe servers, it can carry CCW commands. Running CCW over Fibre Channel is called FICON. This convergence allows mainframes and distributed systems to share a common Fibre Channel network, using the same set of switches and directors.
We saw the use of SANs explode in the marketplace over the past 10 years, and then cool down with a series of mergers and acquisitions. Last year, Brocade announced it was acquiring rival McData, so we will be down to two major players, Cisco and Brocade.
So, IMHO, I think we are well past the "Year of the SAN".
Tomorrow, according to the [Mayan calendar], the end of the 5,125 year cycle rolls over, so it only makes sense to party like it's 1999!
Of course, if you were in the IT industry 13 years ago, you may remember similar hoopla around [Year 2000] when the Gregorian calendar rolled over from "99" to "00". Some of us were asked to work right up to the last day of 1999, and be on-call the first week of 2000, just in case! Tomorrow may prove to be more or less a repeat of that.
Fortunately, there was plenty of other reasons to celebrate these past few weeks.
Birthdays in December Party
The IBM Tucson employees and contractors of building 9070 got together for a combination party, celebrating both the end of 2012 and for three people with birthdays in December: my former manager Bill, my colleague Kris, and myself. Here is our birthday cake! Afterwards, we allVacation movie.
(Note: This was sponsored by my third-line manager, David Gelardi, who one way or another, is responsible for all the IBMers in this building. Thank you David! )
This will be the last year for us to do this, as we are planning to move over to join the employees of building 9032 next year!
IBM Club Event
The IBM Club had its final event at [Golf N' Stuff] family fun park. Over 700 IBM employees and their family members came to eat breakfast burritos and play miniature golf and other games. It had rained earlier in the morning, so the go-kart track was wet, and the staff were trying to dry with leaf blowers. The rest of the park was fully operational, and the weather cleared up nicely. Mo, Rafael and I played golf but the turf was still wet in a few spots. There were also video games, bumper boats, and batting cages.
IBM volunteers dressed up as fictional characters for the kids to take pictures with.
I was proud to be a member of the seven-person IBM Club board for 2012. When I was nominated, I didn't think I stood a chance to be elected, as I was running against five or six other well-qualified candidates, but somehow it happened. I am glad to have been part of the 19-year tradition of the IBM Club history.
(Note: I didn't campaign for this position, but many IBMers in Tucson knew that I had previously owned and managed Tucson Fun & Adventures that organized 15-25 events every month for hundreds of single adults in the Tucson area. This might have helped my chances for election a bit!)
Next year, the IBM Club transitions to the more-efficient "Club Central" model, which is both board-less and cash-less. Instead of a seven-person board organizing events that are fully-funded or partially-subsidized by IBM, events will now be organized by IBM volunteers who post the details on Facebook. All participants simply pay for the events they attend directly to the venue or facility involved.
While the National Aeronautics and Space Administration [NASA] has put out videos and press releases these past 10 days to assure us [there will be a 2013], this shouldn't stop anyone from having a good time! If you did anything special to celebrate the end of the Mayan Calendar, please comment below!
(Actually, the [XIV Model 314] was announced on Nov 10, 2015 last year, but announcements made in November and December are often overlooked between distractions like holidays and year-end processing. Today's announcement was to eliminate the "not available in some countries" restriction. The last time I mentioned on this blog that a product was not available in some countries, I had tons of questions of "why". Hopefully, waiting until a product is available in all countries eliminates that concern.)
What does the XIV model 314 offer? IBM doubled the processors, up to 180 cores, and doubled the DRAM cache, up to 1440 GB. Both of these changes were done to improve the Real-time compression capability.
To reduce test effort cycle time, IBM simplified the configuration options:
Instead of ranging from 6 to 15 modules, the model 314 is limited to 9-15 modules.
The drive sizes are reduced to just 4TB and 6TB capacities.
If you want a Solid-State drive (SSD) for cache boost, only the 800GB option is available.
Through a combination of thin provisioning and compression, you can define up to 2 PB of soft capacity per rack.
The firmware v11.6.1 reduces the minimum volume size for compression from 103GB to 51GB. Firmware perpetually licensed for Spectrum Accelerate can be used with the XIV Model 314.
Well, it's Tuesday again, and we have more IBM announcements.
XIV asynchronous mirror
For those not using XIV behind SAN Volume Controller, [XIV now offers native asynchronous mirroring] support to another XIV far, far away. Unlike other disk systems that are limited to two or three sites, an XIV can mirror to up to 15 other sites. The mirroring can be at the individual volume, or a consistency group of multiple volumes. Each mirror pair can have its own recovery point objective (RPO). For example, a consistency group of mission critical application data might be given an RPO of 30 seconds, but less important data might be given an RPO of 20 minutes. This allows the XIV to prioritize packets it sends across the network.
As with XIV synchronous mirror, this new asynchronous mirror feature can send the data over either its
Fibre Channel ports (via FCIP) or its Ethernet ports.
The IBM System Storage SAN384B and SAN768B directors now offer [two new blades!]
A 24-port FCoCEE blade where each port can handle 10Gb convergence enhanced Ethernet (CEE). CEE can be used to transmit Fibre Channel, TCP/IP, iSCSI and other Ethernet protocols. This connect directly to server's converged network adapter (CNA) cards.
A 24-port mixed blade, with 12 FC ports (1Gbps, 2Ggbs, 4Gbps, 8Gbps), 10 Ethernet ports (1GbE) and 2 Ethernet ports (10GbE). This would connect to traditional server NIC, TOE and HBA cards as well as traditional NAS, iSCSI and FC based storage devices.
IBM also announced the IBM System Storage [SAN06B-R Fibre Channel router]. This has 16 FC ports (1Gbps up to 8Gbps) and six Ethernet ports (1GbE), with support for both FC routing as well as FCIP extended distance support.
With the holiday season coming up at the end of the year, now is a great time to ask Santa for a new shiny pair of XIV systems, and some extra networking gear to connect them.
Storage Networking World conference is over, and the buzz from the analysts appears to be focused onXiotech's low-cost RAID brick (LCRB) called Intelligent Storage Element, or ISE.
(Full disclosure: I work for IBM, not Xiotech, in case there weren't enough IBM references on this blog page to remindyou of that. I am writing this piece entirely from publicly available sources of information, and notfrom any internal working relationships between IBM and Xiotech. Xiotech is a member of the IBM BladeCenteralliance and our two companies collaborate together in that regard.)
Fellow blogger Jon Toigo in his DrunkenData blog posted [I’m Humming “ISE ISE Baby” this Week] and then a follow-up post[ISE Launches]. I looked up Xiotech's SPC-1benchmark numbers for the Emprise 5000 with both 73GB and 146GB drives, and at 8,202 IOPS per TB, does not seem to be as fast as IBM SAN VolumeControllers 11,354 IOPS per TB. Xiotech offers an impressive 5 year warranty (by comparison, IBM offers up to 4 years, and EMC I think is stillonly 90 days).Jon also wrote a review in [Enterprise Systems]that goes into more detail about the ISE.
Fellow blogger Robin Harris in his StorageMojo blog posted [SNW update - Xiotech’s ISE and the dilithium solution], feeling that Xiotech should win the "Best Announcement at SNW" prize. He points to the cool video on the[Xiotech website]. In that video, they claim 91,000 IOPS.Given that it took forty(40) 73GB drives (or 4 datapacs) in the previous example to get 8,202 IOPS for 1TB usable, I am guessing the 91,000 IOPS is probably 44 datapacs (440 drives) glommed together, representing 11TB usable.The ISE design appears very similar to the "data modules" used in IBM's XIV Nextra system.
Fellow blogger Mark Twomey from EMC in his StorageZilla blog posted[Xiotech: Industry second]correctly points out that Xiotech's 520-byte block (512 bytes plus extra for added integrity) was not the firstin the industry. Mark explains that EMC CLARiiON had this since the early 1990's, and implies in the title that this must have been the first in the industry, making Xiotech an industry second. Sorry Mark, both EMC and Xiotech were late to the game. IBM had been using 520-byte blocksize on its disk since 1980 with the System/38. This system morphed to the AS/400, and the blocksize was bumped up to 522 bytes in 1990, and is now called the System i, where the blocksize was bumped up yet again to 528 bytes in 2007.
While IBM was clever to do this, it actually means fewer choices for our System i clients, being only able to chooseexternal disk systems that explicitly support these non-standard blocksize values, such as the IBM System Storage DS8000and DS6000 series. (Yes, BarryB, IBM still sells the DS6000!) The DS6000 was specifically designed with the System i and smaller System z mainframes in mind, and in that niche does very well. Fortunately, as I mentioned in my February post [Getting off the island - the new i5/OS V6R1], IBM has now used virtualization, in the form of the VIOS logical partition, to allow i5/OS systems to attach to standard 512-byte block devices, greatly expanding the storage choices for our clients.
(Side note: SNW happens twice per year, so the challenge is having something new and fresh to talk about each time. While Andy Monshaw, General Manager of IBM System Storage, highlighted some of the many emerging technologies in his keynote address, IBM shipped on many of them prior to his last appearance in October 2007: thin provisioning in the IBM System Storage N series, deduplication in the IBM System Storage N series Advanced Single Instance Storage (A-SIS) feature, and Solid State Disk (SSD) drives in the IBM BladeCenter HS21-XM models. Of course, not everyone buys IBM gear the first day it is available, and IBM is not the only vendor to offer these technologies. My point is that for many people, these are still not yet deployed in their own data center, and so they are still in the future for them. However, since these IBM deliveries happened more than six months ago, they're old news in the eyes of the SNW attendees. While those who follow IBM closely would know that, others like[Britney Spears] may not.)
Back in the 1990s, when IBM was developing the IBM SAN Volume Controller (SVC), we generically called the managed disk arrays that were being virtualized by the SVC as "low-cost RAID brick" or LCRB. The IBM DS3400 is a good example of this. However, as we learned, SVC is not just for LCRB, it adds value in front of all kinds of disk systems, including the not-so-low-cost EMC DMX and IBM DS8000 disk systems. ISE might make a reasonable back-end managed disk device for IBM SVC to virtualize. This gives you the new cool features of Xiotech's ISE, with IBM SVC's faster performance, more robust functionality and advanced copy services.
Next week, I'll be in South America in meetings with IBM Business Partners and storage sales reps.
Yesterday morning, the entire country of Colombia suffered their worst black-out (power outage) in 22 years. 98% of the country was out for 4 1/2 hours.This is just 5 months after an outage that hit 25% of the country, December 7, 2006.Ironically, this one happened the week I am here explaining the need for Business Continuity plans to IBM Business Partners from Argentina, Peru, Velenzuela, Ecuador and Colombia. As is oftenthe case, people often need a real example to recognize the need for planning is important.
It reminded me of the Northeast Black-out of 2003 that impacted USA and Canada. I was speaking to a crowd of 800 people at the SHARE conference in Washington D.C. when it happened, and hundreds of pagers and cell-phones went off all at the same time. Although we were outside the effected area and had plenty of lighting, we ended up canceling therest of my talk, and many people left immediately to help execute their business continuity plans.Of course, terrorism was immediately assumed, but a final report showed that it was initiated in Ohiodue to overgrown trees, and then propagated due to a software bug to hundreds of other plants.
According to this morning's Bogota newspaper, "El Tiempo", nobody knows the root cause of yesterday's outage. Immediately, the country's leftist rebels were blamed, but now the leading theory is that it was initiated byoperator error (a technician touching something he shouldn't have), and then propagated by a faulty distribution system.
Another example of the need for a robust and resilient infrastructure, and appropropriate business continuity plans.
"All work and no play makes Jack a dull boy!"
Often I get feedback from my readers that I focus too much on storage products in this blog, and have been asked to break out of the work world for a change. Fair enough! The first Sunday of May is designated "World Laughter Day". I am proud to be one of the founding members of the [Tucson Laughter Club] back in 2004, and we held our first World Laughter Day event on May 1, 2005 at Udall Park.
Over the past seven years, we have had four other clubs "spin off" from the main group to form their own club. However, for World Laughter Day, the five sister clubs (or five warring tribes, as some call them) put down our differences and got together for a day of fun. In keeping with the tradition of having these events outside, we were granted permission to hold our event at the University of Airzona mall.
While the Tucson Laughter Club is recognized as one of the oldest laughter clubs in the United States, there are actually over 6000 clubs worldwide in over 60 countries. Laughter clubs started in India, when Dr. Madan Kataria (a medical doctor) and his wife Madhuri (a yoga instructor) gathered people in a park to try laughter as a form of healing and exercise. Today, [Laughter Yoga] is practiced outside in parks or indoors.
Up until now, all of the World Laughter Day events in Tucson have been organized by the original Tucson Laughter Club, but this time we decided to pass the baton over to Gita Fendelman of [Curve's Laughter YogHA club] to take the lead. Here she is standing next to a large yellow sign with facts and figures about the history of World Laughter Day.
We had about 45 people join us in a large circle, and proceed with a series of breathing, stretching and laughter exercises. Many of the laughter exercises involve moving around to look at each of the other participants eye-to-eye, and with 45 people, this can be quite challenging.
The weather could not have been nicer. Clear blue skies, a slight breeze, and an unusually cool 75 degrees F. Last week we were in the nineties approach summer, so we were delighted the weather cooled down for this event.
As a certified Laughter Yoga instructor myself, I offered to help lead the events for World Laughter Day. We had plenty of other certified laughter leaders on hand, and so both Gita and Emily (from Laughter Yoga with Emily) served as co-chairs.
I brought my [CRATE battery-powered amp] and microphones so that we could project our voices loud enough to the entire group. There was no electricity anywhere near our location, so battery-powered amps are the way to go for these situations.
After two hours of laughing, we all lie down for some relaxing meditation. Some people use this time to pray for World Peace. In a delightful coincidence, later that day, US President Barack Obama announced that the [world was a better place] having eliminated one of the world's most dangerous terrorists.
I would like to thank Jeff from our local NBC News Channel 4 affiliate [KVOA] who came to interview Gita and video us while we did our laughter exercises.
Every year, March 31 marks "World Backup Day". Sadly, many people forget the importance of backing up their critical information. This is not just true for businesses, non-profit organizations and government agencies, but also for all of your personal information that you keep on computer devices.
My friends over at Cloudwards had developed an awesome infographic related to World Backup Day. Here it is.
(FTC Disclosure: I work for IBM, which has no business relationship with Cloudwards. Cloudwards does not itself provide backup services, but rather reviews services provided by others. This post should not be considered an endorsement of Cloudwards or their reviews.)
Courtesy of: Cloudwards.net
I hope you find this information helpful and informative!
When I was a kid, we didn't have online access to anything. Either yourparents were rich and generous and bought you the latest set of encyclopedias, or they were poor or cheap, and you hoofed it to thenearest library.
Now, I rely heavily on Wikipedia, and other wikis, to find information I need.The key here is the ability to find stuff. With the old 27-volume set ofencyclopedias, you had to know what word something would be filed under, andhow to spell it, so that you could find it. Today's search facilities are much moreforgiving. If you guess wrong, you are only a few clicks away from what youwere really looking for, in a Kevin Bacon six-degrees-of-separation kind of way.
Wikipedia is now looked at more often than CNN.com or the New York Times website.Why? It is amazingly good at summarizing a situation in succinct terms, even fornews "as it happens". The recent episode at Heathrow airport a few weeks agoserves as a good example. I was in Washington DC that week, on my way to Miami and Sao Paulo,Brazil, so it is good to have the news I needed, when I needed it.[Read More]
Miles per Gallon measures an effeciency ratio (amount of work done with a fixed amount of energy), not a speed ratio (distance traveled in a unit of time).
Given that IOPs and MB/s are the unit of "work" a storage array does, wouldn't the MPG equivalent for storage be more like IOPs per Watt or MB/s per Watt? Or maybe just simply Megabytes Stored per Watt (a typical "green" measurement)?
You appear to be intentionally avoiding the comparison of I/Os per Second and Megabytes per Second to Miles Per Hour?
May I ask why?
This is a fair question, Barry, so I will try to address it here.
It was not a typo, I did mean MPG (miles per gallon) and not MPH (miles per hour). It is always challenging to find an analogy that everyone can relate to explain concepts in Information Technology that might be harder to grasp. I chose MPG because it was closely related to IOPS and MB/s in four ways:
MPG applies to all instances of a particular make and model. Before Henry Ford and the assembly line, cars were made one at a time, by a small team of craftsmen, and so there could be variety from one instance to another. Today, vehicles and storage systems are mass-produced in a manner that provides consistent quality. You can test one vehicle, and safely assume that all similar instances of the same make and model will have the similar mileage. The same is true for disk systems, test one disk system and you can assume that all others of the same make and model will have similar performance.
MPG has a standardized measurement benchmark that is publicly available. The US Environmental Protection Agency (EPA) is an easy analogy for the Storage Performance Council, providing the results of various offerings to chose from.
MPG has usage-specific benchmarks to reflect real-world conditions.The EPA offers City MPG for the type of driving you do to get to work, and Highway MPG, to reflect the type ofdriving on a cross-country trip. These serve as a direct analogy to SPC having SPC-1 for Online transaction processing (OLTP) and SPC-2 for large file transfers, database queries and video streaming.
MPG can be used for cost/benefit analysis.For example, one could estimate the amount of business value (miles travelled) for the amount of dollar investment (cost to purchase gallons of gasoline, at an assumed gas price). The EPA does this as part of their analysis. This is similar to the way IOPS and MB/s can be divided by the cost of the storage system being tested on SPC benchmark results. The business value of IOPS or MB/s depends on the application, but could relate to the number of transactions processed per hour, the number of music downloads per hour, or number of customer queries handled per hour, all of which can be assigned a specific dollar amount for analysis.
It seemed that if I was going to explain why standardized benchmarks were relevant, I should find an analogy that has similar features to compare to. I thought about MPH, since it is based on time units like IOPS and MB/s, butdecided against it based on an earlier comment you made, Barry, about NASCAR:
Let's imagine that a Dodge Charger wins the overwhelming majority of NASCAR races. Would that prove that a stock Charger is the best car for driving to work, or for a cross-country trip?
Your comparison, Barry, to car-racing brings up three reasons why I felt MPH is a bad metric to use for an analogy:
Increasing MPH, and driving anywhere near the maximum rated MPH for a vehicle, can be reckless and dangerous,risking loss of human life and property damage. Even professional race car drivers will agree there are dangers involved. By contrast, processing I/O requests at maximum speed poses no additional risk to the data, nor possibledamage to any of the IT equipment involved.
While most vehicles have top speeds in excess of 100 miles per hour, most Federal, State and Local speed limits prevent anyone from taking advantage of those maximums. Race-car drivers in NASCAR may be able to take advantage of maximum MPH of a vehicle, the rest of us can't. The government limits speed of vehicles precisely because of the dangers mentioned in the previous bullet. In contrast, processing I/O requests at faster speeds poses no such dangers, so the government poses no limits.
Neither IOPS nor MB/s match MPH exactly.Earlier this week,I related IOPS to "Questions handled per hour" at the local public library, and MB/s to "Spoken words per minute" in those replies. If I tried to find a metric based on unit type to match the "per second" in IOPS and MB/s, then I would need to find a unit that equated to "I/O requests" or "MB transferred" rather than something related to "distance travelled".
In terms of time-based units, the closest I could come up with for IOPS was acceleration rate of zero-to-sixty MPH in a certain number of seconds. Speeding up to 60MPH, then slamming the breaks, and then back up to 60MPH, start-stop, start-stop, and so on, would reflect what IOPS is doing on a requestby request basis, but nobody drives like this (except maybe the taxi cab drivers here in Malaysia!)
Since vehicles are limited to speed limits in normal road conditions, the closest I could come up with for MB/s would be "passenger-miles per hour", such that high-occupancy vehicles like school buses could deliver more passengers than low-occupancy vehicles with only a few passengers.
Neither start-stops nor passenger-miles per hour have standardized benchmarks, so they don't work well for comparisonbetween vehicles.If you or anyone can come up with a metric that will help explain the relevance of standardized benchmarks better than the MPG that I already used, I would be interested in it.
You also mention, Barry, the term "efficiency" but mileage is about "fuel economy".Wikipedia is quick to point out that the fuel efficiency of petroleum engines has improved markedly in recent decades, this does not necessarily translate into fuel economy of cars. The same can be said about the performance of internal bandwidth ofthe backplane between controllers and faster HDD does not necessarily translate to external performance of the disk system as a whole. You correctly point this out in your blog about the DMX-4:
Complementing the 4Gb FC and FICON front-end support added to the DMX-3 at the end of 2006, the new 4Gb back-end allows the DMX-4 to support the latest in 4Gb FC disk drives.
You may have noticed that there weren't any specific performance claims attributed to the new 4Gb FC back-end. This wasn't an oversight, it is in fact intentional. The reality is that when it comes to massive-cache storage architectures, there really isn't that much of a difference between 2Gb/s transfer speeds and 4Gb/s.
Oh, and yes, it's true - the DMX-4 is not the first high-end storage array to ship a 4Gb/s FC back-end. The USP-V, announced way back in May, has that honor (but only if it meets the promised first shipments in July 2007). DMX-4 will be in August '07, so I guess that leaves the DS8000 a distant 3rd.
This also explains why the IBM DS8000, with its clever "Adaptive Replacement Cache" algorithm, has such highSPC-1 benchmarks despite the fact that it still uses 2Gbps drives inside. Given that it doesn't matter between2Gbps and 4Gbps on the back-end, why would it matter which vendor came first, second or third, and why call it a "distant 3rd" for IBM? How soon would IBM need to announce similar back-end support for it to be a "close 3rd" in your mind?
I'll wrap up with you're excellent comment that Watts per GB is a typical "green" metric. I strongly support the whole"green initiative" and I used "Watts per GB" last month to explain about how tape is less energy-consumptive than paper.I see on your blog you have used it yourself here:
The DMX-3 requires less Watts/GB in an apples-to-apples comparison of capacity and ports against both the USP and the DS8000, using the same exact disk drives
It is not clear if "requires less" means "slightly less" or "substantially less" in this context, and have no facts from my own folks within IBM to confirm or deny it. Given that tape is orders of magnitude less energy-consumptive than anything EMC manufacturers today, the point is probably moot.
I find it refreshing, nonetheless, to have agreed-upon "energy consumption" metrics to make such apples-to-apples comparisons between products from different storage vendors. This is exactly what customers want to do with performance as well, without necessarily having to run their own benchmarks or work with specific storage vendors. Of course, Watts/GB consumption varies by workload, so to make such comparisons truly apples-to-apples, you would need to run the same workload against both systems. Why not use the SPC-1 or SPC-2 benchmarks to measure the Watts/GB consumption? That way, EMC can publish the DMX performance numbers at the same time as the energy consumption numbers, and then HDS can follow suit for its USP-V.
I'm on my way back to the USA soon, but wanted to post this now so I can relax on the plane.
I am still wiping the coffee off my computer screen, inadvertently sprayed when I took a sip while reading HDS' uber-blogger Hu Yoshida's post on storage virtualization and vendor lock-in.
HDS is a major vendor for disk storage virtualization, and Hu Yoshida has been around for a while, so I felt it was fair to disagree with some of the generalizations he made to set the record straight. He's been more careful ever since.
However, his latest post [The Greening of IT: Oxymoron or Journey to a New Reality] mentions an expert panel at SNW that includedMark O’Gara Vice President of Infrastructure Management at Highmark. I was not at the SNW conference last week in Orlando, so I will just give the excerpt from Hu's account of what happened:
"Later I had the opportunity to have lunch with Mark O’Gara. Mark is a West Point graduate so he takes a very disciplined approach to addressing the greening of IT. He emphasized the need for measurements and setting targets. When he started out he did an analysis of power consumption based on vendor specifications and came up with a number of 513 KW for his data center infrastructure....
The physical measurements showed that the biggest consumers of power were in order: Business Intelligence Servers, SAN Storage, Robotic tape Library, and Virtual tape servers....
Another surprise may be that tape libraries are such large consumers of power. Since tape is not spinning most of the time they should consume much less power than spinning disk - right? Apparently not if they are sitting in a robotic tape library with a lot of mechanical moving parts and tape drives that have to accelerate and decelerate at tremendous speeds. A Virtual Tape Library with de-duplication factor of 25:1 and large capacity disks may draw significantly less power than a robotic tape library for a given amount of capacity.
Obviously, I know better than to sip coffee whenever reading Hu's blog. I am down here in South America this week, the coffee is very hot and very delicious, so I am glad I didn't waste any on my laptop screen this time, especially reading that last sentence!
In that report, a 5-year comparison found that a repository based on SATA disk was 23 times more expensive overall, and consumed 290 times more energy, than a tape library based on LTO-4 tape technology. The analysts even considered a disk-based Virtual Tape Library (VTL). Focusing just on backups, at a 20:1 deduplication ratio, the VTL solution was still 5 times per expensive than the tape library. If you use the 25:1 ratio that Hu Yoshida mentions in his post above, that would still be 4 times more than a tape library.
I am not disputing Mark O'Gara's disciplined approach. It is possible that Highmark is using a poorly written backup program, taking full backups every day, to an older non-IBM tape library, in a manner that causes no end of activity to the poor tape robotics inside. But rather than changing over to a VTL, perhaps Mark might be better off investigating the use of IBM Tivoli Storage Manager, using progressive backup techniques, appropriate policies, parameters and settings, to a more energy-efficient IBM tape library.In well tuned backup workloads, the robotics are not very busy. The robot mounts the tape, and then the backup runs for a long time filling up that tape, all the meanwhile the robot is idle waiting for another request.
(Update: My apologies to Mark and his colleagues at Highmark. The above paragraph implied that Mark was using badproducts or configured them incorrectly, and was inappropriate. Mark, my full apology [here])
If you do decide to go with a Virtual Tape Library, for reasons other than energy consumption, doesn't it make sense to buy it from a vendor that understands tape systems, rather than buying it from one that focuses on disk systems? Tape system vendors like IBM, HP or Sun understand tape workloads as well as related backup and archive software, and can provide better guidance and recommendations based on years of experience. Asking advice abouttape systems, including Virtual Tape Libraries, from a disk vendor is like asking for advice on different types of bread from your butcher, or advice about various cuts of meat at the bakery.
The butchers and bakers might give you answers, but it may not be the best advice.
Ken Gibson has written a four-part series about where the storage industry is going, on his Storage Thoughts blog. You can find the four parts here (Part 1,Part 2,Part 3,Part 4).
His analysis of the storage industry is based on the concepts in Clayton Christensen's latest book Seeing What's Next, his latest work on the heels of his last two successes "The Innovator's Dilemma" and "The Innovator's Solution". I've only read the first book, "The Innovator's Dilemma" but need to check out these other two.
Ken explores the efforts of the incumbent players, and I agree IBM is farthest along, but not only for our "Storage Tank" architecture. For those not aware of Storage Tank, it was the code-name of a project from IBM's Almaden Research Center, productized as IBM System Storage SAN File System (SFS). Earlier this year the advanced policy-based data placement, movement and expiration features of SFS were copied over to IBM's General Parallel File System (GPFS) which has wide adoption among the High-Performance Technical Computing (HPTC) community. As I've said before, switching from one file system to another is hard, so it makes sense for HPTC clients who already use GPFS to make use of these new features by staying with GPFS, rather than trying to get them to move to SFS.
I also like Ken's analysis of "overshot" and "undershot" clients. Overshot clients are those that find what the marketplace delivers already "good enough" for their needs, and are price sensitive against paying for features they don't think they need. The undershot clients are those that the current marketplace set of offerings are not yet good enough, and are willing to pay a premium to the vendor or supplier that can get them closer to what they are looking for.
Changes are underfoot, and it is an exciting time to be involved in the storage industry.[Read More]
Stephen Colbert, of The Colbert Report, explains the name changes in recent mergers of the Telecommunications industry. A discussion on "changing names" and how that impacts storage seems like a good way to wrap up the week's theme on naming conventions.
Name changes are sometimes painful, but often times done for a purpose, such as to promote a family. In the US, when a man and woman marries, the woman often changes her family name to match her husband, and the kids all adopt the father's family name. I say "often" because there are times where the woman keeps her name, or adds to it in a hyphenated way. ABC News reported that a Man Fights to Take Wife's Name in Marriage. KipEsquire, a lawyer, writes about it in his blogA stitch in haste.
IT industry changes the names of products that people knew as something else. Other times, they re-use an existing name, when really it is or should be different from the original. Last year, I took on the job of helping transition from our brand "TotalStorage" to the "System Storage" product line under the new "IBM Systems" brand. I help decide what stays the same name or what changes, when it should change, and how to announce that change.
On the disk side, IBM renamed Fibre Array Storage Technology, or FAStT, which was pronounced exactly like "fast", to DS4000 series. This was a big improvement, as people couldn't seem to spell it properly, with variations like "FastT". Nor could people pronounce it properly, saying "fast-tee" instead. The advantage of "DS" is that it is both easy to spell, and easy to pronounce. The DS4000 series continues to be "fast", providing excellent performance for its midrange price category.
IBM's Enterprise Storage Server (ESS) line went from model E10, to F20, to 750 and 800. When IBM came out with its replacement, the IBM TotalStorage DS8000, some people asked why it wasn't named the ESS 900, for example. The DS8000 is quite different internally, new hardware design and implementation, but is highly compatible with the ESS line, and shares much of the same functionality from microcode. Last year, it was replaced by the IBM System Storage DS8000 Turbo. Again, newer hardware, so it was easy to justify the new name change from "TotalStorage" to "System Storage".
Renaming a product risks losing its certifications and awards. For example, IBM spent a lot of time and money getting the OS/390 operating system certified as a "UNIX" platform. When it was renamed to z/OS, IBM had to do it all over again. Learning from this experience, IBM decided not to rename the SAN Volume Controllerto a new designation like "DS5750", as it enjoys the "number one" spot on both the SPC-1 and SPC-2 performance benchmarks, and is recognized as the leader in the disk storage virtualization marketplace. Renaming this product would mean losing that collateral.
IBM's "other disk systems" the N series posed another set of challenges. The current DS line already has entry-level (DS3000), midrange (DS4000) and enterprise-class (DS6000 and DS8000) products. The OEM agreement that IBM has with Network Appliance (NetApp) resulted in a new set of entry-level, midrange, and enterprise-class products. But these didn't fit nicely into the DS3000-to-DS8000 continuum. Instead, IBM decided to go with N series, using N3000 for entry-level, N5000 for midrange, and N7000 for enterprise-class. These are different than the numbers used by NetApp for their comparable, but not identical, offerings.
On the tape side, IBM decided to name the tape drives TS1000 and TS2000 range, tape libraries and automation with a TS3000 range, and tape virtualization to the TS7000 range. A lot of tape products already had 3000 numbering that had to change to fit this new scheme. This is why IBM's popular 3592 tape drive was renamed to the TS1120. The replacement to the 3494 Virtual Tape Server was named TS7700 Virtualization Engine.
Obviously, you can't change the names of products that are currently in the field, but what about existing software with minor updates? IBM decided to leave "TotalStorage Produtivity Center" under the "TotalStorage" brand until it has a significant version upgrade. Many people say "TPC" as a convenient acronym when referring to this product, but TPC is a registered trademark of the Professional Golfers Association (PGA) to refer to its "Tournament Players Club".
How can anyone confuse "managing storage" with "playing golf"? One activity is full of frustration that takes years or decades to master, involving the need to understand a variety of equipment and techniques to use each properly to accomplish your goals; and the other is an enjoyable activity, immediately productive in front of a single pane of glass managing all of your DAS, SAN and NAS storage, from reporting on your files and databases to managing storage networks and tape libraries.
This week and next I am touring Asia, meeting with IBM Business Partners and sales repsabout our July 10 announcements.
Clark Hodge might want to figure out where I am, given the nuclearreactor shutdowns from an earthquake in Japan. His theory is that you can follow my whereabouts just by following the news of major power outages throughout the world.
So I thought this would be a good week to cover the topic of Business Continuity, which includes disaster recovery planning. When making Business Continuity plans, I find it best to work backwards. Think of the scenarios that wouldrequire such recovery actions to take place, then figure out what you need to have at hand to perform the recovery, and then work out the tasks and processes to make sure those things are created and available when and where needed.
I will use my IBM Thinkpad T60 as an example of how this works. Last week, I was among several speakers making presentations to an audience in Denver, and this involved carrying my laptop from the back of the room, up to the front of the room, several times. When I got my new T60 laptop a year ago, it specifically stated NOT to carry the laptop while the disk drive was spinning, to avoid vibrations and gyroscopic effects. It suggested always putting the laptop in standby, hibernate or shutdown mode, prior to transportation, but I haven't gotten yet in the habit of doing this. After enough trips back and forth, I had somehow corrupted my C: drive. It wasn't a complete corruption, I could still use Microsoft PowerPoint to show my slides, but other things failed, sometimes the fatal BSOD and other times less drastically. Perhaps the biggest annoyance was that I lost a few critical DLL files needed for my VPN software to connect to IBM networks, so I was unable to download or access e-mail or files inside IBM's firewall.
Fortunately, I had planned for this scenario, and was able to recover my laptop myself, which is important when you are on the road and your help desk is thousands of miles away. (In theory, I am now thousands of miles closer to our help desk folks in India and China, but perhaps further away from those in Brazil.) Not being able to respond to e-mail for two days was one thing, but no access for two weeks would have been a disaster! The good news: My system was up and running before leaving for the trip I am on now to Asia.
Following my three-step process, here's how this looks:
Step 1: Identify the scenario
In this case, my scenario is that the file system the runs my operating system is corrupted, but my drive does not have hardware problems. Running PC-Doctor confirmed the hardware was operating correctly. This can happen in a variety of ways, from errant application software upgrades, malicious viruses, or in my case, picking up your laptop and carrying it across the room while the disk drive is spinning.
Step 2: Figure out what you need at hand
All I needed to do was repair or reload my file sytem. "Easier said than done!" you are probably thinking. Many people use IBM Tivoli Storage Manager (TSM) to back up their application settings and data. Corporate include/exclude lists avoid backing up the same Windows files from everyone's machines. This is great for those who sit at the same desk, in the same building, and would be given a new machine with Windows pre-installed as the start of their recovery process. If on the other hand you are traveling, and can't access your VPN to reach your TSM server, you have to do something else. This is often called "Bare Metal Restore" or "Bare Machine Recovery", BMR for short in both cases.
I carry with me on business trips bootable rescue compact discs, DVDs of full system backup of my Windows operating system, and my most critical files needed for each specific trip on a separate USB key. So, while I am on the road, I can re-install Windows, recover my applications, and copy over just the files I need to continue on my trip, and then I can do a more thorough recovery back in the office upon return.
Step 3: Determine the tasks and processes
In addition to backing up with IBM TSM, I also use IBM Thinkvantage Rescue and Recovery to make local backups. IBM Rescue and Recovery is provided with IBM Thinkpad systems, and allows me to backup my entire system to an external 320GB USB drive that I can leave behind in Tucson, as well as create bootable recovery CD and DVDs that I can carry with me while traveling.
The problem most people have with a full system backup is that their data changes so frequently, they would have to take backups too often, or recover "very old" data. Most Windows systems are pre-formatted as one huge C: drive that mixes programs and data together. However, I follow best practice, separating programs from data. My C: drive contains the Windows operating system, along with key applications, and the essential settings needed to make them run. My D: drive contains all my data. This has the advantage that I only have to backup my C: drive, and this fits nicely on two DVDs. Since I don't change my operating system or programs that often, and monthly or quarterly backup is frequent enough.
In my situation in Denver, only my C: drive was corrupted, so all of my data on D: drive was safe and unaffected.
When it comes to Business Continuity, it is important to prioritize what will allow you to continue doing business, and what resources you need to make that happen. The above concepts apply from laptops to mainframes. If you need help creating or updating your Business Continuity plan, give IBM a call.
Continuing my quest to "set the record straight" about [IBM XIV Storage System] and IBM's other products, I find myself amused at some of the FUD out there. Some are almost as absurd as the following analogy:
Humans share over 50 percent of DNA with bananas. [source]
If you peel a banana, and put the slippery skin down on the sidewalk outside your office building, it couldpose a risk to your employees
If you peel a human, the human skin placed on the sidewalk in a similar manner might also pose similar risks.
Mr. Jones, who applied for the opening in your storage administration team, is a human being.
You wouldn't hire a banana to manage your storage, would you? This might be too risky!
The conclusion we are led to believe is that hiring Mr. Jones, a human being, is as risky as puttinga banana peel down on the sidewalk. Some bloggers argue that they are merely making a series of factual observations,and letting their readers form their own conclusions. For example, the IBM XIV storage system has ECC-protected mirrored cache writes. Some false claims about this were [properly retracted]using strike out font to show the correction made, other times the same statement appears in another post from the same blogger that[have not yet beenretracted] (Update: has now been corrected). Other bloggers borrow the false statement [for their own blog], perhaps not realizing theretractions were made elsewhere. Newspapers are unable to fix a previous edition, so are forced to publishretractions in future papers. With blogs, you can edit the original and post the changed version, annotated accordingly, so mistakes can be corrected quickly.
While it is possible to compare bananas and humans on a variety of metrics--weight, height, and dare I say it,caloric value--it misses the finer differences of what makes them different. Humans might share 98 percent withchimpanzees, but having an opposable thumb allows humans to do things that chimpanzeesother animals cannot.
Full Disclosure: I am neither vegetarian nor cannibal, and harbor no ill will toward bananas nor chimpanzees.No bananas or chimpanzees were harmed in the writing of this blog post. Any similarity between the fictitiousMr. Jones in the above analogy and actual persons, living or dead, is purely coincidental.
So let's take a look at some of IBM XIV Storage System's "opposable thumbs".
The IBM XIV system comes pre-formatted and ready to use. You don't have to spend weeks in meetings deciding betweendifferent RAID levels and then formatting different RAID ranks to match those decisions. Instead, you can start using the storage on the IBM XIV Storage System right away.
The IBM XIV offers consistent performance, balancing I/O evenly across all disk drive modules, even when performing SnapShot processing, or recovering from component failure. You don't have to try to separate data to prevent one workload from stealing bandwidth from another. You don't have to purchase extra software to determine where the "hot spots" are on the disk. You don't have to buy othersoftware to help re-locate and re-separate the data to re-balance the I/Os. Instead, you just enjoy consistentperformance.
The IBM XIV offers thin provisioning, allowing LUNs to grow as needed to accommodate business needs. You don'thave to estimate or over-allocate space for planned future projects. You don't have to monitor if a LUN is reaching80 or 90 percent full. You don't have to carve larger and larger LUNs and schedule time on the weekends to move thedata over to these new bigger spaces. Instead, you just write to the disk, monitoring the box as a whole, ratherthan individual LUNs.
The IBM XIV Storage System's innovative RAID-X design allows drives to be replaced with drives of any larger or smaller capacity. You don't have to find the exact same 73GB 10K RPM drive to match the existing 73GB 10K RPM drive that failed. Some RAID systems allow "larger than original" substitutions, for example a 146GB drive to replace a 73GB drive, but the added capacity is wasted, because of the way most RAID levels work. The problemis that many failures happen 3-5 years out, and disk manufacturers move on to bigger capacities and differentform factors, making it sometimes difficult to find an exact replacement or forcing customers to keep their own stockof spare drives. Instead, with the IBM XIV architecture, you sleep well at night, knowing it allows future drive capacities to act as replacements, and getting the full value and usage of that capacity.
In the case of IBM XIV Storage System, it is not clear whether
"Vendors" are those from IBM and IBM Business Partners, including bloggers like me employed by IBM,and "everybody else" includes IBM's immediate competitors, including bloggers employed by them.
-- or --
"Vendors" includes IBM and its competitors including any bloggers, so that "everybody else" refers instead to anyone not selling storage systems, but opinionated enough to not qualify as "objective third-party sources".
-- or --
"Vendors" includes official statements from IBM and its competitors, and "everybody else" refers to bloggerspresenting their own personal or professional opinions, that may or may not correspond to their employers.
That said, feel free to comment below on which of these you think the last two points of Steinhardt's rule istrying to capture. Certainly, I can't argue with the top two: a customer's own experience and the experiencesof other customers, which I mentioned previously in my post[Deceptively Delicious].
In that light, here is a 5-minute video on IBM TV with a customer testimonial from the good folksat [NaviSite], one of our manycustomer references for the IBM XIV Storage System.
Last year in Beijing, China, one of my colleagues told me "When it rains here, cabs dry up". Normally, there are enough taxi cabs to handle normal conditions, but when it rains, people who normally walk now want to take a cab instead, and the demand goes up, resulting in being more difficult to find one when you need one.
I'm wrapping up my week here in Chicago, and it snowed yesterday. Cabs were scarce. I walked. Many others walked too, about half with umbrellas to protect themselves against the snowflakes.
Most systems are designed to handle typical average conditions. Taxi cabs in a city, for example, handle typicalamounts of traffic.
IT is different. In many cases, IT infrastructures are designed for the peaks, not the averages. Peaks can be where you need performance the most, and failure to design for peaks can be disastrous. As with any business decision, this represents a trade-off. Design for the average, and suffer through the peaks, or design for the peak, and be over-allocated and under-utilized most of the time otherwise.
However, I have to assume his real question is ... "what is the quick and easy way for me to build a lightweight database app like Microsoft Access that I can distribute as a standalone executable?"
To which I would say "Lotus has a program called Approach, which is part of Lotus SmartSuite, which some people still use. However, a lot of the focus in IBM now centers around the lightweight Cloudscape database which IBM acquired from Informix, which is now known as the [open source project called Derby]. Many IBM and Lotus products, such as Lotus Expeditor use the JDBC connection to Derby, which allows you to use Windows, Linux, Flash, etc. ... with no vendor lock in".
I am familiar with Cloudscape, and I evaluated it as a potential database for IBM TotalStorage Productivity Center, when I was the lead architect defining the version 1 release. It runs entirely on Java, which is both a plus and minus. Plus in that it runs anywhere Java runs, but a minus in that it is not optimized for high performance or large scalability. Because of this, we decided instead on using the full commercial DB2 database instead for Productivity Center.
Not to be undone, my colleagues over at DB2 offered a different alternative, [DB2 Express-C], which runs on a variety of Windows, Linux-x86, and Linux on POWER platforms. It is "free" as in beer, not free as in speech, which means you can download and use it today at no charge, and even ship products with it included, but you are not allowed to modify and distribute altered versions of it, as you can with "free as in speech" open source code, as in the case of Derby above (see [Apache License 2.0"] for details).
As I see it, DB2 Express-C has two key advantages. First, if you like the free version, you can purchase a "support contract" for those that need extra hand-holding, or are using this as part of a commercial business venture. Second,for those who do prefer vendor lock-in, it is easyto upgrade Express-C to the full IBM DB2 database product, so if you are developing a product intended for use with DB2, you can develop it first with DB2 Express-C, and migrate up to full DB2 commercial version when you are ready.
This is perhaps more information than you probably expected for such a simple question. Meanwhile, I am stilltrying to figure out MySQL as part of my [OLPC volunteer project].
My father's favorite question is "What's the worst that could happen?" He is retired now, but workedat the famous [Kitt Peak National Observatory] designing some of the largesttelescopes. Designing telescopes followed well-established mechanical engineering best practices, but each design was unique,so there was always a chance that the end result would not deliver the expected results. What's the worst that can happen? For telescopes, a few billion dollars are wasted and a few years are added to the schedule. Scrap it and start over. Nothing unrecoverable for the US government with unlimited resources and patience.
... the rest of the grimness on the front page today will matter a bit, though, if two men pursuing a lawsuit in federal court in Hawaii turn out to be right. They think a giant particle accelerator that will begin smashing protons together outside Geneva this summer might produce a black hole or something else that will spell the end of the Earth — and maybe the universe.
Scientists say that is very unlikely — though they have done some checking just to make sure.
The world’s physicists have spent 14 years and $8 billion (US dollars) building the Large Hadron Collider, in which the colliding protons will recreate energies and conditions last seen a trillionth of a second after the Big Bang. Researchers will sift the debris from these primordial recreations for clues to the nature of mass and new forces and symmetries of nature.
But Walter L. Wagner and Luis Sancho contend that scientists at the European Center for Nuclear Research, or CERN, have played down the chances that the collider could produce, among other horrors, a tiny black hole, which, they say, could eat the Earth. Or it could spit out something called a “strangelet” that would convert our planet to a shrunken dense dead lump of something called “strange matter.” Their suit also says CERN has failed to provide an environmental impact statement as required under the National Environmental Policy Act.
Although it sounds bizarre, the case touches on a serious issue that has bothered scholars and scientists in recent years — namely how to estimate the risk of new groundbreaking experiments and who gets to decide whether or not to go ahead.
What's the worst that can happen? Scientists now agree that it is sometimes difficult to predict, and someeffects may be unrecoverable.
Unfortunately, this is not the only example of people attempting things they may not understand well enough. Theweb comic below has someone complaining they are out of disk space, and the sales rep suggests solving this with a few commands which will result in deleting all her files. Hopefully, most people reading will recognize this is meant as humor, and not actually attempt the code fragments to "see what they do".
This is a webcomic called "Geek and Poke". If you dare to read the punchline, click here: Funny Geeks - Part 5.
Warning: Do not try the code fragments unless you know what to expect!
Sadly, I often encounter clients who have a "keep forever" approach to their production data. When they are seriously out of space, they feel forced to either buy more disk storage, or start "the big Purge": deleting rows from their database tables, emails older than 90 days, or some other drastic measures. With a focus on keeping down IT budgets, I fear that thesedrastic measures are growing more common. What's the worst that could happen? You might need that data for defending yourself against a lawsuit, or need it to continue to provide service to a loyal client, or just continue normal business operations.I have visited companies where a junior administrator chose the "big Purge" option, without a full understanding ofwhat they were doing, resulting in business disruption until the data could be recovered or re-entered.
IBM offers a better way. Data that may not be needed on disk forever could be moved to lower-cost tape, using up less energy and less floorspace in your data center. Solutions can automatically delete the data systematically based on chronological or event-based retention policies, with the option to keep some data longer in response to a "legal hold" request.
That's certainly better than to risk shrinking your business into a "dense dead lump"!
I am back from China, and now glad to be back in the old USA. Last week, someone asked me what would it take to add a specific feature to the IBM System Storage DS8300. The what-would-it-take question is well-known among development circles informally as a "sizing" effort, or more formally as "Development Expense" estimate.
For software engineering projects, the process was simply that an architect would estimate the number of "Lines of Code" (LOC) typically represented in thousands of lines of code (KLOC). This single number would convert to another single number, "person-months", which would then translate to another single number "dollars". Once you had KLOC, the rest followed directly from a formula, average or rule-of-thumb.
More amazing is that this single number could then determine a variety of other numbers, the number of total months for the schedule, the number of developers, testers, publication writers and quality assurance team members needed, and so on. Again, these were developed using a formula, developed and based on past experience of similar projects.
Hardware design introduces a different set of challenges. When I was getting my Masters Degree in Electrical Engineering, it took myself and four other grad students a full semester just to design a six-layer, 900 transistor silicon chip, which could only perform a single function, multiply two numbers together.At IBM, another book that I was given to read was Soul of a New Machine, documenting six hardware engineers, and six software engineers, working long hours on a tight schedule to produce a new computer for Data General.
So why do I bring this up now? IBM architects William Goddard and John Lynott are being inducted posthumously this year into the prestigious National Inventors Hall of Fame for their disk system innovation.
Under the leadership of Reynold Johnson, the team developed an air-bearing head to “float” above the disk without crashing into the disk. Imagine a fighter airplane flying full speed across the country-side at 50 feet off the ground. If you every heard the term "my disk crashed", it was originally referring to the read/write head touching the disk surface, causing terrible damage.
A uniformly flat disk surface was created by spinning the coating onto the rapidly rotating disk, leaving many wearing lab coats covered with disk liquid at waist level. Developing disk-to-disk and track-to-track access mechanisms proved more challenging, and nearly halted the project. The team, however, was adamant that this problem could be solved, and customers were increasingly asking for random access technology. The result was the "350 Disk Storage Unit" designed for the "305 RAMAC computer", which I have talked about a lot last year as part of our "50 years of disk systems innovation" celebration.
Neither Goddard nor Lynott had computing experience prior to joining IBM. Goddard was a former science teacher who briefly worked in aerospace. Lynott had been a mechanic in the Navy and later a mechanical engineer. They didn't have a nice formula based on past experience, they didn't have the benefit of Fred Brooks' advice, or the rules-of-thumb or averages now used to estimate the size of projects. They had to break new ground.
Next week, May 11-15, I will be in Las Vegas, Nevada for the [IBM Edge 2015 conference], covering IBM System Storage, z Systems and POWER Systems. Is this your first time going?
We have sold out, with over 6,000 registrants, that is way more than the 4,200 we had last year. If this is your first time to Edge, then I thought I would get some of my colleagues to help set some expectations.
So, I asked several of my colleagues a few questions, to help provide a variety of viewpoints. Here were my questions:
What should new attendees expect at Edge?
What are the trends that IBM is focused on? How is IBM differentiating themselves in the marketplace?
Why does it matter to our attendees? What's in it for them?
What is the call to action for my readers?
What should new attendees expect at Edge?
"IBM Real-time Compression technology is field-proven and allows seamless scaling as demand for data storage capacity expands. The wave of new applications and data flowing from systems of engagement create a data management challenge. With Real-time Compression, which compresses data in-line and can store up to five times more data in the same physical space, IBM can help to address these challenges. Compression performance is optimized for enterprise-level and cloud workloads. Expect some exciting announcements in regards to this technology!"
-- Ori Bauer,
Director, WW Development and IBM Systems Israel Development Center
"They should expect Demonstrations and Exhibits! One featuring the capabilities of the FlashSystem V9000 through its slick and easy to use graphical user interface. The other featuring breakthrough technology from IBM Almaden Research.
A 'Digital Rack' introducing five configurations of the FlashSystem V9000 and 900 models that address five pain points customers face. A FlashSystem V9000 module showcasing its rich Tier-1 storage features and virtualization capabilities. A full rack / full configuration FlashSystem V9000 deployment illustrating over two Petabytes (2 PB) effective capacity."
-- Bill Bostic
Manager, FlashSystem Development and Exploitation
"The attendees can expect to see how the IBM DS8870 leadership allows clients to achieve the most value from the latest generation of IBM Mainframe, the z13!
Together, the DS8870 and z13 deliver end-to-end FICON speeds using 16Gbps Fibre Channel. These faster links have industry leading reliability as they are protected by new IBM-led standards to make 16 GB FC 300x more reliable. Using z Systems High Performance FICON (zHPF) protocols, DB2 utilities and Log writes can be reduced by up to 68 percent at 100 KM distances and 7 percent at machine room distances. This capability complements the GDPS and TPC-R HyperSwap function for continuous availability by mitigating the distance penalty when the secondary disks are in a send site up to 100 KM away.
Along with better performance and reliability the DS8870 extends the autonomic I/O management function of the z/OS work load manager into the SAN fabric with Fabric I/O Priority Management. For write operations the fabric priority is applied to the Metro Mirror traffic to provide a consistent policy across FCP and FICON sharing the same ISLs."
-- Harry Yudenfriend,
IBM Fellow, Storage Development
"There will be a VersaStack demo at the Cisco booth in the solutions center. There will be a client and IBM Business Partner giving testimony to the value of VersaStack."
-- Eric Stouffer
Director, IBM Storwize Business Line Executive
"Attendees can expect technology previews of exciting new features, like a highly flexible active-active solution for Storwize, next generation of cloud management integration technology for VMware, providing policy-based storage controls to cloud infrastructures that can take full advantage of advanced storage system features, and for integrating cloud storage into your traditional storage infrastructure."
-- Tommy Rickard,
Director, IBM Systems Development, United Kingdom
What are the trends that IBM is focusing on?
"Accelerating applications. The IBM FlashSystem 900 is designed for uncompromising performance, macro efficiency and enterprise reliability.
Radically improving data center efficiency, economics, and performance by exploiting the rich set of FlashSystem V9000 features including virtualization, dynamic tiering, real time compression, thin provisioning, snap shots, cloning, replication, high availability, with the performance of FlashSystem technology."
-- Bill Bostic
Manager, FlashSystem Development and Exploitation
"VersaStack Integrated Infrastructure Solution combines Cisco Network MDS & Nexus, Cisco UCS compute, and Storage with IBM Storwize V7000K, with Cisco UCS Director for single point of management, to address one of the most rapidly growing markets. Storwize improves integrated infrastructure in three key ways - 1) by virtualizing storage to transform utilization, including older or other less capable storage; 2) optimizing speed and cost of devices automatically using EasyTier; 3) squeezing the most data in to the least space with no performance penalty using real time compression. No other integrated infrastructure from the likes of Dell, HP or Oracle has the value of VersaStack."
-- Ian Shave,
Global Business Unit Executive for VersaStack & BD&A, IBM Storage
"Increasing performance of ethernet networks, especially in new cloud deployments, that increases the importance of iSCSI as an enterprise, high-performance storage network. IBM provides rich support for iSCSI and is continuing to invest in this area. Increasing dominance of cloud infrastructures, including private, hybrid, and public forms, that is driving the need for delegated policy-based management to take advantage of storage system features for the software defined environment. IBM provides the most mature, capable, and flexible storage infrastructure, with leading integration into Openstack. VMware and other key cloud infrastructures.
Increasing opportunity for Cloud storage to satisfy traditional storage needs."
-- Tommy Rickard,
Director, IBM Systems Development, United Kingdom
Why does it matter to our attendees? What's in it for them?
"Leading the way in helping the industry and clients exploit flash technology. IBM FlashSystem 900 is the acknowledged technology for application acceleration. That foundation is extended in the FlashSystem V9000 with a rich set of features to enable replacement of disk in the data center. The IBM Data Engine for NoSQL, released last year, uses FlashSystem technology to leverage flash as memory. Now, IBM will be demonstrating the next step to expand to exploitation of flash beyond flash as memory. This technology will dramatically expand flash applicability for NoSQL use cases. It will simplify storage management and accelerate node recovery, while dramatically improving availability and reliability.
Flash technology promises the advantages of better performance, better economics, and better environmentals. More important is the opportunity to radically reshape IT and the ability to deliver new customer value. IBM R&D is focused on delivering leading edge extensible technology as well as blazing the trail to enable the industry and customers exploit it. "
-- Bill Bostic
Manager, FlashSystem Development and Exploitation
"The end-to-end value that only IBM can provide improves resilience, lowers costs and is better able to handle the demands of the emerging Cloud, Analytics, Mobile and Social (CAMS) workloads."
-- Harry Yudenfriend,
IBM Fellow, Storage Development
"Integrated infrastructure speeds time to value with much of the integration and customization laid out in a recipe called a Cisco Validated Design (CVD). This saves time and money for clients and partners. It also provide confidence for these clients in the implementation and use in their environments. Cisco provides a single solution support offering for VersaStack, too, simplifying the technical support for clients and partners."
-- Eric Stouffer
Director, IBM Storwize Business Line Executive
"Cloud architectures are driving the technical agenda for infrastructure. Many clients are making use of cloud technologies, either to satisfy storage needs using public cloud, or to streamline their processes using private cloud. The largest clients, such as some financial clients, are building new data centers modelled on cloud infrastructure, and the growth of cloud investment is defining the technical agenda for these installations."
-- Tommy Rickard,
Director, IBM Systems Development, United Kingdom
What is the call to action for my readers?
"Come to the demonstrations and exhibits at the Solution Center, which runs Monday lunch to Wednesday lunch, including Monday and Tuesday night receptions! This will be your opportunity to see these solutions live, and get to talk to the experts about them."
-- Bill Bostic
Manager, FlashSystem Development and Exploitation
"Check out my session cSY1467: DS8870 Exploitation of the z13 and z/OS I/O Enhancements, which I will present twice on Tuesday.
-- Harry Yudenfriend,
IBM Fellow, Storage Development
"See the VersaStack demo, go to the various sessions on VersaStack, some of which from IBM and the others from Cisco, visit the [IBM on VersaStack" and/or [Cisco on VersaStack] web sites for more information, or watch the YouTube video [New VersaStack Solution by Cisco and IBM]. If you are seller or IBM Business Partner, find a VersaStack Academy near you for more detailed education in person with experts from IBM and Cisco."
-- Ian Shave,
Global Business Unit Executive for VersaStack & BD&A, IBM Storage
If you didn't get into Edge, we are going to have a LiveStream of some of the keynote sessions on our [Digital Event Center] that you can watch from the comfort of your home or office. Go to the [Registration page] for details.
I arrive this Sunday afternoon. If you see me, stop and say "Hi!"
Twenty years ago, I flew to Atlanta for the semi-annual SHARE conference. I was a lead architect for DFSMS, the storage management software for mainframe servers. When I got to the hotel, I realized that I had forgotten to pack my saline solution for my contact lenses. I went to the hotel gift shop, and picked the first one I found. I took my contacts in the solution and went to bed.
The next morning, I put on my contacts, got dressed, and participated in meetings. One of my colleagues noticed my eyes were quite red, and suggested I switch from contact lenses to glasses. I went back to my hotel room, saw to my horror that what I thought was saline solution was actually hydrogen peroxide intended for hard lenses. When I removed the lenses, all I could see was white light.
I managed to find my way to the elevator, and feel for the button with the star that indicated the lobby on the ground floor. I asked a hotel staffer to call me an ambulance, but instead, they put me in a cab, and sent me to Emory Hospital. On arrival, all I could do was hand over my wallet to my cabbie, and let him take out what he felt was fair, since I could not see him, the meter, or his license number.
After bumping my knees into dozens of cars in the parking lot, I finally made it to the ER, only to have receptionist give me a form to fill out and a pen. At this point, I lost it. I gave her my wallet and said that any information she may need should be in there.
Thankfully, a doctor noticed this exchange, and took care of me right away. I had chemically burned off both corneas. He injected some green fluid into both eyeballs, and sent me off in a cab to the Pharmacy. At least I had both eyes were bandaged in gauze, so people were kind enough to take me to get to the counter to get my pain killers, Percocet.
The pharmacist provided me the pills, and warned me NOT to operate any heavy machinery under the influece of this medication. Seriously? I can't see, both eyes covered, and he tells me that?
I got back to the hotel, got ready for bed, took the pills and brushed my teeth. I woke up the next morning on the bathroom floor, still clutching the toothbrush, and vertical and horizontal lines across my right cheek which were made by the one-inch tiles of the bathroom floor. These pills really knocked me out.
That day, I had to present a full hour in front of hundreds of people. I had a colleague flip my transparencies for me, while I spoke to each one, my eyes still covered in gauze. That evening, I was one of the experts on the panel for a "Birds of a Feather", or BOF session, answering a variety of questions. People could see that I was blind, but I could still hear the questions, and I could still answer them as well.
If you are going to Edge 2013 in Las Vegas, please consider attending my BOF session on Security for PureSystems, System x and Storage products, scheduled for Thursday afternoon, June 13. I will be moderating a distinguished panel of experts to answer your questions! I have listed them here alphabetically:
Jack Arnold, US Federal. Jack has worked decades in the storage industry, and will provide insight into security issues related to the government.
Tom Benjamin, Development Manager for Key Lifecycle Management and Java Cryptography. Tom will bring his expertise in both TKLM and ISKLM for managing encryption keys, and how to communicate these between security and storage administrators.
Paul Bradshaw, Chief Storage Architect for Clouds. A research scientist from IBM's Almaden Research Lab, Paul will provide insight in how to deal with security issues related to private, hybrid and public cloud deployments.
Ajay Dholakia, Solution Center of Excellent. Ajay will cover server-side considerations for security deployments, including System x and PureSystems.
Jim Fisher, Advanced Technical Skills. Jim brings expertise related to deploying data-at-rest encryption.
Not sure what kind of questions to ask? Here is a series of Questions and Answers we had at a Storage event in 2011 that might give you a good idea: [2011 Storage Free-for-All].
Continuing this week's theme on Cloud Computing, Dynamic Infrastructure and Data Center Networking, IBM unveiled details of an advanced computing system that will be able to compete with humans on Jeopardy!, America’s favorite quiz television show. Additionally, officials from Jeopardy! announced plans to produce a human vs. machine competition on the renowned show.
For nearly two years, IBM scientists have been working on a highly advanced Question Answering (QA) system, codenamed "Watson" after IBM's first president, [Thomas J. Watson]. The scientists believe that the computing system will be able to understand complex questions and answer with enough precision and speed to compete on Jeopardy!Produced by Sony Pictures Television, the trivia questions on Jeopardy! cover a broad range of topics, such as history, literature, politics, film, pop culture, and science. It poses a grand challenge for a computing system due to the variety of subject matter, the speed at which contestants must provide accurate responses, and because the clues given to contestants involve analyzing subtle meaning, irony, riddles, and other complexities at which humans excel and computers traditionally do not. Watson will incorporate massively parallel analytical capabilities and, just like human competitors, Watson will not be connected to the Internet or have any other outside assistance.
If this all sounds familiar, you might remember some of the events that have led up to this:
In 1984, the movie ["The Terminator"] introduced the concept of [Skynet], a fictional computer system developed by the militarythat becomes self-aware from its advanced artificial intelligence.
In 1997, an IBM computer called Deep Blue defeated World Chess Champion [Garry Kasparov] in a famous battle of human versus machine. To compete at chess, IBM built an extremely fast computer that could calculate 200 million chess moves per second based on a fixed problem. IBM’s Watson system, on the other hand, is seeking to solve an open-ended problem that requires an entirely new approach – mainly through dynamic, intelligent software – to even come close to competing with the human mind. Despite their massive computational capabilities, today’s computers cannot consistently analyze and comprehend sentences, much less understand cryptic clues and find answers in the same way the human brain can.
In 2005, Ray Kurzweil wrote [The Singularity is Near] referring to the wonders that artificial intelligence will bring to humanity.
The research underlying Watson is expected to elevate computer intelligence and human-to-computer communication to unprecedented levels. IBM intends to apply the unique technological capabilities being developed for Watson to help clients across a wide variety of industries answer business questions quickly and accurately.
According to Gartner data (from 2005!), host-based storage accounts for 34 percent of the overall market for external storage, with the remaining 66 percent going to "fabric-attached" (network) storage, expect this share to grow from 66 percent to 77 percent by 2007.What is the current reality? SAN vs. NAS, FC vs iSCSI?
IBM subscribes to a lot of data from different analysts, they all have their methods for collecting this data, from taking surveys of customers to reviewing financial results of each vendor. While theymight not agree entirely, there are some common threads that lead one to believe they represent "reality". Hereare some numbers from an IDC December 2007 report:
Worldwide Disk Storage
While the 32/68 split is similar to the 34/66 split you mentioned before, you can see that external growth isgrowing faster, so internal host-based storage will drop to 25 percent by 2011, with external storage growing to 75 percent, very close to the 77 predicted. Looking at just the externaldisk storage, there are basically three kinds: DAS (direct cable attachment), NAS (file level protocols suchas NFS, CIFS, HTTP and FTP), and SAN (block-level protocols like FC, iSCSI, ESCON and FICON):
Worldwide External Disk Storage
At these rates, fabric-attached (SAN and NAS) will continue to dominate the storage landscape.Looking more closely now at the block-oriented protocols.
Worldwide External Disk Storage
Fibre Channel (FC)
At these rates, iSCSI will overtake FC by 2011. IBM System Storage N series, DS3300 and XIV Nextraall support iSCSI attachment.
Jon Toigo over at DrunkenData offers some additional data from ex-STKer:[Fred Moore Outlook on Storage 2008]. I met Fredat a conference. He had left STK back in 1998, and started his own company called Horison. NeitherJon nor Fred cite the sources of his statistics, but the following comment leads me to assume hehasn't been paying attention closely to the tape market:
With the demise of STK, who will be the leader in the tape industry?
Depending on how old you are, you might remember exactly where you were when a significant eventoccurred, for example the[Space Shuttle Challenger]explosion. For many IBMers, it was the day our friends at Sun Microsystems announced they were [puttingour lead tape competitor out of its misery]. I was in New York that day, but there was still someconfetti on the floor in the halls of the IBM Tucson lab when I got home a few days later. IBM hasbeen the number one market share leader in tape for over the past four years.
Our industry is full of acronyms, and sometimes spelling out what words an acronym stands for is not enough to explain it fully.
It reminds me of an old story within IBM. A customer engineer (or "CE" for short) was repairing an air-cooled server, and found the failing part being a "FAN". Not knowing what this stood for, he looked up the acronym in the offical "IBM list of acronyms" and found that it stood for "Forced Air Network". Apparently, so many people did not realize that a FAN was just a "fan" that they needed to add an entry to remind people what this little motorized propeller was for.
This brings me to Tony Asaro's Fun with FAN blog entry which mentions yet another definition for FAN, that of "File Area Network". The concept is not new, but some developments this year help make it more a reality.
IBM's General Parallel File System (GPFS) has been enhanced earlier this year with cool ILM-like functionality borrowed from SAN File System, such as policy-based data placement, movement and automatic expiration. This can include policies to place data on the fastest Fibre Channel drives at first, then move them to slower less costly SATA disks after a few months when fewer access reqeusts are expected.
IBM has paired up N series with SAN Volume Controller (SVC), so that an N series gateway can now provide iSCSI, CIFS and NFS access to virtual disks presented from SVC. The problem with NAS appliances in the past, is that once they fill up, moving files to newer technologies is awkward and difficult. With SVC, file systems can now be moved from one physical disk system to another, all while applications are reading and writing data.
To better understand the importance of this, consider the first "FAN", the mainframe z/OS operating system using DFSMS. The mainframe uses the concept of "data sets", a data set can be a stream of fixed 80-character records, representing the original punched cards, a library of related documents, or a random-access data base. All mainframes in a system complex, or "sysplex" for short, could look up the location of any data set, and access it directly. Data sets could be moved from one disk system to another, migrated off to tape, and brought back to disk, all without re-writing any applications.
To join the rest of the world, new types of data set were created for the z/OS operating system, known as HFS and zFS. These held file systems in the sense we know them today, comparable in hierarchical organization of files on Windows, Linux and UNIX platforms. These could be linked and mounted together in larger hierarchical structures across the sysplex.
The concept of files and file systems is a fairly new concept. Prior to this, applications read and wrote directly in terms of blocks, typically fixed length multiples of 512 bytes. For a while, database management systems offered a choice, direct block access or file level access. The former may have offered slightly better performance, but the latter was easier to administer. Without file system, specialized tools were often required to diagnose and fix problems on block-oriented "raw logical" volumes.
This launched a "my file system is better than yours" war which continues today. The official standard is POSIX, but every file system tries to give some proprietary advantage by offering unique features. Sun's file system offers support for "sparse" files, which is ideal for certain mathematical processing of tables. Microsoft's NTFS offers biult-in compression, designed for the laptop user. IBM's JFS2 and Linux's EXT3 file systems support journaling, which tracks updates to file system structures in a separate journal to minimize data corruption in the event of a power outage, and thus speed up the re-boot process. Anyone who has ever waited for a "Scan Disk" or "fsck" process to finish knows what I'm talking about. Of course, if an application deviates from POSIX standards, and exploits some unique feature of a file system, it then limits its portability and market appeal.
The two competing NAS file systems are also different. Common Internet File System (CIFS) was developed initially by IBM and Microsoft to provide interoperability between DOS, Windows and OS/2. Meanwhile, Network File System (NFS) was the darling of nearly every UNIX and Linux distribution, and even has clients on operating platforms as diverse as MacOS, i5/OS, and z/OS. Today, nearly every platform supports one or both of these standards.
Bottom line, file systems are here to stay. Any slight advantages to use raw logical volumes for databases and applications are losing out to the robust set of file system utilities that can be used across a broad set of platforms and applications.
Last year, I covered Chris Anderson's book [The Long Tail]. This year, Chris Anderson, editor-in-chief of Wired.com, has an upcoming book titled Free, the past and future of a radical price. Chris talked about his book here at Nokia World 2007 conference, and the [46-minute video] is worth watching.He asks the big question "What if certain resources were free?" This could be electricity, bandwidth, or storage capacity. He explores how this changes the world, and createsopportunities for new business models. However, many people are stuck in a "scarcity" modeland treat nearly-free resources as expensive, and find themselves doing traditional things thatdon't work anymore. Chris mentions [Second Life] as aneconomy where many resources are free, and seeing how people respond to that.Rather than focusing on making money, new businesses are focused on gainingattention and building their reputation. Here are some example business models:
Cross-subsidy: give away the razors, sell the razor blades; or give away cell phones and sell minutes
Ad-Supported: magazines and newspapers sell for less than production costs
Freemium: 99% use the free version, but a handful pay extra for something more
Digital economics: give away digital music to promote concert tours
Free-sample marketing: give away samples to get word-of-mouth advertising
Gift economy: give people an opportunity and platform to contribute like Wikipedia
Nick Carr writes a post [Dominating the Cloud], indicatingthat IBM, Google, Microsoft, Yahoo and Amazon are the five computing giants to watch, as they are more efficient atconverting electricity into computing than anyone else. Last month, I mentioned IBM and Google partnership on cloud computing in my post[Innovationthat matters: cell phones and cloud computing].Nick's upcoming book titled[The Big Switch] looks into "Utility Computing",comparing the change of companies generating their own electricity to using an electric grid, to the recent developments of cloud computing and software as a service (SaaS). Amazon's latest "SimpleDB" online databaseis cited as an example.
Last, but not least, Seth Godin writes in his post [Meatballs and Permeability] about the bits-vs-atoms issue, what Chris Anderson above refers to as the new digital economy. The idea here is that value carried electronically as bits (digital documents, for example) have completely different economics than value carried as atoms (physical objects), andrequires new marketing techniques. Methods from traditional marketing will not be effective in this new age.Here is a [review] of Seth's new book Meatball Sundae: Is Your Marketing Out of Sync?
All three of these books seem to be covering the same phenomenon, just from different viewpoints. I lookforward to reading them.
While EMC bloggers garnered media attention last year pointing out the faulty mathematics from HDS, an astute reader pointed me to EMC's own [DMX-4 specification sheet],updated for its 1TB SATA disk.I've chosen just the minimum and maximum number of drives RAID-6 data points for non-mainframe platforms:
In the first two rows, the numbers appear as expected. For example, 96 drives would be 12 sets of 6+2 RAID ranks, meaning 72 drives' worth of data, so nearly 36TB for 500GB drives, and nearly 72TB for 1TB drives. With 14+2 RAID-6, thenyou would have 84 drives' worth of data, so 42TB and 84TB respectively match expectations.
Where EMC appears miscalculating is having 20x more drives, as the numbers don't match up. For 1920 drives inRAID-6, you would expect 20x more usable capacity than the 96 drive configurations. For 6+2 configurations, one would expect 720TB and 1440TB respectively. For 14+2 configurations, one wouldexpect 840TB and 1680TB, respectively.
Perhaps EMC DMX-4 can't address more than 600TB for the entire system? Does EMC purposely limit the benefitsof these larger drives? It does question why someone might go from 500GB to 1TB drives, if the maximum configuration only gives about 40TB more capacity.Fellow IBM blogger Barry Whyte questioned the use of SATA in an expensive DMX-4 system, in his post[One Box Fits All - Or Does It], and now perhaps there are good reasons to question 1TB from a capacityperspective as well.
Many often associate CAS with EMC's Centera offering, but with IBM's comprehensive set of compliance storageofferings, EMC doesn't talk about CAS or Centera much anymore.I covered the confusion around CAS in a previous post. When clients ask for "CAS" what they really are looking for is storage designed forfixed content, unstructured data that doesn't change once written. A lot of data falls under this category, such as scanned documents, audio and video recordings, medical images, and so on. Some laws and regulations further require enforcement that the data is not deleted or tampered with, until some time after an event or expiration date is met.
In the past, clients used write-once read-many (WORM) optical media, but today we have disk and tape offerings instead. Since the term "WORM" is inappropriate fordisk-based solutions, IBM has standardized to the use of the term "non-erasable, non-rewriteable" (NENR) to discusstoday's solutions and offerings.
Let's recap what IBM has to offer:
IBM System Storage DR550
This comes in both large version (DR550) andsmall version (DR550 Express).Both offerings provide NENR protection of fixed content data with your choice of a disk-only or disk-and-tape configuration. IBM also announced a DR550 file system gateway, extending the number of applications that can take advantage of this offering.
IBM System Storage N series with SnapLock(tm)
IBM has seen great success with the N series disk systems. A specificfeature called SnapLock allows some of the data stored to be NENR protected until an expiration date is met. As partof IBM's emphasis for "unified storage", a single N series appliance or gateway can manage both regular (erasable/modifiable) data with NENR data. Combining this with our recently announced Advanced Single InstanceStorage (A-SIS) de-duplication feature, and you get a very cost-effective offering!
IBM System Storage Multilevel Grid Access Manager Software
A fourth option for NENR data is WORM tape. IBM supports WORM cartridge media in both the enterprise TS1120 drive as well as LTO3 and LTO4 drives. The advantage is that you don't need unique tape drives for WORM support. IBM drives can read and write both regular and WORM cartridges, and provide a cost-effective alternative to optical media.
As you see, IBM doesn't limit itself to disk-only offerings. Our leadership in tape allows us to innovate tape and disk-and-tape offerings that can provide more cost-effective solutions to store fixed content, retention managed data.The next time you have a conversation with a storage vendor, don't ask for CAS, ask instead for archive and compliance storage. Broaden your mind, and broaden the set of options and choices that might provide a better fit for your requirements.
As I mentioned in my post [Moving Over to MyDeveloperWorks], those of us bloggers on IBM's DeveloperWorks are moving over to a new system called "MyDeveloperWorks" which has a host of new features.
Fortunately for me, I missed the note to volunteer to be one of the first bloggers on the block to volunteer to move over. I was traveling and decided not to deal with it until I got back.However, fellow IBM Master Inventor, Barry Whyte, was not so lucky. It is safe to say he was stupid enough to volunteer, and is probably regretting the decision every day since. In case you lost his RSS feed, or can't find him anymore on Google or whatever search engine, here is his[new blog].
As for my blog, I have asked to postpone the move until all the problems that Barry has encountered are resolved. That might be a awhile, but if you lose access to mine sometime in the near future, hopefully at least you have been warned as to what might have happened.
Yesterday, I started this week's topic discussing the various areas of exploration to helpunderstand our recent press release of the IBM System Storage SAN Volume Controller and itsimpressive SPC-1 and SPC-2 benchmark results that ranks it the fastest disk system in the industry.
Some have suggested that since the SVC has a unique design, it should be placed in its own category,and not compared to other disk systems. To address this, I would like to define what IBM meansby "disk system" and how it is comparable to other disk systems.
When I say "disk system", I am going to focus specifically on block-oriented direct-access storage systems, which I will define as:
One or more IT components, connected together, that function as a whole, to serve as a target forread and write requests for specific blocks of data.
Clarification: One could argue, and several do in various comments below, that there are other typesof storage systems that contain disks, some that emulate sequential access tape libraries, some that emulate file-systems through CIFS or NFS protocols, and some that support thestorage of archive objects and other fixed content. At the risk of looking like I may be including or excluding such to fit my purposes, I wanted to avoid apples-to-orangescomparisons between very different access methods. I will limit this exploration to block-oriented, direct-access devices. We can explore these other types of storage systems in later posts.
People who have been working a long time in the storage industry might be satisfied by this definition, thinkingof all the disk systems that would be included by this definition, and recognize that other types of storage liketape systems that are appropriately excluded.
Others might be scratching their heads, thinking to themselves "Huh?" So, I will provide some background, history, and additional explanation. Let's break up the definition into different phrases, and handle each separately.
read and write requests
Let's start with "read and write requests", which we often lump together generically as input/output request, or just I/O request. Typically an I/O request is initiated by a host, over a cable or network, to a target. The target responds with acknowledgment, data, or failure indication. A host can be a server, workstation, personal computer, laptop or other IT device that is capable of initiating such requests, and a target is a device or system designed to receive and respond to such requests.
(An analogy might help. A woman calls the local public library. She picks up the phone, and dials the phone number of the one down the street. A man working at the library hears the phone ring, answers it with "Welcome to the Public Library! How can I help you?" She asks "What is the capital city of Ethiopia?" and replies "Addis Ababa." and hangs up. Satisfied with this response, she hangs up. In this example, the query for information was the I/O request, initiated by the lady, to the public library target)
Today, there are three popular ways I/O requests are made:
CCW commands over OEMI, ESCON or FICON cables
SCSI commands over SCSI, Fibre Channel or SAS cables
SCSI commands over Ethernet cables, wireless or other IP communication methods
specific blocks of data
In 1956, IBM was the first to deliver a disk system. It was different from tape because it was a "direct access storage device" (the acronym DASD is still used today by some mainframe programmers). Tape was a sequential media, so it could handle commands like "read the next block" or "write the next block", it could not directly read without having to read past other blocks to get to it, nor could it write over an existing block without risking overwriting the contents of blocks past it.
The nature of a "block" of data varies. It is represented by a sequence of bytes of specific length. The length is determined in a variety of ways.
CCW commands assume a Count-Key-Data (CKD) format for disk, meaning that tracks are fixed in size, but that a track can consist of one or more blocks, and can be fixed or variable in length. Some blocks can span off the end of one track, and over to another track. Typical block sizes in this case are 8000 to 22000 bytes.
SCSI commands assume a Fixed-Block-Architecture (FBA) format for disk, where all blocks are the same size, almost always a power of two, such as 512 or 4096 bytes. A few operating systems, however, such as i5/OS on IBM System i machines, use a block size that doesn't follow this power-of-two rule.
one or more IT components
You may find one or more of the following IT components in a disk system:
motorized platter(s) covered in magnetic coating with a read/write head to move over its surface. These are often referred to as Hard Disk Drive (HDD) or Disk Drive Modules (DDM), and are manufacturedby companies like Seagate or Hitachi Global Storage Technologies.
A set of HDD can be accessed individually, affectionately known as JBOD for Just-a-bunch-of-disk, or collectively in a RAID configuration.
Memory can act as the high-speed cache in front of slower storage, or as the storage itself. For example, the solid state disk that IBM announced last week is entirely memory storage, using Flash technology.
Lately, there are two popular packaging methods for disk systems:
Monolithic -- all the components you need connected together inside a big refrigerator-sized unit, with options to attach additional frames. The IBM System Storage DS8000, EMC Symmetrix DMX-4 and HDS TagmaStore USP-V all fit this category.
Modular -- components that fit into standard 19-inch racks, often the size of the vegetable drawer inside a refrigerator, that can be connected externally with other components, if necessary, to make a complete disk system. The IBM System Storage DS6000, DS4000, and DS3000 series, as well as our SVC and N series, fall into this category.
Regardless of packaging, the general design is that a "controller" receives a request from its host attachment port, and uses its processors and cache storage to either satisfy the request, or pass the request to the appropriate HDD,and the results are sent back through the host attachment port.
In all of the monolithic systems, as well as some of the modular ones, the controller and HDD storage are contained in the same unit. On other modular systems, the controller is one system, and the HDD storage is in a separate system, and they are cabled together.
serve as a target
The last part is that a disk system must be able to satisfy some or all requests that come to it.
(Using the same analogy used above, when the lady asked her question, the guy at the public library knew the answer from memory, and replied immediately. However, for other questions, he might need to look up the answer in a book, do a search on the internet, or call another library on her behalf.)
Some disk systems are cache-only controllers. For these, either the I/O request is satisfied as a read-hit or write-hit in cache, or it is not, and has to go to the HDD. The IBM DS4800 and N series gateways are examples of this type of controller.
Other systems may have controller and disk, but support additional disk attachment. In this case, either the I/O request is handled by the cache or internal disk, or it has to go out to external HDD to satisfy the request. IBM DS3000 series, DS4100, DS4700, and our N series appliance models, all fall into this category.
So, the SAN Volume Controller is a disk system comprising of one to four node-pairs. Each node is a piece of IT equipment that have processors and cache. These node-pairs are connected to a pair of UPS power supplies to protect the cache memory holding writes that have not yet been de-staged. The combination of node-pairs and UPS acting as a whole, is able to serve as a target to SCSI commands sent over Fibre Channel cables on a Storage Area Network (SAN). To read some blocks of data, it uses its internal cache storage to satisfy the request, and for others, it goes out to external disk systems that contain the data required. All writes are satisfied immediately in cache on the SVC, and later de-staged to external disk when appropriate.
As of end of 2Q07, having reached our four-year anniversary for this product, IBM has sold over 9000 SVC nodes, which are part of more than 3100 SVC disk systems. These things are flying off the shelves, clocking in a 100% YTY growth over the amount we sold twelve months ago. Congratulations go to the SVC development team for their impressive feat of engineering that is starting to catch the attention of many customers and return astounding results!
So, now that I have explained why the SVC is considered a disk system, tomorrow I'll discuss metrics to measure performance.
I welcome HDS into the "Super High-End" club. Those who follow my blog might remember thatI suggested that analysts like IDC that use "Entry Level", "Midrange" and "Enterprise" as categoriesmay need a New Category: Super High End.
I was not surprised to see EMC, who now drops further down in perception, dispute HDS's recent SPC-1 benchmarks.Fellow blogger EMC's BarryB posted on his Storage Anarchist blog [IBM vs. Hitachi] thatpoints out that IBM's SAN Volume Controller (SVC) is still much faster, and less expensive, than USP-V.
So, just in case you haven't seen all the press releases, here is a quick recap on the results:IBM SVC 4.2 is still in first place, then HDS USP-V, then IBM System Storage DS8300. Just for comparison, I includeour IBM System Storage DS4800 midrange disk results, so you can appreciate the difference between midrange and high-end.There are other products from other vendors, I just point out a few from IBM and HDS here in this graph.
******************************************************************** 272,505 IOPS - IBM SVC 4.2 ************************************************** 200,245 IOPS - HDS USP-V ******************************* 123,033 IOPS - IBM DS8300 *********** 45,014 IBM DS4800
HDS tried to come up with a phrase "Enterprise Storage System" for comparison that would leave the SVC 4.2 out.Given that the SVC has five nines (99.999%) availability, has non-disruptive upgrade and firmware update capability, has more than two processors typical of midrange products, and can connect to mainframes via z/VM, z/VSE andLinux on System z operating systems, there is no reason to pretend SVC isn't Enterprise-class.
The irony now is that EMC now looks very lonely being one of the last remaining major storage vendors not to participate in standardized benchmarks that help customers make purchase decisions, as mentioned both by IBM's BarryW: I guess that only leaves EMC, as well as HDS's Claus Mikkelsen: Olympics of Storage.
Earlier this year, EMC's Chuck Hollis opined[Storage Scorecard]that the EMC DMX and HDS TagmaStore USP were high-endboxes, which I would speculate both of these would fall somewhere between DS4800 and DS8300 on the graph above.If that is the case, it is impressive that HDS was able to re-engineer their USP-V to be 2-3x faster thanits predecessor, the USP.
Not all workloads are the same, and your mileage may vary. While I can't speak to HDS, the folks over atEMC have assured me, in writingcomments on this blog, that there is nothing preventing their customers from publishingtheir own performance comparisons between EMC and non-EMC equipment. I would encourage every customer to do this, between IBM and HDS, HDS and EMC, and between IBM and EMC, to help shed even more light on this area.In fact, you can even run your own SPC benchmarks to see how your own environment compares to the ones published.
Of course, performance is just one attribute on which to choose a storage vendor, and to choose specific products,models or features. For more information about Storage Performance Council and the SPC-1 and SPC-2 benchmarks,see my week-long series on SPC benchmarks, which are listed in reverse chronological order.
Go to the official Storage Performance Council website to read the details of the SPC-1 results.
Please welcome new IBM blogger Keith Stevenson, his new blog is called [Infovore]. He gives his take on the big October 20 announcement we had this week,
and will continue to cover topics related to storage of information.
I am proud to announce that fellow IBMer Carlos Pratt has launched a new IBM storage blog[GreenSpeed].
I'd like to expand a bit on how I know Carlos. Back in 1999 I was asked to lead a team at IBM Tucson to install Linux on our local z800 mainframe, and run tests to confirm that all of our IBM disk and tape storage offerings attached successfully. I was, at the time, lead architect for DFSMS on OS/390 and management felt that my knowledge of the S/390 instruction set was all that was needed to pull this off. My team was a collection of people from a variety of other hardware and software teams, and Carlos came over from the Disk Performance test team.
Needless to say, there were some challenges. The port of Red Hat and SUSE Linux over to the mainframe required special device drivers, and in some cases, we actually needed to make changes to the Linux kernel. While it was over 100 degrees outside, we were in the test lab wearing jackets with a refrigerator thermometer hanging on the wall to monitor our ice cold working conditions.
And of course, we had our internal skeptics. At the time, Linux was only a few percentage points of marketshare, and a few unenlightened souls did not see any reason to invest in support for a new operating system until it was more established. People with a "Wait-and-See" attitude don't last long at IBM. Fortunately, smarter heads prevailed, and now that Linux is well established as the operating system of the future, we can all look back and say "I told you so!"
Carlos was a "get things done" kind of guy. Working with frequent patches to the Linux kernel, device drivers under development, and a team fairly new to this new operating system, Carlos was able to provide the driving force to get our tests done.
I would like to welcome IBMer Barry Whyte to the blogosphere!
From his bio:
Barry Whyte is a 'Master Inventor' working in the Systems & Technology Group based in IBM Hursley, UK. Barry primarly works on the IBM SAN Volume Controller virtualization appliance. Barry graduated from The University of Glasgow in 1996 with a B.Sc (Hons) in Computing Science. In his 10 years at IBM he has worked on the successful Serial Storage Architecture (SSA) range of products and the follow-on Fibre Channel products used in the IBM DS8000 series. Barry joined the SVC development team soon after its inception and has held many positions before taking on his current role as SVC performance architect. Outside of work, Barry enjoys playing golf and all things to do with Rotary Engines.
To avoid confusion in future posts, I will refer to Barry Whyte as BarryW, and fellow EMC blogger Barry Burke (aka the Storage Anarchist) as BarryB.
I'm in Chicago this week, but it is actually HOTTER here than in my home town of Tucson, Arizona.
I am Tony Pearson, storage consultant at the IBM Executive Briefing Center, located in Tucson, Arizona. I have degrees in Computer Engineering and Electrical Engineering from the University of Arizona. Over the past 20 years, I have worked in a variety of storage roles, including development projects, product and portfolio management, testing, field support, marketing, and now am doing storage consulting.
There are a lot of things to discuss related to storage, and I am never short of opinions. As such, the standard IBM disclaimer applies: “The postings on this site solely reflect the personal views of the author and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management.”
I have invited other IBMers to post their opinions, and when they do, their opinions may not necessarily match mine either.
This is an open two-way conversation between IBM, Business Partners, Independent Software Vendors, prospect and existing clients. I encourage everyone to post comments about our products, services, and marketing efforts.
I am Tony Pearson, IBM brand marketing strategy, located in Tucson, Arizona. I have degrees in Computer Engineering and Electrical Engineering from the University of Arizona. Over the past 20 years, I have worked in a variety of storage roles, including development projects, product and portfolio management, testing, field support, and now bring that technical experience to marketing.
There are a lot of things to discuss related to storage, and I am never short of opinions. As such, the standard IBM disclaimer applies: “The postings on this site solely reflect the personal views of the author and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management.”
I have invited other IBMers to post their opinions, and when they do, their opinions may not necessarily match mine either.
This is an open two-way conversation between IBM, Business Partners, Independent Software Vendors, prospects and existing clients. I encourage everyone to post comments about our products, services, and marketing efforts.
Hey everyone, I'm having a great time in New York.
Here are a few webinars this week you might be interested in, related to tape, and tape encryption:
1) Wednesday If regulatory compliance and protecting your data against security breaches is top of mind for you, I invite you to attend a webinar on a new enterprise encryption solution from IBM featuring the IBM System Storage™ TS1120 tape drive. On September 20, 2006 Jon Oltsik, Senior Analyst for Information Security with the Enterprise Strategy Group, will moderate a discussion on IBM’s encryption strategy and latest data security advances with a panel of our product and industry experts.
Webcast: How to Diagnose and Cure What Ails Your Storage Infrastructure
Wednesday, March 23, 2011 at 11:00 AM PDT / 11:00 AM Arizona MST / 2:00 PM EDT
Storage is the most poorly utilized infrastructure element -- and the most costly part of hardware budgets -- in most IT shops today. And it’s getting worse. Storage management typically involves nightmarish mash-up of tools for capacity management, performance management and data protection management unique to each array deployed in heterogeneous fabrics. Server and desktop virtualization seem to have made management issues worse, and coming on the heels of changing workloads and data proliferation is the requirement to add data management to the set of responsibilities shouldered by fewer and fewer storage professionals. Forecast for Storage in 2012: more pain as long delayed storage infrastructure refresh becomes mandatory.
In this webcast, fellow blogger Jon Toigo, CEO of Toigo Partners International, of [DrunkenData] fame, and I will take turns assessing the challenges and suggesting real-world solutions to the many issues that confound storage efficiency in contemporary IT. Integrating real world case studies and technology insights, our storage experts will deliver a must see webcast that sets down a strategy for fixing storage...before it fixes you.
Don't miss this event, unless you like the stress of knowing that your next disaster may be a data disaster.
If we have learned anything from last decade's Y2K crisis, is that we should not wait for the last minute to take action. Now is the time to start thinking about weaning ourselves off Windows XP. IBM has 400,000 employees, so this is not a trivial matter.
Already, IBM has taken some bold steps:
Last July, IBM announced that it was switching from Internet Explorer (IE6) to [Mozilla Firefox as its standard browser]. IBM has been contributing to this open source project for years, including support for open standards, and to make it [more accessible to handicapped employees with visual and motor impairments]. I use Firefox already on Windows, Mac and Linux, so there was no learning curve for me. Before this announcement, if some web-based application did not work on Firefox, our Helpdesk told us to switch back to Internet Explorer. Those days are over. Now, if a web-based application doesn't work on Firefox, we either stop using it, or it gets fixed.
IBM also announced the latest [IBM Lotus Symphony 3] software, which replaces Microsoft Office for Powerpoint, Excel and Word applications. Symphony also works across Mac, Windows and Linux. It is based on the OpenOffice open source project, and handles open-standard document formats (ODF). Support for Microsoft Office 2003 will also run out in the year 2014, so moving off proprietary formats to open standards makes sense.
I am not going to wait for IBM to decide how to proceed next, so I am starting my own migrations. In my case, I need to do it twice, on my IBM-provided laptop as well as my personal PC at home.
Last summer, IBM sent me a new laptop, we get a new one every 3-4 years. It was pre-installed with Windows XP, but powerful enough to run a 64-bit operating system in the future. Here are my series of blog posts on that:
I decided to try out Red Hat Enterprise Linux 6.1 with its KVM-based Red Hat Enterprise Virtualization to run Windows XP as a guest OS. I will try to run as much as I can on native Linux, but will have Windows XP guest as a next option, and if that still doesn't work, reboot the system in native Windows XP mode.
So far, I am pleased that I can do nearly everything my job requires natively in Red Hat Linux, including accessing my Lotus Notes for email and databases, edit and present documents with Lotus Symphony, and so on. I have made RHEL 6.1 my default when I boot up. Setting up Windows XP under KVM was relatively simple, involving an 8-line shell script and 54-line XML file. Here is what I have encountered:
We use a wonderful tool called "iSpring Pro" which merges Powerpoint slides with voice recordings for each page into a Shockwave Flash video. I have not yet found a Linux equivalent for this yet.
To avoid having to duplicate files between systems, I use instead symbolic links. For example, my Lotus Notes local email repository sits on D: drive, but I can access it directly with a link from /home/tpearson/notes/data.
While my native Ubuntu and RHEL Linux can access my C:, D: and E: drives in native NTFS file system format, the irony is that my Windows XP guest OS under KVM cannot. This means moving something from NTFS over to Ext4, just so that I can access it from the Windows XP guest application.
For whatever reason, "Password Safe" did not run on the Windows XP guest. I launch it, but it takes forever to load and never brings up the GUI. Fortunately, there is a Linux version [MyPasswordSafe] that seems to work just fine to keep track of all my passwords.
Personal home PC
My Windows XP system at home gave up the ghost last month, so I bought a new system with Windows 7 Professional, quad-core Intel processor and 6GB of memory. There are [various editions of Windows 7], but I chose Windows 7 Professional to support running Windows XP as a guest image.
Here's is how I have configured my personal computer:
I actually found it more time-consuming to implement the "Virtual PC" feature of Windows 7 to get Windows XP mode working than KVM on Red Hat Linux. I am amazed how many of my Windows XP programs DO NOT RUN AT ALL natively on Windows 7. I now have native 64-bit versions of Lotus Notes and Symphony 3, which will do well enough for me for now.
I went ahead and put Red Hat Linux on my home system as well, but since I have Windows XP running as a guest under Windows 7, no need to duplicate KVM setup there. At least if I have problems with Windows 7, I can reboot in RHEL6 Linux at home and use that for Linux-native applications.
Hopefully, this will position me well in case IBM decides to either go with Windows 7 or Linux as the replacement OS for Windows XP.
(Chris doesn't actually name who is his source making such a claim, whether thatsomeone was employed by any of the parties involved at the time the events occurred,or is currently employed by a competitor like EMC bitterly jealous of the success IBM and HDScurrently enjoy with their offerings.)
As I already posted before about IBM'slong history of storage virtualization, SAN Volume Controller was really part of a sequence of major product in this area, after the successful 3850 MSS and 3494 VTS block virtualization products.
In the late 1990's, our research teams in Almaden, California and Hursley, UK were exploring storagetechnologies that could take advantage of commodity hardware parts and the industry-leadingLinux operating system.
As is often the case, while IBM was working on "the perfect product", small start-ups announce "not-yet-perfect" products into the marketplace. Tactical moves like partneringwith DataCore was a smart move, for the following reasons:
Helps identify market segments. Identify which subset of customers would most benefit fromdisk virtualization. While our 3850 MSS and 3494 VTS were focused on mainframe customers, this newtechnology was focused on distributed Unix, Windows and Linux servers.
Helps prioritize market requirements. What are the most appealing features?What drives clients to buy disk virtualization for distributed systems platforms?
Helps evaluate packaging options. Should we deliver pure software and expect customersto purchase their own servers? Should we offer this as a "service offering" with installation anddeployment services included? Should we offer this as hardware with software pre-installed?
The partnership proved worthwhile, not just to prove to IBM that this was a worthwhile market to enter, but also how "NOT" to package a solution. Specifically, DataCore SANsymphony was software that you had to install on your own Windows-based server. The client was left with the task of orderinga suitable Intel-based server, with the right amount of CPU cycles, RAM and host bus adapter ports,and configure the Windows operating system and DataCore software.
It didn't go well. Basically, customers were expected to be their own "hardware engineers", having to knowway too much about storage hardware and software to design a combination that worked for theirworkloads. Most clients were disappointed with the amount of effort involved, and the resulting poor performance.
To fix this, IBM delivered the SAN Volume Controller, with an optimized Linux operating system and internally-writtensoftware that runs on IBM System x(tm) server hardware optimized for performance.
I can't speak for HDS, but I suspect they came to similar conclusions that resulted in a similar decisionto build their product in-house. I welcome Hu Yoshida to correct me if I am wrong on this.
Jon Toigo has a funny cartoon on his post, [As I Listen to EMC Brag on “New” Functionality…]. Basically, it pokes fun that many of us bloggers argue which vendor was first to introduce some technology or another. We all do this, myself included.
Recently, Claus Mikkelsen's, currently with HDS, [brought up accurately some past history from the 1990s], which is before many storage bloggers hired on with their current employers. Claus and I worked together for IBM back then, so I recognized many of the events he mentions that I can't talk about either. In many cases, IBM or HDS delivered new features before EMC.
I've been reading with some amusement as fellow blogger Barry Burke asked Claus a series of questions about Hitachi's latest High Availability Manager (HAM) feature. Claus was too busy with his "day job" and chose to shut Barry down. Sadly, HDS set themselves up for ridicule this round, first by over-hyping a function before its announcement, and then announcing a feature that IBM and EMC have offered for a while. The problem and confusion for many is that each vendor uses different terminology. Hitachi's HAM is similar to IBM's HyperSwap and EMC's AutoSwap. The implementations are different, of course, which is often why vendors are often asked to compare and contrast one implementation to another.
In his latest response,[how to mind the future of a mission-critical world], Barry reports that several HDS bloggers now censor his comments.That's too bad. I don't censor comments, within reason, including Barry's inane questions about IBM's products, and am glad that he does not censor my inane questions to him about EMC products in return. The entire blogosphere benefits from these exchanges, even if they are a bit heated sometimes.
We all have day jobs, and often are just too busy, or too lazy, to read dozens or hundreds of pages of materials, if we can even find them in the first place. Not everyone has the luxury of a "competitive marketing" team to help do the research for you, so if we can get an accurate answer or clarification about a product that is generally available directly from a vendor's subject matter expert, I am all for that.
Tonight PBS plans to air Season 38, Episode 6 of NOVA, titled [Smartest Machine On Earth]. Here is an excerpt from the station listing:
"What's so special about human intelligence and will scientists ever build a computer that rivals the flexibility and power of a human brain? In "Artificial Intelligence," NOVA takes viewers inside an IBM lab where a crack team has been working for nearly three years to perfect a machine that can answer any question. The scientists hope their machine will be able to beat expert contestants in one of the USA's most challenging TV quiz shows -- Jeopardy, which has entertained viewers for over four decades. "Artificial Intelligence" presents the exclusive inside story of how the IBM team developed the world's smartest computer from scratch. Now they're racing to finish it for a special Jeopardy airdate in February 2011. They've built an exact replica of the studio at its research lab near New York and invited past champions to compete against the machine, a big black box code -- named Watson after IBM's founder, Thomas J. Watson. But will Watson be able to beat out its human competition?"
Like most supercomputers, Watson runs the Linux operating system. The system runs 2,880 cores (90 IBM Power 750 servers, four sockets each, eight cores per socket) to achieve 80 [TeraFlops]. TeraFlops is the unit of measure for supercomputers, representing a trillion floating point operations. By comparison, Hans Morvec, principal research scientist at the Robotics Institute of Carnegie Mellon University (CMU) estimates that the [human brain is about 100 TeraFlops]. So, in the three seconds that Watson gets to calculate its response, it would have processed 240 trillion operations.
Several readers of my blog have asked for details on the storage aspects of Watson. Basically, it is a modified version of IBM Scale-Out NAS [SONAS] that IBM offers commercially, but running Linux on POWER instead of Linux-x86. System p expansion drawers of SAS 15K RPM 450GB drives, 12 drives each, are dual-connected to two storage nodes, for a total of 21.6TB of raw disk capacity. The storage nodes use IBM's General Parallel File System (GPFS) to provide clustered NFS access to the rest of the system. Each Power 750 has minimal internal storage mostly to hold the Linux operating system and programs.
When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, "The actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1TB." For performance reasons, various subsets of the data are replicated in RAM on different functional groups of cluster nodes. The entire system is self-contained, Watson is NOT going to the internet searching for answers.
..., nor any drop to drink"From Rime of the Ancient Mariner (1798), by Samuel Taylor Coleridge
Actually, I've been so busy this week that I am just now getting to this week's theme of Smarter Water. Since it was St. Patrick's Day this week, I thought of discussing IBM's project to help Ireland. Working with the Marine Institute Ireland, IBM has created a system to monitor wave conditions, marine life and pollution levels in and around Galway Bay. Here is quick excerpt from IBM [Press Release]:
"This real-time advanced analytics pilot is turning mountains of data into intelligence, paving the way for smarter environmental management and development of the bay.
The vision for SmartBay is a marine research infrastructure of sensors and computational technology interconnected across Galway Bay collecting and distributing information on coastal conditions, pollution levels and marine life. The monitoring services, delivered via the web and other devices, benefits tourism, fishing, aquaculture and the environment.
The pilot, which includes a move from manual to instrumented data gathering, will allow researchers to deploy quicker reactions to the critical challenges of the bay such as pollution, flooding, fishing stock levels, green energy generation and the threats from climate change."
Or... I could have used water as a metaphor for the "tidal wave" of information. For many,we have a lot of raw data, but not suitably digestible information in the form we need it.
Ok, I admit it is a silly photo, Darth Vader standing in the middle of the ocean filtering sea water into a plastic jug, but it helps focus on the problem. Long before we are donefighting over the last few drops of oil, we will be fighting over water.
This Sunday, March 22, is "World Water Day".Over the past 100 years, water consumption has increased six fold, twice the growth ofhuman population. Today, one in five people on this planet lack access to suitable drinking water. I have been to countries where people not just lack water filters, and in some cases didn't have closeable plastic jugs to carry the water in.
By 2015, the World Health Organization [WHO] estimates that water problems will impactover half the world's population.Here is their [Top 10 Facts File] on water scarcity.
At this point, you might be asking what any of this has to do with IBM.
The smart folks at IBM Research lab, the same location where we do storage research,were able to take some of their knowledge of chemistry, solid state memory, and nanotubes to help the planet with the water situation.Here is a quick [2-minute video]
Continuing my week's theme on how bad things can get following the "Do-it-yourself" plan, I start with James Rogers' piece in Byte and Switch, titled[Washington Gets E-Discovery Wakeup Call]. Here's an excerpt:
"A court filing today reveals there may be gaps in the backup tapes the White House IT shop used to store email. It appears that messages from the crucial early stages of the Iraq War, between March 1 and May 22, 2003, can't be found on tape. So, far from exonerating the White House staffers, the latest turn of events casts an even harsher light on their email policies.
Things are not exactly perfect elsewhere in the federal government, either. A recent [report from the Government Accountability Office (GAO)] identified glaring holes in agencies’ antiquated email preservation techniques. Case in point: printing out emails and storing them in physical files."
You might think that laws requiring email archives are fairly recent. For corporations, they began with laws like Sarbanes-Oxley that the second President Bush signed into law back in 2002. However, it appears that laws for US Presidents to keep their emails were in force since 1993, back when the first President Clinton was in office. (we might as all get used to saying this in case we have a "second" President Clinton next January!)
"The Federal Record Act requires the head of each federal agency to ensure that documents related to that agency's official business be preserved for federal archives. The Watergate-era Presidential Records Act augmented the FRA framework by specifically requiring the president to preserve documents related to the performance of his official duties. A [1993 court decision] held that these laws applied to electronic records, including e-mails, which means that the president has an obligation to ensure that the e-mails of senior executive branch officials are preserved.
In 1994, the Clinton administration reacted to the previous year's court decision by rolling out an automated e-mail-archiving system to work with the Lotus-Notes-based e-mail software that was in use at the time. The system automatically categorized e-mails based on the requirements of the FRA and PRA, and it included safeguards to ensure that e-mails were not deliberately or unintentionally altered or deleted.
When the Bush administration took office, it decided to replace the Lotus Notes-based e-mail system used under the Clinton Administration with Microsoft Outlook and Exchange. The transition broke compatibility with the old archiving system, and the White House IT shop did not immediately have a new one to put in its place.
Instead, the White House has instituted a comically primitive system called "journaling," in which (to quote from a [recent Congressional report]) "a White House staffer or contractor would collect from a 'journal' e-mail folder in the Microsoft Exchange system copies of e-mails sent and received by White House employees." These would be manually named and saved as ".pst" files on White House servers.
One of the more vocal critics of the White House's e-mail-retention policies is Steven McDevitt, who was a senior official in the White House IT shop from September 2002 until he left in disgust in October 2006. He points out what would be obvious to anyone with IT experience: the system wasn't especially reliable or tamper-proof."
So we have White House staffers manually creating PST files, and other government agencies printing out their emails and storing them in file cabinets. When I first started at IBM in 1986, before Notes or Exchange existed, we used PROFS on VM on the mainframe, and some of my colleagues printed out their emails and filed them in cabinets. I can understand how government employees, who might have grown up using mainframe systems like PROFS, might have just continued the practice when they switched to Personal Computers.
Perhaps the new incoming White House staff hired by George W. Bush were more familiar with Outlook and Exchange, and ratherthan learning to use IBM Lotus Notes and Domino, found it easier just to switch over. I am not going to debatethe pros and cons of "Lotus Notes/Domino" versus "Microsoft Outlook/Exchange" as IBM has automated email archiving systems that work great for both of these, as well as also for Novell Groupwise. So, taking the benefit of the doubt,when President Bush took over, he tossed out the previous administration's staff, and brought in his own people, andlet them choose the office productivity tools they were most comfortable with.Fair enough, happens every time a new President takes office. No big surprise there.
However, doing this without a clear plan on how to continue to comply with the email archive laws already on the books, and that it continues to be bad several years later, is appalling. I can understand why business are upset in deploying mandated archiving solutions when their own government doesn't have similar automation in place.
In his last post in this series, he mentions that the amazingly successful IBM SAN Volume Controller was part of a set of projects:
"IBM was looking for "new horizon" projects to fund at the time, and three such projects were proposed and created the "Storage Software Group". Those three projects became know externally as TPC, (TotalStorage Productivity Center), SanFS (SAN File System - oh how this was just 5 years too early) and SVC (SAN Volume Controller). The fact that two out of the three of them still exist today is actually pretty good. All of these products came out of research, and its a sad state of affairs when research teams are measured against the percentage of the projects they work on, versus those that turn into revenue generating streams."
But this raises the question: Was SAN File System just five years too early?
IBM classifies products into three "horizons"; Horizon-1 for well-established mature products, Horizon-2 was for recently launched products, and Horizon-3 was for emerging business opportunities (EBO). Since I had some involvement with these other projects, I thought I would help fill out some of this history from my perspective.
Back in 2000, IBM executive [Linda Sanford] was in charge of IBM storage business and presented that IBM Research was working on the concept of "Storage Tank" which would hold Petabytes of data accessible to mainframes and distributed servers.
In 2001, I was the lead architect of DFSMS for the IBM z/OS operating system for mainframes, and was asked to be lead architect for the new "Horizon 3" project to be called IBM TotalStorage Productivity Center (TPC), which has since been renamed to IBM Tivoli Storage Productivity Center.
In 2002, I was asked to lead a team to port the "SANfs client" for SAN File System from Linux-x86 over to Linux on System z. How easy or difficult to port any code depends on how well it was written with the intent to be ported, and porting the "proof-of-concept" level code proved a bit too challenging for my team of relative new-hires. Once code written by research scientists is sufficiently complete to demonstrate proof of concept, it should be entirely discarded and written from scratch by professional software engineers that follow proper development and documentation procedures. We reminded management of this, and they decided not to make the necessary investment to add Linux on System z as a supported operating system for SAN file system.
In 2003, IBM launched Productivity Center, SAN File System and SAN Volume Controller. These would be lumped together with Horizon-1 product IBM Tivoli Storage Manager and the four products were promoted together as the inappropriately-named [TotalStorage Open Software Family]. We actually had long meetings debating whether SAN Volume Controller was hardware or software. While it is true that most of the features and functions of SAN Volume Controller is driven by its software, it was never packaged as a software-only offering.
The SAN File System was the productized version of the "Storage Tank" research project. While the SAN Volume Controller used industry standard Fibre Channel Protocol (FCP) to allow support of a variety of operating system clients, the SAN File System required an installed "client" that was only available initially on AIX and Linux-x86. In keeping with the "open" concept, an "open source reference client" was made available so that the folks at Hewlett-Packard, Sun Microsystems and Microsoft could port this over to their respective HP-UX, Solaris and Windows operating systems. Not surprisingly, none were willing to voluntarily add yet another file system to their testing efforts.
Barry argues that SANfs was five years ahead of its time. SAN File System tried to bring policy-based management for information, which has been part of DFSMS for z/OS since the 1980s, over to distributed operating systems. The problem is that mainframe people who understand and appreciate the benefits of policy-based management already had it, and non-mainframe couldn't understand the benefits of something they have managed to survive without.
(Every time I see VMware presented as a new or clever idea, I have to remind people that this x86-based hypervisor basically implements the mainframe concept of server virtualization introduced by IBM in the 1970s. IBM is the leading reseller of VMware, and supports other server virtualization solutions including Linux KVM, Xen, Hyper-V and PowerVM.)
To address the various concerns about SAN File System, the proof-of-concept code from IBM Research was withdrawn from marketing, and new fresh code implementing these concepts were integrated into IBM's existing General Parallel File System (GPFS). This software would then be packaged with a server hardware cluster, exporting global file spaces with broad operating system reach. Initially offered as IBM Scale-out File Services (SoFS) service offering, this was later re-packaged as an appliance, the IBM Scale-Out Network Attached Storage (SONAS) product, and as IBM Smart Business Storage Cloud (SBSC) cloud storage offering. These now offer clustered NAS storage using the industry standard NFS and CIFS clients that nearly all operating systems already have.
Today, these former Horizon-1 products are now Horizon-2 and Horizon-3. They have evolved. Tivoli Storage Productivity Center, GPFS and SAN Volume Controller are all market leaders in their respective areas.
Those that prefer to work with one-stop shopping of an IT Supermarket, with companies like IBM, HP and Dell who offer a complete set of servers, storage, switches, software and services, what we call "The Five S's".
Those that perfer shopping for components at individual specialty shops, like butchers, bakers, and candlestick makers, hoping that this singular focus means the products are best-of-breed in the market. Companies like HDS for disk, Quantum for tape, and Symantec for software come to mind.
My how the IT landscape for vendors has evolved in just the past five years! Cisco starts to sell servers, and enters a "mini-mall" alliance with EMC and VMware to offer vBlock integrated stack of server, storage and switches with VMware as the software hypervisor. For those not familiar with the concept of mini-malls, these are typically rows of specialty shops. A shopper can park their car once, and do all their shopping from the various shops in the mini-mall. Not quite "one-stop" shopping of a supermarket, but tries to address the same need.
("Who do I call when it breaks?" -- The three companies formed a puppet company, the Virtual Computing Environment company, or VCE, to help answer that question!)
Among the many things IBM has learned in its 100+ years of experience, it is that clients want choices. Cisco figured this out also, and partnered with NetApp to offer the aptly-named FlexPod reference architecture. In effect, Cisco has two boyfriends, when she is with EMC, it is called a Vblock, and when she is with NetApp, it is called a FlexPod. I was lucky enough to find this graphic to help explain the three-way love triangle.
Did this move put a strain on the relationship between Cisco and EMC? Last month, EMC announced VSPEX, a FlexPod-like approach that provides a choice of servers, and some leeway for resellers to make choices to fit client needs better. Why limit yourself to Cisco servers, when IBM and HP servers are better? Is this an admission that Vblock has failed, and that VSPEX is the new way of doing things? No, I suspect it is just EMC's way to strike back at both Cisco and NetApp in what many are calling the "Stack Wars". (See [The Stack Wars have Begun!], [What is the Enterprise Stack?], or [The Fight for the Fully Virtualized Data Center] for more on this.)
(FTC Disclosure: I am both an employee and shareholder of IBM, so the U.S. Federal Trade Commission may consider this post a paid, celebrity endorsement of the IBM PureFlex system. IBM has working relationships with Cisco, NetApp, and Quantum. I was not paid to mention, nor have I any financial interest in, any of the other companies mentioned in this blog post. )
Last month, IBM announced its new PureSystems family, ushering in a [new era in computing]. I invite you all to check out the many "Paterns of Expertise" available at the [IBM PureSystems Centre]. This is like an "app store" for the data center, and what I feel truly differentiates IBM's offerings from the rest.
The trend is obvious. Clients who previously purchased from specialty shops are discovering the cost and complexity of building workable systems from piece-parts from separate vendors has proven expensive and challenging. IBM PureFlex™ systems eliminate a lot of the complexity and effort, but still offer plenty of flexibility, choice of server processor types, choice of server and storage hypervisors, and choice of various operating systems.
Here I am, day 11 of a 17-day business trip, on my last leg of the trip this week, in Kuala Lumpur in Malaysia. I have been flooded with requests to give my take on EMC's latest re-interpretation of storage virtualization, VPLEX.
I'll leave it to my fellow IBM master inventor Barry Whyte to cover the detailed technical side-by-side comparison. Instead, I will focus on the business side of things, using Simon Sinek's Why-How-What sequence. Here is a [TED video] from Garr Reynold's post
[The importance of starting from Why].
Let's start with the problem we are trying to solve.
Problem: migration from old gear to new gear, old technology to new technology, from one vendor to another vendor, is disruptive, time-consuming and painful.
Given that IT storage is typically replaced every 3-5 years, then pretty much every company with an internal IT department has this problem, the exception being those companies that don't last that long, and those that use public cloud solutions. IT storage can be expensive, so companies would like their new purchases to be fully utilized on day 1, and be completely empty on day 1500 when the lease expires. I have spoken to clients who have spent 6-9 months planning for the replacement or removal of a storage array.
A solution to make the data migration non-disruptive would benefit the clients (make it easier for their IT staff to keep their data center modern and current) as well as the vendors (reduce the obstacle of selling and deploying new features and functions). Storage virtualization can be employed to help solve this problem. I define virtualization as "technology that makes one set of resources look and feel like a different set of resources, preferably with more desirable characteristics.". By making different storage resources, old and new, look and feel like a single type of resource, migration can be performed without disrupting applications.
Before VPLEX, here is a breakdown of each solution:
Non-disruptive tech refresh, and a unified platform to provide management and functionality across heterogeneous storage.
Non-disruptive tech refresh, and a unified platform to provide management and functionality between internal tier-1 HDS storage, and external tier-2 heterogeneous storage.
Non-disruptive tech refresh, with unified multi-pathing driver that allows host attachment of heterogeneous storage.
New in-band storage virtualization device
Add in-band storage virtualization to existing storage array
New out-of-band storage virtualization device with new "smart" SAN switches
SAN Volume Controller
HDS USP-V and USP-VM
For IBM, the motivation was clear: Protect customers existing investment in older storage arrays and introduce new IBM storage with a solution that allows both to be managed with a single set of interfaces and provide a common set of functionality, improving capacity utilization and availability. IBM SAN Volume Controller eliminated vendor lock-in, providing clients choice in multi-pathing driver, and allowing any-to-any migration and copy services. For example, IBM SVC can be used to help migrate data from an old HDS USP-V to a new HDS USP-V.
With EMC, however, the motivation appeared to protect software revenues from their PowerPath multi-pathing driver, TimeFinder and SRDF copy services. Back in 2005, when EMC Invista was first announced, these three software represented 60 percent of EMC's bottom-line profit. (Ok, I made that last part up, but you get my point! EMC charges a lot for these.)
Back in 2006, fellow blogger Chuck Hollis (EMC) suggested that SVC was just a [bump in the wire] which could not possibly improve performance of existing disk arrays. IBM showed clients that putting cache(SVC) in front of other cache(back end devices) does indeed improve performance, in the same way that multi-core processors successfully use L1/L2/L3 cache. Now, EMC is claiming their cache-based VPLEX improves performance of back-end disk. My how EMC's story has changed!
So now, EMC announces VPLEX, which sports a blend of SVC-like and Invista-like characteristics. Based on blogs, tweets and publicly available materials I found on EMC's website, I have been able to determine the following comparison table. (Of course, VPLEX is not yet generally available, so what is eventually delivered may differ.)
Scalable, 1 to 4 node-pairs
One size fits all, single pair of CPCs
SVC-like, 1 to 4 director-pairs
Works with any SAN switches or directors
Required special "smart" switches (vendor lock-in)
SVC-like, works with any SAN switches or directors
Broad selection of IBM Subsystem Device Driver (SDD) offered at no additional charge, as well as OS-native drivers Windows MPIO, AIX MPIO, Solaris MPxIO, HP-UX PV-Links, VMware MPP, Linux DM-MP, and comercial third-party driver Symantec DMP.
Limited selection, with focus on priced PowerPath driver
Invista-like, PowerPath and Windows MPIO
Read cache, and choice of fast-write or write-through cache, offering the ability to improve performance.
No cache, Split-Path architecture cracked open Fibre Channel packets in flight, delayed every IO by 20 nanoseconds, and redirected modified packets to the appropriate physical device.
SVC-like, Read and write-through cache, offering the ability to improve performance.
Space-Efficient Point-in-Time copies
SVC FlashCopy supports up to 256 space-efficient targets, copies of copies, read-only or writeable, and incremental persistent pairs.
Like Invista, No
Remote distance mirror
Choice of SVC Metro Mirror (synchronous up to 300km) and Global Mirror (asynchronous), or use the functionality of the back-end storage arrays
No native support, use functionality of back-end storage arrays, or purchase separate product called EMC RecoverPoint to cover this lack of functionality
Limited synchronous remote-distance mirror within VPLEX (up to 100km only), no native asynchronous support, use functionality of back-end storage arrays
Provides thin provisioning to devices that don't offer this natively
Like Invista, No
SVC Split-Cluster allows concurrent read/write access of data to be accessed from hosts at two different locations several miles apart
I don't think so
PLEX-Metro, similar in concept but implemented differently
Non-disruptive tech refresh
Can upgrade or replace storage arrays, SAN switches, and even the SVC nodes software AND hardware themselves, non-disruptively
Tech refresh for storage arrays, but not for Invista CPCs
Tech refresh of back end devices, and upgrade of VPLEX software, non-disruptively. Not clear if VPLEX engines themselves can be upgraded non-disruptively like the SVC.
Heterogeneous Storage Support
Broad support of over 140 different storage models from all major vendors, including all CLARiiON, Symmetrix and VMAX from EMC, and storage from many smaller startups you may not have heard of
Invista-like. VPLEX claims to support a variety of arrays from a variety of vendors, but as far as I can find, only DS8000 supported from the list of IBM devices. Fellow blogger Barry Burke (EMC) suggests [putting SVC between VPLEX and third party storage devices] to get the heterogeneous coverage most companies demand.
Back-end storage requirement
Must define quorum disks on any IBM or non-IBM back end storage array. SVC can run entirely on non-IBM storage arrays
HP SVSP-like, requires at least one EMC storage array to hold metadata
SVC 2145-CF8 model supports up to four solid-state drives (SSD) per node that can treated as managed disk to store end-user data
Invista-like. VPLEX has an internal 30GB SSD, but this is used only for operating system and logs, not for end-user data.
In-band virtualization solutions from IBM and HDS dominate the market. Being able to migrate data from old devices to new ones non-disruptively turned out to be only the [tip of the iceberg] of benefits from storage virtualization. In today's highly virtualized server environment, being able to non-disruptively migrate data comes in handy all the time. SVC is one of the best storage solutions for VMware, Hyper-V, XEN and PowerVM environments. EMC watched and learned in the shadows, taking notes of what people like about the SVC, and decided to follow IBM's time-tested leadership to provide a similar offering.
EMC re-invented the wheel, and it is round. On a scale from Invista (zero) to SVC (ten), I give EMC's new VPLEX a six.
This week I am down under, starting my 7-city Storage Optimisation Breakfast roadshow on Tuesday in Sydney, Australia. I can't be at two places at once, and it seems whenever I am one place, lots of my coworkers are somewhere else at another conference or event. For those at [VMworld 2010] conference in San Francisco this week, IBM is a Platinum Sponsor and hosting a variety of presentations and activities. Here are some things to look forward to:
Session ID SP9638 - Getting the MAX from your Virtualization Investment
Monday 1:30pm, Moscone South Room 309
Speaker: Bob Zuber, IBM System x Program Director
Speaker: Clod Barrera Distinguished Engineer and Chief Technical Strategist
Clod and I just finished Solutions University 2010 in Dallas, and here he is going to VMworld! You already know that virtualization is beneficial. Exploit virtualization to its MAXimum and move beyond virtualization 101 where you have virtualized web, file/print, and DHCP type workloads. Now it is time to take virtualization to the next step and virtualize business infrastructure applications such as ERP, Messaging, CRM, and Database. With IBM solutions you can take the virtualization journey to build a smarter data center through; 1) Consolidation, 2) Management, 3) Automation and 4) Optimization. Attend this session and learn the key considerations for virtualizing mission-critical workloads and the best practices for a virtual data center that delivers a REAL return on your investment.
Session ID TA8065 - Storage Best Practices, Performance Tuning and Troubleshooting
Speaker: Duane Fafard, Senior XIV Storage Architect, IBM
Monday 10:30 AM Moscone South Room 301
Wednesday 03:00 PM Moscone West Room 2005
The industry has solved many of the challenges of virtualization applications by delivering innovative server solutions that automatically migrate load to available resources, but the complete environment requires both the network and the storage to be part of the equation. Designing, managing, and troubleshooting intricate storage environments in today’s age have become more and more complex. This session will discuss storage best practices, performance challenges, and resolving issues in the storage area network using native tools within the environment. With the techniques learned in this session, the storage administrator will be able to use these best practices to design proper storage solutions and pinpoint troubled areas quickly and accurately.
Session ID SS1012 - Expert Panel: How Smarter Systems can Address your Business Challenges
Wednesday, 12-1pm, Room 135
This is IBM's "Super Session". At IBM, we know that all business challenges such as sprawling IT infrastructure, poor performance and rising management costs are solvable on a smarter planet. With Smarter Systems, IBM can help you increase utilization and flexibility, reduce complexity and cost, respond to business changes swiftly and effectively, and enable end-to-end resiliency and security. Alex Yost, Vice President and Business Line Executive for IBM System x and BladeCenter hosts a panel of Virtualization experts:
James Northington, Vice President and Business Line Executive, IBM System x
Donn Bullock, Vice President of Sales, Mainline Information Systems, Inc.
Dylan Larson, Director of Advanced Software and Server Technologies, Intel Data Center Group
Richard, McAniff, Chief Development Officer and Member of the Office of the President, VMware
Siddhartha (Sid) Chatterjee, Ph.D, Vice President, Strategy & Partnerships, IBM Systems Software
David Guzman, Chief Information Officer and Senior Vice President, Global Technology Solution, Acxiom
This week, some of my coworkers are out at
[VMworld 2009] in San Francisco. IBM is a platinum sponsor, and is the leading reseller of VMware software. Here is the floor plan for our IBM booth there:
Virtual Data Center in a Box & Virtual Networking on
IBM & VMware Joint Collaboration on Power Monitoring
“Always on IT” Business Continuity Solution
IBM System Storage™ XIV®
[IBM XIV Storage System] is a revolutionary, easily managed, open disk system, designed to meet today’s ongoing IT challenges. This system now supports VMware 4.0 and extends the benefits of virtualization to your storage system, enabling easy provisioning and self-tuning after hardware changes. Its unique grid-based architecture represents the next generation of high-end storage and delivers outstanding performance, scalability, reliability and features, along with management simplicity and exceptional TCO.
IBM Storage Solutions with VMware
Featured products include: The new IBM System Storage DS5020 , Virtual Disk solutions with IBM System Storage SAN Volume Controller, IBM Tivoli Storage Productivity Center, and IBM System Storage ProtecTIER Data Deduplication solutions.
Server virtualization with VMware vSphere offers significant benefits to an organization, including increased asset utilization, simplified management and faster server provisioning. In addition to these benefits, VMware enables business agility and business continuity with more advanced features such as VMotion, high availability, fault tolerance, and Site Recovery Manager that all require dependable high-performance shared storage. Adding storage solutions --including virtualized storage-- from IBM delivers complementary benefits to your information infrastructure that extend and enhance the benefits of VMware vSphere while increasing overall reliability, availability and performance to help you transform into a dynamic infrastructure. IBM can provide the right storage solution for your environment and requirements. Our solutions help maximize efficiency with lower costs and provide affordable, scalable storage solutions that help you solve your particular needs.
Stop by to learn how our the exciting new storage solutions can help optimize VMware including self-encrypting storage, automated, affordable disaster recovery with VMware SRM easier and faster provisioning of storage for virtual machines, dramatically improved storage utilization with ProtecTIER deduplication, and how the DS5000 has lower costs Total Cost of Acquisition (TCA) than typical competitors.
IBM Smart Business Desktop Cloud
IBM System x® iDataPlex™: Get More on the Floor
Virtual Client Solutions from IBM
IBM also is sponsoring some breakout sessions:
Leverage Storage Solutions for a Smarter Infrastructure
Simplify and Optimize with IBM N series
IBM SAN Volume Controller: Virtualized Storage for Virtual Servers
XIV: Storage Reinvented for today's dynamic world
Wish I was there, looks like a lot of good information!
IBM also has a vision for the future, and like Martin Luther King's speech, is startingto enable change. Last February 2006, IBM launched "Information on Demand", a visionthat involves bringing together our hardware, software, and services.
The impact has not gone unnoticed. Barron's featured IBM in an article titled "The New IBM".
I suspect bloggers helped get the word out. Here's a graph fromYahoo! Financeshowing the IBM stock price over the pastsix months. This blog started in September, when stock was in the low 80's, and now it is in thehigh 90's. I can't take all the credit, of course, as there are now over 3000 IBMers blogging, either inside thecompany, or externally to the rest of the world.
Continuing this week's theme on the z10 EC mainframe being able to perform the workloadof hundreds or thousands of small 2-way x86 servers, I offer a simple analogy.
One car, one driver
If you wonder why so many companies subscribe to the notion that you should only runa single application per server, blame Sun, who I think helped promote this idea.Not to be out-done, Microsoft, HP and Dell think that it is a great idea too. Imaginethe convenience for operators to be able to switch off a single machine and impactonly a single application. Imagine how much this simplifies new application development,knowing that you are the only workload on a set of dedicated resources.
This is analogous to a single car, single driver, where the car helps get the personfrom "point A" to "point B" and the single driver represents the driver and solepassenger of the vehicle. If this were a single driver on a energy-efficient motorcycleor scooter, than would be reasonable, but people often drive alone much bigger vehicles,what Jeff Savit would call "over-provisioning". Chips have increased in processingpower much faster than individual applications have increased their requirements, so as a result,you have over-provisioning.
Carpooling - one bus, one driver, and many other passengers riding along
This is how z/OS operates. Yes, you could have up to 60 LPARs that you could individuallyturn on and off, but where z/OS gets most of its advantages is that you can run many applicationsin a single OS instance, through the use of "Address Spaces" which act as application containers.Of course, it is more difficult to write for this environment, because you have to be a good"z/OS citizen", share resources nicely, and be WLM-compliant to allow your application to beswapped out for others.
While you get efficiencies with this approach, when you bring the OS down, all the apps on that OS image haveto stop with it. For those who have "Parallel Sysplex" that is not an issue. For example, let's say youhave three mainframes, each running several LPARs of z/OS, and your various z/OS images all are able toprocess incoming transactions for a common shared DB2 database. Thanks to DB2 sharing technology, youcould take down an individual LPAR or z/OS image, and not disrupt transaction processing, because theIP spreader just sends them to the remaining LPARs. A "Coupling Facility" allows for smooth operationsif any of the OS images are lost from an unexpected disaster or disruption.
Needless to say, IBM does not give each z/OS developer his or her own mainframe. Instead, we get to run z/OS guest images under z/VM. It was even possible to emulate the next generation S/390 chipsetto allow us to test software on hardware that hasn't been created yet. With HiperSockets, we canhave virtual TCP/IP LAN connections between images, have virtual coupling facilities, have virtualdisk and virtual tape, and so on. It made development and test that much more efficient, which iswhy z/OS is recognized as one of the most rock-solid bullet-proof operating systems in existence.
The negatives of carpooling or taking the bus applies here as well. I have been on buses that havestopped working, and 50 people are stranded. And you don't need more than two people to make thelogistics of most carpools complicated. This feeds the fear that people want to have separatemanageable units one-car-one-driver than putting all of their eggs into one basket, having to scheduleoutages together, and so on.
(Disclaimer: From 1986 to 2001 I helped the development of z/OS and Linux on System z. Mostof my 17 patents are from that time of my career!)
Bicycle races and Marathons
The third computing model is the Supercomputer. Here we take a lot of one-way and two-way machines,and lash them together to form an incredible machine able to perform mathematical computations fasterthan any mainframe. The supercomputer that IBM built for Los Alamos National Laboratory just clockedin at 1,000,000,000,000,000 floating point operations per second. This is not a single operating system,but rather each machine runs its own OS, is given its primary objective, and tries to get it done.NetworkWorld has a nice article on this titled:[IBM, Los Alamos smash petaflop barrier, triple supercomputer speed record].If every person in the world was armed with a handheld calculator and performed one calculation per second, it would take us 46 years collectively to do everything this supercomputer can do in one day.
I originally thought of bicycle races as an analogy for this, but having listened to Lance Armstrong at the[IBM Pulse 2008] conference, I learned thatbiking is a team sport, and I wanted something that had the "every-man-for-himself" approach to computing.So, I changed this to marathons.
The marathon was named after a fabled greek soldier was sent as messenger from the [Battle of Marathon to the City of Athens],a distance that is now standardized to 26 miles and 385 yards, or 42.195 kilometers for my readersoutside the United States.
If you were given the task to get thousands of people from "point A" to "point B" 26 plus milesaway, would you chose thousands of cars, each with a lone driver? Conferences with a lot of people in a few hotels useshuttle buses instead. A few drivers, a few buses, and you can get thousands of people from a fewplaces to a few places. But the workloads that are sent to supercomputers have a single end point,so a dispatcher node gives a message to each "greek soldier" compute node, and has them run it on their own. Somemake it, some don't, but for a supercomputer that is OK. When the message is delivered, the calculation for thatlittle piece is done, and the compute node gives it another message to process. All of the computations areassembled to come up with the final result. Applications must be coded very speciallyto be able to handle this approach, but for the ones that are, amazing things happen.
So, how does "server virtualization" come into play?
IBM has had Logical Partitions for quite some time. A logical partition, or LPAR, can run its own OSimage, and can be turned on and off without impacting other LPARs. LPARs can have dedicated resources,or shared resources with other LPARs. The IBM z10 EC can have up to 60 LPARs. System p and System i,now merged into the new "POWER Systems" product line, also support LPARs in this manner. Depending onthe size of your LPAR, this could be for a single OS and application, or a single OS with lots of applications.
Address Spaces/Application Containers
This is the bus approach. You have a single OS, and that is shared by a set of application containers. z/OS does this with address spaces, all running under a single z/OS image, and for x86there are products like [Parallels Virtuozzo Containers] that can run hundred of Windows instances under a single Windows OS image, or a hundred Linux imagesunder a single Linux OS image. However, you cannot mix and match Windows with Linux, just as all theaddress spaces on z/OS all have to be coded for the same z/OS level on the LPAR they run in.
The term "guests" were chosen to model this after the way hotels are organized. Each guest has a roomwith its own lockable entrance and privacy, but shared lobby, and in some countries, shared bathroomson every hall. This approach is used by z/VM, VMware and others. The z/VM operating system can handle any S/390-chip operating system guest, so you could have a mix ofz/OS, TPF, z/VSE, Linux and OpenSolaris, and even other z/VM levels running as guests. Many z/VM developers runin this "second level" mode to develop new versions of the z/VM operating system!
As part of the One Laptop Per Child [OLPC] development team (yes, I ama member of their open source community, and now have developer keys to provide contributions), I havebeen experimenting with Linux KVM. This was [folded into the base Linux 2.6.20 kernel and availableto run Linux and Windows guest images. This is a nice write-up on[Wikipedia].
The key advantage of this approach is that you are back to one-car-one-driver simplistic mode of thinking. Each guest can be turned on and off without impacting otherapplications. Each guest has its own OS image, so you can mix different OS on the same server hardware.You can have your own customized kernel modules, levels of Java, etc.Externally, it looks like you are running dozens of applications on a single server, but internally,each application thinks it is the only one running on its own OS. This gives you simpler codingmodel to base your test and development with.
Jeff is correct that running less than 10 percent utilization average across your servers is a cryingshame, and that it could be managed in a manner that raises the utilization of the servers so that fewer areneeeded. Just as people could carpool, or could take the bus to work, it just doesn't happen, and data centersare full of single-application servers.
VMware has an architectural limit of 128 guests per machine, and IBM is able to reach this withits beefiest System x3850 M2 servers, but most of the x86 machines from HP, Dell and Sun are less powerful,and only run a dozen or so guests. In all cases, fewer servers means it is simpler to manage, so moreapplications per server is always the goal in mind.
VMware can soak up 30 to 40 percent of the cycles, meaning the most you can get from a VMware-basedsolution is 60 to 70 percent CPU utilization (which is still much better than the typical 5 to 10 percent average utilization we see today!) z/VM has been finely tuned to incur as little as 7 percent overhead,so IBM can achieve up to 93 percent utilization.
Jeff argues that since many of the z/OS technologies that allow customers to get over90 percent utilization don't apply to Linux guests under z/VM, then all of the numbers are wrong.My point is that there are two ways to achieve 90 percent utilization on the mainframe, one is throughz/OS running many applications on a single LPAR (the application container approach), and the other through z/VM supporting many Linux OS images, each with one (or a few) applications (the virtual guest approach).
I am still gathering more research on this topic, so I will try to have it ready later this week.
Several of my IBM colleagues will be attending the "Virtual Worlds 2007" conference today and tomorrow. This conference sold out so quickly that they have already scheduled a second one for October. The focus is on 3-D internet technologies likeSecond Life. Attendance is expected at over 600 people.
IBM is investing heavily in this new concept of v-business. Last year, I was one of only 325 IBMers on Second Life. Now, according to this Better than Life blog entry from Grady Booch, IBM Fellow, the number is over 4000!
Of course, the challenge for IBM, and others, is learning to market in virtual worlds. Already, my team is in-world, and we meet several times a week. Using Second Life is quickly becoming an essential business skill, like participating in conference calls, or responding to instant messages.
What does meeting in-world entail?
Scheduling a time and a place
Finding a time that people can meet is no different than scheduling a audio or video conference call. In general, you don't have to worry about travel, but you do have to be actively somewhere connected to everyone else.
Finding a place involves actually determining the island, region and coordinates to hold the meeting. You need to find a place with enough seating. You don't have to worry about daylight, each person can control how much or little sunlight shows up on their screen. You do have to make sure you pick a spot that nobody else plans to use at that same time. Just like scheduling conference rooms at the site or hotel, we have to schedule rooms in advance.
To avoid this hassle, I have created the "pocket conference room". This is a single object that I can "rez" onto the ground, from my inventory, with 40 chairs, a PowerPoint presentation screen, a podium for a speaker to stand behind, and stools for speakers to sit on if they are next on the agenda. Now, I can hold impromptu meetings in any sandbox, grassy knoll, or the roof top of a building.
As with any other meeting, you need some basic ground rules. I am not talking the usual "no shooting, no gambling, no selling" rules that you see everywhere in Second Life. Instead, rules like an avatar must stand up before speaking. Anyone with a question must first "raise their hand" and get recognized by the chair. These ground rules can be as formal as Robert's Rules of Order or more casual, depending on who is participating.
It costs 10 Linden Dollars (L$) per PAGE to upload a PowerPoint presentation. This has the immediate benefit of having everyone spend more time and effort on their presentation, trying to cut down the number of charts, and focus more on what they are going to say.
Public Speaking Skills
It is amazing. People who are too scared to speak in front of an audience in Real Life have no problem having their avatar stand in front of other avatars in Second Life. This has greatly broadened the pool of speakers to tap into.Are you a woman with a husky masculine voice? Are you a man with a high-pitched feminine voice? Now, you can create an avatar that matches your voice.
This turns out to be the biggest challenge. In Real Life, organizing a face-to-face meeting involves time and effort making sure the venue has everything you need, a platform, a podium, good Audio/Video system, etc. All people have to do is show up, sit in a chair and listen.
In Second Life, however, the aspects of venue are all covered, but getting people to show up is another story. People have to sign up for Second Life account, create an avatar, wear appropriate virtual clothing, figure out how to teleport near the venue, walk or fly the difference to get to the exact building and room, master the sitting-in-a-chair and hold-coffee-and-sip-occasionally process, and pay attention.
Perhaps the best part of Second Life is that if you are not paying attention, your avatar noticeably falls asleep, into a hunched-over position, what is called "afk" (short for Away From Keyboard). On the other hand, if you do need to step away from your desk, you can put your avatar in "afk" mode immediately, tell everyone why and perhaps when you'll be back, and then re-activate when you return. This is one of the best improvements over regular audio conference calls.
I suspect the need for having places in Second Life to hold meetings will become more and more in demand.At a time when real-estate sales in the US is slowing down, Coldwell Banker's Second Life efforts are ramping up. I am not making this up. Coldwell Banker is one of the nation's largest real estate brokerage firms. They are trying to bring the same "adult supervision" to virtual real-estate transactions, offering to help people buy and rent properties in Second Life.
Registration is now open for our next "Meet the Storage Experts" event in Second Life. All IBMers, clients and IBM Business Partners are welcome to attend. We will focus this time on DS3000 and N series disk systems, tape systems,and IBM storage networking gear.
It takes me 20-30 minutes to complete a crossword or Sudoku puzzle. I am in no hurry, and I find the process relaxing. But what if you were paid to complete a puzzle? In that case, finishing the puzzle sooner, in fewer minutes, means more money in your paycheck per hour worked! However, getting paid would mean that doing these puzzles may no longer be fun or relaxing.
The idea of converting a hobby into a revenue-generating activity is not new. Who wouldn't want to earn money doing something you were planning to do already? The television is full of commercial advertisements for credit cards where you can earn Double Miles or Cash Rewards just for spending money on things you were going to spend on anyways.
But is "earn" the right word? The merchants pay a percentage fee every time a patron uses a credit card, and the bank is just providing a marketing incentive in the form of a portion of those fees back to the consumer, to encourage more usage of their card versus other forms of payment. Sort of like "profit sharing".
(FTC Disclosure: I am a full-time employee and shareholder of the IBM Corporation. This blog post should not be considered an endorsement for anything. My opinions and writings are based on publicly available information and my own experiences doing freelance work prior to my employment at IBM. I have no hands-on experience with Amazon Mechanical Turk, neither as a worker nor requester, have not participated in TopCoder contests, nor have I used the Viggle app. I do not have any financial interest in Amazon, TopCoder, Viggle or any other third-party company mentioned on this blog post, nor has anyone paid me to mention their company names, brands or offerings.)
Here's how it works. You get the app on your phone, and register each television show as you watch it. You can watch the show live, or much later recorded on your Tivo. You watch the shows you were going to watch anyways, and just provide your demographics, all in the name of market research. You get two points per minute of watching, and after 7,500 points, you get a $5 gift card from retailers such as from retailers such as Burger King, Starbucks, Best Buy, Sephora, Fandango, and CVS drugstores. For the typical American, it would take about three weeks to watch that much television!
Of course, this is not the only way to earn money working from home. A reader asked me for my opinions of [Amazon Mechanical Turk]. While the other examples above are done for marketing purposes, Mechanical Turk can be used for a variety of other things. Up to now, the IT industry has regarded the Cloud as the delivery of computing as a service, with the infrastructure, hardware and software existing on internationally networked servers, effectively invisible to the end user. This model is now to being applied broadly to people.
Basically, Mechanical Turk acts as a marketplace, where employers post Human Intelligent Tasks (HITs) that workers can do. Most can be completed in minutes and you are paid pennies to do so. Some examples might help illustrate what a HIT looks like:
Call a business and get the email address of the manager in charge.
Review a photograph and describe its style or content in three words or less
Select among multiple choices to categorize a job listing or company position
As a Mechanical Turk worker, you only work on the HITs you choose to work on, presumably those that interest you, and that you can do well and quickly. Workers can do this anytime, anywhere, such as 2:00am in the morning, at home, when you can't sleep or taking care of children. You can choose to work as much or as little as you like.
The employers--referred to as Mechanical Turk requesters--put money into their payroll accounts, load up their tasks, and hit publish. This gives them immediate access to a global, on-demand 24-by-7 workforce that can help complete thousands of HITs in minutes. These employers won't have to put an advertisement in the want ads and interview potential candidates, just to let them go later when the project is over.
Just like any other job, Mechanical Turk wages are reported to the IRS, and each person's work is evaluated for quality. In doing these tasks, you build up your "digital reputation" that will either prevent you or allow you to work on certain HITs. You can also take tests to reach Qualification levels to be eligible to work on HITs not available to everyone else.
Software engineers would have a hard time writing an Artificial Intelligence [AI] program to do these simple tasks, so being able to generate a HIT for something in the middle of a computer program might be the easiest way to get past a difficult part of an algorithm. Amusingly, Amazon describes this form of [crowdsourcing] as an artificial form of Artificial Intelligence!
While this approach may work for small, easily defined tasks, what about works that require a high amount of Human Intelligence, like storage software or hardware development?
When I was working for IBM as a software engineer in the 1980s and 1990s, it took us years to get a project done, using the traditional [Waterfall Model]. My job as a software architect was to estimate the thousands of lines of code (KLOC) a project would require, estimate the number of Person-Years (PY) it would take, and recommend the appropriate sized team. Back then, each engineer averaged only about 1,000 lines of software code per year, so KLOC and PY were often used interchangeably. Fellow IBM author Fred Brooks wrote an excellent book on the process called [The Mythical Man-Month].
The Waterfall model has the advantage that people only have to work a portion of the cycle on the project. In between, there was plenty of downtime to attend training, improve your skills, or take vacation. As our director Lynn Yates would often complain, "if they are only writing two lines of code in the morning, and two in the afternoon, why do they need time to rest?"
The Waterfall model was not perfect, and had its share of critics. One downside was that the clients didn't see anything until General Availability (GA), with a few getting a glimpse a few months earlier during our Early Support Program (ESP). By the time clients could tell us it was not what they wanted or expected, it was too late to change until the next release.
To address this concern, 17 software engineers wrote the now famous [Agile Manifesto]. The authors felt that collaboration, between the developers and with the clients, is critical to success. Business people and developers must work together daily throughout the project. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. The best architectures, requirements, and designs emerge from self-organizing teams. The result is an iterative approach that allows the client to see working prototypes early in the process, allowing last-minute changes to requirements to influence the final product.
Combining the Mechanical Turk concept with Agile programming methodology gives you what IBM calls an "Outcomes Model" approach. In the IBM research paper [Software Economies] (PDF, 5 pages), the authors argue that there are four fundamental principles needed for an "Outcomes Model" approach:
Autonomy. All of the actions necessary to bring jobs to completion should be driven by market forces; the process is
never gated by an entity outside of the market.
Inclusiveness. Everyone who provides information or performs work that leads to improvements should share in the
Transparency. The system should be transparent with respect to both the flow of money in the market and the tasks
performed by workers in the market.
Reliability. The system should be immune to manipulation, robust against attack (e.g., via insertion of untrusted code),
and prevent "shallow" work which would have to be re-done later.
I was surprised to see that [the TopCoder Community is 390,593 strong], nearly the size of the entire IBM company. TopCoder is focused on computer programming and digital creation using the Outcomes Model approach. Rather than paying everyone for their work, however, the platform is designed around challenges and competitions, and the top players or contributors are rewarded with cash prizes.
As an innovative company, IBM constantly explores a variety of means and approaches to offer value to its clients and customers. These new approaches may have some distinct advantages not just for IBM and its shareholders, but also for its clients and the freelancers hired to work on these projects. The global marketplace is getting flatter, smaller and smarter. It will be interesting how this plays out. If the discussion above encourages you to hone your technical skills, perhaps that is motivation enough to get off the couch and stop watching so much television!
I’ve just returned from the IBM Tivoli Pulse conference in Las Vegas – a meeting of over 4000 customers, partners, and IBM employees. ... There was a lot to digest, but three of the major themes caught my attention, and my imagination. ... First, IBM put a huge push behind their Dynamic Infrastructure initiative. Sounds like so many other automation and autonomic initiatives of the past, right? Well, things are getting better, and “dynamic” is becoming more of a realistic possibility, especially with the emergence of cloud computing and cloud services models. ... Second, a lot of time was spent on IBM’s Service Management Industry Solutions. When I first heard of this, my thought was that IBM was creating solutions for the Service Management industry (i.e. food services, janitorial services, hospitality services). But this is much larger than that – much, much larger. IBM is taking their unique ability to pair business (non-IT) expertise with IT consulting, planning, and technology delivery, and constructing (careful – here comes the “f” word) frameworks for several vertical industry segments. ... IBM is perhaps the only organization in the world that can take this on fully and hope to deliver a meaningful result. But beyond that, this represents a huge opportunity for IT professionals to become the transformation agents within their own organizations, contributing at a whole new level. ... Lastly, I was really impressed by IBM’s Smarter Planet initiative. The primary thought here was that the key to a greener planet is to take inefficiencies out of just about every form of business through the intelligent application and deployment of technology. At first I was thinking this was just another marketing initiative, but in the course of this event, listening to the keynotes and talking to a number of IBM execs, it became apparent that this is a substantial cultural shift within IBM itself. Just think about that for a moment – when 400,000 employees all change their direction and focus, their sheer mass is going to make a noticeable difference. ... Magic (Johnson) gave an excellent talk, and reminded the audience that you should do two things no matter what your job or role. First, service starts with knowing your customers – not just who they are, but what they do and what is important to them. And second – always over-deliver. Go that extra step. Exceed expectations. The boost in loyalty, goodwill, and improved customer relationships will be well worth the effort. Good thoughts to keep with us….
If you missed Pulse 2009, perhaps because your company has put a clamp down on travel expenses, you are in luck! IBM is hosting the "Dynamic Infrastructure Forum" March 3-4, 2009, on your computer. This is an IBM Virtual event, no travel required! [Register Today!]
Earlier this week, EMC announced its Symmetrix V-Max, following two trends in the industry:
Using Roman numerals. The "V" here is for FIVE, as this is the successor to the DMX-3 and DMX-4. EMC might have gotten the idea from IBM's success with the XIV (which does refer to the number 14, specifically the 14th class of a Talpiot program in Israel that the founders of XIV graduated from).
Adding "-Max", "-Monkey" or "2.0" at the end of things to make them sound more cool and to appeal to a younger, hipper audience. EMC might have gotten this idea from Pepsi-Max (... a taste of storage for the next generation?)
I took a cue from President Obama and waited a few days to collect my thoughts and do my homework before responding.Special thanks to fellow blogger ChuckH in giving me a [handy list of reactions] for me to pick and choose from. It appears that EMC marketing machine feels it is acceptable for their own folks to claim that EMC is doing something first, or that others are catching up to EMC, but when other vendors do likewise, then that is just pathetic or incoherent. Here are a few reactions already from fellow bloggers:
This was a major announcement for EMC, addressing many of the problems, flaws and weaknesses of the earlier DMX-3 and DMX-4 deliverables. Here's my read on this:
Now you can have as many FCP ports (128) as an IBM System Storage DS8300, although the maximum number of FICON ports is still short, and no mention of ESCON support. The Ethernet ports appear to be 1Gb, not the new 10GbE you might expect.
Support for System z mainframe
V-Max adds some new support to catch up with the DS8000, like Extended Address Volumes (EAV). EMC is still not quite there yet. IBM DS8000 continues to be the best, most feature-rich storage option if you have System z mainframe servers.
Both the IBM DS8000 and HDS USP-V beat the DMX-4 in performance, and in some cases the DMX-4 even lost to the IBM XIV, so EMC had to do something about it. EMC chooses not to participate in industry-standard performance benchmarks like those from the [Storage Performance Council], which limits them to vague comparisons against older EMC gear. I'll give EMC engineers the benefit of the doubt and say that now V-Max is now "comparably as fast as HDS and IBM offerings".
Getting "V" in the name
The "V" appears to be for the roman number five, not to be confused with external heterogeneous storage virtualization that HDS USP-V and IBM SVC provide. There is no mention of synergy with EMC's failed "Invista" product, and I see no support for attaching other vendors disk to the back of this thing.
Switch to Intel processor
Apple switched its computers from PowerPC to Intel-based, and now EMC follows in the same path. There are some custom ASICs still in V-Max, so it is not as pure as IBM's offerings.
Modular, XIV-like Scale-out Architecture
Actually, the packaging appears to follow the familiar system bays and storage bays of the DMX-4 and DMX-4 950 models, but architecturally offers XIV-like attachment across a common switch network between "engines", EMC's term for interface modules.
Non-disruptive data migration
IBM's SoFS, DR550 and GMAS have this already, as does as anything connected behind an IBM SAN Volume Controller.
A long time ago, IBM used to have midrange disk storage systems called "FAStT" which stood for Fibre Array Storage Technology, so this might have given EMC the idea for their "Fully Automated Storage Tiering" acronym. The concept appears similar to what IBM introduced back in 2007 for the Scale-Out-File Services [SofS] which not only provides policy-based placement, movement and expiration on different disk tiers, includes tape tiers as well for a complete solution. I don't see anything in the V-Max announcement that it will support tape anytime soon.
And what ever happend to EMC's Atmos? Wasn't that supposed to be EMC's new direction in storage?
Zero-data loss Three-site replication
IBM already calls this Metro/Global Mirror for its IBM DS8000 series, but EMC chose to call it SRDF/EDP for Extended Distance Protection.
Ease of Use
The most significant part of the announcement is that EMC is finally focusing on ease-of-use.In addition to reducing the requirement for "Bin File" modifications, this box has a redesigned user interface to focus on usability issues. For past DMX models, EMC customers had to either hire EMC to do tasks for them that were just to difficult otherwise, or buy expensive software like their EMC Control Center to manage. EMC willcontinue to sell DMX-4 boxes for a while, as they are probably supply-constrained on the V-Max side, but I doubt they will retro-fit these new features back to DMX-3 and DMX-4.
When IBM announced its acquisition of XIV over a year ago now, customers were knocking down our doors to get one. This caught two particular groups looking like a [deer in headlights]:
EMC Symmetrix sales force: Some of the smarter ones left EMC to go sell IBM XIV, leaving EMC short-handed and having to announce they [were hiring during their layoffs]. Obviously, a few of the smart ones stayed behind, to convince their management to build something like the V-Max.
IBM DS8000 sales force: If clients are not happy with their existing EMC Symmetrix, why don't they just buy an IBM DS8000 instead? What does XIV have that DS8000 doesn't?
Let me contrast this with the situation Microsoft Windows is currently facing.
I am often asked by friends to help them pick out laptops and personal computers. I use Linux, Windows and MacOS, so have personal experience with all three operating systems.
Linux is cheaper, offers the power-user the most options for supporting older, less-powerfulequipment, but I wouldn't have my Mom use it. While distributions like Ubuntu are makinggreat strides, it is just too difficult for some people.
MacOS is nice, I like it, it works out of the box with little or no customization and an intuitive interface. However, some of my friends don't make IBM-level salaries, and have to watch their budget.
In their "I'm a PC" campaign, Microsoft is fighting both fronts. Let's examine two commercials:
In the first commercial, a young eight-year-old puts together a video from pictures oftoy animals and some background music.The message: "Windows is easier to use than Linux!" If they really wanted to send this message, they should have shown senior citizens instead.
In the second commercial, a young college student is asked to find a laptop with 17 inchscreen, and a variety of other qualifications, for under $1000 US dollars. The only modelat the Apple store below this price had a 13 inch screen, but she finds a Windows-based system that had this size screen and met all the other qualifications. The message: "Windows-based hardware from a variety of competitors are less expensive than hardware from Apple!"
Both Microsoft and Apple charge a premium for ease-of-use.In the storage world, things are completely opposite. Vendors don't charge a premium forease-of-use. In fact, some of the easiest to use are also the least expensive.
If you just have Windows and Linux, you can get some entry level system likethe IBM DS3000 series, only a few features, and can be set up in six simple steps.
Next, if you have a more interesting mix of operating systems, Linux, Windows and some flavorsof UNIX like IBM AIX, HP-UX or Sun Solaris, then you might want the features and functionsof more pricier midrange offerings. More options means that configuration and deploymentis more difficult, however.
Finally, if you are serious Fortune 500 company, running your mission critical applications on System z or System i centralized systems in a big data center, that you might be willing to pay top dollar for the most feature-rich offerings of an Enterprise-class machine.Thankfully you have an army of highly-trained staff to handle the highest levels of complexity.
IBM's DS8000, HDS USP-V and EMC's Symmetrix are the key players in the Enterprise-classspace. They tried to be ["all things to all people"], er.. perhaps all things to allplatforms. All of the features and functions came at a price, not just in dollars, butin complexity and difficulty. You needed highly skilled storage admins using expensive storage management software, or be willing to hirethe storage vendor's premium services to get the job done.
IBM recognized this trend early. IBM's SVC, N series and now XIV all offer ease-of-use withenterprise-class features and functions, at lower total cost of ownership than traditional enterprise-class systems. IBM is not the only one, of course, as smaller storage start-ups like 3PAR,Pillar Data Systems, Compellent, and to some extent Dell's EqualLogic all recognized thisand developed clever offerings as well.
While IBM's XIV may not have been the first to introduce a modular, scale-out architectureusing commodity parts managed by sophisticated ease-of-use interfaces, its success might have been the kick-in-the-butt EMC needed to follow the rest of the industry in this direction.
Chris Evans over at Storage Architect posts aboutHardware Replacement Lifecycle Update, on how storage virtualization can helpwith storage hardware replacemement. He makes two points that I would like to comment on.
... indeed products such as USP, SVC and Invista can help in this regard. However at some stage even the virtualisation tools need replacing and the problem remains, although in a different place.
Knowing that replacement of technologies at all levels are inevitable, IBM System Storage SAN Volume Controlleris actually designed to allow cluster non-disruptive upgrade, which we announcedMay 2006.
The process is quite elegant. The SVC consists of one or more node-pairs, and can be upgraded while the systemis up and running by replacing nodes one at a time in a sequence of suspend and resume. All of the mapping tablesare loaded onto the new nodes from the rest of the still active nodes.
I was hoping as part of the USP-V announcement HDS would indicate how they intend to help customers migrate from an existing USP which is virtualising storage, but alas it didn't happen.
Unlike the SVC, once cannot just upgrade the USP in place and make it into a USP-V. While it might be possible tounplug external disk from the old USP, and re-plug into the new USP-V, what do you do about the internal disk data?I doubt you can just move drawers and trays of disk from the old to the new. The data has to be moved some other way.
Some have asked why not just put an SVC in front of both the old USP and the new USP-V and transfer the data that way.While SVC does support virtualizing the old USP device, IBM is still testing the new USP-V as a managed device, and so this solution is not yet available, and would only apply to the LUNs in the USP-V, not the volumes specifically formatted for System i or System z.
An alternative is to take advantage of IBM's Data Mobility Services, the result of our recentacquisition of SofTek. IBM can help you both mainframe and distributed systems data from any device, to any device.
In a typical four year lifecycle of storage arrays, it might take six months or so to fill up the box, and might takeas much as a year at the end to move the data out to other equipment. SVC can greatly reduce both of these, so that you can take immediate advantage of new equipment as soon as possible, and keep using it for close to the full four years,migrating weeks or days before your lease expires.
Last week, a writer for a magazine contacted us at IBM to confirm a quote that writing a Terabyte (TB) on disk saves 50,000 trees. I explained that this was cited from UC Berkeley's famousHow Much Information? 2003 study.
To be fair, the USA Today article explains that AT&T also offers "summary billing" as well as "on-line billing", but apparently neither of these are the default choice. I can understand that phone companies send out bills on paper because not everyone who has a phone has internet access, but in the case of its iPhone customers, internet access is in the palm of your hands! Since all iPhone customers have internet access, and AT&T knows which customers are using an iPhone, it would make sense for either on-line billing or summary billing to be the default choice, and let only those that hate trees explicitly request the full billing option.
Sending a box of 300 pages of printed paper is expensive, both for the sender and the recipient. This informationcould have been shipped less expensively on computer media, a single floppy diskette or CDrom for example. Forthose who prefer getting this level of detail, a searchable digitized version might be more useful to the consumer.
Which brings me to the concept of Information Lifecycle Management (ILM). You can read my recent posts on ILM byclicking the Lifecycle tab on the right panel, or my now infamous post from last year about ILM for my iPod.
His recollection of the history and evolution of ILM fairly matches mine:
The phrase "Information Lifecycle Management" was originally coined by StorageTek in early 1990s as a way to sell its tape systems into mainframe environments. Automated tape libraries eliminated most if not all of the concerns that disk-only vendors tout as the problem with manual tape. I began my IBM career in a product now called DFSMShsm which specifically moved data from disk to tape when it no longer needed the service level of disk. IBM had been delivering ILM offerings since the 1970s, so while StorageTek can't claim inventing the concept, we give them credit for giving it a catchy phrase.
EMC then started using the phrase four years ago in its marketing to sell its disk systems, including slower less-expensive SATA disk. The ILM concept helped EMC provide context for the many acquisitions of smaller companies that filled gaps in the EMC portfolio. Question: Why did EMC acquire company X? Answer: To be more like IBM and broaden its ILM solution portfolio.
Information Lifecycle Management is comprised of the policies, processes,practices, and tools used to align the business value of information with the mostappropriate and cost effective IT infrastructure from the time information isconceived through its final disposition. Information is aligned with businessrequirements through management policies and service levels associated withapplications, metadata, and data.
Whitepapers and other materials you might read from IBM, EMC, Sun/StorageTek, HP and others will all pretty much tell you what ILM is, consistent with this SNIA definition, why it is good for most companies, and how it is not just about buying disk and tape hardware. Software, services, and some discipline are needed to complete the implementation.
While the SNIA definition provides a vendor-independent platform to start the conversation, it can be intimidatingto some, and is difficult to memorize word for word.When I am briefing clients, especially high-level executives, they often ask for ILM to be explained in simpler terms. My simplified version is:
Information starts its life captured or entered as an "asset" ...
This asset can sometimes provide competitive advantage, or is just something needed for daily operations. Digital assets vary in business value in much the same way that other physical assets for a company might. Some assets might be declared a "necessary evil" like laptops, but are tracked to the n'th degree to ensure they are not lost, stolen or taken out of the building. Other assetsare declared "strategically important" but are readily discarded, or at least allowed to walk out the door each evening.
... then transitions into becoming just an "expense" ...
After 30-60 days, many of the pieces of information are kept around for a variety of reasons. However, if it isn'tneeded for daily operations, you might save some money moving it to less expensive storage media, throughless expensive SAN or LAN network gear, via less expensive host application servers. If you don't need instantaccess, then perhaps the 30 seconds or so to fetch it from much-less-expensive tape in an automated tape librarycould be a reasonable business trade-off.
... and ends up as a "liability".
Keeping data around too long can be a problem. In some cases, incriminating, and in other cases, just having toomuch data clogs up your datacenter arteries. If not handled properly within privacy guidelines, data potentially exposes sensitive personal or financial information of your employees and clients. Most regulations require certain data to be kept, in a manner protected against unexpected loss, unethical tampering, and unauthorized access, for a specific amount of time, after which it can be destroyed, deleted or shredded.
So ILM is not just a good idea to save a company money, it can keep them out of the court room, as well as help save the environment and not kill so many trees. Now that 100 percent of iPhone customers have internet access, and a goodnumber of non-iPhone customers have internet access at home, work, school or public library, it makes sense for companies to ask people to "opt-in" to getting their statements on paper, rather than forcing them to "opt-out".
Eventually hardware fails, ... ... eventually software works.
For a solid backup product, consider usingIBM Tivoli Storage Manager.I use it to protect all my data on my laptop. And when switching recently from my old Thinkpad T30 to my newThinkpad T60, used it to transfer my data over as well.[Read More]
In addition to creating the Dilbert cartoon, Scott Adams has a blog, which sometimes is quite serious,and other times quite funny. The anticipated 30x cost of "Flash Drives" for Enterprise disk systems reminded meof one of Scott's articles from November 2007 titled [Urge to Simplify].Here's an excerpt:
Now the casinos have people trained, like chickens hoping for pellets, to take money from one machine (the ATM), carry it across a room and deposit in another machine (the slot machine). I believe B.F. Skinner would agree with me that there is room for even more efficiency: The ATM and the slot machine need to be the same machine.
The casinos lose a lot of money waiting for the portly gamblers with respiratory issues to waddle from the ATM to the slot machines. A better solution would be for the losers, euphemistically called “players,” to stand at the ATM and watch their funds be transferred to the hotel, while hoping to somehow “win.” The ATM could be redesigned to blink and make exciting sounds, so it seems less like robbery.
I’m sure this is in the five-year plan. Longer term, people will be trained to set up automatic transfers from their banks to the casinos. People will just fly to Vegas, wander around on the tarmac while the casino drains their bank accounts, then board the plane and fly home. The airlines are already in on this concept, and stopped feeding you sandwiches a while ago.
Perhaps EMC can redesign its DMX-4 to "blink and make exciting sounds" as well. The Flash Drives were designedfor the financial services industry, so those disk systems could be directly connected to make transfers between the appropriate bank accounts.
The Harvard Extension School is running a course focused on virtual law with a Second Life component. Rebecca Nesson (’Rebecca Berkman’ in Second Life) is teaching the class. The lectures, which look fascinating, are available to at-large participants on Berkman Island [SLURL: http://slurl.com/secondlife/Berkman/113/70/24].
You can attend the lectures in Second Life on Monday evenings from 8:00-10:00pm EST (5:00-7:00pm SL time). Videos of past lectures are linked on the course’s web site, where you can also find the syllabus, a wiki, and more.
The US version of The Office (which does an excellent job of being almost as funny as the BBC version) is no stranger to life online. It’s fun to spot Kevin, Meredith, Creed, Roy, Pam all on MySpace, and Dwight has a blog. This week they dipped into Second Life. The very same week as CSI:NY; It’s all getting very mainstream.
Of course, the Office’s treatment of SL was as tongue-in-cheek as you’d expect…
Dwight:“Second Life is not a game. It is a Multi User Virtual Environment. It doesn’t have points or scores or winners or losers.”
Jim:“Oh, it has losers.”
Steve Nelson at Clear Ink, the team behind bringing the office into SL for the episode, has [written about the project] and carefully lists the locations and clothing used.
I watched this episode and loved how they were able to blend it in seamlessly without looking out of placeor awkward reference.
Cisco Systems Inc. has been staging virtual meetings between developers and channel partners in Second Life for more than a year, but this invitation was a first for me. So a presentation announcing the winners of a networking technology innovation contest -- inside a Second Life simulation -- seemed like the place to be.
I'm probably an SL noob (for newbie) by most standards, but I've spent enough time there to know most of the ways to move and how to search out islands and events.
In all, I would say the Cisco event sparked my interest in the SL virtual meeting format, but my attention was focused more on making things in SL work smoothly than on the material presented.
I've had some interesting conversations with event-coordinators looking for advice on setting up events in Second Life, so I suspect that is a good sign that this is still growing momentum.
We have some exciting webcasts in the upcoming weeks!
Smarter Enterprises Need Smarter Storage
In this [InformationWeek webcast], my IBM colleague Allen Marin will present a brief overview of IBM Smarter Storage for the enterprise with a focus on new high-end disk and Virtual Tape solutions.
Allen will take you through the recent enhancements [announced earlier this month], highlighting how the new capabilities can address the requirements of your mission-critical applications, as well as your evolving business analytics, and cloud initiatives.
Date: Wednesday, October 24, 2012 Time: 10:00 AM PDT / 10:00AM Arizona / 1:00 PM EDT Duration: 60 Minutes
[Register now!] All registrants will get the independent Clipper Group Report - "When Infrastructure Really Matters - A Focus on High-End Storage" - free!
Smarter Storage for Midsize Businesses
Businesses of all sizes are getting buried in the avalanche of data. Data is coming in at faster rates and in greater volumes. The value of data is increasing. Old processes and technologies aren't working. Midsize businesses have the same issues managing the rapid growth of data as large enterprises, but they don't have the same size budget or staff. They need advanced capabilities at an affordable price that are easy to implement.
Speakers for this webcast include Brian Truskowski, General Manager, IBM System Storage and Networking; Ed Walsh, Vice President of Market and Strategy, IBM System Storage; and Tommy Rickard, IBM Director, UK Storage Development.
Date: Tuesday, November 6, 2012 Time: 8:00 AM PST / 9:00AM Arizona / 11:00 AM EST Duration: 60 Minutes
[Register now!] Learn how new IBM Smarter Storage solutions can help midsize businesses tame the explosion of information and their IT budgets.
I hope you can find time in your busy schedule to participate in one or both of these webcasts.
Are you looking for new storage for 2014? Time to replace that old gear on your IT floor?
The decisions you make about your IT infrastructure affect everything -- from database and business analytics to cloud and virtualization. That's why it's more important than ever to choose wisely.
If you are currently running on storage from HP, HDS, EMC or one of IBM's many other competitors, you might want to take a fresh new look at IBM storage which...
performs faster with greater throughput and lower latency,...
and is easier to use, ...
AND costs less over the next three to five years!
Next week, on January 16, senior IBM executives will share news about breakthrough technologies, featuring Intel® processors, that enhance Smarter Computing servers and storage.
(This webcast will be available worldwide. I, myself, will be in Winnipeg, Canada, freezing my [tuque] off!)
In this webcast, you will learn how to improve decision support and data processing for your mission-critical applications, drive higher performance on analytics and increase agility and flexibility through scalable solutions.
I can't believe we got snow this week on Valentine's Day! It didn't last long on the ground here in Tucson, but there are still some white caps in our mountains. For those of you "trapped" by snow, or too much work, here are two upcoming events you can attend from your desk and computer!
IBM Oracle Virtual University 2012
Please join us for the fourth annual IBM Oracle Virtual University that runs "live" for 24 hours, then continues 'on-demand' replay through the remainder of 2012.
From: Tuesday, February 21, 6:00 am US Eastern Time EST (6:00 pm China Time)
To: Wednesday, February 22, 6:00 am EST
This is a great educational event for IBM and Business Partner sales & technical teams who sell IBM Oracle solutions or have Oracle solutions installed in their account. It is for anyone who is new to or interested in the IBM Oracle Alliance as well as experienced sales & technical people who need all the latest on the IBM/Oracle co-opetition relationship for 2012 and beyond.
This VIRTUAL on-line event will cover key topics around the IBM Oracle Alliance. I am one of the speakers and will cover IBM System Storage offerings as they relate to Oracle software.
This is a chance for sellers to hear an update on what's new, unique and available to sell in 2012. The goal of this session is to help enable you to sell more IBM products and services with Oracle solutions in 2012! Learn where to go for help to better understand these solutions, close more deals and reach your targets.
Even through economic challenges, storage requirements have continued to grow along with the information explosion.
Join us for this informative webcast and hear from Jon Toigo, CEO and Managing Principal of Toigo Partners, as he discusses six cutting-edge storage technologies that are ready for prime time and can help transform your data center.
Date: Tuesday, February 28
Time: 1:00 pm EST, 12"00 pm CST, 10:00 am PST
The featured speaker is fellow blogger Jon Toigo, CEO and Managing Principal, Toigo Partners, an outspoken technology consumer advocate and vendor watchdog whose articles, columns, and blog posts on [DrunkenData.com] are enjoyed by over a million readers per month.
Here are some upcoming events related to IBM Storage!
If you sell IBM and/or Oracle solutions, please join me for IBM Oracle Virtual University 2013!
A few weeks ago, I recorded a session on IBM Storage: Overview, Positioning and How to Sell that will be available on demand starting tomorrow, February 26th, at the IBM Oracle Virtual University 2013.
It's one of 65 new sessions that will help IBM to surround Oracle applications with IBM infrastructure, services and industry solutions. Oracle software, after all, runs best on IBM hardware. Other highlights of Oracle Virtual University include a live executive State of the Alliance session with Q&A, Oracle keynote, updates by Oracle product managers, sessions on PureSystems, Selling IBM into an Oracle environment, Cloud, and much more.
There will be live technical teams on hand throughout launch day to answer your questions in real time, so I hope you can carve out 30 minutes or more on February 26th to take advantage of these available resources.
After helping launch the first Pulse back in 2008, I have sadly not been back since. Last year, I was invited to attend as a last-minute replacement for another speaker, but I was busy [having emergency surgery].
This year's [Pulse 2013] conference looks amazing. It will be held in Las Vegas, Nevada. Guest Speaker Payton Manning, NFL 4-time MVP football player, and Carrie Underwood, 6-time Grammy award winner, join IBM's Software Group executives and experts on how IBM Tivoli can help optimize your IT infrastructure.
Sadly, once again, I will not be there at Pulse. This time, I will be on the East Coast visiting clients instead, but my on-premise correspondent, Tom Rauchut, has informed me that he will be there. Hopefully, he will provide me something to write about.
Later in March, I will be in Brussels, Belgium for the Storage Expo. This is held March 20-21, at the Brussels-Expo venue. I will be presenting several topics each day, as well as visit clients in the area. This event comes on behalf of IBM Belgium in association with IBM Business Partner IRIS-ICT.
If you plan to participate in any of these events, let me know!
Last week, US President Barack Obama declared September 2011 as "National Preparedness Month". Here is an excerpt of the press release:
Whenever our Nation has been challenged, the American people have responded with faith, courage, and strength. This year, natural disasters have tested our response ability across all levels of government. Our thoughts and prayers are with those whose lives have been impacted by recent storms, and we will continue to stand with them in their time of need. This September also marks the 10th anniversary of the tragic events of September 11, 2001, which united our country both in our shared grief and in our determination to prevent future generations from experiencing similar devastation. Our Nation has weathered many hardships, but we have always pulled together as one Nation to help our neighbors prepare for, respond to, and recover from these extraordinary challenges.
In April of this year, a devastating series of tornadoes challenged our resilience and tested our resolve. In the weeks that followed, people from all walks of life throughout the Midwest and the South joined together to help affected towns recover and rebuild. In Joplin, Missouri, pickup trucks became ambulances, doors served as stretchers, and a university transformed itself into a hospital. Local businesses contributed by using trucks to ship donations, or by rushing food to those in need. Disability community leaders worked side-by-side with emergency managers to ensure that survivors with disabilities were fully included in relief and recovery efforts. These stories reveal what we can accomplish through readiness and collaboration, and underscore that in America, no problem is too hard and no challenge is too great.
Preparedness is a shared responsibility, and my Administration is dedicated to implementing a "whole community" approach to disaster response. This requires collaboration at all levels of government, and with America's private and nonprofit sectors. Individuals also play a vital role in securing our country. The National Preparedness Month Coalition gives everyone the chance to join together and share information across the United States. Americans can also support volunteer programs through www.Serve.gov, or find tools to prepare for any emergency by visiting the Federal Emergency Management Agency's Ready Campaign website at [www.Ready.gov] or [www.Listo.gov].
In the last few days, we have been tested once again by Hurricane Irene. While affected communities in many States rebuild, we remember that preparedness is essential. Although we cannot always know when and where a disaster will hit, we can ensure we are ready to respond. Together, we can equip our families and communities to be resilient through times of hardship and to respond to adversity in the same way America always has -- by picking ourselves up and continuing the task of keeping our country strong and safe.
NOW, THEREFORE, I, BARACK OBAMA, President of the United States of America, by virtue of the authority vested in me by the Constitution and the laws of the United States, do hereby proclaim September 2011 as National Preparedness Month. I encourage all Americans to recognize the importance of preparedness and observe this month by working together to enhance our national security, resilience, and readiness.
IBM has several webinars to help you prepare for upcoming disasters.
Today, September 8, at 4pm EDT, IBM is hosting a [CloudChat on Business Resilience] will focus on resiliency and continuity in the cloud—a timely topic considering the recent weather events on the East Coast of the U.S. This chat will include Richard Cocchiara, IBM Distinguished Engineer and CTO, IBM Business Continuity and Resiliency Services (@RichCocchiara1) and Patrick Corcoran, Global Business Development, IBM Business Continuity and Resiliency Services (@PatCorcoranIBM).
Don't think you can afford Disaster Recovery planning? Next week, September 13, I will be joined with a few other experts on freeing up much needed funds from your tight IT budget, by being more efficient. The Webinar [Taming Data Growth Made Easy] is part of IBM's "IT Budget Killer" series.
Lastly, on September 21, IBM will have the Webinar [Planning for Disaster Recovery in a Power Environment: Best Practices to Protect Your Data]. This will cover principal lessons learned from disasters like Hurricane Katrina and the World Trade Center, local and regional considerations for Disaster Recovery Planning, planning Recovery Time Objectives (RTOs), and best practices for automation, mirroring and multiple Site Operational Efficiencies. A customer case study from University of Rochester Medical Center (URMC) will help reinforce the concepts, with a discussion on how a major hospital ensures Business Continuity via Contingency Planning using IBM Power Systems. The speakers in clude Steve Finnes, World Wide Offering Manager for IBM Power Systems, Vic Peltz, Consulting IT Architect for WW Business Continuance Technical Marketing, and Rick Haverty, Director of IT Infrastructure at University of Rochester Medical Center (URMC).
Hopefully, you will find these webinars useful and informative!
An exciting new addition to the IBM storage line, the Storwize V7000 is a very versatile and solid choice as a midrange storage device. This session will cover a technical overview of the controller as well as its positioning within the overall IBM storage line.
xST04 - XIV Implementation, Migration and Optimization
Attend this session to learn how to integrate the IBM XIV Storage System in your IT environment. After this session, you should understand where the IBM XIV Storage system fits, and understand how to take full advantage of the performance capabilities of XIV Storage by using the massive parallelism of its grid architecture. You will learn how to migrate data onto the XIV and hear about real world client experiences.
xST05 - IBM's Storage Strategy in the Smarter Computing Era
Want to understand IBM's storage strategy better? This session will cover the three key themes of IBM's Smarter Computing initiative: Big Data, Optimized Systems, and Cloud. IBM System Storage strategy has been aligned to meet the storage efficiency, data protection and retention required to meet these challenges.
IBM offers encryption in a variety of ways. Data can be encrypted on the server, in the SAN switch, or on the disk or tape drive. This session will explain how encryption works, and explain the pros and cons with each encryption option.
sAC01 - IBM Information Archive for email, Files and eDiscovery
IBM has focused on data protection and retention, and the IBM Information Archive is the ideal product to achieve it. Come to this session to discuss archive solutions, compliance regulations, and support for full-text indexing and eDiscovery to support litigation.
sGE04 - IBM's Storage Strategy in the Smarter Computing Era
Want to understand IBM's storage strategy better? This session will cover the three key themes of IBM's Smarter Computing initiative: Big Data, Optimized Systems, and Cloud. IBM System Storage strategy has been aligned to meet the storage efficiency, data protection and retention required to meet these challenges.
sSM03 - IBM Tivoli Storage Productivity Center – Overview and Update
IBM's latest release of IBM Tivoli Storage Productivity Center is v4.2.2, a storage resource management tool that manages both IBM and non-IBM storage devices, including disk systems, tape libraries, and SAN switches. This session will give an overview of the various components of Tivoli Storage Productivity Center and provide an update on what's new in this product.
sSN06 - SONAS and the Smart Business Storage Cloud (SBSC)
Confused over IBM's Cloud strategy? Trying to figure out how IBM Storage plays in private, hybrid or public cloud offerings? This session will cover both the SONAS integrated appliance and the Smart Business Storage Cloud customized solution, and will review available storage services on the IBM Cloud.
sTA01 - Tape Storage Reinvented: What's New and Exciting in the Tape World?
This very informative session will keep you up to date with the latest tape developments. These include the TS3500 tape library connector Model SC1 (Shuttle). The shuttle enables extreme scalability of over 300,000 tape cartridges in a single library image by interconnecting multiple tape libraries with a unique, high speed transport system. The world's fastest tape drive, the TS1140 3592-E07, will also be presented. The performance and functionality of the new TS1140 as well as the new 4TB tape media will be discussed. Also, the IBM System Storage Linear Tape File System (LTFS), including the Library Edition, will be presented. LTFS allows a disk-like, drag-and-drop interface for tape. This is a not-to-be-missed session for all you tape lovers out there!
In December, I will be going to Gartner's Data Center Conference in Las Vegas, but the agenda has not been finalized, so I will save that for another post.
I hope everyone enjoyed the French Open in Second Life! Here are some upcoming events:
Rational Software Development Conference comes to Second Life
As part of its commitment to the developer community, IBM is broadening the experience for conference visitors and avatars visiting IBM CODESTATION, in the virtual world of Second Life. During RSDC this year, visitors can view the General Sessions, catch Rational product demonstrations, interact with Rational experts, and learn about the first CODESTATION "Coder's Challenge" kicking off in July.
For Rational Software Development Conference (RSDC) information and registration, running June 10-14:here
Virtual Technical Briefing in Second Life: Web 2.0
Join IBM developerWorks in Second Life for a virtual Web 2.0 Briefing on June 21, 2007 at 12:30 pm EDT/ 9:30 am PDT. During this briefing from IBM developerWorks you'll see presentations on Web 2.0 technologies, a flash demo of associated hot technologies and have a chance to have your questions answered by IBM experts.
In the last two years Web 2.0 has created one of the most remarkable growth surges in Web application history. The transition of consumer Web sites from isolated information silos to sources of shared content and functionality, make the Web a true computing platform serving web applications to end-users. Now it's time to take the lessons learned from that success and see how it can bring value to you and your business.
Based on our success for our April 26 event, we decided to have the next event in September. More details to follow,but we plan to have it open to customers, analysts and business partners. If you are interested in participating, now is a good time to get your avatar in second life up and running. If you need "System Storage", "IBM Business Partner" logo clothing for your avatar, send me a note.
Every September, IBM Tucson spends a Wednesday or Saturday to help out local non-profit charities. The event is orgnaized the the local United Way. My first one was packing boxes of food for the [Community Food Bank of Southern Arizona] on September 12, 2001, the day after the [tragic events in New York and Washington DC]. The mindless activity of putting a bottle, bag or can into one box after another helped us cope with the shock and awe that week.
So, it seemed fitting on the 10th anniversary of that event to go back to the Community Food Bank and help pack boxes of food. The facility received nearly $200,000 in donations in response to the [shooting of US Congresswoman Gabrielle Giffords]. Her husband, astronaut Mark Kelly, suggested that dontaions go in part to the Tucson Community Food Bank, and with the money they were able to expand operations, dedicating a portion as the [Gabrielle Giffords Family Assistance Center] to bring together food handouts with the [Supplemental Nutrition Assistance Program for food stamps, and the Women with Infant Children (WIC) program. One-stop assistance!
This year, nearly 500 Tucson IBMers to complete 22 projects at 17 nonprofit agencies. We were not alone, we were joined by volunteers from Bank of America, Texas Instruments, Tucson Medical Center, Geico Insurance, University of Arizona, Cox Cable TV, Desert Diamond Casinos, The Westin La Paloma Resort and Spa, the Arizona Lottery, Community Partnership of Southern Arizona (CPSA), Pizza Hut, Arizona Daily Star, 94.9 MixFM Radio, BizTucson, and News 4 Tucson (our local NBC affiliate).
In a bit of competition, our team, Team B, of 14 IBMers, competed against another team, Team A, of 20 people. Despite having fewer people, we were able to pack 746 boxes, representing 20,000 pounds of food, beating out Team A which only packed 18,000 pounds. (I have chosen not to identify anyone on Team A, no need to rub their noses in it. This was all for a good cause.)
Each box contained cereal, canned evaporated milk, canned vegetables and fruits, fruit juice, rice, and dry beans. My job on the assembly line was to put two half-gallon jugs of grape juice in the box and move it down the line.
What lessons can a team of people learn from an activity like this?
When you put a bunch of efficiency experts from IBM on a task, they will self-organize and self-manage for optimum performance, just as we don on our regular day jobs.
No matter what you plan in advance, individual personalities and strengths surface, encouraging minor adjustments to process and procedures to be more efficient.
In an assembly line process, where each person has to wait for the person before them to finish their assigned task, it becomes obvious who is not pulling their fair share of the work. In this manner, everyone holds everyone else accountable for their output.
This was a great day for a good cause. The Community Food Bank qualifies for the Arizona [Working Poor Tax Credit] program. For every dollar the Community Food Bank receives, they can give 10 dollars of food to someone in need.
Special thanks to Greg Kishi for being our team leader for this event, and to Carol Tribble for taking these photographs.
While many are just becoming familiar with the end-user interfaces of Web 2.0, from blogs and wikis to FaceBook and FlickR, fewer may be familiar with the "information infrastructure" of servers and storagebehind the scenes.
Last year, I bought an XO laptop under the One Laptop Per Child [OLPC] foundation's Give-1-Get-1 program and posted my impressions on this blog. One in particular, my post[Printingon XO laptop with CUPS and LPR] showed how to print from the XO laptop over to a network-attached printer.This caught the attention of the OLPC development team, who asked me tohelp them with another project as a volunteer. Before accepting, I had to learn what skills they were really looking for, especially since I do notconsider myself an expert in neither printing nor networking.
(Unlike a regular 9-to-5 job where most people just try to look busy for eight hours a day, doingvolunteer work means being ready to ["roll up your sleeves"] and actuallyaccomplish something. This applies to any kind of volunteer work, from hammering nails for [Habitat for Humanity] to sorting cans at the [Community Food Bank].Best Buy uses the phrase "Results Oriented Work Environment" [ROWE] to describetheir latest program, modeled in part after the mobile workforce policies of Web2.0-enlightened companiesIBM and Sun, but that is perhaps a topic for another blog post!)
Apparently, to support a school full of students with XO laptops, it would be nice to have a few serversthat provide support to manage the class lesson plans, make reading materials and other content available,and keep track of results. What they need is an "information infrastructure"! They decided on two specific servers:
School Server -- this would run a popular class management system called [Moodle]
Library Server -- a server for a digital library collection, based on Fedora Commons[16-minute video]
In keeping with OLPC philosophy to use free and open source software[FOSS], both servers are based on the [LAMP] platform. LAMP is an acronym for thecombined software bundle of Linux, Apache, MySQL and a Programming language like PHP. The "XS" team working onthe school server wanted me to build a LAMP server and install Moodle to help test the configuration, determinewhat other software is required, and perhaps develop a backup/recovery scenario. Basically, they needed someone with Linux skills to put some hardware and software together.
(I am no stranger to Linux. Back in the 1990s, I was part of the Linux for S/390 team, led the effort to createthe infamous "compatible disk layout" (CDL) that allows z/OS to access ESCON and FICON-attached Linux volumes,took my LPI certification exam, and led a team to validate FCP drivers for our disk and tape storage systems. For an IBMer to volunteer foran Open Source community project, you have to take an "open source" class and get management approval to reviewfor any possible "conflicts of interest". I got this all taken care of, and accepted to help the XS team.)
Building a test environment is similar to baking a cake. You have a recipe, utensils, and ingredients. Here'sa bit of description of each of the ingredients:
Like Windows, the Linux operating system comes in different flavors to run on handhelds, desktops and servers. For servers, IBM tends to focus on Red Hat Enterprise Linux (RHEL) and SUSE Linux Eneterprise Server (SLES). However, the XS team decidedinstead to use [Fedora 7], a community-supported version from Red Hat. Earlier versions of Fedora were known as "Fedora Core", but apparently with version 7, the word "Core" has been dropped. Fedora 7 can be used in either desktop or server mode.
[Apache] is web server software, and half of all web servers on the internet use it. It competes head-on against Micorosofts Internet Information Services (IIS) serverprovided in Windows 2003. The Apache name is partly from thefact that its origins were "a patchy" variant of the NCSA HTTPd 1.3 codebase. Thepopular [IBM HTTP Server] is poweredby Apache, with added support to the rest of the IBM WebSphere software portfolio. The XS team chose Apache v2as the web server platform.
[MySQL] is a relational database management system (RDBMS) software, similar to commercial products like IBM DB2 Universal Database, Oracle DB, or Microsoft SQL Server. The SQL stands for Structured Query Language, developed by IBM in the early 1970s as a standard languageto update and query database tables. MySQL comes in two flavors, MySQL Enterprise for commercial use, and MySQLCommunity, which is community-supported. There are over 10 million instances of MySQL running websites on the internet, which helps explain why Sun Microsystems agreed to acquire MySQL AB company last month.The XS team decided on MySQL 5.0 as the database platform.
To make HTML pages dynamic, including the possibility to add or query database contents, requires programming.A variety of web scripting languages were developed, all starting with the letter "P" to claim to be the programming part of the LAMP platform, including [PHP], Perl, and Python. Later, new programming language frameworks have been developed that do not start with the letter "P", like [Ruby on Rails]. PHP is short for PHP: Hypertext Preprocessor which explains that it pre-processes HTML during web serving,looking for special tags indicating PHP code, allowing programming logic to insert HTML content, such as information extracted from a database.While Python is the language that runs the Sugar interface on the XO laptops, the XS team decided onPHP v5 as the programming language for the server.
As for utensils, you only need a few utilities
A simple text editor: I go old-school and use the classic "vi" (to learn this editor, see the["Cheat Sheet" method] on IBM Developerworks)
secure socket shell (SSH): this allows you to access one server from another
browser access to the internet: when you encounter problems, get error messages, or whatever, it pays to know how to search for things with Google
As for a recipe, the Moodle website spells out some unique details and parameters. For the base LAMP platform,I chose to follow the book [Fedora 7 Unleashed] that has specific chapters on setting up SSH, Apache, MySQL, PHP, Squid and so on. The resultingconfiguration looks like this:
Here were the sequence of events:
I took an old PC that I wasn't using anymore, backed up the Windows system, and installed Linux on top. Thebook above had a Fedora 7 DVD on the back jacket, but I used the [OLPC LiveCD] that had some values pre-configured.
Set the IP address static. I set mine to 192.168.0.77 which nobody sees except my other systems.
My school server is "headless" which means it does not have its own keyboard, video or mouse. It also runs only to Linux run level 3, command line interface only, no graphics.I was able toshare using a KVM switch], but this meant having to remember something on one screen while I was switching over to the other. My Windows XP system has mybrowser connection to the internet to follow instructions or read error messages, so I need that up all thetime. To get around this, on my Windows XP system,I generated SSH public and private keys, copied the public key over to my new Linux system, and used [OpenSSH for Windows] to connect over. Now, on one screen,I have my Windows XP Firefox browser, and a separate command line window that is accessing my Linux schoolserver.
With SSH up and running, I can now use "vi" to edit files, and issue commands to install or activatethe remaining software. First up, Apache. I got this working, and from Windows XP, verified that going to"http://192.168.0.77" showed the Apache test screen.
I installed PHP, and tested it with a simple short index.php file.
I installed MySQL, setup the base "installation databases", and created a test database. Here is whereyou might want to set a password for the MySQL root user, but I chose to do that later for now.
I installed Moodle. It was smart enough to check that Apache, PHP, and MySQL were operational, andapparently I missed a few special "PHP" modules that had to be linked in. I was able to find them, downloadthem, and get them installed.
I brought up Moodle, created a "class category" of SCIENCE and a new class "Chemistry 101", and it allworked.
I also activated Squid, which is a web proxy cache server that stores web pages for faster access.
Another idea was to activate Samba, to provide CIFS file and print sharing, but I decided to put this off.
I got all of this done last Saturday, start to finish. Now the fun begins. We are going to run throughsome tests, document the procedures, and try to get a system up and running in a remote school in Nepal. Fornow, I have only one XO laptop to simulate what the student sees, and one laptop that can represent eithera teacher's Windows-based laptop, or run QEMU and emulate a second XO laptop.For tuning, I might go through the procedures mentioned on IBM Developerworks "Tuning LAMP"[Part 1, Part 2,Part 3].
For those in the server or storage industry that need to understand Web 2.0 information infrastructure better,building a LAMP server like this can be quite helpful.
Are you going to Edge 2013 in Las Vegas, June 10-14?
In my talks with clients about storage, I find similar hesitation on turning on various storage efficiency features that IBM (and other vendors) have to offer. Let's examine a few of them.
Less than half of businesses have activated "thin provisioning" on storage devices that support this feature. Why? IBM introduced thin provisioning on its RAMAC Virtual Array back in 1997! The technology is well proven in the field. Don't know how to report this for charge-back activity? Charge your end-users for the maximum capacity upper limit. Simple enough!
What about Data Deduplication? IBM has had this feature on its N series since 2007, but it wasn't until IBM came out with the IBM ProtecTIER gateway and appliance models that people started to take notice of this technology. Yes, I agree Hash Collisions can be quite scary on competitive gear, but on IBM ProtecTIER we do not use hash codes, and all data is compared byte-for-byte. For those considering hash-based deduplication, hash collisions in general are quite rare. Jeff Preshing does the math for you in his blog post: [Hash Collision Probabilities]. Of course, if you want to leave no doubt in the minds of a jury of your peers, stick with byte-for-byte comparison methods in the IBM ProtecTIER.
Lastly, I have heard concerns of using real-time compression? Really? Real-time compression has been used in wide-area network (WAN) transmissions ever since IBM developed the Houston Aerospace Spooling Protocol (HASP) for NASA back in 1973. IBM has offered real-time compression on tape cartridges since 1986, the year I started with IBM, some 27 years ago. And now, real-time compression is available for file-based and block-based disk systems. All of these solutions are based on the Lempel-Ziv lossless compression algorithms introduced in 1977. One customer I spoke with was unwilling to try compression, because it requires thin provisioning as a pre-requisite. How is that for having one fear based on another one!
IBM places a high value on data integrity. For each data footprint reduction method, IBM has designed a solution that returns back the exact ones and zeros, in the correct quantity and order, as was originally stored.
For more on this topic, come see me present "Data Footprint Reduction -- Understanding IBM Storage Efficiency Options" at [IBM Edge 2013 conference] in Las Vegas, June 10-14.
Yesterday, I was able to get the "Build 650" up and running under Qemu emulation onmy Thinkpad laptop computer. Today, I was able to get my Thinkpad and my XO laptoptalking to each other for a "chat".
The built-in "Chat" activity is one of the many kid-friendly activities included onthe XO laptop for the One Laptop Per Child [OLPC] project.It is also possible for two or more people to share other activities, like editing a textdocument, or browsing the internet.
As they say, emulation is only 95% complete, and this is true in this case as well. My Thinkpaddoes not have a built-in video camera, and for some reason the Qemu emulation does not let mehear any sound, despite specifying "-soundhw es1370" parameter. And lastly, it doesn't have the"mesh network" built-in Wi-Fi capability, just standard 54Mbps 802.1g through my Linksys router.
So, I set both XO and Thinkpad to use the new "xochat.org" jabber server so that the two couldsee each other:
$ sugar_control_panel -s jabber xochat.org
I set my XO nickname to be "TonyP" and my Thinkpad to be "Pearson", and chose blue-orange forthe first, and orange-blue for the second.
The process of starting a chat is similar to other IM systems like IBM Lotus Sametime. You havea neighborhood view that shows all people online using the same jabber server. In my case therewere about 30 or so icons on the screen. From the colors on my XO, I was able to locate my Thinkpad,and invite him to a chat. You can share the chat with everyone on the network, or keep it privatebetween two people. I tried both ways to see the difference.
In a private two-way chat, the first person starts up their Chat activity, and sends an inviteto join to another person. The second person sees a flashing chat bubble on the bottom of thescreen to the left of all the other action bar icons. The difference is that the chat bubble isblue-orange matching the sender, rather than black-and-white of the rest of the icons.
If the recipient happens to be busy doing something else full-screen, like browsing the web, theredoesn't seem to be any interruption. It is only when he goes to "home view" will he see the coloredchat bubble and decide to join or not.
The chat itself colorizes the text to match to color of the participant's icons. Blue for one, and orangefor the other. It two people had identical color schemes I guess it might be hard to tell. Thetext is white, so it is best to choose darker colors for contrast.
A nice feature is that you can save your chat session with the "keep" button on the upper rightpart of the screen, and your dialogue discussion will show up as an entry in the "journal".
Using this technique, it is possible for someone who has one "XO" laptop and one regular computer,or two regular computers, to develop and test applications that involve the sharing aspect of educational opportunities. Chats can be between students, student-to-teacher, or event student-to-mentor.
Last week, on January 31, two of my colleagues retired from IBM. At IBM, retirements always happen on the last day of the month. Here is my memories of each, listed alphabetically by last name.
Mark Doumas retires after working 32 years with IBM. Mark was my manager for a few months in 2003. Back then, IBM was working on launching a variety of new products, including the IBM SAN File System (SFS), the IBM SAN Volume Controller (SVC), a new release of Tivoli Storage Manager (TSM), and TotalStorage Productivity Center (TPC), which was later renamed to IBM Tivoli Storage Productivity Center.
Mark was manager of the portfolio management team, and I was asked to manage the tape systems portfolio. I am no stranger to tape, as one of my 19 patents is for the pre-migration feature of the IBM 3494 Virtual Tape Server (VTS). The portfolio included LTO and Enterprise tape drives, tape libraries and virtual tape systems. My job was to help decide how much of IBM's money we should invest in each product area. This was less of a technical role, and more of a business-oriented project management position
Portfolio management is actually part of a chain of project management roles. At the lowest level are team leads that manage individual features, referred to as line items of a release. Release managers are responsible for all the line items of a particular release. Product managers determine which line items will be shipped in which release, and often have to balance across three or more releases. Architects help determine which products in a portfolio should have certain features. Since I was chief architect for DFSMS and Productivity Center, stepping up to portfolio manager was naturally the next rung on the career ladder.
(Side note: If you were wondering why I was only a few months on the job, it was because I was offered an even better position as Technical Evangelist for SVC. See my 2007 blog post [The Art of Evangelism] for a humourous glimpse of the kind of trouble I got in with that title on my business card!)
While my stint in this role was brief, I am still considered an honorary member of the tape development team. Nearly every week I present an overview of our tape systems portfolio at the Tucson Executive Briefing Center, or on the road at conferences and marketing events.
This year, 2012, marks the 60th anniversary of IBM Tape, but I will save that for a future post!
Jim is an IBM Fellow for IBM Systems and Technology Group. There are only 73 IBM Fellows currently working for IBM, and this is the highest honor IBM can bestow on an employee. He has been working with IBM since 1968 and now retires after 44 years! Jim was tasked with predicting the future of IT, and help drive strategic direction for IBM. Cost pressures, requirements for growth, accelerating innovation and changing business needs help influence this direction.
Many consider Jim one of the fathers of server virtualization. For those who think VMware invented the concept of running multiple operating systems on a single host machine, guess again! IBM developed the first server hypervisor in 1967, and introduced the industry's first [offical VM product on August 2, 1972] for the mainframe.
When I joined IBM in 1986, my first job was to work on what was then called DFHSM software for the MVS operating system. Each software engineer had unlimited access to his or her own VM instance of a mainframe for development and testing. This was way better than what we had in college, having to share time on systems for only a few minutes or hours per day. Today, DFHSM is now called the DFSMShsm component of DFSMS, an element of the z/OS operating system.
At various conferences like [SHARE] and [WAVV] we celebrated VM's 25th anniversary in 1997, and its 30th anniversary in 2002. Today, it is called z/VM and IBM continues to invest in its future. Last October, IBM announced [z/VM 6.2] release which provides Live Guest Relocation (LGR) to seemlessly move VM guest images from one mainframe to another, similar to PowerVM's Live Partition Mobility or VMware's VMotion.
Lately, it seems employees at other companies jump from job to job, and from employer to employer, on average every 4.1 years. According to [National Longitudinal Surveys] conducted by the [US. Government's Bureau of Labor Statistics], the average baby boomer holds 11 jobs. In contrast, it is quite common to see IBMers work the majority of their career at IBM.
The next time you have a tasty beverage in your hand, raise your glass! To Mark and Jim, you have earned our respect, and you both have certainly earned your retirement!
(Note: I have been informed that this week the U.S. Federal Trade Commission has [announced an update] to its
[16 CFR Part 255: Guides Concerning the Use of Endorsements and Testimonials in Advertising]. As if it were not obvious enough already, I must emphasize that I work for IBM, IBM provides me all the equipment and related documentation that I need for me to blog about IBM solutions, and that I am paid to blog as part of my job description. Both my boss and I agree I am not paid enough, but that is another matter. Beginning December 1, 2009, all positive mentions of IBM products, solutions and services on this blog might be considered a "celebrity endorsement" by the FTC and others under these new guidelines. Negative mentions of IBM products are probably typos.)
At a conference once, a presenter discussing tips and techniques about public speaking told everyone to be
aware that everyone in the audience is "tuned into radio station WIIFM" (What's In It For Me). If a member of the audience cannot figure out why the information being presented is relevant to them individually, they may not pay attention for long. Likewise, when it comes to archiving data for long term retention, I think we have many people are tuned into KEFM (the Keep Everything Forever methodology). Two classic articles from Drew Robb on the subject are [Can Data Ever Be Deleted?] and [Experts Question 'Keep Everything' Philosophy].
(Note: For those of my readers who do not live in the US, most radio stations start with
the letter "K" if they are on the left half of the country, and "W" if they are on the right half. See
Thomas H. White's [Early Radio History] to learn more.)
Contrary to popular belief, IBM would rather have their clients implement a viable archive strategy than just mindlessly buying more disk and tape for a "Keep Everything Forever" methodology. Keeping all information around forever can be a liability, as data that you store can be used against you in a court of law. It can also make it difficult to find the information that you do need, because the sheer volume of information to sort through makes the process more time consuming.
The problem with most archive storage solutions is that they are inflexible, treating all data the same under a common set of rules. The IBM Information Archive is different. You can have up to three separate "collections".
Each collection can have its own set of policies and rules. You can have a collection that is locked down
for compliance with full Non-Erasable, Non-Rewriteable (NENR) enforcement, and another collection that allows
full read/write/delete capability.
Each collection can consist of either files or objects. Unlike other storage devices that force you to convert files into objects, or objects into files for their own benefit.
IBM Information Archive is scalable enough to support up to a billion of either files or objects per collection.
Each collection can span storage tiers, even across disk and tape resources.
Object collections are accessed using IBM System Storage Archive Manager (SSAM) application programming interface (API). People who use IBM Tivoli Storage Manager (TSM) archive or IBM System Storage DR550 are already familiar with this interface. An object can represent the archived slice of a repository, a set of rows from a database, a collection of emails from an individual mailbox user, etc.
File collections can be used for any type of data you would store on a NAS device. This includes databases, email repositories, static Web pages, seismic data, user documents, spreadsheets, presentations, medical images, photos, videos, and so on.
The IBM Information Archive solution was designed to work with a variety of Enterprise Content Management (ECM) software, and is part of the overall IBM Smart Archive strategy.
I saw this as an opportunity to promote the new IBM Tivoli Storage Manager v6.1 which offers a variety of new scalability features, and continues to provide excellent economies of scale for large deployments, in my post [IBM has scalable backup solutions].
"So does TSM scale? Sure! Just add more servers. But this is not an economy of scale. Nothing gets less expensive as the capacity grows. You get a more or less linear growth of costs that is directly correlated to the growth of primary storage capacity. (Technically, it costs will jump at regular and predictable intervals, by regular and predictable and equal amounts, as you add TSM servers to the infrastructure--but on average it is a direct linear growth. Assuming you are right sized right now, if you were to double your primary storage capacity, you would double the size of the TSM infrastructure, and double your associated costs.)"
I talked about inaccurate vendor FUD in my post [The murals in restaurants], and recently, I saw StorageBod's piece, [FUDdy Waters]. So what would "economies of scale" look like? Using Scott's own words:
Without Economies of Scale
"If it costs you $5 to backup a given amount of data, it probably costs you $50 to back up 10 times that amount of data, and $500 to back up 100 times that amount of data."
With Economies of Scalee
"If anybody can figure out how to get costs down to $40 for 10 times the amount of data, and $300 for 100 times the amount of data, they will have an irrefutable advantage over anybody that has not been able to leverage economies of scale."
So, let's do some simple examples. I'll focus on a backup solution just for employee workstations, each employee has 100GB of personal data to backup on their laptop or PC. We'll look at a one-person company, a ten-person company, and a hundred-person company.
Case 1: The one-person company
Here the sole owner needs a backup solution. Here are all the steps she might perform:
Spend hours of time evaluating different backup products available, and make sure her operating system, file system and applications are supported
Spend hours shopping for external media, this could be an external USB disk drive, optical DVD drive, or tape drive, and confirm it is supported by the selected backup software.
Purchase the backup software, external drive, and if optical or tape, blank media cartridges.
Spend time learning the product, purchase "Backup for Dummies" or similar book, and/or taking a training class.
Install and configure the software
Operate the software, or set it up to run automatically, and take the media offsite at the end of the day, and back each morning
Case 2: The ten-person company
I guess if each of the ten employees went off and performed all of the same steps as above, there would be no economies of scale.
Fortunately, co-workers are amazingly efficient in avoiding unnecessary work.
Rather than have all ten people evaluate backup solutions, have one person do it. If everyone runs the same or similar operating system, file systems and applications, this can be done about the same as the one-person case.
Ditto on the storage media. Why should 10 people go off and evaluate their own storage media. One person can do it for all ten people in about the same time as it takes for one person.
Purchasing the software and hardware. Ok, here is where some costs may be linear, depending on your choices. Some software vendors give bulk discounts, so purchasing 10 seats of the same software could be less than 10 times the cost of one license. As for storage hardware, it might be possible to share drives and even media. Perhaps one or two storage systems can be shared by the entire team.
For a lot of backup software, most of the work is in the initial set up, then it runs automatically afterwards. That is the case for TSM. You create a "dsm.opt" file, and it can list all of the include/exclude files and other rules and policies. Once the first person sets this up, they share it with their co-workers.
Hopefully, if storage hardware was consolidated, such that you have fewer drives than people, you can probably have fewer people responsible for operations. For example, let's have the first five employees sharing one drive managed by Joe, and the second five employees sharing a second drive managed by Sally. Only two people need to spend time taking media offsite, bringing it back and so on.
Case 3: The hundred-person company
Again, it is possible that a hundred-person company consists of 10 departments of 10 people each, and they all follow the above approach independently, resulting in no economies of scale. But again, that is not likely.
Here one or a few people can invest time to evaluate backup solutions. Certainly far less than 100 times the effort for a one-person company.
Same with storage media. With 100 employees, you can now invest in a tape library with robotic automation.
Purchase of software and hardware. Again, discounts will probably apply for large deployments. Purchasing 1 tape library for all one hundred people is less than 10 times the cost and effort of 10 departments all making independent purchases.
With a hundred employees, you may have some differences in operating system, file systems and applications. Still, this might mean two to five versions of dsm.opt, and not 10 or 100 independent configurations.
Operations is where the big savings happen. TSM has "progressive incremental backup" so it only backs up changed data. Other backup schemes involve taking period full backups which tie up the network and consume a lot of back end resources. In head-to-head comparisons between IBM Tivoli Storage Manager and Symantec's NetBackup, IBM TSM was shown to use significantly less network LAN bandwidth, less disk storage capacity, and fewer tape cartridges than NetBackup.
The savings are even greater with data deduplication. Either using hardware, like IBM TS76750 ProtecTIER data deduplication solution, or software like the data deduplication capability built-in with IBM TSM v6.1, you can take advantage of the fact that 100 employees might have a lot of common data between them.
So, I have demonstrated how savings through economies of scale are achieved using IBM Tivoli Storage Manager. Adding one more person in each case is cheaper than the first person. The situation is not linear as Scott suggests. But what about larger deployments? IBM TS3500 Tape Library can hold one PB of data in only 10 square feet of data center floorspace. The IBM TS7650G gateway can manage up to 1 PB of disk, holding as much as 25 PB of backup copies. IT Analysts Tony Palmer, Brian Garrett and Lauren Whitehouse from Enterprise Strategy Group tried IBM TSM v6.1 out for themselves and wrote up a ["Lab Validation"] report. Here is an excerpt:
"Backup/recovery software that embeds data reduction technology can address all three of these factors handily. IBM TSM 6.1 now has native deduplication capabilities built into its Extended Edition (EE) as a no-cost option. After data is written to the primary disk pool, a deduplication operation can be scheduled to eliminate redundancy at the sub-file level. Data deduplication, as its name implies, identifies and eliminates redundant data.
TSM 6.1 also includes features that optimize TSM scalability and manageability to meet increasingly demanding service levels resulting from relentless data growth. The move from a proprietary back-end database to IBM DB2 improves scalability, availability, and performance without adding complexity; the DB2 database is automatically maintained and managed by TSM. IBM upgraded the monitoring and reporting capabilities to near real-time and completely redesigned the dashboard that provides visibility into the system. TSM and TSM EE include these enhanced monitoring and reporting capabilities at no cost."
The majority of Fortune 1000 customers use IBM Tivoli Storage Manager, and it is the backup software that IBM uses itself in its own huge data centers, including the cloud computing facilities. In combination with IBM Tivoli FastBack for remote office/branch office (ROBO) situations, and complemented with point-in-time and disk mirroring hardware capabilities such as IBM FlashCopy, Metro Mirror, and Global Mirror, IBM Tivoli Storage Manager can be an effective, scalable part of a complete Unified Recovery Management solution.
Continuing my drawn out coverage of IBM's big storage launch of February 9, today I'll cover the IBM System Storage TS7680 ProtecTIER data deduplication gateway for System z.
On the host side, TS7680 connects to mainframe systems running z/OS or z/VM over FICON attachment, emulating an automated tape library with 3592-J1A devices. The TS7680 includes two controllers that emulate the 3592 C06 model, with 4 FICON ports each. Each controller emulates up to 128 virtual 3592 tape drives, for a total of 256 virtual drives per TS7680 system. The mainframe sees up to 1 million virtual tape cartridges, up to 100GB raw capacity each, before compression. For z/OS, the automated library has full SMS Tape and Integrated Library Management capability that you would expect.
Inside, the two control units are both connected to a redundant pair cluster of ProtecTIER engines running the HyperFactor deduplication algorithm that is able to process the deduplication inline, as data is ingested, rather than post-process that other deduplication solutions use. These engines are similar to the TS7650 gateway machines for distributed systems.
On the back end, these ProtecTIER deduplication engines are then connected to external disk, up to 1PB. If you get 25x data deduplication ratio on your data, that would be 25PB of mainframe data stored on only 1PB of physical disk. The disk can be any disk supported by ProtecTIER over FCP protocol, not just the IBM System Storage DS8000, but also the IBM DS4000, DS5000 or IBM XIV storage system, various models of EMC and HDS, and of course the IBM SAN Volume Controller (SVC) with all of its supported disk systems.
Before acquisition, Diligent offered only software. The task of putting this software on an appropriate x86 server with sufficientmemory and processor capability was left as an exercise for the storage admin. With the TS7650G, IBM installs theProtectTIER software on the fastest servers in the industry, the IBM System x3850 M2 and x3950 M2. This eliminateshaving the storage admins pretend that they have hardware engineering degrees.
Before acquisition, the software worked only on a single system. IBM was able to offer multiple configurations of the TS7650G, including a single-controller model as well as a clustered dual-controller model. The clustered dual-controller model can ingest data at an impressive 900 MB/sec, which is up to nine times faster than some of thecompetitive deduplication offerings.
Before acquisition, ProtecTIER emulated DLT tape technology. This limited its viability, as the market sharefor DLT has dropped dramatically, and continues to dwindle. Most of the major backup software support DLT as anoption, but going forward this may not be true much longer for new tape applications.IBM was able to extend support by adding LTO emulation on theTS7650G gateway, future-proofing this into the 21st Century.
At last week's launch, covering so many products with so few slides, this announcement was shrunken down to a single line "Store 25 TB of backups onto 1 TB of disk, in 8 hours" and perhaps a few people missed that this wasactually covering two key features.
With deduplication, the TS7650G might get up to 25 times reduction on disk. If you back up a 1 TB data basethat changes only slightly from one day to the next, once a day for 25 days, it might only take 1 TB, or so, of disk tohold all the unique versions, as most of the blocks would be identical, rather than 25 TB on traditional disk or tapestorage systems. The TS7650G can manage up to 1 PB of disk,which could represent in theory up to 25 PB of backup data.
With an ingest rate of 900 MB/sec, the TS7650G could ingest 25 TB of backups during a typical 8 hour backup window.
The 25 TB of the first may not necessarily be the 25 TB of the second, but the wording was convenient for marketingpurposes, and a comma was used to ensure no misunderstandings.Of course, depending on the type of application, the frequency of daily change, and the backup software employed, your mileage may vary.
Continuing this week's theme about new products that were mentioned in last week's launch, today I willcover the new [S24 and S54 frames].
Before these new frames, customers had two choices for their tape cartridges: keep them in an automatedtape library, or on an external shelf. Most of the critics of tape focus almost entirely on the problemsrelated to the latter. When tapes are placed outside of automation, you need human intervention to findand fetch the tapes, tapes can be misplaced or misfiled, tapes can be dropped, tapes can get liquids spilledon them, and so on. These problems just don't happen when stored in automated tape libraries.
Until now, the number of cartridges were limited to the surface area of the wall accessible by the roboticpicker. Whether the robot rotates in a circle picking from dodecagon walls, or back and forth from longrectangular walls, the problem was the same.
But what about tapes that may not need to be readily accessible, but still automated? With the newhigh density frames, you can now stack tapes several cartridges deep, spring loaded deep shelves thatpush the tape cartridges up to the front one at a time. The high-density frame design might have been inspired by thefamous [Pez] candy dispenser, but at 70.9 inches, does not beat the[World's Tallest Pez Dispenser].
(Note: PEZ® is a registered trademark of Pez Candy, Inc.)
In a regular cartridge-only frame, like the D23, you have slots for 200 cartridges on the left, and 200 cartridges on the right, and the robotic picker can pull out and push back cartridges into any of theseslot positions. In the new S24, there are still 200 slots on the left, now referred to as "tier 0",but up to 800 cartridges on the right. In each slot there are up to four 3592 cartridges, the positionimmediately reachable to the picker is referred to as "tier 1", and the ones tucked behindare "tier 2", "tier 3" and "tier 4".
<- - - S24 frame - - - >
We have fun slow-motion videos we show customers on how these work. For example, in the diagram above, let'ssuppose you want to fetch Tape E in the "tier 4" position. The following sequence happens:
Robotic picker pulls "tier 1" tape cartridge B, and pushes it into another shelf slot. Tapes C, D and E get pushed up to be Tiers 1, 2 and 3 now.
Robotic picker pulls "tier 1" tape cartridge C, and puts it in another shelf slot. Tapes D and E get pushed up to be Tiers 1 and 2 now.
Robotic picker pulls "tier 1" tape cartridge D, and puts it in another shelf slot. Tape E gets pushed up to be Tier 1 now.
Robotic picker pulls "tier 1" tape cartridge E, this is the tape we wanted, and can move it to the drive.
The other three cartridges (B, C and D) are then pulled out of the temporary slot, and pushed back into their original order.
In this manner, the most recently referenced tape cartridges will be immediately accessible, and the ones leastreferenced will eventually migrate to the deeper tiers. The 3592 cartridges can be used with either TS1120 orTS1130 drives. Each cartridge can hold up to 3TB of data (1TB raw, at 3:1 compression), so the entire framecould hold 3PB in just 10 square feet of floor space. Five D23 frames could be consolidated down to two S24 frames.The S24 frame comes in "Capacity on Demand" pricing options. The base model of the S24 has just tiers 0, 1 and 2, for a total capacity of 600 cartridges. You can then later license tiers 3 and 4 when needed.
The S54 is basically similar in operation, but for LTO cartridges. It works with any mix of LTO-1, LTO-2, LTO-3 andLTO-4 cartridges.The left side holds tier 0 as before, but the right side has up to five LTO cartridges deep. For Capacity on Demand pricing,the base model supports 660 cartridges (tiers 0,1,2), with options to upgrade for the additional 660 cartridges.The total 1320 cartridges could hold up to 2.1 PB of data (at 2:1 compression). One S54 frame could replacethree traditional S53 frames that held only 440 LTO cartridges each.
If you have both TS1100 series and LTO drives in your TS3500 tape library, then you can haveboth S24 and S54 frames side by side.
It's official! IBM System Storage TS1120 tape drive takes home the gold award, the product of the year, announced by Storage magazine.
I spent 18 hours traveling from Australia to China yesterday, and we were partially delayed due to weather, but felt that it was necessary to discuss the innovative use of encryption on this drive.
While most consider the TS1120 an "Enterprise-class" tape technology for the mainframe, it is also attachable to the smallest distributed systems running Windows, Linux, or various flavors of UNIX. Rather than limit users with an Encryption Key Manager that only ran on z/OS, IBM instead chose to implement it in Java, that can be run on anything from z/OS to Linux, Unix and Windows platforms, giving clients choice and flexibility in their deployment.
The design is quite clever and elegant. In the encryption world, there are two ways to encrypt.
This is very fast, because it uses a single key for both encryption and decryption, and can be incorporated on a chip. The problem is that anyone with the key can read the sensitive data.
This is slower, but more secure, using two separate keys. The public "encryption" key takes clear data and encrypts it. Anyone can be freely given this key, as they cannot use it to decrypt any other data. The private "decryption" key is able to decrypt the data, so that one is kept secret. If two business plan to exchange lots of tapes, they can exchange their "encryption" keys to each other.
So, let's say that Green, Inc. wants to send a tape to Blue, Co. Blue has already provided its public "encryption" key to Green, so Green does the following:
Generate a unique data key, will call it the "red key", and there is one for each tape. It is a standard AES 256-bit symmetric key that can be processed with less than one percent overhead on the tape drive. All the data is encrypted with this key.
Store the red key on the tape. How does Green give Blue the red key? Green encrypts it with Blue's RSA 2048-bit public "encryption" key. This is stored on three places on the tape cartridge, one in memory, and the other two on the media itself.
Sends the tape over to Blue Co.
When it arrives on the dock at Blue Co., they do the following:
Mount the tape and decrypt the "red key" using Blue's super-secret private decryption key.
Pass the "red key" to the tape drive, and have it read, append or re-write the tape.
If the super-secret private key is ever compromised, all you have to do is mount the tape, unlock the red key with the old private key, and re-lock the red key with a new public key. Since the red key doesn't change, the rest of the data can be left in tact. The whole process takes less than 5 minutes, compared to Sun Microsystems method, which could take 1-2 hours per cartridge, having to decrypt and re-encrypt the entire data stream.
This is page 34 of Sequoia Capital's[56-slide presentation] about the current financial meltdown. In the past, IT spending tracked closely to the rest of the economy, but the latest downturn has not yet reflected in IT spend.
The rest of the deck is worth going through, with interesting stats presented in a clear manner.
Here we are again at Top Gun class.In between class topics, we often show short video clips.
This week, we saw IBM Executive Bob Hoey's wisdom on selling mainframe computers. Bob is the VP of Sales for our System z server line, but the lessons might also apply to high-end disk or enterprise tape libraries.
On his "Data Storage - Dullness becomes Mainstream" blog, Chris Evans is
amazed athow low they can go!.He compares the latest 100GB Toshiba 1.8" drive designed for portable music players, to the size andweight of older technology, like the IBM 3380 Direct Access Storage Device (DASD).
Chris couldn't find the dimensions of the 3380, so I thought I would provide the missing detail.The IBM 3380 History Archivesprovides a nice summary:
The CJ2 model that Chris mentions was announced September 1, 1987 and shipped in 1988. Earlier models of the 3380 were announced 1980-1986.
Capacity and performance were measured in 7-bit "characters", since we were not yet storing full 8-bit bytes.
By today's standards, having such a large box to hold a few GB might seem amusing, but at the time, this unit was four times the capacity as its predecessor, the IBM 3350 DASD. Compare that with our first disk system, the IBM 350 Disk Storage Unit, introduced in 1956, that stored only 5 million characters (5MB) and was the size of two refrigerators.
The term "DASD", pronounced daz-dee, was used as some earlier devices were based on magnetic drums or strips of magnetic tape. Today, DASD is still a common term for disk systems among mainframe administrators.
The 3380 was also twice as fast as the IBM 3350, at 3 million characters per second (3 MB/sec). The irony was thatthe mainframe servers could not keep up, so a Speed Matching Buffer feature was invented to slow it down to half-speed, when used with certain models of mainframe.
As for the dimensions, I too had a hard time finding a publicly available resource that listed 3380 dimensions,so I searched internal IBM resources, and finally, asked someone over in the next building just to measure one ofthe 3380K models we still have in the Tucson test lab floor. The dimensions are ... (drumroll please)
70 inches (1778mm) tall
44 inches (1117mm) wide
32 inches (812mm) deep
The result is that the box could actually hold a much more impressive 52,500 of the new Toshiba drives, twicethe original, albeit conservative, estimate. Before anyone"tries this at home", however, keep in mind that around each Toshiba drive,as with any ATA drive, you need to have all the electronics to communicate to the outside world, and provide cooling. Running tens of thousands of these little guys in the spaceof 60 square feet would probably melt the floor or set off your smoke alarm system.
Today, 13.5% of EMC's sales force is female, the company says, compared with 40% at International Business Machines Corp. and 29% at CA Inc., a big software vendor, those companies say. According to the 2000 U.S. census, about 25% of high-tech employees nationally were women.
IBM recognizes that diversity provides unique advantages in dealing with a global marketplace. Not only are women well represented on our IT sales force, they are also well represented on our board of directors, our Worldwide Management Committee, and our executive team overall, as well as in technical positions such as IBM Fellows, Distinguished Engineers, members of the IBM Academy of Technology. Working Mother magazine has rated IBM one of the top 10 "Best Companies" for women to work for in each of the 18 years that it has published this list.
In 2006, 51 camps called EXITE (Exploring Interests in Technology and Engineering) were held worldwide in 33 countries. The hope is to get young girls to pursue college degrees in computer science, math and engineering, so that they can then help fill the shortage of technical resources in IT.
So, if you are a women discouraged at your current place of employment, and are looking for exciting new opportunities in IT, come check out working for IBM![Read More]
In his blog, Paul Gillin agrees with Time Magazine's Person of the Year choice of "all of us", those of us who use the World Wide Web to do business or have fun, and to those who contribute to the internet by creating content, such as people who blog or create websites.
So, in continuing my theme this week to recap the best and worst of last year, I list my personal "tech highlights" of 2006.
Programming my Tivo Remotely.
Last September, I realized on a 3-week business trip that I had not programmed my Tivo to record the premieres of each of the new season's television shows. If you miss the first few weeks, it might be difficult to make sense of the rest of the season. Fortunately, I was able to program my Tivo remotely through the internet.
Purchasing TV shows on iTunes
Despite this, I had a repeat episode of "House" record instead of a new episode of
Still unable to make sense of what was going on in the TV show "Heroes", I was able to read the "wiki" which explained all the subtle imagery and background implied.
Using Linux to rescue lost Windows data
My disk drive failed on my laptop, and although I had most of my data backed up with Tivoli Storage Manager prior to my business trip, I had some files that I acquired or updated during the business trip. Thankfully, there are Linux "LiveCD" images that allow you to access your Windows files. You boot these LiveCD images from your CD drive, so there is no installation of Linux on the hard drive itself. If you travel as much as I do, consider bringing along some Linux CDs to get you out of trouble.
Connecting my home entertainment system to my Mac
I now have an 802.1g (54Mb) wireless hub which allows my Tivo to connect wirelessly to the internet to get daily updates, but also allows me to play all my music stored on my Mac through my home entertainment system, and I can also listen to thousands of radio stations through "Live365.com". My favorite station is "Depeche Mode Inspired" which plays songs by Depeche Mode, as well as cover versions by a variety of others.
Learning to Blog
Believe it or not, there is a right way and a wrong way to blog, and this year has been a good learning experience. IBM has a fairly healthy blogging policy, but nonetheless, say the wrong thing and I could be in serious trouble. Fortunately, that hasn't happened, and I am glad to see a fairly open exchange of ideas among the set of bloggers that discuss storage issues.
Building a Snowman in Second Life
I have been a member of Second Life now since November, but it wasn't until I entered a competition to build avirtual snowman last week that the potential of this new interface became obvious to me. There is still lots to learn, but at least now I see value in spending more time and effort learning more about it.
Getting an all-in-one printer/scanner to work with both my Mac and IBM PC
I didn't think it could be done, but here it is, my all-in-one Printer/Scanner works correctly, seemlessly, from both my Windows PC and my Mac Mini, and I have it on my home network so my laptop can use it also, wirelessly!
Using Google Language Tools to translate materials to Portuguese
I speak several languages, enough to order food in restaurants and to get around through various modes of transportation, but translation for a technical audience is more challenging. A class we normally conduct in pure English was taken to Sao Paulo, Brazil, and although most students know some amount of English, we thought it would be good to translate the test questions to Brazillian Portuguese. I took the questions and ran them through a number of translation services websites, and had local IBMers review the results. The winner was Google language tools, which required hardly any edits to the generated text. The class was a big success.
Digital Cameras and CD Burners
As I travelled from Brazil to Bolivia last August, I met a young back-packer who was on her way to Peru, but was staying in La Paz for a few days. We had a great time together, and I was able to transfer the digital photos from my Canon PowerShot digital camera into my laptop and burn her a CD to take with her to Peru.
Painting my Dining Room table
After Halloween, I accidently left my pumpkin jack-o-lantern on my kitchen table as I left for a trip, and when I got back, it had decomposed and left a terrible stain on the wood surface. After sanding the table, I determined that the best course of action was just to paint the surface. I could have just painted it a solid color, or maybe a faux finish with two colors, but instead, chose to copy a famous painting, "Le Cafe" by Alberto Magnelli. I was able to scan this into my computer, resize it, and then project the image onto my table, to then outline the image and paint. I know I would not have been able to do this free-hand.
I am sure there are other triumphs I had throughout the year, but these are the first the come to mind.
This week, I was in Sydney, Australia teaching IBM Storage Portfolio Top Gun class.
Our hotel is near [Circular Quay], and our class is at the IBM Centre at St. Leonards, just six metro stops away. There are also ferry boats from Circular Quay to other parts of the city.
Here are other members of the teach team. Scott McPeek covers the IBM SmartCloud Virtual Storage Center, SAN Volume Controller and Tivoli Storage Productivity Center. Vic Peltz covers high-end disk, disk replication, and competitive issues. Here we are in front of the [Sydney Opera House].
We arrived at 4:15pm to discover they weren't open for dinner until 5:30pm. We managed to find some beverages at the bar next door. Corona beer?!?! I just travelled thousands of miles across the Pacific Ocean to be offered Mexican beer I can get locally in Tucson? I don't think so! Instead, we got some local Tasmanian brew.
Once seated, our table at Doyles was outdoors on the patio, with stunning views of the sunset. The weather was just right, cool and crisp sea air, but not windy.
I tried their Sydney Sangria which combines red wine, fruit juices and ginger beer. This had an interesting kick. If you have never tried Ginger beer, I highly recommend it! For dinner, I had the Flathead fish and chips. All of the fish at Doyles is locally sourced.
We got done with dinner just in time to catch the last ferry boat at 6:55pm! We literally were the last three to get on the boat before they pulled up the gangplank!
On Monday night, after the first day of class, our friends at [Brocade] invited us to a Pizza-and-Beer reception at the [Cabana Bar and Lounge], similar to the Brocade reception at Sale Street Bar last week in Auckland. Here I am with Katie, one of the Brocade employees hosting the event.
While at the reception, we had a terrible rain storm. I am so glad we were not on the street at that time. Some of our colleagues were not so lucky, and arrived soaking wet!
Special thanks to Tim Lees, the Brocade partner manager to IBM in ANZ, for hosting these receptions in both Auckland and Sydney!
On Tuesday, I once again presented the [Storwize family, DS3500 and DCS3700 disk systems]. Based on student feedback from last week's Auckland class, we took out some of the more technical details of each product, and added more information on the business value of each feature.
I returned safely from my trip to Tulsa, Oklahoma.
(A special shout-out to Shannon at [In The Raw] sushi restaurant, and my new friends I met at the rooftop of [the Mayo]!)
Last week I was in Auckland, New Zealand teaching Top Gun class. Top Gun teaches IBM Business Partners and sales reps how to sell our products, services, and solutions. I have been teaching Top Gun classes around the world since 1998.
(Why didn't I post sooner? Because IBM's developerWorks was getting an exciting upgrade to IBM Connections 4.0, and bloggers like me have to wait for the conversion to complete!)
While many of my trips in the USA involve traveling alone, that is not the case for Top Gun classes. Our class manager, Joe Ebidia, brought his wife Karen. Our class administrator is Hyein (Hyein is a Korean name that rhymes with rain). In addition to some local instructors, I am joined by my IBM USA colleagues Scott McPeek (Tivoli Storage) and Vic Peltz (Disk/Replication/Competitive Sales).
The rest of the teach team arrived a day or two early to adjust to jet lag. I, on the other hand, got off the plane Monday at 6am, and had a business meeting that same morning with GTS architects from Wellington.
Clockwise from left: Karen is vegetarian, and had some pasta with tomato sauce. Hyein had a lamb burger. Joe had flounder. I had salmon risotto. Yum!
(To those asking why I have only the bellies of Karen and Joe in the picture, I was focused on taking picture of the food.)
After setting up the classroom, we took a ferry over to [Devonport], a charming seaside village just minutes across the bay from Auckland. The ferry boats were close the the Central Business District our [Stamford Plaza hotel] was in, and they run every 30 minutes.
The four of us walked up to the top of Mt. Victoria to see the views of the city. I highly recommend this! Once you get to Devonport, you can walk along the streets to see all the cute shops, or enjoy the parks and natural beauty. I had [done this before], but it is always worth doing again!
The class is four days long. I had six presentations. Here were the first three:
Selling IBM Storwize V7000 and V7000 Unified. Scott McPeek had already covered SAN Volume Controller (SVC), so it was easy to explain the Storwize V7000. For the V7000 Unified, I went into more detail of the file-based protocols and features, paving the way for Vic's "Selling SONAS" later in the week.
Selling IBM Storwize V3700. Having covered the SVC and Storwize V7000, my presentation on the Storwize V3700 focuses more on the positioning of when to sell which product for particular workloads.
Understanding IBM's Big Four Initiatives. This was an interesting request. I was asked to cover Social, Mobile, Analytics and Cloud (what we internally call SMAC) from a storage perspective. Social included Social Media, Social Networking and Social Business. Mobile focused on IBM's Mobile First campaign. Analytics included big data, Hadoop, and our various solutions for performing analytics. Cloud included IBM's Cloud Computing Reference Architecture (CCRA), IBM SmartCloud Enterprise storage, our Backup and Archive clouds, and the new SmartCloud Storage portfolio.
I will save the rest of the week for the next post!
This Thursday, June 16, 2011, marks IBM's Centennial 100 year anniversary. It happens to also be my 25th anniversary with IBM Storage. To avoid conflicts in celebrations, we decided to celebrate my induction into the "Quarter Century Club" (QCC) last Friday instead.
My colleague Harley Puckett was master of ceremonies. Here he is presenting me with a memorial plaque and keychain. Harley mentioned a few facts about 1986, the year I started working for IBM. Ronald Reagan was the US President, gasoline cost only 93 cents per gallon, and the US National Debt was only 2 trillion US dollars!
Here are my colleagues from DFSMShsm. From left to right: Ninh Le, Henry Valenzuela, Shannon Gallaher, and Stan Kissinger. I started in 1986 as aa software developer on DFHSM, and slowly worked my way up to be a lead architect of DFSMS.
Here are my colleagues from Tivoli Storage Manager (TSM). From left to right: Matt Anglin, Ken Hannigan and Mark Haye. I first met them when they worked in DFDSS, having moved from San Jose, CA down to Tucson. While I never worked on the TSM code itself, I did co-author some of the patents used in the product and other products like the 3494 Virtual Tape Server that makes use of TSM internally. I also traveled extensively to promote TSM, often with a TSM developer tagging along so they can learn the ropes about how to travel and make presentaitons.
Here are my colleagues from the disk team. From left to right: Joe Bacco, Carlos Pratt, Gary Albert, and Siebo Friesenborg. I worked on the SMI-S interface for the ESS 800 and DS8000 disk systems needed for the Tivoli Storage Productivity Center. Joe leads the "Disk Magic" tools team. Carlos and I worked on qualifying the various disk products to run with Linux on System z host attachment. Gary Albert is the Business Line Executive (BLE) of Enterprise Disk. Siebo Friesenborg was a disk expert on performance and disaster recovery, but is now enjoying his retirement.
Here are my colleagues from the support team. From left to right: Max Smith, Dave Reed, and Greg McBride. I used to work in Level 2 Support for DFSMS with Max and Dave, carrying a pager and managing the queue on RETAIN. We had enough people so that each Level 2 only had to carry the pager two weeks per year. On Monday afternoons, the person with the pager would give it to the next person on the rotation. On Monday, September 10, 2001, I got the pager, and the following morning, it went off to help all the many clients affected by the September 11 tragedy.
I worked with Greg McBride when he was in DFSMS System Data Mover (SDM), and then again in Tivoli Storage Productivity Center for Replication (TPC-R), and now he is supporting IBM Scale-Out Network Attached Storage (SONAS).
Standing in the light blue striped shirt is Greg Van Hise, my first office-mate and mentor when I first joined IBM. He went on to be part of the elite "DFHSM 2.4.0" prima donna team, then move on to be an architect for Tivoli Storage Manager (TSM).
I wasn't limited to inviting just coworkers, I was also able to invite friends and family. Here are Monica, Richard, and my mother. Normally, my parents head south for the summer, but they postponed their flights so that they could participate in my QCC celebration.
From left to right: my father, Greg Tevis, and myself. It was pure coincidence that my father would wear a loud darkly patterned shirt like mine. Honestly, we did not plan this in advance. Greg Tevis and I were lead architects for the Tivoli Storage Productivity Center, and Greg is now the Technology Strategist for the Tivoli Storage product line.
Here is Jack Arnold, fellow subject matter expert who works with me here at the Tucson Executive Briefing Center, sampling the food. We had quite the spread, including egg rolls, meatballs, luncheon meats, chicken strips, and fresh vegetables.
More colleagues from the Tucson Executive Briefing Center, from left to right, Joe Hayward, Lee Olguin, and Shelly Jost. Joe was a subject matter expert on Tape when I first joioned the EBC in 2007, but he has moved back to the Tape development/test team. Lee is our master "Gunny" sargeant to manage all of our briefing schedules. Shelly is our Client Support Manager, and was the one who organized all the food and preparations for this event!
Lastly, here are Brad Johns, myself, and Harley Puckett. Brad was my mentor for my years in Marketing, and has since retired from IBM and now works on his golf game. I would like to thank all of the Tucson EBC staff for pulling off such a great event, and all my coworkers, friends and family for coming out to celebrate this milestone in my career!
In addition to the plaque and keychain, Harley presented me with a book of congratulatory letters. If you would like to send a letter, it's not too late, contact Mysti Wood (email@example.com).
As I have mentioned before, I started this blog on September 1, 2006 as part of IBM's big ["50 Years of Disk Systems Innovation"] campaign. IBM introduced the first commercial disk system on September 13, 1956 and so the 50th anniversary was in 2006. That means this month, IBM celebrates the "Diamond" anniversary, 60 years of Disk Systems!
For those who missed it, IBM announced last Tuesday encryption capability for the TS1120 drive, our enterprise tape drive that read and write 3592 cartridges. Do you need special cartridges for this? No! Use the sames ones you have already been using!
You can read more about it www.ibm.com/storage/tape."
Short and sweet, but it got me started, and I ended up writing 21 blog posts that first month. You can read blog posts from all 10 years by looking at the left panel of my blog under "Archive".
While traditional disk and tape storage are still very important and relevant in today's environment, IBM has also expanded into other technologies:
In 2012, IBM [acquired Texas Memory Systems]. In 2014, IBM shipped 62PB, more Flash capacity than any other vendor. In 2015, continued its #1 status, shipping 170PB of Flash, again, more than any other vendor.
IBM has flash everywhere, from the advanced FlashSystem 900, V9000, A9000 and A9000R models, to other all-flash array and hybrid flash-and-disk systems a with various sets of features and functions to meet a variety of workload requirements.
The DS8888 all-flash array, and the DS8886 and DS8884 hybrid flash-and-disk systems round out the latest in the DS8000 storage systems family. SAN Volume Controller and Storwize family of products, based on IBM Spectrum Virtualize software, also have all-flash array and hybrid configurations. The most recent being the Gen2+ models of Storwize V7000F and V5030F. The latest solution is the DeepFlash 150 models, designed for analytics and unstructured data.
Between internally-developed IBM Spectrum Scale and IBM Spectrum Archive, and IBM's [acquisition of Cleversafe], IBM is ranked #1 in Object Storage. IBM Cloud Object Storage System, IBM's new name for Cleversafe's flagship product, is available as software-only, pre-built systems, or in the IBM SoftLayer cloud.
Software-Defined Storage (SDS) with IBM Spectrum Storage
Last year, IBM re-branded its various storage software products under the "IBM Spectrum Storage" family. Earlier this year, IBM announced the new [IBM Spectrum Storage Suite license] which makes it even easier to procure, either with a perpetual software license, elastic monthly licensing, or utility license that combines some of each.
IBM is ranked #1 in Software-Defined Storage, with over 40 percent marketshare, offering solutions as Software-only, pre-built systems, and in IBM SoftLayer cloud.
It's Tuesday, and that means more IBM announcements!
I haven't even finished blogging about all the other stuff that got announced last week, and here we are with more announcements. Since IBM's big [Pulse 2010 Conference] is next week, I thought I would cover this week's announcement on Tivoli Storage Manager (TSM) v6.2 release. Here are the highlights:
Client-Side Data Deduplication
This is sometimes referred to as "source-side" deduplication, as storage admins can get confused on which servers are clients in a TSM client-server deployment. The idea is to identify duplicates at the TSM client node, before sending to the TSM server. This is done at the block level, so even files that are similar but not identical, such as slight variations from a master copy, can benefit. The dedupe process is based on a shared index across all clients, and the TSM server, so if you have a file that is similar to a file on a different node, the duplicate blocks that are identical in both would be deduplicated.
This feature is available for both backup and archive data, and can also be useful for archives using the IBM System Storage Archive Manager (SSAM) v6.2 interface.
Simplified management of Server virtualization
TSM 6.2 improves its support of VMware guests by adding auto-discovery. Now, when you spontaneously create a new virtual machine OS guest image, you won't have to tell TSM, it will discover this automatically! TSM's legendary support of VMware Consolidated Backup (VCB) now eliminates the manual process of keeping track of guest images. TSM also added support of the Vstorage API for file level backup and recovery.
While IBM is the #1 reseller of VMware, we also support other forms of server virtualization. In this release, IBM adds support for Microsoft Hyper-V, including support using Microsoft's Volume Shadow Copy Services (VSS).
Automated Client Deployment
Do you have clients at all different levels of TSM backup-archive client code deployed all over the place? TSM v6.2 can upgrade these clients up to the latest client level automatically, using push technology, from any client running v5.4 and above. This can be scheduled so that only certain clients are upgraded at a time.
Simultaneous Background Tasks
The TSM server has many background administrative tasks:
Migration of data from one storage pool to another, based on policies, such as moving backups and archives on a disk pool over to a tape pools to make room for new incoming data.
Storage pool backup, typically data on a disk pool is copied to a tape pool to be kept off-site.
Copy active data. In TSM terminology, if you have multiple backup versions, the most recent version is called the active version, and the older versions are called inactive. TSM can copy just the active versions to a separate, smaller disk pool.
In previous releases, these were done one at a time, so it could make for a long service window. With TSM v6.2, these three tasks are now run simultaneously, in parallel, so that they all get done in less time, greatly reducing the server maintenance window, and freeing up tape drives for incoming backup and archive data. Often, the same file on a disk pool is going to be processed by two or more of these scheduled tasks, so it makes sense to read it once and do all the copies and migrations at one time while the data is in buffer memory.
Enhanced Security during Data Transmission
Previous releases of TSM offered secure in-flight transmission of data for Windows and AIX clients. This security uses Secure Socket Layer (SSL) with 256-bit AES encryption. With TSM v6.2, this feature is expanded to support Linux, HP-UX and Solaris.
Improved support for Enterprise Resource Planning (ERP) applications
I remember back when we used to call these TDPs (Tivoli Data Protectors). TSM for ERP allows backup of ERP applications, seemlessly integrating with database-specific tools like IBM DB2, Oracle RMAN, and SAP BR*Tools. This allows one-to-many and many-to-one configurations between SAP servers and TSM servers. In other words, you can have one SAP server backup to several TSM servers, or several SAP servers backup to a single TSM server. This is done by splitting up data bases into "sub-database objects", and then process each object separately. This can be extremely helpful if you have databases over 1TB in size. In the event that backing up an object fails and has to be re-started, it does not impact the backup of the other objects.
Seth Godin has an interesting post titled Times a Million.He recounts how many people determine the fuel savings of higher-mileage cars to be only $300-$900 per year,and that this is not enough to motivate the purchase of a more-efficient vehicle, such as a hybrid orelectric car. Of course, if everyone drove more efficient vehicles, the benefits "times a million" wouldbenefit everyone and the world's ecology.
When I discuss storage-related concepts, many executives mistakenly relate them to the one area of information technologythey know best: their laptop. Let's take a look at some examples:
Information Lifecycle Management
Information Lifecycle Management (ILM) includes classifying data by business value, and then using this to determineplacement, movement or deletion. If you think about the amount of time and effort to review the files on yourindividual laptop, and to manually select and move or delete data, versus the benefits for the individual laptopowner, you would dismiss the concept. Most administrative tasks are done manually on laptops, because automatedsoftware is either unavailable or too expensive to justify for a single owner.
In medium and large size enterprises, automated software to help classify, move and delete data makes a lot of sense.Executives who decide that ILM is not for their data center, based on their experiences with their laptop, are losingout on the "times a million" effect.
Laptops have various controls to minimize the use of battery, and these controls are equally available when pluggedin. Many users don't bother turning off the features and functions they don't need when plugged in, because theyfeel the cost savings would only amount to pennies per day.
Times a million, energy savings do add up, and options to reduce the amount used per server, per TB of data stored, not only save millions of dollars per year, but can also postpone the need to build a new data center, or upgrade the electrical systems in your existing data center.
Backup and Disaster Recovery planning
I am not surprised how many laptops do not have adequate backup and disaster recovery plans. When executives thinkin terms of the time and effort to backup their data, often crudely copying key files to CDrom or USB key, and worryingabout the management of those copies, which copies are the latest, and when those copies can be destroyed, theymight reject deploying appropriate backup policies for others.
Times a million, the collected data stored on laptops could easily be half of your companies emails and intellectual property. Products like IBM Tivoli Storage Manager can manage a large number of clients with a few administrators,keeping track of how many copies to keep, and how long to keep them.
So, next time you are looking at technology or solutions for your data center, don't suffer from "Laptop Mentality". Focus instead on the data center as a whole.
My how time flies. This week marks my 24th anniversary working here at IBM. This would have escaped me completely, had I not gotten an email reminding me that it was time to get a new laptop. IBM manages these on a four-year depreciation schedule, and I received my current laptop back in June 2006, on my 20th anniversary.
When I first started at IBM, I was a developer on DFHSM for the MVS operating system, now called DFSMShsm on the z/OS operating system. We all had 3270 [dumb terminals], large cathode ray tubes affectionately known as "green screens", and all of our files were stored centrally on the mainframe. When Personal Computers (PC) were first deployed, I was assigned the job of deciding who got them when. We were getting 120 machines, in five batches of 24 systems each, spaced out over the next two years. I was assigned the job of recommending who should get a PC during the first batch, the second batch, and so on. I was concerned that everyone would want to be part of the first batch, so I put out a survey, asking questions on how familiar they were with personal computers, whether they owned one at home, were familiar with DOS or OS/2, and so on.
It was actually my last question that helped make the decision process easy:
How soon do you want a Personal Computer to replace your existing 3270 terminal?
As late as possible
I had five options, and roughly 24 respondents checked each one, making my job extremely easy. Ironically, once the early adopters of the first batch discovered that these PC could be used for more than just 3270 terminal emulation, many of the others wanted theirs sooner.
Back then, IBM employees resented any form of change. Many took their new PC, configured it to be a full-screen 3270 emulation screen, and continued to work much as they had before. My mentor, Jerry Pence, would print out his mails, and file the printed emails into hanging file folders in his desk credenza. He did not trust saving them on the mainframe, so he was certainly not going to trust storing them on his new PC. One employee used his PC as a door stop, claiming he will continue to use his 3270 terminal until they take it away from him.
Moving forward to 2006, I was one of the first in my building to get a ThinkPad T60. It was so new that many of the accessories were not yet available. It had Windows XP on a single-core 32-bit processor, 1GB RAM, and a huge 80GB disk drive. The built-in 1GbE Ethernet went unused for a while, as we had 16 Mbps Token Ring network.
I was the marketing strategist for IBM System Storage back then, and needed all this excess power and capacity to handle all my graphic-intense applications, like GIMP and Second Life.
Over the past four years, I made a few slight improvements. I partitioned the hard drive to dual-boot between Windows and Linux, and created a separate partition for my data that could be accessed from either OS. I increased the memory to 2GB and replaced the disk with a drive holding 120GB capacity.
A few years ago, IBM surprised us by deciding to support Windows, Linux and Mac OS computers. But actually it made a lot of sense. IBM's world-renown global services manages the help-desk support of over 500 other companies in addition to the 400,000 employees within IBM, so they already had to know how to handle these other operating systems. Now we can choose whichever we feel makes us more productive. Happy employees are more productive, of course. IBM's vision is that almost everything you need to do would be supported on all three OS platforms:
Access your email, calendar, to-do list and corporate databases via Lotus Notes on either Windows, Linux or Mac OS. Corporate databases store our confidential data centrally, so we don't have to have them on our local systems. We can make local replicas of specific databases for offline access, and these are encrypted on our local hard drive for added protection. Emails can link directly to specific entries in a database, so we don't have huge attachments slowing down email traffic. IBM also offers LotusLive, a public cloud offering for companies to get out of managing their own email Lotus Domino repositories.
Create presentations, documents and spreadsheets on either Windows, Linux or Mac OS. Lotus Symphony is based on open source OpenOffice and is compatible with Microsoft Office. This allows us to open and update directly in Microsoft's PPT, DOC and XLS formats.
Many of the corporate applications have now been converted to be browser-accessible. The Firefox browser is available on Windows, Linux and Mac OS. This is a huge step forward, in my opinion, as we often had to download applications just to do the simplest things like submit our time-sheet or travel expense reimbursement. I manage my blog, Facebook and Twitter all from online web-based applications.
The irony here is that the world is switching back to thin clients, with data stored centrally. The popularity of Web 2.0 helped this along. People are using Google Docs or Microsoft OfficeOnline to eliminate having to store anything locally on their machines. This vision positions IBM employees well for emerging cloud-based offerings.
Sadly, we are not quite completely off Windows. Some of our Lotus Notes databases use Windows-only APIs to access our Siebel databases. I have encountered PowerPoint presentations and Excel spreadsheets that just don't render correctly in Lotus Symphony. And finally, some of our web-based applications work only in Internet Explorer! We use the outdated IE6 corporate-wide, which is enough reason to switch over to Firefox, Chrome or Opera browsers. I have to put special tags on my blog posts to suppress YouTube and other embedded objects that aren't supported on IE6.
So, this leaves me with two options: Get a Mac and run Windows on the side as a guest operating system, or get a ThinkPad to run Windows or Windows/Linux. I've opted for the latter, and put in my order for a ThinkPad 410 with a dual-core 64-bit i5 Intel processor, VT-capable to provide hardware-assistance for virtualization, 4GB of RAM, and a huge 320GB drive. It will come installed with Windows XP as one big C: drive, so it will be up to me to re-partition it into a Windows/Linux dual-boot and/or Windows and Linux running as guest OS machine.
(Full disclosure to make the FTC happy: This is not an endorsement for Microsoft or against Apple products. I have an Apple Mac Mini at home, as well as Windows and Linux machines. IBM and Apple have a business relationship, and IBM manufactures technology inside some of Apple's products. I own shares of Apple stock, I have friends and family that work for Microsoft that occasionally send me Microsoft-logo items, and I work for IBM.)
I have until the end of June to receive my new laptop, re-partition, re-install all my programs, reconfigure all my settings, and transfer over my data so that I can send my old ThinkPad T60 back. IBM will probably refurbish it and send it off to a deserving child in Africa.
If you have an old PC or laptop, please consider donating it to a child, school or charity in your area. To help out a deserving child in Africa or elsewhere, consider contributing to the [One Laptop Per Child] organization.
Are you tired of hearing about Cloud Computing without having any hands-on experience? Here's your chance. IBM has recently launched its IBM Development and Test Cloud beta. This gives you a "sandbox" to play in. Here's a few steps to get started:
Generate a "key pair". There are two keys. A "public" key that will reside in the cloud, and a "private" key that you download to your personal computer. Don't lose this key.
Request an IP address. This step is optional, but I went ahead and got a static IP, so I don't have to type in long hostnames like "vm353.developer.ihost.com".
Request storage space. Again, this step is optional, but you can request a 50GB, 100GB and 200GB LUN. I picked a 200GB LUN. Note that each instance comes with some 10 to 30GB storage already. The advantage to a storage LUN is that it is persistent, and you can mount it to different instances.
Start an "instance". An "instance" is a virtual machine, pre-installed with whatever software you chose from the "asset catalog". These are Linux images running under Red Hat Enterprise Virtualization (RHEV) which is based on Linux's kernel virtual machine (KVM). When you start an instance, you get to decide its size (small, medium, or large), whether to use your static IP address, and where to mount your storage LUN. On the examples below, I had each instance with a static IP and mounted the storage LUN to /media/storage subdirectory. The process takes a few minutes.
So, now that you are ready to go, what instance should you pick from the catalog? Here are three examples to get you started:
IBM WebSphere sMASH Application Builder
Base OS server to run LAMP stack
Next, I decided to try out one of the base OS images. There are a lot of books on Linux, Apache, MySQL and PHP (LAMP) which represents nearly 70 percent of the web sites on the internet. This instance let's you install all the software from scratch. Between Red Hat and Novell SUSE distributions of Linux, Red Hat is focused on being the Hypervisor of choice, and SUSE is focusing on being the Guest OS of choice. Most of the images on the "asset catalog" are based on SLES 10 SP2. However, there was a base OS image of Red Hat Enterprise Linux (RHEL) 5.4, so I chose that.
To install software, you either have to find the appropriate RPM package, or download a tarball and compile from source. To try both methods out, I downloaded tarballs of Apache Web Server and PHP, and got the RPM packages for MySQL. If you just want to learn SQL, there are instances on the asset catalog with DB2 and DB2 Express-C already pre-installed. However, if you are already an expert in MySQL, or are following a tutorial or examples based on MySQL from a classroom textbook, or just want a development and test environment that matches what your company uses in production, then by all means install MySQL.
This is where my SSH client comes in handy. I am able to login to my instance and use "wget" to fetch the appropriate files. An alternative is to use "SCP" (also part of PuTTY) to do a secure copy from your personal computer up to the instance. You will need to do everything via command line interface, including editing files, so I found this [VI cheat sheet] useful. I copied all of the tarballs and RPMs on my storage LUN ( /media/storage ) so as not to have to download them again.
Compiling and configuring them is a different matter. By default, you login as an end user, "idcuser" (which stands for IBM Developer Cloud user). However, sometimes you need "root" level access. Use "sudo bash" to get into root level mode, and this allows you to put the files where they need to be. If you haven't done a configure/make/make install in awhile, here's your chance to relive those "glory days".
In the end, I was able to confirm that Apache, MySQL and PHP were all running correctly. I wrote a simple index.php that invoked phpinfo() to show all the settings were set correctly. I rebooted the instance to ensure that all of the services started at boot time.
Rational Application Developer over VDI
This last example, I started an instance pre-installed with Rational Application Developer (RAD), which is a full Integrated Development Environment (IDE) for Java and J2EE applications. I used the "NX Client" to launch a virtual desktop image (VDI) which in this case was Gnome on SLES 10 SP2. You might want to increase the screen resolution on your personal computer so that the VDI does not take up the entire screen.
From this VDI, you can launch any of the programs, just as if it were your own personal computer. Launch RAD, and you get the familiar environment. I created a short Java program and launched it on the internal WebSphere Application Server test image to confirm it was working correctly.
If you are thinking, "This is too good to be true!" there is a small catch. The instances are only up and running for 7 days. After that, they go away, and you have to start up another one. This includes any files you had on the local disk drive. You have a few options to save your work:
Copy the files you want to save to your storage LUN. This storage LUN appears persistent, and continues to exist after the instance goes away.
Take an "image" of your "instance", a function provided in the IBM Developer and Test Cloud. If you start a project Monday morning, work on it all week, then on Friday afternoon, take an "image". This will shutdown your instance, and backup all of the files to your own personal "asset catalog" so that the next time you request an instance, you can chose that "image" as the starting point.
Another option is to request an "extension" which gives you another 7 days for that instance. You can request up to five unique instances running at the same time, so if you wanted to develop and test a multi-host application, perhaps one host that acts as the front-end web server, another host that does some kind of processing, and a third host that manages the database, this is all possible. As far as I can tell, you can do all the above from either a Windows, Mac or Linux personal computer.
Getting hands-on access to Cloud Computing really helps to understand this technology!
This is a reasonable question. Since Invista 2.0 came out months ago in August, and Invista 2.1 is rumored to be out by end of this month, why put out a press release now, rather than just wait a few weeks? Thesignificant part of this announcement was that EMC finally has their first customer reference.To be fair, getting a customer to agree to be a reference is difficult for any vendor. Some non-profitsand government agencies have rules against it, and some corporations just don't want to be bothered byjournalists, or take phone calls from other prospective customers. I suspect EMC wanted to put the good folks from Purdue University in front of the cameras and microphones before they:
In Moore's terminology, Purdue University would be a "technology enthusiast", interested in exploring the technologyof the EMC Invista. Universities by their very nature often see themselves as early adopters, willing to take big risks in hopes to reap big rewards. The chasm happens later, when there are a lot of early adopters, all willing to be reference accounts. The mainstream market--shown here as pragmatists, conservatives, and skeptics-- are unwillingto accept reference claims from early adopters, searching instead for moderate gains from minimal risks. They prefer references from customers that are similar in size and industry. Whether a vendor can get a product to cross this chasm is the focus of the book.
Why "SAN" virtualization?
Technically, Invista is "storage" virtualization, not "SAN" virtualization. Virtualizationis any technology that makes one set of resources look and feel like a different setof resources, preferably with more desirable characteristics. You can virtualizeservers, SANs, and storage resources.
Virtual SAN (VSAN) technology, supported bythe Cisco MDS 9500 Series Multilayer Director Switch, partitions a single physical SAN into multipleVSANs, allowing different business functions and requirements to share a common physical infrastructure.
How does Invista advance Cisco's VSAN functionality? It doesn't, but that doesn't makethe title a falsehood, or the press release by association full of lies.If you read the entire press release, EMCcorrectly states that Invista is "storage" virtualization. Some storagevirtualization products, like EMC Invista and IBM System Storage SAN Volume Controller (SVC), require a SAN as a platform for which to perform their magic.Marketing people might use the term "SAN" torefer not just the network gear that provides the plumbing, but also to include the storage devices that are attached to the SAN. In that light, theuse of "SAN virtualization" can be understood in the title.
More importantly, it appears that EMC no longer requires that you purchase new SAN equipment from themwith Invista. When the Invista first came out, it cost over a quarter-million US dollars to cover thecost of the intelligent switches, but with the price drop to $100K, I imagine this means theyassume everyone has an appropriately-supported intelligent switch already deployed.
Why this architecture?
In his post [Storage Virtualization and Invista 2.0], EMC blogger ChuckH does a fair job explaining why EMC went in this direction for Invista, and how it is different thanother storage virtualization products.
Most storage virtualization products are cache-based. The world's first disk storagevirtualization product, the IBM 3850 Mass Storage System, introduced in 1974, and thefirst tape virtualization product, the IBM 3494 Virtual tape Server, introduced in 1997, bothused disk cache in front of tape storage. Later virtualization products, like IBM SVC and HDS USP-V, use DRAM memory cache in front of disk storage, but the concept is the same.People are comfortable with cache-based solutions, because the technology is matureand well proven in the marketplace, and excited and delighted that these can offer the following features in a mixed heterogeneous disk environment:
instantaneous point-in-time copy
None of these features are provided by Invista, as there is no cache in the switch. Instead,Invista is a "packet cracker"; it cracks open each FCP packet, inspects and modifies the contents, then passes theFCP packet along to the appropriate storage device. This process slows down each read andwrite by some amount, perhaps 20 microseconds. The disadvantage of slowing down every readand write is offset by having other benefits, like non-disruptive data migration.
To compensate for Invista's inability to provide these features,EMC offers a second solution called EMC RecoverPoint, which is an in-band cache-based appliancesimilar in design to SVC, but maps all virtual disks one-to-one to physical disks. It offersremote distance asynchronous mirroring between heterogeneous devices.EMC supports RecoverPoint in front of Invista, but if you are considering buying bothto get the combined set of features, you might as well buy an IBM SVC or HDS USP-V instead,in one system, rather than two, which is much less complicated. IBM SVC and HDS USP-Vhave both "crossed the chasm" having sold thousands of units to every type and size of customer.
Hopefully, this answers the questions you might have about EMC Invista.