Comments (6) Visits (13271)
The Storage Architect writes in his post:
Array-based replication does have drawbacks; all externalised storage becomes dependent on the virtualising array. This makes replacement potentially complex. To date, HDS have not provided tools to seamlessly migrate away from one USP to another (as far as I am aware). In addition, there's the problem of "all your eggs in one basket"; any issue with the array (e.g. physical intervention like fire, loss of power, microcode bug etc) could result in loss of access to all of your data. Consider the upgrade scenario of moving to a higher level of code; if all data was virtualised through one array, you would want to be darn sure that both the upgrade process and the new code are going to work seamlessly...
I would argue that the IBM System Storage SAN Volume Controller (SVC) is more like the HDS USP, and less like the Invista. Both SVC and USP provide a common look and feel to the application server, both provide additional cache to external disk, both are able to provide a consistent set of copy services.
IBM designed the SVC so that upgrades can occur non-disruptively. You can replace the hardware nodes, one node at a time, while the SVC system is up and running, without disruption to reading and writing data on virtual disk. You can upgrade the software, one node at a time, while the SVC system is up and running, without disruption to reading and writing data on virtual disk. You can upgrade the firmware on the managed disk arrays behind the SVC, again, without disruption to reading and writing data on virtual disk.
More importantly, SVC has the ultimate "un-do" feature. It is called "image mode". If for any reason you want to take a virtual disk out of SVC management, you migrate over to an "image mode" LUN, and then disconnect it from SVC. The "image mode" LUN can then be used directly, with all the file system data in tact.
I define "virtualization" as technology that makes one set of resources look and feel like a different set of resources with more desirable characteristics. For SVC, the more desirable characteristics include choice of multi-pathing driver, consistent copy services, improved performance, etc. For EMC Invista, the question is "more desirable for whom?" EMC Invista seems more designed to meet EMC's needs, not its customers. EMC profits greatly from its EMC PowerPath multi-pathing driver, and from its SRDF copy services, so it appears to have designed a virtualization offering that:
A post from Dan over at Architectures of Control explains the anti-social nature of public benches. City planners, in an effort to discourage homeless people from sleeping on benches in parks or sidewalks, design benches that are so uncomfortableto use, that nobody uses them. These included benches made of metal that are too hot or too cold during certainmonths, benches slanted at an angle that dump you on the ground if you lay down, or benches that have dividers sothat you must be in an upright seated position to use.
This is not a disparagement of split-path switch-based designs. Rather, EMC's specific implementation appears to be designed for it to continuevendor lock-in for its multi-pathing driver, continuevendor lock-in for its copy services when used with EMC disk, and only provide slightly improved data migration capability for heterogeneous storage environments. Other switch-based solutions, such as those from Incipient or StoreAge, had different goals in mind.
Sadly, my IBM colleague BarryW and I have probably spent more words discussing Invista than all eleven EMC bloggers combined this year. While everyone in the industry is impressed how often EMC can sell "me, too" products with an incredibly large marketing budget, EMC appears not to have set aside funds for the Invista.
If a customer could design the ideal "storage virtualization" solution that would provide them the characteristics they desire the most from storage resources, it would not be anything like an Invista. While there are pros and cons between IBM's SVC and HDS's TagmaStore offerings, the reason both IBM and HDS are the market leaders in storage virtualization is because both companies are trying to provide value to the customer, just in different ways, and with different implementations.
Last week, Paul Weinberg of eChannelLine.com asks Is this the year of the SAN (again)?So, I thought this week I would cover my thoughts and opinions on storage networking. We oftenfocus on servers or storage devices, and forget that the network in between is an entire worldon itself.
I believe Mr. Weinberg is basing this on the idea that in 2007, over 50 percent of disk will beattached over SAN, edging out the alternative: Direct Attached Storage (DAS). But perhaps 50 percentis the wrong number to focus on. In 2007, The United Nations estimates thatcities will surpass rural areas, with just over 50 percent of theworld's population. Does that make this the "Year of the City"? Of course not.
Instead, I prefer to use the methodology that Malcolm Gladwell uses in his book, The Tipping Point.(I have read this book and highly recommend it!)Gladwell indicates that the tipping point happens at the start of the epidemic, not when it is half over.Isn't it better to celebrate the sweet 16 debutante ball when young ladies have completed their years of training and preparation, and are ready to be introduced to the rest of the world, rather than after they are thirty-something, married with children.
Let's explore some of the history. Stuart Kendric has a nice 7-page summary on theHistory & Plumbing of SANs.
IBM announced the first SAN technology calledEnterprise Systems Connection (ESCON) way back in September 1990. This allowed multiplemainframe servers to connect to multiple storage systems over equipment called "ESCON Directors" that directedtraffic from point A to point B. Before this, mainframes sent "ChannelCommand Words" or CCWs, across parallel "bus and tag" copper cables. ESCON was serial overfiber optic wiring. SANs solved two problems: first, it reduced the "rat's nest" of cables between many serversand many storage systems, and second, it extended the distance between server and storage device.
For distributed systems running UNIX or Windows, the CCW-equivalent over parallel cables was called Small ComputerSystem Interface (SCSI). The SCSI command had over 1000 command words, so for its Advanced Technology (AT) personal computers (PC AT), IBM introduced a subset of SCSI commands called ATA (Advanced Technology Attachment). ATA drives supportedfewer commands, ran at slower speeds, and were manufactured with a less rigorous process. Today ATA drives are about 55 percent the cost per MB as comparable SCSI drives.
Anyone who has ever opened their PC and found flat ribbon cable with eight or sixteen wires in parallel, can understand that the same issues applied externally. Parallel technologies arelimited to distance and speed, as all the bits have to arrive at the end of the wire at approximately thesame time. Direct attach schemes with every server attaches directly to every storage device were also problematic.Imagine 100 servers connected to 100 storage devices, that would be 10,000 wires!
So, a new technology standard was developed, called Fibre Channel, ratified in 1994.The spelling of "Fibre" was intentionally made different than "Fiber" on purpose. "Fibre" is a protocol thatcan travel over copper or glass wires. "Fiber" represents the glass wiring itself.
Fibre Channel is amazingly versatile. For today's Linux, UNIX and Windows servers, it can carry SCSI commands, and the combination of SCSI over FC is called Fibre Channel Protocol (FCP). For the mainframe servers, it can carry CCW commands. Running CCW over Fibre Channel is called FICON. This convergence allows mainframes and distributed systems to share a common Fibre Channel network, using the same set of switches and directors.
We saw the use of SANs explode in the marketplace over the past 10 years, and then cool down with a series of mergers and acquisitions. Last year, Brocade announced it was acquiring rival McData, so we will be down to two major players, Cisco and Brocade.
So, IMHO, I think we are well past the "Year of the SAN".
Well, it's Tuesday again, and we have more IBM announcements.
With the holiday season coming up at the end of the year, now is a great time to ask Santa for a new shiny pair of XIV systems, and some extra networking gear to connect them.
Comments (6) Visits (12738)
Storage Networking World conference is over, and the buzz from the analysts appears to be focused onXiotech's low-cost RAID brick (LCRB) called Intelligent Storage Element, or ISE.
(Full disclosure: I work for IBM, not Xiotech, in case there weren't enough IBM references on this blog page to remindyou of that. I am writing this piece entirely from publicly available sources of information, and notfrom any internal working relationships between IBM and Xiotech. Xiotech is a member of the IBM BladeCenteralliance and our two companies collaborate together in that regard.)
Fellow blogger Jon Toigo in his DrunkenData blog posted [I’m Humming “ISE ISE Baby” this Week] and then a follow-up post[ISE Launches]. I looked up Xiotech's SPC-1benchmark numbers for the Emprise 5000 with both 73GB and 146GB drives, and at 8,202 IOPS per TB, does not seem to be as fast as IBM SAN VolumeControllers 11,354 IOPS per TB. Xiotech offers an impressive 5 year warranty (by comparison, IBM offers up to 4 years, and EMC I think is stillonly 90 days).Jon also wrote a review in [Enterprise Systems]that goes into more detail about the ISE.
Fellow blogger Robin Harris in his StorageMojo blog posted [SNW update - Xiotech’s ISE and the dilithium solution], feeling that Xiotech should win the "Best Announcement at SNW" prize. He points to the cool video on the[Xiotech website]. In that video, they claim 91,000 IOPS.Given that it took forty(40) 73GB drives (or 4 datapacs) in the previous example to get 8,202 IOPS for 1TB usable, I am guessing the 91,000 IOPS is probably 44 datapacs (440 drives) glommed together, representing 11TB usable.The ISE design appears very similar to the "data modules" used in IBM's XIV Nextra system.
Fellow blogger Mark Twomey from EMC in his StorageZilla blog posted[Xiotech: Industry second]correctly points out that Xiotech's 520-byte block (512 bytes plus extra for added integrity) was not the firstin the industry. Mark explains that EMC CLARiiON had this since the early 1990's, and implies in the title that this must have been the first in the industry, making Xiotech an industry second. Sorry Mark, both EMC and Xiotech were late to the game. IBM had been using 520-byte blocksize on its disk since 1980 with the System/38. This system morphed to the AS/400, and the blocksize was bumped up to 522 bytes in 1990, and is now called the System i, where the blocksize was bumped up yet again to 528 bytes in 2007.
While IBM was clever to do this, it actually means fewer choices for our System i clients, being only able to chooseexternal disk systems that explicitly support these non-standard blocksize values, such as the IBM System Storage DS8000and DS6000 series. (Yes, BarryB, IBM still sells the DS6000!) The DS6000 was specifically designed with the System i and smaller System z mainframes in mind, and in that niche does very well. Fortunately, as I mentioned in my February post [Getting off the island - the new i5/OS V6R1], IBM has now used virtualization, in the form of the VIOS logical partition, to allow i5/OS systems to attach to standard 512-byte block devices, greatly expanding the storage choices for our clients.
(Side note: SNW happens twice per year, so the challenge is having something new and fresh to talk about each time. While Andy Monshaw, General Manager of IBM System Storage, highlighted some of the many emerging technologies in his keynote address, IBM shipped on many of them prior to his last appearance in October 2007: thin provisioning in the IBM System Storage N series, deduplication in the IBM System Storage N series Advanced Single Instance Storage (A-SIS) feature, and Solid State Disk (SSD) drives in the IBM BladeCenter HS21-XM models. Of course, not everyone buys IBM gear the first day it is available, and IBM is not the only vendor to offer these technologies. My point is that for many people, these are still not yet deployed in their own data center, and so they are still in the future for them. However, since these IBM deliveries happened more than six months ago, they're old news in the eyes of the SNW attendees. While those who follow IBM closely would know that, others like[Britney Spears] may not.)
Back in the 1990s, when IBM was developing the IBM SAN Volume Controller (SVC), we generically called the managed disk arrays that were being virtualized by the SVC as "low-cost RAID brick" or LCRB. The IBM DS3400 is a good example of this. However, as we learned, SVC is not just for LCRB, it adds value in front of all kinds of disk systems, including the not-so-low-cost EMC DMX and IBM DS8000 disk systems. ISE might make a reasonable back-end managed disk device for IBM SVC to virtualize. This gives you the new cool features of Xiotech's ISE, with IBM SVC's faster performance, more robust functionality and advanced copy services.
Next week, I'll be in South America in meetings with IBM Business Partners and storage sales reps.
technorati tags: SNW, LCRB, Xiotech, ISE, IBM, BladeCenter, Jon Toigo, DrunkenData, Robin Harris, StorageMojo, SPC, SPC-1, SPC-2, Emprise, SAN Volume Controller, SVC, XIV, Nextra, Mark Twomey, StorageZilla, EMC, CLARiiON, System/38, AS/400, System i, i5/OS, V6R1, VIOS, Andy Monshaw, thin provisioning, N series, deduplication, de-dupe, A-SIS, SSD, HS21 XM, BarryB, Britney Spears, DMX, DS3400[Read More]
Steve Rubel has an interesting blog on Wikipedia: Wikipedia Is More Popular Than...
When I was a kid, we didn't have online access to anything. Either yourparents were rich and generous and bought you the latest set of encyclopedias, or they were poor or cheap, and you hoofed it to thenearest library.
Now, I rely heavily on Wikipedia, and other wikis, to find information I need.The key here is the ability to find stuff. With the old 27-volume set ofencyclopedias, you had to know what word something would be filed under, andhow to spell it, so that you could find it. Today's search facilities are much moreforgiving. If you guess wrong, you are only a few clicks away from what youwere really looking for, in a Kevin Bacon six- Wikipedia is now looked at more often than CNN.com or the New York Times website.Why? It is amazingly good at summarizing a situation in succinct terms, even fornews "as it happens". The recent episode at Heathrow airport a few weeks agoserves as a good example. I was in Washington DC that week, on my way to Miami and Sao Paulo,Brazil, so it is good to have the news I needed, when I needed it.[Read More]
Wikipedia is now looked at more often than CNN.com or the New York Times website.Why? It is amazingly good at summarizing a situation in succinct terms, even fornews "as it happens". The recent episode at Heathrow airport a few weeks agoserves as a good example. I was in Washington DC that week, on my way to Miami and Sao Paulo,Brazil, so it is good to have the news I needed, when I needed it.[Read More]
Comments (5) Visits (16780)
Perhaps I wrapped up my exploration of disk system performance one day too early. (While it is Friday here in Malaysia, it is still only Thursday back home)
Barry Burke, EMC blogger (aka The Storage Anarchist) writes:
Aren't you mixing metrics here?
This is a fair question, Barry, so I will try to address it here.
It was not a typo, I did mean MPG (miles per gallon) and not MPH (miles per hour). It is always challenging to find an analogy that everyone can relate to explain concepts in Information Technology that might be harder to grasp. I chose MPG because it was closely related to IOPS and MB/s in four ways:
It seemed that if I was going to explain why standardized benchmarks were relevant, I should find an analogy that has similar features to compare to. I thought about MPH, since it is based on time units like IOPS and MB/s, butdecided against it based on an earlier comment you made, Barry, about NASCAR:
Let's imagine that a Dodge Charger wins the overwhelming majority of NASCAR races. Would that prove that a stock Charger is the best car for driving to work, or for a cross-country trip?
Your comparison, Barry, to car-racing brings up three reasons why I felt MPH is a bad metric to use for an analogy:
You also mention, Barry, the term "efficiency" but mileage is about "fuel economy".Wikipedia is quick to point out that the fuel efficiency of petroleum engines has improved markedly in recent decades, this does not necessarily translate into fuel economy of cars. The same can be said about the performance of internal bandwidth ofthe backplane between controllers and faster HDD does not necessarily translate to external performance of the disk system as a whole. You correctly point this out in your blog about the DMX-4:
Complementing the 4Gb FC and FICON front-end support added to the DMX-3 at the end of 2006, the new 4Gb back-end allows the DMX-4 to support the latest in 4Gb FC disk drives.
This also explains why the IBM DS8000, with its clever "Adaptive Replacement Cache" algorithm, has such highSPC-1 benchmarks despite the fact that it still uses 2Gbps drives inside. Given that it doesn't matter between2Gbps and 4Gbps on the back-end, why would it matter which vendor came first, second or third, and why call it a "distant 3rd" for IBM? How soon would IBM need to announce similar back-end support for it to be a "close 3rd" in your mind?
I'll wrap up with you're excellent comment that Watts per GB is a typical "green" metric. I strongly support the whole"green initiative" and I used "Watts per GB" last month to explain about how tape is less energy-consumptive than paper.I see on your blog you have used it yourself here:
The DMX-3 requires less Watts/GB in an apples-to-apples comparison of capacity and ports against both the USP and the DS8000, using the same exact disk drives
It is not clear if "requires less" means "slightly less" or "substantially less" in this context, and have no facts from my own folks within IBM to confirm or deny it. Given that tape is orders of magnitude less energy-consumptive than anything EMC manufacturers today, the point is probably moot.
I find it refreshing, nonetheless, to have agreed-upon "energy consumption" metrics to make such apples-to-apples comparisons between products from different storage vendors. This is exactly what customers want to do with performance as well, without necessarily having to run their own benchmarks or work with specific storage vendors. Of course, Watts/GB consumption varies by workload, so to make such comparisons truly apples-to-apples, you would need to run the same workload against both systems. Why not use the SPC-1 or SPC-2 benchmarks to measure the Watts/GB consumption? That way, EMC can publish the DMX performance numbers at the same time as the energy consumption numbers, and then HDS can follow suit for its USP-V.
I'm on my way back to the USA soon, but wanted to post this now so I can relax on the plane.
technorati tags: IBM, EMC, Storage Anarchist, MPG, MPH, IOPS, NASCAR, Malaysia, Watts, GB, green, back-end, DMX-3, DMX-4, HDS, USP, USP-V, SPC, SPC-1, SPC-2, standardized, benchmarks, workload, DS8000, disk, storage, tape[Read More]
Stephen Colbert, of The Colbert Report, explains the name changes in recent mergers of the Telecommunications industry. A discussion on "changing names" and how that impacts storage seems like a good way to wrap up the week's theme on naming conventions.
Name changes are sometimes painful, but often times done for a purpose, such as to promote a family. In the US, when a man and woman marries, the woman often changes her family name to match her husband, and the kids all adopt the father's family name. I say "often" because there are times where the woman keeps her name, or adds to it in a hyphenated way. ABC News reported that a Man Fights to Take Wife's Name in Marriage. KipEsquire, a lawyer, writes about it in his blogA stitch in haste.
IT industry changes the names of products that people knew as something else. Other times, they re-use an existing name, when really it is or should be different from the original. Last year, I took on the job of helping transition from our brand "TotalStorage" to the "System Storage" product line under the new "IBM Systems" brand. I help decide what stays the same name or what changes, when it should change, and how to announce that change.
On the disk side, IBM renamed Fibre Array Storage Technology, or FAStT, which was pronounced exactly like "fast", to DS4000 series. This was a big improvement, as people couldn't seem to spell it properly, with variations like "FastT". Nor could people pronounce it properly, saying "fast-tee" instead. The advantage of "DS" is that it is both easy to spell, and easy to pronounce. The DS4000 series continues to be "fast", providing excellent performance for its midrange price category.
IBM's Enterprise Storage Server (ESS) line went from model E10, to F20, to 750 and 800. When IBM came out with its replacement, the IBM TotalStorage DS8000, some people asked why it wasn't named the ESS 900, for example. The DS8000 is quite different internally, new hardware design and implementation, but is highly compatible with the ESS line, and shares much of the same functionality from microcode. Last year, it was replaced by the IBM System Storage DS8000 Turbo. Again, newer hardware, so it was easy to justify the new name change from "TotalStorage" to "System Storage".
Renaming a product risks losing its certifications and awards. For example, IBM spent a lot of time and money getting the OS/390 operating system certified as a "UNIX" platform. When it was renamed to z/OS, IBM had to do it all over again. Learning from this experience, IBM decided not to rename the SAN Volume Controllerto a new designation like "DS5750", as it enjoys the "number one" spot on both the SPC-1 and SPC-2 performance benchmarks, and is recognized as the leader in the disk storage virtualization marketplace. Renaming this product would mean losing that collateral.
IBM's "other disk systems" the N series posed another set of challenges. The current DS line already has entry-level (DS3000), midrange (DS4000) and enterprise-class (DS6000 and DS8000) products. The OEM agreement that IBM has with Network Appliance (NetApp) resulted in a new set of entry-level, midrange, and enterprise-class products. But these didn't fit nicely into the DS3000-to-DS8000 continuum. Instead, IBM decided to go with N series, using N3000 for entry-level, N5000 for midrange, and N7000 for enterprise-class. These are different than the numbers used by NetApp for their comparable, but not identical, offerings.
On the tape side, IBM decided to name the tape drives TS1000 and TS2000 range, tape libraries and automation with a TS3000 range, and tape virtualization to the TS7000 range. A lot of tape products already had 3000 numbering that had to change to fit this new scheme. This is why IBM's popular 3592 tape drive was renamed to the TS1120. The replacement to the 3494 Virtual Tape Server was named TS7700 Virtualization Engine.
Obviously, you can't change the names of products that are currently in the field, but what about existing software with minor updates? IBM decided to leave "TotalStorage Produtivity Center" under the "TotalStorage" brand until it has a significant version upgrade. Many people say "TPC" as a convenient acronym when referring to this product, but TPC is a registered trademark of the Professional Golfers Association (PGA) to refer to its "Tournament Players Club".
How can anyone confuse "managing storage" with "playing golf"? One activity is full of frustration that takes years or decades to master, involving the need to understand a variety of equipment and techniques to use each properly to accomplish your goals; and the other is an enjoyable activity, immediately productive in front of a single pane of glass managing all of your DAS, SAN and NAS storage, from reporting on your files and databases to managing storage networks and tape libraries.
Enjoy the weekend!
technorati tags: Stephen Colbert, Colbert Report, Telecommunications industry, KipEsquire, IBM, FAStT, DS4000, DS3000, DS8000, OS/390, UNIX, z/OS, SAN Volume Controller, N series, TS1120, TS7700, TotalStorage Productivity Center, TPC, PGA, Golf[Read More]
This week and next I am touring Asia, meeting with IBM Business Partners and sales repsabout our July 10 announcements.
Clark Hodge might want to figure out where I am, given the nuclearreactor shutdowns from an earthquake in Japan. His theory is that you can follow my whereabouts just by following the news of major power outages throughout the world.
So I thought this would be a good week to cover the topic of Business Continuity, which includes disaster recovery planning. When making Business Continuity plans, I find it best to work backwards. Think of the scenarios that wouldrequire such recovery actions to take place, then figure out what you need to have at hand to perform the recovery, and then work out the tasks and processes to make sure those things are created and available when and where needed.
I will use my IBM Thinkpad T60 as an example of how this works. Last week, I was among several speakers making presentations to an audience in Denver, and this involved carrying my laptop from the back of the room, up to the front of the room, several times. When I got my new T60 laptop a year ago, it specifically stated NOT to carry the laptop while the disk drive was spinning, to avoid vibrations and gyroscopic effects. It suggested always putting the laptop in standby, hibernate or shutdown mode, prior to transportation, but I haven't gotten yet in the habit of doing this. After enough trips back and forth, I had somehow corrupted my C: drive. It wasn't a complete corruption, I could still use Microsoft PowerPoint to show my slides, but other things failed, sometimes the fatal BSOD and other times less drastically. Perhaps the biggest annoyance was that I lost a few critical DLL files needed for my VPN software to connect to IBM networks, so I was unable to download or access e-mail or files inside IBM's firewall.
Fortunately, I had planned for this scenario, and was able to recover my laptop myself, which is important when you are on the road and your help desk is thousands of miles away. (In theory, I am now thousands of miles closer to our help desk folks in India and China, but perhaps further away from those in Brazil.) Not being able to respond to e-mail for two days was one thing, but no access for two weeks would have been a disaster! The good news: My system was up and running before leaving for the trip I am on now to Asia.
Following my three-step process, here's how this looks:
technorati tags: IBM, July, announcements, earthquake, Japan, nuclear reactor, power, outage, business, continuity, disaster, recovery, plan, plans, planning, IBM, Thinkpad, T60, laptop, Windows, Denver, BSOD, VPN, India, China, Brazil, help desk, Asia, Tivoli, Storage, Manager, TSM, BMR, external, USB, bootable, CD, DVD, separating, programs, data, Clark Hodge[Read More]
Comments (5) Visits (11037)
Continuing my quest to "set the record straight" about [IBM XIV Storage System] and IBM's other products, I find myself amused at some of the FUD out there. Some are almost as absurd as the following analogy:
The conclusion we are led to believe is that hiring Mr. Jones, a human being, is as risky as puttinga banana peel down on the sidewalk. Some bloggers argue that they are merely making a series of factual observations,and letting their readers form their own conclusions. For example, the IBM XIV storage system has ECC-protected mirrored cache writes. Some false claims about this were [properly retracted]using
While it is possible to compare bananas and humans on a variety of metrics--weight, height, and dare I say it,caloric value--it misses the finer differences of what makes them different. Humans might share 98 percent withchimpanzees, but having an opposable thumb allows humans to do things that
Full Disclosure: I am neither vegetarian nor cannibal, and harbor no ill will toward bananas nor chimpanzees.No bananas or chimpanzees were harmed in the writing of this blog post. Any similarity between the fictitiousMr. Jones in the above analogy and actual persons, living or dead, is purely coincidental.
So let's take a look at some of IBM XIV Storage System's "opposable thumbs".
Fellow blogger from EMC Mark Twomey on his StorageZilla blog, posted about [Steinhardt's Rule of Customer Beliefs] with his own Twomey Corollary. Here is an excerpt:
In priority order, customers believe:
In the case of IBM XIV Storage System, it is not clear whether
That said, feel free to comment below on which of these you think the last two points of Steinhardt's rule istrying to capture. Certainly, I can't argue with the top two: a customer's own experience and the experiencesof other customers, which I mentioned previously in my post[Deceptively Delicious].
technorati tags: IBM, XIV, storage, system, banana, ECC, protected, mirrored, cache, writes, RAID, SnapShot, consistent performance, thin provisioning, Mark Twomey, StorageZilla, Steinhardt, NaviSite, customer reference[Read More]
My father's favorite question is "What's the worst that could happen?" He is retired now, but workedat the famous [Kitt Peak National Observatory] designing some of the largesttelescopes. Designing telescopes followed well-established mechanical engineering best practices, but each design was unique,so there was always a chance that the end result would not deliver the expected results. What's the worst that can happen? For telescopes, a few billion dollars are wasted and a few years are added to the schedule. Scrap it and start over. Nothing unrecoverable for the US government with unlimited resources and patience.
Over the weekend, we discussed the lawsuit to stop CERN from potentially destroying the planet. Dennis Overbye writes about this in his New York Times article titled["Asking a Judge to Save the World, and Maybe a Whole Lot More"]. Here's an excerpt:
... the rest of the grimness on the front page today will matter a bit, though, if two men pursuing a lawsuit in federal court in Hawaii turn out to be right. They think a giant particle accelerator that will begin smashing protons together outside Geneva this summer might produce a black hole or something else that will spell the end of the Earth — and maybe the universe.
What's the worst that can happen? Scientists now agree that it is sometimes difficult to predict, and someeffects may be unrecoverable.
Unfortunately, this is not the only example of people attempting things they may not understand well enough. Theweb comic below has someone complaining they are out of disk space, and the sales rep suggests solving this with a few commands which will result in deleting all her files. Hopefully, most people reading will recognize this is meant as humor, and not actually attempt the code fragments to "see what they do".
Sadly, I often encounter clients who have a "keep forever" approach to their production data. When they are seriously out of space, they feel forced to either buy more disk storage, or start "the big Purge": deleting rows from their database tables, emails older than 90 days, or some other drastic measures. With a focus on keeping down IT budgets, I fear that thesedrastic measures are growing more common. What's the worst that could happen? You might need that data for defending yourself against a lawsuit, or need it to continue to provide service to a loyal client, or just continue normal business operations.I have visited companies where a junior administrator chose the "big Purge" option, without a full understanding ofwhat they were doing, resulting in business disruption until the data could be recovered or re-entered.
IBM offers a better way. Data that may not be needed on disk forever could be moved to lower-cost tape, using up less energy and less floorspace in your data center. Solutions can automatically delete the data systematically based on chronological or event-based retention policies, with the option to keep some data longer in response to a "legal hold" request.
That's certainly better than to risk shrinking your business into a "dense dead lump"!Read More]
I am back from China, and now glad to be back in the old USA. Last week, someone asked me what would it take to add a specific feature to the IBM System Storage DS8300. The what-would-it-take question is well-known among development circles informally as a "sizing" effort, or more formally as "Development Expense" estimate.
For software engineering projects, the process was simply that an architect would estimate the number of "Lines of Code" (LOC) typically represented in thousands of lines of code (KLOC). This single number would convert to another single number, "person-months", which would then translate to another single number "dollars". Once you had KLOC, the rest followed directly from a formula, average or rule-of-thumb.
More amazing is that this single number could then determine a variety of other numbers, the number of total months for the schedule, the number of developers, testers, publication writers and quality assurance team members needed, and so on. Again, these were developed using a formula, developed and based on past experience of similar projects.
Earlier in my career, I was the lead architect for DFSMS for the z/OS operating system, and later for IBM TotalStorage Productivity Center, performing these sizing efforts. A famous IBM architect, Frederick P. Brooks, wrote a now-classic book that was requiredreading when I started at IBM, which just was re-released as Mythical Man-Month: Essays in Software Engineering, 20th Anniversary Edition. In addition to sound advice, he alsooffered a formula or two that helps with these estimating tasks.
Hardware design introduces a different set of challenges. When I was getting my Masters Degree in Electrical Engineering, it took myself and four other grad students a full semester just to design a six-layer, 900 transistor silicon chip, which could only perform a single function, multiply two numbers together.At IBM, another book that I was given to read was Soul of a New Machine, documenting six hardware engineers, and six software engineers, working long hours on a tight schedule to produce a new computer for Data General.
So why do I bring this up now? IBM architects William Goddard and John Lynott are being inducted posthumously this year into the prestigious National Inventors Hall of Fame for their disk system innovation.
Under the leadership of Reynold Johnson, the team developed an air-bearing head to “float” above the disk without crashing into the disk. Imagine a fighter airplane flying full speed across the country-side at 50 feet off the ground. If you every heard the term "my disk crashed", it was originally referring to the read/write head touching the disk surface, causing terrible damage.
A uniformly flat disk surface was created by spinning the coating onto the rapidly rotating disk, leaving many wearing lab coats covered with disk liquid at waist level. Developing disk-to-disk and track-to-track access mechanisms proved more challenging, and nearly halted the project. The team, however, was adamant that this problem could be solved, and customers were increasingly asking for random access technology. The result was the "350 Disk Storage Unit" designed for the "305 RAMAC computer", which I have talked about a lot last year as part of our "50 years of disk systems innovation" celebration.
Neither Goddard nor Lynott had computing experience prior to joining IBM. Goddard was a former science teacher who briefly worked in aerospace. Lynott had been a mechanic in the Navy and later a mechanical engineer. They didn't have a nice formula based on past experience, they didn't have the benefit of Fred Brooks' advice, or the rules-of-thumb or averages now used to estimate the size of projects. They had to break new ground.
Now that's innovation!
technorati tags: IBM, DS8300, disk, KLOC, sizing, estimate, DFSMS, z/OS, TotalStorage Productivity Center, Frederick Brooks, William Goddard, John Lynott, Mythical Man-Month, Reynold Johnson, RAMAC, 305, 350,[Read More]
I got the following comment on my earlier post [A Recap of Storage Industry Acquisitions], Reuben wrote:
According to Gartner data (from 2005!), host-based storage accounts for 34 percent of the overall market for external storage, with the remaining 66 percent going to "fabric-attached" (network) storage, expect this share to grow from 66 percent to 77 percent by 2007.What is the current reality? SAN vs. NAS, FC vs iSCSI?
IBM subscribes to a lot of data from different analysts, they all have their methods for collecting this data, from taking surveys of customers to reviewing financial results of each vendor. While theymight not agree entirely, there are some common threads that lead one to believe they represent "reality". Hereare some numbers from an IDC December 2007 report:
Jon Toigo over at DrunkenData offers some additional data from ex-STKer:[Fred Moore Outlook on Storage 2008]. I met Fredat a conference. He had left STK back in 1998, and started his own company called Horison. NeitherJon nor Fred cite the sources of his statistics, but the following comment leads me to assume hehasn't been paying attention closely to the tape market:
With the demise of STK, who will be the leader in the tape industry?
Depending on how old you are, you might remember exactly where you were when a significant eventoccurred, for example the[Space Shuttle Chal
technorati tags: Gartner, IDC, host-based, fabric-attached, NAS, iSCSI, SAN, FC, ESCON, FICON, NFS, CIFS, internal, external, disk, systems, storage, DrunkenData, Fred Moore, STK, Sun, confetti, Challenger[Read More]
Comment (1) Visits (9534)
Our industry is full of acronyms, and sometimes spelling out what words an acronym stands for is not enough to explain it fully.
It reminds me of an old story within IBM. A customer engineer (or "CE" for short) was repairing an air-cooled server, and found the failing part being a "FAN". Not knowing what this stood for, he looked up the acronym in the offical "IBM list of acronyms" and found that it stood for "Forced Air Network". Apparently, so many people did not realize that a FAN was just a "fan" that they needed to add an entry to remind people what this little motorized propeller was for.
This brings me to Tony Asaro's Fun with FAN blog entry which mentions yet another definition for FAN, that of "File Area Network". The concept is not new, but some developments this year help make it more a reality.
To join the rest of the world, new types of data set were created for the z/OS operating system, known as HFS and zFS. These held file systems in the sense we know them today, comparable in hierarchical organization of files on Windows, Linux and UNIX platforms. These could be linked and mounted together in larger hierarchical structures across the sysplex.
The concept of files and file systems is a fairly new concept. Prior to this, applications read and wrote directly in terms of blocks, typically fixed length multiples of 512 bytes. For a while, database management systems offered a choice, direct block access or file level access. The former may have offered slightly better performance, but the latter was easier to administer. Without file system, specialized tools were often required to diagnose and fix problems on block-oriented "raw logical" volumes.
This launched a "my file system is better than yours" war which continues today. The official standard is POSIX, but every file system tries to give some proprietary advantage by offering unique features. Sun's file system offers support for "sparse" files, which is ideal for certain mathematical processing of tables. Microsoft's NTFS offers biult-in compression, designed for the laptop user. IBM's JFS2 and Linux's EXT3 file systems support journaling, which tracks updates to file system structures in a separate journal to minimize data corruption in the event of a power outage, and thus speed up the re-boot process. Anyone who has ever waited for a "Scan Disk" or "fsck" process to finish knows what I'm talking about. Of course, if an application deviates from POSIX standards, and exploits some unique feature of a file system, it then limits its portability and market appeal.
The two competing NAS file systems are also different. Common Internet File System (CIFS) was developed initially by IBM and Microsoft to provide interoperability between DOS, Windows and OS/2. Meanwhile, Network File System (NFS) was the darling of nearly every UNIX and Linux distribution, and even has clients on operating platforms as diverse as MacOS, i5/OS, and z/OS. Today, nearly every platform supports one or both of these standards.
Bottom line, file systems are here to stay. Any slight advantages to use raw logical volumes for databases and applications are losing out to the robust set of file system utilities that can be used across a broad set of platforms and applications.Read More]
Some upcoming books have caught my attention.
Last year, I covered Chris Anderson's book [The Long Tail]. This year, Chris Anderson, editor-in-chief of Wired.com, has an upcoming book titled Free, the past and future of a radical price. Chris talked about his book here at Nokia World 2007 conference, and the [46-minute video] is worth watching.He asks the big question "What if certain resources were free?" This could be electricity, bandwidth, or storage capacity. He explores how this changes the world, and crea Nick Carr writes a post [Dominating the Cloud], indicatingthat IBM, Google, Microsoft, Yahoo and Amazon are the five computing giants to watch, as they are more efficient atconverting electricity into computing than anyone else. Last month, I mentioned IBM and Google partnership on cloud computing in my post[Innovationthat matters: cell phones and cloud computing].Nick's upcoming book titled[The Big Switch] looks into "Utility Comp Last, but not least, Seth Godin writes in his post [Meatballs and Permeability] about the bits-vs-atoms issue, what Chris Anderson above refers to as the new digital economy. The idea here is that value carried electronically as bits (digital documents, for example) have completely different economics than value carried as atoms (physical objects), andrequires new marketing techniques. Methods from traditional marketing will not be effective in this new age.Here is a [review] of Seth's new book Meatball Sundae: Is Your Marketing Out of Sync? All three of these books seem to be covering the same phenomenon, just from different viewpoints. I lookforward to reading them. technorati tags: Long Tail, Chris Anderson, Wired, Nokia World, secondlife, cross-subsidy, digital economy, Nick Carr, Big Switch, utility computing, IBM, Google, Microsoft, Yahoo, Amazon, SimpleDB, Seth Godin, Meatball Sundae, bits, atoms
Nick Carr writes a post [Dominating the Cloud], indicatingthat IBM, Google, Microsoft, Yahoo and Amazon are the five computing giants to watch, as they are more efficient atconverting electricity into computing than anyone else. Last month, I mentioned IBM and Google partnership on cloud computing in my post[Innovationthat matters: cell phones and cloud computing].Nick's upcoming book titled[The Big Switch] looks into "Utility Comp Last, but not least, Seth Godin writes in his post [Meatballs and Permeability] about the bits-vs-atoms issue, what Chris Anderson above refers to as the new digital economy. The idea here is that value carried electronically as bits (digital documents, for example) have completely different economics than value carried as atoms (physical objects), andrequires new marketing techniques. Methods from traditional marketing will not be effective in this new age.Here is a [review] of Seth's new book Meatball Sundae: Is Your Marketing Out of Sync? All three of these books seem to be covering the same phenomenon, just from different viewpoints. I lookforward to reading them. technorati tags: Long Tail, Chris Anderson, Wired, Nokia World, secondlife, cross-subsidy, digital economy, Nick Carr, Big Switch, utility computing, IBM, Google, Microsoft, Yahoo, Amazon, SimpleDB, Seth Godin, Meatball Sundae, bits, atoms
Last, but not least, Seth Godin writes in his post [Meatballs and Permeability] about the bits-vs-atoms issue, what Chris Anderson above refers to as the new digital economy. The idea here is that value carried electronically as bits (digital documents, for example) have completely different economics than value carried as atoms (physical objects), andrequires new marketing techniques. Methods from traditional marketing will not be effective in this new age.Here is a [review] of Seth's new book Meatball Sundae: Is Your Marketing Out of Sync?
All three of these books seem to be covering the same phenomenon, just from different viewpoints. I lookforward to reading them.
technorati tags: Long Tail, Chris Anderson, Wired, Nokia World, secondlife, cross-subsidy, digital economy, Nick Carr, Big Switch, utility computing, IBM, Google, Microsoft, Yahoo, Amazon, SimpleDB, Seth Godin, Meatball Sundae, bits, atoms[Read More]
Comments (7) Visits (14615)
While EMC bloggers garnered media attention last year pointing out the faulty mathematics from HDS, an astute reader pointed me to EMC's own [DMX-4 specification sheet],updated for its 1TB SATA disk.I've chosen just the minimum and maximum number of drives RAID-6 data points for non-mainframe platforms:
Where EMC appears miscalculating is having 20x more drives, as the numbers don't match up. For 1920 drives inRAID-6, you would expect 20x more usable capacity than the 96 drive configurations. For 6+2 configurations, one would expect 720TB and 1440TB respectively. For 14+2 configurations, one wouldexpect 840TB and 1680TB, respectively.
Perhaps EMC DMX-4 can't address more than 600TB for the entire system? Does EMC purposely limit the benefitsof these larger drives? It does question why someone might go from 500GB to 1TB drives, if the maximum configuration only gives about 40TB more capacity.Fellow IBM blogger Barry Whyte questioned the use of SATA in an expensive DMX-4 system, in his post[One Box Fits All - Or Does It], and now perhaps there are good reasons to question 1TB from a capacityperspective as well.Read More]
In a recent post, ESG Analyst Tony Asaro asks What happened to CAS?
Many often associate CAS with EMC's Centera offering, but with IBM's comprehensive set of compliance storageofferings, EMC doesn't talk about CAS or Centera much anymore.I covered the confusion around CAS in a previous post. When clients ask for "CAS" what they really are looking for is storage designed forfixed content, unstructured data that doesn't change once written. A lot of data falls under this category, such as scanned documents, audio and video recordings, medical images, and so on. Some laws and regulations further require enforcement that the data is not deleted or tampered with, until some time after an event or expiration date is met.
In the past, clients used write-once read-many (WORM) optical media, but today we have disk and tape offerings instead. Since the term "WORM" is inappropriate fordisk-based solutions, IBM has standardized to the use of the term "non-erasable, non-rewriteable" (NENR) to discusstoday's solutions and offerings.
Let's recap what IBM has to offer:
As you see, IBM doesn't limit itself to disk-only offerings. Our leadership in tape allows us to innovate tape and disk-and-tape offerings that can provide more cost-effective solutions to store fixed content, retention managed data.The next time you have a conversation with a storage vendor, don't ask for CAS, ask instead for archive and compliance storage. Broaden your mind, and broaden the set of options and choices that might provide a better fit for your requirements.
technorati tags: ESG, analyst, Tony Asaro, EMC, Centera, CAS, IBM, system, storage, DR550, Express, N series, GAM, grid, GMAS, medical, archive, WORM, TS1120, LTO, LTO3, LTO4, NENR, fixed, content, retention[Read More]
Comments (8) Visits (12749)
Yesterday, I started this week's topic discussing the various areas of exploration to helpunderstand our recent press release of the IBM System Storage SAN Volume Controller and itsimpressive SPC-1 and SPC-2 benchmark results that ranks it the fastest disk system in the industry.
Some have suggested that since the SVC has a unique design, it should be placed in its own category,and not compared to other disk systems. To address this, I would like to define what IBM meansby "disk system" and how it is comparable to other disk systems.
When I say "disk system", I am going to focus specifically on block-oriented direct-access storage systems, which I will define as:
One or more IT components, connected together, that function as a whole, to serve as a target forread and write requests for specific blocks of data.
Clarification: One could argue, and several do in various comments below, that there are other typesof storage systems that contain disks, some that emulate sequential access tape libraries, some that emulate file-systems through CIFS or NFS protocols, and some that support thestorage of archive objects and other fixed content. At the risk of looking like I may be including or excluding such to fit my purposes, I wanted to avoid appl
People who have been working a long time in the storage industry might be satisfied by this definition, thinkingof all the disk systems that would be included by this definition, and recognize that other types of storage liketape systems that are appropriately excluded.
Others might be scratching their heads, thinking to themselves "Huh?" So, I will provide some background, history, and additional explanation. Let's break up the definition into different phrases, and handle each separately.
So, the SAN Volume Controller is a disk system comprising of one to four node-pairs. Each node is a piece of IT equipment that have processors and cache. These node-pairs are connected to a pair of UPS power supplies to protect the cache memory holding writes that have not yet been de-staged. The combination of node-pairs and UPS acting as a whole, is able to serve as a target to SCSI commands sent over Fibre Channel cables on a Storage Area Network (SAN). To read some blocks of data, it uses its internal cache storage to satisfy the request, and for others, it goes out to external disk systems that contain the data required. All writes are satisfied immediately in cache on the SVC, and later de-staged to external disk when appropriate.
As of end of 2Q07, having reached our four-year anniversary for this product, IBM has sold over 9000 SVC nodes, which are part of more than 3100 SVC disk systems. These things are flying off the shelves, clocking in a 100% YTY growth over the amount we sold twelve months ago. Congratulations go to the SVC development team for their impressive feat of engineering that is starting to catch the attention of many customers and return astounding results!
So, now that I have explained why the SVC is considered a disk system, tomorrow I'll discuss metrics to measure performance.
Comments (2) Visits (13786)
I welcome HDS into the "Super High-End" club. Those who follow my blog might remember thatI suggested that analysts like IDC that use "Entry Level", "Midrange" and "Enterprise" as categoriesmay need a New Category: Super High End.
I was not surprised to see EMC, who now drops further down in perception, dispute HDS's recent SPC-1 benchmarks.Fellow blogger EMC's BarryB posted on his Storage Anarchist blog [IBM vs. Hitachi] thatpoints out that IBM's SAN Volume Controller (SVC) is still much faster, and less expensive, than USP-V.
So, just in case you haven't seen all the press releases, here is a quick recap on the results:IBM SVC 4.2 is still in first place, then HDS USP-V, then IBM System Storage DS8300. Just for comparison, I includeour IBM System Storage DS4800 midrange disk results, so you can appreciate the difference between midrange and high-end.There are other products from other vendors, I just point out a few from IBM and HDS here in this graph.
HDS tried to come up with a phrase "Enterprise Storage System" for comparison that would leave the SVC 4.2 out.Given that the SVC has five nines (99.999%) availability, has non-disruptive upgrade and firmware update capability, has more than two processors typical of midrange products, and can connect to mainframes via z/VM, z/VSE andLinux on System z operating systems, there is no reason to pretend SVC isn't Enterprise-class.
The irony now is that EMC now looks very lonely being one of the last remaining major storage vendors not to participate in standardized benchmarks that help customers make purchase decisions, as mentioned both by IBM's BarryW: I guess that only leaves EMC, as well as HDS's Claus Mikkelsen: Olympics of Storage.
Earlier this year, EMC's Chuck Hollis opined[Storage Scorecard]that the EMC DMX and HDS TagmaStore USP were high-endboxes, which I would speculate both of these would fall somewhere between DS4800 and DS8300 on the graph above.If that is the case, it is impressive that HDS was able to re-engineer their USP-V to be 2-3x faster thanits predecessor, the USP.
Not all workloads are the same, and your mileage may vary. While I can't speak to HDS, the folks over atEMC have assured me, in writingcomments on this blog, that there is nothing preventing their customers from publishingtheir own performance comparisons between EMC and non-EMC equipment. I would encourage every customer to do this, between IBM and HDS, HDS and EMC, and between IBM and EMC, to help shed even more light on this area.In fact, you can even run your own SPC benchmarks to see how your own environment compares to the ones published.
Of course, performance is just one attribute on which to choose a storage vendor, and to choose specific products,models or features. For more information about Storage Performance Council and the SPC-1 and SPC-2 benchmarks,see my week-long series on SPC benchmarks, which are listed in reverse chronological order.
Go to the official Storage Performance Council website to read the details of the SPC-1 results.
technorati tags: IBM, Super, High-End, Entry-Level, Midrange, IDC, Enterprise, HDS, USP-V, USP, EMC, SPC, SPC-1, SAN Volume Controller, SVC, DS8300, DS4800, mainframe, z/VM, z/VSE, Linux, System z, BarryB, BarryW, Chuck Hollis, SPC-2, Storage Performance Council[Read More]
I would like to welcome IBMer Barry Whyte to the blogosphere!
From his bio:
Barry Whyte is a 'Master Inventor' working in the Systems & Technology Group based in IBM Hursley, UK. Barry primarly works on the IBM SAN Volume Controller virtualization appliance. Barry graduated from The University of Glasgow in 1996 with a B.Sc (Hons) in Computing Science. In his 10 years at IBM he has worked on the successful Serial Storage Architecture (SSA) range of products and the follow-on Fibre Channel products used in the IBM DS8000 series. Barry joined the SVC development team soon after its inception and has held many positions before taking on his current role as SVC performance architect. Outside of work, Barry enjoys playing golf and all things to do with Rotary Engines.
To avoid confusion in future posts, I will refer to Barry Whyte as BarryW, and fellow EMC blogger Barry Burke (aka the Storage Anarchist) as BarryB.
I'm in Chicago this week, but it is actually HOTTER here than in my home town of Tucson, Arizona.
Comments (3) Visits (10111)
A recent blog by Chris Mellor makes the outlandish conspiracy theory that IBM and HDS copied virtualisation technology from small start-up company DataCore.
(Chris doesn't actually name who is his source making such a claim, whether thatsomeone was employed by any of the parties involved at the time the events occurred,or is currently employed by a competitor like EMC bitterly jealous of the success IBM and HDScurrently enjoy with their offerings.)
As I already posted before about IBM'slong history of storage virtualization, SAN Volume Controller was really part of a sequence of major product in this area, after the successful 3850 MSS and 3494 VTS block virtualization products.
In the late 1990's, our research teams in Almaden, California and Hursley, UK were exploring storagetechnologies that could take advantage of commodity hardware parts and the indu As is often the case, while IBM was working on "the perfect product", small start-ups announce "not-yet-perfect" products into the marketplace. Tactical moves like partneringwith DataCore was a smart move, for the following reasons: The partnership proved worthwhile, not just to prove to IBM that this was a worthwhile market to enter, but also how "NOT" to package a solution. Specifically, DataCore SANsymphony was software that you had to install on your own Windows-based server. The client was left with the task of orderinga suitable Intel-based server, with the right amount of CPU cycles, RAM and host bus adapter ports,and configure the Windows operating system and DataCore software. It didn't go well. Basically, customers were expected to be their own "hardware engineers", having to knowway too much about storage hardware and software to design a combination that worked for theirworkloads. Most clients were disappointed with the amount of effort involved, and the resulting poor performance. To fix this, IBM delivered the SAN Volume Controller, with an optimized Linux operating system and inte I can't speak for HDS, but I suspect they came to similar conclusions that resulted in a similar decisionto build their product in-house. I welcome Hu Yoshida to correct me if I am wrong on this.
As is often the case, while IBM was working on "the perfect product", small start-ups announce "not-yet-perfect" products into the marketplace. Tactical moves like partneringwith DataCore was a smart move, for the following reasons:
The partnership proved worthwhile, not just to prove to IBM that this was a worthwhile market to enter, but also how "NOT" to package a solution. Specifically, DataCore SANsymphony was software that you had to install on your own Windows-based server. The client was left with the task of orderinga suitable Intel-based server, with the right amount of CPU cycles, RAM and host bus adapter ports,and configure the Windows operating system and DataCore software.
It didn't go well. Basically, customers were expected to be their own "hardware engineers", having to knowway too much about storage hardware and software to design a combination that worked for theirworkloads. Most clients were disappointed with the amount of effort involved, and the resulting poor performance.
To fix this, IBM delivered the SAN Volume Controller, with an optimized Linux operating system and inte I can't speak for HDS, but I suspect they came to similar conclusions that resulted in a similar decisionto build their product in-house. I welcome Hu Yoshida to correct me if I am wrong on this.
I can't speak for HDS, but I suspect they came to similar conclusions that resulted in a similar decisionto build their product in-house. I welcome Hu Yoshida to correct me if I am wrong on this.
Comments (5) Visits (26917)
Tonight PBS plans to air Season 38, Episode 6 of NOVA, titled [Smartest Machine On Earth]. Here is an excerpt from the station listing:
"What's so special about human intelligence and will scientists ever build a computer that rivals the flexibility and power of a human brain? In "Artificial Intelligence," NOVA takes viewers inside an IBM lab where a crack team has been working for nearly three years to perfect a machine that can answer any question. The scientists hope their machine will be able to beat expert contestants in one of the USA's most challenging TV quiz shows -- Jeopardy, which has entertained viewers for over four decades. "Artificial Intelligence" presents the exclusive inside story of how the IBM team developed the world's smartest computer from scratch. Now they're racing to finish it for a special Jeopardy airdate in February 2011. They've built an exact replica of the studio at its research lab near New York and invited past champions to compete against the machine, a big black box code -- named Watson after IBM's founder, Thomas J. Watson. But will Watson be able to beat out its human competition?"
Craig Rhinehart offers [10 Things You Need to Know About the Technology Behind Watson].
An artist has come up with this clever [unofficial poster].
Like most supercomputers, Watson runs the Linux operating system. The system runs 2,880 cores (90 IBM Power 750 servers, four sockets each, eight cores per socket) to achieve 80 [TeraFlops]. TeraFlops is the unit of measure for supercomputers, representing a trillion floating point operations. By comparison, Hans Morvec, principal research scientist at the Robotics Institute of Carnegie Mellon University (CMU) estimates that the [human brain is about 100 TeraFlops]. So, in the three seconds that Watson gets to calculate its response, it would have processed 240 trillion operations.
Several readers of my blog have asked for details on the storage aspects of Watson. Basically, it is a modified version of IBM Scale-Out NAS [SONAS] that IBM offers commercially, but running Linux on POWER instead of Linux-x86. System p expansion drawers of SAS 15K RPM 450GB drives, 12 drives each, are dual-connected to two storage nodes, for a total of 21.6TB of raw disk capacity. The storage nodes use IBM's General Parallel File System (GPFS) to provide clustered NFS access to the rest of the system. Each Power 750 has minimal internal storage mostly to hold the Linux operating system and programs.
When Watson is booted up, the 15TB of total RAM are loaded up, and thereafter the DeepQA processing is all done from memory. According to IBM Research, "The actual size of the data (analyzed and indexed text, knowledge bases, etc.) used for candidate answer generation and evidence evaluation is under 1TB." For performance reasons, various subsets of the data are replicated in RAM on different functional groups of cluster nodes. The entire system is self-contained, Watson is NOT going to the internet searching for answers.
On ZDnet, Steven J. Vaughan-Nichols welcomes our new [Linux Penguin Jeopardy overlords]. I have to say I share his enthusiasm!
Comment (1) Visits (9390)
Registration is now open for our next "Meet the Storage Experts" event in Second Life. All IBMers, clients and IBM Business Partners are welcome to attend. We will focus this time on DS3000 and N series disk systems, tape systems,and IBM storage networking gear.
Continuing this week's theme of recap for 2006, I thought it would be good to look back at the various videos made available on the internet.
Comment (1) Visits (13964)
Earlier this week, EMC announced its Symmetrix V-Max, following two trends in the industry:
I took a cue from President Obama and waited a few days to collect my thoughts and do my homework before responding.Special thanks to fellow blogger ChuckH in giving me a [handy list of reactions] for me to pick and choose from. It appears that EMC marketing machine feels it is acceptable for their own folks to claim that EMC is doing something first, or that others are catching up to EMC, but when other vendors do likewise, then that is just pathetic or incoherent. Here are a few reactions already from fellow bloggers:
This was a major announcement for EMC, addressing many of the problems, flaws and weaknesses of the earlier DMX-3 and DMX-4 deliverables. Here's my read on this:
When IBM announced its acquisition of XIV over a year ago now, customers were knocking down our doors to get one. This caught two particular groups looking like a [deer in headlights]:
Let me contrast this with the situation Microsoft Windows is currently facing.
In their "I'm a PC" campaign, Microsoft is fighting both fronts. Let's examine two commercials:
IBM's DS8000, HDS USP-V and EMC's Symmetrix are the key players in the Ente Alan Cooper wrote a book that would change that mindset, titled[The Inmates Are Running the Asylum: Why High-Tech Products Drive Us Crazy and How to Restore the Sanity]. This book showed why the situation had gotten out ofhand, and to demand better ease-of-use from all vendors. It was required reading for anyone working in or with IBM's "User Centered Design" teams.I highly recommend this book. IBM recognized this trend early. IBM's SVC, N series and now XIV all offer ease-of-use with While IBM's XIV may not have been the first to introduce a modular, scale-out architectureusing commodity parts managed by sophisticated ease-of-use interfaces, its success might have been the kick-in-the-butt EMC needed to follow the rest of the industry in this direction.
Alan Cooper wrote a book that would change that mindset, titled[The Inmates Are Running the Asylum: Why High-Tech Products Drive Us Crazy and How to Restore the Sanity]. This book showed why the situation had gotten out ofhand, and to demand better ease-of-use from all vendors. It was required reading for anyone working in or with IBM's "User Centered Design" teams.I highly recommend this book.
IBM recognized this trend early. IBM's SVC, N series and now XIV all offer ease-of-use with While IBM's XIV may not have been the first to introduce a modular, scale-out architectureusing commodity parts managed by sophisticated ease-of-use interfaces, its success might have been the kick-in-the-butt EMC needed to follow the rest of the industry in this direction.
While IBM's XIV may not have been the first to introduce a modular, scale-out architectureusing commodity parts managed by sophisticated ease-of-use interfaces, its success might have been the kick-in-the-butt EMC needed to follow the rest of the industry in this direction.Read More]
Comments (2) Visits (10519)
Chris Evans over at Storage Architect posts aboutHardware Replacement Lifecycle Update, on how storage virtualization can helpwith storage hardware replacemement. He makes two points that I would like to comment on.
In a typical four year lifecycle of storage arrays, it might take six months or so to fill up the box, and might takeas much as a year at the end to move the data out to other equipment. SVC can greatly reduce both of these, so that you can take immediate advantage of new equipment as soon as possible, and keep using it for close to the full four years,migrating weeks or days before your lease expires.
Last week, a writer for a magazine contacted us at IBM to confirm a quote that writing a Terabyte (TB) on disk saves 50,000 trees. I explained that this was cited from UC Berkeley's famousHow Much Information? 2003 study.
I thought of this today as I read Jefferson Graham's article "How many trees did your iPhone bill kill?" in the USA Today newspaper. Apparently, new Apple iPhone users were sent AT&T billing statements that detailed their every phone call, text message or internet access. Here's a video on YouTube from Justine Ezarik that shows the absurdity of a 300-page monthly phone bill:
To be fair, the USA Today article explains that AT&T also offers "summary billing" as well as "on-line billing", but apparently neither of these are the default choice. I can understand that phone companies send out bills on paper because not everyone who has a phone has internet access, but in the case of its iPhone customers, internet access is in the palm of your hands! Since all iPhone customers have internet access, and AT&T knows which customers are using an iPhone, it would make sense for either on-line billing or summary billing to be the default choice, and let only those that hate trees explicitly request the full billing option.
Sending a box of 300 pages of printed paper is expensive, both for the sender and the recipient. This informationcould have been shipped less expensively on computer media, a single floppy diskette or CDrom for example. Forthose who prefer getting this level of detail, a searchable digitized version might be more useful to the consumer.
Which brings me to the concept of Information Lifecycle Management (ILM). You can read my recent posts on ILM byclicking the Lifecycle tab on the right panel, or my now infamous post from last year about ILM for my iPod.
His recollection of the history and evolution of ILM fairly matches mine:
While the SNIA definition provides a vendor-independent platform to start the conversation, it can be intimidatingto some, and is difficult to memorize word for word.When I am briefing clients, especially high-level executives, they often ask for ILM to be explained in simpler terms. My simplified version is:
So ILM is not just a good idea to save a company money, it can keep them out of the court room, as well as help save the environment and not kill so many trees. Now that 100 percent of iPhone customers have internet access, and a goodnumber of non-iPhone customers have internet access at home, work, school or public library, it makes sense for companies to ask people to "opt-in" to getting their statements on paper, rather than forcing them to "opt-out".
technorati tags: IBM, Terabyte, TB, 50,000 trees, Jefferson Graham, USAtoday, Apple, iPhone, iPod, AT&T, Justine Ezarik, YouTube, Information, Lifecycle, Management, ILM, SNIA, EMC, Sun, StorageTek, HP, asset, laptops, expense, employees, privacy, exposure, liability, unethical tampering, unexpected loss, unauthorized access, opt-in, opt-out[Read More]
Comments (3) Visits (9984)
In addition to creating the Dilbert cartoon, Scott Adams has a blog, which sometimes is quite serious,and other times quite funny. The anticipated 30x cost of "Flash Drives" for Enterprise disk systems reminded meof one of Scott's articles from November 2007 titled [Urge to Simplify].Here's an excerpt:
Now the casinos have people trained, like chickens hoping for pellets, to take money from one machine (the ATM), carry it across a room and deposit in another machine (the slot machine). I believe B.F. Skinner would agree with me that there is room for even more efficiency: The ATM and the slot machine need to be the same machine.
Perhaps EMC can redesign its DMX-4 to "blink and make exciting sounds" as well. The Flash Drives were designedfor the financial services industry, so those disk systems could be directly connected to make transfers between the appropriate bank accounts.Read More]
An interesting blog in "Channel Advisor" relates to the lack of trust in the storage industry:Education — Trust: Key To Survival In Today's IT Worldand offers some advice to vendors and channel distributions.
I can't stress enough how important is credibility in a highly-competitive marketplace.[Read More]
Here we are again at Top Gun class.In between class topics, we often show short video clips.
This week, we saw IBM Executive Bob Hoey's wisdom on selling mainframe computers. Bob is the VP of Sales for our System z server line, but the lessons might also apply to high-end disk or enterprise tape libraries.Read More]
Comments (2) Visits (9806)
On his "Data Storage - Dullness becomes Mainstream" blog, Chris Evans is
amazed athow low they can go!.He compares the latest 100GB Toshiba 1.8" drive designed for portable music players, to the size andweight of older technology, like the IBM 3380 Direct Access Storage Device (DASD).
Chris couldn't find the dimensions of the 3380, so I thought I would provide the missing detail.The IBM 3380 History Archivesprovides a nice summary:
At least take a backup first.Read More]