It's official. Last week,while I was talking about Fundamental Changes
needed to make your IT environmentmore energy-efficient, IBM announced its "Big Green Linux" initiative at LinuxWorld conference.
You can read the IBM Press Release, or the commentary onDaniWeb,ZDnet, orInfoWorld.
IT managers looking to reduce their energy consumption are probably closely watching XenSource and VMware. CRN feels that a war is imminent between these two. Eweek recognizes thisbattle is bad news for Linux virtualization on x86,and that Oracle is losing patienceon their inability to work together. Simon Crosby, CTO of XenSource, compares performance between the two platforms, and feelsthat XenSource should get its rightful piece of the pie.
There is a ray of hope. Charles Babcock in Information Week explains thatpeace has been established on at least one front, XenSource and VMware are working together to improve virtualization in the Linux kernel.
Despite this, or perhaps because of this, over 30 percent of IBM's Linux server revenue is onnon-x86 platforms, avoiding the XenSource vs. VMware decision altogether. Both System z (traditional mainframe servers) and System p (traditional UNIX servers) are able to run many Linux images in a fully virtualized manner, without VMware or XenSource.
technorati tags: IBM, Big Green, Linux, initiative, XenSource, VMware, virtualization, Simon Crosby, mainframe, UNIX, servers
CNET staff writer Elinor Mills writes how some things in Web 2.0 have morphed, going from killer app to major Web platform
.Among the examples are Salesforce.com, Google, Second Life, and Facebook.
Philip Rosedale, chief executive of Linden Labs, which produced the Second Life virtual reality environment, said Second Life and Facebook are popular because they give people a new environment to interact in that they are comfortable with.
Of course I have blogged for months now on my involvement in Second Life, and how IBM is investing in this platform for business purposes. Recently, IBM made news for publishing its Code of Conduct,and set of guidelines on how you run your avatar in virtual worlds, including Second Life. IBM recognizesthe business potential of virtual worlds, and has formed the "3D Internet" group exploring the possibilities.Over 5000 IBM employees now use Second Life on a regular basis.
I was surprised to learn that there were over 23,000 IBMers already on Facebook. I used to be on LinkedIn,but found FaceBook to have more IBMers and have made the switch. Recently, we were told that these 23,000 IBMers spend 19 minutes, on average, per day visiting Facebook pages. Nobody askedme how much time I spend every day on FaceBook, but with over 350,000 employees in the company,I am sure some have ways to track the lives of others.
Both of these count as adding more "FUN" into the workplace, which everyone should strive for. It is also good to know that the skills you developusing Second Life or FaceBook can carry over to your next job role or your next employer.The number-one question I get from new colleagues when I mention either these exciting new ways to communicate and collaborate is: "But how is this related to business?"
Second Life is obvious, a new innovative way to hold meetings with colleagues, Business Partners and clients isgoing to have business value. Meetings in Second Life help you focus on what is being discussed, versus a plaintelephone call where your eyes may wander to other things in your view. Of course nothing beatsthe effectiveness of face-to-face meetings, but Second Life offers a more energy-efficient alternative than traveling to other cities or countries.
I am still fairly new to Facebook, installing and trying out new apps. I found this article that explains12 Ways to Use Facebook Professionally. So far it serves me well as a replacement for LinkedIn,and provides my friends and family a quick answer to Where in the world is Tony Pearson?
What else can these and other Web platfoms do? I am still in the exploratory stages.
technorati tags: CNET, Web2.0, Google, SecondLife, FaceBook, Philip Rosedale, eightbar, IBM, energy efficiency, travel
Stephen over at RupturedMonkey
discusses the challenges of recruiting storage administrators:
There has been a Storage Admin job advertised for many months but no one wants it. Why? It's offering VERY good money but the word has got around the company has poor management practices and most people don't last for more than 6 months. So, with the shortage of good SAN people, good money and conditions, what can that company do to recruit someone? ...
This leads me to the thought that has anyone ever thought about the standards that storage administrators should follow? Can an employer look up a web site to find questions to ask prospective employees? More often than not, they are recruiting because the previous one left so how can companies know what they are getting.
There is actually a great standard called Information Technology Infrastructure Library (ITIL) that applies not just to storage administrators, but other IT personnel such as network administrators and server administrators. Here's a quick web-site about ITIL History:
ITIL History can be traced back to the late 1980’s when the British government determined that the level of IT service quality provided to them was not sufficient enough. The Central Computer and Telecommunications Agency (CCTA), now called the Office of Government Commerce (OGC), was tasked with developing a framework for efficient and financially responsible use of IT resources within the British government and the private sector.
The goal was to develop an approach that would be vendor-independent and applicable to organizations with differing technical and business needs. This resulted in the creation of the ITIL.
This standard spread from the UK to other governments in Europe, and is now being adopted worldwide by government agencies, non-profit organizations and commercial enterprises. IBM, of course, has been involved along the way, encouraging this set of best practices to take hold.
IBMer John Long, in ITSM Watch article, points outsome key points:
- ITIL provides a common vocabulary that puts everyone in the IT industry on the same page, with the ultimate goal of helping companies run their IT organizations more efficiently.
- ITIL provides recommendations, or best practices, for managing the way IT provides services to the rest of the organization, in the same way you would the rest of your business, with a defined set of processes.
- While ITIL does a great job of describing what needs to be done, it doesn’t describe how to get it done. It doesn’t tell you how to take those best practices and implement them with real-life tools and technology. It’s not prescriptive.
The general process is now referred to as "IT Service Management", and the seven ITIL books are managed by the IT Service Management forum (ITSMf).
ITIL is vendor-independent. You can learn ITIL disciplines at one IT shop, and carry those skills with you when you go to another IT shop that has completely different gear. A common vocabulary would allow employers to post jobs in a consistent manner, and ask questions to those interviewing for the job. You can be ITIL-trained, and even ITIL-certified. IBM offers this training.
Of course, specific skills on how to use specific software to configure storage devices, request change control approvals, or define SAN zones, are useful, but often can be picked up on the job, reading the vendor manuals on the specifics. Of course, you can use IBM TotalStorage Productivity Center, which would allow someone to manage a variety of disk, tape and SAN fabric gear from one interface, greatly reducing the learning curve.
To learn more about ITIL, visit IBM Service Managementor watch thisflash video.
technorati tags: IBM, ITIL, IT, Service, Management, standards, storage, administrators, admins, skills, recruitment, vocabulary, TotalStorage, Productivity+Center, history, SAN, CCTA, OGC, ITSMF, Tivoli, disk, tape
Jon W Toigo over at Drunkendata has had a great set of posts on his skepticism of storage vendors touting their "green storage" solutions. My apologies for my"unnecessary" use of quotation marks
The ones I liked specifically were:
The last of which refers to this ComputerWorld article "EPA: U.S. needs more power plants to support data centers", which claims "from a technology perspective, the systems most responsible for gobbling up power are the relatively low-cost x86 servers ..." The article is based onthe recent EPA report that was just released.
Last month, in my post How manys Watts per Terabyte, I mentioned:
Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
Perhaps they won't be surprised any more. Here is an article in eWeek that explains how IBM isreducing energy costs 80% by consolidating 3,900 rack-optimized servers to 33 IBM System z mainframe servers, running Linux, in its own data centers. Since 1997, IBM has consolidated its 155 strategic worldwide data center locations down to just seven.
I am very pleased that IBM has invested heavily into Linux, with support across servers, storage, software andservices. Linux is allowing IBM to deliver clever, innovative solutions that may not be possible with other operating systems. If you are in storage, you should consider becoming more knowledgeable in Linux.
The older systems won't just end up in a landfill somewhere. Instead, the details are spelled out inthe IBM Press Release:
As part of the effort to protect the environment, IBM Global Asset Recovery Services, the refurbishment and recycling unit of IBM, will process and properly dispose of the 3,900 reclaimed systems. Newer units will be refurbished and resold through IBM's sales force and partner network, while older systems will be harvested for parts or sold for scrap. Prior to disposition, the machines will be scrubbed of all sensitive data. Any unusable e-waste will be properly disposed following environmentally compliant processes perfected over 20 years of leading environmental skill and experience in the area of IT asset disposition.
Whereas other vendors might think that some operational improvements will be enough, such as switching to higher-capacity SATA drives, or virtualizing x86 servers, IBM recognizes that sometimes more fundamental changes are required to effect real changes and real results.
technorati tags: Jon Toigo, Drunkendata, power, consumption, disk, systems, green, storage, Linux, consolidation, virtualization, recovery, services, landfill, SATA, x86, servers, eWeek, mainframe
I would like to welcome IBMer Barry Whyte
to the blogosphere!
From his bio:
Barry Whyte is a 'Master Inventor' working in the Systems & Technology Group based in IBM Hursley, UK. Barry primarly works on the IBM SAN Volume Controller virtualization appliance. Barry graduated from The University of Glasgow in 1996 with a B.Sc (Hons) in Computing Science. In his 10 years at IBM he has worked on the successful Serial Storage Architecture (SSA) range of products and the follow-on Fibre Channel products used in the IBM DS8000 series. Barry joined the SVC development team soon after its inception and has held many positions before taking on his current role as SVC performance architect. Outside of work, Barry enjoys playing golf and all things to do with Rotary Engines.
To avoid confusion in future posts, I will refer to Barry Whyte as BarryW, and fellow EMC blogger Barry Burke (aka the Storage Anarchist) as BarryB.
I'm in Chicago this week, but it is actually HOTTER here than in my home town of Tucson, Arizona.
technorati tags: IBM, Barry Whyte, SVC, SAN Volume Controller, disk, systems, Barry Burke, Chicago
There are a lot of exciting conferences and events coming up soon.
SHARE will be in San Diego, August 12-17. Held twice a year, I attended SHARE for 10 years back when I was lead architect for DFSMS,and then later the focal point for storage support on the Linux for System z platform.I won't be there this time around, but am glad to see that it is still thriving.
- IBM Storage and Storage Networking Symposium
IBM Storage and Storage Networking Symposium will be in Las Vegas, August 19-24.This is a great conference that is focused entirelyon the products and solutions I deal with the most. I attended nearly every one since they startedthis back in the 1990s, and am glad that I will be there this year, making several presentations.If you plan to attend this and want to meet up, drop me a note.
For those in Europe, there will be a similar one in Montpelier, France, October 15-18.
VMworld will be held in San Francisco, September 11-13.IBM is a top reseller of VMware software, and is proud to be a Platinum Sponsor for this event. Lookfor the panel discussion on "Storage Virtualization" which I am sure will include SAN Volume Controller.
- Meet the Storage Experts
Based on our successful product launch in Second Life back in April, we are now holding meetingsevery quarter to discuss various IBM System Storage topics. The next one will be September 20 onone of the IBM islands in Second Life. For those without travel budgets to go anywhere, the advantageto our "Second Life" events is that no travel is required, it can be done from the comfort of workor home office location.
I will post updates on how to register for this event as soon as I know them.
- Virtual Worlds
Virtual Worlds Fall 2007 onOctober 10-11, 2007 at the San Jose Convention Center. Sandy Kearney, IBM GlobalDirector of IBM 3D Internet and Virtual Business, will be the keynote speaker.This will include discussion of Second Life.
I am sure there are others, but these are the ones that I am aware of IBM's involvement.I'll be in Chicago next week, meeting with Sales Reps and Business Partners.
Enjoy the weekend!
technorati tags: IBM, SHARE, Storage, Networking, Symposium, VMworld, Sandy Kearney, DFSMS, Virtual, Worlds, Fall, 2007, Chicago, Secondlife
Stephen2615 over at RupturedMonkey asksDo more SAN related issues happen with blade enclosures?
and shares some of his bad experiences related to HP Blades in B class enclosures. Others comment that they had similar experiences with their B class equipment.
The question is if this is unique or specific to these particular models, or if this affects all kinds of blade servers because of their very nature and architecture. Stephen indicates that they also have HP C class enclosures, but since they are still in test mode, cannot comment on them.
I have no experience with any of HP's blade servers, but I have worked closely with our IBM BladeCenter team to help make sure that our storage, and our SAN equipment, work well together with the BladeCenter, and more importantly, that problems can be diagnosed effectively.
When I asked why people feel they need to know the inner workings of storage, the overwhelming response was to help diagnose problems. This could include problems inplacing related data on a potentially single point of failure, problems with performance, and problems communicating with 1-800-IBM-SERV.
So, if you have encountered problems diagnosing SAN problems with BladeCenter, or find that setting up an IBM SAN with blade servers in general, I would be interested in hearing what IBM can do to make the situation better.[Read More]
Perhaps I wrapped up my exploration of disk system performance one day too early. (While it is Friday here in Malaysia, it is still only Thursday back home)
Barry Burke, EMC blogger (aka The Storage Anarchist) writes:
Aren't you mixing metrics here?
Miles per Gallon measures an effeciency ratio (amount of work done with a fixed amount of energy), not a speed ratio (distance traveled in a unit of time).
Given that IOPs and MB/s are the unit of "work" a storage array does, wouldn't the MPG equivalent for storage be more like IOPs per Watt or MB/s per Watt? Or maybe just simply Megabytes Stored per Watt (a typical "green" measurement)?
You appear to be intentionally avoiding the comparison of I/Os per Second and Megabytes per Second to Miles Per Hour?
May I ask why?
This is a fair question, Barry, so I will try to address it here.
It was not a typo, I did mean MPG (miles per gallon) and not MPH (miles per hour). It is always challenging to find an analogy that everyone can relate to explain concepts in Information Technology that might be harder to grasp. I chose MPG because it was closely related to IOPS and MB/s in four ways:
- MPG applies to all instances of a particular make and model. Before Henry Ford and the assembly line, cars were made one at a time, by a small team of craftsmen, and so there could be variety from one instance to another. Today, vehicles and storage systems are mass-produced in a manner that provides consistent quality. You can test one vehicle, and safely assume that all similar instances of the same make and model will have the similar mileage. The same is true for disk systems, test one disk system and you can assume that all others of the same make and model will have similar performance.
MPG has a standardized measurement benchmark that is publicly available. The US Environmental Protection Agency (EPA) is an easy analogy for the Storage Performance Council, providing the results of various offerings to chose from.
MPG has usage-specific benchmarks to reflect real-world conditions.The EPA offers City MPG for the type of driving you do to get to work, and Highway MPG, to reflect the type ofdriving on a cross-country trip. These serve as a direct analogy to SPC having SPC-1 for Online transaction processing (OLTP) and SPC-2 for large file transfers, database queries and video streaming.
MPG can be used for cost/benefit analysis.For example, one could estimate the amount of business value (miles travelled) for the amount of dollar investment (cost to purchase gallons of gasoline, at an assumed gas price). The EPA does this as part of their analysis. This is similar to the way IOPS and MB/s can be divided by the cost of the storage system being tested on SPC benchmark results. The business value of IOPS or MB/s depends on the application, but could relate to the number of transactions processed per hour, the number of music downloads per hour, or number of customer queries handled per hour, all of which can be assigned a specific dollar amount for analysis.
It seemed that if I was going to explain why standardized benchmarks were relevant, I should find an analogy that has similar features to compare to. I thought about MPH, since it is based on time units like IOPS and MB/s, butdecided against it based on an earlier comment you made, Barry, about NASCAR:
Let's imagine that a Dodge Charger wins the overwhelming majority of NASCAR races. Would that prove that a stock Charger is the best car for driving to work, or for a cross-country trip?
Your comparison, Barry, to car-racing brings up three reasons why I felt MPH is a bad metric to use for an analogy:
- Increasing MPH, and driving anywhere near the maximum rated MPH for a vehicle, can be reckless and dangerous,risking loss of human life and property damage. Even professional race car drivers will agree there are dangers involved. By contrast, processing I/O requests at maximum speed poses no additional risk to the data, nor possibledamage to any of the IT equipment involved.
- While most vehicles have top speeds in excess of 100 miles per hour, most Federal, State and Local speed limits prevent anyone from taking advantage of those maximums. Race-car drivers in NASCAR may be able to take advantage of maximum MPH of a vehicle, the rest of us can't. The government limits speed of vehicles precisely because of the dangers mentioned in the previous bullet. In contrast, processing I/O requests at faster speeds poses no such dangers, so the government poses no limits.
- Neither IOPS nor MB/s match MPH exactly.Earlier this week,I related IOPS to "Questions handled per hour" at the local public library, and MB/s to "Spoken words per minute" in those replies. If I tried to find a metric based on unit type to match the "per second" in IOPS and MB/s, then I would need to find a unit that equated to "I/O requests" or "MB transferred" rather than something related to "distance travelled".
In terms of time-based units, the closest I could come up with for IOPS was acceleration rate of zero-to-sixty MPH in a certain number of seconds. Speeding up to 60MPH, then slamming the breaks, and then back up to 60MPH, start-stop, start-stop, and so on, would reflect what IOPS is doing on a requestby request basis, but nobody drives like this (except maybe the taxi cab drivers here in Malaysia!)
Since vehicles are limited to speed limits in normal road conditions, the closest I could come up with for MB/s would be "passenger-miles per hour", such that high-occupancy vehicles like school buses could deliver more passengers than low-occupancy vehicles with only a few passengers.
Neither start-stops nor passenger-miles per hour have standardized benchmarks, so they don't work well for comparisonbetween vehicles.If you or anyone can come up with a metric that will help explain the relevance of standardized benchmarks better than the MPG that I already used, I would be interested in it.
You also mention, Barry, the term "efficiency" but mileage is about "fuel economy".Wikipedia is quick to point out that the fuel efficiency of petroleum engines has improved markedly in recent decades, this does not necessarily translate into fuel economy of cars. The same can be said about the performance of internal bandwidth ofthe backplane between controllers and faster HDD does not necessarily translate to external performance of the disk system as a whole. You correctly point this out in your blog about the DMX-4:
Complementing the 4Gb FC and FICON front-end support added to the DMX-3 at the end of 2006, the new 4Gb back-end allows the DMX-4 to support the latest in 4Gb FC disk drives.
You may have noticed that there weren't any specific performance claims attributed to the new 4Gb FC back-end. This wasn't an oversight, it is in fact intentional. The reality is that when it comes to massive-cache storage architectures, there really isn't that much of a difference between 2Gb/s transfer speeds and 4Gb/s.
Oh, and yes, it's true - the DMX-4 is not the first high-end storage array to ship a 4Gb/s FC back-end. The USP-V, announced way back in May, has that honor (but only if it meets the promised first shipments in July 2007). DMX-4 will be in August '07, so I guess that leaves the DS8000 a distant 3rd.
This also explains why the IBM DS8000, with its clever "Adaptive Replacement Cache" algorithm, has such highSPC-1 benchmarks despite the fact that it still uses 2Gbps drives inside. Given that it doesn't matter between2Gbps and 4Gbps on the back-end, why would it matter which vendor came first, second or third, and why call it a "distant 3rd" for IBM? How soon would IBM need to announce similar back-end support for it to be a "close 3rd" in your mind?
I'll wrap up with you're excellent comment that Watts per GB is a typical "green" metric. I strongly support the whole"green initiative" and I used "Watts per GB" last month to explain about how tape is less energy-consumptive than paper.I see on your blog you have used it yourself here:
The DMX-3 requires less Watts/GB in an apples-to-apples comparison of capacity and ports against both the USP and the DS8000, using the same exact disk drives
It is not clear if "requires less" means "slightly less" or "substantially less" in this context, and have no facts from my own folks within IBM to confirm or deny it. Given that tape is orders of magnitude less energy-consumptive than anything EMC manufacturers today, the point is probably moot.
I find it refreshing, nonetheless, to have agreed-upon "energy consumption" metrics to make such apples-to-apples comparisons between products from different storage vendors. This is exactly what customers want to do with performance as well, without necessarily having to run their own benchmarks or work with specific storage vendors. Of course, Watts/GB consumption varies by workload, so to make such comparisons truly apples-to-apples, you would need to run the same workload against both systems. Why not use the SPC-1 or SPC-2 benchmarks to measure the Watts/GB consumption? That way, EMC can publish the DMX performance numbers at the same time as the energy consumption numbers, and then HDS can follow suit for its USP-V.
I'm on my way back to the USA soon, but wanted to post this now so I can relax on the plane.
technorati tags: IBM, EMC, Storage Anarchist, MPG, MPH, IOPS, NASCAR, Malaysia, Watts, GB, green, back-end, DMX-3, DMX-4, HDS, USP, USP-V, SPC, SPC-1, SPC-2, standardized, benchmarks, workload, DS8000, disk, storage, tape
Wrapping up this week's exploration on disk system performance, today I willcover the Storage Performance Council (SPC) benchmarks, and why I feel they are relevant to help customers make purchase decisions. This all started to address a comment from EMC blogger Chuck Hollis, who expressed his disappointment in IBM as follows:
You've made representations that SPC testing is somehow relevant to customers' environments, but offered nothing more than platitudes in support of that statement.
Apparently, while everyone else in the blogosphere merely states their opinions and moves on,IBM is held to a higher standard. Fair enough, we're used to that.Let's recap what we covered so far this week:
- Monday, I explained how seemingly simple questions like "Which is the tallestbuilding?" or "Which is the fastest disk system?" can be steeped in controversy.
- Tuesday, I explored what constitutes a disk system. While there are special storage systemsthat include HDD that offer tape-emulation, file-oriented access, or non-erasable non-rewriteable protection,it is difficult to get apples-to-apples comparisions with storage systems that don't offer these special features.I focused on the majority of general-purpose disk systems, those that are block-oriented, direct-access.
- Wednesday, I explored two metrics to measure storage performance, I/O requestsper second (IOPS) and Megabytes transferred per second (MB/s).
Today, I will explore ways to apply these metrics to measure and compare storageperformance.
Let's take, for example, an IBM System Storage DS8000 disk system. This has a controller thatsupports various RAID configurations, cache memory, and HDD inside one or more frames.Engineers who are testing individual components of this system might run specifictypes of I/O requests to test out the performance or validate certain processing.
- 100% read-hit, this means that all the I/O requests are to read data expectedto be in the cache.
- 100% read-miss, this means that all the I/O requests are to read data expectedNOT to be in the cache, and must go fetch the data from HDD.
- 100% write-hit, this means that all the I/O requests are to write data into cache.
- 100% write-miss, this means that all the I/O requests are to bypass the cache,and are immediately de-staged to HDD. Depending on the RAID configuration, this can result in actually reading or writing several blocks of data on HDD to satisfy thisI/O request.
Known affectionately in the industry as the "four corners" test, because you can show them on a box, with writes on the left, reads on the right,hits on the top, and misses on the bottom.Engineers are proud of these results, but these workloads do notreflect any practical production workload. At best, since all I/O requests are oneof these four types, the four corners provide an expectation range from the worst performance (most often write-missin the lower left corner)and the best performance (most often read-hit in the upper right corner) you might get with a real workload.
To understand what is needed to design a test that is more reflective of real business conditions,let's go back to yesterday's discussion of fuel economy of vehicles, with mileage measured in miles per gallon.The How Stuff Works websiteoffers the following description for the two measurements taken by the EPA:
- City MPG
The "city" program is designed to replicate an urban rush-hour driving experience in which the vehicle is started with the engine cold and is driven in stop-and-go traffic with frequent idling. The car or truck is driven for 11 miles and makes 23 stops over the course of 31 minutes, with an average speed of 20 mph and a top speed of 56 mph.
- Highway MPG
The "highway" program, on the other hand, is created to emulate rural and interstate freeway driving with a warmed-up engine, making no stops (both of which ensure maximum fuel economy). The vehicle is driven for 10 miles over a period of 12.5 minutes with an average speed of 48 mph and a top speed of 60 mph.
Why two different measurements? Not everyone drives in a city in stop-and-go traffic. Having only one measurement may not reflect the reality that you may travel long distances on the highway. Offering both city and highway measurements allows the consumers to decide which metric relates closer to their actual usage.
Should you expect your actual mileage to be the exact same as the standardized test?Of course not. Nobody drives exactly 11 miles in the city every morning with 23 stops along the way,or 10 miles on the highway at the exact speeds listed.The EPA's famous phrase "your mileage may vary" has been quickly adopted into popular culture's lexicon. All kinds of factors, like weather, distance, anddriving style can cause people to get better or worse mileage than thestandardized tests would estimate.
Want more accurate results that reflect your driving pattern, in specific conditions that you are most likely to drive in? You could rentdifferent vehicles for a week and drive them around yourself, keeping track of whereyou go, and how fast you drove, and how many gallons of gas you purchased, so thatyou can then repeat the process with another rental, and so on, and then use yourown findings to base your comparisons. Perhaps you find that your results are always20% worse than EPA estimates when you drive in the city, and 10% worse when you driveon the highway. Perhaps you have many mountains and hills where you drive, you drive too fast, you run the Air Conditioner too cold, or whatever.
If you did this with five or more vehicles, and ranked them best to worstfrom your own findings, and also ranked them best to worst based on the standardizedresults from the EPA, you likely will find the order to be the same. The vehiclewith the best standardized result will likely also have the best result from your ownexperience with the rental cars. The vehicle with the worst standardized result willlikely match the worst result from your rental cars.
(This will be one of my main points, that standardized estimates don't have to be accurate to beuseful in making comparisons. The comparisons and decisions you would make with estimatesare the same as you would have made with actual results, or customized estimates based on current workloads. Because the rankings are in the same order, they are relevant and useful for making decisions based on those comparisons.)
Most people shopping around for a new vehicle do not have the time or patience to do this with rental cars. Theycan use the EPA-certified standardized results to make a "ball-park" estimate on how much they will spendin gasoline per year, decide only on cars that might go a certain distancebetween two cities on a single tank of gas, or merely to provide ranking of thevehicles being considered. While mileage may not be the only metric used in making a purchase decision, it can certainly be used to help reduce your consideration setand factor in with other attributes, like number of cup-holders, or leather seats.
In this regard, the Storage Performance Council has developed two benchmarks that attempt to reflect normal business usage, similar to "City" and "Highway" driving measurements.
SPC-1 consists of a single workload designed to demonstrate the performance of a storage subsystem while performing the typical functions of business critical applications. Those applications are characterized by predominately random I/O operations and require both queries as well as update operations. Examples of those types of applications include OLTP, database operations, and mail server implementations.
SPC-2 consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during the execution of business critical applications that require the large-scale, sequential movement of data. Those applications are characterized predominately by large I/Os organized into one or more concurrent sequential patterns. A description of each of the three SPC-2 workloads is listed below as well as examples of applications characterized by each workload.
- Large File Processing: Applications in a wide range of fields, which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing.
- Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence.
- Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library.
The SPC-2 benchmark was added when people suggested that not everyone runs OLTP anddatabase transactional update workloads, just as the "Highway" measurement was addedto address the fact that not everyone drives in the City.
If you are one of the customers out there willing to spend the time and resources to do your own performance benchmarking, either at your own data center, or with theassistance of a storage provider, I suspect most, if not all, the major vendors(including IBM, EMC and others), and perhaps even some of the smaller start-ups, would be glad to work with you.
If you want to gather performance data of your actual workloads, and use this to estimate how your performance might be with a new or different storage configuration, IBMhas tools to make these estimates, and I suspect (again) that most, if not all, of theother storage vendors have developed similar tools.
For the rest of you who are just looking to decide which storage vendors to invite on your next RFP, and which products you might like to investigate that matchthe level of performance you need for your next project or application deployment,than the SPC benchmarks might help you with this decision. If performance is importantto you, factor these benchmark comparisons with the rest of the attributes you arelooking for in a storage vendor and a storage system.
In my opinion, I feel that for some people, the SPC benchmarks provide some value in this decision making process. They are proportionally correct, in that even ifyour workload gets only a portion of the SPC estimate, that storage systems withfaster benchmarks will provide you better performance than storage systems with lower benchmark results. That is why I feel they can be relevant in makingvalid comparisons for purchase decisions.
Hopefully, I have provided enough "food for thought"on this subject to support why IBM participates in the Storage Performance Council, why the performance of the SAN Volume Controller can be compared to the performanceof other disk systems, and why we at IBM are proud of the recent benchmark results in our recent press release.
Enjoy the weekend!
technorati tags: IBM, SPC, EMC, Chuck Hollis, fastest, disk, system, SVC, HDD, storage, four corners, read-hit, read-miss, write-hit, write-miss, City, Highway, MPG, OLTP, SPC-1, SPC-2, benchmarks, file, database, video,
Continuing our exploration this week into the performance of disk systems, today I will cover the metrics to measure performance. Why do people have metrics?
- Help provide guidance in decision making prior to purchase
- Help manage your current environment
- Help drive changes
Several bloggers suggested that perhaps an analogy to vehicles would be reasonable, given that cars and trucks are expensive pieces of engineering equipment, and people make purchase decisions between different makes and models.
In the United States, the Environmental Protection Agency (EPA) government entity is responsible for measuringfuel economy of vehicles using the metric Miles Per Gallon (mpg).Specifically, these are U.S. miles (not nautical miles) and U.S. gallons, not imperial gallons. It is importantwhen defining metrics that you are precise on the units involved.
Since nearly all vehicles are driven by gallons of gasoline, and travel miles of distance, this is a great metric to use for comparing all kinds of vehicles, including motorcycles, cars, trucks and airplanes. The EPA has a fuel economy website to help people make these comparisons.Manufacturers are required by law to post their vehicles' fuel-economy ratings, as certified by the federal Environmental Protection Agency (EPA), on the window stickers of most every new vehicle sold in the U.S. -- vehicles that have gross-vehicle-weight ratings over 8,500 pounds are the exception.
What about storage performance? What could we use as the "MPG"-like metric that would allow you to compare different makes and models of storage?
The two most commonly used are I/O requests per second (IOPS) and Megabytes transferred per second (MB/s). To understand the difference in each one, let's go back to our analogy from yesterday's post.
(A woman calls the local public library. She picks up the phone, and dials the phone number of the one down the street. A man working at the library hears the phone ring, answers it with "Welcome to the Public Library! How can I help you?" She asks "What is the capital city of Ethiopia?" He replies "Addis Ababa" and hangs up. Satisfied with this response, she hangs up. In this example, the query for information was the I/O request, initiated by the lady, to the public library target)
In this example, it might have only taken 1 second to actually provide the answer, but it might have taken 10-30 seconds to pick up the phone, hear the request, respond, and then hang up the phone. If one person is able to do this in 10 seconds, on average, then he can handle 360 questions per hour. If another person takes 30 seconds, then only 120 questions per hour. Many business applications read or write less than 4KB of information per I/O request, and as such the dominant factor is not the amount of time to transfer the data, but how quickly the disk system can respond to each request. IOPS is very much like counting "Questions handled per hour" at the public library. To be more specific on units, we may specify the specific block size of the request, say 512 bytes or 4096 bytes, to make comparisons consistent.
Now suppose that instead of asking for something with a short answer, you ask the public library to read you the article from a magazine, identify all the movies and show times of a local theatre, or recite a work from Shakespeare. In this case, the time it took to pick up the phone and respond is very small compared to the time it takes to deliverthe information, and could be measured instead in words per minute. Some employees of the library may be faster talkers, having perhaps worked in auction houses in a prior job, and can deliver more words per minute than other employees. MB/s is very much like counting "Spoken words per minute" at the public library. To be more specific on units, we may request a specific amount of information, say the words contained in "Romeo and Juliet", to make comparisons consistent.
Now that we understand the metrics involved, tomorrow we can discuss how to use these in the measurement process.
technorati tags: IBM, disk, systems, EPA, MPG, mileage, fuel, economy, IOPS, MB/s, Shakespeare, vehicles, cars, trucks, motorcycles, airplanes