I didn't really have a theme this week, still recovering from jet-lag from my travels through Japan, Australia, China.
Gary Diskman has an amusing blog entry about a Funny disaster recovery job posting. It is not clear if he is being completely tongue-in-cheek, or a bit cynical. However, it rings true that you get what you measure, and some managers look for easy metrics, even if there are unintended consequences.
Western medicine works this way. Rather than paying your doctor to keep you healthy, you pay him per visit, to get refills on prescriptions, check-ups on medical conditions, surgeries and so on. While Eastern medicine is focused on keeping people healthy, Western medicine profits more from resolving "situations".
I have seen similar situations on the "health" of the data center. In one case, the admins were measured on how quickly they bring back up their web-servers after a crash. They had this process down to a science, because they were measured on how quickly they resolved the situation. I suggested switching from Windows to Linux, a much more reliable operating system for web-serving, and showed examples of web-servers running Linux that have been up for 1000 days or more. Management changed the metrics to "average up-time in days" and magically the re-boots all but disappeared, thanks to Linux, but also thanks in part to shifting the incentive structure. Perhaps some of those earlier situations were "artificially created"?
Back in the 1980s, I was working on a small software project that was about 5000 lines of code. In those days, testers were measured by the number of "successful" testcases that ran without incident. Testcases that uncovered an error were labeled as "failures" to be re-run after the developers fixed the code. When I declared my code ready for test, the test team ran 110 testcases, all successfully, and they were all rewarded for meeting their schedule. I, on the other hand, did not accept these results, met with them and told them I would give them $100 each if they could find a bug in my code in the next week. Nobody writes 5000 lines of code without some error along the way, not even me. (As one author put it, more people have left earth's gravity to orbit the planet than have written perfect code that did not require subsequent review or testing. It's so true. Good software is difficult to write.)
The test team accepted the challenge, and found 6 problems, more than I expected, but at least I felt more confident of the code quality after fixing these. As I suspected, the unintended consequence of counting "successful" testcases was that testers would write the most simple, basic, leas So, my advice is to determine metrics that have the intended consequences you want, while avoiding any negative unintented consequences that might undermine your eventual success. People will quickly figure out how to maximize the results, and if you can align their goals to company goals, then everybody benefits. Well, I'll be blogging from Mexico next week (yes, it is a business trip!). Enjoy the weekend.
So, my advice is to determine metrics that have the intended consequences you want, while avoiding any negative unintented consequences that might undermine your eventual success. People will quickly figure out how to maximize the results, and if you can align their goals to company goals, then everybody benefits.
Well, I'll be blogging from Mexico next week (yes, it is a business trip!). Enjoy the weekend.
Comments (7) Visits (12050)
I am still wiping the coffee off my computer screen, inadvertently sprayed when I took a sip while reading HDS' uber-blogger Hu Yoshida's post on storage virtualization andvendor lock-in. This blog appears to be the text version of theirfunny video.
While most of the post is accurate and well-stated, two opinions particular caught my eye. I'll be nice and call them opinions, since these are blogs, and always subject to interpretation. I'll put quotes around them so that people will correctly relate these to Hu, and not me.
"Storage virtualization can only be done in a storage controller. Currently Hitachi is the only vendor to provide this."
Hu, I enjoy all of your blog entries, but you should know better. HDS is fairly new-comer to the storage virtualization arena, so since IBM has been doing this for decades, I will bring you and the rest of the readers up to speed. I am not starting a blog-fight, just want to provide some additional information for clients to consider when making choices in the marketplace.
First, let's clarify the terminology. I will use 'storage' in the broad sense, including anything that can hold 1's and 0's, including memory, spinning disk media, and plastic tape media. These all have different mechanisms and access methods, based on their physical geometry and characteristics. The concept of 'virtualization' is any technology that makes one set of resources look like another set of resources with more preferable characteristics, and this applies to storage as well as servers and networks. Finally, 'storage controller' is any device with the intelligence to talk to a server and handle its read and write requests.
Second, let's take a look at all the different flavors of storage virtualization that IBM has developed over the past 30 years.
So, bottom line, storage virtualization can, and has, been delivered in the operating system software, in the server's host bus adapter, inside SAN switches, and in storage controllers. It can be delivered anywhere in the path between application and physical media. Today, the two major vendors that provide disk virtualization "in the storage controller" are IBM and HDS, and the three major vendors that provide tape virtualization "in the storage controller" are IBM, Sun/STK, and EMC. All of these involve a mapping of logical to physical resources. Hitachi uses a one-for-one mapping, whereas IBM additionally offers more sophisticated mappings as well.
In case you haven't noticed, IBM System Storage makes most of their announcements on Tuesdays. IBM announced a lot today, so here is a quick run-down.
IBM continues its market leadership with these new set of features and offerings!
I am back from China, and now glad to be back in the old USA. Last week, someone asked me what would it take to add a specific feature to the IBM System Storage DS8300. The what-would-it-take question is well-known among development circles informally as a "sizing" effort, or more formally as "Development Expense" estimate.
For software engineering projects, the process was simply that an architect would estimate the number of "Lines of Code" (LOC) typically represented in thousands of lines of code (KLOC). This single number would convert to another single number, "person-months", which would then translate to another single number "dollars". Once you had KLOC, the rest followed directly from a formula, average or rule-of-thumb.
More amazing is that this single number could then determine a variety of other numbers, the number of total months for the schedule, the number of developers, testers, publication writers and quality assurance team members needed, and so on. Again, these were developed using a formula, developed and based on past experience of similar projects.
Earlier in my career, I was the lead architect for DFSMS for the z/OS operating system, and later for IBM TotalStorage Productivity Center, performing these sizing efforts. A famous IBM architect, Frederick P. Brooks, wrote a now-classic book that was requiredreading when I started at IBM, which just was re-released as Mythical Man-Month: Essays in Software Engineering, 20th Anniversary Edition. In addition to sound advice, he alsooffered a formula or two that helps with these estimating tasks.
Hardware design introduces a different set of challenges. When I was getting my Masters Degree in Electrical Engineering, it took myself and four other grad students a full semester just to design a six-layer, 900 transistor silicon chip, which could only perform a single function, multiply two numbers together.At IBM, another book that I was given to read was Soul of a New Machine, documenting six hardware engineers, and six software engineers, working long hours on a tight schedule to produce a new computer for Data General.
So why do I bring this up now? IBM architects William Goddard and John Lynott are being inducted posthumously this year into the prestigious National Inventors Hall of Fame for their disk system innovation.
Under the leadership of Reynold Johnson, the team developed an air-bearing head to “float” above the disk without crashing into the disk. Imagine a fighter airplane flying full speed across the country-side at 50 feet off the ground. If you every heard the term "my disk crashed", it was originally referring to the read/write head touching the disk surface, causing terrible damage.
A uniformly flat disk surface was created by spinning the coating onto the rapidly rotating disk, leaving many wearing lab coats covered with disk liquid at waist level. Developing disk-to-disk and track-to-track access mechanisms proved more challenging, and nearly halted the project. The team, however, was adamant that this problem could be solved, and customers were increasingly asking for random access technology. The result was the "350 Disk Storage Unit" designed for the "305 RAMAC computer", which I have talked about a lot last year as part of our "50 years of disk systems innovation" celebration.
Neither Goddard nor Lynott had computing experience prior to joining IBM. Goddard was a former science teacher who briefly worked in aerospace. Lynott had been a mechanic in the Navy and later a mechanical engineer. They didn't have a nice formula based on past experience, they didn't have the benefit of Fred Brooks' advice, or the rules-of-thumb or averages now used to estimate the size of projects. They had to break new ground.
Now that's innovation!
technorati tags: IBM, DS8300, disk, KLOC, sizing, estimate, DFSMS, z/OS, TotalStorage Productivity Center, Frederick Brooks, William Goddard, John Lynott, Mythical Man-Month, Reynold Johnson, RAMAC, 305, 350,[Read More]
Comment (1) Visits (4742)
Wrapping up my week in China, I read an article by Li Xing in the local "China Daily" about energy efficiency in buildings. She argues that it is not enough for a building to be energy-efficient on its own, but you have to consider the impact of the other buildings around. Does it reflect the sun so harshly into neighboring windows that people are forced to put up blinds and use artificial light? Does it block the sun, so that rooms that previously could be used with natural sunlight must now be artificially lit?
A similar effect happens with power and cooling in the data center. Servers and storage systems generate heat, and that heat affects all the other equipment in the data center. IBM has the most power-efficient and heat-efficient servers and storage, but that is not enough. You have to consider the heat generated by all systems that might raise overall temperature.
This is what motivated IBM to deliver the IBM Rear Door Heat eXchanger, a member of IBM's CoolBlue(tm) portfolio.
According to a press release:
Research has indicated that water can remove far more heat per volume unit than air. For example, in order to disperse 1,000 watts, with 10 degree temperature difference, only 24 gallons of water per hour is needed, while the same space would require nearly 11,475 cubic feet of air. IBM's Rear Door Heat eXchanger helps keep growing datacenters at safe temperatures, without adding AC units. The unobtrusive solution brings more cooling capacity to areas where heat is the greatest -- around racks of servers with more powerful and multiple processors.
The eXchanger works on standard 42U racks, and can help clients deal with the rapid growth of rack-mounted servers and storage on their raised floor. How cool is that!