Shakespeare wrote "What's in a name? That which we call a rose by any other word would smell as sweet." This week my theme will be on names, naming convention, and how we access information on storage.
Take for example these two sentences:
The Bears beat New Orleans.
Though they appear very different, football fans who might have watched either or both of the two conference title games yesterday would quickly recognize that they refer to the same two teams and the same end-result.
I'll be traveling to Asia next week. While most people call me "Tony", my legal given name is "Anthony" which is what appears on my passport and other legal documents. Most English-speaking countries handle this fine, but it can be confusing in Japan or China, where "A. Pearson" doesn't match "T. Pearson".
In the US, our given and family names are referred to as our "first name" and our "last name", relating to their positional sequence. In Asia, family names come first, followed by their given names last. To help avoid confusion, we have started adopting the practice of putting the family name in ALL CAPITAL LETTERS, so I would "Tony PEARSON" while my colleague may be "WONG Francis".
In Japanese, "Mr. JONES" would be "Jones-san". However, Pearson-san is such a toungue-twister, that most just say "Tony-san" which is fine with me. I have been called "Mr. Tony" in a variety of countries, perfectly acceptable.
You can call me anything you like, just don't call me late for dinner.
Comments (2) Visits (7976)
On his blog post on preparation, Seth Godin mentioned an appropriate Swedish saying:
There is no bad weather, just bad clothing.
Appropriate because it snowed here in Tucson, Arizona on Sunday evening, leaving many of us here figuring out how to drive through the stuff on Monday. In my entire lifetime, I have only witness snow down in the Tucson valley a handful of times. It got me thinking about coats, and the wonderful schemes for coat check rooms, as an analogy for data access. A lot of people ask me to compare and contrast one technology from another, say block-level virtualization from content-addressable storage, and so on, and I always try to find a good analogy to help explain things.
Let's start with the setting. It is snowing outside and people are wearing coats. When they come inside, they check their coats at a coat check room, a large room with rows and rows of racks with hangers. A coat check attendant takes your coat and puts it on a hanger, and gives you a ticket or other identifier that will allow you to retrieve your coat later. The ticket must have sufficient information to retrieve the coat quickly, rather than searching rows and rows of hangers for it.
A problem arises when you generate "hash codes" for storage. It is possible for two different pieces of data to resolve to the same hash code. When an application tries to write a piece of data, and it resolves to a hash code that already exists, that is called a collision. One response is to either compare the incoming data to the data that is already stored, confirm they are identical, but that can be time consuming. The other response is to just assume they are identical, and reject the secondary copy, a process often referred to as "de-duplication".
What's the chance of getting a collision for data that is really different? Let's take for example the famousBirthday paradox. Suppose the coat check room assigned the hanger based on your birthday (month and day). How may coats before you run the risk of having two people turn in coats with the same birthday? After only 23 people, the likelihood is 50%. At 60 people, it goes up to 99%.
For this reason, IBM does not offer content-addressable storage. For non-erasable, non-rewriteable storage, the IBM System Storage DR550 requires the application to give each object a name, and that name is then used to storage the data, eliminating the possibility that data might accidently be thrown away.
It's safer that way.
technorati tags: Seth Godin, Swedish, saying, bad, weather, clothing, snow, Tucson, coat, check, room, IBM, block-based, disk, storage, DR550, N series, NAS, healthcare, life sciences, grid, medical, archive, solution, GMAS, cont
Happy New Year!
This year I resolve to be more consistent in my blogging, and my goal is to give you one to five entries per week, every week, based on the advice from Glenn Wolsey, Jennette Banks, and others.On some weeks, I will have a running theme, so rather than super-long entries to cover everything I can think of on a topic, make the entries short and readable. This week is a good time to review last year's "New Year's Resolutions" and to make new ones for 2007. I will discuss actions that companies can adopt for their data centers.
A common resolution is to lose weight, as in this Dilbert comic. Last year, I resolved to lose weight in 2006, and am delighted with myself that I lost eight pounds. When people ask for the secret of my success, I whisper in their ear "Eat less, exercise more." In general, people (and companies) know what to do, but just don't do it, which Pfeffer and Sutton document in their book The Knowing-Doing Gap. In my case, it involved lifestyle change: I exercised at a gym three times per week in Tucson, with a personal trainer, and revamped my diet.
Not everyone subscribes to the "eat less exercise more" philosophy. For example, Ric Watson argues in his blog that you can eat fewer calories, but eat more in actual volume, by choosing the right foods. This brings up the issues of "metrics" that most data centers are familiar with. Last year, I read the book "You: On a Diet" which explains that it is better to focus on "waist reduction" as measured in inches around your mid-section at the belly button, than "weight reduction" as measured in pounds. This year, I resolve to get down to 35 inches by the end of 2007.
The problem with measuring "weight" is that you are weighing bones, muscle and fat. A person can gain ten pounds of muscle, lose ten pounds of fat, and the scale would indicate no progress. The same problem occurs in data centers. How many TB of data do you have? Storage admins can easily tell you, but can they tell how much of this is bone (data needed for operating infrastructure), muscle (data used in daily operations that generates revenue) or fat (obsolete or orphaned data)?
We at IBM often state that "Information Lifecycle Management (ILM)" is more lifestyle change than a "fad diet". Figuring out what data you should capture in the first place, where to place it, when to move it, and when to get rid of it, is more important that just buying different tiers of storage hardware. So, for those looking to make new data center resolutions, I suggest the following actions:
Continuing this week's theme of New Year's Resolutions for the data center, today we'll talk about one that people don't always think about on a personal level, that is to hone your tools and skills.
A long time ago, I used to be a regular speaker at the SHARE user group conference. One of the most attended sessions was Sam Golob presenting the latest CBT Tape set of tools. Over time, this large collection of "mainframe shareware" was handed out on 3480 tape cartridges, then on CDs, and finally made downloadable off the web.Sam's main point, which I remember to this day, was that everyone who has a job should figure out what tools they use, keep those tools functioning properly, and learn to use them well.
Later, I took some cooking classes at a culinary school. Among other things, we learned:
This last point hits close to home, as many people like me have too many tools that they do not use often enough to know how to use them well. Do I really need my strawberry corer, garlic press, or a tray designed for the storage and delivery of deviled eggs?
The same could be said about software tools. What tools do you use in your job? Do you feel you know how to take full advantage of their power and capabilities?If you develop software, do you know all the features for your debugging tools? If you develop advertising or marketing materials, do you know all the features of your photo or video editing software? If you manage storage in a data center, do you know all the tools for managing your storage area network (SAN), disk systems, tape libraries, and reporting tools to identify all of your files and databases across your entire IT environment?I would not be surprised if you could replace a whole mess of tools with just one, such as the IBM TotalStorage Productivity Center.Read More]
Wrapping up this week's theme of New Year's Resolutions for the data center, the New York Times argues we should go easy on the resolutions, so I'll conclude with reducing stress. Lighten up! Relax, and try not to take your job so seriously.
(I know you're probably thinking, "That's easy for you to say, Mr. paid
technorati tags: New Years, resolutions, reducing stress, laughter, Tucson Laughter Club, Laughter Yoga, Sun, StorageTek, Kodak, Work/Life Balance, sleep, blogfights, assertive, music, LifeHacker, Live365, Pink Noise[Read More]