Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is a Master Inventor and Senior IT Specialist for the IBM System Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2011, Tony celebrated his 25th year anniversary with IBM Storage on the same day as the IBM's Centennial. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services. You can also follow him on Twitter @az990tony.
(Short URL for this blog: ibm.co/Pearson
It's official! My "blook" Inside System Storage - Volume I is now available.
This blog-based book, or “blook”, comprises the first twelve months of posts from this Inside System Storage blog,165 posts in all, from September 1, 2006 to August 31, 2007. Foreword by Jennifer Jones. 404 pages.
IT storage and storage networking concepts
IBM strategy, hardware, software and services
Disk systems, Tape systems, and storage networking
Storage and infrastructure management software
Second Life, Facebook, and other Web 2.0 platforms
IBM’s many alliances, partners and competitors
How IT storage impacts society and industry
You can choose between hardcover (with dust jacket) or paperback versions:
This is not the first time I've been published. I have authored articles for storage industry magazines, written large sections of IBM publications and manuals, submitted presentations and whitepapers to conference proceedings, and even had a short story published with illustrations by the famous cartoon writer[Ted Rall].
But I can say this is my first blook, and as far as I can tell, the first blook from IBM's many bloggers on DeveloperWorks, and the first blook about the IT storage industry.I got the idea when I saw [Lulu Publishing] run a "blook" contest. The Lulu Blooker Prize is the world's first literary prize devoted to "blooks"--books based on blogs or other websites, including webcomics. The [Lulu Blooker Blog] lists past year winners. Lulu is one of the new innovative "print-on-demand" publishers. Rather than printing hundredsor thousands of books in advance, as other publishers require, Lulu doesn't print them until you order them.
I considered cute titles like A Year of Living Dangerously, orAn Engineer in Marketing La-La land, or Around the World in 165 Posts, but settled on a title that matched closely the name of the blog.
In addition to my blog posts, I provide additional insights and behind-the-scenes commentary. If you go to the Luluwebsite above, you can preview an entire chapter in its entirety before purchase. I have added a hefty 56-page Glossary of Acronyms and Terms (GOAT) with over 900 storage-related terms defined, which also doubles as an index back to the post (or posts) that use or further explain each term.
So who might be interested in this blook?
Business Partners and Sales Reps looking to give a nice gift to their best clients and colleagues
Managers looking to reward early-tenure employees and retain the best talent
IT specialists and technicians wanting a marketing perspective of the storage industry
Mentors interested in providing motivation and encouragement to their proteges
Educators looking to provide books for their classroom or library collection
Authors looking to write a blook themselves, to see how to format and structure a finished product
Marketing personnel that want to better understand Web 2.0, Second Life and social networking
Analysts and journalists looking to understand how storage impacts the IT industry, and society overall
College graduates and others interested in a career as a storage administrator
And yes, according to Lulu, if you order soon, you can have it by December 25.
We had a great event today! This was a first-of-a-kind product launch, using Second Life as the medium. We invited IBM Business Partners, industry analysts and reporters from the Press to have their "avatars" in-world to watch us launch new tape systems, archive and retention systems, and disk systems announced this month.
Andy Monshaw, IBM System Storage General Manager, welcomed everyone to the event, and introduced our three speakers.He mentioned that this was a great innovative way to meet, collaborate and forge relationships without the carbon pollution associated with travel required by a more traditional face-to-face meeting. We had attendees from the USA, UK, Germany, Sweden, Italy, Colombia, and Brazil.
All the attendees were given a "goody bag" that contained IBM BP-logo clothing, animations and gestures to be used during the meeting.
Eric Buckley, one of our marketing managers for tape systems, introduced our complete line of LTO 4 tape systems, as wellas the TS7520 Virtualization Engine, a virtual tape library for Windows, UNIX and Linux servers. Eric had a virtual 3-Dversion of an LTO cartridge that is photo-realistic and dimensionally correct.
Funda Eceral, our solutions manager for archive and retention offerings, presented the new version of the IBM System Storage DR550, the DR550 file system gateway, and the IBM System Storage Multilevel Grid Archive Manager. At first we thought we would "pass the microphone" from speaker to speaker, but it turned out to be easier just to give all three speakers their own microphone.
Last, but not least, was David Tareen, marketing manager for disk systems, covering the entry-level DS3000 Express disk system bundles designed for our SMB client. David used a black-and-brown pointer stick to point out specific things on the charts.
After the presentations, Kristie Bell, VP of Marketing for IBM System Storage, hosted a Question & Answer (Q&A) panel.Avatars rose their left hand to indicate they had a question.
We thought it would be a good idea to have a few minutes at the end to socialize over a cup of coffee. This involved making a "coffee machine" that dispensed coffee, and the appropriate animations and gestures so that everyone could sip the coffee, and hold the coffee at waist level when they were talking.
The event was held upstairs in one of the conference rooms of the IBM Briefing Center, located on "IBM 8" island.Many people went to the ground floor to look at the many IBM System Storage products on display. Unlike a picture on a web-page, Second Life gives you a 3-D view that you can walk around each product, and get a feel for the size and shape of the hardware.
We had four photographers and camera-persons on hand to capture still shots, video, audio, and chat text, and are working now to combine them for marketing collateral. I want to thank the builders, script programmers, animators, clothing designers, speakers, editors, and channel enablement team for making this event such a great success!
Use more efficient disk media, such as high-capacity SATA disk drives
Both are great recommendations, but why limit yourself to what EMC offers? Your x86-based machines are only a subset of your servers,and disk is only a subset of your storage. IBM takes a more holistic approach, looking at the entire data center.
VMware is a great product, and IBM is its top reseller. But in addition to VMware, there are other solutions for the x86-based servers, like Xen and Microsoft Virtual Server. IBM's System p, System i, and System z product lines all support logical partitioning.
To compare the energy effectiveness of server virtualization, consider a metric that can apply across platforms. For example, for an e-mail server, consider watts per mailbox. If you have, say, 15,000 users, you can calculate how many watts you are consuming to manage their mailboxes on your current environment, and compare that with running them on VMware, or logical partitions on other servers. Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
More efficient Media
SATA and FATA disks support higher capacities, and run at slower RPM speeds, thus using fewer watts per terabyte.A terabyte stored on 73GB high-speed 15K RPM drives consumes more watts than the same terabyte stored using 500GB SATA.Chuck correctly identifies that tape is more power-efficient than disk, but then argues that paper is more power-efficient than tape. But paper is not necessarily more efficient than tape.
ESG analyst Steve Duplessie divides up data betweenDynamic vs. Persistent. The best place to put dynamic data is on disk, and here is where evaluation of FC/SAS versus SATA/FATA comes into play.Persistent data, on the other hand, can be stored on paper, microfiche, optical or tape media. All of these shelf-resident media consume no electricity, nor generate any heat that would require additional cooling.
A study by scientists at the Lawrence Berkeley National Laboratory titled High-Tech Means High-Efficiency: The Business Case for Energy Management in High-Tech Industries indicates thatData centers consume 15 to 100 times more energy per square foot than traditional office space. Storing persistent data in traditional office space can save a huge amount of energy. Steve Duplessie feels the ratio of dynamic to persistent data is 1:10 today, but is likely to grow to 1:100 in the near future, raising the demand for energy-efficient storage of persistent data ever more important to our environment.
Data centers consume nearly 5000 Megawatts in the USA alone, 14000 Megawatts worldwide. To put that in perspective, the country of Hungary I was in last week can generate up to 8000 Megawatts for the entire country (and they were using 7400 Megawatts last week as a result of their current heat wave, causing them grave concern).
Back in the 1990's, one of the insurance companies IBM worked with kept data on paper in manila folders, and armiesof young adults in roller skates were dispatched throughout the large warehouses of shelves to get the appropriate folder in response to customer service inquiries. Digitizing this paper into electronic format greatly reduced the need for this amount of warehouse space, as well as improved the time to retrieve the data.
A typical file storage box (12 inch x 12 inch x 18 inch) containing typed pages single-spaced, double-sided, 12 point font could hold perhaps 100MB. The same box could hold a hundred or more LTO or 3592 tape cartridges, each storing hundreds of GB of information. That's a million-to-one improvement of space-efficiency, and from a watts-per-TB basis, translates to substantial improvement in standard office air conditioning and lighting conditions.
To learn more about IBM's Project Big Green, watch thisintroductory video which used Second Life for the animation.
Today was the "First Ever Live Virtual Virtualization Tech Fair" sponsored by IBM and VMware. This was a 1-day event hosted by Unisfair.
The day included presentations done at a conference call, along with exhibition booths.
We had an exhibition booth exclusively for "storage virtualization" featuring our IBM System Storage SAN Volume Controller (disk virtualization) and IBM System Storage TS7520 Virtualization Engine (a virtual tape library, or VTL).
People who were logged in were represented in silhouette form. When someone walked into the booth, our army of "booth reps" were able to chat with them and answer their questions. They could also peruse the various online materials we made available about each product.
Here are some of my observations:
A lot of questions were related to IBM's support for VMware. Although VMware is now currently owned by EMC, pending a spin-off IPO, IBM is its biggest reseller, given IBM's vast experience in server virtualization. Ironically, IBM's SAN Volume Controller supports VMware better than EMC's own storage virtualization product, Invista.
People also familiar with Second Life thought this 2-D "silhouette" version eliminated the need to configure and dress up your avatar as is required in participating in Second Life events. However, being only ableto chat, send e-mail and show web pages seemed less immersive than what Second Life can offer.
This event generated over 60 leads. We will pass on the contact information to the appropriate sales team.
Sometimes, it's difficult to explain the products I manage to people outside the IT storage industry. How do you explain FCP vs. FICON, Giant Magnetoresistive (GMR) heads, the SMI-S interface, etc. enough to then explain how your job relates to those technologies. At least my friends and family read this blog, so they can somewhat understand some of the things I am working on. When I visit my folks on Sundays, we sometimes discuss items they read in my blog that week.
In addition to a "take your children to work day", we have discussed within IBM a "take your parents to work day", especially for the young new hires who have a hard time explaining what their new job is to the rest of their family.
The problem is not just your parents, but any of your co-workers old enough to be parents who haven't bothered to keep up with the latest advancements in Web 2.0 technology. Here are some examples:
A project leader working with a technology partner asked if me if there was a difference between a "blog" and a "wiki" and which should his team use. This was not a simple yes/no answer, and involved some explanation, conversation and understanding of what he was trying to accomplish.
For one of my meetings, someone instant-messaged me asking where it was, was it "face-to-face" (F2F) or Conference call (CC). I replied back, "A2A w/CC" (avatar-to-avatar with voice over conference call). When you are meeting other avatars in-world in Second Life, it gets quite distracting having everyone typing away, with their hands and fingers moving furiously, so we use a conference call to complement our 3D interaction.
That's why I was very excited to seeLinden Lab announces voice beta in Second Life. It won't be fully ready until later this year, but adding voice to Second Life will greatly reduce the hurdles we now have trying to coordinate conference calls with in-world activity.
I realize not everyone can keep up with all the new and different technologies, but the social networking aspects of some of these new developments are worth looking into.
Well, we had another successful event in Second Life today.
Unlike our April 26 launch of our System Storage products for IBM Business Partners only, this time we decided this time to make it as a "Meet the Storage Experts" Q&A Panel format, and open up registration to everyone. Thesubject matter experts sat at the front of the room on four stools. We had six rows of chairs arrangedsemi-circularly.
Shown above, from left to right, are the avatars of our four experts:
IBM System Storage N series, focusing on recent N3000 disk system announcements
Harold Pike (holding the microphone while speaking)
IBM System Storage DS3000 and DS4000 series, focusing on recent DS3000 disk system announcements
IBM System Storage TS series, focusing on recent TS2230, TS3400 and TS7700 tape system announcements
IBM storage networking, focusing on recent IBM SAN256B director blade announcements
While Eric was a veteran Second Lifer, having presented at our April event, the other three were trainedon how to raise their hand, speak into the microphone, sit on the stool, and so on. I want to thank allof our experts for putting in this effort!
The event was produced by Katrina H Smith. She did a great job, and made sure we were on top ofall the issues and tasks required to get the job done. Running a Second Life event is every bit ashard as running a real face-to-face event. We had several meetings to discuss venue details, placementof chairs, placement of product demos, audio/video recording, wall decorations, tee-shirt and coffee mug design, logistics, and so on.
I acted as moderator/emcee for the event. That is my back in the picture above. The process wassimple, modeled after the "Birds of a Feather" sessions at events like SHARE and the IBMStorage and Storage Networking Symposium. We threw out a list of topics the experts would cover,and people in the audience would "raise their left hand". I, as the moderator, would then walkover to each person, and hold out the microphone for them to ask the question. I would then repeat the question and ask the appropriate expert to provide an answer. We defined gestures onhow to "raise hand" and "put hand down" that we gave to each registered participant.
We had four dedicated "camera-avatars" in world to capture both video and screenshots.Our video editors are now working to edit "highlight videos" that we can use at future events, for training materials, and for our internal "BlueTube" online video system.
The room was filled with examples of each of our products, made into 3D objects that were dimensionallycorrect, and "textured" with photographs of the actual products. If you click on an object, you get a "notecard" that provided more information. Special thanks to Scott Bissmeyer for making all of theseobjects for us.
We made posters of each expert and placed them in all four corners of the room. On the bottom of each coffee mug was a picture of each of the experts, and if you walked under each of the posters, you were"dispensed" a coffee mug matching the expert shown in the poster.Participants could "Collect all Four!" When you bring the coffee mug up to takea sip, the picture on the bottom of the mug is exposed for all to see.And as a final give-away to the audience, we made a variety of event tee-shirts and polo-shirts.
At the end of the session, we asked everyone to click on the "Survey" kiosk near the exit door. We askedsix simple questions using SurveyMonkey.com that took only a fewminutes to process. We found asking questions immediately at the end of the event was the best way tocapture this feedback.
From a "Green" perspective, we had people registered from the following countries: US, India, Mexico,Australia, United Kingdom, Brazil, Germany, Argentina, Chile, China, Canada, and Venezuela. Second Lifeallows all these people who probably could not travel, or could not afford the time and expense to travel,to participate in a simulated face-to-face meeting without energy consumption of traditional travel methods.
More importantly, we got several leads for business. People often ask "Yes, but is there any businessassociated with this?" This time, there was, based on the answers to the questions, several avatars asked for a real sales call to follow-up on the products and offerings they were discussed.
With such a great success, we have already scheduled our next Second Life event, November 8. Mark your calendars! I'll postmore details on the registration process of the November event when available.
Well, the weather here has turned awful, so I better turn off my computer to avoid lightning-strike damage.
For those looking for something to do to enjoy the "4th of July" US Independence day holiday tomorrow, thereis the [Team America: Sing-a-long at Tucson's Loft Cinemaat 6pm, you can still see the fireworks after the show is over. I did this last year and it was a lot of fun.
Also, you can check out the IBM Wimbledon build on Second Life. Here's the SLURL:[http://slurl.com/secondlife/IBM%207/133/180/23].Several IBMers will be "in world" at this virtual location on 4th of July. For all of my readers looking to check out Second Life, see what IBM can do, or talk to people who are familiar with this technology, here's your chance.
As for me, I'll be spending my "long weekend" in an airplane. Here's my travel schedule.
July 7-11: Tokyo, Japan - business meetings with IBM sales reps
July 13-18: Mumbai, India - business meetings with IBM business partners
If you will be at any of these locations on any of these dates and want to meet up, please let me know.You can click on the "send e-mail to Tony Pearson" button on the right panel of my blog.
(I was hoping that while I was in Asia, I could stop over and visit the schools I helped in Nepal and my friends at the Open Learning Exchange [OLE Nepal] as part of the One Laptop PerChild [OLPC Nepal] program, but I did not get all my ducks lined up for this with the appropriate travel approvals, visas and logistics. My apologies to Bryan, Sulochan and the rest of the team. Perhaps next year!)
Several of my IBM colleagues will be attending the "Virtual Worlds 2007" conference today and tomorrow. This conference sold out so quickly that they have already scheduled a second one for October. The focus is on 3-D internet technologies likeSecond Life. Attendance is expected at over 600 people.
IBM is investing heavily in this new concept of v-business. Last year, I was one of only 325 IBMers on Second Life. Now, according to this Better than Life blog entry from Grady Booch, IBM Fellow, the number is over 4000!
Of course, the challenge for IBM, and others, is learning to market in virtual worlds. Already, my team is in-world, and we meet several times a week. Using Second Life is quickly becoming an essential business skill, like participating in conference calls, or responding to instant messages.
What does meeting in-world entail?
Scheduling a time and a place
Finding a time that people can meet is no different than scheduling a audio or video conference call. In general, you don't have to worry about travel, but you do have to be actively somewhere connected to everyone else.
Finding a place involves actually determining the island, region and coordinates to hold the meeting. You need to find a place with enough seating. You don't have to worry about daylight, each person can control how much or little sunlight shows up on their screen. You do have to make sure you pick a spot that nobody else plans to use at that same time. Just like scheduling conference rooms at the site or hotel, we have to schedule rooms in advance.
To avoid this hassle, I have created the "pocket conference room". This is a single object that I can "rez" onto the ground, from my inventory, with 40 chairs, a PowerPoint presentation screen, a podium for a speaker to stand behind, and stools for speakers to sit on if they are next on the agenda. Now, I can hold impromptu meetings in any sandbox, grassy knoll, or the roof top of a building.
As with any other meeting, you need some basic ground rules. I am not talking the usual "no shooting, no gambling, no selling" rules that you see everywhere in Second Life. Instead, rules like an avatar must stand up before speaking. Anyone with a question must first "raise their hand" and get recognized by the chair. These ground rules can be as formal as Robert's Rules of Order or more casual, depending on who is participating.
It costs 10 Linden Dollars (L$) per PAGE to upload a PowerPoint presentation. This has the immediate benefit of having everyone spend more time and effort on their presentation, trying to cut down the number of charts, and focus more on what they are going to say.
Public Speaking Skills
It is amazing. People who are too scared to speak in front of an audience in Real Life have no problem having their avatar stand in front of other avatars in Second Life. This has greatly broadened the pool of speakers to tap into.Are you a woman with a husky masculine voice? Are you a man with a high-pitched feminine voice? Now, you can create an avatar that matches your voice.
This turns out to be the biggest challenge. In Real Life, organizing a face-to-face meeting involves time and effort making sure the venue has everything you need, a platform, a podium, good Audio/Video system, etc. All people have to do is show up, sit in a chair and listen.
In Second Life, however, the aspects of venue are all covered, but getting people to show up is another story. People have to sign up for Second Life account, create an avatar, wear appropriate virtual clothing, figure out how to teleport near the venue, walk or fly the difference to get to the exact building and room, master the sitting-in-a-chair and hold-coffee-and-sip-occasionally process, and pay attention.
Perhaps the best part of Second Life is that if you are not paying attention, your avatar noticeably falls asleep, into a hunched-over position, what is called "afk" (short for Away From Keyboard). On the other hand, if you do need to step away from your desk, you can put your avatar in "afk" mode immediately, tell everyone why and perhaps when you'll be back, and then re-activate when you return. This is one of the best improvements over regular audio conference calls.
I suspect the need for having places in Second Life to hold meetings will become more and more in demand.At a time when real-estate sales in the US is slowing down, Coldwell Banker's Second Life efforts are ramping up. I am not making this up. Coldwell Banker is one of the nation's largest real estate brokerage firms. They are trying to bring the same "adult supervision" to virtual real-estate transactions, offering to help people buy and rent properties in Second Life.
Continuing this week in Los Angeles, I went to some interesting sessions today at theSystems Technical Conference (STC08).
System Storage Productivity Center (SSPC) - Install and Configuration
Dominic Pruitt, an IBM IT specialist in our Advanced Technical Support team, presented SSPC and howto install and configure it. For those confused between the difference of TotalStorage ProductivityCenter and System Storage Productivity Center, the former is pure software that you install on aWindows or Linux server, and the latter is an IBM server, pre-installed with Windows 2003, TotalStorageProductivity Center software, TPCTOOL command line interface, DB2 Universal Database, the DS8000 Element Manager, SVC GUI and CIMOM, and [PuTTY] rLogin/SSH/Telnet terminal application software.
Of course, the problem with having a server pre-installed with a lot of software is that there is alwayssomeone that wants to customize it further. For those who just want to manage their DS8000 disk systems,for example, it is possible to uninstall the SVC GUI, CIMOM and PuTTY, and re-install them later when youchange your mind. As a general rule, it is not wise to mix CIMOMs on the same machine, as it might causeconflicts with TCP ports or Java level requirements, so if you want a different CIMOM than SVC, uninstallthe SVC CIMOM first. For those who have SVC, the SSPC replaces the SVC Master Console, so you can safelyturn off the SVC CIMOM on your existing SVC Master Consoles.
The base level is TotalStorage Productivity Center "Basic Edition", but you can upgrade the Productivity Centerfor Disk, Data and Fabric components with license keys. You can also run Productivity Center for Replication,but IBM recommends adding processor and memory to do this (IBM offers this as an orderable option).Whether you have the TotalStorage software or SSPC hardware, Productivity Center has a cool role-to-groups mapping feature.You can create user groups, either on the Windows server, the Active Directory, or other LDAP, and then map which roles should be assigned to users in each group.
Since Productivity Center manages a variety of different disk systems, it has made anattempt to standardize some terminology. The term "storage pool" refers to an extentpool on the DS8000, or a managed disk group on the SAN Volume Controller. Since the DS8000 can support both mainframe CKD volumes and LUNs for distributed systems, theterm "volume" refers to a CKD volume or LUN, and "disk" refers to the hard disk drive (HDD).
To help people learn Productivity Center, IBM offers single-day "remote workshops"that use Windows Remote Desktop to allow participants to install, customize and usethe software with no travel required.
IBM Integrated Approach to Archiving
Dan Marshall, IBM global program manager for storage and data services on our Global Technology Services team, presented IBM's corporate-wide integration to support archive across systems, software and services.One attendee asked me why I was there, given that "archive" is one of my areas of subject matter expertise that I present often at the Tucson Executive Briefing Center. I find it useful to watch others present the material, even material that I helped to develop, to see a different slant or spin on each talking point.
Archive is one area that brings all parts of IBM together: systems, software and services.Dan provided a look at archive from the services angle, providing an objective unbiasedview of the different software and systems available to solve specific challenges.
Encryption Key Manager (EKM) Design and Implementation
Jeff Ziehm, IBM tape technical sales specialist, presented IBM's EKM software, how it works in a tape environment, and how to deploy it in various environments. Since IBM is allabout being open and non-proprietary, the EKM software runs on Java on a variety ofIBM and non-IBM operating systems. IBM offers "keytool" command line interface (CLI) for the LTO4 and TS1120 tape systems, and "iKeyMan" graphical user interface (GUI) for theTS1120. Since it runs on Java, IBM Business Partners and technical support personneloften just [download and install EKM]onto their own laptops to learn how to use it.
Virtual Tape Update
We had three presenters at this one. First, Jeff Mulliken, formerly from Diligent and now a full IBM employee, presented the current ProtecTier softwarewith the HyperFactor technology, then Abbe Woodcock, IBM tape systems, compared Diligent with IBM's TS7520 and just-announced TS7530virtual tape libraries, and finally Randy Fleenor, IBM tape sales leader, presented IBM's strategy going forward in tape virtualization.
Let's start with Diligent. The ProtecTier software runs on any x86-64 server withat least four cores and the correct Emulex host bus adapter (HBA) cards. Using Red HatEnterprise Linux (RHEL) as a base, the ProtecTier software performs its deduplication entirely in-lineat an "ingest rate" of 400-450 MB/sec. This is all possible using 4GB memory-resident "dictionary table" that can map up to 1 PB of back end physical storage, which could represent as much as 25PB of "nominal" storage. Theserver is then point-to-point or SAN-attached to Fibre Channel disk systems.
As we learned yesterday from Toby Marek's session, there are four ways to performdeduplication:
full-file comparisons. Store only one copy of identical files.
fixed-chunk comparisons. Files are carved up into fixed-size chunks, and each chunkis compared or hashed to existing chunks to eliminate duplicates.
variable-chunk comparisons. Variable-length chunks are hashed or diffed to eliminate duplicate data.
content-aware comparisons. If you knew data was in Powerpoint format, for example,you could compare text, photos or charts against other existing Powerpoint files toeliminate duplicates.
IBM System Storage N series Advanced Single Instance Storage (A-SIS) uses fixed-chunkmethod, and Diligent uses variable-chunk comparisons. Diligent does this using "dataprofiling". For example, let's say most of my photographs are pictures of people, buildings, landscapes, flowers and IT equipment. When I back these up, the Diligentserver "profiles" each, and determines if any existing data have a similar profilethat might have at least 50 percent similar content. Diligent than reads in the data that is mostly likely similar, does a byte-for-byte ["diff" comparison], and creates variable-lengthchunks that are either identical or unique to sections of the existing data. Theunique data is compressed with LZH and written to disk, and the sequential series of pointer segments representing the ingested file is written in a separate section on disk.
That Diligent can represent profiles for 1PB of data in as little as 4GB memory-residentdictionary is incredible. By comparison, 10TB data would require 10 million entries on a content-aware solution, and 1.25 billion entries for one based on hash-codes.
Abbe Woodcock presented the TS7530 tape system that IBM announced on Tuesday. It has some advantages over the current Diligent offering:
Hardware-based compression (TS7520 and Diligent use software-based compression)
1200 MB/sec (faster ingest rate than Diligent)
1.7PB of SATA disk (more disk capacity than Diligent)
Support for i5/OS (Diligent's emulation of ATL P3000 with DLT7000 tapes not supported on IBM's POWER systems running i5/OS)
Ability to attach a real tape library
NDMP backup to tape
tape "shredding" (virtual equivalent of degaussing a physical tape to erase all previously stored data)
Randy Fleenor wrapped up the session telling us IBM's strategy going forward with all of thevirtual tape systems technologies. Until then, IBM is working on "recipes" or "bundles", puttingDiligent software with specific models of IBM System x servers and IBM System Storage DS4000 disk systemsto avoid the "do-it-yourself" problems of its current software-only packaging.
Understanding Web 2.0 and Digital Archive Workloads
I got to present this in the last time slot of the day, just before everyone headed off to the [Westin Bonaventure hotel] for our big fancy barbecue dinner. Like my previous sessionon IBM Strategy, this session was more oriented toward a sales audience, but both garnereda huge turn-out and were well-received by the technical attendees.
This session was requested because these new applications and workloads are what is driving IBM to acquire small start-ups like XIV, deploy Scale-Out File Services (SOFS), and develop the innovative iDataPlex server rack.
The session was fun because it was a mix of explanation of the characteristics ofWeb 2.0 services; my own experience as a blogger and user of Google Docs, FlickR, Second Life andTivo; and an exploration in how database and digital archives will impact thegrowth in computing and storage requirements.
I'll expand on some of these topics in later blog posts.
I can't believe I have been blogging for a year now!
I have Jennifer Jones from IBM to thank for getting this started. She was my predecessor in the job I have now, and she was moving on to bigger and better things, and during the transition for me to take over, she suggested that we start a blog, podcast, or similar. While there are many blogs and podcasts inside the firewall of IBM, I wanted something to be accessible to all of our IBM sales team, IBM Business Partners, existing and prospective clients, and to enable comments, to enable two-waycommunication. Podcasts are very one-way, so we chose a blog instead.Getting it set up took a while, convincing our own management that this was worthwhile, and dealing with our legal department on the IBM blogging guidelines of what we can and cannot write about, we finally got it going last year, launching September 1, just in time for our 50 years of disk systems innovation campaign.
It has been a wild ride, a great learning experience, and has proven quite fulfilling for job satisfaction. Here are some observations and lessons I have learned along the way.
Roller is the open source blog server that drives Sun Microsystem's blogs.sun.com employee blogging site, IBM DeveloperWorks blogs that this blog exists on, thousands of internal blogs at IBM Blog Central, the JRoller Java community site, and hundreds of others world-wide.Whereas there might be fancier blog systems elsewhere that I could have chosen, hosting my blog with IBM Developerworksseemed like a good choice. I can access from any web-browser capable machine, and enter my blog posts in nativeHTML, that I develop in the tool itself, or offline with a standard basic text editor like Microsoft Notepad that I can then cut-and-paste back in.
One lesson I learned the hard way was that Roller generates the Permalink URL for each blog post based on the first five words of the title. For that reason, it is important to chose an appropriate and unique title, avoiding the use of punctuation, quotation marks, or pharmaceutical "enhancement products" that might get rejected by SPAM filters.Once chosen, you can't change the title afterwards as it won't match the Permalink anymore.My blog post "Aperi is (enhancement product) for SMI-S" caused no end of grief to our Press Release team.
Writing blog posts in native HTML is not as hard as it sounds. I am limited to hosting a maximum of 24MB of files, and they can only be jpg, jpeg, gif, png, mp3, pdf or ppt format.So, wherever possible, I point to other websites for content.For those new to blogging, I recommendThe Barebones Guide to HTML.
Roller also generates for me a spreadsheet of all my page views for the week. Tracking blog traffic closely is as crazyas checking your company's stock price every day. These "web-stat" e-mails get filed directly into my Bacn folder on Lotus Notes.
In my earlyadvice to bloggers, I mentioned my choice of Bloglines as my RSS feed reader. When I subscribe to a new blog, I specify Full entries, not Partial,which allows me to scan it quickly, but filters out many of the non-text content like videos. It also allowed meto see what my own blog posts looked like from within a reader, so that I can write them appropriately.
I find if valuable to read other blogs, including those written by employees of our toughest competitors. Evenif you don't blog yourself, following blogs can be extremely valuable. Be careful what you leave as comments onother blogs, they may come back to haunt you later.
Currently, I track 55 blogs, some about storage,marketing, Web 2.0 issues, Second Life, Linux, or other areas of interest. I prefer blogs that make only 1-5 postsper week, so blogs like LifeHacker and LifeRemix are off my Bloglines list, but are excellent resourceswhen I am searching for something specific. If you think 55 is a lot of blogs, consider Timothy Ferriss' post onHow RobertScoble reads 622 RSS feeds each morning.
I have quite an international readership, so I have to be careful using American idioms and pop cultural references.For example, in my blog post IBM acquires Softek, I mentioned "shotgun weddings" and had various responses asking what exactly did that mean,all from readers outside the USA. I've learned that sometimes you need to link them to an American Slang dictionary,or Wikipedia encyclopedia entry to explain these terms and phrases.
Technoraticurrently tracks over 100 million blogs and over 250 million pieces of tagged social media. Getting my blogtracked had some issues. You have to join, thenpost a "claim"on your own blog. My mistake was having a case-sensitive URL with a mix of upper and lower case letters, but Technorati prefers all lower case. IBM worked with Technorati to get this resolved.
Del.icio.us is a social bookmarking website -- the primary use is to store your bookmarks online, which allows you to access the same bookmarks from any computer and add bookmarks from anywhere, too. On del.icio.us, you can use tags to organize and remember your bookmarks, which is a much more flexible system than folders.
I use Firefox, Safari, Dillo and Internet Explorer web browsers, so it is nice that I have access to allmy bookmarks in the same consistent manner. When I see content on a website that I might like to reference laterin a blog, I tag it with del.icio.us so that I can get to it later.
Fellow GTD-ers will quickly recognize this acronym, but for the rest of you, it refers to David Allen's book "Getting Things Done®".This is a great book! I learned about it reading other people's blogs, and found it incrediblyuseful helping me organize my time.There are various online tools available to help employ this method. I use Lotus Connections Activitiesfor group projects with co-workers at IBM, and BackPack for projects withmy friends outside of work.
The success of YouTube encouraged IBM to launch IBM TV, a portal for IBM's video and multimedia assets and make it easier for IBM employees, customers, partners and prospects to access and view IBM multimedia. The plan is to have eight anchor episodes per year, professionally hosted by TV personality, Joe Washington, and point to related offers and other resources for viewers to learn more.
Blogging also introduced me to Second Life. I asked around if anyone else within IBM was using Second Life, anddiscovered quite a few. I got invited to join our internal Eightbar group, and participated in various events, including an IBM Holidayparty that I discussed in my blog post"Building a Snowman in Second Life".
In April, we had a launch of our newest products in Second Life, and we plan to have two more Second Life events,September 20 and another in November, staged as "Meet the Experts" question and answer panels.
I wrap up with Facebook. Actually, whereas most of my Web 2.0 efforts have been work-related, I have quite a few friends and family who follow my blog. Several were inspired to start their own blogs, such asPassages from Pamand Barry Whyte on Storage Virtualization. Bridging the gap is Facebook, something I can use to keep tabs on my friends, as well as my storage industry-related contacts.
Wow, that's quite a lot in one year. Well, I am done with my meetings down here in Sao Paulo, Brazil. My colleauges and I are returning tonight to enjoy the long Labor Day weekend.
It's been a while since I've talked about [Second Life].
The latest post on eightbar[Spimes, Motes and Data centers]discusses IBM's use of virtual world technology to analyze data centers in three dimensions.New World Note asks[What's The Point Of 3D Data Centers?]One would think that a simple monitoring tool based on a two-dimensional floor plan would be enough to evaluate a data center.
Enter Michael Osias, IBM (a.k.a Illuminous Beltran in Second Life). Some of the leading news sites havebegun to notice some 3D data centers that he has helped pioneer. UgoTrade writes up an article aboutMichael and the media attention in [The Wizard of IBM's 3DData Centers].
Of course, in presenting these "Real Life/Second Life" (RL/SL) interactive technologies, IBM is sometimes the target of ridicule. Why? Because IBM is 10 years ahead of everyone else. So, are there aspects of a data center where 3D interfaces makes sense? I think there is.
IBM TotalStorage Productivity Center has an awesome "topology viewer" that shows what servers are connectedto which switches, to which disk systems and tape libraries. This is all done in a 2D diagram, generated dynamicallywith data discovered through open standard interfaces, similar to what you might draw manually with toolslike Visio. Imagine, however, howmore powerful if it were a 3D viewer, with virtual equipment mapped to the physical location of each pieceof hardware on the data center floor, including the position on the rack and location on the data center floor.
Designing computer room air conditioning (CRAC) systems is actually a three dimensional problem. Cold air isfed underneath the raised floor, comes up through strategically placed "vent" tiles, taken in the front ofeach rack. Hot air comes out the back of each rack, and hopefully finds ceiling duct intake to get cooled again.The temperature six inches off the floor is different than the temperature six feet off the floor, and 3Dmonitor tools could be helpful in identifying "hot spots" that need attention. In this case "spimes" representsensors in the 3D virtual world, able to report back information to help diagnose problems or monitor events.
After many people left the mainframe in favor of running a single application per distributed server, the pendulumhas finally swung back. Companies are discovering the many benefits of changing this behavior. "Re-centralization" is the task at hand. Thanks to virtualization of servers, networks and storage, sharing common resources canonce again claim the benefits of economies of scale. In many cases, servers work together in collective unitsfor specific applications that might benefit better if consolidated together onto the same equipment.
IBM's "New Enterprise Data Center" vision recognizes that people will need to focus on the management aspectsof their IT infrastructure, and 3D virtual world technologies might be an effective way to getthe job done.
Avi Bar-Zeeb of RealityPrime has an interesting post aboutHow Google Earth [really] Works.Normally, people who are very knowledgeable in a topic have a hard time describing concepts in basic terms. Avi was one of the co-founders of Keyhole, the company that built the predecessor for Google Earth, and also worked with Linden Lab for its 3D rendering it its virtual world, so he certainly knows what he is talking about. While he sometimes drops down into techno-talk about patents, the post overall is a good read.
It is perhaps human nature to be curious on how things are put together and how they function, leading to the popularity of web sites like www.howstuffworks.com that cover a wide range of topics.
Many things can be used without understanding their internal inner workings. You can put on a pair of blue jeans without knowing how the cotton was made into denim fabric; lace up your favorite pair of running shoes without understanding the chemical make-up of the plastic that cushions your feet; or drink a glass of beer after your five mile run without knowing how alcohol is processed by your liver.
For technology, however, some people insist they need to know how it works in order for them to get the most use of it. When shopping for a car, for example, a guy might look under the hood, and ask questions about how the engine works, while his wife sits inside the vehicle, counting cup holders and making sure the radio has all the right buttons.
Not all technology suffers from need-to-know-itis. For example, the Apple iPod music player and the Canon PowerShot digital camera, are both just disk systems that read and write data, with knobs and dials on one end, and ports for connectivity on the other. Everyone just asks how to use their controls, and might read the manual to understand how to connect the cables. Few people who use these devices ask how they work before they buy them.
Other disk systems, the kind designed for data centers for the medium and large enterprise, apparently aren't there yet. Storage admins who might happily own both an iPod player and a PowerShot camera, insist they need to know how the technologies inside various storage offerings work. Is this just curiosity talking? Or are there some tasks like configuration, tuning, and support that just can't be done without this knowledge? Does knowing the inner workings somehow make the job more enjoyable, easier, or performed with less stress?
I'm curious what you think, send me a comment on this.
Chris Anderson, of Wired magazine, wrote a great article called The Long Tail.
This article became a book by the same name published earlier this year, and I just discovered it on a recent visit to Second Life. A lot of IBMers are now alsoSecond Lifers, and I suspect it is just a matter of time before we are conductingour customer briefings there, and getting our year-end bonuses paid directly in Linden bucks.(Those of you not familiar with Second Life can watch this 3-minute video fromthe folks at Text100)
Anyways, the Long Tail describes the new economy of entertainment thanks to digitalstorage. Here are some of the key insights.
In the past, entertainment was all about hits: hit songs, hit movies,hit novels, and this was primarily because of the economic realities restricted byphysical space. Chris writes: "An average movie theater will not show a film unless it can attract at least 1,500 people over a two-week run; that's essentially the rent for a screen. An average record store needs to sell at least two copies of a CD per year to make it worth carrying; that's the rent for a half inch of shelf space."
Things have changed. To drive the point home, Robbie Vann-Adibe (CEO of eCast), poses the trick question"What percentage of the top 10,000 titles in any online media store (Netflix, iTunes, Amazon, or any other) will rent or sell at least once a month?" The answer will surprise you. Write down your guess first, then go read here. His digital jukeboxes are able to play from a list of150,000 songs, not the few hundred you'd find at the Tap Room which is rated as having the best jukebox in Tucson.
The phenomenon is not just limited to music. "Take books," Chris writes, "The average Barnes & Noble carries 130,000 titles. Yet more than half of Amazon's book sales come from outside its top 130,000 titles. Consider the implication: If the Amazon statistics are any guide, the market for books that are not even sold in the average bookstore is larger than the market for those that are..."
This has incredible implications for the storage industry. For one, content providers are going to dig deep into their archives to digitize and deliver "long tail" offerings. If they don't have a deep archive, many will start to build one. Second, the need to search through that large volume of content will become more critical. Classifying and indexing with the appropriate tags and metadata will be an important task.
Well, there's little to no chance we'll get snow in Tucson the rest of this year, so I built a snowman out in Second Life. That's my avatar on the right, andI am an eightbar specialist. Eightbar refers to our logo.
This was part of an IBM "Holiday Party" where dozens of IBMers met "in the virtual world" to participate in 3D competitions,I entered the "Build a Snowman" competition, since I am still a beginner at this. This was whatI was able to come up with in 20 minutes that we had to get it done. Why I made mine out of woodwith different colors was so that I could stand out from the crowd. Everyone else used traditionalwhite snowy textures.
Others had a more challenging "Build a Snow Globe" where you have to write scripts to get thelittle snow flakes to move around. This for the advanced builders of our group.
This is still new, emerging technology, but eventually, Second Life and other MMOs could be used to market products,that people can view from all three dimensions, talk to a technical specialist, and get all questions answered.It could be used for education, shopping around, and collaborating with others.
Anyways, I haven't heard the results, but I had fun anyways.
The terms "information" and "data" are often used interchangeably in regular usage, but for the storageindustry, there are significant differences between the two, as different as "fact" from "meaning".
For example, if you are walking down the street, and see a pole with red and white stripes, the data of red and white stripes may not have much meaning, unless you recognize the information is that you are in front of a barber shop.I thought of this when someone pointed me to theStrip Generator Tool website, which can helpyou generate various stripes for use on the tiled background of web pages. (Or if you aredesigning neckties for your Second Life avatar).
Many national flags are based on simple stripes of different colors.For example, look at the national flags of France, Russia, and the Netherlands. These consist of a red, white, and blue stripe, justin different sequence and orientation.Again, the data of these colors, the width of their lines, and the way they are placed on the flag are all data, but the information they convey is significantly more than that.One person might walk right by the flag, not knowing which country it belongs to, while anotherperson might get emotional memories of their homeland.
For those of us in the storage industry, data is just binary 1's and 0's on disk and tape media, and canbe treated like packages at the post office in brown wrapping paper. Just as post office employees don't have to know the contents to ship them to the final destination, servers and storage devices don't need to knowthe informational content of the data that they process and store.
Converting information to data is easy. Let's take an example of taking a digital photo. The photo could be a picture of you and your spouseon your last vacation trip, but you would never know that from just looking at a series of 1's and 0's. For this reason, you create photo albums, you write captions below indicating where and when the photowas taken. This additional "context" is often called "metadata" or just simply "indexing".
Both the information captured (the photo in this case) and its metadata (the caption), can be storedas 1's and 0's on storage media. These bits can be compressed, encrypted, or represented in a variety of formats.
Information is copied from one data file to another. In the traditional sense, one piece of informationcould exist in the primary production copy, as well as multiple archive or backup copies. One piece ofinformation, stored on multiple copies of data. In a sense, this is similar to genetic information storedon each human being (data copy). Richard Dawkins, author of The Selfish Gene, reminds us that genes outlive individual humans. In storage, we remind people that data outlivesthe media it is initally written to, and the information outlives the initial data copy stored.
Converting data back to information is not always as simple.Not all sequences of 1's and 0's are obvious what they represent. To display a digital photo, you need to know the format the photo is in, and have an appropriate application that can display it back to something a human person can recognize. If the bits were compressed, the application needs to handlethat, or you need to de-compress the data before handing it to the application. For encrypted data,you need to have the decryption key. The process of converting a single file of data back to information is called "rendering".
One of the big problems with keeping information for long periods of time, isthat you may not have the equipment, decryption key, or applications needed to render the data back to usable information. You've kept the data, but you can't make any sense of it, as if it went through an episode of Will it Blend?
A good example is how the current version of Microsoft Office application is unable to interpret andrender data documents that were stored in WORD 1.0 format. IBM and others have developed "rendering tools" that can help decipher the bits, and bring back the information. To help address this challenge, the new Microsoft Office 2007 haschosen the OOXML format, but will continue to support some of the older legacy formats. IBM and the rest of the world are focused instead on Open Document Format (ODF) open standard. Those of usstill using older versions of Microsoft Office might need the Office 2007 Compatibility Pack.
Another way to get information from data is "data mining", an important part of "business intelligence". Here you are gleaning information notfrom individual details, but from patterns in the data, averages, statistics, totals, that havebroader meaning than individual transactions or events.
For many applications, DLM is just fine. Let's consider e-mail, for example. For most employees,deleting e-mails larger than 1 MB, after 90 days, regardless of content, is probably a reasonable DLM policy. All data is treated the same, based purely on the size and date markings on the outer brown wrapper.
For more sensitive content, DLM is not enough. The e-mails that are to or from the president of thecompany, or between top executives, or that contain certain pieces of information relevant for lawsuitsor other investigations, may not be treatedthe same as other e-mails. In this case, you need ILM technologies, managing based on the informational content of the data, and not just the size and date last referenced.
Of course, IBM supports both, and can help you decide the right solution for each workload.
This week (actually April 29 to May 2) is IBM'sPartnerWorld 2007 conference.Over the past 10 years, IBM's shift to rely more heavily on business partners has proven to be a smart decision. IBM Business Partners can often focus on a specific region or industry much better, with laser-like focus.
Registration is now open for our next "Meet the Storage Experts" event in Second Life. All IBMers, clients and IBM Business Partners are welcome to attend. We will focus this time on DS3000 and N series disk systems, tape systems,and IBM storage networking gear.
It seems like I just get out of one conference, and into another. This week I am at Pulse 2008, which combines the best of IBM Tivoli and Maximo into one conference.Like many conferences, this one starts on Sunday, and ends on Thursday.
We're at the Swan and Dolphin hotels at [Walt Disney World] in Orlando, Florida. I've been to several conferences in Orlando, but this is my first time at the Swanand Dolphin. (When I walked into the main lobby, I had a bout of "deja vu". IBM LotusSphere was here last year, and they had a complete replica made in SecondLife!)
If you haven't been to Walt Disney World resorts, whether for a conference or vacation,there are two things you need to know:
Nothing is within a short "walking distance", you need to take a bus or boat to get anywhere
Despite this, you will be doing a lot of walking, so wear comfortable shoes!
Pulse encouraged everyone to blog and take pictures posted onto FlickR, here are a few from Sunday:
Lou and Elizabeth from [Syclo], an IBM Business Partner
Mike and Megha from [Birlasoft] show off their accreditation
Greg Tevis explains FilesX, recently acquired by IBM
Alan Lepofsky posts about The Value Of Social Networking which points to this same presentation about Web 2.0 concepts and ideas.He also points to this article in the Wall Street Journal titledPlaying Well With Others about IBM and their leadership in Web 2.0 technologies, such as those from our Lotus group.
Some quotes from the WSJ article I found interesting:
Some 26,000 IBM workers have registered blogs on the company's internal computer network where they opine on technology and their work.
Social networking is especially important for the 42% of IBM employees who regularly work from their homes or client locations rather than IBM facilities.
At most companies, public-relations managers and the human-resources department tightly control all electronic communications except for email and instant messaging. ... Not at IBM.
"Any employee can have a blog, a wiki or a podcast,..."
IBM owns more than 50 "islands" in Second Life and often uses them for lectures and group discussions.
Two years ago, IBM started Wiki Central to manage wikis for IBM groups. It now has more than 20,000 wikis online with more than 100,000 users.
Interesting in learning more about Web 2.0? The last page of the deck above has a good set of links and resources, for example, here are 23 Things to know about Web 2.0 to get you started.