The terms "information" and "data" are often used interchangeably in regular usage, but for the storageindustry, there are significant differences between the two, as different as "fact" from "meaning".
For example, if you are walking down the street, and see a pole with red and white stripes, the data of red and white stripes may not have much meaning, unless you recognize the information is that you are in front of a barber shop.I thought of this when someone pointed me to theStrip Generator Tool website, which can helpyou generate various stripes for use on the tiled background of web pages. (Or if you aredesigning neckties for your Second Life avatar).
Many national flags are based on simple stripes of different colors.For example, look at the national flags of France, Russia, and the Netherlands. These consist of a red, white, and blue stripe, justin different sequence and orientation.Again, the data of these colors, the width of their lines, and the way they are placed on the flag are all data, but the information they convey is significantly more than that.One person might walk right by the flag, not knowing which country it belongs to, while anotherperson might get emotional memories of their homeland.
For those of us in the storage industry, data is just binary 1's and 0's on disk and tape media, and canbe treated like packages at the post office in brown wrapping paper. Just as post office employees don't have to know the contents to ship them to the final destination, servers and storage devices don't need to knowthe informational content of the data that they process and store.
Converting information to data is easy. Let's take an example of taking a digital photo. The photo could be a picture of you and your spouseon your last vacation trip, but you would never know that from just looking at a series of 1's and 0's. For this reason, you create photo albums, you write captions below indicating where and when the photowas taken. This additional "context" is often called "metadata" or just simply "indexing".
Both the information captured (the photo in this case) and its metadata (the caption), can be storedas 1's and 0's on storage media. These bits can be compressed, encrypted, or represented in a variety of formats.
Information is copied from one data file to another. In the traditional sense, one piece of informationcould exist in the primary production copy, as well as multiple archive or backup copies. One piece ofinformation, stored on multiple copies of data. In a sense, this is similar to genetic information storedon each human being (data copy). Richard Dawkins, author of The Selfish Gene, reminds us that genes outlive individual humans. In storage, we remind people that data outlivesthe media it is initally written to, and the information outlives the initial data copy stored.
Converting data back to information is not always as simple.Not all sequences of 1's and 0's are obvious what they represent. To display a digital photo, you need to know the format the photo is in, and have an appropriate application that can display it back to something a human person can recognize. If the bits were compressed, the application needs to handlethat, or you need to de-compress the data before handing it to the application. For encrypted data,you need to have the decryption key. The process of converting a single file of data back to information is called "rendering".
One of the big problems with keeping information for long periods of time, isthat you may not have the equipment, decryption key, or applications needed to render the data back to usable information. You've kept the data, but you can't make any sense of it, as if it went through an episode of Will it Blend?
A good example is how the current version of Microsoft Office application is unable to interpret andrender data documents that were stored in WORD 1.0 format. IBM and others have developed "rendering tools" that can help decipher the bits, and bring back the information. To help address this challenge, the new Microsoft Office 2007 haschosen the OOXML format, but will continue to support some of the older legacy formats. IBM and the rest of the world are focused instead on Open Document Format (ODF) open standard. Those of usstill using older versions of Microsoft Office might need the Office 2007 Compatibility Pack.
Another way to get information from data is "data mining", an important part of "business intelligence". Here you are gleaning information notfrom individual details, but from patterns in the data, averages, statistics, totals, that havebroader meaning than individual transactions or events.
For many applications, DLM is just fine. Let's consider e-mail, for example. For most employees,deleting e-mails larger than 1 MB, after 90 days, regardless of content, is probably a reasonable DLM policy. All data is treated the same, based purely on the size and date markings on the outer brown wrapper.
For more sensitive content, DLM is not enough. The e-mails that are to or from the president of thecompany, or between top executives, or that contain certain pieces of information relevant for lawsuitsor other investigations, may not be treatedthe same as other e-mails. In this case, you need ILM technologies, managing based on the informational content of the data, and not just the size and date last referenced.
Of course, IBM supports both, and can help you decide the right solution for each workload.
technorati tags: IBM, barber pole, stripe generator, International space station, France, Russia, Netherlands, digital photography, Richard Dawkins, blender, rendering tools, metadata, encryption, OOXML, ODF, Open Document Format, Microsoft, Office, Word, ILM, information, lifecycle, management, data, DLM, e-mail, archive, context, Hu+Yoshida
In case you missed it, IBMunveiled a new digital video surveillance service yesterday. This "marks an important shift in the industry's approach to security, applying advanced analytics to video data and signaling the ability to converge physical and information technology (IT) security."
The IBM Smart Surveillance Solution is designed to provide the unique capability to carry out efficient data analysis of video sequences either in real time or from recordings. These recordings can be on disk or tape storage.
The problem with today's existing "analog" surveillance is that the analog cameras record onto traditional VHS tapes, and these are rotated through, re-written after a few hours or days. To review tapes often involves human intervention, and must be done before the VHS tapes are re-used. Many shoplifters, thieves, and other law-breakers take a chance that their actions will not be caught on tape, or that they will be long gone by the time the video is analyzed.
The IBM Smart Surveillance Solution can provide a number of advantages over traditional video solutions, including:
With real-time analytics capabilities, the new DVS service can open up a wide array of new applications that go far beyond the traditional security aspects of surveillance systems. Early adopter industries in this rapidly evolving market include retail, public sector and financial services. The retail industry estimates nearly $50 billion is lost annually to fraud, theft and administrative errors.
Once in digital format, video surveillance can be sent further, processed quicker, and stored for longer periods of time, than traditional media makes practical today.
Beyond fraud and theft, this kind of solution could also help identify bullies who makedeath threats in High School.
Today was our annual "State of the Site" meeting for the IBM Tucson site. This facility was completed in 1978, and I started my career here in 1986.
Various employees and teams were recognized for the contributions and dedication. For example:
Our site manager, Terri Mitchell, did a recap of all our recent awards and accomplishments.Of the nine Design Innovation awards won by IBM this year at the CeBIT conference, eight were for IBM System Storage products!
A representative from Tucson's Brewster Center presented Terri an award, thanking IBM for its strong support for the community through various charity initiatives.
The final speaker was a new IBM client, Tony Casella, the IT Director of the town of Marana. Recently, the town of Marana selected IBM products made big news. Arizona is the fastest growing state in the USA, and the town of Marana, just north of Tucson, is one of the fastest growing communities in Arizona. The town is growing so large that it will soon spill over from Pima into Pinal county, and will be the first town in Arizona authorized to span county boundaries.
Marana is most famous for its Gallery Golf Club on Dove Mountain that is the new home of the World Golf Cham His decision was based on conversations he had with other IT directors of other towns and cities, and this November 2006 article in Network World. He held up the copy of his magazine. Tony was very delighted with IBM's solution-oriented approach, rather than just selling more boxes of hardware. He found IBM easy to do business with, and committed to his success. technorati tags: IBM, Tucson, Tom Beglin, Jack Arnold, Michael Scott, Second Life, Terri Mitchell, CeBIT, design, awards, NEBS, disk, tape, NAS, Tony Casella, Marana, Arizona, Accenture, Golf, Championship, Network World, HP
His decision was based on conversations he had with other IT directors of other towns and cities, and this November 2006 article in Network World. He held up the copy of his magazine.
Tony was very delighted with IBM's solution-oriented approach, rather than just selling more boxes of hardware. He found IBM easy to do business with, and committed to his success.
technorati tags: IBM, Tucson, Tom Beglin, Jack Arnold, Michael Scott, Second Life, Terri Mitchell, CeBIT, design, awards, NEBS, disk, tape, NAS, Tony Casella, Marana, Arizona, Accenture, Golf, Championship, Network World, HP
It's good to see IBM TotalStorage Productivity Center evolve and expand. I was the lead architect for this product a few years ago, and my has it come a long way from its early beginnings.
Today, Gartner, Inc. has IBM Positioned in Leader Quadrant for Storage Resource Management and SAN Management Software.
The Magic Quadrant is copyrighted concept by Gartner, representing a two-by-two grid that ranks various offerings from different vendors. Ideally, vendors want their products in the upper right "Leaders" quadrant. Yahoo Finance reports:
According to Gartner, Inc., "Leaders have the highest combined measures of an ability to execute and a completeness of vision. They have the most comprehensive and scalable products. They have a proven track record of financial performance and an established market presence. In terms of vision, they are perceived as thought leaders, having well-articulated plans for ease of use, how to address scalability and product breadth. For vendors to have long-term success, they must plan to address the expanded market requirements for change management and root-cause and performance analysis. Leaders must not only deliver to the current market requirements, which continue to change, but they also need to anticipate and deliver on future requirements. A cornerstone for leaders is the ability to articulate how these requirements will be addressed as part of their vision for resource management. As a group, leaders can be considered a part of most new purchase proposals, and they have high success rates in winning new business."IBM TotalStorage Productivity Center is a strategic part of IBM Service Management, and a foundational component of the IBM Systems Director family. IBM is making a concerted effort across servers, networks, software and storage to help manage the IT infrastructure in a coordinated way.
I have seen other quadrants used to help explain different market segments, such as the one used in this 40-minute video Guy Kawasaki’s Art of the Start speech at TiECon 2006.
To the current architects and developers of Productivity Center, well done!
As an alumni of the University of Arizona, it is always good to see any of the Arizona schools try something new and innovative. This time, it was our arch-rivals atArizona State University (in Tempe, AZ, near Phoenix).
An article in InformationWeek reports that40,000 ASU Students Leap to Google Apps; University Pays Zero. The ASU president, Michael Crow, wants to make IT the primary driver in his ambitious "New American University" project.Last October, ASU became the first large institution to deploy Google Apps, a comprehensive suite of productivity applications that includes e-mail, search, calendars, instant messaging, and even word processing and spreadsheets.I've tried them out, they work, nothing fancy but certainly good enough for college homework assignments.
Already 40,000 students and faculty have switched their e-mail to Google, while keeping their asu.edu designation. (out of 65,000 student population, which Mr. Crow is trying to raise to 90,000 students!)
E-mail is a thorn in the side of storage administrators. Being "semi-structured" repositories, they cannot just delete or move files around, as there is context between notes and their attachments, that shouldn't be broken. E-mail systems are often the fastest growing consumer of storage for many organizations.
Switching from maintaining their own mail servers to Google is saving ASU $500,000 US dollars alone, not including the administrator labor savings. Again, some corporations might feel their e-mail is too "secret" to be outsourced like this, but for college students who spend all their creative talent posting things on MySpace and YouTube, and faculty who spend their careers TRYING to get published, they have nothing to hide from the rest of the world. It makes perfect sense.
Best of all, Google isn't charging ASU anything for this service. Google is able to cover the costs from advertising revenue instead. I can think of a lot of companies that might want to advertise to a demographic of "40,000 students who are mostly 18-25 years old and all live in or near Tempe, AZ".