Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
This week, July 26-30, 2010, I am in Washington DC for the annual [2010 System Storage Technical University]. As with last year, we have joined forces with the System x team. Since we are in Washington DC this time, IBM added a "Federal Track" to focus on government challenges and solutions. So, basically, offering attendees the option to attend three conferences for one low price.
This conference was previously called the "Symposium", but IBM changed the name to "Technical University" to emphasize the technical nature of the conference. No marketing puffery like "Journey to the Private Cloud" here! Instead, this is bona fide technical training, qualifying attendees to count this towards their Continuing Professional Education (CPE).
(Note to my readers:The blogosphere is like a playground. In the center are four-year-olds throwing sand into each other's faces, while mature adults sit on benches watching the action, and only jumping in as needed. For example, fellow blogger Chuck Hollis (EMC) got sand in his face for promising to resign if EMC ever offered a tacky storage guarantee, and then [failed to follow through on his promise] when it happened.
Several of my readers asked me to respond to another EMC blogger's latest [fistful of sand].
A few months ago, fellow blogger Barry Burke (EMC) committed to [stick to facts] in posts on his Storage Anarchist blog. That didn't last long! BarryB apparently has fallen in line with EMC's over-promise-then-under-deliver approach. Unfortunately, I will be busy covering the conference and IBM's robust portfolio of offerings, so won't have time to address BarryB's stinking pile of rumor and hearsay until next week or later. I am sorry to disappoint.)
This conference is designed to help IT professionals make their business and IT infrastructure more dynamic and, in the process, help reduce costs, mitigate risks, and improve service. This technical conference event is geared to IT and Business Managers, Data Center Managers, Project Managers, System Programmers, Server and Storage Administrators, Database Administrators, Business Continuity and Capacity Planners, IBM Business Partners and other IT Professionals. This week will offer over 300 different sessions and hands-on labs, certification exams, and a Solutions Center.
For those who want a quick stroll through memory lane, here are my posts from past events:
In keeping up with IBM's leadership in Social Media, IBM Systems Lab Services and Training team running this event have their own [Facebook Fan Page] and
[blog]. IBM Technical University has a Twitter account [@ibmtechconfs], and hashtag #ibmtechu. You can also follow me on Twitter [@az990tony].
Continuing my coverage of the Data Center 2010 conference, Monday I attended four keynote sessions.
The first keynote speaker started out with an [English proverb]: Turbulent waters make for skillful mariners.
He covered the state of the global economy and how CIOs should address the challenge. We are on the flat end of an "L-shaped" recovery in the United States. GDP growth is expected to be only 4.7 percent Latin America, 2.3 percent in North America, 1.5 percent Europe. Top growth areas include 8.0 percent India and 8.6 percent China, with an average of 4.7 growth for the entire Asia Pacific region.
On the technical side, the top technologies that CIOs are pursuing for 2011 are Cloud Computing, Virtualization, Mobility, and Business Intelligence/Analytics. He asked the audience if the "Stack Wars" for integrated systems are hurting or helping innovation in these areas.
Move over "conflict diamonds", companies now need to worry about [conflict minerals].
He proposed an alternative approach called Fabric-Based Infrastructure. In this new model, a shared pool of servers is connected to a shared pool of storage over an any-to-any network. In this approach, IT staff spend all of their time just stocking up the vending machine, allowing end-users to get the resources they need.
Crucial Trends You Need to Watch
The second speaker covered ten trends to watch, but these were not limited to just technology trends.
Virtualization is just beginning - even though IBM has had server virtualization since 1967 and storage virtualization since 1974, the speaker felt that adoption of virtualization is still in its infancy. Ten years ago, average CPU utilization for x86 servers of was only 5-7 percent. Thanks to server virtualization like VMware and Hyper-V, companies have increased this to 25 percent, but many projects to virtualized have stalled.
Big Data is the elephant in the room - storage growth is expected to grow 800 percent over the next 5 years.
Green IT - Datacenters consume 40 to 100 times more energy than the offices they support. Six months ago, Energy Star had announced [standards for datacenters] and energy efficiency initiatives.
Unified Communications - Voice over IP (VoIP) technologies, collaboration with email and instant messages, and focus on Mobile smartphones and other devices combines many overlapping areas of communication.
Staff retention and retraining - According to US Labor statistics, the average worker will have 10 to 14 different jobs by the time they reach 38 years of age. People need to broaden their scope and not be so vertically focused on specific areas.
Social Networks and Web 2.0 - the keynote speaker feels this is happening, and companies that try to restrict usage at work are fighting an uphill battle. Better to get ready for it and adopt appropriate policies.
Legacy Migrations - companies are stuck on old technology like Microsoft Windows XP, Internet Explorer 6, and older levels of Office applications. Time is running out, but migration to later releases or alternatives like Red Hat Linux with Firefox browser are not trivial tasks.
Compute Density - Moore's Law that says compute capability will double every 18 months is still going strong. We are now getting more cores per socket, forcing applications to re-write for parallel processing, or use virtualization technologies.
Cloud Computing - every session this week will mention Cloud Computing.
Converged Fabrics - some new approaches are taking shape for datacenter design. Fabric-based infrastructure would benefit from converging SAN and LAN fabrics to allow pools of servers to communicate freely to pools of storage.
He sprinkled fun factoids about our world to keep things entertaining.
50 percent of today's 21-year-olds have produced content for the web. 70 percent of four-year-olds have used a computer. The average teenager writes 2,282 text messages on their cell phone per month.
This year, Google averaged 31 billion searches per month, compared 2.6 billion searches per month in 2007.
More video has been uploaded to YouTube in the last two months than the three major US networks (ABC, NBC, CBS) have aired since 1948.
Wikipedia averages 4300 new articles per day, and now has over 13 million articles.
This year, Facebook reached 500 million users. If it were a country, it would be ranked third. Twitter would be ranked 7th, with 69% of their growth being from people 32-50 years old.
In 1997, a GB of flash memory cost nearly $8000 to manufacture, today it is only $1.25 instead.
The computer in today's cell phone is million times cheaper, and thousand times more powerful, than a single computer installed at MIT back in 1965. In 25 years, the compute capacity of today's cell phones could fit inside a blood cell.
See [interview of Ray Kurzweil] on the Singularity for more details.
The Virtualization Scenario: 2010 to 2015
The third keynote covered virtualization. While server virtualization has helped reduce server costs, as well as power and cooling energy consumption, it has had a negative effect on other areas. Companies that have adopted server virtualization have discovered increased costs for storage, software and test/development efforts.
The result is a gap between expectations and reality. Many virtualization projects have stalled because there is a lack of long-term planning. The analysts recommend deploying virtualization in stages, tackle the first third, so called "low hanging fruit", then proceed with the next third, and then wait and evaluate results before completing the last third, most difficult applications.
Virtualization of storage and desktop clients are completely different projects than server virtualization and should be handled accordingly.
Cloud Computing: Riding the Storm Out
The fourth keynote focus on the pros and cons of Cloud Computing. First they start by defining the five key attributes of Cloud: self-service, scalable elasticity, shared pool of resources, metered and paid per use, over open standard networking technologies.
In addition to IaaS, PaaS and SaaS classifications, the keynote speaker mentioned a fourth one: Business Process as a Service (BPaaS), such as processing Payroll or printing invoices.
While the debate rages over the benefits between private and public cloud approaches, the keynote speaker brings up the opportunites for hybrid and community clouds. In fact, he felt there is a business model for a "cloud broker" that acts as the go-between companies and cloud service providers.
A poll of the audience found the top concerns inhibiting cloud adoption were security, privacy, regulatory compliance and immaturity. Some 66 percent indicated they plan to spend more on private cloud in 2011, and 20 percent plan to spend more on public cloud options. He suggested six focus areas:
Test and Development
Prototyping / Proof-of-Concept efforts
Web Application serving
SaaS like email and business analytics
Select workloads that lend themselves to parallelization
The session wrapped up with some stunning results reported by companies. Server provisioning accomplished in 3-5 minutes instead of 7-12 weeks. Reduced cost of email by 70 percent. Four-hour batch jobs now completed in 20 minutes. 50 percent increase in compute capacity with flat IT budget. With these kind of results, the speaker suggests that CIOs should at least start experimenting with cloud technologies and start to profile their workloads and IT services to develop a strategy.
That was just Monday morning, this is going to be an interesting week!
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of the other Monday morning keynote sessions:
Driving Innovation to Achieve Dramatic Improvements
What is Innovation? It is a process that starts with one or more ideas, that results in change, that creates value. Easier said than done!
Innovation drives business growth. The analyst indicated that the IT infrastructure can either be in the way to impede business growth, neutral to enable growth, or contributing to business growth. Companies often find downtime as an inhibitor to business growth. The analyst gave these typical numbers.
Unplanned downtime (hours per year)
Planned Downtime (hours per year)
A big inhibitor to change is "cultural inertia", which states that the way things are prevent what they could be. Change requires both rewards and measures. Employees are often uncomfortable with change. Motivation should be with carrots not sticks.
(I often joke that the only people who are comfortable with change are babies with soiled diapers and prisoners on death row!)
The impedence to change is further amplified by leadership because what got them into their positions was their history of success, and often leaders perpetuate what worked for them in the past.
"There is nothing so useless as doing efficiently that which should not be done at all."
--- Peter Drucker
Nothing lasts forever, and companies should not try to avoid the inevitable. Innovators need to see themselves as change agents. the analyst feels that less than 10 percent of IT will adopt innovation to enact dramatic change. The analyst took a poll of the audience asking: Why isn't your IT Infrastructure and Operations more innovative? Over 800 attendees responded. Here were the results:
The analyst suggests treating Innovation like a team sport, with small 2-5 person teams. Search for breakthrough opportunities by setting audacious goals to inspire innovative thinking. What approach are most people doing today? Here are some polling results:
The analyst suggest it is more important to establish a culture of innovation first, and process second. Skunkworks projects are back in favor. IT folks should avoid the worship of so-called "best practices" as a reason to avoid change in trying something different. To think "outside-the-box" you need to get outside the box, or office, or cubicle, or wherever you work that prevents you from interacting with your internal or external customers. Customers can bring great insights on new approaches to take.
One new approach, born in the Cloud and now coming to the Enterprise is the concept of [DevOps], which consists of promoting collaboration between the "Appplication Development" half of IT, with the "Operations" half. If you never had heard of DevOps before, you are not alone, most of the attendees at this conference hadn't either. Here are the poll results:
Some companies have instituted a "Fresh Eyes" program, asking new-hires and early-tenure employees questions like: What surprised you the most when you joined the company? Was there anything that didn't make sense to you? Do you have any ideas to improve the way we do things?
"In a time of crisis we all have the potential to morph up to a new level and do things we never thought possible"
– Stuart Wilde
Why wait for a crisis?
Facebook: Efficient Infrastructure at a Massive Scale
Frank Frankovsky, the Director of Hardware and Design and Supply Chain at Facebook, was sitting right next to me in the audience. I didn't know this until it was his turn to speak, and he jumps up and walks to the stage! For those who live under a rock and/or are over 40 years old, Facebook is a social media site that allows people to maintain personal profiles, share photos, news and messages, play games, and create groups to organize events. They now support over 800 million accounts, a healthy percentage of the 1.9 billion people on the internet today.
Started in 2004, Facebook was originally hosted on standard server and storage hardware in colocation facilities. Facebook saved 38 percent costs by bringing their operations in-house, building their own servers from parts, and using no third-party software. Facebook has the advantage of owning their entire software stack, leveraging open source as much as possible. They even re-wrote their own PHP compiler, which they pronounce "Hip-Hop", short for high-performance-PHP.
Facebook can stand up a new data center in less than 10 months, from breaking ground to serving users. Most of Facebook's data centers sport a PUE less than 1.5, but their newest one in Prineville, Oregon is down to an amazing 1.07 level for a 7.5 Megawatt facility! How did they do it? Here are a few of their tricks:
Use Scale-Out architecture. Having lots of small servers, scattered in various data centers, allows them to survive a server failure, as well as having the luxury to shut down a datacenter when needed for maintenance reasons.
Free Cooling. Instead of air-conditioning, they pump in cold air from the outside, and send the heated exhaust back outdoors. Frank does not believe servers should be treated like humans, so their data centers run uncomfortably hot. The 50-year climate data is used to determine data center locations that have the optimal "free cooling" opportunities.
Eliminate UPS and PDU energy losses. Rather than running 480 VAC power through UPS that represent a 6 to 12 percent loss, and then PDU that introduce another 3 percent loss getting down to 208 or 120 VAC, Frank's team builds servers that feed direclty off the 480 VAC from the power company. For backup power, they use 48VDC batteries. One set of batteries can backup six racks of servers.
Target 6 to 8 KW per rack. Low-density racks are easier to keep cool.
Build their own IT equipment. Rather than buying commercially-available servers, Frank's team builds 1.5 U servers based on Intel "Westmere" chipset. 1.5U allows for larger fan radius than standard 1U pizza box format. (IBM's iDataPlex uses 2U fans for the same reason!) Facebook has a "Vanity free" design philosophy, so no fancy plastic bezels. In most cases, the covers are left off. Most (65 percent) of their servers are web front-ends. They plan new IT equipment based on Intel's "Sandy Bridge" chipset.
Use SATA drives. They buy the largest SATA drives available, directly from manufacturers, in direct-attach storage (DAS) in their servers. Data is organized in a Hadoop cluster, and they have developed their own internal "Haystack" for photo storage. Despite the floods in Thailand, Facebook has secured all the SATA disk they plan to buy for 2012 from their suppliers.
Use Solid-State drives. Their Database tier uses 100 percent Solid-State drives.
Frank is also a founder for the [Open Compute Project], which takes an "Open source" approach to IT hardware.
Facebook does not bother with hypervisors. Instead, they have adapted their own software to make full use of the CPU natively. This eliminates the "I/O Tax" penalty associated with VMware and other hypervisors.
Of course, not everyone owns their entire software stack, and can build their own servers! It was nice to hear how a company without such limitations can innovate to their advantage.
This week I am in Orlando, Florida for the IBM Edge conference. Tuesday afternoon we had a Birds-of-a-Feather (BOF) session to discuss social media. I was the moderator. We had two independent bloggers on the panel: [Jon Toigo] and [Steve Foskett]. We had several IBM social media experts, including Jack Arnold, Scott Drummond, Mary Hall, Nick Harris, and Rich Swain. Also joining us was Alex Hollingworth, social media expert from Emulex.
At the opening session, Deon Newman suggest we re-tweet him, isn't that plagiarism? What is your take on this?
The important thing is to give credit where it is due. Avoid screen scraping others and passing it off as your own. When you re-tweet someone, you give them credit for their original tweet. You are basically saying, "I could not have said it better myself!" With blogs, you can do the same by linking to other blog posts.
I am active in social media, but am having trouble getting the older colleagues in the IT department to participate. I want them to write down all the knowledge in their heads.
The best way to get employees to do anything new or different is to show them how it benefits them. For example, if the elders are tired of answering the same questions over and over, have them start an internal wiki, blog or knowledgebase to capture the answers to frequent questions. This will save them time, so they can see value for themselves. I suggest looking at IBM Lotus Connections which provides collaboration tools inside your firewall, accessible only to internal employees of the company.
How do we differentiate facts from opinions in our social media writings?
You can always be explicit, for example IMHO stands for "In my humble opinion". I find that blogs are 99 percent opinion, and 1 percent fact, so it is easier to point out the facts linking or citing sources, and let the rest of your writing be considered opinion.
I would like to find people on Linkedin to establish business relationships with the storage administrators, decision makers and influencers within the companies I want to sell to, how do I best do that?
Nobody likes cold calls. If you upgrade to a "Pro" account on LinkedIn, you can send 15 to 25 "Inmail" emails through their system to introduce yourself. Otherwise, consider finding someone in your network that knows them, and arrange for them to provide the mutual introduction for you.
How do I find people to follow related to the topics I am interested in, like storage?
There are tools like [Tweetadder] to help you find people to follow. Or, just search on certain hashtags, and add people you find that use them.
I am concerned about privacy? What can I do to protect my privacy?
Decide up front which topics are off-limits in your blog or other social media. For services like Facebook, check your privacy settings every 30 days. Several people have opted to create a special "Facebook Page" that represents their professional brand, so that the rest of Facebook can be used for friends and family.
I want to start a new blog, which service should I use?
Services like Blogger, Blogspt and TypePad are generally easy to set up. Wordpress is more advanced, but can be more complicated to set up.
I don't care for writing a blog, how can I set up a video blog, or vlog?
Consider creating a channel on YouTube. Another popular site is Vimeo. A "Pro" account of Vimeo provides added features.
I am new to Twitter, what tools should I look into?
I suggest you look at HootSuite. This lets you post to Twitter, Facebook and Linkedin. You can schedule when a tweet will be posted, so you can right them in advance and schedule them for a certain date and time. Also, if you have a blog, you can have Hootsuite send out tweets automatically with the titles and link to each blog post.
How much effort should we put in to Social Media?
As much or as little as you want. Don't force yourself to spend more time than you want. Typically, people spend 1-2 hours per day. Cut down how much you spend watching television to make up the difference. Set up "Google Alerts" that can send you emails when certain phrases appear anywhere. There are also social bookmarking tools like Instapaper, Delicious or Diigo that can save bookmarks in the cloud for things that you want to read, but don't have time to read now.
Which social media would be the best to get chicks.
Writing a technical blog with good quality content. Girls want to be with you. Guys want to be you.
How can I use social media to provide feedback about specific products?
IBM now has a [Reviews and Ratings] for its IBM System Storage products. Consider writing a review today!
Thanks to all of the panel for their help with this!
This week I am at the Data Center Conference 2009 in Las Vegas. There are some 1700 people registered this year for this conferece, representing a variety of industries like Public sector, Services, Finance, Healthcare and Manufacturing. A survey of the attendees found:
55 percent are at this conference for the first time.
18 percent once before, like me
15 percent two or three times before
12 percent four or more times before
Plans for 2010 IT budgets were split evenly, one third planning to spend more, one third planning to spend about the same, and the final third looking to cut their IT budgets even further than in 2009. The biggest challenges were Power/Cooling/Floorspace issues, aligning IT with Business goals, and modernizing applications. The top three areas of IT spend will be for Data Center facilities, modernizing infrastructure, and storage.
There are six keynote sessions scheduled, and 66 breakout sessions for the week. A "Hot Topic" was added on "Why the marketplace prefers one-stop shopping" which plays to the strengths of IT supermarkets like IBM, encourages HP to acquire EDS and 3Com, and forces specialty shops like Cisco and EMC to form alliances.
Day 2 began with a series of keynote sessions. Normally when I see "IO" or "I/O", I immediately think of input/output, but here "I&O" refers to Infrastructure and Operations.
Business Sensitivity Analysis leads to better I&O Solutions
The analyst gave examples from Alan Greenspan's biography to emphasize his point that what this financial meltdown has caused is a decline in trust. Nobody trusts anyone else. This is true between people, companies, and entire countries. While the GDP declined 2 percent in 2009 worldwide, it is expected to grow 2 percent in 2010, with some emerging markets expected to grow faster, such as India (7 percent) and China (10 percent). Industries like Healthcare, Utilities and Public sector are expected to lead the IT spend by 2011.
While IT spend is expected to grow only 1 to 5 percent in 2010, there is a significant shift from Capital Expenditures (CapEx) to Operational Expenses (OpEx). Five years ago, OpEx used to represent only 64 percent of IT budget in 2004, but today represents 76 percent and growing. Many companies are keeping their aging IT hardware longer in service, beyond traditional depreciation schedules. The analyst estimated over 1 million servers were kept longer than planned in 2009, and another 2 million will be kept longer in 2010.
An example of hardware kept too long was the November 17 delay of 2000 some flights in the United States, caused by a failed router card in Utah that was part of the air traffic control system. Modernizing this system is estimated to cost $40 billion US dollars.
Top 10 priorities for the CIO were Virtualization, Cloud Computing, Business Intelligence (BI), Networking, Web 2.0, ERP applications, Security, Data Management, Mobile, and Collaboration. There is a growth in context-aware computing, connecting operational technologies with sensors and monitors to feed back into IT, with an opportunity for pattern-based strategy. Borrowing a concept from the military, "OpTempo" allows a CIO to speed up or slow down various projects as needed. By seeking out patterns, developing models to understand those patterns, and then adapting the business to fit those patterns, a strategy can be developed to address new opportunities.
Infrastructure and Operations: Charting the course for the coming decade
This analyst felt that strategies should not just be focused looking forward, but also look left and right, what IBM calls "adjacent spaces". He covered a variety of hot topics:
65 percent of energy running x86 servers is doing nothing. The average x86 running only 7 to 12 percent CPU utilization.
Virtualization of servers, networks and storage are transforming IT to become on big logical system image, which plays well with Green IT initiatives. He joked that this is what IBM offered 20 years ago with Mainframe "Single System Image" sysplexes, and that we have come around full circle.
One area of virtualization are desktop images (VDI). This goes back to the benefits of green-screen 3270 terminals of the mainframe era, eliminating the headaches of managing thousands of PCs, and instead having thin clients rely heavily on centralized services.
The deluge in data continues, as more convenient access drives demand for more data. The anlyst estimates storage capacity will increase 650 percent over the next five years, with over 80 percent of this unstructured data. Automated storage tiering, ala Hierarchical Storage Manager (HSM) from the mainframe era, is once again popular, along with new technologies like thin provisioning and data deduplication.
IT is also being asked to do complex resource tracking, such as power consumption. In the past IT and Facilities were separate budgets, but that is beginning to change.
The fastest growing social nework was Twitter, with 1382 percent growth in 2009, of which 69 percent of new users that joined this year were 39 to 51 years old. By comparison, Facebook only grew by 249 percent. Social media is a big factor both inside and outside a company, and management should be aware of what Tweets, Blogs, and others in the collective are saying about you and your company.
The average 18 to 25 year old sends out 4000 text messages per month. In 24 hours, more text messages are sent out than people on the planet (6.7 billion). Unified Communications is also getting attention. This is the idea that all forms of communication, from email to texts to voice over IP (VoIP), can be managed centrally.
Smart phones and other mobile devices are changing the way people view laptops. Many business tasks can be handled by these smaller devices.
It costs more in energy to run an x86 server for three years than it costs to buy it. The idea of blade servers and componentization can help address that.
Mashups and Portals are an unrecognized opportunity. An example of a Mashup is mapping a list of real estate listings to Google Maps so that you can see all the listings arranged geographically.
Lastly, Cloud Computing will change the way people deliver IT services. Amusingly, the conference was playing "Both Sides Now" by Joni Mitchell, which has the [lyrics about clouds]
Unlike other conferences that clump all the keynotes at the beginning, this one spreads the "Keynote" sessions out across several days, so I will cover the rest over separate posts.
The technology industry is full of trade-offs. Take for example solar cells that convert sunlight to electricity. Every hour, more energy hits the Earth in the form of sunlight than the entire planet consumes in an entire year. The general trade-off is between energy conversion efficiency versus abundance of materials:
Get 9-11 percent efficiency using rare materials like indium (In), gallium (Ga) or cadmium (Cd).
Get only 6.7 percent efficiency using abundant materials like copper (Cu), tin (Sn), zinc (Zn), sulfur (S), and selenium (Se)
A second trade-off is exemplified by EMC's recent GeoProtect announcement. This appears similar to the geographic dispersal method introduced by a company called [CleverSafe]. The trade-off is between the amount of space to store one or more copies of data and the protection of data in the event of disaster. Here's an excerpt from fellow blogger Chuck Hollis (EMC) titled ["Cloud Storage Evolves"]:
"Imagine a average-sized Atmos network of 9 nodes, all in different time zones around the world. And imagine that we were using, say, a 6+3 protection scheme.
The implication is clear: any 3 nodes could be completely lost: failed, destroyed, seized by the government, etc.
-- and the information could be completely recovered from the surviving nodes."
For organizations worried about their information falling into the wrong hands (whether criminal or government sponsored!), any subset of the nodes would yield nothing of value -- not only would the information be presumably encrypted, but only a few slices of a far bigger picture would be lost.
Seized by the government?falling into the wrong hands? Is EMC positioning ATMOS as "Storage for Terrorists"? I can certainly appreciate the value of being able to protect 6PB of data with only 9PB of storage capacity, instead of keeping two copies of 6PB each, the trade-off means that you will be accessing the majority of your data across your intranet, which could impact performance. But, if you are in an illicit or illegal business that could have a third of your facilities "seized by the government", then perhaps you shouldn't house your data centers there in the first place. Having two copies of 6PB each, in two "friendly nations", might make more sense.
(In reality, companies often keep way more than just two copies of data. It is not unheard of for companies to keep three to five copies scattered across two or three locations. Facebook keeps SIX copies of photographs you upload to their website.)
ChuckH argues that the governments that seize the three nodes won't have a complete copy of the data. However, merely having pieces of data is enough for governments to capture terrorists. Even if the striping is done at the smallest 512-byte block level, those 512 bytes of data might contain names, phone numbers, email addresses, credit cards or social security numbers. Hackers and computer forensics professionals take advantage of this.
You might ask yourself, "Why not just encrypt the data instead?" That brings me to the third trade-off, protection versus application performance. Over the past 30 years, companies had a choice, they could encrypt and decrypt the data as needed, using server CPU cycles, but this would slow down application processing. Every time you wanted to read or update a database record, more cycles would be consumed. This forced companies to be very selective on what data they encrypted, which columns or fields within a database, which email attachments, and other documents or spreadsheets.
An initial attempt to address this was to introduce an outboard appliance between the server and the storage device. For example, the server would write to the appliance with data in the clear, the appliance would encrypt the data, and pass it along to the tape drive. When retrieving data, the appliance would read the encrypted data from tape, decrypt it, and pass the data in the clear back to the server. However, this had the unintended consequences of using 2x to 3x more tape cartridges. Why? Because the encrypted data does not compress well, so tape drives with built-in compression capabilities would not be able to shrink down the data onto fewer tapes.
(I covered the importance of compressing data before encryption in my previous blog post
[Sock Sock Shoe Shoe].)
Like the trade-off between energy efficiency and abundant materials, IBM eliminated the trade-off by offering compression and encryption on the tape drive itself. This is standard 256-bit AES encryption implemented on a chip, able to process the data as it arrives at near line speed. So now, instead of having to choose between protecting your data or running your applications with acceptable performance, you can now do both, encrypt all of your data without having to be selective. This approach has been extended over to disk drives, so that disk systems like the IBM System Storage DS8000 and DS5000 can support full-disk-encryption [FDE] drives.