Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2018, Tony celebrates his 32th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
This week, I will be in Las Vegas for the 30th annual [Data Center Conference]. For those on Twitter, follow the conference on hashtag #GartnerDC, and follow me at [@az990tony].
Once again, I will be working the IBM Exhibition Booth of the Solution Showcase, attending keynote and break-out sessions, and meeting with clients and analysts. Today is mostly setting up the booth, getting my registration badge and materials, an orientation meeting for first-timers, and finish off the evening with a networking event to get the party started!
Traffic to and from the hotel was a mess today because of the [Las Vegas Strip at Night Rock-n-Roll Marathon]. The entire Las Vegas Boulevard was blocked off from 2pm to 11pm, causing taxis some headaches getting to and from each hotel. This marathon included a "Stiletto Dash" where women had to run in shoes that had at least three inch heels! (Only in Las Vegas!)
The conference is organized into 8 tracks:
Navigating the Journey to Cloud-Delivered Services
Achieving and Maintaining IT Operational Excellence
Modernizing Your Storage Strategy to Keep Pace with Burgeoning Demand
Ensuring Your Business Continuity Management Plan Reflects Today's Realities and Tomorrow's Challenges
Virtualization: Moving at Light Speed While Leveraging Your Existing Investments
The Future of Servers and Operating Systems
Data Center Modernization: Staying Agile in Chaotic Times
Pervasive Mobility: What Infrastructure and Operations Needs to Know Now
I am glad to see that storage got its own track this year! If you are attending the conference, here are the sessions that IBM is featuring for Monday:
IBM: Watson and Your Data Center
This is a lunch-time talk. Steve Sams, IBM VP of Sites and Facilities, will explain how to leverage Watson-like analytic approaches to provide flexible, cost-effective data center solutions. Analytics can be used to better align IT to the business needs, optimize server, storage and network utilization and improve data center design.
IBM: University of Rochester Medical Center cracks the code on data growth
Rick Haverty, Director of Infrastructure for University of Rochester Medical Center (URMC), will discuss how his team built a storage strategy that transformed their environment to bring savings right to their bottom line without sacrificing the speed, criticality and performance requirements of their imaging and EMR systems. I will be there to introduce Rick at the beginning, and then moderate the Q&A after the talk.
Solution Showcase Reception
The Solution Showcase opens up Monday night with a reception, serving food and drinks. Look for the IBM Portable Mobile Data Center (PMDC), the big trailer on the show floor. We also have an exhibit booth, across from the PMDC, to ask questions and talk with various IBM experts. You can look for me and the other experts wearing white lab coats!
This week, I will be in Las Vegas for the 30th annual [Data Center Conference]. For those on Twitter, follow the conference on hashtag #GartnerDC, and follow me at [@az990tony]. IBM is a Global Partner and Platinum Sponsor for this event. Here is a recap of some of the Monday morning keynote sessions:
Welcome and Introduction
Monday morning kicked off with a welcome introduction from the conference coordinators. This is the highest attendance for this conference in its 30 year history, with 60 percent of the attending for their first time, and 18 percent only once before. This is the fourth time I am attending. Half of the attendees represent corporations with 20,000 employees or more, the other half from smaller companies and government agencies. The top five industries represented are financial services, public sector, healthcare, manufacturing, and energy.
This conference uses a clever "interactive polling" where hand-held devices can be used to select choices, and results of over 800 voters are presented immediately on the big screen.
For IT budgets, 42 percent plan to increase next year, 32 percent flat, and 26 percent lower, which are similar to the numbers last year. Of nine different IT challenges, the top three were managing storage growth, power/cooling issues, and adopting a Cloud strategy.
Top 10 Trends and how they will impact Data Center IT
The analyst presented top 10 business, technology and societal trends that will impact IT. He added a last-minute eleventh issue that he felt will impact everyone in 2012:
Consumerization and the Tablet. Back in 1997, a GB of flash memory cost $7,992 US dollars, and today that same GB costs only 25 cents. Employees are bringing their own devices to the workplace, and expecting IT support.
Infinite Data Center. You may never have to expand your floorspace again. Improvements in server and storage density can allow you to continually upgrade in place.
Energy Management. Data centers consume 100x more energy than the offices they support. The cost of energy is on part with IT equipment. Energy management is becoming an enterprise-wide discipline. A key performance indicator (KPI) can be "compute per kW" or "compute per Square foot".
Context Awareness. There are hundreds of thousands of apps for Android-based smart phones and iPhones. Context awareness allows an app to help business travelers in airports know what restaurants are nearby, their flight status, and alternate flights available, based entirely on their location.
Hybrid Clouds. By 2013, over 60 percent of cloud adoption will be to redeploy existing apps like email. Some 80 percent of cloud initiatives will be private or hybrid configurations. Customers want "good enough" technology, and thus Cloud will be mostly an augmentation strategy.
Fabric Computing. The opposite of fully-integrated stacks is the notion of having compute, memory and storage joined together via an interconnect fabric with software to manage the entire environment.
IT Complexity. Robert Glass's Law states that for every 25 percent increase in functionality, there is a 100 percent increase in complexity. See Roger Session's whitepaper [The IT Complexity Crisis: Danger and Opportunity] for more on this.
Patterns and Analytics. Big data and business analytics is a key platform. This is expected to grow 60 percent CAGR.
Impact of Virtualization. Virtualizing your environment should be considered a continuous process, not a one-time project. Many companies are running x86 servers at less than 55 percent, which the speaker considers under-utilized. Virtual Desktop Infrastructure (VDI) is a trade-off, may cost more but have other business benefits to consider. The problem is that many IT shops are organized vertially (a server team, storage team, network team) but problems surface horizontally, and there is no "ownership" for the resolution. Some use "tiger teams" to address this. Companies should reward lateral thinking.
Social Media. Of the ommunications on cell phones by college students, 98.4 percent are text messages, and only 1.6 percent voice phone calls. People search Google for "what was", but they search Twitter for "what is". Most of the growth on Twitter are in the 39-52 year-old demographic. The analyst felt that if your company is blocking or restricting access to facebook, twitter, youtube or other social networking sites, then shame on you. I agree!
Flooding in Thailand. Over two million square feet of HDD production space were flooded, and this will impact HDD prices for 2012. Already, a 2TB drive that was selling for $79 at local store is now selling for $190.
How To Get Your CFO's Support For Strategy and Funding
In the first of a series of "mastermind interviews", the analyst interviewed their own CFO Chris Lafond. Ultimately, it is about business results. They have grown annual 15-20 percent, from 250 million in 2003 to 1.3 billion US dollars in 2011 for annual revenue, 4600 employees, doing business in 85 countries. The company is focused on three business areas: Research, Consulting, and Events like this one. Chris does not approve 3-5 year projects, and instead requests projects be broken up into year-long phases. ROI can be very misleading, and he asks instead for benefits and contributions to initiatives.
It is important to keep the horse in front of the cart. Accounting departments should not drive business decisions. For example, companies should not move to the public cloud just so that the accounting department can shift from CAPex to OPex. Try to depreciate as soon as possible. Likewise, green technologies and social responsibility are factors, but not drivers of business decisions. Acquisitions are a natural evolution of the market, so risk mitigation strategies should be in place in case your vendor of choice is acquired by someone you don't like.
For BC/DR planning, the analyst has a single Data Center approach, but Chris indicated that IT is looking to expand this. Their single datacenter for one part of their business was in Florida, and the other in Massachusetts, and both impacted by Hurricanes or Earthquakes recently.
The "lightning round" asked Chris his thoughts, either thumbs up, thumbs down, or neutral, on single ideas or concepts. I liked this part of the interview!
Chargeback? Thumbs down. He doesn't feel you should have internal fighting over charge rates. He prefers showback instead.
BYO Device with stipend? Thumbs down, but inevitable. Giving people a chunk of money to buy their own laptop, smart phone or tablet of choice may wreak havoc on the IT department for support and service.
Telepresence? Thumbs down. Cool, but very expensive. I don't think people are prepared to exploit the benefits of this.
Corporate apps on public "app stores"? Thumbs down. Concerns over security and integration is main issue.
Access to Social Networks? Thumbs up. This is how employees communicate and collaborate. Don't stifle them doing the right things just because you are afraid they might waste 20 minutes on Facebook per day.
Your IT budget? It's up slightly 1-5 percent for 2012.
Cloud? Promising, some challenges related to integration and security.
Chris finished up with a story about an application team that indicated that they would need to make 100 customizations to an off-the-shelf general ledger financial application. Chris and the other executives asked to be presented each and every customization, and he was able to eliminate most of them.
Positive comments I heard from the audience was that these keynotes had real "meat" to them, and not just full of cliches and platitudes that is common for keynote sessions. I would have to agree.
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of the Monday afternoon sessions:
IBM Watson and your Data Center
Steve Sams, IBM VP of Site and Facilities Services, cleverly used IBM Watson as a way to explain how analytics can be used to help manage your data center. Sadly, most of the people at my table missed the connection between IBM Watson and Analytics. How does answering a single trivia question in under three seconds relate to the ongoing operations of a data center? If you were similarly confused, take a peak at my series of IBM Watson blog posts:
The analyst who presented this topic was probably the fastest-speaking Texan I have met. He covered various aspects of Cloud Computing that people need to consider. Why hasn't Cloud taken off sooner? The analyst feels that Cloud Computing wasn't ready for us, and we weren't ready for Cloud Computing. The fundamentals of Cloud Computing have not changed, but we as a society have. Now that many end users are comfortable consuming public cloud resources, from Facebook to Twitter to Gmail, they are beginning to ask for similar from their corporate IT.
Legal issues - see this hour-long video, [Cloud Law & Order], which discusses legal issues related to Cloud Computing.
Employee staffing - need to re-tool and re-train IT employees to start thinking of their IT as a service provider internally.
Hybrid Cloud - rather than struggle choosing between private and public cloud methodologies, consider a combination of both.
University of Rochester Medical Center (URMC) Cracks Code on Data Growth
Often times, the hour is split, 30 minutes of the sponsor talking about various products, followed by 30 minutes of the client giving a user experience. Instead, I decided to let the client speak for 45 minutes, and then I moderated the Q&A for the remaining 15 minutes. This revised format seemed to be well-received!
University of Rochester is in New York, about 60 miles east of Buffalo, and 90 miles from Toronto across Lake Ontario. Six years ago, Rick Haverty joined URMC as the Director of Infrastructure services, managing 130 of the 300 IT personnel at the Medical Center. I met Rick back in May, when he presented at the IBM [Storage Innovation Executive Summit] in New York City.
URMC has DS8000, DS5000, XIV, SONAS, Storwize V7000 and is in the process of deploying Storwize V7000 Unified. He presented how he has used these for continuous operations and high availability, while controlling storage growth and costs.
The Q&A was lively, focusing on how his team manages 1PB of disk storage with just four storage administrators, his choice of a "Vendor Neutral Archive" (VNA), and his experiences with integration.
This was a great afternoon, and I was glad to get all my speaking gigs done early in the week. I would like to thank Rick Haverty of URMC for doing a great job presenting this afternoon!
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of the other Monday morning keynote sessions:
Driving Innovation to Achieve Dramatic Improvements
What is Innovation? It is a process that starts with one or more ideas, that results in change, that creates value. Easier said than done!
Innovation drives business growth. The analyst indicated that the IT infrastructure can either be in the way to impede business growth, neutral to enable growth, or contributing to business growth. Companies often find downtime as an inhibitor to business growth. The analyst gave these typical numbers.
Unplanned downtime (hours per year)
Planned Downtime (hours per year)
A big inhibitor to change is "cultural inertia", which states that the way things are prevent what they could be. Change requires both rewards and measures. Employees are often uncomfortable with change. Motivation should be with carrots not sticks.
(I often joke that the only people who are comfortable with change are babies with soiled diapers and prisoners on death row!)
The impedence to change is further amplified by leadership because what got them into their positions was their history of success, and often leaders perpetuate what worked for them in the past.
"There is nothing so useless as doing efficiently that which should not be done at all."
--- Peter Drucker
Nothing lasts forever, and companies should not try to avoid the inevitable. Innovators need to see themselves as change agents. the analyst feels that less than 10 percent of IT will adopt innovation to enact dramatic change. The analyst took a poll of the audience asking: Why isn't your IT Infrastructure and Operations more innovative? Over 800 attendees responded. Here were the results:
The analyst suggests treating Innovation like a team sport, with small 2-5 person teams. Search for breakthrough opportunities by setting audacious goals to inspire innovative thinking. What approach are most people doing today? Here are some polling results:
The analyst suggest it is more important to establish a culture of innovation first, and process second. Skunkworks projects are back in favor. IT folks should avoid the worship of so-called "best practices" as a reason to avoid change in trying something different. To think "outside-the-box" you need to get outside the box, or office, or cubicle, or wherever you work that prevents you from interacting with your internal or external customers. Customers can bring great insights on new approaches to take.
One new approach, born in the Cloud and now coming to the Enterprise is the concept of [DevOps], which consists of promoting collaboration between the "Appplication Development" half of IT, with the "Operations" half. If you never had heard of DevOps before, you are not alone, most of the attendees at this conference hadn't either. Here are the poll results:
Some companies have instituted a "Fresh Eyes" program, asking new-hires and early-tenure employees questions like: What surprised you the most when you joined the company? Was there anything that didn't make sense to you? Do you have any ideas to improve the way we do things?
"In a time of crisis we all have the potential to morph up to a new level and do things we never thought possible"
– Stuart Wilde
Why wait for a crisis?
Facebook: Efficient Infrastructure at a Massive Scale
Frank Frankovsky, the Director of Hardware and Design and Supply Chain at Facebook, was sitting right next to me in the audience. I didn't know this until it was his turn to speak, and he jumps up and walks to the stage! For those who live under a rock and/or are over 40 years old, Facebook is a social media site that allows people to maintain personal profiles, share photos, news and messages, play games, and create groups to organize events. They now support over 800 million accounts, a healthy percentage of the 1.9 billion people on the internet today.
Started in 2004, Facebook was originally hosted on standard server and storage hardware in colocation facilities. Facebook saved 38 percent costs by bringing their operations in-house, building their own servers from parts, and using no third-party software. Facebook has the advantage of owning their entire software stack, leveraging open source as much as possible. They even re-wrote their own PHP compiler, which they pronounce "Hip-Hop", short for high-performance-PHP.
Facebook can stand up a new data center in less than 10 months, from breaking ground to serving users. Most of Facebook's data centers sport a PUE less than 1.5, but their newest one in Prineville, Oregon is down to an amazing 1.07 level for a 7.5 Megawatt facility! How did they do it? Here are a few of their tricks:
Use Scale-Out architecture. Having lots of small servers, scattered in various data centers, allows them to survive a server failure, as well as having the luxury to shut down a datacenter when needed for maintenance reasons.
Free Cooling. Instead of air-conditioning, they pump in cold air from the outside, and send the heated exhaust back outdoors. Frank does not believe servers should be treated like humans, so their data centers run uncomfortably hot. The 50-year climate data is used to determine data center locations that have the optimal "free cooling" opportunities.
Eliminate UPS and PDU energy losses. Rather than running 480 VAC power through UPS that represent a 6 to 12 percent loss, and then PDU that introduce another 3 percent loss getting down to 208 or 120 VAC, Frank's team builds servers that feed direclty off the 480 VAC from the power company. For backup power, they use 48VDC batteries. One set of batteries can backup six racks of servers.
Target 6 to 8 KW per rack. Low-density racks are easier to keep cool.
Build their own IT equipment. Rather than buying commercially-available servers, Frank's team builds 1.5 U servers based on Intel "Westmere" chipset. 1.5U allows for larger fan radius than standard 1U pizza box format. (IBM's iDataPlex uses 2U fans for the same reason!) Facebook has a "Vanity free" design philosophy, so no fancy plastic bezels. In most cases, the covers are left off. Most (65 percent) of their servers are web front-ends. They plan new IT equipment based on Intel's "Sandy Bridge" chipset.
Use SATA drives. They buy the largest SATA drives available, directly from manufacturers, in direct-attach storage (DAS) in their servers. Data is organized in a Hadoop cluster, and they have developed their own internal "Haystack" for photo storage. Despite the floods in Thailand, Facebook has secured all the SATA disk they plan to buy for 2012 from their suppliers.
Use Solid-State drives. Their Database tier uses 100 percent Solid-State drives.
Frank is also a founder for the [Open Compute Project], which takes an "Open source" approach to IT hardware.
Facebook does not bother with hypervisors. Instead, they have adapted their own software to make full use of the CPU natively. This eliminates the "I/O Tax" penalty associated with VMware and other hypervisors.
Of course, not everyone owns their entire software stack, and can build their own servers! It was nice to hear how a company without such limitations can innovate to their advantage.
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of some of the Tuesday afternoon sessions:
Brocade: Maximizing Your Cloud: How Data Centers Must Evolve
This was a session sponsored by Brocade to promote their concept of the "Ethernet Fabric". The first speaker, John McHugh, was from Brocade, and the second speaker was a client testimonial, Jamie Shepard, EVP for International Computerware, Inc.
John had an interesting take on today's network challenges. He feels that most LANs are organized for "North-South" traffic, referring to upload/downloads between clients and servers. However, the networks of tomorrow will need to focus on "East-West" traffic, referring to servers talking to other servers.
John was also opposed to integrated stacks that combine servers, storage and networking into a single appliance, as this prevents independent scaling of resources.
The Future of Backup is Not Backup
Primary data is growing at 40 to 60 percent compound annual growth rate (CAGR), but backup data is growing faster. Why? Because data that was not backed up before are now being backed up, including test data, development data, and mobile application data.
Backup costs are 19x more expensive than production software costs. There is an enormous gap in data protection because companies fail to factor this into their budgets. It is not uncommon for IT departments to use multiple backup tools, for example one tool for VMs, and another tool for servers, and a third product for desktops.
part of the problem is identifying who "buys" the backup software. The server team might focus on the operating systems supported. The storage team focuses on the disk and tape media supported. The application owners focus on the features and capabilities for backup that minimize impact to their application.
The analyst organized these issues into three "C's" of backup concerns: Cost, Capability and Complexity. Cost is not just the software license fee for the backup software, but the cost of backup media, courier fees, and transmisison bandwidth. Capability refers to the features and functions, and IT folks are tired of having to augment their backup solution with additional tools and scripts to compensate for lack of capability. Complexity refers to the challenges trying to get existing backup software to tackle new sources like Virtual Machines, Mobile apps, and so on.
Has everyone moved to a tape-less backup system? Polling results found that people are shifting back to tape, either in a tape-only environment, or to supplement their disk or disk-based virtual tape library (VTL). Here are the polling results:
The poll also showed the top three backup software vendors were Symantec, IBM and Commvault, which is consistent with marketshare. However, the analyst feels that by 2014, an estimated 30 percent of companies will change their backup softwar vendor out of frustration over cost, capability and/or complexity.
There are a lot new backup software products specific to dealing with Virtual Machines. Some are focused exclusively on VMware. When asked what tool people used to backup their VMs, the polling results showed the following. NOte that 20 percent for Other includes products from major vendors, like IBM Tivoli Storage Manager for Virtual Environments, as the analyst was more interested in the uptake of backup software from startups.
Some companies are considering Cloud Computing for backup. This is one area where having the cloud service provider at a distance is an actual advantage for added protection. A poll asking whether some or most data is backed up to the Cloud, either already today, or plans for the near future within the next 12 or 24 months, showed the following:
In addition to backup service providers, there are now several startups that offer file sharing, and some are adding "versioning" to this that can serve as an alternative to backup. These include DropBox, SugarSync, iCloud, SpiderOak and ShareFile.
The final topic was Snapshot and Disk Replication. These tend to be hardware-based, so they may not have options for versioning, scheduling, or application-aware capabilities normally associated with backup software. Space-efficient snapshots, which point unchanged data back to the original source, may not provide full data protection that disparate backup copies would provide. Here were polling results on whether snapshot/replication was used to augment or replace some or most of their backups:
Some of his observations and recommendations:
Maintenance is more expensive than acquisition cost. Don't focus on the tip of the iceberg. Some backup software is more efficient for bandwidth and media which will save tons of money in the long run.
Try to optimize what you have. He calls this the "Starbuck's effect". If you just need one coffee, then paying $4.50 for a cup makes sense. But if you need 100 coffees, you might be better off buying the beans.
Design backups to meet service level agreements (SLAs). In the past, backup was treated as one-size-fits-all, but today you can now focus on a workload by workload basis.
Be conservative in adopting new technologies until you have your backup procedures in place to handle data protection.
Backup is for operational recovery, not long-term retention of data. A poll showed two-thirds of the audience kept backup versions for longer than 60 days! Re-evaluate how long you keep backups, and how many versions you keep. If you need long-term retention, use archive process instead.
Recovery testing is a dying art. Practice recovery procedures so that you can do it safely and correctly when it matters most.
The analyst had a series of awesome pictures of large structures, the pyramids of Giza, the Chrysler building, and so on, and how they would look without their foundations in place. Backup is a foundation and should be treated as such in all IT planning purposes.
IT is evolving, but some basic needs like networking and backup procedures don't change. As companies re-evaluate their IT operations for Big Data, Cloud Computing and other new technologies, it is best to remember that some basic needs must be met as part of those evaluations.
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of the Tuesday morning sessions:
Wells Fargo: Data Center Lessons Learned from the Wachovia Acquisition
This was the next in their "Mastermind Interview" series. The analyst interviewed Scott Dillon, EVP and Head of Technology Infrastructure Services for Wells Fargo bank. Some 13 years ago, Wells Fargo merged with Norwest, and three years ago, Wells Fargo merged again, this time with Wachovia bank. Today, the new merged Wells Fargo manages 1.2 Trillion USD in assets, some 12,000 ATMs, and 9,000 branch offices within two miles of 50 percent of the US population.
On the technical side, Scott's team has to deal with 10,000 IT changes per month, spanning 85 discrete businesses that Wells Fargo is involved in. To help drive the consolidation, they formed a culture group called "One Wells Fargo".
Often, Wells Fargo and Wachovia used different applications for the same function. The consolidation team took the A-or-B-but-not-C approach, which means they would either choose the existing application that Wells Fargo was already using (A), or the one that Wachovia was already using (B), but not look for a replacement (C). They also wanted to avoid re-platforming any apps during the merger. This simplified the process of developing target operating models (TOMs).
Before each application cut-over, the consolidation team did dry-run, dress rehearsals and walkthroughs over the phone to ensure smooth success. They wanted a Wachovia account holder to be able to walk into the bank on one day, and then come back the next day as a Wells Fargo account holder, into the same branch office but now with Wells Fargo signage, with minimal disruption.
Wells Fargo also adopted a test-to-learn approach of choosing small test markets to see how well the transition would work before tackling larger, more complicated markets. For example, they started in Colorado, where Wells Fargo has a huge presence, but Wachovia had a small presence.
This was first and foremost a business merger, not just an IT merger. Each decision to 6-18 months to act on, and the IT team spent the last three years working every weekend to make this a reality.
A Satirical Look at Business and Technology
Comedian Bob Hirschfeld presented a light-hearted look at the IT industry. Bob actually attended sessions on Monday at this conference so his satire was exceptionally hard-hitting. He took jabs at the latest IT job requirements, padding on light poles, IBM Watson, social media's impact on dictators, various industry acronyms, virtualization, the various reasons why printer ink is so expensive, and the evil masterminds behind Powerpoint.
Storing Big Data takes a Village
Two analysts co-presented this session on the 12 dimensions of information management that revolve around the volume, variety and velocity of "Big Data".
In the past, it took a while to gather data, and a while to process the data, so annual, quarterly and monthly reports were common. Today, with high-velocity streams like Twitter, especially during cultural events or natural disasters, data is produced and analyzed quickly. It is important to sort the steady-state from the anomalies.
Myth 1: All data fits nicely into relational databases. The analysts feel the concept of putting everything into one big data base is dead. Some data sets are so complicated that traditional database joins would cause smoke to come out of the sides of the servers. Instead, new technologies have emerged, including NoSQL, Cassandra, Hadoop, Columnar databases, and In-memory databases. XML has helped to bring together disparate data formats.
Companies need to adapt to this new reality of Business Analytics. Here is a poll of the audience on how many are in what stage of adaptation:
Myth 2: Everyone will do Big Data with commodity hardware. Businesses want commmercial offerings that don't fail every day. (For example, instead of using open-source Hadoop, consider IBM's [InfoSphere BigInsights] commercial product based on Hadoop designed for the Enterprise).
Myth 3: Big Data is too big for backup. Certainly, traditional full-plus-incremental approaches fail to scale, but that is not the only option you have. Consider disk replication, snapshots, and integrated disk-and-tape blended solutions that adopt a more progressive backup methodology.
Capacity forecasting can be difficult with Big Data. Scale-out NAS systems, including IBM SONAS and the various me-too competitive offerings, were originally focused on High Performance Computing (HPC) and the Media & Entertainment (M&E) industries, are now ready for prime-time and appropriate for other use cases.
It's like the game of Clue, but instead of Professor Plum with the candlestick in the library, it was Chuck with the Cluster in the Closet. To avoid shadow IT creating huge Hadoop Clusters in your closets, encourage the use of Cloud Computing for "sandbox" projects. IBM, Amazon and others offer hosted MapReduce engines for this purpose.
What type of storage do you plan to use for Big Data? The top five, weighted from a list during a poll of the audience were: (78) traditional disk arrays, (71) Scale-out NAS, (46) pre-configured appliances, (30) Hadoop clusters, and (23) Cloud Storage.
Big Data is about doing things differently. Do your employees understand analytical techniques? Your company may need to start thinking about policies for capturing Big Data, storing it correctly, and analyzing it for insights and patterns needed to stay competitive.
It was good to mix reality with a bit of humor. Some of these conference attendees take themselves too seriously, and it is good to be reminded that IT is just part of the overall business operation.
Continuing my coverage of the 30th annual [Data Center Conference]. here is a recap of Wednesday morning sessions.
A Data Center Perspective on MegaVendors
The morning started with a keynote session. The analyst felt that the eight most strategic or disruptive companies in the past few decades were: IBM, HP, Cisco, SAP, Oracle, Apple and Google. Of these, he focused on the first three, which he termed the "Megavendors", presented in alphabetical order.
Cisco enjoys high-margins and a loyal customer base with Ethernet switch gear. Their new strategy to sell UP and ACROSS the stack moves them into lower-margin business like servers. Their strong agenda with NetApp is not in sync with their partnership with EMC. They recently had senior management turn-over.
HP enjoys a large customer base and is recognized for good design and manufacturing capabilities. Their challenges are mostly organizational, distracted by changes at the top and an untested and ever-changing vision, shifting gears and messages too often. Concerns over the Itanium have not helped them lately.
IBM defies simple description. One can easily recognize Cisco as an "Ethernet Switch" company, HP as a "Printer Company", Oracle as a "Database Company', but you can't say that IBM is an "XYZ" company, as it has re-invented itself successfully over its past 100 years, with a strong focus on client relationships. IBM enjoys high margins, sustainable cost structure, huge resources, a proficient sales team, and is recognized for its innovation with a strong IBM Research division. Their "Smarter Planet" vision has been effective in supporting their individual brands and unlock new opportuties. IBM's focus on growth markets takes advantage of their global reach.
His final advice was to look for "good enough" solutions that are "built for change" rather than "built to last".
Chris works in the Data Center Management and Optimization Services team. IBM owns and/or manages over 425 data centers, representing over 8 million square feet of floorspace. This includes managing 13 million desktops, and 325,000 x86 and UNIX server images, and 1,235 mainframes. IBM is able to pool resources and segment the complexity for flexible resource balancing.
Chris gave an example of a company that selected a Cloud Compute service provided on the East coast a Cloud Storage provider on the West coast, both for offering low rates, but was disappointed in the latency between the two.
Chris asked "How did 5 percent utilization on x86 servers ever become acceptable?" When IBM is brought in to manage a data center, it takes a "No Server Left Behind" approach to reduce risk and allow for a strong focus on end-user transition. Each server is evaluated for its current utilization:
Amazingly, many servers are unused. These are recycled properly.
1 to 19 percent
Workload is virtualized and moved to a new server.
20 to 39 percent
Use IBM's Active Energy Manager to monitor the server.
40 to 59 percent
Add more VMs to this virtualized server.
over 60 percent
Manage the workload balance on this server.
This approach allows IBM to achieve a 60 to 70 percent utilization average on x86 machines, with an ROI payback period of 6 to 18 months, and 2x-3x increase of servers-managed-per-FTE.
Storage is classified using Information Lifecycle Management (ILM) best practices, using automation with pre-defined data placement and movement policies. This allows only 5 percent of data to be on Tier-1, 15 percent on Tier-2, 15 percent on Tier-3, and 65 percent on Tier-4 storage.
Chris recommends adopting IT Service Management, and to shift away from one-off builds, stand-alone apps, and siloed cost management structures, and over to standardization and shared resources.
You may have heard of "Follow-the-sun" but have you heard of "Follow-the-moon"? Global companies often establish "follow-the-sun" for customer service, re-directing phone calls to be handled by people in countries during their respective daytime hours. In the same manner, server and storage virtualization allows workloads to be moved to data centers during night-time hours, following the moon, to take advantage of "free cooling" using outside air instead of computer room air conditioning (CRAC).
Since 2007, IBM has been able to double computer processing capability without increasing energy consumption or carbon gas emissions.
It's Wednesday, Day 3, and I can tell already that the attendees are suffering from "information overload'.
Continuing my coverage of the 30th annual [Data Center Conference]. here is a recap of Wednesday breakout sessions.
Aging Data: The Challenges of Long-Term Data Retention
The analyst defined "aging data" to be any data that is older than 90 days. A quick poll of the audience showed the what type of data was the biggest challenge:
In addition to aging data, the analyst used the term "vintage" to refer to aging data that you might actually need in the future, and "digital waste" being data you have no use for. She also defined "orphaned" data as data that has been archived but not actively owned or managed by anyone.
You need policies for retention, deletion, legal hold, and access. Most people forget to include access policies. How are people dealing with data and retention policies? Here were the poll results:
The analyst predicts that half of all applications running today will be retired by 2020. Tools like "IBM InfoSphere Optim" can help with application retirement by preserving both the data and metadata needed to make sense of the information after the application is no longer available. App retirement has a strong ROI.
Another problem is that there is data growth in unstructured data, but nobody is given the responsibility of "archivist" for this data, so it goes un-managed and becomes a "dumping ground". Long-term retention involves hardware, software and process working together. The reason that purpose-built archive hardware (such as IBM's Information Archive or EMC's Centera) was that companies failed to get the appropriate software and process to complete the solution.
Cloud computing will help. The analyst estimates that 40 percent of new email deployments will be done in the cloud, such as IBM LotusLive, Google Apps, and Microsoft Online365. This offloads the archive requirement to the public cloud provider.
A case study is University of Minnesota Supercomputing Institute that has three tiers for their storage: 136TB of fast storage for scratch space, 600TB of slower disk for project space, and 640 TB of tape for long-term retention.
What are people using today to hold their long-term retention data? Here were the poll results:
Bottom line is that retention of aging data is a business problem, techology problem, economic problem and 100-year problem.
A Case Study for Deploying a Unified 10G Ethernet Network
Brian Johnson from Intel presented the latest developments on 10Gb Ethernet. Case studies from Yahoo and NASA, both members of the [Open Data Center Alliance] found that upgrading from 1Gb to 10Gb Ethernet was more than just an improvement in speed. Other benefits include:
45 percent reduction in energy costs for Ethernet switching gear
80 percent fewer cables
15 percent lower costs
doubled bandwidth per server
Ruiping Sun, from Yahoo, found that 10Gb FCoE achieved 920 MB/sec, which was 15 percent faster than the 8Gb FCP they were using before.
IBM, Dell and other Intel-based servers support Single Root I/O Virtualization, or SR-IOV for short. NASA found that cloud-based HPC is feasible with SR-IOV. Using IBM General Parallel File System (GPFS) and 10Gb Ethernet were able to replace a previous environment based on 20 Gbps DDR Infiniband.
While some companies are still arguing over whether to implement a private cloud, an archive retention policy, or 10Gb Ethernet, other companies have shown great success moving forward!
This is my final post on my coverage of the 30th annual [Data Center Conference]. IBM was a Platinum sponsor, and there were over 2,600 attendees, of which 27 percent were IT Directors or higher. Two thirds of the companies have 5000 employees or more. Here is a recap of the last few sessions I attended.
Best Practices for Data Center consolidation
As if the conference co-chairs aren't already super-busy, here they are presenting one of the breakout sessions. In the 1990s, consolidation was done purely to reduce total cost of ownership (TCO). Today, there are a variety of other reasons, including issues with power and cooling, service level agreements, and security.
Of these, 25 percent plan to have more data centers in three years, and 47 percent plan to consolidate to fewer. The benefits to consolidation include economies of scale, staff reduction, reduced hardware facilities costs, and application retirement. Challenges include dealing with politics, building new facilities to replace the old ones, and bandwidth. Here were some of the primary reasons why data center consolidation projects fail:
Human Resources (HR) issues
Resources not freed available
Lack of Project Management skills
No rationalization at consolidated site
Interactive Polling Results
The last keynote session was Thursday morning. The conference co-chairs present the highlights of the interactive polling that was done during the week at this conference.
The first topic was social media. There was a lot of Twitter activity with hashtag #GartnerDC that I followed throughout the week. Most of the tweets seem to be from people who were not actually at the conference.
Some 45 percent of the attendees have implemented social media initiatives at their companies. What tooling are they using to accomplish this? There are some provided by the major ITSM vendors, tools specific for corporate social media such as Yammer, collaboration tools like Microsoft SharePoint and IBM's Lotus Connections, and public sites like Facebook and Twitter. Here were the poll results:
The next topic was focused on Mobile devices and Cloud Computing. For example, do companies store data in public cloud, or plan to in the future, for mobile devices?
One third of the attendees allow employees to bring their own tablet to work with full IT support. Only 18 percent allow employees to bring their own PC or laptop. Over 40 percent felt that their IT department was not yet ready to support smartphones.
What are the main drivers to adopt private cloud? Some are deploying private clouds as a way to defend their IT jobs from going to the public cloud. Here were the poll results:
What problems are companies trying to solve with cloud computing? Here were the poll results:
A majority of attendees that use VMware are exploring LInux KVM, such as Red Hat Enterprise Virtualization (RHEV) or Microsfot Hyper-V. What storage protocol are attendees using for their server virtualization? Here were the poll results:
The next topic was the process for IT service management. The top three were ITIL, CMMI and DevOps, with the majority using ITIL or ITIL in combination with something else. These are needed for release management, change management, performance management, capacity management and incident management. How collaborative is the relationship between IT operations and application development? Here were the poll results:
How well does IT operations contribute to business innovation? This year 38 percent were satisfied, and 33 percent unsatisfied. This was a big improvement over last year, that found 19 percent satisfied, 64 percent unsatisfied.
Building a Private Storage Cloud: Is It a Science Experiment?
While everyone understands the benefits of private and public cloud computing, there seems to be hesitation about hosted cloud storage. Some people have already adopted some form of cloud storage, and other plan to within 12 months. Here were the poll results:
The top three reasons for considering public cloud storage was to adopt lower-cost storage tier, to benefit from off-site storage, and staff constraints. The top concerns were security and performance.
The IT department will need to start thinking like a cloud provider, and perhaps adopt a hybrid cloud approach. What IT equipment can be re-used? What will the new IT operations look like in a Cloud environment? What were the primary use cases for cloud storage? Here were the poll results:
In addition to the major cloud providers (IBM, Amazon, etc.) there are a variety of new cloud storage startups to address these business needs.
So that wraps up my coverage of this conference. In addition to attending great keynote and breakout sessions, I was able to have great one-on-one discussions with clients at the Solution Showcase booth, during breaks and at meals. IBM's focus on Big Data, Workload-optimized Systems, and Cloud seems to resonate well with the analysts and attendees. I want to give special thinks to Lynda, Dana, Peggy, Hugo, David, Rick, Cris, Richard, Denise, Chloe, and all my colleagues, friends and family from Arizona for their support!
Continuing my coverage of the 30th annual [Data Center Conference]. Here is a recap of more of the Tuesday afternoon sessions:
IBM CIOs and Storage
Barry Becker, IBM Manager of Global Strategic Outsourcing Enablement for Data Center Services, presented this session on Storage Infrastructure Optimization (SIO).
A bit of context might help. I started my career in DFHSM which moved data from disk to tape to reduce storage costs. Over the years, I wouuld visit clients, analyze their disk and tape environment, and provide a set of recommendations on how to run their operations better. In 2004, this was formalized into week-long "Information Lifecycle Management (ILM) Assessments", and I spent 18 months in the field training a group of folks on how to perform them. The IBM Global Technology Services team have taken a cross-brand approach, expanding this ILM approach to include evaluations of the application workloads and data types. These SIO studies take 3-4 weeks to complete.
Over the next decade, there will only be 50 percent more IT professionals than we have today, so new approaches will be needed for governance and automation to deal with the explosive growth of information.
SIO deals with both the demand and supply of data growth in five specific areas:
Data reclamation, rationalization and planning
Virtualization and tiering
Backup, business continuity and disaster recovery
Storage process and governance
Archive, Retention and Compliance
The process involves gathering data and interview business, financial and technical stakeholders like storage administrators and application owners. The interviews take less than one hour per person.
Over the past two years, the SIO team has uncovered disturbing trends. A big part of the problem is that 70 percent of data stored on disk has not been accessed in the past 90 days, and is unlikely to be accessed at all in the near future, so would probably be better to store on lower cost storage tiers.
Storage Resource Management (SRM) is also a mess, with over 85 percent of clients having serious reporting issues. Even rudimentary "Showback" systems to report back what every individual, group or department were using resulted in significant improvement.
Archive is not universally implemented mostly because retention requirements are often misunderstood. Barry attributed this to lack of collaboration between storage IT personnel, compliance officers, and application owners. A "service catalog" that identifies specific storage and data types can help address many of these concerns.
The results were impressive. Clients that follow SIO recommendations save on average 20 to 25 percent after one year, and 50 percent after three to five years. Implementing storage virtualization averaged 22 percent lower CAPEX costs. Those that implemented a "service catalog" saved on average $1.9 million US dollars. Internally, IBM's own operations have saved $13 million dollars implementing these recommendations over the past three years.
Reshaping Storage for Virtualization and Big Data
The two analysts presenting this topic acknowledged there is no downturn on the demand for storage. To address this, they recommend companies identify storage inefficiencies, develop better forecasting methodologies, implement ILM, and follow vendor management best practices during acquisition and outsourcing.
To deal with new challenges like virtualization and Big Data, companies must decide to keep, replace or supplement their SRM tools, and build a scalable infrastructure.
One suggestion to get upper management to accept new technologies like data deduplication, thin provisioning, and compression is to refer to them as "Green" technologies, as they help reduce energy costs as well. Thin provisioning can help drive up storage utilization to rates as high as you dare, typically 60 to 70 percent is what most people are comfortable with.
A poll of the audience found that top three initiatives for 2012 are to implement data deduplication, 10Gb Ethernet, and Solid-State drives (SSD).
The analysts explained that there are two different types of cloud storage. The first kind is storage "for" the cloud, used for cloud compute instances (aka Virtual Machines), such as Amazon EBS for EC2. The second kind is storage "as" the cloud, storage as a data service, such as Amazon S3, Azure Blob and AT&T Synaptic.
The analysts feel that cloud storage deployments will be mostly private clouds, bursting as needed to public cloud storage. This creates the need for a concept called "Cloud Storage Gateways" that manage this hybrid of some local storage and some remote storage. IBM's SONAS Active Cloud Engine provides long-distance caching in this manner. Other smaller startups include cTera, Nasuni, Panzura, Riverbed, StorSimple, and TwinStrata.
A variation of this are "storage gateways" for backup and archive providers as a staging area for data to be subsequently sent on to the remote location.
New projects like virtualization, Cloud computing and Big Data are giving companies a new opportunity to re-evaluate their strategies for storage, process and governance.
Next week, I will be in Las Vegas for the 30th annual [Data Center Conference]. This is the fourth year attending this. For a bit of nostalgia, check out my blog posts from the [2008 event] and the [2009 event].
Back in October, Daryl Pereira asked me for an interview about my blog. I get a lot of these requests, but this one was different. Daryl is on the IBM DeveloperWorks team, and he was going to interview me to for the "Great Mind Challenge". This is a fun competition for a group of about 100 college students from San Jose State University to get them to learn blogging best practices and techniques.
This was the one post that put me into the #1 position, with over 70,000 hits so far and counting, and that does not include all the people who read my blog through feed readers or the various cross-postings on IBM Storage Community and IBM Virtual Briefing Center.
This blog post was part of a series on IBM Watson, the computer that beat two humans on the "Jeapoardy!" television game show. Having worked closely with the IBM Research scientists to understand how IBM Watson worked so that I could blog about it, I thought a good way for readers to appreciate how it was put together was to explain how to assemble a scaled-down version. My inspiration was an article by John Pultorak that explained [how to build your own Apollo Guidance Computer (AGC) in your basement].
The blog post series proved to be a big hit. IBM Watson helps to demonstrate many modern computer techniques, including business analytics of Big Data, Cloud Computing, and parallel programming techniques such as Hadoop. Showing that a "Watson Jr." could be built in your basement helped to emphasize that IBM Watson was made from hardware and software that are generally available today.
I am very proud of this blog post. I worked with Moshe Yanai and the rest of the XIV team to be completely accurate and correct to set the right level of expectations. So many false statements and FUD had been thrown out about what would happen if a double drive failure happened during the short 30 minute window of opportunity, and it turns out that in most cases, no data is lost, and in all other cases, the lost data can be easily identified and restored. In most cases, this will be less recovery required than a double drive failure on a traditional RAID-5 disk array.
It was also an opportunity to try out Animoto to create a short and simple video. Normally, when marketing needs a video made, it will cost 25,000 dollars USD or more, and take weeks to produce. I was able to get this video done in just a few hours with no out-of-pocket expenses.
After this post, nearly all FUD in the blogosphere about double drive failures disappeared. More importantly, the XIV sales that quarter (2Q2010) was substantially better than the prior quarter. Many XIV sales reps credit this blog post for that huge bump in XIV sales! I guess this could be the Tony Pearson equivalent of the [Colbert Bump].
In 2009 and 2010, I was the third most influential blogger on IBM's Developerworks, and now in 2011, I have risen to number one position! Internally, we call this "Winning the Devy" (like an Emmy, but for DeveloperWorks bloggers). I would like to thank all my readers for continuing to share in the conversation!