Continuing my week in Washington DC for the annual [2010 System Storage Technical University], I presented a session on Storage for the Green Data Center, and attended a System x session on Greening the Data Center. Since they were related, I thought I would cover both in this post.
- Storage for the Green Data Center
I presented this topic in four general categories:
- Drivers and Metrics - I explained the three key drivers for consuming less energy, and the two key metrics: Power Usage Effectiveness (PUE) and Data Center Infrastructure Efficiency (DCiE).
- Storage Technologies - I compared the four key storage media types: Solid State Drives (SSD), high-speed (15K RPM) FC and SAS hard disk, slower (7200 RPM) SATA disk, and tape. I had comparison slides that showed how IBM disk was more energy efficient than competition, for example DS8700 consumes less energy than EMC Symmetrix when compared with the exact same number and type of physical drives. Likewise, IBM LTO-5 and TS1130 tape drives consume less energy than comparable HP or Oracle/Sun tape drives.
- Integrated Systems - IBM combines multiple storage tiers in a set of integrated systems managed by smart software. For example, the IBM DS8700 offers [Easy Tier] to offer smart data placement and movement across Solid-State drives and spinning disk. I also covered several blended disk-and-tape solutions, such as the Information Archive and SONAS.
- Actions and Next Steps - I wrapped up the talk with actions that data center managers can take to help them be more energy efficient, from deploying the IBM Rear Door Heat Exchanger, or improving the management of their data.
- Greening of the Data Center
Janet Beaver, IBM Senior Manager of Americas Group facilities for Infrastructure and Facilities, presented on IBM's success in becoming more energy efficient. The price of electricity has gone up 10 percent per year, and in some locations, 30 percent. For every 1 Watt used by IT equipment, there are an additional 27 Watts for power, cooling and other uses to keep the IT equipment comfortable. At IBM, data centers represent only 6 percent of total floor space, but 45 percent of all energy consumption. Janet covered two specific data centers, Boulder and Raleigh.
At Boulder, IBM keeps 48 hours reserve of gasoline (to generate electricity in case of outage from the power company) and 48 hours of chilled water. Many power outages are less than 10 minutes, which can easily be handled by the UPS systems. At least 25 percent of the Computer Room Air Conditioners (CRAC) are also on UPS as well, so that there is some cooling during those minutes, within the ASHRAE guidelines of 72-80 degrees Fahrenheit. Since gasoline gets stale, IBM runs the generators once a month, which serves as a monthly test of the system, and clears out the lines to make room for fresh fuel.
The IBM Boulder data center is the largest in the company: 300,000 square feet (the equivalent of five football fields)! Because of its location in Colorado, IBM enjoys "free cooling" using outside air temperature 63 percent of the year, resulting in a PUE of 1.3 rating. Electricity is only 4.5 US cents per kWh. The center also uses 1 Million KwH per year of wind energy.
The Raleigh data center is only 100,000 Square feet, with a PUE 1.4 rating. The Raleigh area enjoys 44 percent "free cooling" and electricity costs at 5.7 US cents per kWh. The Leadership in Energy and Environmental Design [LEED] has been updated to certify data centers. The IBM Boulder data center has achieved LEED Silver certification, and IBM Raleigh data center has LEED Gold certification.
Free cooling, electricity costs, and disaster susceptibility are just three of the 25 criteria IBM uses to locate its data centers. In addition to the 7 data centers it manages for its own operations, and 5 data centers for web hosting, IBM manages over 400 data centers of other clients.
It seems that Green IT initiatives are more important to the storage-oriented attendees than the x86-oriented folks. I suspect that is because many System x servers are deployed in small and medium businesses that do not have data centers, per se.
technorati tags: IBM, Technical University, Green Data Center, PUE, DCiE, Free Cooling, ASHRAE, LEED, SSD, Disk, Tape, SONAS, Archive
In keeping with the spirit to be a more kinder, gentler 2011, I decided last week to refrain from being the rain on someone else's parade that occurs immediately before, during or after a competitor's announcement or annual conference, and let EMC have their few moments in the spotlight last week. This of course allows me more time to learn about the announcements and reflect on marketplace reactions. Here's a quick look at the [EMC Press Release]:
- A new VNXe disk system
Of the 41 new storage technologies and products EMC announced last week, the VNXe is EMC's "me-too" product to compete against other low-end disk systems like the IBM System Storage DS3524 and N3000 series. It looks truly new, developed organically from the ground up, with a new architecture, new OS. It comes in either the 2U-high VNXe3100 or the 3U-high VNXe3300. These employ 3.5-inch SAS drives to provide Ethernet-based NFS, CIFS and iSCSI host attachment. The $10K USD price tag appears to be for the hardware only. As is typical for EMC, they charge software features in bundles or "suites", so the actual TCO will be much higher. I have not seen any announcements whether Dell plans to resell either the VNXe nor the VNX models, now that they have acquired Compellent.
- A new VNX disk system
Despite having a similar name as the VNXe, the VNX appears to be a re-hash of the Celerra/CLARiiON mess that EMC has been selling already, based on the old FLARE and DART operating systems of these older disk systems. This scales from 75 to 1000 SAS drives. While EMC calls the VNX "unified", it currently is only available in block-only and file-only models, with a future promise from EMC that they will offer a combined block-and-file version sometime in the future. EMC claims that the VNX will be faster than the predecessors, so hopefully that means EMC has joined the rest of the planet and will publish SPC-1 and SPC-2 benchmarks to back up that claim. They can compare against the SPC-1 benchmarks that our friends at NetApp ran against EMC CLARiiON.
- New software for the VMAX
A long time ago, EMC announced they would provide non-disruptive automated tiering. Their first delivery "FAST V1" handled entire LUNs at a time. EMC now has finally "FAST VP" which we expected was going to be called "FAST V2", which provides sub-LUN automated tiering between Solid-state and spinning disk drives.. Meanwhile, IBM has been delivering "Easy Tier" on the IBM System Storage DS8000 series, SAN Volume Controller, and Storwize V7000 disk systems.
- Data Domain Archiver
Competing against IBM, HP and Oracle in the tape arena, EMC's latest addition to the Data Domain family is designed for the long-term retention of backups? Archives of backups? Backups are short-lived, protecting against the unexpected loss from hardware failure or data corruption. Keeping backups as "archives" is generally a bad mistake, as it makes it hard to e-Discover the data you need when you need it, and may not have the appropriate hardware tor restore these old backups when you do find them.
I will have to dig deeper into all of these different technologies in separate posts in the future.
The first day had various breakout sessions in the afternoon.
- Understanding Your Options for Storing Archive Data to Meet Compliance Challenges
I presented IBM's Smart Archive strategy and the storage products IBM offers to archive data and meet compliance regulations:
- The differences between backup and archive, including a few of my own personal horror stories helping companies who had foolishly thought that keeping backup copies for years would adequately serve as their archive strategy
- The differences between Write-Once Read-Many (WORM) media, and Non-Erasable, Non-Rewriteable (NENR) storage options.
- How disk-only archive solutions become "space heaters" for your data center.
- An overview of the various storage hardware options from IBM.
- How LTFS can be incorporated into an archive solution, such as [Crossroads Systems' StrongBox® solution].
- An explanation of the different IBM software offerings to help complement the storage hardware choices.
- IBM TotalStorage Productivity Center (TPC): New Features and Functions
Mike Griese, IBM program manager for TPC, presented the latest in TPC 5.1 version announced this week. His session was organized into four key sections:
- Insights - TPC 5.1 integrates COGNOS reporting, which allows custonmization of reports and ad-hoc exploration and analysis. Since the reports are not binary-compiled into the product, IBM can ship new COGNOS reports as templates outside the normal TPC release schedule. Also, TPC 5.1 got smarter on reporting on server virtualization hypervisor environments to avoid double-counting.
- Recommendations - TPC 5.1 can analyze your usage patterns across the entire data center and make recommendations to move data from one storage tier to another. You can then act on these recommendations by moving data from one tier to another, either "up-tier" to faster storage, or "down-tier" to less expensive storage, using a storage hypervisor like IBM SAN Volume Controller. This is complementary to features like Easy Tier which optimize within a single disk system.
- Performance - TPC 5.1 uses a new web-based GUI, based on AJAX, HTML5 and Dojo widgets, inspired by the IBM XIV GUI, and similar to the web-based GUI of SAN Volume Controller, Storwize V7000 and SONAS.
- Optimization - TPC 5.1 allows you to optimize for Cloud by introducing a new RESTful API for storage provisioning and support for SONAS environments. This will allow upward-integration to products like [IBM Service Delivery Manager] and [Tivoli Storage Automation Manager].
Mike also explained the new TPC 5.1 packaging. Instead of having a variety of components like "TPC for Disk", "TPC for Data", and "TPC for Replication", the new packaging simplifies this down to two levels of functionality. The basic level supports block-level devices, including disk performance, replication and SAN fabric management. The advanced level adds support for files and databases, including support for Cloud management such as SONAS environments.
Dan Zehnpfennig, Solution Architect, talked about his experiences installing TPC 5.1 and how this was much improved over previous TPC versions.
- IBM Watson: How it Works and What it Means for Society Beyond Winning Jeopardy!
I presented how IBM Watson works, how it played the Jeopardy! game show last year, and how IBM has helped clients use the technology to solve real-world problems.
- Understanding the IBM Grand Challenge, how it compares to the IBM Deep Blue chess playing computer
- How IBM Watson works, the hardware, the software, and the algorithms involved
- How to build your own "Watson Jr." in your own basement, based on my [popular instructions I published last year].
- Examples of how the technology is being used in Healthcare and Financial Services
If you missed it, I will be repeating this session on IBM Watson on Thursday.
Tonight we have the grand opening reception of the Solution Center and a concert featuring Grace Potter & the Nocturnals!
technorati tags: IBM, Archive, Compliance, WORM, NENR, Mike Griese, , Dan Zehnpfennig, Tivoli Storage, Productivity Center, TPC, Watson, Healthcare, Financial Services, Wellpoint, Seton, CitiGroup
Continuing my coverage of the 30th annual [Data Center Conference]. here is a recap of Wednesday breakout sessions.
- Private Cloud Computing at Bank of America – One Year Later
Prentice Dees, Senior VP for Systems Automation Engineering at Bank of America, did the happy dance celebrating their success implementing a private cloud. Bank of America merged with Merrill Lynch, has 29 million users residing in over 100 countries, and 5900 retail offices in 40 countries. They manage $1 billion US dollars in deposits, and $2.2 trillion in assets.
Rather than IaaS or PaaS, his team focused on Application-as-a-Service (AaaS). Their goal is to transform and move IT out of the way of the business. In his view, if a human has to touch a keyboard, then his team has failed.
He divides the work up into three layers:
- Bones: These are the physical components, such as servers, storage, switches that provide capacity and interconnect.
- Muscle: This is the translation layer, providing actions and reporting.
- Brains: This is the layer for intelligent automation
Provisioning new servers with storage involves three sets of steps. The first set of steps involves requesting approval. The second set of steps deploys the server. The third involves installing the application, loading the data and using it until End-of-Life. The second set of steps took 14 to 60 days before, and has been automated down to one to three hours.
The results is that he has improved server utilization 10x, and storage is over-provisioned 4x, and are now hosting over 11,000 server images, saving $20 million US dollars. Not only is this lower cost per application deployed, but the process allows for lower-skilled personnel. He has over 500TB of virtual storage deployed, using thin provisioning, with only 128TB of physical disk. But they have only scratched the surface. Only 15 to 20 percent are virtualized in this manner, and they want to get to 80 percent within the next three years.
What makes an application not "Cloud-ready"? Prentice is a big fan of Linux and Open Source solutions. Some applications consume the entire server. In other cases, code changes are required. If possible, try to split up large applications into smaller Cloud-ready chunks?
How many people on his team? There are currently 16 to 20 people on the team, but at its peak there were 30 people.
Rather than wasting time on capacity planning, his team focuses on a cost recovery model instead. Seed capital in combination with rock-solid recovery is the way to go. "All models are wrong," the saying goes, "but some are useful!"
A nice side benefit to this new approach is maintenance is greatly improved. Rather than rushing to fix problems, you roll the application over to another host machine, and then take your time fixing the failed hardware.
How does the team deal with requests for dedicated resources? Give them the keys to their own miniature private cloud. Let them provision from their dedicated resources using the same methods you use to provision everyone else. This allows them to get comfortable with the process, and eventually join the rest of the shared pool. Analytics can be used to find "rogue VMs" that don't play well with others.
Their automation is a mix of commercial and open source software, with home-grown scripts. They have one "Orchestration Management Data Base" (OMDB) to manage multiple disparate Configuration Management Data bases (CMDBs). The chargeback is not quite per individual pay-per-use, but more at the departmental level.
- Aging Data: The Challenges of Long-Term Data Retention
The analyst defined "aging data" to be any data that is older than 90 days. A quick poll of the audience showed the what type of data was the biggest challenge:
In addition to aging data, the analyst used the term "vintage" to refer to aging data that you might actually need in the future, and "digital waste" being data you have no use for. She also defined "orphaned" data as data that has been archived but not actively owned or managed by anyone.
You need policies for retention, deletion, legal hold, and access. Most people forget to include access policies. How are people dealing with data and retention policies? Here were the poll results:
The analyst predicts that half of all applications running today will be retired by 2020. Tools like "IBM InfoSphere Optim" can help with application retirement by preserving both the data and metadata needed to make sense of the information after the application is no longer available. App retirement has a strong ROI.
Another problem is that there is data growth in unstructured data, but nobody is given the responsibility of "archivist" for this data, so it goes un-managed and becomes a "dumping ground". Long-term retention involves hardware, software and process working together. The reason that purpose-built archive hardware (such as IBM's Information Archive or EMC's Centera) was that companies failed to get the appropriate software and process to complete the solution.
Cloud computing will help. The analyst estimates that 40 percent of new email deployments will be done in the cloud, such as IBM LotusLive, Google Apps, and Microsoft Online365. This offloads the archive requirement to the public cloud provider.
A case study is University of Minnesota Supercomputing Institute that has three tiers for their storage: 136TB of fast storage for scratch space, 600TB of slower disk for project space, and 640 TB of tape for long-term retention.
What are people using today to hold their long-term retention data? Here were the poll results:
Bottom line is that retention of aging data is a business problem, techology problem, economic problem and 100-year problem.
- A Case Study for Deploying a Unified 10G Ethernet Network
Brian Johnson from Intel presented the latest developments on 10Gb Ethernet. Case studies from Yahoo and NASA, both members of the [Open Data Center Alliance] found that upgrading from 1Gb to 10Gb Ethernet was more than just an improvement in speed. Other benefits include:
- 45 percent reduction in energy costs for Ethernet switching gear
- 80 percent fewer cables
- 15 percent lower costs
- doubled bandwidth per server
Ruiping Sun, from Yahoo, found that 10Gb FCoE achieved 920 MB/sec, which was 15 percent faster than the 8Gb FCP they were using before.
IBM, Dell and other Intel-based servers support Single Root I/O Virtualization, or SR-IOV for short. NASA found that cloud-based HPC is feasible with SR-IOV. Using IBM General Parallel File System (GPFS) and 10Gb Ethernet were able to replace a previous environment based on 20 Gbps DDR Infiniband.
While some companies are still arguing over whether to implement a private cloud, an archive retention policy, or 10Gb Ethernet, other companies have shown great success moving forward.
technorati tags: IBM, BofA, Prentice+Dees, AaaS, Linux, Open Source, OMDB, CMDB, Aging data, Archive, Retention, , InfoSphere, Optim, LotusLive, University Minnesota, , 10GbE, SR-IOV, GPFS, private cloud
There is still time to enroll for [IBM Edge], a conference focused on storage, to be held June 4-8 in Orlando, Florida. There is an early-bird discount until May 6!
I will be there all week! Here are the seven sessions I will be presenting at the Technical Edge side of the event:
- Understanding Your Options for Storing Archive Data to Meet Compliance Challenges
This session will cover the IBM software and hardware solutions that your organization can use to store archive data, including features like immutability, Write-Once-Read-Many (WORM) technology and Non-Erasable, Non-Rewriteable (NENR) enforcement. The discussion will include high-level concepts like chronological and event-based retention, litigation hold and release, as well as an overview of the products and solutions from IBM that you can deploy today.
IBM Watson: How it Works and What it Means for Society Beyond Winning Jeopardy!
In 2011, the IBM Watson computer was able to beat the top-earning human winners on the trivia game-show “Jeopardy!” As I was the author of [How to Build Your Own Watson Junior in Your Basement], I have been asked to explain how the IBM Watson system was put together, how it works, and what examples of text mining and big data analytics means for society as we apply technology to meet tomorrow's challenges.
Using Social Media for IBM System Storage - Birds of a Feather
I will be moderating this Birds of a Feather, or BOF, session that will bring together a Q&A panel of experts on how social media can be leveraged to help you do your job, get the information you need to solve problems, and share your knowledge with others.
Data Footprint Reduction: Understanding IBM Storage Efficiency Options
Data Footprint Reduction is the catch-all term for a variety of technologies designed to help reduce storage costs. In this session, I will cover thin provisioning, space-efficient copies, deduplication and compression technologies, and describe the IBM storage products that provide these capabilities.
IBM's Storage Strategy in the Smarter Computing Era
Confused about IBM's new initiatives for Big Data analytics, Workload Optimized Systems, and Cloud Computing? This session will explain it all, and how IBM's strategy for its various storage products and solutions fit into these overall themes.
IBM SONAS and the IBM Cloud Storage Taxonomy
Confused over the different types of cloud storage? IBM's scale-out Network Attached Storage (SONAS) can be used in a variety of use cases. This session will provide an overview of IBM's SONAS solution, provide an update on the latest features and functions recently announced, and explain how it can be deployed in various private, public and hybrid cloud environments.
IBM Tivoli Storage Productivity Center Overview and Update
IBM has enhanced its premier storage infrastructure management tool: IBM Tivoli Storage Productivity Center. This session will provide both an overview of the product, and explain the latest features and functions recently announced.
I hope to see you all there!
technorati tags: IBM, Edge, Archive, Social Media, BOF, Data Footprint Reduction, Strategy, Smarter Planet, Smarter Computing, SONAS, Cloud, Taxonomy, Tivoli Storage, Productivity Center, TPC, IBM Watson