bySteve Kenniston History truly does repeat itself. We are talking about the history of
data storage. Every once and a while a new technology comes along that
requires a new way to think about infrastructure. Notice I said
“infrastructure”. I’d like to paint two analogies:
1: RAID – Prior to RAID users stored their data on disk and if they
could afford it, they backed that data up to have a protected copy of
their data. When RAID came out, users were able to store their data on
multiple disks appearing as one device. The benefits to this were,
increased data reliability, better performance. This new technology
however, fundamentally changed how disk was sold, but the questions were
How much capacity do you need?
What type of performance does your application require?
sales reps point of view changed. There were a number of new
considerations that needed to be taken into account. First, the age old
question, “Will I sell less storage “stuff?” Remember the person, at
the time, selling the disk was probably also selling the backup tape and
software to protect that information. If the disks are more reliable,
maybe the customer won’t need as much tape? Second, when the capacity
question came up, the seller also needed to know what type of RAID the
customer wanted to ensure they sold them enough drives. It was no
longer as simple as asking the capacity requirements and dividing it by
the drive capacity at the time. Now depending upon RAID levels there
was a new set of math that needed to be done. Third was the notion of
performance and more spindles meant more performance so now that the
capacity equation was solved for, you also needed to know the I/O
requirements in order to make sure the right number of drives were sold
to solve for the capacity as well as the performance.
what, we figured it out and the industry never looked back. RAID is a
defacto standard in all storage subsystems today, I even run RAID in my
home. The business benefits of having RAID far outweighed the costs.
In fact, it is probably one of the first times in storage history that
the question of, “how can you afford not to have it”, came up.
2: Virtual Machines – When VMware came out the value proposition was,
do more work, with less physical infrastructure. And again, the
business benefits far outweighed the technology hurdle of implementing
the new solution.
in mind that it is much harder to change process in IT than it is to
change technology, IT decided that this new way of serving up processing
power to applications was well worth all of the process changes that it
would require. One example, backup would need to change when
implementing virtual server technology. The data would grow 4x and the
processing of that information for backup would take longer, in a world
where time was all to valuable. However the business benefit justified
Again, the sellers questions were consistent:
How many virtual servers do you need? (Capacity)
What type of performance do you need for each virtual server?
answers to these questions allowed a sales rep to configure the right
number of physical systems to handle the right number of systems to make
the line of business successful. Additionally, some of the same
considerations came up. “Will I sell less server and make less money?”
Now that there was new server technology (more processors, the ability
to handle more memory) systems could be bigger, and more expensive.
Sellers also needed to know a bit more about “capacity”, how many
virtual systems could a physical system run successfully? They also
needed to have an understanding of performance. Now sellers were
configuring systems to run the equivalent of 20 to 100 servers on one
Today I would suggest that we are at a cross roads in history. New technology has come along that will have asignificantimpact
on the storage world. First, research from IBM reflects the fact that
disk drives can no longer keep getting two times as dense for half the
cost as they had been throughout the late 90’s and early 2000’s. The
technology doesn’t exist today to make the drives spin faster, stay cool
and not loose data. Until now. Real-time compressionis
a game changing technology that will add significant value to the
storage industry without having to change the way IT thinks about the
deployment of their storage.
is growing at such a significant pace today and with the latest IBM
research about disk capacities, something needs to change. Data centers
are just running out of space and more customers want to keep more data
on line for reasons such as competitive edge or compliance, but no
matter the reason, they want access to their information. Enter
real-time compression. Now there is a fundamental difference between
real-time compression and other compression technologies and compression
implementations but I am not going get into it here, but it is safe to
say that post process and in-line compression are very different than
real-time compression and users can’t get the benefits of improved
primary storage capacity, transparently, with no performance impact with
anything but real-time compression technology.
real-time compression, like other game changing technology, doesn’t
require any new questions; there are just simply a new set of math
How much capacity is required?
What is the performance requirement?
time, real-time compression will be as ubiquitous as RAID, and just
like users don’t think that much about RAID, users won’t need to think
about compression. Compression will become an expected feature of the
array. It doesn’t matter that it now takes fewer drives to satisfy the
original question around capacity and performance. With data growing as
fast as it is and with disks not being able to keep up their growth
pace, something needs to change and that something is real-time
compression. Soon, it won’t matter what the physical disk capacity is
of a disk drive, it will be about a disks virtual disk capacity, what it
has the capability of storing that matters. It is time we all started
thinking this way.
“Storage Efficiency” has become a big topic over the past 12 months. There are a number of new technologies that have come out in the last few years that are helping to deal with storage growth. We all know that data is the root of the decisions that drive business today. The more data you have, hopefully, the better decisions you can make to drive your business to success. The question is, “what is the value (and hence the cost) of the infrastructure to create that success?” What we do know is that the ability to put more data in a highly efficient footprint can give your company a competitive edge. There are five technologies that can help an IT organization create an efficient storage infrastructure. These are:
3) Thin Provisioning
It is also important to point out that there are some semantics when talking about storage efficiency, specifically between efficiency and optimization technologies. I think it is useful to attempt to define these as they lead us to picking the right solutions for what we are trying to accomplish. For the purpose of this post, efficiency will relate to making existing capacity more useful and optimization will mean making more capacity out of existing capacity.
Using these definitions, technologies such as Tiering, Virtualization and Thin Provisioning are efficiency technologies. These technologies help to utilize the existing capacity that you have.
Tiering is technology that is used on about 10% of your data or less. It is used to move data that requires higher performance to flash storage. Good tiering technology analyzes data access patterns and moves the most active data to the highest performing disk. It doesn’t really change the amount of physical capacity that is required; it just changes whattypeof capacity is required and allows IT to make sure data is operating as fast and efficiently as possible.
Virtualization technology allows IT to make sure disk utilization is used as efficiently as possible. Until recently storage utilization rates were around 50%. By leveraging virtualization technology, IT can group pools of storage so they don’t need to purchase capacity needlessly. Virtualization can be used on 50% to 60% of your storage but it doesn’t change your physical capacity infrastructure requirements and at most allows users to take advantage of 20% to 40% of their capacity that they once didn’t access.
Similar to Virtualization technology, thin provisioning technology also can be used on 50% to 60% of your capacity however, thin provisioning technology gives IT about 10% to 40% of their capacity back. Thin Provisioning helps IT manage their existing capacity and their utilization by being able to make capacity available to users much easier again however it doesn’t change the amount of physical storage infrastructure required.
Optimization technologies help IT to better manage their physical storage footprint. Optimization technologies optimize existing infrastructure by allowing users to put more capacity in the physical same space. The two technologies that are currently used today are data deduplication and real-time compression.
Optimization technologies are a bit tricky. There is a balance that is required between optimization and performance and availability. At the end of the day, IT chooses the storage it buys with two very important characteristics in mind, performance and availability. Optimization technologies can not affect these characteristics. It is for this reason that data deduplication really isn’t ready for “prime time” on primary, active storage. Data deduplication creates too much of a performance impact on primary, active data. Today, data deduplication could be used on about 10% to 15% of the primary, less active capacity that is in the data center and only provides about 30% to 50% overall optimization. In other words deduplication technology can impact the physical infrastructure by as much as 10%, meaning IT may not need to buy as much physical capacity.
Real-time compression, on the other hand, has one of the most dramatic affects on primary storage capacity. Real-time compression can be used on as much as 85% of the storage footprint and can compress data between 50% and 80%. That said Real-time compression could have IT purchase as much as 70% less overall storage capacity. Real-time compression also does not affect the main characteristics for which users buy storage (performance and availability). IT could have as much as 70% less footprint but keep the same amount of data or more on-line. Additionally, IT can now purchase storage opportunistically without having to have such a dramatic impact on their infrastructure, process or budgets. This allows companies to keep more capacity on line and available to help companies do more analytics on more capacity and become more competitive.
When deciding which storage efficiency technology will have a more effective impact on your overall environment and budget, start with optimization technologies and start to get the data growth under control. Adding value to the line of business that can drive revenue with more data will make you a hero and your business more successful.
Businesses continue to search for storage solutions that save money
without sacrificing performance. Last year, IBM introduced Scale Out
Network Attached Storage (SONAS), the industry’s first such
network-attached storage (NAS) offering to address this business need.
SONAS is an enterprise class, NAS system that provides extreme
scalability, availability and security—and does so with record-breaking
performance. It’s designed as a single global repository to manage
multiple petabytes of storage and billions of files all under one file
In April, IBM announced significant performance enhancements to
SONAS: improved information lifecycle management (ILM), hierarchical
storage management (HSM) as well as ease of deployment and antivirus
Todd Neville, SONAS program leader at IBM, says SONAS is unique in
that it can very near-linearly scale to almost any performance level.
With SONAS, he says, “You can build a system that’s as fast as you want
it to be; but it’s not just about absolute size, it’s also about bang
for your buck. We’ve significantly increased the software performance in
our upcoming release 1.2, so customers see a significant performance
increase on their current platform with no additional costs.”
Funda Eceral, SONAS market segment manager at IBM, says SONAS is the
only true scale-out NAS system available in the marketplace. “While you
can nondisruptively add capacity with storage building blocks,” Eceral
says, “you can also still continue to independently scale out your I/O
performance with interface nodes. It brings operational efficiency and
extraordinary utilization rates for each customer.”
Three Key Features
This version of SONAS offers three key features, according to Neville:
Ease of deployment. Using Network Data Management Protocol
(NDMP), a SONAS device can be easily integrated into existing
data-center backup infrastructures. “If you have an enterprise backup
deployment using NDMP, you will be able to take SONAS and quickly
connect with a wide variety of popular backup systems,” Neville says.
Built-in antivirus integration. Scalable NAS storage devices
must have a way for an antivirus function to perform scans on files
intelligently, such as when they’re opened or closed. SONAS includes a
built-in functionality that lets a third party like Symantec integrate
into the SONAS device to perform antivirus operations, as simple “full
file-system scans” become cumbersome at enterprise scales.
Physical size. Neville says customers asked IBM to make the
SONAS device more compact, although it supports almost a full petabyte
in a single rack, making it the only offering in IBM’s NAS portfolio
that can do so. It’s now 10 inches shorter than the original device, can
scale up to 14.4 petabytes (with 2 TB drives) and has a single point of
management, which can significantly reduce storage-administration
“Everyone says, ‘We do tiering, HSM and ILM,’ but design
matters—IBM does it differently.” —Todd Neville, SONAS program leader,
“Everyone says, ‘We do tiering, HSM and ILM,’ but design matters—IBM does it differently.” —Todd Neville, SONAS program leader, IBM Next Page >>
In the cover story this month,
Lee Cleveland, Distinguished Engineer, Power Systems direct attach
storage, and Andy Walls, Distinguished Engineer, chief hardware
architect for DS8000 and solid-state drives (SSDs), sat down to talk
about all of the new storage technologies IBM has been releasing lately.
What I didn’t have room for in the article was a nice summary of the
technologies that can help you improve access, manage growth, protect
data, reduce costs or reduce complexity. Whatever your goals, IBM has an
integrated storage option for every organization.
Here are the quick highlights of the latest storage announcements:
IBM Storwize V7000
New advanced software functions
New easy-to-use, Web-based GUI
RAID and enclosure RAS services and diagnostics
Additional host, controller and ISV interoperability
Integration with IBM Systems Director
Enhancements to Tivoli Storage Productivity Center (TPC), FlashCopy Manager (FCM) and Tivoli Storage Manager (TSM) support
Proven IBM software functionalities
Easy Tier (dynamic HDD/SSD management)
RAID 0, 1, 5, 6, 10
Storage virtualization (local and external disks)
Non-disruptive data migration
Global and Metro Mirror
FlashCopy up to 256 copies of each volume
IBM Storwize Rapid Application Storage Solution
Runs on: AIX 7.1-5.3, IBM i 7.1-6.1 (with VIOS), Red Hat and SUSE Linux, z/VSE, Microsoft Windows, Mac OS X
ProtecTier deduplication offers 25-to-1 reduction and online backup
In June, IBM debuted ProtecTIER* deduplication solutions
for AIX* and IBM i. ProtecTIER offers solutions to those who can’t
complete backup operations in a given window, have difficulty protecting
rapidly growing amounts of data or find their current backup
With data amounts growing, deduplication is becoming a vital part of
data management, backup and recovery. “One of the reasons ProtecTIER is
so crucial is because of the crazy growth the world is experiencing as
it moves to an all-digital environment,” says Victor Nemechek,
ProtecTIER deduplication offering manager at IBM. “Customers are finding
their data often doubles or more every year and their current backup
systems make it difficult to capture that data, protect it and restore
it when they need to.”
For backups many companies use tapes that load data quickly, but
present retrieval problems. These challenges—along with reliability
problems—sent customers to disk where data was more accessible, but also
expensive. Companies used disk for small portions of their most
critical data, and kept their other data on tape. “Even with disk for
critical data, backup is still an issue because you have a primary disk
that you store your data on and you have to have that much disk to back
up to, basically doubling your disk needs, and that can be very
expensive,” Nemechek says.
“Deduplication can squeeze 25 terabytes of data down to only
1 terabyte of physical disk, so customers can have the speed and
reliability of disk but without that one-to-one cost.” —Victor Nemechek,
ProtecTIER deduplication offering manager, IBM
IBM's Ed Walsh, Director of Storage Efficiency sits down with Steve Duplessie, Founder of ESG to talk about how IBM Real-time Compression sets the bar for doing storage optimization in NAS. At the end of the day, if you can do compression in real time, without sacrificing performance and the transparency of the implementation, then why wouldn't you - given the savings you can get over traditional compression.
We all know compression is not new and it is coming as a standard feature in a number of storage systems. The issue is, each of these technologies has a significant impact on performance - both primary storage performance as well as the performance on all of the back end operations such as backups, replication etc...
IBM's Real-time Compression doesn't have any of these limitations - listen to Ed to hear more.
IBM System Storage TS7610 ProtecTIER Deduplication Appliance Express
The TS7610 is a powerful new addition to the IBM ProtecTIER
solution set, which brings the benefits of the reliability and
performance of disk-based data protection to mid-sized businesses who
need to ensure their backups are successfully completed in a timely
manner. The TS7610 brings the added benefit of inline data deduplication
which can squeeze up to 25TB or more into a single terabyte of storage.
The TS7610 also reduces costs (such as reducing downtime and time spent
managing and supporting systems) up to 45% over standard
non-deduplicated virtual tape library systems.
Manage storage more effectively with virtualizationcapabilities from IBM As the need for data storage continues to spiral upward, tradi-tional physical approaches to storage management becomeincreasingly problematic. Physically expanding the storage environment can be costly, time-consuming and disruptive—especially when it has to be done again and again in responseto ever-growing storage demands. Yet manually improving stor-age utilization to control growth can be challenging. Physicalinfrastructures can also be inflexible at a time when businessesneed to be able to make ever-more rapid changes in order tostay competitive.The alternative is a virtualized approach in which storage virtualization software presents a “view” of storage resources toservers that is different from the actual physical hardware inuse. This logical view can hide undesirable characteristics ofstorage while presenting storage in a more convenient mannerfor applications. For example, storage virtualization may presentstorage capacity as a consolidated whole, hiding the actualphysical boxes that contain the storage. In this way storagebecomes a logical pool of resources that exists virtually, regard-less of where the actual physical storage resources are locatedin the larger information infrastructure. These software-definedvirtual resources are easier and less disruptive to change andmanage than hardware-based physical storage devices, sincethey don’t involve moving equipment or making physical con-nections. As a result, they can respond more flexibly anddynamically to changing business needs. Similarly, the flexibilityafforded by virtual resources makes it easier to match storageto business requirements.Learn More>