This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections platform will be sunset on December 31, 2019. On January 1, 2020, this blog will no longer be available. More details available on our FAQ.
Clod Barrera is an IBM Distinguished Engineer and Chief Technical Strategist for IBM System Storage. He predicts that by 2015, 10 percent of the servers and storage purchases, as well as 25 percent of the network gear purchases, will be related to Cloud deployments. Cloud Storage is expected to grow at a compound annual growth rate (CAGR) of 32 percent through 2015, compared to only 3.8 percent growth for non-Cloud storage.
Cloud Computing is allowing companies to rethink their IT infrastructure, and reinvent their business. Clod presented an interesting chart on the "Taxonomy" of storage in Cloud environments. On the left he had examples of Storage that was part of a Cloud Compute application. On the right he had storage that was accessed directly through protocols or APIs. Under each he had several examples for transactional data, stream data, backups and archives.
Clod feels the only difference between Private and Public clouds is a matter of ownership. In private clouds, these are owned by the company that uses them via their private Intranet network. Public clouds are owned by Cloud Service providers and are accessed over the public Internet. Clod presented IBM's strategy to deliver Cloud at five levels:
Private Cloud: on-site equipment, behind company firewall, managed by IT staff
Managed Private Cloud: on-site equipment, behind company firewall, managed by IBM or other Cloud Service provider
Hosted Private Cloud: dedicated, off-premises equipment, located and managed by IBM or other Cloud Service Provider, and access through VPN
Shared Cloud Services: shared, off-premises equipment, located at IBM or other Cloud Service Provider, managed by IBM or Cloud Service provider, and access through VPN. The facility is intended for enterprises only, on a contractual basis, and will be auditable for compliance to government regulations, etc.
Public Cloud: shared, off-premises equipment, located and managed by IBM or other Cloud Service provider, targeted to offer cloud compute and storage resources, with standardized platforms of operating systems and middleware, for individuals, small and medium sized businesses.
As with storage in traditional data center deployments, storage in clouds will be tiered, with Tier 0 being the fastest tier, to Tier 4 for "deep and cheap" archive storage. IBM SONAS is an example of Cloud-ready storage that can help make these tiers accessible through standard Ethernet protocols. Cloud Service providers will use metering and Service Level Agreements (SLAs) to offer different rates for different tiers of storage in the cloud.
Clod wrapped up his session explaining IBM's Cloud Computing Reference Architecture (CCRA). This is an all-encompassing diagram that shows how all of IBM's hardware, software and services fit into Cloud deployments.
IBM Information Archive for email, files and eDiscovery
Not too many people have heard of IBM's Smart Archive strategy and the storage products IBM offers to meet compliance regulations. This session covered the following:
The differences between backup and archive, including a few of my own personal horror stories helping companies who had foolishly thought that keeping backup copies for years would adequately serve as their archive strategy
The differences between optical media, Write-Once Read-Many (WORM) media, and Non-Erasable, Non-Rewriteable (NENR) storage options.
Why putting a [space heater] on your data center floor is a bad idea, driving up power and cooling costs for little business value to the enterprise once the unit is full of rarely accessed read-only data.
An overview of the [IBM Information Archive], an integrated stack of servers, storage and software that replaces previous offerings such as the IBM System Storage DR550 and the IBM Grid Medical Archive Solution (GMAS).
The marketing bundle known as the [Information Archive for Email, Files and eDiscovery] that combines the Information Archive storage appliance with Content Collectors for email and file systems, as well as eDiscovery tools, and implementation services for a solution that can support a small or medium size business, up to 1400 employees.
IBM Tivoli Storage Productivity Center v4.2 Overview and Update
Many of the concerns raised when I [presented v4.1 at this conference last year] were addressed this year in v4.2, including full performance statistics for IBM XIV storage system, storage resource agent support for HP-UX and Solaris, and a variety of other issues.
I presented this overview in stages:
"Productivity Center Basic Edition" that comes pre-installed on the IBM System Storage Productivity Center hardware console, that provides discover of devices, basic configuration, and a clever topology viewer of what is connected to what.
"Productivity Center for Disk" and "Productivity Center for Disk Midrange Edition (MRE)" that provides real-time and historical performance monitoring, asset and capacity reporting.
"Productivity Center for Replication" which supports monitoring, failover and failback for FlashCopy, Metro Mirror and Global Mirror on the SVC, Storwize V7000, DS8000, DS6000 and ESS 800.
"Productivity Center for Data" which supports reporting on files, file systems and databases on DAS, SAN and NAS attached storage from a Operating System viewpoint.
"Productivity Center Standard Edition" which includes all of the above except "Replication", and adds performance monitoring of SAN Fabric gear, and some very clever analytics to improver performance and problem determination.
One of the questions that came up was "How big does my company have to be to consider using Productivity Center?" which I answered as follows:
"If you are a small company, and the "IT Person" has responsibilities outside the IT, and managing the few pieces of kit is just part of his job, then consider just using the web-based GUI through a Firefox or similar browser. If you are a medium sized company with dedicated IT personnel, but mostly run by system admins or database admins that manage storage and networks on the side, you might want to consider the "Storage Control" plug-in for IBM Systems Director. But if you are larger shop, and there are employees with the title "Storage Administrator" and/or "SAN Administrator", then Tivoli Storage Productivity Center is for you."
Tivoli Storage Productivity Center has matured into a fine piece of software that truly can help medium and large sized data centers manage their storage and storage networking infrastructure.
I like speaking the first day of these events. Often people come in just to hear the keynote speakers, and stay the afternoon to hear a few break-out sessions before they leave Tuesday or Wednesday for other meetings.
Jim is an IBM Fellow for IBM Systems and Technology Group. There are only 73 IBM Fellows currently working for IBM, and this is the highest honor IBM can bestow on an employee. He has been working with IBM since 1968.
He is tasked with predicting the future of IT, and help drive strategic direction for IBM. Cost pressures, requirements for growth, accelerating innovation and changing business needs help influence this direction.
IBM's approach is to integrate four different "IT building blocks":
Scale-up Systems, like the IBM System Storage DS8000 and TS3500 Tape Library
Resource Pools, such as IBM Storage Pools formed from managed disks by IBM SAN Volume Controller (SVC)
Integrated stacks and appliances, integrated software and hardware stacks, from Storwize V7000 to full rack systems like IBM Smart Analytics Server or CloudBurst.
Mobility of workloads and resources requires unified end-to-end service management. Fortunately, IBM is the #1 leader in IT Service Management solutions.
Jim addressed three myths:
Myth 1: IT Infrastructures will be homogenous.
Jim feels that innovations are happening too rapidly for this to ever happen, and is not a desirable end-goal. Instead, a focus to find the right balance of the IT building blocks might be a better approach.
Myth 2: All of your problems can be solved by replacing everything with product X.
Jim feels that the days of "rip-and-replace" are fading away. As IBM Executive Steve Mills said, "It isn't about the next new thing, but how well new things integrate with established applications and processes."
Myth 3: All IT will move to the Cloud model.
Jim feels a substantial portion of IT will move to the Cloud, but not all of it. There will always be exceptions where the old traditional ways of doing things might be appropriate. Clouds are just one of the many building blocks to choose from.
Jim's focus lately has been finding new ways to take advantage of virtualization concepts. Server, storage and network virtualization are helping address these challenges through four key methods:
Sharing - virtualization that allows a single resource to be used by multiple users. For example, hypervisors allow several guest VM operating systems share common hardware on a single physical server.
Aggregation - virtualization that allows multiple resources to be managed as a single pool. For example, SAN Volume Controller can virtualize the storage of multiple disk arrays and create a single storage pool.
Emulation - virtualization that allows one set of resources to look and feel like a different set of resources. Some hypervisors can emulate different kinds of CPU processors, for example.
Insulation - virtualization that hides the complexity from the end-user application or other higher levels of infrastructure, making it easier to make changes of the underlying managed resources. For example, both SONAS and SAN Volume Controller allow disk capacity to be removed and replaced without disruption to the application.
In today's economy, IT transformation costs must be low enough to yield near-term benefits. The long-term benefits are real, but near-term benefits are needed for projects to get started.
What set's IBM ahead of the pack? Here was Jim's list:
100 Years of Innovation, including being the U.S. Patent leader for the last 18 years in a row
IBM's huge investment in IBM Research, with labs all over the globe
Leadership products in a broad portfolio
Workload-optimized designs with integration from middleware all the way down to underlying hardware
Comprehensive management software for IBM and non-IBM equipment
Clod is an IBM Distinguished Engineer and Chief Technical Strategist for IBM System Storage. His presentation focused on trends and directions in the IT storage industry. Clod started with five workload categories:
To address these unique workload categories, IBM will offer workload-optimized systems. The four drivers on the design for these are performance, efficiency, scalability, and integration. For example, to address performance, companies can adopt Solid-State Drives (SSD). Unfortunately, these are 20 times more expensive dollar-per-GB than spinning disk, and the complexity involved in deciding what data to place on SSD was daunting. IBM solved this with an elegant solution called IBM System Storage Easy Tier, which provides automated data tiering for IBM DS8000, SAN Volume Controller (SVC) and Storwize V7000.
For scalability, IBM has adopted Scale-Out architectures, as seen in the XIV, SVC, and SONAS. SONAS is based on the highly scalable IBM General Parallel File System (GPFS). File systems are like wine, they get better with age. GPFS was introduced 15 years ago, and is more mature than many of the other "scalable file systems" from our competition.
Areal Density advancements on Hard Disk Drives (HDD) are slowing down. During the 1990s, the IT industry enjoyed 60 to 100 percent annual improvement in areal density (bits per square inch). In the 2000s, this dropped to 25 to 40 percent, as engineers are starting to hit various physical limitations.
Storage Efficiency features like compression have been around for a while, but are being deployed in new ways. For example, IBM invented WAN compression needed for Mainframe HASP. WAN compression became industry standard. Then IBM introduced compression on tape, and now compression on tape is an industry standard. ProtecTIER and Information Archive are able to combine compression with data deduplication to store backups and archive copies. Lastly, IBM now offers compression on primary data, through the IBM Real-Time Compression appliance.
For the rest of this decade, IBM predicts that tape will continue to enjoy (at least) 10 times lower dollar-per-GB than the least expensive spinning disk. Disk and Tape share common technologies, so all of the R&D investment for these products apply to both types of storage media.
For integration, IBM is leading the effort to help companies converge their SAN and LAN networks. By 2015, Clod predicts that there will be more FCoE purchased than FCP. IBM is also driving integration between hypervisors and storage virtualization. For example, IBM already supports VMware API for Array Integration (VAAI) in various storage products, including XIV, SVC and Storwize V7000.
Lastly, Clod could not finish a presentation without mentioning Cloud Computing. Cloud storage is expected to grow 32 percent CAGR from year 2010 to 2015. Roughly 10 percent of all servers and storage will be in some type of cloud by 2015.
As is often the case, I am torn between getting short posts out in a timely manner versus spending some more time to improve the length and quality of information, but posted much later. I will spread out the blog posts in consumable amounts throughout the next week or two, to achieve this balance.
Maria Boonie is the IBM Director for IBM Worldwide Training and Technical Conferences. She indicated that there were 1500 attendees this week crossing both the System Storage and System x conferences at this hotel. There are 35 vendors that have sponsored this event, and they will be at the "Solutions Center" being held Monday through Wednesday this week.
She took this opportunity to plug IBM's latest education offerings, including Guaranteed-to-Run implementation classes, and Instructor-Led Online (ILO) technical classes.
Brian Truskowski is IBM General Manager for System Storage and Networking. I used to directly report to him in a previous role, and a few years ago he used to be the IBM CIO that helped with IBM's internal IT transformation.
Brian indicates that the previous approach to growth was to "Just Buy More", but this has some unintended consequences. He argued that companies need to adopt one or more of the following approaches to growth:
Stop storing so much - reduce data footprint using storage efficiency capabilities like data deduplication and compression
Store more with what is already on the floor - improve storage utilization with technologies like storage virtualization and thin provisioning
Move data to the right place - implement automated tiering, such as "Flash & Stash" between Solid-state drives and spinning disk, and/or Information Lifecycle Management (ILM) between disk and tape. Studies at some clients have found over 70 percent of data has not beed touched in the last 90 days
This time of dramatic change is the result of a "perfect storm" of influences, including the rising costs and risks associated with losing data, the increased need to index and search data, the desire for "Business Analytics", and the expectation for 100 percent up-time. This is driving IBM to offer hyper-efficient backup, Continuous Data Availability, and Smart Archive solutions.
The case study of SPRINT is a good example. SPRINT is a Telecommunications provider for cell phone users. They were challenged with 35 percent utilization, 165 storage arrays from six different vendors, and an expected 100 percent increase in their IT maintenance costs. After implementing IBM SAN Volume Controller (SVC) and Tivoli Storage Productivity Center (TPC) to manager 2.9 PB of data, SPRINT increased their utilization to 82 percent, reduced down to 70 storage arrays from only three vendors, and reduced their maintenance costs by 57 percent. Today, SPRINT now manages over 5 PB of data with SVC and TPC, have reduced their power and cooling by 3.5 million KWh, representing $320,000 USD in savings.
Roland Hagan is the IBM Vice President for the System x server platform. He talked about the "IT Conundrum" that represents a vicious cycle of "IT Sprawl", "Untrusted Data" and "Inflexible IT" that seem to feed each other. IBM is trying to change behavior, from thinking and dealing with physical boxes representing servers, storage and network gear, to a more holistic view focused on workloads, shared resource pools, independent scaling, and automated management.
IBM is leading the server marketplace, in part because of clever things IBM is doing, especially in developing the eX5 chipset that surrounds x86 commondity processors, and in part because of actions or decisions the competition have taken:
It doesn't break IBM's heart that Oracle decided to drop software support of their database on Itanium, which focued entirley against HP. Oracle runs on IBM servers better than Oracle/Sun or HP servers today, so it does not impact us, other than IBM has had a lot of people leaving HP to switch over to IBM.
HP has taken on a new CEO and reduced their R&D budget, causing them to be late-to-market on some of their offerings.
Dell continues to focus on the small and medium sized customer, and have not really broken into the "Enterprise".
Newcomer Cisco has some great technology that only seems to be adoptable in "Green Field" situations, as it does not integrate well with existing data center infrastructures.
The combination of ex5 chip-set architecture, Max5 memory expansion capabilities and Virtual Network Interface Cards (NICs), provide for a very VM-aware platform. For those who are not ready to fully adopt an integrated stack like IBM CloudBurst, IBM offers the Tivoli Service Automation software on its own, and a new [IBM BladeCenter Foundation for Cloud] as stepping stones to get there.
There are certainly more attendees here than last year, which reflects either the change in location (Orlando, Florida rather than Washington DC) as well as the economic recovery. I'm looking forward to an excellent week!
It's that time again. Every year, IBM hosts the "System Storage Technical University". I have been going to these since they first started in the 1990s. This time we are at the lovely [Hilton Orlando] in Orlando, Florida.
For those who want to relive past events, here are my blog posts from this event in 2010:
As was the case last year, IBM once again will run this conference alongside the [IBM System x Technical University] the same week, in the same hotel. This allows attendees to cross over to the other side to see a few sessions of the other conference. I took advantage of this last year, and plan to do so again this year as well!
For those on Twitter, you can follow my tweets at [@az990tony] or search on the hash tag #ibmtechu.
The new [IBM System Storage Tape Controller 3592 Model C07] is an upgrade to the previous C06 controller. Like the C06, the new 3592-C07 can have up to four FICON (4Gbps) ports, four FC ports, and connect up to 16 drives. The difference is that the C07 supports 8Gbps speed FC ports, and can support the [new TS1140 tape drives that were announced on May 9]. A cool feature of the C07 is that it has a built-in library manager function for the mainframe. On the previous models, you had to have a separate library manager server.
Crossroads ReadVerify Appliance (3222-RV1)
IBM has entered an agreement to resell [Crossroads ReadVerify Appliance], or "RV1" for short. The RV1 is a 1U-high server with software that gathers information on the utilization, performance and health for a physical tape environment, such as an IBM TS3500 Tape Library. The RV1 also offers a feature called "ArchiveVerify" which validates long-term retention archive tapes, providing an audit trail on the readability of tape media. This can be useful for tape libraries attached behind IBM Information Archive compliance storage solution, or the IBM Scale-Out Network Attached Storage (SONAS).
As an added bonus, Crossroads has great videos! Here's one, titled [Tape Sticks]
Linear Tape File System (LTFS) Library Edition Version 2.1
While the hardware is all refreshed, the overall "scale-out" architecture is unchanged. Kudos to the XIV development team for designing a system that is based entirely on commodity hardware, allowing new hardware generations to be introduced with minimal changes to the vast number of field-proven software features like thin provisioning, space-efficient read-only and writeable snapshots, synchronous and asynchronous mirroring, and Quality of Service (QoS) performance classes.
The new XIV Gen3 features an Infiniband interconnect, faster 8Gbps FC ports, more iSCSI ports, faster motherboard and processors, SAS-NL 2TB drives, 24GB cache memory per XIV module, all in a single frame IBM rack that supports the IBM Rear Door Heat Exchanger. The results are a 2x to 4x boost in performance for various workloads. Here are some example performance comparisons:
Disclaimer: Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. Your mileage may vary.
In a Statement of Direction, IBM also has designed the Gen3 modules to be "SSD-ready" which means that you can insert up to 500GB of Solid-State drive capacity per XIV module, up to 7.5TB in a fully-configured 15 module frame. This SSD would act as an extension of DRAM cache, similar to how Performance Accelerator Modules (PAM) on IBM N series.
IBM will continue to sell XIV Gen2 systems for the next 12-18 months, as some clients like the smaller 1TB disk drives. The new Gen3 only comes with 2TB drives. There are some clients that love the XIV so much, that they also use it for less stringent Tier 2 workloads. If you don't need the blazing speed of the new Gen3, perhaps the lower cost XIV Gen2 might be a great fit!
As if I haven't said this enough times already, the IBM XIV is a Tier-1, high-end, enterprise-class disk storage system, optimized for use with mission critical workloads on Linux, UNIX and Windows operating systems, and is the ideal cost-effective replacement for EMC Symmetrix VMAX, HDS USP-V and VSP, and HP P9000 series disk systems, . Like the XIV Gen2, the XIV Gen3 can be used with IBM System i using VIOS, and with IBM System z mainframes running Linux, z/VM or z/VSE. If you run z/OS or z/TPF with Count-Key-Data (CKD) volumes and FICON attachment, go with the IBM System Storage DS8000 instead, IBM's other high-end disk system.
(FTC Disclosure: I do not work or have any financial investments in ENC Security Systems. ENC Security Systems did not paid me to mention them on this blog. Their mention in this blog is not an endorsement of either their company or any of their products. Information about EncryptStick was based solely on publicly available information and my own personal experiences. My friends at ENC Security Systems provided me a full-version pre-loaded stick for this review.)
The EncryptStick software comes in two flavors, a free/trial version, and the full/paid version. The free trial version has [limits on capacity and time] but provides enough glimpse of the product to decide before you buy the full version. You can download the software yourself and put in on your own USB device, or purchase the pre-loaded stick that comes with the full-version license.
Whichever you choose, the EncryptStick offers three nice protection features:
Encryption for data organized in "storage vaults", which can be either on the stick itself, or on any other machine the stick is connected to. That is a nice feature, because you are not limited to the capacity of the USB stick.
Encrypted password list for all your websites and programs.
A secure browser, that prevents any key-logging or malware that might be on the host Windows machine.
I have tried out all three functions and everything works as advertised. However, there is always room for improvement, so here are my suggestions.
The first problem is that the pre-loaded stick looks like it is worth a million dollars. It is in a shiny bronze color with "EncryptStick" emblazoned on it. This is NOT subtle advertising! This 8GB capacity stick looks like it would be worth stealing solely on being a nice piece of jewelry, and then the added bonus that there might be "valuable secrets" just makes that possibility even more likely.
If you want to keep your information secure, it would help to have "plausible deniability" that there is nothing of value on a stick. Either have some corporate logo on it, of have the stick look like a cute animal, like these pig or chicken USB sticks.
It reminds me how the first Apple iPod's were in bright [Mug-me White]. I use black headphones with my black iPod to avoid this problem.
Of course, you can always install the downloadable version of EncryptStick software onto a less conspicuous stick if you are concerned about theft. The full/paid version of EncryptStick offers an option for "lost key recovery" which would allow you to backup the contents of the stick and be able to retrieve them on a newly purchased stick in the event your first one is lost or stolen.
Imagine how "unlucky" I felt when I notice that I had lost my "rabbits feet" on this cute animal-themed USB stick.
I sense trouble for losing the cap on my EncryptStick as well. This might seem trivial, but is a pet-peeve of mine that USB sticks should plan for this. Not only is there nothing to keep the cap on (it slides on and off quite smoothly), but there is no loop to attach the cap to anything if you wanted to.
Since then, I got smart and try to look for ways to keep the cap connected. Some designs, like this IBM-logoed stick shown above, just rotate around an axle, giving you access when you need it, and protection when it is folded closed.
Alternatively, get a little chain that allows you to attach the cap to the main stick. In the case of the pig and chicken, the memory section had a hole pre-drilled and a chain to put through it. I drilled an extra hole in the cap section of each USB stick, and connected the chain through both pieces.
(Warning: Kids, be sure to ask for assistance from your parents before using any power tools on small plastic objects.)
The EncryptStick can run on either Microsoft Windows or Mac OS. The instructions indicate that you can install both versions of download software onto a single stick, so why not do that for the pre-loaded full version? The stick I have had only the Windows version pre-loaded. I don't know if the Windows and Mac OS versions can unlock the same "storage vaults" on the stick.
Certainly, I have been to many companies where either everyone runs Windows or everyone runs Mac OS. If the primary target audience is to use this stick at work in one of those places, then no changes are required. However, at IBM, we have employees using Windows, Mac OS and Linux. In my case, I have all three! Ideally, I would like a version of EncryptStick that I could take on trips with me that would allow me to use it regardless of the Operating System I encountered.
Since there isn't a Linux-version of EncryptStick software, I decided to modify my stick to support booting Linux. I am finding more and more Linux kiosks when I travel, especially at airports and high-traffic locations, so having a stick that works both in Windows or Linux would be useful. Here are some suggestions if you want to try this at home:
Use fdisk to change the FAT32 partition type from "b" to "c". Apparently, Grub2 requires type "c", but the pre-loaded EncryptStick was set to "b". The Windows version of EncryptStick> seems to work fine in either mode, so this is a harmless change.
Install Grub2 with "grub-install" from a working Linux system.
Once Grub2 is installed, you can boot ISO images of various Linux Rescue CDs, like [PartedMagic] which includes the open-source [TrueCrypt] encryption software that you could use for Linux purposes.
This USB stick could also be used to help repair a damaged or compromised Windows system. Consider installing [Ophcrack] or [Avira].
Certainly, 8GB is big enough to run a full Linux distribution. The latest 32-bit version of [Ubuntu] could run on any 32-bit or 64-bit Intel or AMD x86 machine, and have enough room to store an [encrypted home directory].
Since the stick is formatted FAT32, you should be able to run your original Windows or Mac OS version of EncryptStick with these changes.
Depending on where you are, you may not have the luxury to reboot a system from the USB memory stick. Certainly, this may require changes to the boot sequence in the BIOS and/or hitting the right keys at the right time during the boot sequence. I have been to some "Internet Cafes" that frown on this, or have blocked this altogether, forcing you to boot only from the hard drive.
Well, those are my suggestions. Whether you go on a trip with or without your laptop, it can't hurt to take this EncryptStick along. If you get a virus on your laptop, or have your laptop stolen, then it could be handy to have around. If you don't bring your laptop, you can use this at Internet cafes, hotel business centers, libraries, or other places where public computers are available.
In less than a month, I will be presenting at the annual IBM Storage Technical University, July 18-22, at the Hilton in Orlando, Florida. This is one of my favorite conferences! You can sign up for this at their [Online Registration Page].
I will be covering a variety of topics:
IBM Storage Strategy in the Era of Smarter Computing - After IBM has led the IT industry through the "Centralized Computing" era, and then later the "Distributed Computing" era, we are now entering the third era, that of Smarter Computing. Come learn IBM's strategy for Storage to address today's big challenges, including Big Data, Integrated Workload-optimized systems, and Cloud service delivery models.
IBM Information Archive for Email, Files and eDiscovery - This session will cover the latest announcement for our non-erasable, non-rewriteable compliance storage, the Information Archive (IA), how this can be used to protect your emails and files, and provide indexed search to assist with eDiscovery.
IBM Tivoli Storage Productivity Center Overview and Update - I was one of the original lead architects for Productivity Center. Come learn what this software is all about, and how the latest features and functions can help you manager your IT environment.
IBM SONAS and the Smart Business Storage Cloud - Confused about Cloud Computing and Cloud Storage? I will explain everything you need to know, including how the integrated SONAS appliance operates, IBM's customized solutions for private cloud deployments, and IBM's public cloud offerings.
BOF on Social Media - BOF stands for "Birds of a Feather", and his normally an after-hours discussion on a single theme. This BOF will be a four-expert Q&A panel, including myself, John Sing, Rich Swain and Ian Wright. We will discuss how we got started in Social Media, and how it has boosted our careers and our ability to get work done.
Last Thursday, on IBM's 100-year anniversary, we had a huge turn-out for the celebration here at the IBM Development Lab site in Tucson, AZ. Employees brought in memorabilia that reminded them of the past 100 years.
Everyone got a black tee-shirt with the original IBM logo. There was plenty of music, food and drink, as well as a few speeches by former and current IBM executives.
Now, the fun begins on the next century of IBM. What will be in store for the world in the 21st century? We live in interesting times!
Kevin's perspective focused on the evolution over the past 100 years of "information science", in six chapters: sensing, memory, processing, logic, connecting, and architecture. He covers the technology from IBM Punched Cards and core memory, to the latest optical chips and the DeepQA technology in IBM Watson.
Steve's perspective was on IBM as a corporation, and how IBM and other corporations have evolved over the past century. In the late 19th century and early 20th century, "Internationals" had their headquarters in the United States, and regional sales and distribution offices elsewhere. The mid-20th century gave rise to "Multinationals" that invested more heavily in regional headquarters scattered across the globe. Today, in the 21st century, IBM and its clients are [Globally Integrated Entrprises] that move work to the lowest costs, best skills, and most attractive business climates.
Jeffrey M. O'Brien
Jeffrey M. O'Brien has been a senior editor [Fortune] and [Wired] magazines, and his work has appeared in The Best of Technology Writing, The Best American Science and Nature Writing, and The Best American Science Writing.
Jeffrey's perspective is on the impact technology has on humanity, organized into five steps towards progress: Seeing, Mapping, Understanding, Believing, and Acting. These steps have been around long before IBM, and Jeffrey is able to draw parallels to such efforts as Lewis & Clark mapping out the Louisiana Purchase, advancements in genetically modified foods, and the thousands of IBMers required to land a man on the moon.
This afternoon, everyone at the IBM Tucson site will be getting together to celebrate IBM's Centernnial!
This week, IBM celebrates its Centennial, 100 years since its incorporation on June 16, 1911.
A few months ago, the Tucson Executive Briefing Center ordered its latest IBM System Storage [DS8800] to be on display for demos. This was manufactured in Vác, Hungary (about an hour north of Budapest), and was going to be shipped over to the United States.
However, Sam Palmisano, IBM Chairman and CEO, was in Hannover, Germany for the [CeBIT conference] and wanted this DS8800 to be re-directed to Germany first for this event. He was kind enough to sign it for us. Brian Truskowski, IBM General Manager for Storage, and Rod Adkins, IBM Senior Vice President for IBM Systems Technolgoy Group (and my fifth-line manager), also signed this as well!
I am pleased to say this "signed" DS8000 has arrived to Tucson. This is the latest model in a family of market-leading high-end enterprise-class disk systems designed to attach to all computers, including System z mainframes, POWER systems running AIX and IBM i, as well as servers running HP-UX, Solaris, Linux or Windows.
For more on IBM's other innovations over the past 100 years, check out the [Icons of Progress], which includes these storage innovations:
Well, it's Tuesday, in the United States at least, and you know what that means... IBM Announcements! I am actually down under in Sydney, Australia, and it is Wednesday already as I write this. I feel like a time traveler.
IBM announces their latest disk system, the [IBM System Storage DCS3700], designed for high-performance computing (HPC), business analytics, video broadcasting, and other sequential workloads. The "DCS" stands for Deep Computing Storage. IBM already has the DCS9900 for large enterprise deployments, so this smaller DCS3700 is targeted for midrange deployments.
In a compact 4U package, the DCS3700 packs dual active-active controllers and up to 60 disk drives. The controller drawer can support two additional expansion drawers, of 60 drives each in 4U drawers, for a maximum total of 180 drives in 12U of rack space. Packed with "green" 7200RPM energy-efficient 2TB drives, a system can have up to a 360TB raw capacity. The system supports RAID levels 0, 1, 3, 5, 6, and 10.
The system comes with the latest 6Gbps SAS connections for host attachment, but you can choose 8Gbps Fibre Channel Protocol (FCP) instead, allowing the DCS3700 to be managed by SVC or Storwize V7000.
Dan Galvan, IBM VP of Marketing for Storage, was the next speaker. With 300 billion emails being sent per day, 4.6 billion cell phones in the world, and 26 million MRIs per year, there is going to be a huge demand for file-based storage. In fact, a recent study found that file-based storage will grow at 60 percent per year, compared to 15 percent growth for block-based storage.
Dan positioned IBM's Scale-out Network Attached Storage (SONAS) as the big "C:" drive for a company. SONAS offers a global namespace, a single point of management, with the ability to scale capacity and performance tailored for each environment.
The benefits of SONAS are great. We can consolidate dozens of smaller NAS filers, we can virtualize files across different storage pools, and increase overall efficiency.
Powering advanced genomic research to cure cancer
The next speaker was supposed to be Bill Pappas, Senior Enterprise Network Storage Architect, Research Informatics at [St. Jude Children’s Research Hospital]. Unfortunately, St. Jude is near the flooding of the Mississippi river, and he had to stay put. An IBM team was able to capture his thoughts on video that was shown on the big screen.
Thanks to the Human Genome project, St. Jude is able to cure people. They see 5700 patients per year, and have an impressive 70 percent cure rate. The first genetic scan took 10 years, now the technology allows a genome to be mapped in about a week. Having this genomic information is making vast strides in healthcare. It is the difference of fishing in a river, versus putting a wide net to catch all the fish in the Atlantic ocean all at once.
Recently, St. Jude migrated 250 TB of files from other NAS to an IBM SONAS solution. The SONAS can handle a mixed set of workloads, and allows internal movement of data from fast disk, to slower high-capacity disk, and then to tape. SONAS is one of the few storage systems that supports a blended disk-and-tape approach, which is ideal for the type of data captured by St. Jude.
IBM's own IT transformation
Pat Toole, IBM's CIO, presented the internal transformation of IBM's IT operations. He started in 2002 in the midst of IBM's effort to restructure its process and procedures. They identified four major data sources: employee data, client data, product data, and financial data. They put a focus to understand outcomes and set priorities.
The result? A 3-to-1 payback on CIO investments. This allowed IBM to go from server sprawl to consolidated pooling of resources with the right levels of integration. In 1997, IBM had 15,000 different applications running across 155 separate datacenters. Today, they have reduced this down to 4,500 applications and 7 datacenters. Their goal is to reduce down to 2,225 applications by 2015. Of these, only 250 are mission critical.
Pat's priorities today: server and storage virtualization, IT service management, cloud computing, and data-centered consolidation. IBM runs its corporate business on the following amount of data:
9 PB of block-based storage, SVC and XIV
1 PB of file-based storage, SONAS
15 PB of tape for backup and archive
Pat indicated that this environment is growing 25 percent per year, and that an additional 70-85 PB relates to other parts of the business.
By taking this focused approach, IBM was able to increase storage utilization from 50 to 90 percent, and to cut storage costs by 50 percent. This was done through thin provisioning, storage virtualization and pooling.
Looking forward to the future, Pat sees the following challenges: (a) that 120,000 IBM employees have smart phones and want to connect them to IBM's internal systems; (b) the increase in social media; and (c) the use of business analytics.
After the last session, people gathered in the "Hall of the Universe" for the evening reception, featuring food, drinks and live music. It was a great day. I got to meet several bloggers in person, and their feedback was that this was a very blogger-friendly event. Bloggers were given the same level of access as corporate executives and industry analysts.
During the break, I talked with some of the other bloggers at this event. From left to right: Stephen Foskett [Pack Rat] blog, Devang Panchigar [StorageNerve], and yours truly, Tony Pearson. (Picture courtesy of Stephen Foskett)
Meet the Experts
This next segment was a Q&A panel, with a moderator posing questions to four experts. Originally, I was scheduled to be the moderator, but this was changed to Doug Balog. The experts on the panel were:
Rich Castagna, Editorial Director for Storage Media, TechTarget. TechTarget is the group that runs the [SearchStorage] website.
Stan Zaffos, Gartner VP of Research, who spoke earlier today. I have worked with Stan for years as well, and have attended the last four Gartner Data Center Conferences held every December in Las Vegas.
Steve Duplessie, Founder and Senior Analyst, Enterprise Strategy Group (ESG). Steve's blog is titled [The Bigger Truth].
Jon clarified a statement Doug Balog said earlier in the day attributed to his study. Doug had said that 40 percent of all data should be archived. The study that Jon Toigo had done found that, on average, for the data on disk systems, about 30 percent is useful data, 40 percent is not active and could be eligible for archive, and the remaining 30 percent was crap.
The other experts introduced themselves. Rich felt that "Cloud" was still the biggest buzzword in the IT industry. Stan felt that CIOs should ask their storage administrators "What are you doing to improve my agility and efficiency". Steve felt that it was better to focus on improving process and procedures, rather than trying to deploy the best technology.
How can you best reduce backup costs per TB?
Jon- use tape.
Rich- Clean up your environment.
Stan- Don't rehydrate your deduplicated data, adopt archive approach, and revisit your backup schedules.
Steve- Deduplication covers up stupidity. No band-aids! Companies need to address the cause.
Does Backup as a Public Service for large enterprises makes sense?
Rich- Yes, especially for those with Remote Office/Branch Office (ROBO).
Stan- It depends. You should implement client-side dedupe. Get the Cloud Provider to waive telecom bandwidth charges.
Steve- Consider recovery scenarios, and try to maintain control.
Jon- "Clouds" are bulls@#$ marketing. WAN latency will pile up.
What are the top issues IT leaders should be discussing with the Storage Managers?
Stan- To ensure SLAs meet but not exceed design, to automate, and to evaluate SAN/NAS ratios.
Steve- Server virtualization is putting the spotlight on storage. Failure to implement storage virtualization is becoming the gate that slows down sever virtualization adoption.
Jon- Insist on management features from all storage vendors, try to separate feature/function from the underlying hardware layer. See IBM's [Project Zero].
Rich- Efficiency, Archiving, Thin Provisioning, Compression, Data Protection & Retention, Backup Redesign to protect endpoints like laptops and cell phones.
When does Archive eliminate Backup?
The need for protection never goes away. There are two kinds of data: "originals" and "derivatives", and two kinds of disk: "failed" and "not yet failed".
Given SATA and SAS drives, what is the future of 10K/15K RPM drives?
There is no future for these faster drives, they are going away.
What is the biggest challenge for adopting archive?
It is easy to move data out of production systems, but difficult to make these archives accessible for eDiscovery and Search. There is also concern about changing data formats. Adobe has changed the format of PDF a whopping 33 times.
This was by far the most entertaining section of the day! Hand-held devices allowed the audience to vote which answers they liked best.
Doug Balog, IBM VP and Business Level Executive for Storage, presented Smart Archiving. Citing research by Jon Toigo, Doug indicated that 40 percent of data on disk should be archived. Sadly, a vast majority of companies continue to use their backups as archives. There is a better way to do archives, to address the needs of four use cases:
The IBM Information Archive for email, files and eDiscovery offers full text indexing. A well-deployed archive strategy can save up to 60 percent in backup costs, and reduce backup times by 80 percent. IBM offers advanced analytics and visualization for archive data.
An analysis of a global insurance company found that they kept, on average, 120 copies of every email sent. This was the combination of an average of 12 copies of the email, multipled by 10 backups of the email repository.
Banjercito, a bank in Mexico, has a 10-year retention requirement from government regulations.
The new LTFS Library Edition allows Library-based access to files stored on tape cartridges. The new TS3500 Library Connector means that a single system of connected tape libraries can hold up to 2.7 Exabytes (EB) of data.
Archive Industry Perspectives
Steve Duplessie from Enterprise Strategy Group [ESG] gave his views on the challenges of volume, access and cost. His definition for archive: the long term retention of information on a separate environment for compliance, eDiscovery and business reference purposes. Steve advocates a purpose-built solutiion for archive. There are three major challenges for implementing an archive solution:
Getting Participation -- Steve feels that key stakeholders have inappropriate expectations of what archive is, or can be.
Define Tasks -- Steve argues that archive is very much a process-oriented approach, and tasks must fit business process and procedures
Prepare for Future Content Types -- the frequent change of standard and proprietary data types poses a real challenge for long term retention of data
For example, the Financial Industry Regulatory Authority [FINRA] oversee 4,000 brokerage firms, and 600,000 broker/dealers. They have mandated the storing of digital data related to stock trades, and this can include text messages, voice messages, and emails. They continue to expand this definition, so soon this could include tweets on Twitter, for example.
Steve feels there are four key requirements for archive:
Support for email, such as an email application plug-in
Off-line access to archived data
Support for mobile devices, such as smartphones
Basic search capabilities
Companies are starting to take archive seriously. About 35 percent of firms surveyed have adopted archive, and another 36 percent plan to in the next 12-24 months. Enterprise archive has grown over 200 percent from 2007 to 2009. Steve agrees that not everything needs to be stored on disk. Retention periods greater than six years dictates the need for tape.
Current systems may not meet today's requirements. Data loss and downtime costs have skyrocketed. Data Protection and Retention projects can represent a gold mine of savings, new capabilities can greatly lower costs, allowing companies to shift resources over to revenue generation.
Big Data, New Physics and Geospatial Super-Food
I would vote this the best session of the day! For all those confused on what the heck "Big Data" means, Jeff has the best explanation. Jeff Jonas is an IBM Distinguished Engineer and the Chief Scientist of Entity Analytics. He had just finished his 17th marathon on Saturday, and his fingers were bandaged.
Jeff had founded the Systems Research and Design (SR&D) company, known for creating NORA (non-obvious relationship awareness) used by Las Vegas casinos to identify fraud. SR&D was acquired by IBM back in 2005. Jeff is focused on sensemaking of streams. He feels many companies are suffering from "Enterprise Amnesia".
"The data must find the data .. and the relevance must find the user."
-- Jeff Jonas
Jeff's metaphor to Big Data is a jigsaw puzzle without the picture on the outside of the box. To demonstrate his point, he presented a pile of jigsaw puzzle pieces and asked four teenagers to put the puzzle together without the advantage of the picture on the box. What he had not told them was that he mixed four different puzzles together, removing out 10 to 20 percent of the pieces from each puzzle. He also added some duplicate pieces from a second identical puzzle, and just to make things fun, included a dozen pieces from a sixth puzzle just to mess with their heads. Within a few hours, the kids had managed to figure out that there were four puzzles, that there were duplicate pieces, and that there were some pieces that did not fit any of the four puzzles.
"You can't squeeze knowledge from a pixel."
-- Jeff Jonas
This approach favors false negatives. New observations reverse out old conceptions. As the picture emerges, this provides added focus on new information. More data can provide better predictions. "Bad" data, including misspelled words and mis-coded categories, was often discarded or corrected on the basis of "Garbage-In, Garbage Out", but can now be useful in a Big Data perspective.
Take for example the 600 billion recordings of the "location data" captured on cell phones every day. With regular triangulation of cell phone towers, the information can pinpoint you within 60 meters, add GPS and this improved to within 20 meters, and add Wi-Fi is further improved to 10 meters. While this data is "de-identified" so as not to identify individual users, the process of re-identification is relatively trivial. Jeff's system is able to predict a person will be next Thursday at 5:35pm with 87 percent accuracy.
Thus, Big Data represents an asset, accumulation of context. Real-time analytics can be a competitive advantage. These streams of data will need persistent storage and massive I/O capabilities. In one example, Jeff processed 4,200 separate sources of information and was able to identify "dead votes". These are votes cast by people that died in years prior, indicating voter fraud.
Jeff's latest project, codenamed G2, will tackle not just people, but everything from proteins to asteroids.
Normally, the worst time slot is the hour after lunch, but these presentations kept people's attention.
Down the street, in Times Square, IBM made it on the big board.
Continuous Data Availability
Jeanine Cotter, IBM VP for Data Center Services, started out with a video about Sabre. IBM developed this revolutionary airline reservation system to handle the huge volume of transactions. Today, 18 percent of organizations consider downtime unacceptable for their tier-1 applications, and 53 percent would be seriously impacted by an outage lasting an hour or more.
Eventually, companies cross the "Continuous Availability" threshold, the point where they discover that the possibility of downtime is too costly to ignore. IBM has clients using 3-site Metro/Global Mirror that can fail-over an entire data center in just five mouse clicks.
Jeanine also mentioned Euronics, which is using SAN Volume Controller's Stretched Cluster capability, which allows them to easily vMotion virtual guest images from one data center to another. SVC has had this capability for a while, but now, with full VMcenter plug-in and VAAI support, the capability is fully integrated with VMware.
A final example was a mid-sized University, they are using IBM Storwize V7000 with Metro Mirror. The primary location's Storwize V7000 manages Solid-state drives with Easy Tier. The secondary location's Storwize V7000 has high-capacity SATA drives and FlashCopy.
Customer Testimonial - University of Rochester Medical Center
Rick Haverty, Director of IT infrastructure at University of Rochester Medical Center [URMC] provided the next client testimonial. The mission of the URMC is to use science, education and technology to improve health. URMC gets over $400 million USD in NIH grants, which puts them around 23rd largest University-based academic medical centers in the country. They have over 900 doctors, general practice and specialists.
URMC has an IBM BlueGene supercomputer, a Cisco network over 45,000 ports, and over 7.5 million square feet of Wi-Fi wireless internet coverage. They have three datacenters. The first is 7500 square feet, the second is 6000 square feet, and the third is just 800 square feet to hold their "off-site tapes".
URMC has digitized all of their records, including Electronic Medical Records (EMR) system, medical dosage history, imaging "priors", calibration of infusion pumps, RFID monitoring, and even provide IT support while the patient is on the operating table. RFID monitoring ensures all of the refrigerators are keeping medications at the right temperature. A single failed refrigerator can lose $20,000 dollars worth of medication.
When is a good time for downtime? At URMC, they handle 90,000 Emergency Room vists per year, so the answer is never. When is the ER busiest? Monday morning. (not what I expected!)
URMC's EMR software (Epic) runs on clustered POWER7 servers, with DS8700 disk systems using Metro Mirror to secondary location. They also keep a third "shadow" POWER7 for read-only purposes, and a separate system that provides web-based read-only access. Finally, they have 90 stand-alone Personal Computers (PCs) that contain information for all the patients that have reservations this week, just in case all the other systems fail.
The exploding volume of data comes from medical imaging. For radiology (X-rays), each image is called a "study" takes 20-30 MB each, and they have 650,000 studies per year. This represents about 16TB storage per year, with 3 second response time access. These must be kept for 7 years since last view, or until the patient reaches the age of 18 years old, which ever is later.
But radiology is just one discipline. Healthcare has a whole bunch of "ologies". Another is "Pathology" which looks at cells between glass slides in a microscope. Each study consumes 10-20GB, and URMC does about 100,000 pathology studies per year, representing 150TB per year.
URMC has identified that they have 42 mission-critical applications. The data for these are stored on DS8000, XIV, Storwize V7000 and DS5000, all managed behind SAN Volume Controller.
During lunch, people were able to take a look at our solutions. Here are Dan Thompson and Brett Cooper striking a pose.
Hyper-Efficient Backup and Recovery
The afternoon was kicked off by Dr. Daniel Sabbah, IBM General Manager of Tivoli software. He started with some shocking statistics: 42 percent of small companies have experienced data loss, 32 percent have lost data forever. IBM has a solution that offers "Unified Recovery Management". This involves a combination of periodic backups, frequent snapshots, and remote mirroring.
IBM Tivoli Storage Manager (TSM) was introduced in 1993, and was the first backup software solution to support backup to disk storage pools. Today, TSM is now also part of Cloud Computing services, including IBM Information Protection Services. IBM announced today a new bundle called IBM Storwize Rapid Application Backup, which combines IBM Storwize V7000 midrange disk system, Tivoli FlashCopy Manager, implementation services, with a full three-year hardware and software warranty. This could be used, for example, to protect a Microsoft Exchange email system with 9000 mailboxes.
IBM also announced that its TS7600 ProtecTIER data deduplication solutions have been enhanced to support many-to-many bi-direction remote mirroring. Last year, University of Pittsburgh Medical Center (UPMC) reported that they were average 24x data deduplication factor in their environment using IBM ProtecTIER.
"You are out of your mind if you think you can live without tape!"
-- Dick Crosby, Director of System Administration, Estes
The new IBM TS1140 enterprise class tape drive process 2.3 TB per hour, and provides a density of 1.2 PB per square foot. The new 3599 tape media can hold 4TB of data uncompressed, which could hold up to 10TB at a 2.5x compression ratio.
The United States Golfers Association [USGA] uses IBM's backup cloud, which manages over 100PB of data from 750 locations across five continents.
Customer Testimonial - Graybar
Randy Miller, Manager of Technical System Administration at Graybar, provided the next client testimonial. Graybar is an employee-owned company focused on supply-chain management, serving as a distributor for electical, lighting, security, power and cooling equipment.
Their problem was that they had 240 different locations, and expecting local staff to handle tape backups was not working out well. They centralized their backups to their main data center. In the event that a system fails in one of their many remote locations, they can rebuild a new machine at their main data center across high-speed LAN, and then ship overnight to the remote location. The result, the remote location has a system up and running by 10:30am, faster than they would have had from local staff trying to figure out how to recover from tape. In effect, Graybar had implemented a "private cloud" for backup in the 1990s, long before the concept was "cool" or "popular".
In 2001, they had an 18TB SAP ERP application data repository. To back this up, they took it down for 1 minute per day, six days a week, and 15 minutes down on Sundays. The result was less than 99.8 percent availability. To fix this, they switched to XIV, and use Snapshots that are non-disruptive and do not impact application performance.
Over 85 percent of the servers at Graybar are virtualized.
Their next challenge is Disaster Recovery. Currently, they have two datacenters, one in St. Louis and the other in Kansas City. However, in the aftermath of Japan's earthquakes, they realize there is a nuclear power plan between their two locations, so a single incident could impact both data centers. They are working with IBM, their trusted advisors, to investigate a three-site solution.
This week, May 15-22, I am in Auckland, New Zealand teaching IBM Storage Top Gun sales class. Next week, I will be in Sydney, Australia.
Optimizing Storage Infrastructure for Growth and Innovation
This session started off with my former boss, Brian Truskowski, IBM General Manager of System Storage and Networking.
We've come a long way in storage. In 1973, the "Winchester Drive" was named after the famous Winchester 3030 rifle. The disk drive was planning to have two 30MB platters, hence the name. When it finally launched, it would have two 35MB platters, for a total raw capacity of 70MB.
Today, IBM announced the verison 6.2 of SAN Volume Controller with support for 10GbE iSCSI. Since 2003, IBM has sold over 30,000 SAN Volume Controllers. An SVC cluster can now manage up to 32PB of disk storage.
IBM also announced new 4TB tape drive (TS1140), LTFS Library Edition, the TS3500 Library Connector, improved TS7600 and TS7700 virtual tape libraries, enhanced Information Archive for email, files and eDiscovery, new Storwize V7000 hardware, new Storwize Rapid Application bundles, new firmware for SONAS and DS8000 disk systems, and Real-Time Compression support for EMC disk systems. I plan to cover each of these in follow-on posts, but if you can't wait, here are [links to all the announcements].
Customer Testimonial - CenterPoint Energy
"CenterPoint is transforming its business from being an energy distribution company that uses technology, to a technology company that distributes energy."
-- Dr. Steve Pratt, CTO of CenterPoint Energy
The next speaker was Dr. Steve Pratt is CTO of [CenterPoint Energy]. CenterPoint is 110 years old (older than IBM!) energy company that is involved in electricity, gasoline distribution, and natural gas pipeline. CenterPoint serves Houston, Texas (the fourth largest city in the USA) and surrounding area.
CenterPoint are transforming to a Smart Grid involving smart meters, and this requires the best IT infrastructure you can buy, including IBM DS8000, XIV and SAN Volume Controller disk systems, IBM Smart Analytics System, Stream Analytics, IBM Virtual Tape Library, IBM Tivoli Storage Manager, and IBM Tivoli Storage Productivity Center.
Dr. Pratt has seen the transition of information over the years:
Data Structure, deciding how to code data to record it in a structured manner
Information Reporting, reporting to upper management what happened
Intelligence Aggregation, finding patterns and insight from the data
Predictive Analytics, monitoring real-time data to take pro-active steps
Autonomics, where automation and predictive analysis allows the system to manage itself
What does the transition to a Smart Grid mean for their storage environment? They will go from 80,000 meter reads, to 230,400,000 reads per day. Ingestion of this will go from MB/day to GB/sec. Reporting will transition to real-time analytics.
Dr. Pratt prefers to avoid trade-offs. Don't lose something to get something else. He also feels that language of the IT department can help. For example, he uses "Factor" like 25x rather than percent reduction (96 percent reduced). He feels this communicates the actual results more effectively.
Today's smarter consumers are driving the need for smarter technologies. Individual consumers and small businesses can make use of intelligent meters to help reduce their energy costs. Everything from smart cars to smart grids will need real-time analytics to deal with the millions of events that occur every day.
IBM's Data Protection and Retention Story
Brian Truskowski came back to provide the latest IBM messaging for Data Protection and Retention (DP&R). The key themes were:
Stop storing so much
Store more with what's on the floor
Move data to the right place
IBM announced today that the IBM Real-Time Compression Appliances now support EMC gear, such as EMC Celerra. While some of the EMC equipment have built-in compression features, these often come at a cost of performance degradation. Instead, the IBM Real-Time compression can offer improved performance as well as 3x to 5x reduction in storage capacity.
OVer 70 percent of data on disk has not be accessed in the last 90 days. IBM Easy Tier on the DS8700 and DS8800 now support FC-to-SATA automated tiering.
IBM is projecting that backup and archive storage will grow at over 50 percent per year. To help address this, IBM is launching a new "Storage Infrastructure Optimization" assessment. All attendees at today's summit are eligible for a free assessment.
Analytics are increasing the value of information, and making it more accessible to the average knowledge worker. The cost of losing data, as well as the effort spent searching for information, has skyrocketed. Users have grown to expect 100 percent uptime availability.
An analysis of IT environments found that only 55 percent was spent on revenue-producing workloads. The remaining 45 percent was spent on Data Protection and Retenion. That means that for every IT dollar spent on projects to generate revenue, you are spending another 90 cents to protect it. Imagine spending 90 percent of your house payments for homeowners' insurance, or 90 percent of your car's purchase price for car insurance.
IBM has organized its solutions into three categories:
Hyper-Efficient Backup and Recovery
Continuous Data Availability
What would it mean to your business if you could shift some of the money spent on DP&R over to revenue-producing projects instead? That was the teaser question posed at the end of these morning sessions for us to discuss during lunch.
Normally, IBM has its announcements on Tuesdays, but this week it was on Monday!
I am here in New York City, at the Kaufmann Theater of the American Museum of Natural History, for the
[IBM Storage Innovation Executive Summit]. We have about 250 clients here, as well as many bloggers and storage analysts.
My day started out being interviewed by Lynda from Stratecast, a division of [Frost & Sullivan]. This interview will be part of a video series that Stratecast is doing about the storage industry.
(About the venue: American Museum of Natural History was built in 1869. It was featured in the film "Night at the Museum". In keeping with IBM's focus on scalability and preservation, the museum here boasts skeletons of the largest dinosaurs. The five-story building takes up several city blocks, and the Kaufmann theater is buried deep in the bottom level, well shielded from cell phone or Wi-Fi signals allowing me to focus on taking notes the traditional way, with pen and paper.)
Deon Newman, IBM VP of Marketing for Northa America, was our Master of Ceremonies. Today would be filled with market insight, best practices, thought leadership, and testimonials of powerful results.
This is my first in a series of blog posts on this event.
Information Explosion on a Smarter Planet
Bridget van Kralingen, IBM General Manager for North America, indicated that storage is finally having its day in the sun, moving from the "back office" to the "front office". According to Google's Eric Schmidt, we now create, capture and replicate more date in two days than all of the information recorded from the dawn of time to the year 2003.
1928: IBM's innovative 80-column punch card stored nearly twice as much as its 50-column predecessor.
1947: Bing Crosby decided to do his radio show by recording it at his convenience on magnetic tape, rather than doing it live. This was the motivation for IBM researches to investigate tape media, delivering the first commercial tape drive in 1952. One tape reel could hold the equivalent of 30,000 punch cards.
1956: the IBM RAMAC mainframe was the first computer to access data randomly with an externally-attached disk system, the "350 Disk Unit", which stored 5 million 7-bit characters (about 5MB) and weighed over 500 pounds. Compare that today's cell phone that can store several GB of data in a handheld device.
1978: IBM invented Redundant Array of Independent Disks (RAID) through a collaboration with University of Berkeley.
1993: IBM introduces the [IBM 9337 Disk Storage Array], the first external disk storage system for distributed operating systems. This was based on the Serial Storage Architecture [SSA] protocol.
1995: IBM launches products that support Storage Area Networks (SAN), based on the Fibre Channel Protocol. IBM's internal codenames for disk products were all names of sharks, and so our internal mantra was that a healthy storage diet was comprised of "Plenty of Fish and Fibre".
2010: IBM ships Easy Tier, the world's easiest-to-use sub-LUN automated tiering capability, for the IBM System Storage DS8700 disk system.
Storage is growing (in capacity) at 40 percent per year, but IT budgets are only growing (in dollars) by a measly 1 to 5 percent. She cited the success at [Sprint], presented at the October 2010 launch. By combining IBM SAN Volume Controller with a three-tier storage architecture, Sprint lowered their raw capacity from 10PB to 8.4PB, increasing utilization from 35 to 78 percent. This involved shrinking from six storage vendors to three, and reducing total number of disk arrays from 166 down to 96. The resulting system has only 38 percent of their data on their most expensive Tier-1 storage, the rest is now living on less expensive Tier-2 and Tier-3 storage.
Companies are entering the era of Big Data with an insatiable appetite for collecting and analyzing data for marketplace insights. IBM [InfoSphere BigInsights], based on the Apache Hadoop, has helped customers make sense of it all. Innovative technology, expertise and marketplace insight will provide the competitive path forward in the coming decade.
Storage Challenges and Opportunities in 2011 and Beyond
I always enjoy hearing Stan Zaffos, Gartner Research VP, present at the annual [Data Center Conference] in Las Vegas every December. His analysis and research focuses on storage systems and emerging storage technologies.
Stan provided his perspective on the storage industry. He suggested a top-down approach, based on the market trends that Gartner is closely monitoring. He suggests focusing heavily on managing data growth, using SLAs to improve efficiency, and to follow Gartner's recommended actions. His statement, "If something is not sustainable, then it is unsustainable." resonated well with the audience. His key three points:
Design to meet but not exceed Service Level Agreements (SLAs)
Re-evaluate your ratio of SAN versus NAS based on growth of unstructured data content,
Explore the variety of Cloud options available.
Those of us who have been in this business a long time recognize that the problems haven't changed, just the dimensions. When in the past three decades were IT budgets generous and plentiful? When was there more than enough IT staff to handle all the requests in a timely manner? When hasn't there been a period of information growth? Gartner's analysis external control block (RAID protected disk systems) is growing revenue at 8.7 percent. Raw TBs of disk capacity is growing at 55 percent, and expected to be 100 Exabytes by 2015.
SAN has four times more revenue than NAS today, but NAS is growing faster. NAS was only 9 percent marketshare in 2010, but is projected to grow to 32 percent by 2015. SAN can offer higher price/performance for traditional OLTP and database workloads, but NAS is better suited for unstructured data, backups and archives, assisted by storage efficiency features like real-time compression and data deduplication. Which industries create the most unstructured data? The ones involved in filling out forms! This includes government, insurance agencies, manufacturing, mining and pharmaceuticals.
The phrase "good enough" should no longer be considered an insult. Too often IT departments design solutions that far exceed negotiated Service Level Agreements (SLAs), and they should instead focus on just meeting them instead. Modular storage systems are often sufficient for most workloads. Slower 7200RPM SATA disks can be one third the price of faster 15K RPM Fibre Channel drives, and often sufficient performance for the tasks required. Unified storage, such as IBM N series, can help simplify capacity planning, as storage can be re-purposed if different workloads grow at different rates. The key is to focus on meeting SLAs based on the price-vs-risk factor. Take a minimalist approach with fewer SLAs, fewer management classes, and fewer storage vendors.
Stan suggests a two-pronged approach: Capacity management through content analytics and classification, and Efficient Utilization through Thin Provisioning, storage virtualization, Quality of Service (QoS), compression and deduplication capabilities. This features will be ubiquitous by 2013. If you are worried that these technologies mean more information packed onto fewer devices, Stan's response was "If it's not there, it can't break." Storing data on fewer disks or tape cartridges means less chance something will fail.
Stan feels IT shops using Thin Provisioning should continue to charge their end-users on what they ask for (the full allocation request) rather than what the thin-provisioned amount actually is on the storage devices themselves. For example, if someone asks for 100GB LUN to be allocated to their system, but this only takes up 30GB of actual data space, chargeback the full 100GB!
It can take five years for new technology to get 50 percent adopted. The Romans took eight years to build the [Colosseum]. His research on "network convergence" found that 42 percent planned to use iSCSI, 32 percent Fibre Channel over Ethernet (FCoE) or other Top-of-Rack(TOR) converged switches, and 16 percent looking for full convergence of servers, switches and storage. Features like IBM Easy Tier automatic sub-LUN tiering were introduced later, and so have not been adopted as widely as other features like Thin Provisioning that have been around since the 1990's IBM RAMAC Virtual Array.
Stan felt that Public and Private clouds were two different approaches. Public clouds offer reservation-less provisioning. Private clouds offer improved agility, but can be more complex to set up, and has the risk of idle capacity similar to traditional IT datacenter deployments. Storage and File virtualization should be considered a pre-req for adopting Cloud technologies.
Storage IT teams need to adopt more than just technical skills. They need to learn about legal and government regulatory compliance issues, financial considerations, and would even benefit doing some "marketing". Why marketing? Because often IT departments need end-users to change their attitudes and behaviours, and this can be accomplished through internal marketing campaigns.
IBM introduced the Linear Tape File System last year, which I explained in my post [IBM Celebrates 10 Year Anniversary for LTO tape], and released it as open source to the rest of the Linear Tape Open [LTO] Consortium so that the entire planet can benefit from IBM's innovation. IBM presented a technology demonstration of its Linear Tape File System - Library Edition at the NAB conference, showing how this new IBM library offering of the file system can put mass archives of rich media video content at the users fingertips with the ease of library automation.
From left to right, here is Atsushi Nagaishi (Toshiba) and Shinobu Fujihara (IBM). Fujihara-san is from IBM's Yamato lab in Japan where some of the LTFS development was done. The Yamato Lab was not damaged by the [Earthquakes in Japan].
With the capabilities of LTFS, IBM has introduced an entirely new role for tape, as an attractive high capacity, easy to use, low cost and shareable storage media. LTFS can make tape usable in a fashion like removable external disk, a giant alternative to floppy diskettes, DVD-RW and USB memory sticks with directory tree access and file-level drag-and-drop capability. LTFS can allow the for passing of information around from one system or employee to another. And as for high video storage capacity, a 1.5TB LTO-5 cartridge can hold about 50 hours of XDCAM HD video!
A group photo of the global IBM LTFS team, from left to right, David Pease from IBM Almaden Research Center, Ed Childers from IBM Tucson, Shinobu Fujihara and Hironobu Nagura from IBM Japan.
IBM was once again #1 leader in Tape worldwide for the year 2010. With this exciting new win, tape is not just for backup and archive anymore!