This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
Continuing my week in Chicago, for the IBM Storage Symposium 2008, I attended two presentations on XIV.
XIV Storage - Best Practices
Izhar Sharon, IBM Technical Sales Specialist for XIV, presented best practices using XIV in various environments.He started out explaining the innovative XIV architecture: a SATA-based disk system from IBM can outperformFC-based disk systems from other vendors using massive parallelism. He used a sports analogy:
"The men's world record for running 800 meters was set in 1997 by Wilson Kipketer of Denmark in a time of 1:41.11.
However, if you have eight men running, 100 meters each, they will all cross the finish line in about 10 seconds."
Since XIV is already self-tuning, what kind of best practices are left to present? Izhar presented best practicesfor software, hosts, switches and storage virtualization products that attach to the XIV. Here's some quickpoints:
Use as many paths as possible.
IBM does not require you to purchase and install multipathing software as other competitors might. Instead, theXIV relies on multipathing capabilities inherent to each operating system.For multipathing preference, choose Round-Robin, which is now available onAIX and VMware vSphere 4.0, for example. Otherwise, fixed-path is preferred over most-recently-used (MRU).
Encourage parallel I/O requests.
XIV architecture does not subscribe to the outdated notion of a "global cache". Instead, the cache is distributed across the modules, to reduce performance bottlenecks. Each HBA on the XIV can handle about 1400requests. If you have fewer than 1400 hosts attached to the XIV, you can further increase parallel I/O requests by specifying a large queue depth in the host bus adapter (HBA).An HBA queue depth of 64 is a good start. Additional settings mightbe required in the BIOS, operating system or application for multiple threads and processes.
For sequential workloads, select host stripe size less than 1MB. For random, select host stripe size larger than 1MB. Set rr_min_io between ten(10) and the queue depth(typically 64), setting it to half of the queue depth is a good starting point.
If you have long-running batch jobs, consider breaking them up into smaller steps and run in parallel.
Define fewer, larger LUNs
Generally, you no longer need to define many small LUNs, a practice that was often required on traditionaldisk systems. This means that you can now define just 1 or 2 LUNs per application, and greatly simplifymanagement. If your application must have multiple LUNs in order to do multiple threads or concurrent I/O requests, then, by all means, define multiple LUNs.
Modern Data Base Management Systems (DBMS) like DB2 and Oracle already parallelize their I/O requests, sothere is no need for host-based striping across many logical volumes. XIV already stripes the data for you.If you use Oracle Automated Storage Management (ASM), use 8MB to 16MB extent sizes for optimal performance.
For those virtualizing XIV with SAN Volume Controller (SVC), define manage disks as 1632GB LUNs, in multiple of six LUNs per managed disk group (MDG), to balance across the six interface modules. Define SVC extent size to 1GB.
XIV is ideal for VMware. Create big LUNs for your VMFS that you can access via FCP or iSCSI.
Organize data to simplify Snapshots.
You no longer need to separate logs from databases for performance reasons. However, for some backup productslike IBM Tivoli Storage Manager (TSM) for Advanced Copy Services (ACS), you might want to keep them separatefor snapshot reasons. Gernally, putting all data for an application on one big LUNgreatly simplifies administration and snapshot processing, without losing performance.If you define multiple LUNs for an application, simply put them into the same "consistencygroup" so that they are all snapshot together.
OS boot image disks can be snapshot before applying any patches, updates or application software, so that ifthere are any problems, you can reboot to the previous image.
Employ sizing tools to plan for capacity and performance.
The SAP Quicksizer tool can be used for new SAP deployments, employing either the user-based orthroughput-based sizing model approach. The result is in mythical unit called "SAPS", which represents0.4 IOPS for ERP/OLTP workloads, and 0.6 IOPS for BI/BW and OLAP workloads.
If you already have SAP or other applications running, use actual I/O measurements. IBM Business Partners and field technical sales specialists have an updated version of Disk Magic that can help size XIV configurations fromPERFMON and iostat figures.
Lee La Frese, IBM STSM for Enteprise Storage Performance Engineering, presented internal lab test results forthe XIV under various workloads, based on the latest hardware/software levels [announced two weeks ago]. Three workloadswere tested:
Web 2.0 (80/20/40) - 80 percent READ, 20 percent WRITE, 40 percent cache hits for READ.YouTube, FlickR, and the growing list at [GoWeb20] are applications with heavy read activity, but because of[long-tail effects], may not be as cache friendly.
Social Networking (50/50/50) - 50 percent READ, 50 percent WRITE, 50 percent cache hits for READ.Lotus Connections, Microsoft Sharepoint, and many other [social networking] usage are more write intensive.
Database (70/30/50) - 70 percent READ, 30 percent WRITE, 50 percent cache hits for READ.The traditional workload characteristics for most business applications, especially databases like DB2 andOracle on Linux, UNIX and Windows servers.
The results were quite impressive. There was more than enough performance for tier 2 application workloads,and most tier 1 applications. The performance was nearly linear from the smallest 6-module to the largest 15-module configuration. Some key points:
A full 15-module XIV overwhelms a single SVC 8F4 node-pair. For a full XIV, consider 4 to 8 nodes 8F4 models, or 2 to 4 nodes of an 8G4. For read-intensive cache-friendly workloads, an SVC in front of XIV was able to deliver over 300,000 IOPS.
A single node TS7650G ProtecTIER can handle 6 to 9 XIV modules. Two nodes of TS7650G were needed to drivea full 15-module XIV. A single node TS7650 in front of XIV was able to ingest 680 MB/sec on the seventh day with17 percent per-day change rate test workload using 64 virtual drives. Reading the data back got over 950 MB/sec.
For SAP environments where response time 20-30 msec are acceptable, the 15-module XIV delivered over 60,000 IOPS. Reducing this down to 25,000-30,000 cut the msec response time to a faster 10-15 msec.
These were all done as internal lab tests. Your mileage may vary.
Not surprisingly, XIV was quite the popular topic here this week at the Storage Symposium. There were many moresessions, but these were the only two that I attended.
Continuing my week in Chicago, for the IBM Storage Symposium 2009, I attended what in my opinion was the bestsession of the week. This was by a guy named Chip Copper, who covered IBM's set of Ethernet and Fibre Channelnetworking gear. Attributes are the four P's:
Power and Cooling (electricity usage)
Equipment comes in two flavors: Top-of-Rack (ToR) thin pizza box switches, and Middle-of-Row (MoR) much larger directors.The MoR directors are engineered for up to 50Gbps per half-slot, so 10GbE and the future 40GbE can be easily accommodated in a single half-slot, and the future 100GbE can be done with a full slot (two half-slots).
While many companies might have been contemplating the switch from copper wires to optical fiber, there is a new reason for copper cables: Power-over-Ethernet (PoE). Many IP-phones, digital video surveillance cameras, and other equipment can have a single cable that delivers both signal and electricity over copper. If you have already deployed optical fiber throughout the building, there are "last mile" options where the signals are converted to copper wires and electrical energy added for these types of devices.
Two directors can be connected together with Inter-Chassis Link (ICL) cables to make them look like a single director with twice the number of ports. These are different than Inter-Switch Links (ISL) as they are not counted as an extra "hop" for networking counting purposes, especially important for FICON usage.
Today, we have 1Gbps, 2Gbps, 4Gbps and 8Gbps Fibre Channel. Since these all use 10-for-8 encoding (10 bits represents one 8-bit byte), then in was easy to calculate throughput: 8Gpbs was 800 MB/sec, for example. Auto-negotiation between speeds is not done at the HBA card, switch or director blade itself, but in the Short Form-factor Pluggable (SFP) optical connector. However, you can only auto-negotiate if the encoding matches. The 4/2/1 SFP can run at 4Gbps or auto-negotiate to slower 2Gbps and 1Gbps. The 8/4/2 SFP can run at 8Gbps, or auto-negotiate down to slower 4Gpbs and 2Gbps. Folks who still have legacy 1Gbps equipment, but want to run some things at 8 Gbps, can buy 8Gbps-capable switches or director blades, but then put some 4/2/1 SFPs into them. These 4/2/1 SFP are cheaper, so this might be something to consider if budgets are tight. Some SFPs handle up to 10km distances, but others only 4km, so be careful not to order the wrong ones.
Unfortunately, there are proposals in place for 10Gbps and 40Gbps that would use a different 66-for-64 encoding (66 bits represent 8 bytes), so 10Gbps would be 1200 MB/sec. These are used today for ISL between directors and switches.In theory, the 40Gbps could auto-negotiate down to 10Gbps, but not to any of the 8/4/2/1 Gbps that use different 10-for-8 encoding.
For those who cannot afford a SAN768B, there is a smaller SAN384B that can carry: 192 ports (4Gpbs/2Gbps), 128 ports (8Gbps) or 24 ports (10Gbps). The SAN384B can be ICL connected to another SAN384B or even the SAN768B as your needs grow.
On the entry-level side, the SAN24B-4 offers a feature called "Access Gateway". This makes the SAN24B look like an SAN end-point host, rather than a switch, and makes initial deployment of integrated bundled solutions easier. Once connected to everything, you can convert it over to full "switch" mode.The SAN40B-4 and SAN80B-4 provide midrange level support, including Fibre Channel routing at the 8Gbps level. In fact, all 8Gbps ports include routing capability. IBM offers both single-port and dual-port 8Gbps host bus adapter (HBA) cards to connect to these switches. These HBA offer 16 virtual channels per port, so that if you have VMware running many guests, or want to connect both disk and tape to the same HBA, you can keep the channel traffic separate for Quality of Service (QoS).
Chip wrapped up his session to discuss Fibre Channel over Ethernet (FCoE), and explained why we need to have a loss-less Convergence Enhanced Ethernet (CEE) to meet the needs of storage traffic as well as traditional Fibre Channel does today. IBM offers all of the equipment you need to get started today on this FCoCEE, with Converged Network Ethernet cards for your System x servers, and a new SANB32 that has 24 10GbE CEE ports and 8 traditional 8Gbps FC ports. This means that you can put the CNA card in your existing servers, connect to this switch, and then connect to your existing 10GbE LAN and your existing 8Gpbs or 4Gpbs FC-based SAN to the rest of your storage devices.
Worried that the FCoE or CEE standards could change after you deploy this gear? Aren't most LAN and SAN switches based on Application-specific integrated circuit [ASIC] chips which are created in the factory? Don't worry, IBM's equipment have put all the standards-vulnerable portions of the logic into separate Field-programmable gate array [FPGA] that can be updated with simplya firmware upgrade. This is future-proofing I can agree with!
Continuing my week in Chicago, I decided to attend some of the presentations from the System x side. This is the advantage of running both conferences in the same hotel, attendees can choose how many of each they want to participate in.
Wayne Wigley, IBM Advanced Technical Support (ATS), presented a series of presentations on different server virtualization offerings available for System x and BladeCenter servers. I am very familiar with virtualization implemented on System z mainframes, as well as IBM's POWER systems, and have working knowledge of Linux KVM and Xen, so I was well prepared to handle hearing the latest about Microsoft's Hyper-V and VMware's Vsphere version 4.
Microsoft Hyper-V 2008
Hyper-V can run as part of Windows 2008, are standalone on its own.Different levels of Windows 2008 include licenses for different number of Windows virtual machines (VMs).Windows Server 2008 Standard includes 1 Windows VM, Enterprise includes 4 Windows VMs, and the Datacenter edition includes unlimited number of Windows VMs. If you want to run more Window VMs than come included, you need to pay extra for each additional one. For example, to run 10 Windows VMs on a 2-socket server would cost about $9000 US dollars on Standard but only $6000 US dollars on Datacenter edition (list prices from Microsoft Web site).
Unlike VMware, which takes a monolithic approach as hypervisor, Hyper-V is more like Xen with a microkernelized approach. This means you need a "parent" guest OS image, and the rest of the Guest OS images are then considered "child" images.These child images can be various levels of Windows, from Windows XP Pro to Windows Server 2008, Xen-enabled Linux, or even a non-hypervisor-aware OS.The "parent" guest OS image provides networking and storage I/O services to these "child" images.For the hypervisor-aware versions of Windows and Linux, Hyper-V allows optimized access to the hypervisor, "synthetic devices", and hypercalls. Synthetic devices present themselves as network devices, but only serve to pass data along the VMBus to other networking resources. This process does not require software emulation, and therefore offers higher performance for virtual machines and lower host system overhead.For non-hypervisor-aware OS images, Hyper-V provides device emulation through the "parent" image, which is slower.
Microsoft System Center Virtual Machine Manager (SCVMM) can manage both Hyper-V and VMware VI3 images.Wayne showed various screen shots of the GUI available to manage Hyper-V images.In standalone mode, you lose the nice GUI and management console.
Hyper-V supports external, internal and private virtual LANs (VLAN). External means that VMs can communicate with the outside world over standard ethernet connections. Internal means that VMs can communicate with "parent" and "child" guest images on the same server only. Private means that only "child" guests can communicate with other "child" images.
Hyper-V supports disk attached via IDE, SATA, SCSI, SAS, FC, iSCSI, NFS and CIFS. One mode is "Virtual Hard Disk" (VHD) similar to VMware VMDK files. The other is "pass through" mode, which are actual disk LUNs accessed natively. VHDs can be dynamic (thin provisioned), fixed (fully allocated), or differencing. The concept of differencing is interesting, as you start with a base read-only VHD volume image, and have a separate "delta" file that contains changes from the base image.
Some of the key features of Hyper-V 2008 are:
Being able to run concurrently 32-bit and 64-bit versions of Linux and Windows guest images
Support for 64 GB of memory and 4-way symmetric multiprocessing (SMP) per VM
Clustering for High Availability and Quick Migration of VM images
Live backup with integration with Microsoft's Volume Shadow Copy Services (VSS)
Virtual LAN (VLAN) support, and Virtual and Pass-through physical disk support
A clever VMbus, virtual service parent/client approach to sharing hardware
Optimized performance options for hypervisor-aware versions of Windows and Linux, and emulated supportfor non-hypervisor-aware OS images.
VMware Vsphere v4.0
This was titled as an "Overview" session, but really was an "Update" session on the newest features of this release. The big change appears to be that VMware added "v" in front of everything.
Under vCompute, there are some new features on VMware's Distributed Resource Scheduler (DRS) which includes recommended VM migrations. Dynamic Power Management (DPM) will move VMs during periods of low usageto consolidate onto fewer physical servers so as to reduce energy consumption.
Under vStorage, vSphere introduces an enhanced Plugable Storage Architecture (PSA), with supportfor Storage Array Type Plugins (SATP) and Path Selection Plugins (PSP). This vStorage API allows forthird party plugins for improved fault-tolerance and complex I/O load balancing algorithms. This releasealso has improved support for iSCSI, including Challenge-Handshake Authentication Protocol (CHAP) support.Similar to Hyper-V's dynamic VHD, VMware supports "thin provisioning" for their virtual disk VMDK files.A feature of "Storage Vmotion" allows conversion between "thick" and "thin" provisioning formats.
The vStorage API for Data Protection provide all the features of VMware Consolidated Backup (VCB). The APIprovides full, incremental and differential file-level backups for Windows and Linux guests, including supportfor snapshots and Volume Shadow Copy Services (VSS) quiescing.
VMware introduces direct I/O pass-through for both NIC and HBA devices. While thisallows direct access to SAN-attached LUNs similar to Hyper-V, you lose a lot of features like Vmotion, High Availability and Fault Tolerance. Wayne felt that these restrictions are temporary, that hopefully VMwarewill resolve this over the next 12 months.
Under vNetwork, VMware has virtual LAN switches called vSwitches. This includes support for IPv6and VLAN offloading.
The vSphere server can now run with up to 1TB of RAM and 64 logical CPUs to support up to 320 VM guest images.Each VM can have up to 255GB RAM and up to 8-way SMP.Vsphere ESX 4 introduces a new virtual hardware platform called VM Hardware v7. While Vsphere 4.0 can run VMs from ESX 2 and ESX 3, the problem is if you have new VMs based on this newer VM Hardware v7, you cannot run them on older ESX versions.
Vsphere comes in four sizes: Standard, Advanced, Enterprise, and Enterprise Plus, ranging in list price from $795 US dollars to $3495 US dollars.
While IBM is the #1 reseller of VMware, we also are proud to support Hyper-V, Xen, KVM and other similar products.Analysts expect most companies will have two or more server virtualization solutions in their data center, and it is good to see that IBM supports them all.
Continuing my week in Chicago for the IBM Storage and Storage Networking Symposium and System x and BladeCenter Technical Conference, I presented a variety of topics.
Hybrid Storage for a Green Data Center
The cost of power and cooling has risen to be a #1 concern among data centers. I presented the following hybrid storage solutions that combine disk with tape. These provide the best of both worlds, the high performance access time of disk with the lower costs and reduced energy consumption of tape.
IBM [System Storage DR550] - IBM's Non-erasable, Non-rewriteable (NENR) storage for archive and compliance data retention
IBM Grid Medical Archive Solution [GMAS] - IBM's multi-site grid storage for PACS applications and electronic medical records[EMR]
IBM Scale-out File Services [SoFS] - IBM's scalable NAS solution that combines a global name space with a clustered GPFS file system, serving as the ideal basis for IBM's own[Cloud Computing and Storage] offerings
Not only do these help reduce energy costs, they provide an overall lower total cost of ownership (TCO) thantraditional WORM optical or disk-only storage configurations.
The Convergence of Networks - Understanding SAN, NAS and iSCSI in the Data Center Network
This turned out to be my most popular session. Many companies are at a crossroads in choosing data and storage networking solutions in light of recent announcements from IBM and others. In the span of 75 minutes, I covered:
Block storage concepts, storage virtualization and RAID levels
File system concepts, how file systems map files to block storage
Network Attach Storage, the history of the NFS and CIFS protocols, Pros and Cons of using NAS
Storage Area Networks, the history of SAN protocols including ESCON, FICON and FCP, Pros and Cons of using SAN
IP SAN technologies, iSCSI and Fibre Channel over Ethernet (FCoE), Pros and Cons of using this approach
Network Convergence with Infiniband and Fibre Channel over Convergence Enhanced Ethernet (FCoCEE), why Infiniband was not adopted historically in the marketplace as a storage protocol, and the features and enhancements of Convergence Enhanced Ethernet (CEE) needed to merge NAS, SAN and iSCSI traffic onto a single converged data center network [DCN]
Yes, it was a lot of information to cover, but I managed to get it done on time.
IBM Tivoli Storage Productivity Center version 4.1 Overview and Update
In conferences like these, there are two types of product-level presentations. An "Overview" explains howproducts work today to those who are not familiar with it. An "Update" explains what's new in this version of the product for those who are already familiar with previous releases. I decided to combine these into one sessionfor IBM's new version of [Tivoli Storage Productivity Center].I was one of the original lead architects of this product many years ago, and was able to share many personalexperiences about its evolution in development and in the field at client facilities.Analysts have repeatedly rated IBM Productivity Center as one of the top Storage Resource Management (SRM) tools available in the marketplace.
Information Lifecycle Management (ILM) Overview
Can you believe I have been doing ILM since 1986? I was the lead architect for DFSMS which provides ILM support for z/OS mainframes. In 2003-2005, I spent 18 months in the field performingILM assessments for clients, and now there are dozens of IBM practitioners in Global Technology Services andSTG Lab Services that do this full time. This is a topic I cover frequently at the IBM Executive Briefing Center[EBC], because it addressesseveral top business challenges:
Reducing costs and simplifying management
Improving efficiency of personnel and application workloads
Managing risks and regulatory compliance
IBM has a solution based on five "entry points". The advantage of this approach is that it allows our consultants to craft the right solution to meet the specific requirements of each client situation. These entry points are:
Tiered Information Infrastructure - we don't limit ourselves to just "Tiered Storage" as storage is only part of a complete[information infrastructure] of servers,networks and storage
Storage Optimization and Virtualization - including virtual disk, virtual tape and virtual file solutions
Process Enhancement and Automation - an important part of ILM are the policies and procedures, such as IT Infrastructure Library [ITIL] best practices
Archive and Retention - space management and data retention solutions for email, database and file systems
I did not get as many attendees as I had hoped for this last one, as I was competing head-to-head in the same time slot as Lee La Frese covering IBM's DS8000 performance with Solid State Disk (SSD) drives, John Sing covering Cloud Computing and Storage with SoFS, and Eric Kern covering IBM Cloudburst.
I am glad that I was able to make all of my presentations at the beginning of the week, so that I can then sit back and enjoy the rest of the sessions as a pure attendee.
This week I am in Chicago for the IBM Storage and Storage Networking Symposium, which coincides with the System x and BladeCenter Technical Conference. This allows the 800 attendees to attend both storage or server presentations at their convenience. There were hundreds of sessions, over 20 time slots, so for each time slot, you have 15 or so topics to choose from.Mike Kuhn kicked off the series of keynote sessions. Here's my quick recap of each one:
Curtis Tearte, General Manager, IBM System Storage
Curtis replaced Andy Monshaw as General Manager for IBM System Storage. His presentation focused on how storage fits into IBM's Dynamic Infrastructure strategy. Some interesting points:
a billion camera-enabled cell phones were sold in 2007, compared to 450 million in 2006.
IBM expects that there will be 2 billion internet users by 2011, as well as trillions of "things".
In the US, there were 2.2 million medical pharmacy dispensing errors resulting for handwritten prescriptions.
Time wasted looking for parking spaces in Los Angeles consumed 47,000 gallons of gasoline, and generated 730 tons of carbon dioxide.
In the US, 4.2 billion hours are lost, and 2.9 billion gallons of gas consumed, due to traffic congestion.
Over the past decade, servers went from 8 watts to 100 watts per $1000 US dollars.
Data growth appears immune to the economic recession. The digital footprint per person is expected to grow from 1TB today to over 15TB by 2020.
10 hours of YouTube videos are uploaded every minute.
Bank of China manages 380 million bank accounts, processing over 10,000 transactions per second.
At the end of the session, Curtis transitioned from demonstrating his knowledge and passion of storage to his knowledge and passion in his favorite sport: baseball. Chicago is home to both the Cubs and the White Sox.
Roland Hagan, Vice President Business Line Executive, System x
IBM sets the infrastructure agenda for the entire industry. The Dynamic Infrastructure initiative is not just IT, but a complete end-to-end view across all of the infrastructures in play, including transportation, manufacturing, services and facilities.Companies spent over $60 billion US dollars on servers last year. Of these, 53 percent for x86-based servers, 9 percent for Itanium-based, 26 percent for RISC-based (POWER6, SPARC, etc.), and 11 percent mainframe. Theeconomic downturn has impacted revenues, but the percentages continue about the same.
The dominant deployment model remains one application per server. As a result, power, cooling and management costs have grown tremendously. There are system admins opposed to consolidating server images with VMware, Hyper-V, Xen or other server virtualizaition technologies. Roland referred to these admins as "server huggers".To help clients adopt cloud computing technologies, IBM introduced [Cloudburst] appliances. IBM plans to offer specialized versions for developers, for service providers, and for enterprises.
IBM's Enterprise-X Architecture is what differentiates IBM's x86-based servers from all the competitors, surrounding Intel and AMD processors with technology that provides distinct advantages. For example, to support server virtualization, IBM's eX4 provides support for more memory, which often is more critical than CPU resources when deploying large number of guest OS images. IBM System x servers have an integrated management module (IMM) and was the first to change over from BIOS to the new Unified Extensible Firmware Interface [UEFI] standard.
IBM servers offer double the performance, consume half the power, and cost a third less to manage, than comparably priced servers from competitors. Of the top 20 more energy efficient server deployments, 19 are from IBM. Roland cited customer reference SciNet, a 4000-server supercomputer with 30,000 cores based on IBM [iDataPlex] servers. At 350 TeraFLOPs it is ranked #16 fastest supercomputer in the world, and #1 in Canada. With apower usage effectiveness (PUE) less than 1.2, it also is very energy efficient. This means that for every 12 watts of electricity going in to the data center, 10 watts are used for servers, storage and networking gear, andonly 2 watts used for power and cooling. Traditional data centers have PUE around 2.5, consuming 25 watts total for every 10 watts used by servers, storage and networking gear.
Clod Barrera, Distinguished Engineer, Chief Technical Strategist for IBM System Storage
Clod presented trends and directions for disk and tape technology, disk and tape systems, and the direction towards cloud computing.
And, it's not too late to sign up for IBM Tivoli's [Pulse 2008] conference that will be heldin Orlando, Florida, May 18-22, 2008. I'll be there Sunday and Monday only, in the Tivoli Storage track, so if you are planning to attend and wish to meet up with me while I am there, please send me a note!
If you missed the [IBM System Storage and Storage Networking Symposium] in San Diego, California last month (like I did because I was in Japan and India), here is your chance to attend the one next month in Europe, September 8-11, in beautiful[Montpellier, France]. Several of my colleagues from the IBM Tucson Executive Briefing Center are scheduled to speak at this event.
And maybe, perhaps, some IBM executives, will have something important to say next month also! Stay tuned!
For a list of other IBM events this year, see the [2008 schedule].
Continuing my coverage of the IT Security and Storage Expo in Brussels, Belgium, we had some great storage solutions on display at the IBM and I.R.I.S-ICT booth.
Here my IBM colleague Tom Provost is showing the front of the "Smarter Office" solution. The second photo gives the view from behind. While I always explained the solution from the front of the box, many of the more technical attendees at this conference wanted to inspect the ports in the back.
This sound-isolated 11U solution combines the following:
The [IBM Storwize V3700] with 300GB small-form-factor (SFF) drives provides shared storage for the servers.
Two [IBM System x3550 M4 servers] that can run VMware, Hyper-V or Linux KVM server hypervisor software for your Windows and/or Linux applications. These are two socket servers that can have up to 16 x86 cores each.
A Juniper EX2200 switch to network the servers and storage together.
A Local Console Manager (LCM) with rackable keyboard, video, and mouse.
In this next example, the IBM team combined a BladeCenter S chassis that can hold six blade servers, with a Storwize V7000 Unified which offers FCP, iSCSI, FCoE, NFS, CIFS, HTTPS, SCP and FTP block and file protocols.
If those configurations are too small for your needs, consider the Flex System chassis or full PureFlex system frame. The rack-mountable 10U chassis can hold the Flex System V7000 and 10 compute notes. The PureFlex frame can hold up to four of these chasses.
IBM and I.R.I.S-ICT also had an IBM XIV Gen3 and a TS3500 Tape library on display.
"With Cisco Systems, EMC, and VMware teaming up to sell integrated IT stacks, Oracle buying Sun Microsystems to create its own integrated stacks, and IBM having sold integrated legacy system stacks and rolling in profits from them for decades, it was only a matter of time before other big IT players paired off."
Once again we are reminded that IBM, as an IT "supermarket", is able to deliver integrated software/server/storage solutions, and our competitors are scrambling to form their own alliances to be "more like IBM." This week, IBM announced new ordering options for storage software with System x servers, including BladeCenter blade servers and IntelliStation workstations. Here's a quick recap:
IBM Tivoli Storage Manager FastBack v6.1 supports both Windows and Linux! FastBack is a data protection solution for ROBO (Remote Office, Branch Office) locations. It can protect Microsoft Exchange, Lotus Domino, DB2, Oracle applications. FastBack can provide full volume-level recovery, as well as individual file recovery, and in some cases Bare Machine Recovery. FastBack v6.1 can be run stand-alone, or integrated with a full IBM Tivoli Storage Manager (TSM) unified recovery management solution.
FlashCopy Manager v2.1
FlashCopy Manager uses point-in-time copy capabilities, such as SnapShot or FlashCopy, to protect application data using an application-aware approach for Microsoft Exchange, Microsoft SQL server, DB2, Oracle, and SAP. It can be used with IBM SAN Volume Controller (SVC), DS8000 series, DS5000 series, DS4000 series, DS3000 series, and XIV storage systems. When applicable, FlashCopy manager coordinates its work with Microsoft's Volume Shadow Copy Services (VSS) interface. FlashCopy Manager can provide data protection using just point-in-time disk-resident copies, or can be integrated with a full IBM Tivoli Storage Manager (TSM) unified recovery management solution to move backup images to external storage pools, such as low-cost, energy-efficient tape cartridges.
General Parallel File System (GPFS) v3.3 Multiplatform
GPFS can support AIX, Linux, and Windows! Version 3.3 adds support for Windows 2008 Server on 64-bit chipset architectures from AMD and Intel. Now you can have a common GPFS cluster with AIX, Linux and Windows servers all sharing and accessing the same files. A GPFS cluster can have up to 256 file systems. Each of these file systems can be up to 1 billion files, up to 1PB of data, and can have up to 256 snapshots. GPFS can be used stand-alone, or integrated with a full IBM Tivoli Storage Manager (TSM) unified recovery management solution with parallel backup streams.
For full details on these new ordering options, see the IBM [Press Release].
In general, people agree that IBM, HP and EMC are the top three vendors in storage,with HDS, Sun and Dell rounding out the top six.
The fun begins when a respected analyst like IDC Corp. publishes their calculations,and individual vendors re-swizzle the results because they are not happy with theirfindings.
I thought it would be helpful to illustrate how this all works. First, you need to comeup with a defintion of what you are going to count. You could count units sold, revenue dollars, or capacity Terabytes, or some other generally accepted metric.
Next, you need to define what's in and what's out. For example, you can say "storage"which would include both disk drives and tape drives, both internal to servers, orexternal to servers, or you can choose a more narrow definition, say external disksystems, which might suit you better if you aren't in the tape business, and don't sell servers.
By some definitions, my Apple iPod, Motorolla cell phone, and Canon digital camera,could all be counted as external disk systems, as they all connect via USB cableto my IBM laptop, and act like a disk drive to my Windows operating system, allowingme to read and write data back and forth. It is necessary to define exactly what you plan to include,and what to exclude, based on the reported numbers available.
The last rule is that nothing gets double-counted. In our complicated industry ofmanufacturers and vendors, sometimes storage is manufactured by one company, but soldby another, typically under the vendor's brand, not the manufacturer's brand. Youcan either count manufactured units, or vendor units, but you can't mix and match.
IBM is both manufacturer and vendor. However, IDC only counts vendor units, so storagemanufactured by someone else, but sold by IBM is counted as IBM, and storage manufacturedby IBM but branded by someone else goes to that other vendor. Likewise, HP and Sun re-brandHitachi storage, and Dell re-brands EMC storage.
EMC would like to treat all EMC-manufactured storage re-branded by Dell as EMC vended storage,so that it can move up in the ratings. But Dell wants to count it too, so that it can appearin the top six. You can't have it both ways.
But are these ratings just "bragging rights"? Not always. When big purchases are planned fornew projects, or a client decides its time to throw out the current vendor and shop for a newone, the ratings could influence that decision. In that regard, IDC 4Q05 Storage Tracker reportedIBM as number one over all in storage hardware at the end of 2005, which includes both internal and external disk systems, as well as tape drives sold under the IBM brand, based on dollar revenues. By this method of counting, HP came in at number 2, EMC at number 3, and the rest round out thetop six as before.
In the end, this is just one factor when deciding which brand to choose for your storage needs.
This week and next, I am down under in Australia and New Zealand for a seven-city Storage Optimisation Breakfast series of presentations to clients and prospects. My first city for this seven-city tour was Sydney, Australia.
Here is the view from my room at the [Shangri-La hotel], including the famous [Sydney Opera House] and Circular Quay, from which to take a water taxi or ride the Manly Ferry. [Sydney harbour] is the deepest harbour in the Southern Hemisphere, allowing boats of all sizes to enter. This section of the city is known as "The Rocks".
Sydney is a very modern metropolis. The last time I was in Sydney was in May 2007 to teach an IBM Top Gun class. My post back then on [Dealing with Jet Lag] is as relevant now as it was back then. In addition to being 9 hours off-shifted from last week in Dallas, Texas, I also have to deal with the colder climate, about 40 degrees F cooler down here. The weather is crisp and clear, it is Winter going into Spring down here as the seasons are flipped below the equator.
Many of the buildings are recognizable from the movie ["The Matrix"] which was filmed here. We joked that this seven-city trip was also similar to [The Adventures of Priscilla, Queen of the Desert], in that both journeys started in Sydney. If you haven't seen the latter, I highly recommend it to get to learn more about Australia as a country.
(Completely useless trivia: Actor Hugo Weaving appeared in both movies. While most people associate him with Australia, where he has lived since 1976, he actually was born in Nigeria, and traveled extensively because his father worked in the computer industry.)
Here I am standing next to our banner.
The line-up for each event is simple. After all the attendees sit down for breakfast, we have the following three sessions:
First, Anna Wells, local IBM Executive for Storage Sales in Australia and New Zealand presents IBM's strategy for storage, and how IBM plans to address Storage Efficiency, Data Protection and Service Delivery. She then highlights various products that are currently available to help meet customer needs, including XIV and the SAN Volume Controller (SVC).
Second, we have a client or two share their success story. We will have different speakers at the different locations.
Third, I present on future trends that will impact the storage marketplace. With only 40 minutes for my section, I decided to focus on just three specific trends, with a mix of some colorful analogies to help emphasize my key points.
We had a great turn-out for our first event in Sydney, lots of clients and prospects came out for this. There is a lot of enthusiasm for IBM's vision, thought leadership, and broad portfolio of storage solutions.
When I was a kid, I used to love old spy movies where they would hide a small microchip or microfiche behind the stamp on a letter or postcard. "Yeah right," I would think to myself, "how much information could that little thing possibly hold."On their post[Bringing the "New Intelligence" Down to Earth: Intro to Semantic Web, Internet-of-Thing], My fellow IBM bloggers Jack Mason and Adam Christensen pointed me to a crazy new product called "Mir:ror" that connects to your PC or laptop.
At first, I thought it was a another product spoof, like Onion News Networks'video of the [Apple MacBook Wheel] that eliminatesthe need for a keyboard.But no, this product is real, from a company called [Violet]. The mir:ror, the internet-connected rabbits, and the tiny postage stamps called "ztamps" with embedded RFID chips that allow everything to be interconnected.I can see a lot of interesting uses for the ztamps. Squishing CD-romsor memory sticks inside presentation folders was always awkward. Butthese are small, flat and discrete. I don't know how many GBs of storage each ztamp holds, but they look cool, don't they?
Just another example of becoming a smarter planet!
Greg and 3PAR's Marc Farley did an "ambush" interview with the folks at the IBM booth at SNW, including Paula Koziol about Twitter, and [Rich Swain] about IBM's latest SONAS product. Here is their post [Storage Monkey business with IBM]:
You can learn more about SONAS from my post [More Details about IBM Clustered NAS]. SONAS is based on software that has been available since 1996, on commodity off-the-shelf server and storage systems, but building a complete system was left as an exercise to the end-user, which many of the top 500 Supercomputers have done.
Back in November 2007, IBM announced Scale-Out File Services (SoFS) which was a set of IBM Global Technical Services to build a customized solution from the software and a set of servers, disk and tape storage. Customized configurations were done for a variety of workloads from Digital Media to Scientific Research High Performance Computing (HPC). Last year, SoFS was renamed to IBM Smart Business Storage Cloud (SBSC).
This year, IBM was able to package all of the software and hardware into an easy to order machine-type model that has everything cabled and ready to use. This is what SONAS is today.
IBM Master Inventor, Senior IT Architect, and Event Content Manager
Well, it's Tuesday again, and you know what that means? IBM Announcements! We have a lot today, so I will just give you the quick highlights, and then Chris and Lloyd will follow-up with more detailed posts.
New IBM Storwize V5000 models
IBM introduces several new entry-level models.
The Storwize V5010E and V5030E are the "Express" models that allow for hybrid configurations, mixing Flash and spinning HDD disk. The Storwize V5010E is a single controller, two-canister model, with basic features. The Storwize V5030E adds more memory, more CPU power, and additional features like Data Reduction Pools and data-at-rest Encryption. Hosts can attach via SAS, 16Gb FCP or iSCSI.
The Storwize V5100 is the baby model of the FlashSystem 9100, supporting both FlashCore and industry-standard NVMe Flash drives, with the option to SAS-attach expansion drawers, mixing Flash and spinning HDD disk. The Storwize V5100F is the all-flash version. Hosts can attach via 32Gbps FCP, 25GbE RoCE, 25GbE iWarp, and iSCSI.
IBM Spectrum Virtualize for Public Cloud on Amazon Web Services (AWS)
The IBM Spectrum Virtualize for Public Cloud, or what our young folks shorten unofficially to SV4PC, that has been available on IBM Cloud will now also be available on Amazon Web Services.
For those readers asking "What took so long?" Amazon was not going to put specialized equipment in their data centers, so IBM had to make Spectrum Virtualize software container-native. Yes, the SVC code now runs in its own Docker container.
Basically a 2-node cluster is represented as two AWS EC2 instances, virtualizing EBS storage. The Transparent Cloud Tiering (TCT) that let's you "FlashCopy-to-the-Cloud" can be used to go directly to Amazon's S3 object storage.
This conversion to container-native has worked so well, IBM now plans to offer container-native software-defined storage capability across the board, for object storage, block storage, and file storage.
Did you notice that the Storwize V5100/F models support 32Gbps FCP in the section above? If that raised your eyebrow, I am pleased to tell you that IBM will be supporting 32Gbps FCP on these new Storwize V5100/F, the Storwize V7000 Gen3 and the FlashSystem 9100 devices.
We have also added a new b-type SAN switch, the SAN18B-6 which is Broadcom's Gen6 technology in a sleek 1U configuration, sporting 12 FCP ports that support 32Gbps and auto-negotiate to slower speeds as needed for compatibility with 8Gbps and 16Gbps devices. The other six ports are Ethernet, and can be used for disaster recovery replication, either using native TCP/IP or FCIP protocols.
IBM has enhanced the alerting capabilities of both the on-premise IBM Spectrum Control and its "as-a-Service" sister offering IBM Storage Insights. This allows you to set up alerts for "device groups" across multiple storage devices, as well as setting up filters to make the alerts more meaningful, eliminating some of the noise.
When IBM first introduced IBM Storage Insights, it was intended as an alternative to the on-premise solution. Now, clients demand both, so now if you have one, we can offer you the other! The new [IBM Storage Insights for IBM Spectrum Control] is an IBM Cloud service that can help you predict and prevent storage problems before they impact your business.
It is complementary to IBM Spectrum Control and is available at no additional cost if you have an active license with a current subscription and support agreement for IBM Virtual Storage Center, IBM Spectrum Storage Suite, or any edition of IBM Spectrum Control.
As an on-premises application, IBM Spectrum Control doesn't send the metadata about monitored devices offsite, which is ideal for dark shops and sites that don't want to open ports to the cloud. However, if your organization allows for communication between its network and the cloud, you can use IBM Storage Insights for IBM Spectrum Control to transform your support experience for IBM block storage.
IBM Spectrum Scale has been certified to run with with Hortonworks Data Platform (HDP) 3.1 release.
(Ha, I probably could have fit all that in the title of this section, but instead I just said "IBM Spectrum Scale" and you are thinking "Oh Boy!" and then you see something that could have fit in the title and feel all disappointed. It is kind of like when the local news asks "Was the restaurant you had lunch today contaminated with Salmonella?" and then follows up with the answer "Find out at 11:00pm evening news!" And then you wait until 11:00pm for then to say, "No, there was no salmonella found in any of the restaurants.".)
So, I would not have mentioned Spectrum Scale certification of HDP 3.1 unless there was at least something else worth mentioning. There is! IBM Spectrum Scale now also enhanced its performance for SMB and NFS, and has enhanced the scalability and resiliency of its Active File Management (AFM) feature.
The IBM FlashSystem A9000 and A9000R are targeted to Cloud and Managed Service Providers (CSP/MSP). The 12.3.2 release now supports VLAN tagging for iSCSI deployments. This VLAN tagging allows multiple virtual networks and IP addresses to share iSCSI ports, making it ideal for multi-tenancy for CSP/MSP clients.
IBM manages over a hundred Blockchain networks for its clients. For those not familiar with Blockchain, it is a way to record transactions, whenever money or product changes hands, an entry is recorded into the blockchain ledger for all to see.
This has two drawbacks. Information stored in the ledger may contain information you do not want everyone to see. The other is scalability, storing photos, and other supporting documents may be nice to have, but takes up a lot of space, and slows down transaction rates.
The solution is "off-chain" data. These are supporting documents that aren't needed in the blockchain itself. To connect them, you store a checksum hash of the supporting document in the ledge, then store the supporting document as off-chain data on-premises. If you need to produce the document for an audit, its checksum hash will match what is in the ledger.
In the beginning, people thought Docker containers would just be used for microservices with no persistent storage. Then clients realized they needed persistent storage, and they needed to orchestrate that storage provisioning. The IT industry has a variety of different orchestrators like Kubernetes, Docker Swarm, and Mesos. All of these manage persistent storage differently. IBM has focused on Kubernetes, using Ubiquity open source project to manage FlexVolumes.
Container Standard Interface (CSI) is an effort to standardize the provisioning of persistent storage. Allowing containerized applications to have access to storage that persists, even after the container is shutdown or crashes. For the next few years, I suspect IBM will need to support both the old way (FlexVolumes) and the new way (CSI) until all the standards settle.
You can hear all about these exciting announcements at the upcoming IBM Systems Technical University (TechU) in Atlanta, GA (USA), April 29-May 3. Visit [ibm.biz/Atlanta2019] to learn more and register. The three of us all plan to be there! Stop by and say hello.
IBM Senior Certified Executive IT Architect
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Today I want to write about a very recent enhancement for IBM storage clients. This announcement was specific to all IBM Storage clients using IBM Spectrum Control.
IBM recently announced a new solution for existing Spectrum Control clients to obtain a cloud based version and get even more value from their Spectrum Control investment. Whether you have Spectrum Control Standard, Advanced, Select or Virtual Storage Center all versions are covered under this new solution.
In this new solution for existing clients of Spectrum Control, they are entitled to this new solution using existing Spectrum Control licensed capacity. This new solution which is a cloud based Software as a Service (SaaS) offering is titled Storage Insights for Spectrum Control. For current Spectrum Control clients this new solution is provided with no additional costs.
Since the release of Storage Insights & Storage Insights Pro existing clients with Spectrum Control have asked for a similar cloud based option. Today we have that option.
Other devices: Non-IBM storage devices, VMWare, SAN Switches
On Premises / Cloud
Asset management (Type, model, serial number, firmware)
On Premises / Cloud
Support management: Ticket creation / log upload
On Premises / Cloud
Health (show status of entity) direct / call home
On Premises / Cloud
Alerting (send mesg to user) status / thresholds / eMail / SNMP / scripts
On Premises / Cloud
Storage / Fabric Performance and Error Reporting
On Premises / Cloud
Performance Interval, retention
5 min, 24 hour
5 min, 1 year
1 min, customizable
On premises / Cloud
Provisoning using Service Classes & Capacity Pools with automatic zoning
Reclamation analysis of unused volumes
On Premises / Cloud
Service Management (Chargeback & Consumer reports)
On Premises / Cloud
Custom Reporting: GUI / API
On Premises / Cloud
Tiering support across pools
Recommend and Implement
On Premises / Cloud
Balance workload across pools
User ManagementL Active DIrectory/LDAP integration
Cloud portal SLA
If your existing Spectrum Control instance is providing your requirements then consider Storage Insights for Spectrum Control for the added value of enhance IBM Storage Support, and constantly getting access to the latest features of Storage Insights for Spectrum Control without any of the maintenance or upgrade activities.
Whatever reason you may need to reach out to IBM Storage Support, with IBM Support having immediate access to your storage configuration details will reduce the time and your teams effort to get a resolution or recommendation from IBM on how to proceed.
Dan Galvan, IBM VP of Marketing for Storage, was the next speaker. With 300 billion emails being sent per day, 4.6 billion cell phones in the world, and 26 million MRIs per year, there is going to be a huge demand for file-based storage. In fact, a recent study found that file-based storage will grow at 60 percent per year, compared to 15 percent growth for block-based storage.
Dan positioned IBM's Scale-out Network Attached Storage (SONAS) as the big "C:" drive for a company. SONAS offers a global namespace, a single point of management, with the ability to scale capacity and performance tailored for each environment.
The benefits of SONAS are great. We can consolidate dozens of smaller NAS filers, we can virtualize files across different storage pools, and increase overall efficiency.
Powering advanced genomic research to cure cancer
The next speaker was supposed to be Bill Pappas, Senior Enterprise Network Storage Architect, Research Informatics at [St. Jude Children’s Research Hospital]. Unfortunately, St. Jude is near the flooding of the Mississippi river, and he had to stay put. An IBM team was able to capture his thoughts on video that was shown on the big screen.
Thanks to the Human Genome project, St. Jude is able to cure people. They see 5700 patients per year, and have an impressive 70 percent cure rate. The first genetic scan took 10 years, now the technology allows a genome to be mapped in about a week. Having this genomic information is making vast strides in healthcare. It is the difference of fishing in a river, versus putting a wide net to catch all the fish in the Atlantic ocean all at once.
Recently, St. Jude migrated 250 TB of files from other NAS to an IBM SONAS solution. The SONAS can handle a mixed set of workloads, and allows internal movement of data from fast disk, to slower high-capacity disk, and then to tape. SONAS is one of the few storage systems that supports a blended disk-and-tape approach, which is ideal for the type of data captured by St. Jude.
IBM's own IT transformation
Pat Toole, IBM's CIO, presented the internal transformation of IBM's IT operations. He started in 2002 in the midst of IBM's effort to restructure its process and procedures. They identified four major data sources: employee data, client data, product data, and financial data. They put a focus to understand outcomes and set priorities.
The result? A 3-to-1 payback on CIO investments. This allowed IBM to go from server sprawl to consolidated pooling of resources with the right levels of integration. In 1997, IBM had 15,000 different applications running across 155 separate datacenters. Today, they have reduced this down to 4,500 applications and 7 datacenters. Their goal is to reduce down to 2,225 applications by 2015. Of these, only 250 are mission critical.
Pat's priorities today: server and storage virtualization, IT service management, cloud computing, and data-centered consolidation. IBM runs its corporate business on the following amount of data:
9 PB of block-based storage, SVC and XIV
1 PB of file-based storage, SONAS
15 PB of tape for backup and archive
Pat indicated that this environment is growing 25 percent per year, and that an additional 70-85 PB relates to other parts of the business.
By taking this focused approach, IBM was able to increase storage utilization from 50 to 90 percent, and to cut storage costs by 50 percent. This was done through thin provisioning, storage virtualization and pooling.
Looking forward to the future, Pat sees the following challenges: (a) that 120,000 IBM employees have smart phones and want to connect them to IBM's internal systems; (b) the increase in social media; and (c) the use of business analytics.
After the last session, people gathered in the "Hall of the Universe" for the evening reception, featuring food, drinks and live music. It was a great day. I got to meet several bloggers in person, and their feedback was that this was a very blogger-friendly event. Bloggers were given the same level of access as corporate executives and industry analysts.
During the break, I talked with some of the other bloggers at this event. From left to right: Stephen Foskett [Pack Rat] blog, Devang Panchigar [StorageNerve], and yours truly, Tony Pearson. (Picture courtesy of Stephen Foskett)
Meet the Experts
This next segment was a Q&A panel, with a moderator posing questions to four experts. Originally, I was scheduled to be the moderator, but this was changed to Doug Balog. The experts on the panel were:
Rich Castagna, Editorial Director for Storage Media, TechTarget. TechTarget is the group that runs the [SearchStorage] website.
Stan Zaffos, Gartner VP of Research, who spoke earlier today. I have worked with Stan for years as well, and have attended the last four Gartner Data Center Conferences held every December in Las Vegas.
Steve Duplessie, Founder and Senior Analyst, Enterprise Strategy Group (ESG). Steve's blog is titled [The Bigger Truth].
Jon clarified a statement Doug Balog said earlier in the day attributed to his study. Doug had said that 40 percent of all data should be archived. The study that Jon Toigo had done found that, on average, for the data on disk systems, about 30 percent is useful data, 40 percent is not active and could be eligible for archive, and the remaining 30 percent was crap.
The other experts introduced themselves. Rich felt that "Cloud" was still the biggest buzzword in the IT industry. Stan felt that CIOs should ask their storage administrators "What are you doing to improve my agility and efficiency". Steve felt that it was better to focus on improving process and procedures, rather than trying to deploy the best technology.
How can you best reduce backup costs per TB?
Jon- use tape.
Rich- Clean up your environment.
Stan- Don't rehydrate your deduplicated data, adopt archive approach, and revisit your backup schedules.
Steve- Deduplication covers up stupidity. No band-aids! Companies need to address the cause.
Does Backup as a Public Service for large enterprises makes sense?
Rich- Yes, especially for those with Remote Office/Branch Office (ROBO).
Stan- It depends. You should implement client-side dedupe. Get the Cloud Provider to waive telecom bandwidth charges.
Steve- Consider recovery scenarios, and try to maintain control.
Jon- "Clouds" are bulls@#$ marketing. WAN latency will pile up.
What are the top issues IT leaders should be discussing with the Storage Managers?
Stan- To ensure SLAs meet but not exceed design, to automate, and to evaluate SAN/NAS ratios.
Steve- Server virtualization is putting the spotlight on storage. Failure to implement storage virtualization is becoming the gate that slows down sever virtualization adoption.
Jon- Insist on management features from all storage vendors, try to separate feature/function from the underlying hardware layer. See IBM's [Project Zero].
Rich- Efficiency, Archiving, Thin Provisioning, Compression, Data Protection & Retention, Backup Redesign to protect endpoints like laptops and cell phones.
When does Archive eliminate Backup?
The need for protection never goes away. There are two kinds of data: "originals" and "derivatives", and two kinds of disk: "failed" and "not yet failed".
Given SATA and SAS drives, what is the future of 10K/15K RPM drives?
There is no future for these faster drives, they are going away.
What is the biggest challenge for adopting archive?
It is easy to move data out of production systems, but difficult to make these archives accessible for eDiscovery and Search. There is also concern about changing data formats. Adobe has changed the format of PDF a whopping 33 times.
This was by far the most entertaining section of the day! Hand-held devices allowed the audience to vote which answers they liked best.
Doug Balog, IBM VP and Business Level Executive for Storage, presented Smart Archiving. Citing research by Jon Toigo, Doug indicated that 40 percent of data on disk should be archived. Sadly, a vast majority of companies continue to use their backups as archives. There is a better way to do archives, to address the needs of four use cases:
The IBM Information Archive for email, files and eDiscovery offers full text indexing. A well-deployed archive strategy can save up to 60 percent in backup costs, and reduce backup times by 80 percent. IBM offers advanced analytics and visualization for archive data.
An analysis of a global insurance company found that they kept, on average, 120 copies of every email sent. This was the combination of an average of 12 copies of the email, multipled by 10 backups of the email repository.
Banjercito, a bank in Mexico, has a 10-year retention requirement from government regulations.
The new LTFS Library Edition allows Library-based access to files stored on tape cartridges. The new TS3500 Library Connector means that a single system of connected tape libraries can hold up to 2.7 Exabytes (EB) of data.
Archive Industry Perspectives
Steve Duplessie from Enterprise Strategy Group [ESG] gave his views on the challenges of volume, access and cost. His definition for archive: the long term retention of information on a separate environment for compliance, eDiscovery and business reference purposes. Steve advocates a purpose-built solutiion for archive. There are three major challenges for implementing an archive solution:
Getting Participation -- Steve feels that key stakeholders have inappropriate expectations of what archive is, or can be.
Define Tasks -- Steve argues that archive is very much a process-oriented approach, and tasks must fit business process and procedures
Prepare for Future Content Types -- the frequent change of standard and proprietary data types poses a real challenge for long term retention of data
For example, the Financial Industry Regulatory Authority [FINRA] oversee 4,000 brokerage firms, and 600,000 broker/dealers. They have mandated the storing of digital data related to stock trades, and this can include text messages, voice messages, and emails. They continue to expand this definition, so soon this could include tweets on Twitter, for example.
Steve feels there are four key requirements for archive:
Support for email, such as an email application plug-in
Off-line access to archived data
Support for mobile devices, such as smartphones
Basic search capabilities
Companies are starting to take archive seriously. About 35 percent of firms surveyed have adopted archive, and another 36 percent plan to in the next 12-24 months. Enterprise archive has grown over 200 percent from 2007 to 2009. Steve agrees that not everything needs to be stored on disk. Retention periods greater than six years dictates the need for tape.
Current systems may not meet today's requirements. Data loss and downtime costs have skyrocketed. Data Protection and Retention projects can represent a gold mine of savings, new capabilities can greatly lower costs, allowing companies to shift resources over to revenue generation.
Big Data, New Physics and Geospatial Super-Food
I would vote this the best session of the day! For all those confused on what the heck "Big Data" means, Jeff has the best explanation. Jeff Jonas is an IBM Distinguished Engineer and the Chief Scientist of Entity Analytics. He had just finished his 17th marathon on Saturday, and his fingers were bandaged.
Jeff had founded the Systems Research and Design (SR&D) company, known for creating NORA (non-obvious relationship awareness) used by Las Vegas casinos to identify fraud. SR&D was acquired by IBM back in 2005. Jeff is focused on sensemaking of streams. He feels many companies are suffering from "Enterprise Amnesia".
"The data must find the data .. and the relevance must find the user."
-- Jeff Jonas
Jeff's metaphor to Big Data is a jigsaw puzzle without the picture on the outside of the box. To demonstrate his point, he presented a pile of jigsaw puzzle pieces and asked four teenagers to put the puzzle together without the advantage of the picture on the box. What he had not told them was that he mixed four different puzzles together, removing out 10 to 20 percent of the pieces from each puzzle. He also added some duplicate pieces from a second identical puzzle, and just to make things fun, included a dozen pieces from a sixth puzzle just to mess with their heads. Within a few hours, the kids had managed to figure out that there were four puzzles, that there were duplicate pieces, and that there were some pieces that did not fit any of the four puzzles.
"You can't squeeze knowledge from a pixel."
-- Jeff Jonas
This approach favors false negatives. New observations reverse out old conceptions. As the picture emerges, this provides added focus on new information. More data can provide better predictions. "Bad" data, including misspelled words and mis-coded categories, was often discarded or corrected on the basis of "Garbage-In, Garbage Out", but can now be useful in a Big Data perspective.
Take for example the 600 billion recordings of the "location data" captured on cell phones every day. With regular triangulation of cell phone towers, the information can pinpoint you within 60 meters, add GPS and this improved to within 20 meters, and add Wi-Fi is further improved to 10 meters. While this data is "de-identified" so as not to identify individual users, the process of re-identification is relatively trivial. Jeff's system is able to predict a person will be next Thursday at 5:35pm with 87 percent accuracy.
Thus, Big Data represents an asset, accumulation of context. Real-time analytics can be a competitive advantage. These streams of data will need persistent storage and massive I/O capabilities. In one example, Jeff processed 4,200 separate sources of information and was able to identify "dead votes". These are votes cast by people that died in years prior, indicating voter fraud.
Jeff's latest project, codenamed G2, will tackle not just people, but everything from proteins to asteroids.
Normally, the worst time slot is the hour after lunch, but these presentations kept people's attention.
Down the street, in Times Square, IBM made it on the big board.
Continuous Data Availability
Jeanine Cotter, IBM VP for Data Center Services, started out with a video about Sabre. IBM developed this revolutionary airline reservation system to handle the huge volume of transactions. Today, 18 percent of organizations consider downtime unacceptable for their tier-1 applications, and 53 percent would be seriously impacted by an outage lasting an hour or more.
Eventually, companies cross the "Continuous Availability" threshold, the point where they discover that the possibility of downtime is too costly to ignore. IBM has clients using 3-site Metro/Global Mirror that can fail-over an entire data center in just five mouse clicks.
Jeanine also mentioned Euronics, which is using SAN Volume Controller's Stretched Cluster capability, which allows them to easily vMotion virtual guest images from one data center to another. SVC has had this capability for a while, but now, with full VMcenter plug-in and VAAI support, the capability is fully integrated with VMware.
A final example was a mid-sized University, they are using IBM Storwize V7000 with Metro Mirror. The primary location's Storwize V7000 manages Solid-state drives with Easy Tier. The secondary location's Storwize V7000 has high-capacity SATA drives and FlashCopy.
Customer Testimonial - University of Rochester Medical Center
Rick Haverty, Director of IT infrastructure at University of Rochester Medical Center [URMC] provided the next client testimonial. The mission of the URMC is to use science, education and technology to improve health. URMC gets over $400 million USD in NIH grants, which puts them around 23rd largest University-based academic medical centers in the country. They have over 900 doctors, general practice and specialists.
URMC has an IBM BlueGene supercomputer, a Cisco network over 45,000 ports, and over 7.5 million square feet of Wi-Fi wireless internet coverage. They have three datacenters. The first is 7500 square feet, the second is 6000 square feet, and the third is just 800 square feet to hold their "off-site tapes".
URMC has digitized all of their records, including Electronic Medical Records (EMR) system, medical dosage history, imaging "priors", calibration of infusion pumps, RFID monitoring, and even provide IT support while the patient is on the operating table. RFID monitoring ensures all of the refrigerators are keeping medications at the right temperature. A single failed refrigerator can lose $20,000 dollars worth of medication.
When is a good time for downtime? At URMC, they handle 90,000 Emergency Room vists per year, so the answer is never. When is the ER busiest? Monday morning. (not what I expected!)
URMC's EMR software (Epic) runs on clustered POWER7 servers, with DS8700 disk systems using Metro Mirror to secondary location. They also keep a third "shadow" POWER7 for read-only purposes, and a separate system that provides web-based read-only access. Finally, they have 90 stand-alone Personal Computers (PCs) that contain information for all the patients that have reservations this week, just in case all the other systems fail.
The exploding volume of data comes from medical imaging. For radiology (X-rays), each image is called a "study" takes 20-30 MB each, and they have 650,000 studies per year. This represents about 16TB storage per year, with 3 second response time access. These must be kept for 7 years since last view, or until the patient reaches the age of 18 years old, which ever is later.
But radiology is just one discipline. Healthcare has a whole bunch of "ologies". Another is "Pathology" which looks at cells between glass slides in a microscope. Each study consumes 10-20GB, and URMC does about 100,000 pathology studies per year, representing 150TB per year.
URMC has identified that they have 42 mission-critical applications. The data for these are stored on DS8000, XIV, Storwize V7000 and DS5000, all managed behind SAN Volume Controller.
During lunch, people were able to take a look at our solutions. Here are Dan Thompson and Brett Cooper striking a pose.
Hyper-Efficient Backup and Recovery
The afternoon was kicked off by Dr. Daniel Sabbah, IBM General Manager of Tivoli software. He started with some shocking statistics: 42 percent of small companies have experienced data loss, 32 percent have lost data forever. IBM has a solution that offers "Unified Recovery Management". This involves a combination of periodic backups, frequent snapshots, and remote mirroring.
IBM Tivoli Storage Manager (TSM) was introduced in 1993, and was the first backup software solution to support backup to disk storage pools. Today, TSM is now also part of Cloud Computing services, including IBM Information Protection Services. IBM announced today a new bundle called IBM Storwize Rapid Application Backup, which combines IBM Storwize V7000 midrange disk system, Tivoli FlashCopy Manager, implementation services, with a full three-year hardware and software warranty. This could be used, for example, to protect a Microsoft Exchange email system with 9000 mailboxes.
IBM also announced that its TS7600 ProtecTIER data deduplication solutions have been enhanced to support many-to-many bi-direction remote mirroring. Last year, University of Pittsburgh Medical Center (UPMC) reported that they were average 24x data deduplication factor in their environment using IBM ProtecTIER.
"You are out of your mind if you think you can live without tape!"
-- Dick Crosby, Director of System Administration, Estes
The new IBM TS1140 enterprise class tape drive process 2.3 TB per hour, and provides a density of 1.2 PB per square foot. The new 3599 tape media can hold 4TB of data uncompressed, which could hold up to 10TB at a 2.5x compression ratio.
The United States Golfers Association [USGA] uses IBM's backup cloud, which manages over 100PB of data from 750 locations across five continents.
Customer Testimonial - Graybar
Randy Miller, Manager of Technical System Administration at Graybar, provided the next client testimonial. Graybar is an employee-owned company focused on supply-chain management, serving as a distributor for electical, lighting, security, power and cooling equipment.
Their problem was that they had 240 different locations, and expecting local staff to handle tape backups was not working out well. They centralized their backups to their main data center. In the event that a system fails in one of their many remote locations, they can rebuild a new machine at their main data center across high-speed LAN, and then ship overnight to the remote location. The result, the remote location has a system up and running by 10:30am, faster than they would have had from local staff trying to figure out how to recover from tape. In effect, Graybar had implemented a "private cloud" for backup in the 1990s, long before the concept was "cool" or "popular".
In 2001, they had an 18TB SAP ERP application data repository. To back this up, they took it down for 1 minute per day, six days a week, and 15 minutes down on Sundays. The result was less than 99.8 percent availability. To fix this, they switched to XIV, and use Snapshots that are non-disruptive and do not impact application performance.
Over 85 percent of the servers at Graybar are virtualized.
Their next challenge is Disaster Recovery. Currently, they have two datacenters, one in St. Louis and the other in Kansas City. However, in the aftermath of Japan's earthquakes, they realize there is a nuclear power plan between their two locations, so a single incident could impact both data centers. They are working with IBM, their trusted advisors, to investigate a three-site solution.
This week, May 15-22, I am in Auckland, New Zealand teaching IBM Storage Top Gun sales class. Next week, I will be in Sydney, Australia.