This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections platform will be sunset on December 31, 2019. On January 1, 2020, this blog will no longer be available. More details available on our FAQ.
Continuing my week in Chicago, for the IBM Storage Symposium 2008, I attended two presentations on XIV.
XIV Storage - Best Practices
Izhar Sharon, IBM Technical Sales Specialist for XIV, presented best practices using XIV in various environments.He started out explaining the innovative XIV architecture: a SATA-based disk system from IBM can outperformFC-based disk systems from other vendors using massive parallelism. He used a sports analogy:
"The men's world record for running 800 meters was set in 1997 by Wilson Kipketer of Denmark in a time of 1:41.11.
However, if you have eight men running, 100 meters each, they will all cross the finish line in about 10 seconds."
Since XIV is already self-tuning, what kind of best practices are left to present? Izhar presented best practicesfor software, hosts, switches and storage virtualization products that attach to the XIV. Here's some quickpoints:
Use as many paths as possible.
IBM does not require you to purchase and install multipathing software as other competitors might. Instead, theXIV relies on multipathing capabilities inherent to each operating system.For multipathing preference, choose Round-Robin, which is now available onAIX and VMware vSphere 4.0, for example. Otherwise, fixed-path is preferred over most-recently-used (MRU).
Encourage parallel I/O requests.
XIV architecture does not subscribe to the outdated notion of a "global cache". Instead, the cache is distributed across the modules, to reduce performance bottlenecks. Each HBA on the XIV can handle about 1400requests. If you have fewer than 1400 hosts attached to the XIV, you can further increase parallel I/O requests by specifying a large queue depth in the host bus adapter (HBA).An HBA queue depth of 64 is a good start. Additional settings mightbe required in the BIOS, operating system or application for multiple threads and processes.
For sequential workloads, select host stripe size less than 1MB. For random, select host stripe size larger than 1MB. Set rr_min_io between ten(10) and the queue depth(typically 64), setting it to half of the queue depth is a good starting point.
If you have long-running batch jobs, consider breaking them up into smaller steps and run in parallel.
Define fewer, larger LUNs
Generally, you no longer need to define many small LUNs, a practice that was often required on traditionaldisk systems. This means that you can now define just 1 or 2 LUNs per application, and greatly simplifymanagement. If your application must have multiple LUNs in order to do multiple threads or concurrent I/O requests, then, by all means, define multiple LUNs.
Modern Data Base Management Systems (DBMS) like DB2 and Oracle already parallelize their I/O requests, sothere is no need for host-based striping across many logical volumes. XIV already stripes the data for you.If you use Oracle Automated Storage Management (ASM), use 8MB to 16MB extent sizes for optimal performance.
For those virtualizing XIV with SAN Volume Controller (SVC), define manage disks as 1632GB LUNs, in multiple of six LUNs per managed disk group (MDG), to balance across the six interface modules. Define SVC extent size to 1GB.
XIV is ideal for VMware. Create big LUNs for your VMFS that you can access via FCP or iSCSI.
Organize data to simplify Snapshots.
You no longer need to separate logs from databases for performance reasons. However, for some backup productslike IBM Tivoli Storage Manager (TSM) for Advanced Copy Services (ACS), you might want to keep them separatefor snapshot reasons. Gernally, putting all data for an application on one big LUNgreatly simplifies administration and snapshot processing, without losing performance.If you define multiple LUNs for an application, simply put them into the same "consistencygroup" so that they are all snapshot together.
OS boot image disks can be snapshot before applying any patches, updates or application software, so that ifthere are any problems, you can reboot to the previous image.
Employ sizing tools to plan for capacity and performance.
The SAP Quicksizer tool can be used for new SAP deployments, employing either the user-based orthroughput-based sizing model approach. The result is in mythical unit called "SAPS", which represents0.4 IOPS for ERP/OLTP workloads, and 0.6 IOPS for BI/BW and OLAP workloads.
If you already have SAP or other applications running, use actual I/O measurements. IBM Business Partners and field technical sales specialists have an updated version of Disk Magic that can help size XIV configurations fromPERFMON and iostat figures.
Lee La Frese, IBM STSM for Enteprise Storage Performance Engineering, presented internal lab test results forthe XIV under various workloads, based on the latest hardware/software levels [announced two weeks ago]. Three workloadswere tested:
Web 2.0 (80/20/40) - 80 percent READ, 20 percent WRITE, 40 percent cache hits for READ.YouTube, FlickR, and the growing list at [GoWeb20] are applications with heavy read activity, but because of[long-tail effects], may not be as cache friendly.
Social Networking (50/50/50) - 50 percent READ, 50 percent WRITE, 50 percent cache hits for READ.Lotus Connections, Microsoft Sharepoint, and many other [social networking] usage are more write intensive.
Database (70/30/50) - 70 percent READ, 30 percent WRITE, 50 percent cache hits for READ.The traditional workload characteristics for most business applications, especially databases like DB2 andOracle on Linux, UNIX and Windows servers.
The results were quite impressive. There was more than enough performance for tier 2 application workloads,and most tier 1 applications. The performance was nearly linear from the smallest 6-module to the largest 15-module configuration. Some key points:
A full 15-module XIV overwhelms a single SVC 8F4 node-pair. For a full XIV, consider 4 to 8 nodes 8F4 models, or 2 to 4 nodes of an 8G4. For read-intensive cache-friendly workloads, an SVC in front of XIV was able to deliver over 300,000 IOPS.
A single node TS7650G ProtecTIER can handle 6 to 9 XIV modules. Two nodes of TS7650G were needed to drivea full 15-module XIV. A single node TS7650 in front of XIV was able to ingest 680 MB/sec on the seventh day with17 percent per-day change rate test workload using 64 virtual drives. Reading the data back got over 950 MB/sec.
For SAP environments where response time 20-30 msec are acceptable, the 15-module XIV delivered over 60,000 IOPS. Reducing this down to 25,000-30,000 cut the msec response time to a faster 10-15 msec.
These were all done as internal lab tests. Your mileage may vary.
Not surprisingly, XIV was quite the popular topic here this week at the Storage Symposium. There were many moresessions, but these were the only two that I attended.
Continuing my week in Chicago, for the IBM Storage Symposium 2009, I attended what in my opinion was the bestsession of the week. This was by a guy named Chip Copper, who covered IBM's set of Ethernet and Fibre Channelnetworking gear. Attributes are the four P's:
Power and Cooling (electricity usage)
Equipment comes in two flavors: Top-of-Rack (ToR) thin pizza box switches, and Middle-of-Row (MoR) much larger directors.The MoR directors are engineered for up to 50Gbps per half-slot, so 10GbE and the future 40GbE can be easily accommodated in a single half-slot, and the future 100GbE can be done with a full slot (two half-slots).
While many companies might have been contemplating the switch from copper wires to optical fiber, there is a new reason for copper cables: Power-over-Ethernet (PoE). Many IP-phones, digital video surveillance cameras, and other equipment can have a single cable that delivers both signal and electricity over copper. If you have already deployed optical fiber throughout the building, there are "last mile" options where the signals are converted to copper wires and electrical energy added for these types of devices.
Two directors can be connected together with Inter-Chassis Link (ICL) cables to make them look like a single director with twice the number of ports. These are different than Inter-Switch Links (ISL) as they are not counted as an extra "hop" for networking counting purposes, especially important for FICON usage.
Today, we have 1Gbps, 2Gbps, 4Gbps and 8Gbps Fibre Channel. Since these all use 10-for-8 encoding (10 bits represents one 8-bit byte), then in was easy to calculate throughput: 8Gpbs was 800 MB/sec, for example. Auto-negotiation between speeds is not done at the HBA card, switch or director blade itself, but in the Short Form-factor Pluggable (SFP) optical connector. However, you can only auto-negotiate if the encoding matches. The 4/2/1 SFP can run at 4Gbps or auto-negotiate to slower 2Gbps and 1Gbps. The 8/4/2 SFP can run at 8Gbps, or auto-negotiate down to slower 4Gpbs and 2Gbps. Folks who still have legacy 1Gbps equipment, but want to run some things at 8 Gbps, can buy 8Gbps-capable switches or director blades, but then put some 4/2/1 SFPs into them. These 4/2/1 SFP are cheaper, so this might be something to consider if budgets are tight. Some SFPs handle up to 10km distances, but others only 4km, so be careful not to order the wrong ones.
Unfortunately, there are proposals in place for 10Gbps and 40Gbps that would use a different 66-for-64 encoding (66 bits represent 8 bytes), so 10Gbps would be 1200 MB/sec. These are used today for ISL between directors and switches.In theory, the 40Gbps could auto-negotiate down to 10Gbps, but not to any of the 8/4/2/1 Gbps that use different 10-for-8 encoding.
For those who cannot afford a SAN768B, there is a smaller SAN384B that can carry: 192 ports (4Gpbs/2Gbps), 128 ports (8Gbps) or 24 ports (10Gbps). The SAN384B can be ICL connected to another SAN384B or even the SAN768B as your needs grow.
On the entry-level side, the SAN24B-4 offers a feature called "Access Gateway". This makes the SAN24B look like an SAN end-point host, rather than a switch, and makes initial deployment of integrated bundled solutions easier. Once connected to everything, you can convert it over to full "switch" mode.The SAN40B-4 and SAN80B-4 provide midrange level support, including Fibre Channel routing at the 8Gbps level. In fact, all 8Gbps ports include routing capability. IBM offers both single-port and dual-port 8Gbps host bus adapter (HBA) cards to connect to these switches. These HBA offer 16 virtual channels per port, so that if you have VMware running many guests, or want to connect both disk and tape to the same HBA, you can keep the channel traffic separate for Quality of Service (QoS).
Chip wrapped up his session to discuss Fibre Channel over Ethernet (FCoE), and explained why we need to have a loss-less Convergence Enhanced Ethernet (CEE) to meet the needs of storage traffic as well as traditional Fibre Channel does today. IBM offers all of the equipment you need to get started today on this FCoCEE, with Converged Network Ethernet cards for your System x servers, and a new SANB32 that has 24 10GbE CEE ports and 8 traditional 8Gbps FC ports. This means that you can put the CNA card in your existing servers, connect to this switch, and then connect to your existing 10GbE LAN and your existing 8Gpbs or 4Gpbs FC-based SAN to the rest of your storage devices.
Worried that the FCoE or CEE standards could change after you deploy this gear? Aren't most LAN and SAN switches based on Application-specific integrated circuit [ASIC] chips which are created in the factory? Don't worry, IBM's equipment have put all the standards-vulnerable portions of the logic into separate Field-programmable gate array [FPGA] that can be updated with simplya firmware upgrade. This is future-proofing I can agree with!
Continuing my week in Chicago, I decided to attend some of the presentations from the System x side. This is the advantage of running both conferences in the same hotel, attendees can choose how many of each they want to participate in.
Wayne Wigley, IBM Advanced Technical Support (ATS), presented a series of presentations on different server virtualization offerings available for System x and BladeCenter servers. I am very familiar with virtualization implemented on System z mainframes, as well as IBM's POWER systems, and have working knowledge of Linux KVM and Xen, so I was well prepared to handle hearing the latest about Microsoft's Hyper-V and VMware's Vsphere version 4.
Microsoft Hyper-V 2008
Hyper-V can run as part of Windows 2008, are standalone on its own.Different levels of Windows 2008 include licenses for different number of Windows virtual machines (VMs).Windows Server 2008 Standard includes 1 Windows VM, Enterprise includes 4 Windows VMs, and the Datacenter edition includes unlimited number of Windows VMs. If you want to run more Window VMs than come included, you need to pay extra for each additional one. For example, to run 10 Windows VMs on a 2-socket server would cost about $9000 US dollars on Standard but only $6000 US dollars on Datacenter edition (list prices from Microsoft Web site).
Unlike VMware, which takes a monolithic approach as hypervisor, Hyper-V is more like Xen with a microkernelized approach. This means you need a "parent" guest OS image, and the rest of the Guest OS images are then considered "child" images.These child images can be various levels of Windows, from Windows XP Pro to Windows Server 2008, Xen-enabled Linux, or even a non-hypervisor-aware OS.The "parent" guest OS image provides networking and storage I/O services to these "child" images.For the hypervisor-aware versions of Windows and Linux, Hyper-V allows optimized access to the hypervisor, "synthetic devices", and hypercalls. Synthetic devices present themselves as network devices, but only serve to pass data along the VMBus to other networking resources. This process does not require software emulation, and therefore offers higher performance for virtual machines and lower host system overhead.For non-hypervisor-aware OS images, Hyper-V provides device emulation through the "parent" image, which is slower.
Microsoft System Center Virtual Machine Manager (SCVMM) can manage both Hyper-V and VMware VI3 images.Wayne showed various screen shots of the GUI available to manage Hyper-V images.In standalone mode, you lose the nice GUI and management console.
Hyper-V supports external, internal and private virtual LANs (VLAN). External means that VMs can communicate with the outside world over standard ethernet connections. Internal means that VMs can communicate with "parent" and "child" guest images on the same server only. Private means that only "child" guests can communicate with other "child" images.
Hyper-V supports disk attached via IDE, SATA, SCSI, SAS, FC, iSCSI, NFS and CIFS. One mode is "Virtual Hard Disk" (VHD) similar to VMware VMDK files. The other is "pass through" mode, which are actual disk LUNs accessed natively. VHDs can be dynamic (thin provisioned), fixed (fully allocated), or differencing. The concept of differencing is interesting, as you start with a base read-only VHD volume image, and have a separate "delta" file that contains changes from the base image.
Some of the key features of Hyper-V 2008 are:
Being able to run concurrently 32-bit and 64-bit versions of Linux and Windows guest images
Support for 64 GB of memory and 4-way symmetric multiprocessing (SMP) per VM
Clustering for High Availability and Quick Migration of VM images
Live backup with integration with Microsoft's Volume Shadow Copy Services (VSS)
Virtual LAN (VLAN) support, and Virtual and Pass-through physical disk support
A clever VMbus, virtual service parent/client approach to sharing hardware
Optimized performance options for hypervisor-aware versions of Windows and Linux, and emulated supportfor non-hypervisor-aware OS images.
VMware Vsphere v4.0
This was titled as an "Overview" session, but really was an "Update" session on the newest features of this release. The big change appears to be that VMware added "v" in front of everything.
Under vCompute, there are some new features on VMware's Distributed Resource Scheduler (DRS) which includes recommended VM migrations. Dynamic Power Management (DPM) will move VMs during periods of low usageto consolidate onto fewer physical servers so as to reduce energy consumption.
Under vStorage, vSphere introduces an enhanced Plugable Storage Architecture (PSA), with supportfor Storage Array Type Plugins (SATP) and Path Selection Plugins (PSP). This vStorage API allows forthird party plugins for improved fault-tolerance and complex I/O load balancing algorithms. This releasealso has improved support for iSCSI, including Challenge-Handshake Authentication Protocol (CHAP) support.Similar to Hyper-V's dynamic VHD, VMware supports "thin provisioning" for their virtual disk VMDK files.A feature of "Storage Vmotion" allows conversion between "thick" and "thin" provisioning formats.
The vStorage API for Data Protection provide all the features of VMware Consolidated Backup (VCB). The APIprovides full, incremental and differential file-level backups for Windows and Linux guests, including supportfor snapshots and Volume Shadow Copy Services (VSS) quiescing.
VMware introduces direct I/O pass-through for both NIC and HBA devices. While thisallows direct access to SAN-attached LUNs similar to Hyper-V, you lose a lot of features like Vmotion, High Availability and Fault Tolerance. Wayne felt that these restrictions are temporary, that hopefully VMwarewill resolve this over the next 12 months.
Under vNetwork, VMware has virtual LAN switches called vSwitches. This includes support for IPv6and VLAN offloading.
The vSphere server can now run with up to 1TB of RAM and 64 logical CPUs to support up to 320 VM guest images.Each VM can have up to 255GB RAM and up to 8-way SMP.Vsphere ESX 4 introduces a new virtual hardware platform called VM Hardware v7. While Vsphere 4.0 can run VMs from ESX 2 and ESX 3, the problem is if you have new VMs based on this newer VM Hardware v7, you cannot run them on older ESX versions.
Vsphere comes in four sizes: Standard, Advanced, Enterprise, and Enterprise Plus, ranging in list price from $795 US dollars to $3495 US dollars.
While IBM is the #1 reseller of VMware, we also are proud to support Hyper-V, Xen, KVM and other similar products.Analysts expect most companies will have two or more server virtualization solutions in their data center, and it is good to see that IBM supports them all.
Continuing my week in Chicago for the IBM Storage and Storage Networking Symposium and System x and BladeCenter Technical Conference, I presented a variety of topics.
Hybrid Storage for a Green Data Center
The cost of power and cooling has risen to be a #1 concern among data centers. I presented the following hybrid storage solutions that combine disk with tape. These provide the best of both worlds, the high performance access time of disk with the lower costs and reduced energy consumption of tape.
IBM [System Storage DR550] - IBM's Non-erasable, Non-rewriteable (NENR) storage for archive and compliance data retention
IBM Grid Medical Archive Solution [GMAS] - IBM's multi-site grid storage for PACS applications and electronic medical records[EMR]
IBM Scale-out File Services [SoFS] - IBM's scalable NAS solution that combines a global name space with a clustered GPFS file system, serving as the ideal basis for IBM's own[Cloud Computing and Storage] offerings
Not only do these help reduce energy costs, they provide an overall lower total cost of ownership (TCO) thantraditional WORM optical or disk-only storage configurations.
The Convergence of Networks - Understanding SAN, NAS and iSCSI in the Data Center Network
This turned out to be my most popular session. Many companies are at a crossroads in choosing data and storage networking solutions in light of recent announcements from IBM and others. In the span of 75 minutes, I covered:
Block storage concepts, storage virtualization and RAID levels
File system concepts, how file systems map files to block storage
Network Attach Storage, the history of the NFS and CIFS protocols, Pros and Cons of using NAS
Storage Area Networks, the history of SAN protocols including ESCON, FICON and FCP, Pros and Cons of using SAN
IP SAN technologies, iSCSI and Fibre Channel over Ethernet (FCoE), Pros and Cons of using this approach
Network Convergence with Infiniband and Fibre Channel over Convergence Enhanced Ethernet (FCoCEE), why Infiniband was not adopted historically in the marketplace as a storage protocol, and the features and enhancements of Convergence Enhanced Ethernet (CEE) needed to merge NAS, SAN and iSCSI traffic onto a single converged data center network [DCN]
Yes, it was a lot of information to cover, but I managed to get it done on time.
IBM Tivoli Storage Productivity Center version 4.1 Overview and Update
In conferences like these, there are two types of product-level presentations. An "Overview" explains howproducts work today to those who are not familiar with it. An "Update" explains what's new in this version of the product for those who are already familiar with previous releases. I decided to combine these into one sessionfor IBM's new version of [Tivoli Storage Productivity Center].I was one of the original lead architects of this product many years ago, and was able to share many personalexperiences about its evolution in development and in the field at client facilities.Analysts have repeatedly rated IBM Productivity Center as one of the top Storage Resource Management (SRM) tools available in the marketplace.
Information Lifecycle Management (ILM) Overview
Can you believe I have been doing ILM since 1986? I was the lead architect for DFSMS which provides ILM support for z/OS mainframes. In 2003-2005, I spent 18 months in the field performingILM assessments for clients, and now there are dozens of IBM practitioners in Global Technology Services andSTG Lab Services that do this full time. This is a topic I cover frequently at the IBM Executive Briefing Center[EBC], because it addressesseveral top business challenges:
Reducing costs and simplifying management
Improving efficiency of personnel and application workloads
Managing risks and regulatory compliance
IBM has a solution based on five "entry points". The advantage of this approach is that it allows our consultants to craft the right solution to meet the specific requirements of each client situation. These entry points are:
Tiered Information Infrastructure - we don't limit ourselves to just "Tiered Storage" as storage is only part of a complete[information infrastructure] of servers,networks and storage
Storage Optimization and Virtualization - including virtual disk, virtual tape and virtual file solutions
Process Enhancement and Automation - an important part of ILM are the policies and procedures, such as IT Infrastructure Library [ITIL] best practices
Archive and Retention - space management and data retention solutions for email, database and file systems
I did not get as many attendees as I had hoped for this last one, as I was competing head-to-head in the same time slot as Lee La Frese covering IBM's DS8000 performance with Solid State Disk (SSD) drives, John Sing covering Cloud Computing and Storage with SoFS, and Eric Kern covering IBM Cloudburst.
I am glad that I was able to make all of my presentations at the beginning of the week, so that I can then sit back and enjoy the rest of the sessions as a pure attendee.
This week I am in Chicago for the IBM Storage and Storage Networking Symposium, which coincides with the System x and BladeCenter Technical Conference. This allows the 800 attendees to attend both storage or server presentations at their convenience. There were hundreds of sessions, over 20 time slots, so for each time slot, you have 15 or so topics to choose from.Mike Kuhn kicked off the series of keynote sessions. Here's my quick recap of each one:
Curtis Tearte, General Manager, IBM System Storage
Curtis replaced Andy Monshaw as General Manager for IBM System Storage. His presentation focused on how storage fits into IBM's Dynamic Infrastructure strategy. Some interesting points:
a billion camera-enabled cell phones were sold in 2007, compared to 450 million in 2006.
IBM expects that there will be 2 billion internet users by 2011, as well as trillions of "things".
In the US, there were 2.2 million medical pharmacy dispensing errors resulting for handwritten prescriptions.
Time wasted looking for parking spaces in Los Angeles consumed 47,000 gallons of gasoline, and generated 730 tons of carbon dioxide.
In the US, 4.2 billion hours are lost, and 2.9 billion gallons of gas consumed, due to traffic congestion.
Over the past decade, servers went from 8 watts to 100 watts per $1000 US dollars.
Data growth appears immune to the economic recession. The digital footprint per person is expected to grow from 1TB today to over 15TB by 2020.
10 hours of YouTube videos are uploaded every minute.
Bank of China manages 380 million bank accounts, processing over 10,000 transactions per second.
At the end of the session, Curtis transitioned from demonstrating his knowledge and passion of storage to his knowledge and passion in his favorite sport: baseball. Chicago is home to both the Cubs and the White Sox.
Roland Hagan, Vice President Business Line Executive, System x
IBM sets the infrastructure agenda for the entire industry. The Dynamic Infrastructure initiative is not just IT, but a complete end-to-end view across all of the infrastructures in play, including transportation, manufacturing, services and facilities.Companies spent over $60 billion US dollars on servers last year. Of these, 53 percent for x86-based servers, 9 percent for Itanium-based, 26 percent for RISC-based (POWER6, SPARC, etc.), and 11 percent mainframe. Theeconomic downturn has impacted revenues, but the percentages continue about the same.
The dominant deployment model remains one application per server. As a result, power, cooling and management costs have grown tremendously. There are system admins opposed to consolidating server images with VMware, Hyper-V, Xen or other server virtualizaition technologies. Roland referred to these admins as "server huggers".To help clients adopt cloud computing technologies, IBM introduced [Cloudburst] appliances. IBM plans to offer specialized versions for developers, for service providers, and for enterprises.
IBM's Enterprise-X Architecture is what differentiates IBM's x86-based servers from all the competitors, surrounding Intel and AMD processors with technology that provides distinct advantages. For example, to support server virtualization, IBM's eX4 provides support for more memory, which often is more critical than CPU resources when deploying large number of guest OS images. IBM System x servers have an integrated management module (IMM) and was the first to change over from BIOS to the new Unified Extensible Firmware Interface [UEFI] standard.
IBM servers offer double the performance, consume half the power, and cost a third less to manage, than comparably priced servers from competitors. Of the top 20 more energy efficient server deployments, 19 are from IBM. Roland cited customer reference SciNet, a 4000-server supercomputer with 30,000 cores based on IBM [iDataPlex] servers. At 350 TeraFLOPs it is ranked #16 fastest supercomputer in the world, and #1 in Canada. With apower usage effectiveness (PUE) less than 1.2, it also is very energy efficient. This means that for every 12 watts of electricity going in to the data center, 10 watts are used for servers, storage and networking gear, andonly 2 watts used for power and cooling. Traditional data centers have PUE around 2.5, consuming 25 watts total for every 10 watts used by servers, storage and networking gear.
Clod Barrera, Distinguished Engineer, Chief Technical Strategist for IBM System Storage
Clod presented trends and directions for disk and tape technology, disk and tape systems, and the direction towards cloud computing.
And, it's not too late to sign up for IBM Tivoli's [Pulse 2008] conference that will be heldin Orlando, Florida, May 18-22, 2008. I'll be there Sunday and Monday only, in the Tivoli Storage track, so if you are planning to attend and wish to meet up with me while I am there, please send me a note!
If you missed the [IBM System Storage and Storage Networking Symposium] in San Diego, California last month (like I did because I was in Japan and India), here is your chance to attend the one next month in Europe, September 8-11, in beautiful[Montpellier, France]. Several of my colleagues from the IBM Tucson Executive Briefing Center are scheduled to speak at this event.
And maybe, perhaps, some IBM executives, will have something important to say next month also! Stay tuned!
For a list of other IBM events this year, see the [2008 schedule].
Continuing my coverage of the IT Security and Storage Expo in Brussels, Belgium, we had some great storage solutions on display at the IBM and I.R.I.S-ICT booth.
Here my IBM colleague Tom Provost is showing the front of the "Smarter Office" solution. The second photo gives the view from behind. While I always explained the solution from the front of the box, many of the more technical attendees at this conference wanted to inspect the ports in the back.
This sound-isolated 11U solution combines the following:
The [IBM Storwize V3700] with 300GB small-form-factor (SFF) drives provides shared storage for the servers.
Two [IBM System x3550 M4 servers] that can run VMware, Hyper-V or Linux KVM server hypervisor software for your Windows and/or Linux applications. These are two socket servers that can have up to 16 x86 cores each.
A Juniper EX2200 switch to network the servers and storage together.
A Local Console Manager (LCM) with rackable keyboard, video, and mouse.
In this next example, the IBM team combined a BladeCenter S chassis that can hold six blade servers, with a Storwize V7000 Unified which offers FCP, iSCSI, FCoE, NFS, CIFS, HTTPS, SCP and FTP block and file protocols.
If those configurations are too small for your needs, consider the Flex System chassis or full PureFlex system frame. The rack-mountable 10U chassis can hold the Flex System V7000 and 10 compute notes. The PureFlex frame can hold up to four of these chasses.
IBM and I.R.I.S-ICT also had an IBM XIV Gen3 and a TS3500 Tape library on display.
"With Cisco Systems, EMC, and VMware teaming up to sell integrated IT stacks, Oracle buying Sun Microsystems to create its own integrated stacks, and IBM having sold integrated legacy system stacks and rolling in profits from them for decades, it was only a matter of time before other big IT players paired off."
Once again we are reminded that IBM, as an IT "supermarket", is able to deliver integrated software/server/storage solutions, and our competitors are scrambling to form their own alliances to be "more like IBM." This week, IBM announced new ordering options for storage software with System x servers, including BladeCenter blade servers and IntelliStation workstations. Here's a quick recap:
IBM Tivoli Storage Manager FastBack v6.1 supports both Windows and Linux! FastBack is a data protection solution for ROBO (Remote Office, Branch Office) locations. It can protect Microsoft Exchange, Lotus Domino, DB2, Oracle applications. FastBack can provide full volume-level recovery, as well as individual file recovery, and in some cases Bare Machine Recovery. FastBack v6.1 can be run stand-alone, or integrated with a full IBM Tivoli Storage Manager (TSM) unified recovery management solution.
FlashCopy Manager v2.1
FlashCopy Manager uses point-in-time copy capabilities, such as SnapShot or FlashCopy, to protect application data using an application-aware approach for Microsoft Exchange, Microsoft SQL server, DB2, Oracle, and SAP. It can be used with IBM SAN Volume Controller (SVC), DS8000 series, DS5000 series, DS4000 series, DS3000 series, and XIV storage systems. When applicable, FlashCopy manager coordinates its work with Microsoft's Volume Shadow Copy Services (VSS) interface. FlashCopy Manager can provide data protection using just point-in-time disk-resident copies, or can be integrated with a full IBM Tivoli Storage Manager (TSM) unified recovery management solution to move backup images to external storage pools, such as low-cost, energy-efficient tape cartridges.
General Parallel File System (GPFS) v3.3 Multiplatform
GPFS can support AIX, Linux, and Windows! Version 3.3 adds support for Windows 2008 Server on 64-bit chipset architectures from AMD and Intel. Now you can have a common GPFS cluster with AIX, Linux and Windows servers all sharing and accessing the same files. A GPFS cluster can have up to 256 file systems. Each of these file systems can be up to 1 billion files, up to 1PB of data, and can have up to 256 snapshots. GPFS can be used stand-alone, or integrated with a full IBM Tivoli Storage Manager (TSM) unified recovery management solution with parallel backup streams.
For full details on these new ordering options, see the IBM [Press Release].
In general, people agree that IBM, HP and EMC are the top three vendors in storage,with HDS, Sun and Dell rounding out the top six.
The fun begins when a respected analyst like IDC Corp. publishes their calculations,and individual vendors re-swizzle the results because they are not happy with theirfindings.
I thought it would be helpful to illustrate how this all works. First, you need to comeup with a defintion of what you are going to count. You could count units sold, revenue dollars, or capacity Terabytes, or some other generally accepted metric.
Next, you need to define what's in and what's out. For example, you can say "storage"which would include both disk drives and tape drives, both internal to servers, orexternal to servers, or you can choose a more narrow definition, say external disksystems, which might suit you better if you aren't in the tape business, and don't sell servers.
By some definitions, my Apple iPod, Motorolla cell phone, and Canon digital camera,could all be counted as external disk systems, as they all connect via USB cableto my IBM laptop, and act like a disk drive to my Windows operating system, allowingme to read and write data back and forth. It is necessary to define exactly what you plan to include,and what to exclude, based on the reported numbers available.
The last rule is that nothing gets double-counted. In our complicated industry ofmanufacturers and vendors, sometimes storage is manufactured by one company, but soldby another, typically under the vendor's brand, not the manufacturer's brand. Youcan either count manufactured units, or vendor units, but you can't mix and match.
IBM is both manufacturer and vendor. However, IDC only counts vendor units, so storagemanufactured by someone else, but sold by IBM is counted as IBM, and storage manufacturedby IBM but branded by someone else goes to that other vendor. Likewise, HP and Sun re-brandHitachi storage, and Dell re-brands EMC storage.
EMC would like to treat all EMC-manufactured storage re-branded by Dell as EMC vended storage,so that it can move up in the ratings. But Dell wants to count it too, so that it can appearin the top six. You can't have it both ways.
But are these ratings just "bragging rights"? Not always. When big purchases are planned fornew projects, or a client decides its time to throw out the current vendor and shop for a newone, the ratings could influence that decision. In that regard, IDC 4Q05 Storage Tracker reportedIBM as number one over all in storage hardware at the end of 2005, which includes both internal and external disk systems, as well as tape drives sold under the IBM brand, based on dollar revenues. By this method of counting, HP came in at number 2, EMC at number 3, and the rest round out thetop six as before.
In the end, this is just one factor when deciding which brand to choose for your storage needs.
This week and next, I am down under in Australia and New Zealand for a seven-city Storage Optimisation Breakfast series of presentations to clients and prospects. My first city for this seven-city tour was Sydney, Australia.
Here is the view from my room at the [Shangri-La hotel], including the famous [Sydney Opera House] and Circular Quay, from which to take a water taxi or ride the Manly Ferry. [Sydney harbour] is the deepest harbour in the Southern Hemisphere, allowing boats of all sizes to enter. This section of the city is known as "The Rocks".
Sydney is a very modern metropolis. The last time I was in Sydney was in May 2007 to teach an IBM Top Gun class. My post back then on [Dealing with Jet Lag] is as relevant now as it was back then. In addition to being 9 hours off-shifted from last week in Dallas, Texas, I also have to deal with the colder climate, about 40 degrees F cooler down here. The weather is crisp and clear, it is Winter going into Spring down here as the seasons are flipped below the equator.
Many of the buildings are recognizable from the movie ["The Matrix"] which was filmed here. We joked that this seven-city trip was also similar to [The Adventures of Priscilla, Queen of the Desert], in that both journeys started in Sydney. If you haven't seen the latter, I highly recommend it to get to learn more about Australia as a country.
(Completely useless trivia: Actor Hugo Weaving appeared in both movies. While most people associate him with Australia, where he has lived since 1976, he actually was born in Nigeria, and traveled extensively because his father worked in the computer industry.)
Here I am standing next to our banner.
The line-up for each event is simple. After all the attendees sit down for breakfast, we have the following three sessions:
First, Anna Wells, local IBM Executive for Storage Sales in Australia and New Zealand presents IBM's strategy for storage, and how IBM plans to address Storage Efficiency, Data Protection and Service Delivery. She then highlights various products that are currently available to help meet customer needs, including XIV and the SAN Volume Controller (SVC).
Second, we have a client or two share their success story. We will have different speakers at the different locations.
Third, I present on future trends that will impact the storage marketplace. With only 40 minutes for my section, I decided to focus on just three specific trends, with a mix of some colorful analogies to help emphasize my key points.
We had a great turn-out for our first event in Sydney, lots of clients and prospects came out for this. There is a lot of enthusiasm for IBM's vision, thought leadership, and broad portfolio of storage solutions.
When I was a kid, I used to love old spy movies where they would hide a small microchip or microfiche behind the stamp on a letter or postcard. "Yeah right," I would think to myself, "how much information could that little thing possibly hold."On their post[Bringing the "New Intelligence" Down to Earth: Intro to Semantic Web, Internet-of-Thing], My fellow IBM bloggers Jack Mason and Adam Christensen pointed me to a crazy new product called "Mir:ror" that connects to your PC or laptop.
At first, I thought it was a another product spoof, like Onion News Networks'video of the [Apple MacBook Wheel] that eliminatesthe need for a keyboard.But no, this product is real, from a company called [Violet]. The mir:ror, the internet-connected rabbits, and the tiny postage stamps called "ztamps" with embedded RFID chips that allow everything to be interconnected.I can see a lot of interesting uses for the ztamps. Squishing CD-romsor memory sticks inside presentation folders was always awkward. Butthese are small, flat and discrete. I don't know how many GBs of storage each ztamp holds, but they look cool, don't they?
Just another example of becoming a smarter planet!
Greg and 3PAR's Marc Farley did an "ambush" interview with the folks at the IBM booth at SNW, including Paula Koziol about Twitter, and [Rich Swain] about IBM's latest SONAS product. Here is their post [Storage Monkey business with IBM]:
You can learn more about SONAS from my post [More Details about IBM Clustered NAS]. SONAS is based on software that has been available since 1996, on commodity off-the-shelf server and storage systems, but building a complete system was left as an exercise to the end-user, which many of the top 500 Supercomputers have done.
Back in November 2007, IBM announced Scale-Out File Services (SoFS) which was a set of IBM Global Technical Services to build a customized solution from the software and a set of servers, disk and tape storage. Customized configurations were done for a variety of workloads from Digital Media to Scientific Research High Performance Computing (HPC). Last year, SoFS was renamed to IBM Smart Business Storage Cloud (SBSC).
This year, IBM was able to package all of the software and hardware into an easy to order machine-type model that has everything cabled and ready to use. This is what SONAS is today.
IBM Master Inventor, Senior IT Architect, and Event Content Manager
Well, it's Tuesday again, and you know what that means? IBM Announcements! We have a lot today, so I will just give you the quick highlights, and then Chris and Lloyd will follow-up with more detailed posts.
New IBM Storwize V5000 models
IBM introduces several new entry-level models.
The Storwize V5010E and V5030E are the "Express" models that allow for hybrid configurations, mixing Flash and spinning HDD disk. The Storwize V5010E is a single controller, two-canister model, with basic features. The Storwize V5030E adds more memory, more CPU power, and additional features like Data Reduction Pools and data-at-rest Encryption. Hosts can attach via SAS, 16Gb FCP or iSCSI.
The Storwize V5100 is the baby model of the FlashSystem 9100, supporting both FlashCore and industry-standard NVMe Flash drives, with the option to SAS-attach expansion drawers, mixing Flash and spinning HDD disk. The Storwize V5100F is the all-flash version. Hosts can attach via 32Gbps FCP, 25GbE RoCE, 25GbE iWarp, and iSCSI.
IBM Spectrum Virtualize for Public Cloud on Amazon Web Services (AWS)
The IBM Spectrum Virtualize for Public Cloud, or what our young folks shorten unofficially to SV4PC, that has been available on IBM Cloud will now also be available on Amazon Web Services.
For those readers asking "What took so long?" Amazon was not going to put specialized equipment in their data centers, so IBM had to make Spectrum Virtualize software container-native. Yes, the SVC code now runs in its own Docker container.
Basically a 2-node cluster is represented as two AWS EC2 instances, virtualizing EBS storage. The Transparent Cloud Tiering (TCT) that let's you "FlashCopy-to-the-Cloud" can be used to go directly to Amazon's S3 object storage.
This conversion to container-native has worked so well, IBM now plans to offer container-native software-defined storage capability across the board, for object storage, block storage, and file storage.
Did you notice that the Storwize V5100/F models support 32Gbps FCP in the section above? If that raised your eyebrow, I am pleased to tell you that IBM will be supporting 32Gbps FCP on these new Storwize V5100/F, the Storwize V7000 Gen3 and the FlashSystem 9100 devices.
We have also added a new b-type SAN switch, the SAN18B-6 which is Broadcom's Gen6 technology in a sleek 1U configuration, sporting 12 FCP ports that support 32Gbps and auto-negotiate to slower speeds as needed for compatibility with 8Gbps and 16Gbps devices. The other six ports are Ethernet, and can be used for disaster recovery replication, either using native TCP/IP or FCIP protocols.
IBM has enhanced the alerting capabilities of both the on-premise IBM Spectrum Control and its "as-a-Service" sister offering IBM Storage Insights. This allows you to set up alerts for "device groups" across multiple storage devices, as well as setting up filters to make the alerts more meaningful, eliminating some of the noise.
When IBM first introduced IBM Storage Insights, it was intended as an alternative to the on-premise solution. Now, clients demand both, so now if you have one, we can offer you the other! The new [IBM Storage Insights for IBM Spectrum Control] is an IBM Cloud service that can help you predict and prevent storage problems before they impact your business.
It is complementary to IBM Spectrum Control and is available at no additional cost if you have an active license with a current subscription and support agreement for IBM Virtual Storage Center, IBM Spectrum Storage Suite, or any edition of IBM Spectrum Control.
As an on-premises application, IBM Spectrum Control doesn't send the metadata about monitored devices offsite, which is ideal for dark shops and sites that don't want to open ports to the cloud. However, if your organization allows for communication between its network and the cloud, you can use IBM Storage Insights for IBM Spectrum Control to transform your support experience for IBM block storage.
IBM Spectrum Scale has been certified to run with with Hortonworks Data Platform (HDP) 3.1 release.
(Ha, I probably could have fit all that in the title of this section, but instead I just said "IBM Spectrum Scale" and you are thinking "Oh Boy!" and then you see something that could have fit in the title and feel all disappointed. It is kind of like when the local news asks "Was the restaurant you had lunch today contaminated with Salmonella?" and then follows up with the answer "Find out at 11:00pm evening news!" And then you wait until 11:00pm for then to say, "No, there was no salmonella found in any of the restaurants.".)
So, I would not have mentioned Spectrum Scale certification of HDP 3.1 unless there was at least something else worth mentioning. There is! IBM Spectrum Scale now also enhanced its performance for SMB and NFS, and has enhanced the scalability and resiliency of its Active File Management (AFM) feature.
The IBM FlashSystem A9000 and A9000R are targeted to Cloud and Managed Service Providers (CSP/MSP). The 12.3.2 release now supports VLAN tagging for iSCSI deployments. This VLAN tagging allows multiple virtual networks and IP addresses to share iSCSI ports, making it ideal for multi-tenancy for CSP/MSP clients.
IBM manages over a hundred Blockchain networks for its clients. For those not familiar with Blockchain, it is a way to record transactions, whenever money or product changes hands, an entry is recorded into the blockchain ledger for all to see.
This has two drawbacks. Information stored in the ledger may contain information you do not want everyone to see. The other is scalability, storing photos, and other supporting documents may be nice to have, but takes up a lot of space, and slows down transaction rates.
The solution is "off-chain" data. These are supporting documents that aren't needed in the blockchain itself. To connect them, you store a checksum hash of the supporting document in the ledge, then store the supporting document as off-chain data on-premises. If you need to produce the document for an audit, its checksum hash will match what is in the ledger.
In the beginning, people thought Docker containers would just be used for microservices with no persistent storage. Then clients realized they needed persistent storage, and they needed to orchestrate that storage provisioning. The IT industry has a variety of different orchestrators like Kubernetes, Docker Swarm, and Mesos. All of these manage persistent storage differently. IBM has focused on Kubernetes, using Ubiquity open source project to manage FlexVolumes.
Container Standard Interface (CSI) is an effort to standardize the provisioning of persistent storage. Allowing containerized applications to have access to storage that persists, even after the container is shutdown or crashes. For the next few years, I suspect IBM will need to support both the old way (FlexVolumes) and the new way (CSI) until all the standards settle.
You can hear all about these exciting announcements at the upcoming IBM Systems Technical University (TechU) in Atlanta, GA (USA), April 29-May 3. Visit [ibm.biz/Atlanta2019] to learn more and register. The three of us all plan to be there! Stop by and say hello.
IBM Senior Certified Executive IT Architect
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Today I want to write about a very recent enhancement for IBM storage clients. This announcement was specific to all IBM Storage clients using IBM Spectrum Control.
IBM recently announced a new solution for existing Spectrum Control clients to obtain a cloud based version and get even more value from their Spectrum Control investment. Whether you have Spectrum Control Standard, Advanced, Select or Virtual Storage Center all versions are covered under this new solution.
In this new solution for existing clients of Spectrum Control, they are entitled to this new solution using existing Spectrum Control licensed capacity. This new solution which is a cloud based Software as a Service (SaaS) offering is titled Storage Insights for Spectrum Control. For current Spectrum Control clients this new solution is provided with no additional costs.
Since the release of Storage Insights & Storage Insights Pro existing clients with Spectrum Control have asked for a similar cloud based option. Today we have that option.
Other devices: Non-IBM storage devices, VMWare, SAN Switches
On Premises / Cloud
Asset management (Type, model, serial number, firmware)
On Premises / Cloud
Support management: Ticket creation / log upload
On Premises / Cloud
Health (show status of entity) direct / call home
On Premises / Cloud
Alerting (send mesg to user) status / thresholds / eMail / SNMP / scripts
On Premises / Cloud
Storage / Fabric Performance and Error Reporting
On Premises / Cloud
Performance Interval, retention
5 min, 24 hour
5 min, 1 year
1 min, customizable
On premises / Cloud
Provisoning using Service Classes & Capacity Pools with automatic zoning
Reclamation analysis of unused volumes
On Premises / Cloud
Service Management (Chargeback & Consumer reports)
On Premises / Cloud
Custom Reporting: GUI / API
On Premises / Cloud
Tiering support across pools
Recommend and Implement
On Premises / Cloud
Balance workload across pools
User ManagementL Active DIrectory/LDAP integration
Cloud portal SLA
If your existing Spectrum Control instance is providing your requirements then consider Storage Insights for Spectrum Control for the added value of enhance IBM Storage Support, and constantly getting access to the latest features of Storage Insights for Spectrum Control without any of the maintenance or upgrade activities.
Whatever reason you may need to reach out to IBM Storage Support, with IBM Support having immediate access to your storage configuration details will reduce the time and your teams effort to get a resolution or recommendation from IBM on how to proceed.
Dan Galvan, IBM VP of Marketing for Storage, was the next speaker. With 300 billion emails being sent per day, 4.6 billion cell phones in the world, and 26 million MRIs per year, there is going to be a huge demand for file-based storage. In fact, a recent study found that file-based storage will grow at 60 percent per year, compared to 15 percent growth for block-based storage.
Dan positioned IBM's Scale-out Network Attached Storage (SONAS) as the big "C:" drive for a company. SONAS offers a global namespace, a single point of management, with the ability to scale capacity and performance tailored for each environment.
The benefits of SONAS are great. We can consolidate dozens of smaller NAS filers, we can virtualize files across different storage pools, and increase overall efficiency.
Powering advanced genomic research to cure cancer
The next speaker was supposed to be Bill Pappas, Senior Enterprise Network Storage Architect, Research Informatics at [St. Jude Children’s Research Hospital]. Unfortunately, St. Jude is near the flooding of the Mississippi river, and he had to stay put. An IBM team was able to capture his thoughts on video that was shown on the big screen.
Thanks to the Human Genome project, St. Jude is able to cure people. They see 5700 patients per year, and have an impressive 70 percent cure rate. The first genetic scan took 10 years, now the technology allows a genome to be mapped in about a week. Having this genomic information is making vast strides in healthcare. It is the difference of fishing in a river, versus putting a wide net to catch all the fish in the Atlantic ocean all at once.
Recently, St. Jude migrated 250 TB of files from other NAS to an IBM SONAS solution. The SONAS can handle a mixed set of workloads, and allows internal movement of data from fast disk, to slower high-capacity disk, and then to tape. SONAS is one of the few storage systems that supports a blended disk-and-tape approach, which is ideal for the type of data captured by St. Jude.
IBM's own IT transformation
Pat Toole, IBM's CIO, presented the internal transformation of IBM's IT operations. He started in 2002 in the midst of IBM's effort to restructure its process and procedures. They identified four major data sources: employee data, client data, product data, and financial data. They put a focus to understand outcomes and set priorities.
The result? A 3-to-1 payback on CIO investments. This allowed IBM to go from server sprawl to consolidated pooling of resources with the right levels of integration. In 1997, IBM had 15,000 different applications running across 155 separate datacenters. Today, they have reduced this down to 4,500 applications and 7 datacenters. Their goal is to reduce down to 2,225 applications by 2015. Of these, only 250 are mission critical.
Pat's priorities today: server and storage virtualization, IT service management, cloud computing, and data-centered consolidation. IBM runs its corporate business on the following amount of data:
9 PB of block-based storage, SVC and XIV
1 PB of file-based storage, SONAS
15 PB of tape for backup and archive
Pat indicated that this environment is growing 25 percent per year, and that an additional 70-85 PB relates to other parts of the business.
By taking this focused approach, IBM was able to increase storage utilization from 50 to 90 percent, and to cut storage costs by 50 percent. This was done through thin provisioning, storage virtualization and pooling.
Looking forward to the future, Pat sees the following challenges: (a) that 120,000 IBM employees have smart phones and want to connect them to IBM's internal systems; (b) the increase in social media; and (c) the use of business analytics.
After the last session, people gathered in the "Hall of the Universe" for the evening reception, featuring food, drinks and live music. It was a great day. I got to meet several bloggers in person, and their feedback was that this was a very blogger-friendly event. Bloggers were given the same level of access as corporate executives and industry analysts.
During the break, I talked with some of the other bloggers at this event. From left to right: Stephen Foskett [Pack Rat] blog, Devang Panchigar [StorageNerve], and yours truly, Tony Pearson. (Picture courtesy of Stephen Foskett)
Meet the Experts
This next segment was a Q&A panel, with a moderator posing questions to four experts. Originally, I was scheduled to be the moderator, but this was changed to Doug Balog. The experts on the panel were:
Rich Castagna, Editorial Director for Storage Media, TechTarget. TechTarget is the group that runs the [SearchStorage] website.
Stan Zaffos, Gartner VP of Research, who spoke earlier today. I have worked with Stan for years as well, and have attended the last four Gartner Data Center Conferences held every December in Las Vegas.
Steve Duplessie, Founder and Senior Analyst, Enterprise Strategy Group (ESG). Steve's blog is titled [The Bigger Truth].
Jon clarified a statement Doug Balog said earlier in the day attributed to his study. Doug had said that 40 percent of all data should be archived. The study that Jon Toigo had done found that, on average, for the data on disk systems, about 30 percent is useful data, 40 percent is not active and could be eligible for archive, and the remaining 30 percent was crap.
The other experts introduced themselves. Rich felt that "Cloud" was still the biggest buzzword in the IT industry. Stan felt that CIOs should ask their storage administrators "What are you doing to improve my agility and efficiency". Steve felt that it was better to focus on improving process and procedures, rather than trying to deploy the best technology.
How can you best reduce backup costs per TB?
Jon- use tape.
Rich- Clean up your environment.
Stan- Don't rehydrate your deduplicated data, adopt archive approach, and revisit your backup schedules.
Steve- Deduplication covers up stupidity. No band-aids! Companies need to address the cause.
Does Backup as a Public Service for large enterprises makes sense?
Rich- Yes, especially for those with Remote Office/Branch Office (ROBO).
Stan- It depends. You should implement client-side dedupe. Get the Cloud Provider to waive telecom bandwidth charges.
Steve- Consider recovery scenarios, and try to maintain control.
Jon- "Clouds" are bulls@#$ marketing. WAN latency will pile up.
What are the top issues IT leaders should be discussing with the Storage Managers?
Stan- To ensure SLAs meet but not exceed design, to automate, and to evaluate SAN/NAS ratios.
Steve- Server virtualization is putting the spotlight on storage. Failure to implement storage virtualization is becoming the gate that slows down sever virtualization adoption.
Jon- Insist on management features from all storage vendors, try to separate feature/function from the underlying hardware layer. See IBM's [Project Zero].
Rich- Efficiency, Archiving, Thin Provisioning, Compression, Data Protection & Retention, Backup Redesign to protect endpoints like laptops and cell phones.
When does Archive eliminate Backup?
The need for protection never goes away. There are two kinds of data: "originals" and "derivatives", and two kinds of disk: "failed" and "not yet failed".
Given SATA and SAS drives, what is the future of 10K/15K RPM drives?
There is no future for these faster drives, they are going away.
What is the biggest challenge for adopting archive?
It is easy to move data out of production systems, but difficult to make these archives accessible for eDiscovery and Search. There is also concern about changing data formats. Adobe has changed the format of PDF a whopping 33 times.
This was by far the most entertaining section of the day! Hand-held devices allowed the audience to vote which answers they liked best.
Doug Balog, IBM VP and Business Level Executive for Storage, presented Smart Archiving. Citing research by Jon Toigo, Doug indicated that 40 percent of data on disk should be archived. Sadly, a vast majority of companies continue to use their backups as archives. There is a better way to do archives, to address the needs of four use cases:
The IBM Information Archive for email, files and eDiscovery offers full text indexing. A well-deployed archive strategy can save up to 60 percent in backup costs, and reduce backup times by 80 percent. IBM offers advanced analytics and visualization for archive data.
An analysis of a global insurance company found that they kept, on average, 120 copies of every email sent. This was the combination of an average of 12 copies of the email, multipled by 10 backups of the email repository.
Banjercito, a bank in Mexico, has a 10-year retention requirement from government regulations.
The new LTFS Library Edition allows Library-based access to files stored on tape cartridges. The new TS3500 Library Connector means that a single system of connected tape libraries can hold up to 2.7 Exabytes (EB) of data.
Archive Industry Perspectives
Steve Duplessie from Enterprise Strategy Group [ESG] gave his views on the challenges of volume, access and cost. His definition for archive: the long term retention of information on a separate environment for compliance, eDiscovery and business reference purposes. Steve advocates a purpose-built solutiion for archive. There are three major challenges for implementing an archive solution:
Getting Participation -- Steve feels that key stakeholders have inappropriate expectations of what archive is, or can be.
Define Tasks -- Steve argues that archive is very much a process-oriented approach, and tasks must fit business process and procedures
Prepare for Future Content Types -- the frequent change of standard and proprietary data types poses a real challenge for long term retention of data
For example, the Financial Industry Regulatory Authority [FINRA] oversee 4,000 brokerage firms, and 600,000 broker/dealers. They have mandated the storing of digital data related to stock trades, and this can include text messages, voice messages, and emails. They continue to expand this definition, so soon this could include tweets on Twitter, for example.
Steve feels there are four key requirements for archive:
Support for email, such as an email application plug-in
Off-line access to archived data
Support for mobile devices, such as smartphones
Basic search capabilities
Companies are starting to take archive seriously. About 35 percent of firms surveyed have adopted archive, and another 36 percent plan to in the next 12-24 months. Enterprise archive has grown over 200 percent from 2007 to 2009. Steve agrees that not everything needs to be stored on disk. Retention periods greater than six years dictates the need for tape.
Current systems may not meet today's requirements. Data loss and downtime costs have skyrocketed. Data Protection and Retention projects can represent a gold mine of savings, new capabilities can greatly lower costs, allowing companies to shift resources over to revenue generation.
Big Data, New Physics and Geospatial Super-Food
I would vote this the best session of the day! For all those confused on what the heck "Big Data" means, Jeff has the best explanation. Jeff Jonas is an IBM Distinguished Engineer and the Chief Scientist of Entity Analytics. He had just finished his 17th marathon on Saturday, and his fingers were bandaged.
Jeff had founded the Systems Research and Design (SR&D) company, known for creating NORA (non-obvious relationship awareness) used by Las Vegas casinos to identify fraud. SR&D was acquired by IBM back in 2005. Jeff is focused on sensemaking of streams. He feels many companies are suffering from "Enterprise Amnesia".
"The data must find the data .. and the relevance must find the user."
-- Jeff Jonas
Jeff's metaphor to Big Data is a jigsaw puzzle without the picture on the outside of the box. To demonstrate his point, he presented a pile of jigsaw puzzle pieces and asked four teenagers to put the puzzle together without the advantage of the picture on the box. What he had not told them was that he mixed four different puzzles together, removing out 10 to 20 percent of the pieces from each puzzle. He also added some duplicate pieces from a second identical puzzle, and just to make things fun, included a dozen pieces from a sixth puzzle just to mess with their heads. Within a few hours, the kids had managed to figure out that there were four puzzles, that there were duplicate pieces, and that there were some pieces that did not fit any of the four puzzles.
"You can't squeeze knowledge from a pixel."
-- Jeff Jonas
This approach favors false negatives. New observations reverse out old conceptions. As the picture emerges, this provides added focus on new information. More data can provide better predictions. "Bad" data, including misspelled words and mis-coded categories, was often discarded or corrected on the basis of "Garbage-In, Garbage Out", but can now be useful in a Big Data perspective.
Take for example the 600 billion recordings of the "location data" captured on cell phones every day. With regular triangulation of cell phone towers, the information can pinpoint you within 60 meters, add GPS and this improved to within 20 meters, and add Wi-Fi is further improved to 10 meters. While this data is "de-identified" so as not to identify individual users, the process of re-identification is relatively trivial. Jeff's system is able to predict a person will be next Thursday at 5:35pm with 87 percent accuracy.
Thus, Big Data represents an asset, accumulation of context. Real-time analytics can be a competitive advantage. These streams of data will need persistent storage and massive I/O capabilities. In one example, Jeff processed 4,200 separate sources of information and was able to identify "dead votes". These are votes cast by people that died in years prior, indicating voter fraud.
Jeff's latest project, codenamed G2, will tackle not just people, but everything from proteins to asteroids.
Normally, the worst time slot is the hour after lunch, but these presentations kept people's attention.
Down the street, in Times Square, IBM made it on the big board.
Continuous Data Availability
Jeanine Cotter, IBM VP for Data Center Services, started out with a video about Sabre. IBM developed this revolutionary airline reservation system to handle the huge volume of transactions. Today, 18 percent of organizations consider downtime unacceptable for their tier-1 applications, and 53 percent would be seriously impacted by an outage lasting an hour or more.
Eventually, companies cross the "Continuous Availability" threshold, the point where they discover that the possibility of downtime is too costly to ignore. IBM has clients using 3-site Metro/Global Mirror that can fail-over an entire data center in just five mouse clicks.
Jeanine also mentioned Euronics, which is using SAN Volume Controller's Stretched Cluster capability, which allows them to easily vMotion virtual guest images from one data center to another. SVC has had this capability for a while, but now, with full VMcenter plug-in and VAAI support, the capability is fully integrated with VMware.
A final example was a mid-sized University, they are using IBM Storwize V7000 with Metro Mirror. The primary location's Storwize V7000 manages Solid-state drives with Easy Tier. The secondary location's Storwize V7000 has high-capacity SATA drives and FlashCopy.
Customer Testimonial - University of Rochester Medical Center
Rick Haverty, Director of IT infrastructure at University of Rochester Medical Center [URMC] provided the next client testimonial. The mission of the URMC is to use science, education and technology to improve health. URMC gets over $400 million USD in NIH grants, which puts them around 23rd largest University-based academic medical centers in the country. They have over 900 doctors, general practice and specialists.
URMC has an IBM BlueGene supercomputer, a Cisco network over 45,000 ports, and over 7.5 million square feet of Wi-Fi wireless internet coverage. They have three datacenters. The first is 7500 square feet, the second is 6000 square feet, and the third is just 800 square feet to hold their "off-site tapes".
URMC has digitized all of their records, including Electronic Medical Records (EMR) system, medical dosage history, imaging "priors", calibration of infusion pumps, RFID monitoring, and even provide IT support while the patient is on the operating table. RFID monitoring ensures all of the refrigerators are keeping medications at the right temperature. A single failed refrigerator can lose $20,000 dollars worth of medication.
When is a good time for downtime? At URMC, they handle 90,000 Emergency Room vists per year, so the answer is never. When is the ER busiest? Monday morning. (not what I expected!)
URMC's EMR software (Epic) runs on clustered POWER7 servers, with DS8700 disk systems using Metro Mirror to secondary location. They also keep a third "shadow" POWER7 for read-only purposes, and a separate system that provides web-based read-only access. Finally, they have 90 stand-alone Personal Computers (PCs) that contain information for all the patients that have reservations this week, just in case all the other systems fail.
The exploding volume of data comes from medical imaging. For radiology (X-rays), each image is called a "study" takes 20-30 MB each, and they have 650,000 studies per year. This represents about 16TB storage per year, with 3 second response time access. These must be kept for 7 years since last view, or until the patient reaches the age of 18 years old, which ever is later.
But radiology is just one discipline. Healthcare has a whole bunch of "ologies". Another is "Pathology" which looks at cells between glass slides in a microscope. Each study consumes 10-20GB, and URMC does about 100,000 pathology studies per year, representing 150TB per year.
URMC has identified that they have 42 mission-critical applications. The data for these are stored on DS8000, XIV, Storwize V7000 and DS5000, all managed behind SAN Volume Controller.
During lunch, people were able to take a look at our solutions. Here are Dan Thompson and Brett Cooper striking a pose.
Hyper-Efficient Backup and Recovery
The afternoon was kicked off by Dr. Daniel Sabbah, IBM General Manager of Tivoli software. He started with some shocking statistics: 42 percent of small companies have experienced data loss, 32 percent have lost data forever. IBM has a solution that offers "Unified Recovery Management". This involves a combination of periodic backups, frequent snapshots, and remote mirroring.
IBM Tivoli Storage Manager (TSM) was introduced in 1993, and was the first backup software solution to support backup to disk storage pools. Today, TSM is now also part of Cloud Computing services, including IBM Information Protection Services. IBM announced today a new bundle called IBM Storwize Rapid Application Backup, which combines IBM Storwize V7000 midrange disk system, Tivoli FlashCopy Manager, implementation services, with a full three-year hardware and software warranty. This could be used, for example, to protect a Microsoft Exchange email system with 9000 mailboxes.
IBM also announced that its TS7600 ProtecTIER data deduplication solutions have been enhanced to support many-to-many bi-direction remote mirroring. Last year, University of Pittsburgh Medical Center (UPMC) reported that they were average 24x data deduplication factor in their environment using IBM ProtecTIER.
"You are out of your mind if you think you can live without tape!"
-- Dick Crosby, Director of System Administration, Estes
The new IBM TS1140 enterprise class tape drive process 2.3 TB per hour, and provides a density of 1.2 PB per square foot. The new 3599 tape media can hold 4TB of data uncompressed, which could hold up to 10TB at a 2.5x compression ratio.
The United States Golfers Association [USGA] uses IBM's backup cloud, which manages over 100PB of data from 750 locations across five continents.
Customer Testimonial - Graybar
Randy Miller, Manager of Technical System Administration at Graybar, provided the next client testimonial. Graybar is an employee-owned company focused on supply-chain management, serving as a distributor for electical, lighting, security, power and cooling equipment.
Their problem was that they had 240 different locations, and expecting local staff to handle tape backups was not working out well. They centralized their backups to their main data center. In the event that a system fails in one of their many remote locations, they can rebuild a new machine at their main data center across high-speed LAN, and then ship overnight to the remote location. The result, the remote location has a system up and running by 10:30am, faster than they would have had from local staff trying to figure out how to recover from tape. In effect, Graybar had implemented a "private cloud" for backup in the 1990s, long before the concept was "cool" or "popular".
In 2001, they had an 18TB SAP ERP application data repository. To back this up, they took it down for 1 minute per day, six days a week, and 15 minutes down on Sundays. The result was less than 99.8 percent availability. To fix this, they switched to XIV, and use Snapshots that are non-disruptive and do not impact application performance.
Over 85 percent of the servers at Graybar are virtualized.
Their next challenge is Disaster Recovery. Currently, they have two datacenters, one in St. Louis and the other in Kansas City. However, in the aftermath of Japan's earthquakes, they realize there is a nuclear power plan between their two locations, so a single incident could impact both data centers. They are working with IBM, their trusted advisors, to investigate a three-site solution.
This week, May 15-22, I am in Auckland, New Zealand teaching IBM Storage Top Gun sales class. Next week, I will be in Sydney, Australia.
Optimizing Storage Infrastructure for Growth and Innovation
This session started off with my former boss, Brian Truskowski, IBM General Manager of System Storage and Networking.
We've come a long way in storage. In 1973, the "Winchester Drive" was named after the famous Winchester 3030 rifle. The disk drive was planning to have two 30MB platters, hence the name. When it finally launched, it would have two 35MB platters, for a total raw capacity of 70MB.
Today, IBM announced the verison 6.2 of SAN Volume Controller with support for 10GbE iSCSI. Since 2003, IBM has sold over 30,000 SAN Volume Controllers. An SVC cluster can now manage up to 32PB of disk storage.
IBM also announced new 4TB tape drive (TS1140), LTFS Library Edition, the TS3500 Library Connector, improved TS7600 and TS7700 virtual tape libraries, enhanced Information Archive for email, files and eDiscovery, new Storwize V7000 hardware, new Storwize Rapid Application bundles, new firmware for SONAS and DS8000 disk systems, and Real-Time Compression support for EMC disk systems. I plan to cover each of these in follow-on posts, but if you can't wait, here are [links to all the announcements].
Customer Testimonial - CenterPoint Energy
"CenterPoint is transforming its business from being an energy distribution company that uses technology, to a technology company that distributes energy."
-- Dr. Steve Pratt, CTO of CenterPoint Energy
The next speaker was Dr. Steve Pratt is CTO of [CenterPoint Energy]. CenterPoint is 110 years old (older than IBM!) energy company that is involved in electricity, gasoline distribution, and natural gas pipeline. CenterPoint serves Houston, Texas (the fourth largest city in the USA) and surrounding area.
CenterPoint are transforming to a Smart Grid involving smart meters, and this requires the best IT infrastructure you can buy, including IBM DS8000, XIV and SAN Volume Controller disk systems, IBM Smart Analytics System, Stream Analytics, IBM Virtual Tape Library, IBM Tivoli Storage Manager, and IBM Tivoli Storage Productivity Center.
Dr. Pratt has seen the transition of information over the years:
Data Structure, deciding how to code data to record it in a structured manner
Information Reporting, reporting to upper management what happened
Intelligence Aggregation, finding patterns and insight from the data
Predictive Analytics, monitoring real-time data to take pro-active steps
Autonomics, where automation and predictive analysis allows the system to manage itself
What does the transition to a Smart Grid mean for their storage environment? They will go from 80,000 meter reads, to 230,400,000 reads per day. Ingestion of this will go from MB/day to GB/sec. Reporting will transition to real-time analytics.
Dr. Pratt prefers to avoid trade-offs. Don't lose something to get something else. He also feels that language of the IT department can help. For example, he uses "Factor" like 25x rather than percent reduction (96 percent reduced). He feels this communicates the actual results more effectively.
Today's smarter consumers are driving the need for smarter technologies. Individual consumers and small businesses can make use of intelligent meters to help reduce their energy costs. Everything from smart cars to smart grids will need real-time analytics to deal with the millions of events that occur every day.
IBM's Data Protection and Retention Story
Brian Truskowski came back to provide the latest IBM messaging for Data Protection and Retention (DP&R). The key themes were:
Stop storing so much
Store more with what's on the floor
Move data to the right place
IBM announced today that the IBM Real-Time Compression Appliances now support EMC gear, such as EMC Celerra. While some of the EMC equipment have built-in compression features, these often come at a cost of performance degradation. Instead, the IBM Real-Time compression can offer improved performance as well as 3x to 5x reduction in storage capacity.
OVer 70 percent of data on disk has not be accessed in the last 90 days. IBM Easy Tier on the DS8700 and DS8800 now support FC-to-SATA automated tiering.
IBM is projecting that backup and archive storage will grow at over 50 percent per year. To help address this, IBM is launching a new "Storage Infrastructure Optimization" assessment. All attendees at today's summit are eligible for a free assessment.
Analytics are increasing the value of information, and making it more accessible to the average knowledge worker. The cost of losing data, as well as the effort spent searching for information, has skyrocketed. Users have grown to expect 100 percent uptime availability.
An analysis of IT environments found that only 55 percent was spent on revenue-producing workloads. The remaining 45 percent was spent on Data Protection and Retenion. That means that for every IT dollar spent on projects to generate revenue, you are spending another 90 cents to protect it. Imagine spending 90 percent of your house payments for homeowners' insurance, or 90 percent of your car's purchase price for car insurance.
IBM has organized its solutions into three categories:
Hyper-Efficient Backup and Recovery
Continuous Data Availability
What would it mean to your business if you could shift some of the money spent on DP&R over to revenue-producing projects instead? That was the teaser question posed at the end of these morning sessions for us to discuss during lunch.
Normally, IBM has its announcements on Tuesdays, but this week it was on Monday!
I am here in New York City, at the Kaufmann Theater of the American Museum of Natural History, for the
[IBM Storage Innovation Executive Summit]. We have about 250 clients here, as well as many bloggers and storage analysts.
My day started out being interviewed by Lynda from Stratecast, a division of [Frost & Sullivan]. This interview will be part of a video series that Stratecast is doing about the storage industry.
(About the venue: American Museum of Natural History was built in 1869. It was featured in the film "Night at the Museum". In keeping with IBM's focus on scalability and preservation, the museum here boasts skeletons of the largest dinosaurs. The five-story building takes up several city blocks, and the Kaufmann theater is buried deep in the bottom level, well shielded from cell phone or Wi-Fi signals allowing me to focus on taking notes the traditional way, with pen and paper.)
Deon Newman, IBM VP of Marketing for Northa America, was our Master of Ceremonies. Today would be filled with market insight, best practices, thought leadership, and testimonials of powerful results.
This is my first in a series of blog posts on this event.
Information Explosion on a Smarter Planet
Bridget van Kralingen, IBM General Manager for North America, indicated that storage is finally having its day in the sun, moving from the "back office" to the "front office". According to Google's Eric Schmidt, we now create, capture and replicate more date in two days than all of the information recorded from the dawn of time to the year 2003.
1928: IBM's innovative 80-column punch card stored nearly twice as much as its 50-column predecessor.
1947: Bing Crosby decided to do his radio show by recording it at his convenience on magnetic tape, rather than doing it live. This was the motivation for IBM researches to investigate tape media, delivering the first commercial tape drive in 1952. One tape reel could hold the equivalent of 30,000 punch cards.
1956: the IBM RAMAC mainframe was the first computer to access data randomly with an externally-attached disk system, the "350 Disk Unit", which stored 5 million 7-bit characters (about 5MB) and weighed over 500 pounds. Compare that today's cell phone that can store several GB of data in a handheld device.
1978: IBM invented Redundant Array of Independent Disks (RAID) through a collaboration with University of Berkeley.
1993: IBM introduces the [IBM 9337 Disk Storage Array], the first external disk storage system for distributed operating systems. This was based on the Serial Storage Architecture [SSA] protocol.
1995: IBM launches products that support Storage Area Networks (SAN), based on the Fibre Channel Protocol. IBM's internal codenames for disk products were all names of sharks, and so our internal mantra was that a healthy storage diet was comprised of "Plenty of Fish and Fibre".
2010: IBM ships Easy Tier, the world's easiest-to-use sub-LUN automated tiering capability, for the IBM System Storage DS8700 disk system.
Storage is growing (in capacity) at 40 percent per year, but IT budgets are only growing (in dollars) by a measly 1 to 5 percent. She cited the success at [Sprint], presented at the October 2010 launch. By combining IBM SAN Volume Controller with a three-tier storage architecture, Sprint lowered their raw capacity from 10PB to 8.4PB, increasing utilization from 35 to 78 percent. This involved shrinking from six storage vendors to three, and reducing total number of disk arrays from 166 down to 96. The resulting system has only 38 percent of their data on their most expensive Tier-1 storage, the rest is now living on less expensive Tier-2 and Tier-3 storage.
Companies are entering the era of Big Data with an insatiable appetite for collecting and analyzing data for marketplace insights. IBM [InfoSphere BigInsights], based on the Apache Hadoop, has helped customers make sense of it all. Innovative technology, expertise and marketplace insight will provide the competitive path forward in the coming decade.
Storage Challenges and Opportunities in 2011 and Beyond
I always enjoy hearing Stan Zaffos, Gartner Research VP, present at the annual [Data Center Conference] in Las Vegas every December. His analysis and research focuses on storage systems and emerging storage technologies.
Stan provided his perspective on the storage industry. He suggested a top-down approach, based on the market trends that Gartner is closely monitoring. He suggests focusing heavily on managing data growth, using SLAs to improve efficiency, and to follow Gartner's recommended actions. His statement, "If something is not sustainable, then it is unsustainable." resonated well with the audience. His key three points:
Design to meet but not exceed Service Level Agreements (SLAs)
Re-evaluate your ratio of SAN versus NAS based on growth of unstructured data content,
Explore the variety of Cloud options available.
Those of us who have been in this business a long time recognize that the problems haven't changed, just the dimensions. When in the past three decades were IT budgets generous and plentiful? When was there more than enough IT staff to handle all the requests in a timely manner? When hasn't there been a period of information growth? Gartner's analysis external control block (RAID protected disk systems) is growing revenue at 8.7 percent. Raw TBs of disk capacity is growing at 55 percent, and expected to be 100 Exabytes by 2015.
SAN has four times more revenue than NAS today, but NAS is growing faster. NAS was only 9 percent marketshare in 2010, but is projected to grow to 32 percent by 2015. SAN can offer higher price/performance for traditional OLTP and database workloads, but NAS is better suited for unstructured data, backups and archives, assisted by storage efficiency features like real-time compression and data deduplication. Which industries create the most unstructured data? The ones involved in filling out forms! This includes government, insurance agencies, manufacturing, mining and pharmaceuticals.
The phrase "good enough" should no longer be considered an insult. Too often IT departments design solutions that far exceed negotiated Service Level Agreements (SLAs), and they should instead focus on just meeting them instead. Modular storage systems are often sufficient for most workloads. Slower 7200RPM SATA disks can be one third the price of faster 15K RPM Fibre Channel drives, and often sufficient performance for the tasks required. Unified storage, such as IBM N series, can help simplify capacity planning, as storage can be re-purposed if different workloads grow at different rates. The key is to focus on meeting SLAs based on the price-vs-risk factor. Take a minimalist approach with fewer SLAs, fewer management classes, and fewer storage vendors.
Stan suggests a two-pronged approach: Capacity management through content analytics and classification, and Efficient Utilization through Thin Provisioning, storage virtualization, Quality of Service (QoS), compression and deduplication capabilities. This features will be ubiquitous by 2013. If you are worried that these technologies mean more information packed onto fewer devices, Stan's response was "If it's not there, it can't break." Storing data on fewer disks or tape cartridges means less chance something will fail.
Stan feels IT shops using Thin Provisioning should continue to charge their end-users on what they ask for (the full allocation request) rather than what the thin-provisioned amount actually is on the storage devices themselves. For example, if someone asks for 100GB LUN to be allocated to their system, but this only takes up 30GB of actual data space, chargeback the full 100GB!
It can take five years for new technology to get 50 percent adopted. The Romans took eight years to build the [Colosseum]. His research on "network convergence" found that 42 percent planned to use iSCSI, 32 percent Fibre Channel over Ethernet (FCoE) or other Top-of-Rack(TOR) converged switches, and 16 percent looking for full convergence of servers, switches and storage. Features like IBM Easy Tier automatic sub-LUN tiering were introduced later, and so have not been adopted as widely as other features like Thin Provisioning that have been around since the 1990's IBM RAMAC Virtual Array.
Stan felt that Public and Private clouds were two different approaches. Public clouds offer reservation-less provisioning. Private clouds offer improved agility, but can be more complex to set up, and has the risk of idle capacity similar to traditional IT datacenter deployments. Storage and File virtualization should be considered a pre-req for adopting Cloud technologies.
Storage IT teams need to adopt more than just technical skills. They need to learn about legal and government regulatory compliance issues, financial considerations, and would even benefit doing some "marketing". Why marketing? Because often IT departments need end-users to change their attitudes and behaviours, and this can be accomplished through internal marketing campaigns.
Over on the Tivoli Storage Blog, there is an exchange over the concept of a "Storage Hypervisor". This started with fellow IBMer Ron Riffe's blog post [Enabling Private IT for Storage Cloud -- Part I], with a promise to provide parts 2 and 3 in the next few weeks. Here's an excerpt:
"Storage resources are virtualized. Do you remember back when applications ran on machines that really were physical servers (all that “physical” stuff that kept everything in one place and slowed all your processes down)? Most folks are rapidly putting those days behind them.
In August, Gartner published a paper [Use Heterogeneous Storage Virtualization as a Bridge to the Cloud] that observed “Heterogeneous storage virtualization devices can consolidate a diverse storage infrastructure around a common access, management and provisioning point, and offer a bridge from traditional storage infrastructures to a private cloud storage environment” (there’s that “cloud” language). So, if I’m going to use a storage hypervisor as a first step toward cloud enabling my private storage environment, what differences should I expect? (good question, we get that one all the time!)
The basic idea behind hypervisors (server or storage) is that they allow you to gather up physical resources into a pool, and then consume virtual slices of that pool until it’s all gone (this is how you get the really high utilization). The kicker comes from being able to non-disruptively move those slices around. In the case of a storage hypervisor, you can move a slice (or virtual volume) from tier to tier, from vendor to vendor, and now, from site to site all while the applications are online and accessing the data. This opens up all kinds of use cases that have been described as “cloud”. One of the coolest is inter-site application migration.
A good storage hypervisor helps you be smart.
Application owners come to you for storage capacity because you’re responsible for the storage at your company. In the old days, if they requested 500GB of capacity, you allocated 500GB off of some tier-1 physical array – and there it sat. But then you discovered storage hypervisors! Now you tell that application owner he has 500GB of capacity… What he really has is a 500GB virtual volume that is thin provisioned, compressed, and backed by lower-tier disks. When he has a few data blocks that get really hot, the storage hypervisor dynamically moves just those blocks to higher tier storage like SSD’s. His virtual disk can be accessed anywhere across vendors, tiers and even datacenters. And in the background you have changed the vendor storage he is actually sitting on twice because you found a better supplier. But he doesn’t know any of this because he only sees the 500GB virtual volume you gave him. It’s 'in the cloud'."
"Let’s start with a quick walk down memory lane. Do you remember what your data protection environment looked like before virtualization? There was a server with an operating system and an application… and that thing had a backup agent on it to capture backup copies and send them someplace (most likely over an IP network) for safe keeping. It worked, but it took a lot of time to deploy and maintain all the agents, a lot of bandwidth to transmit the data, and a lot of disk or tapes to store it all. The topic of data protection has modernized quite a bit since then.
Fast forward to today. Modernization has come from three different sources – the server hypervisor, the storage hypervisor and the unified recovery manager. The end result is a data protection environment that captures all the data it needs in one coordinated snapshot action, efficiently stores those snapshots, and provides for recovery of just about any slice of data you could want. It’s quite the beautiful thing."
At this point, you might scratch your head and ask "Does this Storage Hypervisor exist, or is this just a theoretical exercise?" The answer of course is "Yes, it does exist!" Just like VMware offers vSphere and vCenter, IBM offers block-level disk virtualization through the SAN Volume Controller(SVC) and Storwize V7000 products, with a full management support from Tivoli Storage Productivity Center Standard Edition.
SVC has supported every release of VMware since the 2.5 version. IBM is the leading reseller of VMware, so it makes sense for IBM and VMware development to collaborate and make sure all the products run smoothly together. SVC presents volumes that can be formatted for VMFS file system to hold your VMDK files, accessible via FCP protocol. IBM and VMware have some key synergies:
Management integration with Tivoli Storage Productivity Center and VMware vCenter plug-in
VAAI support: Hardware-assisted locking, hardware-assisted zeroing, and hardware-assisted copying. Some of the competitors, like EMC VPLEX, don't have this!
Space-efficient FlashCopy. Let's say you need 250 VM images, all running a particular level of Windows. A boot volume of 20GB each would consume 5000GB (5 TB) of capacity. Instead, create a Golden Master volume. Then, take 249 copies with space-efficient FlashCopy, which only consumes space for the modified portions of the new volumes. For each copy, make the necessary changes like unique hostname and IP address, changing only a few blocks of data each. The end result? 250 unique VM boot volumes in less than 25GB of space, a 200:1 reduction!
Support for VMware's Site Recovery Manager using SVC's Metro Mirror or Global Mirror features for remote-distance replication.
Data center federation. SVC allows you to seamlessly do vMotion from one datacenter to another using its "stretched cluster" capability. Basically, SVC makes a single image of the volume available to both locations, and stores two physical copies, one in each location. You can lose either datacenter and still have uninterrupted access to your data. VMware's HA or Fault Tolerance features can kick in, same as usual.
But unlike tools that work only with VMware, IBM's storage hypervisor works with a variety of server virtualization technologies, including Microsoft Hyper-V, Xen, OracleVM, Linux KVM, PowerVM, z/VM and PR/SM. This is important, as a recent poll on the Hot Aisle blog indicates that [44 percent run 2 or more server hypervisors]!
Join the conversation! The virtual dialogue on this topic will continue in a [live group chat] this Friday, September 23, 2011 from 12 noon to 1pm EDT. Join me and about 20 other top storage bloggers, key industry analysts and IBM Storage subject matter experts to discuss storage hypervisors and get questions answered about improving your private storage environment.
Last week, fellow IBMer Ron Riffe started his three-part series on the Storage Hypervisor. I discussed Part I already in my previous post [Storage Hypervisor Integration with VMware]. We wrapped up the week with a Live Chat with over 30 IT managers, industry analysts, independent bloggers, and IBM storage experts.
"The idea of shopping from a catalog isn’t new and the cost efficiency it offers to the supplier isn’t new either. Public storage cloud service providers seized on the catalog idea quickly as both a means of providing a clear description of available services to their clients, and of controlling costs. Here’s the idea… I can go to a public cloud storage provider like Amazon S3, Nirvanix, Google Storage for Developers, or any of a host of other providers, give them my credit card, and get some storage capacity. Now, the “kind” of storage capacity I get depends on the service level I choose from their catalog.
Most of today’s private IT environments represent the complete other end of the pendulum swing – total customization. Every application owner, every business unit, every department wants to have complete flexibility to customize their storage services in any way they want. This expectation is one of the reasons so many private IT environments have such a heavy mix of tier-1 storage. Since there is no structure around the kind of requests that are coming in, the only way to be prepared is to have a disk array that could service anything that shows up. Not very efficient… There has to be a middle ground.
Private storage clouds are a little different. Administrators we talk to aren’t generally ready to let all their application owners and departments have the freedom to provision new storage on their own without any control. In most cases, new capacity requests still need to stop off at the IT administration group. But once the request gets there, life for the IT administrator is sweet!
Here comes the request from an application owner for 500GB of new “Database” capacity (one of the options available in the storage service catalog) to be attached to some server. After appropriate approvals, the administrator can simply enter the three important pieces of information (type of storage = “Database”, quantity = 500GB, name of the system authorized to access the storage) and click the “Go” button (in TPC SE it’s actually a “Run now” button) to automatically provision and attach the storage. No more complicated checklists or time consuming manual procedures.
A storage hypervisor increases the utilization of storage resources, and optimizes what is most scarce in your environment. For Linux, UNIX and Windows servers, you typically see utilization rates of 20 to 35 percent, and this can be raised to 55 to 80 percent with a storage hypervisor. But what is most scarce in your environment? Time! In a competitive world, it is not big animals eating smaller ones as much as fast ones eating the slow.
Want faster time-to-market? A storage hypervisor can help reduce the time it takes to provision storage, from weeks down to minutes. If your business needs to react quickly to changes in the marketplace, you certainly don't want your IT infrastructure to slow you down like a boat anchor.
Want more time with your friends and family? A storage hypervisor can migrate the data non-disruptively, during the week, during the day, during normal operating hours, instead of scheduling down-time on an evenings and weekends. As companies adopt a 24-by-7 approach to operations, there are fewer and fewer opportunities in the year for scheduled outages. Some companies get stuck paying maintenance after their warranty expires, because they were not able to move the data off in time.
Want to take advantage of the new Solid-State Drives? Most admins don't have time to figure out what applications, workloads or indexes would best benefit from this new technology? Let your storage hypervisor automated tiering do this for you! In fact, a storage hypervisor can gather enough performance and usage statistics to determine the characteristics of your workload in advance, so that you can predict whether solid-state drives are right for you, and how much benefit you would get from them.
Want more time spent on strategic projects? A storage hypervisor allows any server to connect to any storage. This eliminates the time wasted to determine when and how, and let's you focus on the what and why of your more strategic transformational projects.
If this sounds all too familiar, it is similar to the benefits that one gets from a server hypervisor -- better utilization of CPU resources, optimizing the management and administration time, with the agility and flexibility to deploy new technologies in and decommission older ones out.
"Server virtualization is a fairly easy concept to understand: Add a layer of software that allows processing capability to work across multiple operating environments. It drives both efficiency and performance because it puts to good use resources that would otherwise sit idle.
Storage virtualization is a different animal. It doesn't free up capacity that you didn't know you had. Rather, it allows existing storage resources to be combined and reconfigured to more closely match shifting data requirements. It's a subtle distinction, but one that makes a lot of difference between what many enterprises expect to gain from the technology and what it actually delivers."
Jon Toigo on his DrunkenData blog brings back the sanity with his post [Once More Into the Fray]. Here is an excerpt:
"What enables me to turn off certain value-add functionality is that it is smarter and more efficient to do these functions at a storage hypervisor layer, where services can be deployed and made available to all disk, not to just one stand bearing a vendor’s three letter acronym on its bezel. Doesn’t that make sense?
I think of an abstraction layer. We abstract away software components from commodity hardware components so that we can be more flexible in the delivery of services provided by software rather than isolating their functionality on specific hardware boxes. The latter creates islands of functionality, increasing the number of widgets that must be managed and requiring the constant inflation of the labor force required to manage an ever expanding kit. This is true for servers, for networks and for storage.
Can we please get past the BS discussion of what qualifies as a hypervisor in some guy’s opinion and instead focus on how we are going to deal with the reality of cutting budgets by 20% while increasing service levels by 10%. That, my friends, is the real challenge of our times."
Did you miss out on last Friday's Live Chat? We are doing it again this Friday, covering parts I and II of Ron's posts, so please join the conversation! The virtual dialogue on this topic will continue in another [Live Chat] on September 30, 2011 from 12 noon to 1pm Eastern Time.
Wrapping up my week's theme of storage optimization, I thought I would help clarify the confusion between data reduction and storage efficiency. I have seen many articles and blog posts that either use these two terms interchangeably, as if they were synonyms for each other, or as if one is merely a subset of the other.
Data Reduction is LOSSY
By "Lossy", I mean that reducing data is an irreversible process. Details are lost, but insight is gained. In his paper, [Data Reduction Techniques", Rajana Agarwal defines this simply:
"Data reduction techniques are applied where the goal is to aggregate or amalgamate the information contained in large data sets into manageable (smaller) information nuggets."
Data reduction has been around since the 18th century.
Take for example this histogram from [SearchSoftwareQuality.com]. We have reduced ninety individual student scores, and reduced them down to just five numbers, the counts in each range. This can provide for easier comprehension and comparison with other distributions.
The process is lossy. I cannot determine or re-create an individual student's score from these five histogram values.
This next example, complements of [Michael Hardy], represents another form of data reduction known as ["linear regression analysis"]. The idea is to take a large set of data points between two variables, the x axis along the horizontal and the y axis along the vertical, and find the best line that fits. Thus the data is reduced from many points to just two, slope(a) and intercept(b), resulting in an equation of y=ax+b.
The process is lossy. I cannot determine or re-create any original data point from this slope and intercept equation.
In this last example, from [Yahoo Finance], reduces millions of stock trades to a single point per day, typically closing price, to show the overall growth trend over the course of the past year.
The process is lossy. Even if I knew the low, high and closing price of a particular stock on a particular day, I would not be able to determine or re-create the actual price paid for individual trades that occurred.
Storage Efficiency is LOSSLESS
By contrast, there are many IT methods that can be used to store data in ways that are more efficient, without losing any of the fine detail. Here are some examples:
Thin Provisioning: Instead of storing 30GB of data on 100GB of disk capacity, you store it on 30GB of capacity. All of the data is still there, just none of the wasteful empty space.
Space-efficient Copy: Instead of copying every block of data from source to destination, you copy over only those blocks that have changed since the copy began. The blocks not copied are still available on the source volume, so there is no need to duplicate this data.
Archiving and Space Management: Data can be moved out of production databases and stored elsewhere on disk or tape. Enough XML metadata is carried along so that there is no loss in the fine detail of what each row and column represent.
Data Deduplication: The idea is simple. Find large chunks of data that contain the same exact information as an existing chunk already stored, and merely set a pointer to avoid storing the duplicate copy. This can be done in-line as data is written, or as a post-process task when things are otherwise slow and idle.
When data deduplication first came out, some lawyers were concerned that this was a "lossy" approach, that somehow documents were coming back without some of their original contents. How else can you explain storing 25PB of data on only 1PB of disk?
(In some countries, companies must retain data in their original file formats, as there is concern that converting business documents to PDF or HTML would lose some critical "metadata" information such as modificatoin dates, authorship information, underlying formulae, and so on.)
Well, the concern applies only to those data deduplication methods that calculate a hash code or fingerprint, such as EMC Centera or EMC Data Domain. If the hash code of new incoming data matches the hash code of existing data, then the new data is discarded and assumed to be identical. This is rare, and I have only read of a few occurrences of unique data being discarded in the past five years. To ensure full integrity, IBM ProtecTIER data deduplication solution and IBM N series disk systems chose instead to do full byte-for-byte comparisons.
Compression: There are both lossy and lossless compression techniques. The lossless Lempel-Ziv algorithm is the basis for LTO-DC algorithm used in IBM's Linear Tape Open [LTO] tape drives, the Streaming Lossless Data Compression (SLDC) algorithm used in IBM's [Enterprise-class TS1130] tape drives, and the Adaptive Lossless Data Compression (ALDC) used by the IBM Information Archive for its disk pool collections.
Last month, IBM announced that it was [acquiring Storwize. It's Random Access Compression Engine (RACE) is also a lossless compression algorithm based on Lempel-Ziv. As servers write files, Storwize compresses those files and passes them on to the destination NAS device. When files are read back, Storwize retrieves and decompresses the data back to its original form.
As with tape, the savings from compression can vary, typically from 20 to 80 percent. In other words, 10TB of primary data could take up from 2TB to 8TB of physical space. To estimate what savings you might achieve for your mix of data types, try out the free [Storwize Predictive Modeling Tool].
So why am I making a distinction on terminology here?
Data reduction is already a well-known concept among specific industries, like High-Performance Computing (HPC) and Business Analytics. IBM has the largest marketshare in supercomputers that do data reduction for all kinds of use cases, for scientific research, weather prediction, financial projections, and decision support systems. IBM has also recently acquired a lot of companies related to Business Analytics, such as Cognos, SPSS, CoreMetrics and Unica Corp. These use data reduction on large amounts of business and marketing data to help drive new sources of revenues, provide insight for new products and services, create more focused advertising campaigns, and help understand the marketplace better.
There are certainly enough methods of reducing the quantity of storage capacity consumed, like thin provisioning, data deduplication and compression, to warrant an "umbrella term" that refers to all of them generically. I would prefer we do not "overload" the existing phrase "data reduction" but rather come up with a new phrase, such as "storage efficiency" or "capacity optimization" to refer to this category of features.
IBM is certainly quite involved in both data reduction as well as storage efficiency. If any of my readers can suggest a better phrase, please comment below.
IBM is doing a bit of year-end housekeeping. The Storage Community (storagecommunity.org) will be discontinued as of January 1, 2017.
IBM will continue to host a community for all of its followers and contributors to share insights on the latest trends in storage at [ibm.co/StorageSolutions].
All of the most recent IBM content from storagecommunity.org will now be available at this new domain. IBM hopes that you will continue to engage in its community of storage industry thought leaders.
If you would like to contribute to the new community, please [register here]. Simply click the silhouette icon in the top right-hand corner of the page and select "register." Input your email address and create a password, then sign in. You will receive an email from IBM with further instructions to get you set up.
IBM's twitter handle (@SmarterStorage) will also be sunset as of January 1, 2017, but I encourage you to follow @IBMStorage, or my own twitter handle @az990tony, for the latest storage news and announcements from IBM.
Rich Bourdeau has written a nice article on InfoStor titled [Software as a Service (SaaS) meets Storage]. Last year, IBM acquired Arsenal Digital, and he mentions both in this article.It is interesting how this has evolved over the years.
Rent warehouse space for tapes
I remember when various companies offered remote storage for tapes. These would be temperature and humidity-controlledrooms, with access lists on who could bring tapes in, who could take tapes out, and so on. In the event of thedisaster, someone would collect the appropriate tapes and take them to a recovery site location.
Rent online/nearline storage from a Storage Service Provider (SSP)
SSPs rented storage space on disk, or provided automated tape libraries that could be written to. With tapes being ejected and stored in temperature/humidity-controlled vaults. Electronic vaulting eliminates a lot of theissues with cartridge handling and transportation, is more secure, and faster. Rented disk space, based on a Gigabytes-per-month rate, could be used for whatever the customer wanted. If these were for backups or archive,then the customer has to have their own software, to do their own processing at their own location, sending the data to the remote storage as appropriate, and manage their own administration.
Backup-as-a-Service and Archive-as-a-Service
We are now seeing the SaaS model applied to mundane and routine storage management tasks. New providers can offerthe software to send backups, the disk to write them to, and as needed the tape libraries and cartridges to rollover when the disk space is full. Disk capacity can be sized so that the most recent backups are on immediately accessible for fast recovery.
The same concept can be applied to archives. The key difference between a backup and an archive is that backups areversion-based. You might keep three versions of a backup, the most recent, and two older copies, in case something is wrong with the most recent copy, you can go back to older copies. This could be from undetected corruption of the data itself, or problems with the disk or tape media. An archive, on the other hand, is time-based. You want this data to be kept for a specific period of time, based on an event or fixed period of years.
Since BaaS and AaaS providers know what the data is, have some idea of the policies and usage patterns will be, can then optimize a storage solution that best meets service level agreements.
Continuing this week's theme on Storage Area Networks, today I thought I would talkabout the various terms we use for our equipment.
One area of confusion are the adjectives "entry-level", "midrange" and "enterprise-class".What do these mean? Well, as in the case of disk and tape, these three are all relative terms that are a combination of "small, medium, large" as well as "good, better, best".
Entry-level switches are typically only a maximum of 8-16 ports.Ports can connect the switch to a server, a storage device, or another switch.These are sometimes called "edge" switches, as they might be found in the mostremote sections of an office campus, remote branches, or other isolated areasoutside the primary data center.
Midrange switches typically have a maximum of 32-64 ports.More ports on a single switch means fewer switches (and fewer cables) to manage.
These are called "directors" to distinguish them from entry-level and midrange offerings.Directors have a maximum of 140-528 ports, and because so many devices or switches can beconnected to them, they need to be extremely reliable. Directors are designed for 24x7operation, with the ability to make most upgrades and configuration changes while the boxis running (often referred to as "non-disruptive upgrades"). Availability is typically better than "five nines", or 99.999 percent, which means that the box will be up 99.999 percent of the time, or conversely, will be down lessthan 5 minutes per year.
If you are asking yourself "which size is right for my company?" or "is my company big enoughfor a director?" you are asking the wrong questions! Instead, determine a SAN configurationthat meets your workload, and then decide the components for that design.
McData coined a phrase called "core/edge" design that is considered today as "Best Practice" throughout the industry.A good write-up can be found here at SearchStorage.com. Basically, you put your big beefy "core" directors in the center of the room, and then surround it with midrange switches, that then these connect to "edge" switches, that then connect to the servers and storage near them. As you grow, this design can easilyscale to grow with you.
So, if you need help implementing a SAN for the first time, or upgrading the one you have,call IBM, we can help!
I would fall into the "not for me" category, at least at this time. The iPhone is GSM-capable phone with the ability to store 4GB or 8GB of music, photos and video, and has incorporated a 2 megapixel camera. Currently, I have separate components:
A cell phone that is GSM plus CDMA, with features like "speakerphone" which I use quite a lot, but NO camera.
A 7 megapixel camera, also very small, with removable memory cards.
A 60GB iPod, with music and photos. My model is older and doesn't handle videos.
Since I visit government agencies, research and development labs, and other places that don't allow cameras, I have to either chose a cell phone that does not have camera capability in it, or have a camera phone that I leave behind in the car or at the front desk. I have chosen to get cell phones with NO camera. So, NOT having a camera is a primary feature I look for, but this is getting harder and harder these days. I don't know if Apple plans to have a non-camera version of their iPhone, but that would be a deal-breaker for me.
I do carry a separate camera, and where it is permissible, use it separately. This is especially useful if you do a lot of whiteboard or flipchart presentations, and want to capture what you have written for later. (For a great example of how effectively whiteboards can be used, check out these videos from UPS.)A picture is worth a thousand words, and is easier to convey an idea with pictures, especially in countries that may not speak English. Last month, I got a 7 megapixel camera to replace my 5 megapixel. For my work, 2 megapixel as found in the iPhone is not detailed enough.
As for my iPod, I enjoy that I can carry 60GB of music and photos. When I go on vacations, I can bring my camera and iPod, and connect the two, transferring and viewing the pictures that I take. I can easily free up 5-10 GB of space on my iPod for photos in preparation for a trip, then replace that with music when I am back at home. I also use my iPod as a remote disk drive for my laptop on business trips. Again, the 4GB and 8GB may not be enough for what I need.
Printers were never converged into Personal Computers, but they did have their own convergence. I have a multi-function printer/scanner/fax machine. I used to have separate printer, scanner and fax machines, but now the technology is so inexpensive that it got all combined into one solution.
The same is happening for Storage Area Networking gear.
Thanks to Fibre Channel, switches and directors can handle both SCSI commands (FCP) and CCW commands (FICON). This allows the mainframe and distrbuted systems to converge their traffic onto a single network, and is less expensive than trying to maintain one network for the mainframes, and another for the distributed platforms.
On the SCSI side, there are now switches that let you have pluggable ports of different flavors. For example, you can have some ports be Fibre Channel to receive FCP, and other ports to be Ethernet to carry iSCSI. iSCSI is a protocol co-developed between IBM and Cisco to carry SCSI commands over Ethernet. Since most computers already have Ethernet "network interface cards" and most buildings are already wired with an Ethernet infrastructure, this provides a less expensive alternative to Fibre Channel.
Routers, and combination Router/Switches, can send all the FCP/FICON/iSCSI traffic over various long distances to remote data centers, using either iFCP or FCIP protocols. This is a less expensive alternative to dropping your own private "dark fiber" between the two locations, which often involves negotiating access rights to dig trenches through other people's property.
Which brings me back to Apple's iPhone. One device can make calls, watch video, and download webpages all because the networks have converged into sending all data in "packets". The network just routes packets from one place to another. It doesn't care that a packet is a voice packet, a video packet or a webpage packet. It doesn't matter.
"Users can pay for groceries and other purchases by swiping a phone over a reader that electronically communicates with a microchip on the phone. Phone owners confirm the purchase with the push of a button and the deal is complete.
The platform is the result of many years of trials around the world and will enable mobile contactless payments, remote payments, person-to-person payments, and mobile coupons."
Continuing on my theme of storage area networking, today I thought I would coverstorage networking at home.
Before the PC, corporate end-users had dumb terminals (displays) connected to mainframes (servers) thatwere then connected to external disk and tape (storage devices). This was all done with direct cable connections,then later through networks. The PC solved this by putting the display, server and storage into one unit, makingit more accessible to smaller businesses and individuals.
Many years ago, Microsoft started out with the vision "A PC on every desktop".The primary reason we even have networks is while everyone might have had their own PC, not everyone had their own printer. (Printers used to be part of IBM's storage division, which we explained as "storage on PAPER"!)Maybe if Microsoft's vision was "A PC and printer on every desktop", history might have turned up different.
Disclaimer: IBM has close business relationships with both Apple and Microsoft and others,providing the chips inside some of their products. I discuss them here not only becauseI am trying to get you to buy their products, and let IBM benefit indirectly from their success, but because they are newsworthy, and relevant to the topic at hand.
The "Apple TV" is not a TV at all, but rather a server, one that lets your television (your dumb terminal)access the video, audio and photos stored on your Mac or iPod (the storage device), all through a home network.(Sound familiar?)
Bill Gates from Microsoft gave the keynote, and this is probably his last appearance, as he is retiring in 2008,as we are reminded by thisfunny video, to move on to bigger, and better things. It is perhaps fitting that his retirement aligns with the end of the era for the PC.
Microsoft unveiled their Microsoft Windows Home Server, again a server that connects your television (dumb terminal) with your PC or Zune (storage device)all through your home network. (Sound familiar, again?)
Whereas Apple above pretty much shunned the gaming community, Microsoft embraced it with their internet-enabled Xbox360.Microsoft sold 10.4 million of these last year, which was 400 million more than they projected.
Our SAN technology partner Cisco wants to get in on this "home networking craze", as written about inInfoWorld andCnet.
My take on all this...the consumer electronics industry is taking clues from IBM's mainframe business. Not the first time this has happened, and probably not the last.
I already access photos and audio with my Tivo, from both my Mac AND my PC,so not much new here for me. Getting my home network connected was one of mytech highlights of 2006 and organizing my audio content was done withILM for my iPod.
Bypassing the PC, by being able to have your television, handheld or phone access data directly will greatlyincrease the demand for storage from businesses that provide information and content, and for storage networking technology in the home. It will be interesting how this all plays out in 2007.