This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
Well it's Tuesday again, and you know what that means? IBM Announcements!
You might be thinking, didn't IBM just have a [huge storage announcement October 8, 2013]? You would be right! IBM's $1B additional investment in Storage has been like shot of adrenaline in getting new features and functions out sooner to our clients.
DS8870 Disk System Release 7.2
New IBM POWER7+ controllers. The previous models of DS8870 were based on the POWER7 controllers, and these new models have POWER7+ processors. This change enhances the performance across the board, from mainframe to distributed systems, from sequential to random. Customers with existing POWER7-based models will be able to do MES upgrade to the new POWER7+ next year.
For comparison with older DS8000 models, here are some internal IBM measurements we took for Database workloads on both z/OS(mainframe) and Distributed systems with typical 70% read, 30% write and 50% cache hit:
IBM Internal Measurements (thousands of IOPS)
DB Distributed systems
New 1.2TB (10K RPM) and 4TB (7200 RPM) self-encrypting enterprise drives (SED). This is a 33% boost over the previous 900GB and 3TB drives previously available. As with all the other drives in the DS8870, these new drives include the encryption chip right on the drive itself, offering encryption with scalability.
Improved security. Release 7.2 will support the U.S. National Institute of Standards and Technology [NIST.gov]] 800-131A specification, raising the 96-bit encryption to the required 112 bits on the customer IP network. This involves updates to the security firmware, management software and digital signatures on code loads.
Metro Mirror enhancement for System z. By avoiding serial conflicts of updated blocks, this enhancement can boost performance up to 100 percent when using Metro Mirror with z/OS applications on System z mainframes.
Easy Tier™ reporting and graphs to determine optimal mix. Now you can see for yourself how sub-LUN automated tiering is helping your applications.
Easy Tier Workload Categorization
New workload visuals help clients and IBM technical specialists compare activity across tiers within and across pools to help determine optimal drive mix for current workloads
Easy Tier Data Movement Daily Report
New Easy Tier summary report every 24 hours illustrating data migration activity (5-minute intervals) can help visualize migration types and patterns for current workloads
Easy Tier Workload Skew Curve
Shows skew of all workloads across the system in a graph to help clients visualize and accurately tier configurations when adding capacity or a new system Clients can import data into Disk Magic
All-Flash Optimization. Yesterday, in my post [IBM FlashSystem versus EMC XtremeIO], I mentioned that any hybrid systems like the IBM Storwize V7000 that can support a mix of SSD and HDD can obviously be configured as SSD-only. Apparently, that was not obvious to many readers, so I apologize. For the DS8870, you can configure an all-Flash (SSD only) configuration, and Release 7.2 added some optimization when configured with SSD only.
1,056 drives 15K @146GB in RAID-10
224 drives SSD @400GB in RAID-5
Same - Usable 72 TB
70 percent faster
33 percent less floor space required
62 percent less energy consumed
(Note: Performance results based on measurements and projections using IBM benchmarks in a controlled environment.)
OpenStack™ support DS8870 now offers the [OpenStack Cinder] interface for block LUN allocations in OpenStack environments. IBM is a Platinum sponsor of OpenStack, and Opentack is the strategic platform for IBM private and hybrid clouds.
XIV Storage System
Following on the heels of the [XIV enhancements announced], IBM has now added 800GB Solid State Drives (SSD) as Read cache for its 4TB drive-based models.
DCS3860 Disk System
The DCS3860 is the next generation of the DCS3700 disk system. Designed with Linux-x86 servers in mind, the system offers direct SAS host attachement, 24GB of cache, and 60 drives in a compact 4U drawer. Like its predecessor, the drives are stored on five pull-out trays, with twelve hot-swappable drives per tray. You can add up to five more expansion units, with 60 drives each, for a total of 360 drives in 24U rack space.
These new models will help our clients deploy new workloads and consolidate existing workloads.
IBM Senior Certified Executive IT Architect
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Today I want to write about the latest IBM Flash Storage for the IBM DS8880.
IBM Supports three types of Flash Drives in the DS8880 Storage Servers. The first is the High-Performance Flash (HPF) drives and the second the High-Capacity Flash (HCF) drives. Both of these reside in the High Performance Flash Enclosure (HPFE) for low latency access. The third are regular Solid-State Drives (SSD) that can be intermixed with the 10K and 15K drives attached via Device Adapater (DA) loop pairs.
Last week, IBM announced a new smaller high-capacity drive size, the 1.92TB High-Capacity flash drive. This flash drive is available for the DS8882F or DS8884/F models.
The expected use case is clients seeking smaller high performing solutions below the 20 to 40 TB capacity level, while still providing the full enterprise capabilities in the IBM DS8882 and DS8884/F storage solutions. For example, if you need a small configuration to keep certain sensitive data in specific countries to comply with government regulations.
The new drive is fully supported as part of the DS8880 Easy Tier solution as a Tier-2 drive, because the performance numbers on the 1.92TB HCF drives are expected to be similar to the 7.6/15.3 TB HCF drives.
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
The last day of the conference had fewer people. Many stayed for the Elton John concert then left. I am glad to be one of the few that squeezed out every last value of learning from the money it cost for my employer to send me here.
2419A Enhance the Agility of Your Cloud with IBM FlashSystem
Kristy Ortega and Shaluka Perera, IBM FlashSystem Solutions team, presented. Cloud Service Providers (CSP) and Managed Service Providers (MSP) are leveraging flash technology for a variety of reasons:
To meet Service Level Agreements (SLAs)
To handle unpredictable workloads
To minimize noisy neighbor interference
To offer premium performance as an up-sell feature
To be able to scale faster to meet incoming requests
To reduce server count
To keep custonmers delighted and reduce customer churn
To offer data-rich features without sacrificing performance
Kristy gave three practical client use cases:
IP-Only -- an MSP in the Nordic countries, employed IBM FlashSystem and Storwize V5000. They achieved five times VMware density on their servers and 300 percent improved application performance. Nearly all of the cost of the new storage hardware was offset by the savings in VMware license costs!
Cageka -- an MSP in Europe, employed IBM FlashSystem and SAN Volume Controller. They achieved 66 percent reduced SAP ERP response time, 97 percent reduction in floorspace, and 95 percent reduced power and cooling costs.
COCC -- formerly the Connecticut On-Line Computer Center, a CSP for bank and credit unions, employed IBM FlashSystem with IBM POWER servers. They achieved 10x faster OLTP transaction processing times, 80 percent reduction in power and cooling costs. The payback period for this was less than 3 months!
IBM sells SAN switches featuring Brocade Gen5 "Fabric Vision" technology, and resells Cisco MDS switches like the 9396S model. Both of these have been enhanced to handle the lower latency and higher throughput that IBM FlashSystem provides.
IBM Data Engine for NoSQL employs Redis with Coherent Accelerator Processor Interface (CAPI) that allows POWER8 servers to connect directly to IBM FlashSystem as an extension of memory rather than bus-attached external storage. This reduces the code path length to read/write to IBM FlashSystem by 97 percent, resulting in solutions that use six times less rack space, and three times less costs. This solution reduces CPU core requirements by 20-30 cores for every 1M IOPS of workload!
Spectrum Scale supports IBM FlashSystem in a variety of configurations. First, IBM FlashSystem can serve as a high-speed cache when Spectrum Scale virtualizes other NFS storage devices. Second, IBM FlashSystem can serve as a low-latency storage pool to direct new or hot data to. Third, Spectrum Scale can separate its metadata from the content of files and objects, putting the metadata on IBM FlashSystem. This greatly improves searching through directory structures or for specific object attributes.
Last year, IBM, Hewlett-Packard, and VMware launched Project Capstone to "leave no application behind". They made a concerted effort to make sure that all relevant applications that run on bare metal can also run on VMware hypervisor. IBM FlashSystem has support for VMware features, including VAAI, VASA, and VVols.
IBM has partnered with Atlantas ILIO to offer in-line data deduplication for Virtual Desktop Infrastructure (VDI). A single 2U IBM FlashSystem can support 5,000 users and 10,000 virtual desktops, running at 382 IOPS per desktop.
Lastly, Healthcare provider Trizetto has used IBM FlashSystem to reduce OPEX by 90 percent, shrinking from a 20U disk system array to a 2U IBM FlashSystem device.
4331A Leverage zOS and Cloud Storage for Backup/Archive Efficiency and Cost Reduction
Eddie Lin, IBM Senior Technical Staff Member for DS8000 development team, presented this technology preview. Taking advantage of cloud storage is not limited to the distributed storage world alone. The ability to connect existing archive and backup solutions in z/OS to on-premise object storage platforms provides huge efficiency gains, enabling clients to do more during their critical batch windows.
IBM is integrating cloud gateway software into its DS8870 and DS8880 Enterprise Disk Systems in conjunction with DFSMShsm and DFSMSdss for a complete end-to-end solution to optimize this space. We will show a live demonstration of this capability during this session.
This solution uses the Storage-as-the-Storage-Cloud methodology I mentioned in my session yesterday. DS8000 is #1 storage provider for mainframe environments. Eddie explained the current inefficient process of moving cold data to tape, using 37-year-old DFSMShsm functionality.
A new approach involves moving data directly from DS8870 storage systems to object storage, either on-premises or off-premises. This eliminates MIPS used for data movement, and reduces the record-keeping normally done by DFSMShsm. z/OS Data sets migrated to the Cloud will continue to be designated as MIGRAT in the ICF Catalog. Similar recall times from tape or Cloud.
There will also be options for DFSMSdss to invoke the function. However, you will need to provide in the DFSMSdss command parameters all of the information needed to connect to the Cloud that would normally be handled by DFSMShsm.
To make this all happen, you will need a certain level of DFSMS, and a certain level of DS8000 firmware. No new hardware is required, as it uses 1GbE Ethernet ports that already exist in DS8870 and DS8880 models. If you still have DS8100, DS8300, DS8700 or DS8800 models, now is a good time to start upgrade!
Internal tests on a 5GB data set were done to compare MIPS consumption. With DFSMShsm, 0.127 CPU, versus this new "Transparent Cloud Storage Tiering" method was only 0.068 CPU, indicating a 46 percent reduction in MIPS. DFSMShsm is often the #2 biggest consumer of MIPS (DB2 is #1), so any reduction here is a big deal.
IBM plans to support Spectrum Scale, Cleversafe, IBM SoftLayer, Amazon S3, Rackspace, Microsoft Azure. Full encryption data-in-flight is included, with keys managed using IBM SKLM. This capability will be fully supported by z/OS Security products (RACF, Top Secret, etc.) and z/OS audit logging.
Eddie wrapped up with a live demo.
7341A IBM Storage and Catalogic: Software Defined Solutions for Hybrid Cloud and DevOps
Third party Catalogic ECX software supports IBM, NetApp and EMC storage devices. I was hoping to hear how it works specifically with IBM storage models, but instead the speaker explained why Copy Data Management (CDM) was helpful for Bi-Modal environments.
Basically, copies of data taken to protect production data sit idle until needed. With Copy Data Management, the copies are available to development and test personnel. While traditional production IT operations are like Marathon runners, the new DevOps is like short-distance sprinters, needing to be agile in developing and testing new applications. Having ready access to copies of production data can speed this process.
4921A Radical Storage Simplicity for Your Cloud and How it Can Impact Your Customers
Diane Benjuya and Yafit Sami, both from IBM, presented IBM Spectrum Accelerate, the software "de-coupled" from traditional XIV hardware.
The XIV grid architecture automatically distributes data, eliminates hot-spots, and provides enterprise-class features like thin provisioning, VMware support, snapshots and remote mirroring. It's "Distributed RAID-10" capability can rebuild after the loss of a 6TB disk drive failure in less than an hour.
Spectrum Accelerate has nearly the same set of features, minus Microsoft Hyper-V integration, FCP host access support, VMware vSphere v6 VVol support, Real-time Compression, and Encryption. Spectrum Accelerate adds a feature not available to XIV called Hyperconvergence. This allows application Virtual Machines to run on the same servers used for Spectrum Accelerate. Spectrum Accelerate can run on-premises on customer-choice hardware, or in the Cloud, such as IBM SoftLayer.
In response to complaints that IBM XIV was a single-frame storage array, IBM introduced Hyper-Scale, a series of features that allow up to 144 XIV Gen3 frames as a single system. With the introduction of Spectrum Accelerate, Hyper-Scale Manager can now manage any combination of XIV Gen3 and Spectrum Accelerate clusters, on-premises or off-premises, up to 144 total.
Hyper-Scale Mobility can migrate volumes from one XIV to another without the need for external virtualization such as IBM SAN Volume Controller. For iSCSI volumes, Hyper-Scale Mobility can migrate data between XIV and Spectrum Accelerate, or from one Spectrum Accelerate cluster to another, on-premises with off-premises.
Hyper-Scale Consistency allows Snaphots to be taken of a group of volumes across multiple XIV frames. Now, snapshots can be taken of a group of volumes across both XIV and Spectrum Accelerate clusters.
Remote Mirroring is fully supported. You can replicate data from XIV to Spectrum Accelerate, Spectrum Accelerate to XIV, or from one Spectrum Accelerate cluster to another.
The IBM XIV Mobile Dashboard for Apple and Android phones can support any mix of XIV and Spectrum Accelerate clusters. This includes monitoring your environment, as well as push notifications.
IBM has also introduced flexible licensing options. With newly purchased XIV boxes and Spectrum Accelerate, you can choose to buy the software license as "perpetual", allow you to move it to new hardware when your hardware kicks the bucket. This license can be moved to new XIV hardware, or to Spectrum Accelerate cluster deployment.
For Spectrum Accelerate, an additional license option is "monthly", allowing you to elastically add or reduce the amount of storage you manage, either on-premises or off-premises.
Like the idea of Spectrum Accelerate but don't want to build it yourself? Third party SuperMicro offers hardware pre-certified and pre-installed with Spectrum Accelerate. You license Spectrum Accelerate directly from IBM, and SuperMicro will take care of the rest.
Spectrum Accelerate is a component of the Spectrum Storage suite. A single flat-TB pricing for all six Spectrum Storage products.
Want to try IBM Spectrum Accelerate yourself? Here are three options:
Free 90-day trial with self-destruct. After 90 days, the code stops working. You can download this and try it out.
90-day evaluation copy. Your authorized IBM Seller works with you to install, and if you like it, you buy it after 90 days to continue to use it.
Special promotion before June 30, 2016 -- Purchase IBM Spectrum Accelerate for production, and your first 20TB are free. No strings attached.
IBM's Silverpop uses IBM Spectrum Accelerate to deploy their market analytics solution. They can spin up a new customer with 250TB of capacity in 24-48 hours on IBM SoftLayer. They found they use half as many storage admins managing storage with IBM Spectrum Accelerate than their previous method.
Well, that's the end of the conference. I have to go back and submit all of my survey responses, which I should have done every day all along, but was too busy writing blog posts!
The presentations are also now available for download for those who attended the conference. (Go to Session Preview on the IBM InterConnect attendee website and hit the Download Presentation button)
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Wednesday morning I attended more break-out sessions.
1273: New IBM DS8880 Family: Always-On Data at Cloud Speed
Brian Sherman (with support from Eddie Lin) explored the business value that the IBM System Storage DS8000 series provides to organizations requiring ultimate performance and availability.
Brian reviewed the DS8000 advanced functions, including those that have recently become available, and explains what benefits they provide. While he focused on the latest DS8880 family, some of these were also available on the prior DS8870 models.
Cloud-related features include OpenStack Cinder drivers, REST interfaces, Mobile app monitoring, zKVM and PowerVC support, use of IBM Spectrum Control Base, VMware VAAI primitives, SRA and Web-admin plugin support.
3015A Open Doors with an OpenStack Approach
Mohammed "Mo" Abdula, IBM, presented this overview of IBM's involvement with OpenStack, including BlueBox, which provides a private on-premises OpenStack deployment.
Most enterprises know that a single approach to cloud adoption, whether public or private, will not optimize business results. Connecting one or more clouds to traditional systems, or other clouds, is a realistic and achievable strategy.
OpenStack, being an open technology, is making it easy for enterprises to customize the way they deploy mission-critical business applications.
Code, Community and Culture enable innovation - Cloud should hide the details so that people can focus on what is important. OpenStack is opening the doors for enterprises to quickly get on the Cloud journey.
The automotive industry heavily uses OpenStack. Mo gave an example of a successful promotion by a car dealer that resulted in great sales revenues through social media. The app was developed on IBM SoftLayer than moved on-premises. OpenStack interfaces made it possible.
7186A IBM Spectrum Storage Experiences
Douglas O'Flaherty, IBM, served as emcee for this exciting discussion. Three clients presented their success stories with various Spectrum Storage software. Each speaker had 20 minutes to present their story.
Paul Rafferty, IBM Silverpop
Silverpop was a started that provides Marketing automation, empowering marketers with cloud-based capabilities and cutting-edge big data analytics that deliver personalized customer engagements that scale for any sized business. It was IBM acquired in 2014, but Paul presented as a client of IBM Spectrum Accelerate.
To support clients, Silverpop does everything in the Cloud. With their acquisition by IBM, they have switched to using IBM SoftLayer. To that end, they needed robust storage that provides snapshots, consistency groups, and remote disk-to-disk replication, so they selected bare-metal servers running with IBM Spectrum Accelerate, which is the software-only implementation of XIV storage systems.
Silverpop deploys Spectrum Accelerate on either 7-node or 15-node clusters, with an additional spare-node pre-configured in case of failure. Each node is a 2U x86 server with dual 8-core Intel Xeon E5-2650 processors, 128 GB RAM, two 800GB Solid-State Drives (SSD) and 10 SATA drives 4TB capacity each. The 7-node provides about 120TB of usable capacity, and the 15-node about 255 TB.
Worldwide, Silverpop has 1,500 nodes deployed across 10 IBM SoftLayer datacenters, running 15,000 virtual machines. The virtual machines run on the same nodes as Spectrum Accelerate, including Oracle database, DB2 database, HDFS file system, and Spark analytics. They use Chef and UrbanCode for orchestration and code deployment.
If you ask 10 different Spectrum Protect architects how to design a system, you get a wide variety of answers. Blueprints reduce this complexity down to three "T-shirt" sizes: Small, Medium, and Large, based on the amount of backup traffic per day. Small for deployments less than 6 TB per day, Medium for 6-20 TB per day, and Large for over 20 TB per day.
The blueprints can be deployed on Windows, Linux-x86, Linux on POWER, and AIX. They are disk-based storage pools using either IBM Storwize family or Elastic Storage Server models. The blueprints include configuration scripts that can be customized, and Joe suggested tips for those who want to incorporate tape storage pools.
Bob Oesterlin, Nuance
Nuance creates their Nuance Dragon® voice-recognition dictation software. They process 7500 TB per day, 85% read, 15% write traffic. They have 6 PB of Spectrum Scale file system.
To free up space and reduce costs, Nuance stood up their own OpenStack Swift object-store on storage-rich servers. Files that have turned cold were moved out of Spectrum Scale and into this Object Store, which has now grown to over 4 PB of capacity. Unfortunately, there was no way for end-users who had files on Scale to find them after they were moved to Object Store.
IBM has solved this with Transparent Cloud Storage Tiering, which is currently in open beta. With this new approach, files are "migrated" from Spectrum Scale to Cleversafe object-store, but a stub is left behind in the file system directory so that they can be "recalled" back to Spectrum Scale. This is the same methodology IBM uses to migrate/recall data to tape.
I would vote this the best session I have seen all week! Each client solved real-world business problems with Spectrum Storage software.
To encourage traffic through the Solutions EXPO, foot traffic was re-directed through the booths to get to lunch. This reminds me of having to go through the "gift shop" when you leave amusement rides or museums.
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Here is my recap of the lunch-time sessions Wednesday afternoon.
5663A Beyond Hyperconvergence to a Hyperscale Converged Infrastructure
Bernard "Bernie" Spang, IBM, presented. Organizations continue to face challenges with efficiently managing unprecedented volumes and varieties of data. Meanwhile, new frameworks such as Spark and Hadoop are emerging to efficiently exploit that data. These offerings have the potential to deliver significant benefits, but they can also increase data center complexity and cluster sprawl.
Bernie covered the evolution of Hyperconvergence to a Hyperscale converged technology. By extending software-defined infrastructure concepts to a converged application- and data-optimized fabric, IBM is enabling organizations to reduce costs and accelerate time to insight by efficiently storing, analyzing and protecting their data.
Hyperconvergence is the concept of running hypervisor software on storage-rich servers. Software-only versions include IBM Spectrum Accelerate and VMware VSAN, whereas pre-built systems are available from Nutanix, Simplivity and others.
But not everything is x86 or Hypervisor based. Some applications are better served on bare metal, while others might be better served on containers like Docker or LXC. IBM Spectrum Scale provides for all of these additional platforms, works on both x86 and POWER systems, and can handle storage tiering from flash to disk to tape. It can work across locations, representing any mix of on-premises and off-premises facilities.
1841A IBM Cloud Storage Options
I was pleased to have a standing-room only crowd attend my session!
The term "Cloud Storage" can be misleading. I spell out four unique types of storage:
Ephemeral Storage - storage that exists only as long as the Virtual Machine using it is running. This is ideal for boot volumes and temporary work space.
Persistent Storage - typically block/transactional/high-speed storage that continues to live beyond the life of the Virtual Machine.
Hosted Storage - files, documents and backup copies that are read/written in the Cloud
Reference Storage - files and objects that are written once, and never modified thereafter, such as archives, financial records, and photographs. Since the term Write-Once-Read-Many (WORM) applies only to tape and optical media, the IT Industry now uses Non-Erasable-Non-Rewriteable (NENR) to include flash and disk media protected in some manner through software to avoid tampering.
The first two I refer to as "Storage for the Computer Cloud" and the latter two I refer to as "Storage as the Storage Cloud".
I also discuss the differences between block, file and object access, and why different Cloud storage types use different access methods.
I wrapped up the session covering the various IBM storage solutions that we offer for all four Cloud Storage types.
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Here is my recap of the sessions Wednesday afternoon.
1013A Trends in Encryption of Data at Rest: On-Premise and in the Cloud
Rick Robinson and Walid Rjaibi, both from IBM, co-presented. As the storage of data across seamless on-premise, mobile, and cloud systems platforms becomes ubiquitous, the need for protecting the data, regardless of its location, also needs to be maintained through the use of encryption—and that means centralized key management.
How has industry adopted encryption, especially in the cloud? What applications have adopted centralized key management in the cloud? What are the standards?
There are two types of encryption: Symmetric and Asymmetric. Symmetric like AES or 3DES use the same key for both encryption and decryption. It is faster and designed for large amounts of data. The Symmetric key must be kept private and secure.
Asymmetric like RSA, ECC and Diffie-Hellman use two keys, a public key for encryption, and a private key for decryption. This is slower and intended for smaller amounts of data. However, you can freely share the public key with anyone, publish on your website or print it on your business cards. That is because it cannot be used to decrypt any data!
Don't let the size of the key fool you. AES 256-bit has more security strength than RSA-2048 or ECC-384.
Initial implementations used Electronic Code Block (ECB), which uses just the information in the block of data. Two identical plain-text blocks would be encrypted to identical encrypted blocks. Good for deduplication, but bad for security as hackers love to find patterns.
To solve this, Cyber Block Chain (CBC) uses a bit of the previous block to randomize the data so that even identical plain-text blocks would be encrypted to different results. This is like making sourdough bread, a piece of yesterday's dough is saved and used to rise the yeast for today. To get the sequence started, you need an "Initialization Vector" which is either randomly generated, or a "nonce" (which is short for a number-only-used-once).
For handshake sessions, the TLS protocol generates a Symmetric key that both the sender and recipient will use for bulk data transfer. Then, the sender uses the receiver's public key to send the Symmetric key to the receiver. The receiver uses the sender's public key to acknowledge. Once the handshake is complete, both sender and receiver use the shared Symmetric key to transfer the rest of the data.
This notion of wrapping the Symmetric key with an Asymmetric key is also used on tape and disk. The Symmetric key is often randomly assigned per disk drive or tape cartridge, and the Asymmetric key is referred to as the Key-Encrypting-Key (KEK) or "Master Key".
(The best way to explain this is a Real Estate agent that shows different houses to prospective buyers. Rather than having the agent carry 50 different house keys, she carries a single "master key". At each house, there is a locked box hanging on the door knob that can be opened with the master key, and inside this box is the key that opens that particular house.)
The other challenge to encryption is managing the keys. If you lose the key, you lose access to the data. If the keys are divulged to the wrong parties, you may need to re-encrypt your data to avoid inadvertent exposure. Master keys can be rotated every 90 days, just like passwords.
Where do you store your keys. There are several options:
Public Key Cryptography Standard (PKCS) #12 -- defines a method to store keys in a password-protect file, such as a USB thumb drive. IBM GSKit is available to assist with this.
Enterprise Key Manager (EKM) refers to a set of software packages that manage and distribute encryption keys. IBM Security Key Lifecycle Manager (SKLM), Safenet KeySecure, and Thales EKM are three popular examples.
Hardware Security Module (HSM) is hardware designed to securely store keys. IBM z13 Crypto and Safenet Luna are two examples.
Cloud-KMS are key management systems for Cloud providers. IBM Key Protect, Amazon Web Services KMS, and Microsoft Azure Key Vault are three examples.
In a survey done by Thales, the statistics are scary: Only 36 percent of companies have consistent encryption policy. Nearly half (49 percent) of companies use encryption, but inconsistently across their organization. The remaining 15 percent have no encryption strategy whatsoever.
Here is what IBM offers for zSystems, as well as Linux, UNIX and Windows (collectively referred to as LUW):
For zSystems data-at-rest
For z and LUW data-at-rest
Enterprise Key Management Foundation (EKMF)
IBM Security Key Lifecycle Manager (SKLM)
Guardium Data Encryption (GDE)
IBM Key Protect (backed by a Safenet Luna HSM)
3318A System of Systems Transformation at the Boeing Company
Thomas Kelley and Mahendra Velchuru, both from Boeing, co-presented. The Boeing Company celebrates its 100th year in business in 2016. During this time we have traced the history of computing systems within the industry and have utilized IBM as a strategic partner for many decades.
Boeing found themselves with a large inventory of computing systems and technologies that are required to support their business and drive innovation. As they begin their second century, they are launching several critical systems modernizations and technology initiatives in order to maintain our role as the world's leading aerospace provider.
(While other rooms at this conference packed 80 people in a room with only 50 chairs, this session was scheduled in a room that could hold two Boeing 747 airplanes and hundreds of chairs.)
Over the years, Boeing transition from Remote Procedure Call (RPC), to Common Object Request Broker Architecture (CORBA), to Integration Brokers, to Enterprise Service Bus (ESB) Service Oriented Architecture (SOA).
(At this point, I could have gotten up and left the room, as obviously the "Systems" referred to in the title were not referring to IBM Systems, like server, network or storage systems, as I had anticipated. However, I decided to stay and learn more.)
Boeing explained their "Six Pillars" of SOA transformation, starting with a Maturity Assessment of where they were, then a four-year roadmap of transformation, and adopting a Bi-Modal SOA method, and adopting the right level of SOA Governance to keep it running correctly.
2154A Expert Panel on Hybrid Cloud Data Protection: Who Is the Service Provider?
David "Greg" Van Hise, IBM, served as emcee for this expert panel. Experts on our panel perform over five million backups per month. Who better to ask about what's new in cloud data protection? The experts were:
Richard Spurlock, Cobalt Iron -- which provides Cloud Backup for Business Data Protection
Thomas Bak, Front-safe A/S -- a third party that provides Backup-as-a-Service using IBM Spectrum Protect
Daniel Witteveen, IBM Resiliency Services for Cloud Managed Backup -- Formerly known as SmartCloud Managed Backup (SCMB), this is IBM's version of Backup-as-a-Service, also using IBM Spectrum Protect
This session was for people interested in enhancing your own backup capability or understanding how cloud providers can provide data protection services. The panel offered new insights on how hybrid solutions can help you take advantage of the cloud without losing sight of your data. IBM Spectrum Protect can help you keep pace with the flexibility, improved service levels and low cost available from cloud backup providers.
The evening wrapped up with a 2-hour long concert of Sir Elton John! There were 23,000 attendees at this conference, but the MGM Grand Garden Arena only holds 16,800 people, so the rest were directed to MGM's Hakkasan Night Club. Next to my hotel at the Monte Carlo, they are constructing a new "Las Vegas Arena" that will hold 20,000 people for events such as these.
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
4955A IBM and Box: Delivering Hybrid Solutions for Enterprise Content Management
Rich Howarth, IBM VP of Enterprise Content Management, and Rand Wacker, Vice President at Box, co-presented this session on the [IBM-and-Box partnership], integrating content management, social and analytics products with the Box cloud content management offering to enable enterprise customers to deploy hybrid solutions leveraging the best of their existing on-premise technologies along with new cloud technologies.
IBM and Box are partnering to re-imagine content management, case management and governance in the cloud. For example, IBM StoredIQ that scans various data sources to find documents and evidence needed to defend yourself against lawsuits can be run against files uploaded to Box.
On a personal note, the IBM Tucson Executive Briefing Center where I work now uses Box to upload presentation files that are then sent to the client attendees.
6524A The Role of Tape in a Cloud-Based World for Economical and Secure Data Retention
This was a 50/50 session. The first half was presented by Shawn Brume, IBM, that covered Linear Tape File System (LTFS) and IBM Spectrum Archive.
Like the cloud, tape has made great strides -- evolving independently in capacity, durability and data access capability while maintaining its economic benefits. As a result, today's tape is just as well suited to cloud service providers as it is to the enterprises and midsize organizations that rely on it to support their production and data protection strategies.
If a cloud service provider does not use tape, the provider and its customers are almost guaranteed to experience a long-term cost outlay that is higher than necessary, and will be putting their oldest and most compliance-sensitive data at risk, thanks to a disk-only based MSP model. See how incorporating tape into your storage strategy can reduce costs and improve MSP margins.
How does tape compare to disk for Cloud providers? A [Zettabyte] of data would cost $41 billion USD per year on disk, but only 8 billion USD per year on tape. Electricity for a Zettabyte of data requires 1.2 Gigawatt for disk, but only 300 Megawatt for tape.
For access to files that require a tape mount, an access time to first byte averages 45 seconds, with a worst case around 75 seconds. After that, tape can stream data as fast as the Internet can deliver it, so performance is not an issue beyond first byte access.
The second half was presented by Michael Piltoff, from value-added reseller Champion Solutions Group, covering their latest product called EchoLeaf. This can run on Windows or Linux that attaches to any IBM tape library, and exports the files on those cartridges as NFS or CIFS/SMB.
In other words, the entire library appears as a single mount point or drive letter, and each tape cartridge appears as a sub-directory. This uses IBM Spectrum Archive Library Edition under the covers.
4759A Cloud Storage Success: MSPs and Enterprises Reveal their Secrets
How do you distinguish fact from fiction when it comes to claims made by vendors about storage for cloud? Eric Herzog, IBM Vice President Marketing for IBM Storage Systems, served as emcee for a panel of experts using IBM Storage solutions across different industries for their Hybrid Cloud deployments.
The panel shared their experiences in using various technologies to get the most out of their private and hybrid cloud, discussed how they are building out their next-gen data centers to cope with today's business needs, talked about how they are using flash and software defined storage to place them where they need to be to succeed in the future.
On the panel were:
Richard Spurlock, Cobalt Iron, using PB of storage on Spectrum Scale and Cleversafe
Paul Rafferty, IBM Silverpop, using Spectrum Accelerate with different Cloud providers
Johnny Oldenburg, Tieto Sweden AB, using SVC, Storwize V7000 and FlashSystem
Keith Dobbins, Time Warner Cable/Navisite, over 30 fully-populated XIV storage systems
Here were some of the nuggets of wisdom:
Eliminate the debate between private or public cloud. Consider everything to be a unique shade of Hybrid Cloud.
Get the network right, all data and management control is done through the network in the Cloud
Take an "Outside-In" approach, focusing on the business problems being solved, rather than trying to exploit specific technologies.
Workloads are unpredictable in the Cloud. Cloud can sometimes be unreliable in their response to workload changes. Partner with vendors like IBM to provide support and scalability to handle the unexpected.
Ensure that you comply with government and industry regulations. For example, Payment Card Industry Data Security Standard [PCI-DSS] for credit card transactions.
Use VMware Storage Vmotion and VVols to migrate data from one Cloud to another.
Software defined network (SDN) and Software Defined Storage (SDS) greatly automate the provisioning process, pushing many storage admin tasks down to NOC personnel.
Use tools like Spectrum Control to provide a single-pane-of-glass management of your entire environment.
Build abstraction layers at touch points to avoid being impacted by external changes, and use documented reference architectures to ensure success.
Educate your clients and end-users on what is possible, and what is probable, in the Cloud.
Use "Flash Cache" technologies, such as IBM XIV, Oracle, Spectrum Scale, and VMware.
Analytics can help with "data rationalization", which identifies the business value of the data.
Object Store is a first-class citizen and should seriously be considered for new projects.
5467A My Data is Out of Control! Managing the Lifecycle of Your Data with "Big Storage" Cloud Archive
Jeff Karmiol and Quaid Nasir, both from IBM, presented a technology preview of a deep archive to be launched later this year.
A staggering 80 percent of data is never touched after 90 days of capture or creation. However, the data may need be kept for business, compliance or regulatory reasons.
"Big Storage" offers cloud storage for customers who need to store large amounts of data and retrieve it on-demand at the lowest cost possible. This easy-to-use cloud service provides fast retrieval times with affordable, transparent pricing and retrieval rates.
This service uses standard OpenStack Swift and POSIX interfaces so you don't need to learn any new APIs. Files and objects remain visible while archived, making it easy and affordable to continue to extract business value from your archived data.
This deep archive is located in a secure, IBM-managed data center. How deep? The facility is 350 feet deep under a mountain, which allows that tape cartridges to be kept at constant humidity and 40 degree Fahrenheit temperature.
Multiple resiliency and data protection options will be available. The data can be part of a global namespace, with some data on premises, connected to data migrated to the archive. Data movement can be either manually-initiated or policy-managed.
7256: Blogging 301: The Art of Opinion
"Turbo" Todd Watson and I started blogging 10 years ago, and we have both been ranked in the top-10 bloggers for IBM. He presented a series covering the basics of blogging. This session was a deeper dive into best writing practices and structures for being confident, engaging, and convincing in their writing.
Here are some of his bits of wisdom:
Base your opinions on fact and well-research information.
Educate your readers, without being "preachy"
Generate interest and enthusiasm, and encourage readers to participate
Don't equivocate, pick a position or side of a debate and stick with it
Leave your reader with the next logical step, a call to action, or pointer to additional information
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Monday morning I attended the General session and a break-out session.
7030A General Session Day 1: Digital Business Transformation
The General Session was kicked off by severak clients:
Richard Holmes, Westpac Group, a 200-year-old bank with 21,000 branch locations across Australia and New Zealand. They have migrated 70 percent of their applications to the Hybrid Cloud. Provisioning server and storage resources went from 84 days to just minutes.
Matthias Rebellius, Siemens AG, Building Services. They use IBM Watson IoT to monitor the energy usage of their buildings. They have reduced energy consumption 20 to 30 percent, eliminating over 10 million cubic feet of CO2 greenhouse gas.
Robert LeBlanc, IBM Senior Vice President for Cloud, took the stage and welcomed the 23,000 attendees. Developers are turning to IBM Cloud to deliver timely, knowledgeable, and secure experiences for their customers and end-users. Business leaders are seeking new ways to enable their companies to securely implement hybrid cloud strategies that integrate mobile, IoT, and cognitive. He focused on five areas:
Choice, but with Consistency
Hybrid Cloud Integration
Powerful, Accessible data and analytics
Robert indicates that 100 percent of our strategic software products are now Hybrid-Cloud enabled. We get over 3.2 billion API calls per month, and 20,000 new IBM Bluemix users per month. More than 7,000 startups are now running on IBM SoftLayer. IBM was once again ranked #1 for Hybrid Cloud by industry analysts.
IBM predicts that 80 percent of Internet traffic will be video by year 2019. To that end, IBM offers Aspera, Ustream, and Cloudleap.
New IBM Watson APIs can analyze "tone", "emotion" and "vision".
IBM has partnered with Github to offer an Enterprise-class Github-as-a-Service offering suitable for business use.
IBM "Open for Data" has over 150 pre-populated public data sources for use with analytics. This allows applications to analyze their own data in context with public sources.
Carl Eschenbach, VMware, emphasized its partnership with IBM, announcing the ability to run VMware on IBM SoftLayer "bare metal" systems, enabling features like NSX networking and VSAN virtual storage.
Brian Cross, Apple Vice President of Product Marketing, presented the enormity of Apple's developer ecosystem:
1.5 million apps on Apple iOS application store
11 million developers making these apps
100 billion downloads of these apps
1 billion Apple devices
In the past, these developers used Xcode development environment. To take the most advantage of Apple hardware features, many developers use C or C++ programming languages to develop "Native Apps".
Apple developed a new programming language called Swift that has already made it to the top 20 development languages. He gave a demo of "Swift Playground" that allows developers see their apps running while they develop and edit the code.
Apple has made Swift open source, and extended its use across iOS, Mac OS X, Watch OS, tvOS and even Linux operating systems. This means you can write code for devices, client workstations and even servers in your datacenter or Cloud. Download it at [Swift.org].
John Ponzo, IBM Fellow, Vice President and CTO of MobileFirst, wrapped up the General session. He mentioned the "IBM Swift Sandbox" service that helps developers learn Swift programming:
Kitura -- This open source framework would allow developers to build end-to-end applications, deploy, and collaborate on web services and applications written in Swift. Kitura allows developers to build front-end and back-end code using Swift as the programming language to help simplify modern application development.
OpenWhisk -- A feature on IBM Bluemix that provides an event-driven computing service for dynamic applications. It competes against Amazon's Lambda service.
With new ways to deploy Hybrid Cloud, using new composable development tools, it is clear that "Cloud" is not merely a destination, but a new innovation platform.
1581A University of Chicago Taps into IBM Cloud Object Storage for More Effective Patient Treatments
This session was 30 minutes with Piers Nash, University of Chicago - Center for Data Intensive Science (CDIS), client testimonial, followed by Russell Kennedy, IBM, that covered an overview of Cleversafe used in the solution.
University of Chicago's Center for Data Intensive Science (CDIS) accelerates medical discoveries by democratizing access to data for scientific research. Utilizing an object storage solution, CDIS centrally stores and manages vast amounts of genomic and clinical data at web-scale, allowing researchers to collaborate via shared access to harmonized data sets, speeding discovery and enabling precision medicine.
Their initial focus is cancer research. Cancer costs over $100 Billion USD per year in healthcare costs. It is #1 killer among people under 85 years old, affecting half of all men, and a third of all women. There are 1.7 million new cancer cases in the USA every year, 15 million worldwide.
There is no "single cure" for cancer. Whereas all humans share nearly identical 3.2 billion base pairs of genetic material, there are over 15,000 different kinds of cancers, each with its own genome. Capturing RNA sequences of patients results in images 10-20 GB in size, and over the course of treatment could add up to 1 TB of image data per patient. A million patients with 1TB of data each would be an Exabyte of data (1,000 Petabytes).
To store all of this data, CDIS created the Bionimbus Protected Data Cloud, using Cleversafe as the underlying storage technology. This system goes live June 2016, and they plan to keep the data forever.
(We'll see how well that goes 10 years from now! It might be cheaper just to re-sequence a human's DNA as needed, rather than storing it forever, since an individual's DNA never changes.)
The data is "de-identified" meaning that researches using the data are unable to identify individual people associated with each case study or genomic result. They have already collected 1.66 PB of this data.
Most cancer treatments that have been effective have focused on specific genetics. The problem is targeting precise therapies to the right patients. For example, there are two very similar Lung cancers, and about 20 percent of the time, a Lung cancer is mis-identified, such that the patients has adverse reactions to the wrong treatment. By having more analytics-based medicine, the hope is to reduce this trial-and-error approach.
Russ Kennedy, IBM, wrapped up the session explaining Cleversafe, which was a Chicago-based company formed in 2004 that was acquired last year by IBM. Why did University of Chicago choose Cleversafe? Several reasons:
University of Chicago attempts to use open source projects like Gluster or Ceph failed around the 1-2 PB mark. They knew they would need much more than this!
Cleversafe was a Chicago-based company, offering local support
IDC ranked Cleversafe #1 marketshare leader of object storage in 2014 and 2015! It beats out competitors like Dell/EMC and Salty, as well as Cloud Service Providers like Amazon or Google.
Why object storage? IBM predicts a 332 percent growth in data generated from Mobile devices. As much as 90 percent of traffic on Mobile devices will be from Cloud apps rather than voice or text messages. There will be a 10-fold increase of data stored by year 2020, and at least 80 percent of this data will be unstructured content. Cleversafe estimates that managing object storage requires 15x fewer administrators than traditional storage.
Cleversafe consists of three components. The "Accessor" is software that runs bare metal, as Virtual Machine or Docker container. It offers the OpenStack Swift, HTTP/REST and Amazon S3 object-based interfaces to ingest the data. The data is encrypted, divided into pieces, then through a process called [Erasure Coding] is converted to slices. Those slices are stored on storage-rich servers called "storage nodes".
For example, five pieces of data converted to nine slices could be stored on nine machines, three machines at Site 1, three at Site 2, and three at Site 3. You only need to read back any five slices to reconstruct the data, so you could lose any four of the nine machines and still have full recoverability. If the 5/9 example above, you could lose any one site, and a machine in one of the two remaining surviving sites, and still retrieve all of your data.
There is now an "open beta" called the Transparent Cloud Storage Tiering that bridges GPFS and Spectrum Scale over to Cleversafe.
I wrapped up the morning with a lunch at Border Grill with storage clients and IBM Business Partners. This was the best steak I have had this week!
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Monday afternoon, I attended various break-out sessions.
1441A Data Resiliency: Data-Driven Analytics and Beyond
Ramani Routray (IBM) and B.J. Klingenberg, IBM, co-presented. Aggressive and differentiated Recovery Point Objectives (RPOs) and Recovery Time Objectives (RTOs) create data protection silos. Resiliency for an enterprise data center is often achieved via redundant components, periodic backup, continuous replication and/or highly available architectures. With the emergence of cloud delivery models, Backup-as-a-Service and DR-as-a-Service have gained wide acceptance. This uniquely challenges service providers to quickly analyze all the metadata from these environments to enable problem determination, fault isolation, capacity management, SLA violation, etc. Learn about a big data analytics framework that analyzes millions of resiliency metadata tuples in near real-time to generate actionable insights.
1267A Prudential and IBM: Integrating Application and Storage Management to Drive Cloud Service Levels
This was a 50/50 presentation, with the first half covered by clients OJ Dua, supported by his boss, Scott Singerline, both from Prudential Financial.
Prudential explored their successful approach for optimizing storage and improving service. First, experts from Prudential Financial will describe their experiences integrating IBM Spectrum Control v5.2 (formerly IBM Tivoli Storage Productivity Center) inventory, availability, and performance data with Tivoli Application Dependency Discovery Manager (TADDM) and Netcool OMNIbus to improve services for core business applications.
(Over 10 years ago, I was the chief architect for IBM TotalStorage Productivity Center v1. The clients from Prudential could not emphasize enough how much better Spectrum Control v5.2 was compared to their experiences with the prior versions. It has come a long way, baby!)
The second half was covered by Brian Sherman, IBM Distinguished Engineer. He described how related IBM Spectrum Storage solutions are transforming storage. IBM Spectrum Storage solutions deliver reliable, flexible service levels at a significantly lower cost than traditional storage.
6523A VersaStack: Because Time and Cost are of the Essence for Cloud Service Providers
This was more of a 25/75 presentation. Ian Shave, IBM Business Line Executive for Spectrum Virtualize and VersaStack, kicked off the session with a quick overview of VersaStack, which combines Cisco UCS x86 blade servers and Cisco network switches with IBM Spectrum Virtualize storage solutions. This is often referred to as "Integrated Infrastructure" or "Converged Systems". While the growth of Integrated Infrastructure adoption is growing 15 percent, storage within Integrated Infrastructure solutions is growing faster at 44 percent.
VersaStack can be implemented as follows:
Cisco UCS Mini with Storwize V5000, either iSCSI or FCP
Cisco UCS with Storwize V7000 (block-only) or V7000 Unified (file and block access)
Cisco UCS with FlashSystem V9000, for high-speed, low-latency application requirements
John Buskermolen and Dan Simunic, both from i-Virtualize, covered their experiences with VersaStack. Founded in 2009, i-Virtualize is a Managed Services Provider (MSP), Cloud Service Provider (CSP) and value-added reseller, for clients in both USA and Canada, growing 41 percent year over year.
They reduced the time to market from weeks to days, cut new environment provisioning time from days to minutes, and simplified management when it implemented VersaStack, an integrated infrastructure solution that combines Cisco UCS Integrated Infrastructure with IBM storage solutions built with IBM Spectrum Virtualize to deliver extraordinary levels of performance and efficiency.
Why did i-Virtualize choose VersaStack?
79 percent reduced provisioning time
60 percent lower costs
10x performance acceleration
Higher flexibility, with clustered systems that scale up and out
Let's i-Virtualize administrators and management sleep at night
47 percent capacity savings with Real-time Compression
IBM Spectrum Virtualize HyperSwap for high availability
Storage-based replication across multiple datacenters
Cisco UCS director provides single-pane-of-glass management
Their latest project is called VIXO, a Cloud Managed Services Console which stacks Cloud Foundry, Docker, OpenStack, VMware and other 3rd party services on top of their VersaStack. This is a collaboration with Oxbury Group.
VersaStack is an ideal solution for Cloud Service Providers (CSP) or for any client interested in "cloud-in-a-box."
3690A Meet the Experts on IBM Cloud Storage Services
Ann Corrao and Mike Fork, both from IBM, presented IBM's various storage capabilities on SoftLayer and Cloud Managed Services (CMS). Of IBM's 43 Cloud datacenters, 28 are SoftLayer, and the other 15 are CMS.
For block-based volume storage, SoftLayer offers "Endurance" and "Performance". These are backed by multi-pathed iSCSI volumes.
With "Endurance" option, you purchase a fixed I/O density, either 0.5 IOPS/GB, 1 IOPS/GB or 4 IOPS/GB. If you choose a 100 GB volume, you are guaranteed 400 IOPS. Typical business applications like database or email consume about 0.7 IOPS/GB.
With the "Performance" option, you pick the IOPS for your volume, up to 6,000 IOPS, and then pick the size to match your needs, say 100 GB. This is best suited for clients who know their application well enough to specify this.
IBM Bluemix also has a block service, based on OpenStack Cinder drivers. These are backed by internal disk on storage-rich servers. IBM SoftLayer can pack 4 drives into a 1U server, 12 drives into a 2U server and 36 drives into a 3U server.
For object store, IBM SoftLayer supports OpenStack Swift. They support content expiration, versioning and metadata search.
(When asked if this was Cleversafe or something else, Mike was quick to point out that IBM SoftLayer focuses on the "Service Level Agreement (SLA), the client experience, and the APIs" so however they chose to back this storage is internally determined. The client should not have to specify product xyz in their contract.)
An extra feature for object store is "Content Delivery Network" (CDN) which uses EdgeCast to cache content at the edges of the network to improve performance delivery. You designate which object containers you want to accelerate performance, and you pay for the amount of bandwidth consumed.
For file space, IBM SoftLayer supports NFS and SFTP only. Supporting CIFS, or rather its replacement SMB, is a known requirement. In the meantime, there are a variety of 3rd party "Cloud Gateway" solutions, like NetApp AltaVault, Panzura global namespace, or CTERA.
For file sync-and-share, IBM has partnered with Box to provide Enterprise-class service.
How do clients ingest data into their IBM SoftLayer account? One option is to use Aspera, a recent IBM acquisition that is 3x faster than traditional SCP. Another option is to ship disk or tape cartridges to IBM SoftLayer facility.
EMC Corporation (NYSE:EMC) today announced it has been positioned as a leader in the Forrester Wave™: Enterprise Open Systems Virtual Tape Library (VTL), Q1 2008 by Forrester Research, Inc. (January 31, 2008), an independent market and technology research firm. EMC achieved a position as a leader in the Forrester Wave report on virtual tape libraries based on the largest installed base of the EMC® Disk Library family of systems, its broad ecosystem interoperability. Virtual tape libraries emulate tape drives and work in conjunction with existing backup software applications, enabling fast backup and restoration of data by using high-capacity, low-cost disk drives.
EMC was the first major vendor in the open systems virtual tape library market as it introduced the EMC Disk Library in April 2004 and today is a leading provider of open systems virtual tape solutions, with systems that are designed for businesses and organizations of all sizes.
While the press release implies that "EDL equals VTL", Chuck tries to explain they are in fact very different. Here is an excerpt from his blog post:
Virtual Tape Libraries vs. Disk Libraries
As many of you know, VTLs have been around for a while. They use disk as a cache -- they buffer the incoming backup streams, do some housekeeping and stacking, then turn around and write tape efficiently. When you go to restore, you're usually coming back off of tape, unless the backup image in question is sitting in the disk cache.
Now, there is nothing wrong with the VTL approach, but it was conceived in a time when disks were horribly expensive. It was also pretty clear to many of us that disks were going to be a whole lot cheaper in the near future, and this fundamental assumption wouldn't be valid for much longer.
I kept thinking in terms of disk as a direct target for a backup application. No modifications to the backup application. Native speed of sequential disks for both backup and restore. Tape positioned as a backup to the backup. Use the strengths of the underlying array (e.g. CLARiiON) for performance, availability, management, etc.
We ended up calling the concept a "disk library" to differentiate from the VTLs that had come before it. It was a different value proposition and offering, based on the emergence of lower-cost disk media.
... It's nice to see we're at 1,100+ customers, and still going strong.
For those new to the blogosphere, there is a difference between "Press Releases" as formalcorporate communications versus "Blog Posts" which are informal opinions of the individual blogger, whichmay or may not match exactly the views of their respective employer.As we've learned many times before, one should not treat termslike "first" or "leader" in corporate press releases literally! Let's explore each.
Was EDL the first "open systems" Virtual Tape Library?
This is implied by the Forrester report. Chuck mentions the "VTLs that had came before it" in his blog, and many people are aware that IBM and StorageTek had introduced mainframe-attached VTLs in the 1990s. But what about VTL for "open systems"?
(Hold aside for the moment that IBM System zmainframe is an open system itself, with z/OS certified as a bona fide UNIX operating system by the [the Open Group] standards body. Most analysts and research firms usually refer only to the non-mainframe versions of UNIX and Windows. Alternative definitions for "open systems" can be foundin [Web definitions or Wikipedia]. I will assume Forrester meantnon-mainframe servers.)
IBM announced AIX non-mainframe attachment via SCSI connectivity to the IBM 3494 Virtual Tape Server (VTS) on Feb 16, 1999, with general availability in May 28, 1999. That's nearly FIVE YEARS before the April 2004 introduction of EDL. IBM VTS support for Sun Solaris and Microsoft Windows came shortly thereafter in November 2000, and support for HP-UX a bit later in June 2001. One of my 17 patents is for the software inside the IBM 3494 VTS, so like Chuck, I can takesome pride in the success of a successful product.
(I don't remember if StorageTek, which was subsequently acquired by Sun, had ever supported non-mainframe operating systems with their Virtual Storage Manager[VSM] offering, but if they did, I am sure it was also before EMC.)
Last week, another EMC blogger, BarryB (aka [the Storage Anarchist]),took me to task in comments on my post [IBM now supports 1TB SATA drives]. He felt that IBM should not claim support, given that the software inside the IBM System Storage N series is developed by NetApp. He compared this to the situation of HP and Sun re-badging the HDS USP-V disk system. If someone else wrote the software, BarryB opines, IBM should not claim credit for it. I tried to explain how IBM provides added value and has full-time employees dedicated to N series development and support, butdoubt I have changed his mind.
Why do I bring that up? Because the EMC Disk Library runs OEM software from FalconStor. Basically EMC is assembling a hardware/software solution with components provided from OEM suppliers. Hmmm? Sound familiar? Who is calling the kettle black?
If there is a clear winner here, it is FalconStor itself.Perhaps one of the worst kept industry secrets is that FalconStor software is also used in VTL offerings from Sun, Copan, and IBM, the latter embodied as the [IBM TS7520 Virtualization Engine] offering. If you like the concept of an EDL,but prefer instead one-stop shopping from an "information infrastructure" vendor, IBM can offer the TS7520 along with servers, software and services for a complete end-to-end solution.
Can EMC claim to be "a leader" in Virtual Tape Libraries?
During the measured quarter, IBM shipped its 10 millionth LTO-4 tape drive cartridge to Getty Images, the world's leading creator and distributor of still imagery, footage and multi-media products, as well as a recognized provider of other forms of premium digital content, including music. Getty Images is using the LTO-4 drives as part of a tiered infrastructure of IBM disk and tape solutions that help support the backup needs of their digital imagery;
IBM shipped more than 1,500 Petabytes of tape storage in Q3'07 alone;
During Q3'07, IBM shipped the 10,000th IBM System Storage TS3500 Tape Library. The TS3500 is a highly scalable tape library with support from 1 to 192 tape drives and up to 6,400 cartridge slots for open system, mainframe and virtual tape system attachment.
Let's take a look at the numbers. IBM has sold over 5,400 virtual tape libraries. Sun/STK has sold over 4,000 virtual tape libraries. Both are drastically more than the 1,100 mentioned in Chuck's post. Does IDC recognize EMC in third place? No, EMC chooses instead to declare EDL as disk arrays (probably toprop up their IDC "Disk Tracker" numbers), so they don't even earn an honorable mention under the virtual tape librarycategory. This of course includes the number of mainframe-attached models from IBM and Sun/STK. So, if EMC did call these tape systems instead, they might showup in third place, and as such EMC could claim to be "a leader" in much the same way an athlete can claim to be an "Olympic medalist" winning the bronze for third place. (If you limit thecount to just the FalconStor-based models from IBM, EMC, Sun and Copan, then EMC moves up to first or second, but then press release titles like "EMC a Leader in FalconStor-based non-mainframe Virtual Tape Libraries" can get too confusing.)
Chuck, if you are reading this, I feel you have every right to celebrate your involvement with the EDL. Despite having common software and hardware components, both IBM and EMC can rightfully declare their own unique value-add through their respective VTL offerings. Like the IBM N series, the EMC Disk Library is not diminished by the fact the software was written by someone else. BarryB might disagree.
Hello everyone! I am back, fully well-rested from a wonderful 3-week vacation touring the lovely state of Tennessee. Here's a quick recap:
(FCC Disclosure: I mention various companies and products in this blog post. I have no financial interested in any of them, nor have I received any compensation to mention or endorse them here.)
Our first stop was Lynchburg, TN, home of [Jack Daniel's], America's oldest whiskey distillery. Our tour guide, Ron (who both looked and sounded like [John Goodman]) took us first to see how they burn wood to make charcoal, then the natural water spring which supplies the iron-free water used for the whiskey. We then got a whiff of the mash at various stages of fermentation. Lastly, we had samples of Original No. 7, Gentleman's Jack, and Single Barrel.
(A word of caution: Domestic airlines only allow FIVE LITERS of Bourbon, Whiskey or Rum in your checked luggage. That is only six bottles at the 750ml size, of beverages that are between 24 to 70 percent alcohol by volume [ABV]. Anything above 70 percent is considered too flammable to take on the plane. Excess bottles can be custom packed and shipped, but can be quite expensive. Nearly everyone we met drove all the way to Tennessee instead of flying, and now I understand why.)
While in the area, we had a nice lunch at [Miss Mary Bobo's], a boarding house turned into a restaurant. They only serve one meal a day at 1pm, by reservation only. And we were paired up with eight others and served food "family style" a large round table with a [Lazy Susan].
Jack Daniel's is not the only attraction in the area. We also visited [Falls Mill], a grist mill that grinds corn, wheat and rye for the other distilleries. Mo and I visited [Prichard's Distillery], where they make Whiskey, Rye and Rum. We highly recommend their molasses-flavored "Sweet Lucy"!
We stopped at the famous historic landmark, the [Chattanooga Choo Choo], which was formerly a train station, and now renovated into a hotel. We asked to see the inside of one of the train cars converted into a hotel room.
Gatlinburg and Pigeon Forge
We stayed in a cabin in the [Smokey Mountains] near Gatlinburg. In addition to pleasant rides through the National park, we also walked around the small town, looking at all the shops and amusements.
The next town over is Pigeon Forge, and driving down the main parkway is like Las Vegas in a slightly alternate universe. One person called it the Redneck Riviera!
We spent two days at Dollywood theme park, named after its founder, famous country singer Dolly Parton. We arrived after 3pm the first day, so they gave us the second day free!
In addition to roller coaster rides, artisan shops and restaurants, we found zip lines! Mo and I put on harness, attached to a pulley, and zipped over roller coasters, trees and rivers throughout Dollywood park. It was a lot of fun!
We also went to Dolly Parton's other attraction: Stampede. This was a dinner show with horses. It was similar to the Excalibur show we saw in Las Vegas last year during the week of Edge 2013 Conference.
On our way from Gatlinburg, we stopped into Knoxville to have lunch with clients. We had a choice to make, we could either drive up into Kentucky and visit the distilleries in the Bourbon trail, or drive straight to Nashville and spend more time there. We opted for Nashville, saving the Bourbon trail for a future trip.
Our final stop was Nashville, known as Music City. Our hotel was on Broadway, walking distance between Vanderbilt University and the [honky-tonks] downtown.
We had purchased advanced tickets for the [Grand Ole Opry]. This is not your typical concert. Instead, you have no idea who will play until just a few days before. The three hour show had about a dozen different musical acts, some famous, some new to the country music scene.
We went to the Johnny Cash Museum. People with ticket stubs from the Grand Ole Opry get in for a discount!
Searching [TripAdvisor] for things to do in Nashville, I found [The Escape Game]. You pay them money to lock you up in a room with a bunch of strangers, and then collectively as a team you need to figure out how to escape by solving puzzles and clues.
Each room has different themes. First, we tried the "Underground Playground". You know that TV show [Are you Smarter than a Fifth Grader?] Well, the majority of our so-called team were not in this case, and after 60 minutes the referee told us we had failed and unlocked the door.
We had so much fun that we came back two days later to try a different room. This time we tried "The Heist" which is all about art theft. The strangers we were teamed up with were very motivated to get out of the room in time, and we succeeded, getting out in just 54 minutes!
Mo and I had a great time, but are glad to be back home!
Well, I'm going to take a two week break from blogging. Not because my clarification of storage terminology got me Marc Farley's finger wagging of shame.
No, I'm going on vacation.I'll be going to a third-world country, possibly outside the reaches of cell phones, e-mail and the internet, so I won't be blogging until I get back later this month. Since Clark Hodge has discovered a pattern that I am suspiciously close to massive power failures, I think it best not to tell people exactly where I am going.
So, until I get back, I leave you with a nice piece from Kirby at Storage Sanity who has discovered that IBMers are very nice.
Over the past ten years, my co-workers have asked to write a "guest post" on this blog. This time, Moshe Weiss, IBM Senior Manager, Development and Design, has offered the following post, not in his own voice, but in the voice of his "baby", the Hyper-Scale Manager software.
You might think this is a strange approach, but today we have robots that can dance, and cars that can drive themselves! If software could talk, this is what IBM Hyper-Scale Manager would say:
"I was born a year ago.
It wasn't an easy birth… there were many complications. In fact, so many, that I was almost prematurely born!
Most of my development, in preparation for labor and delivery, was done within the last 6 months of the overall 18 months. I was shaped and designed, and sometimes re-shaped, three times. Lots of assumptions had to be made in hopes to ease a successful delivery and help bring me to full term of the birthing process.
During my first year of maturity, I focused on learning how customers used me; what frustrated them the most, and what they loved or 'almost' loved, while still needing refinement and redesign.
The number of customers adopting me grew higher and higher, as did the number of complaints and bugs that I had to deal with, and my users’ frustrations and dislikes because I wasn't yet a complete solution and still had some missing features.
I was renewed four times! Each time of which improved me and made my senses better, faster, adding new capabilities that helped make me more approachable, intuitive and delightful.
Choosing how to renew, and what to add to each renewal, is not an easy task. Basically, it was about prioritizing user experience versus gaps that were deferred from my birth, versus differentiators to make me unique and sell more, versus features in my roadmap, versus investing huge efforts in my quality.
Each renewal was a complex process with lots of features and behaviors to add, while trying to make my customers’ life a bit easier, since features that were important to them were sometimes considered low priority.
But, there were also good times during my first year:
Huge customer adoption rate
100 new customers in two months!
Growing was a great thing and my parents were and are still so proud! But, like with most things, it came with a price - a lot of sustain issues from the field, requests for changes and bad feedback that I am hard to use and missing core elements.
Being a new baby in the Storage world is not a simple thing, as expectations are huge (mainly because of my successful elder brother, the XIV GUI) and I must quickly keep up with all of them.
Although, I am getting tons of good feedback for being revolutionary and unique. People are emotionally engaged with me, and being that I’m a baby, I love to see emotions!
Huge marketing efforts to put me center stage
However, because of some initial problems at the start -- I am a new product, remember? -- I was thrown out of multiple customer sites, and some sales/marketing guys just stopped believing in me. That made me sad.
My parents did a great job, though, in talking, explaining and demonstrating what I can do, together with what I can’t do now, but will do soon. This really helped in some areas, and customers began to see what my parents saw in me for so many years.
I’m really enthusiastic to hear what people will think of me when I’m two years old!
As part of the renewal I had four times during my first year, design elements were reconsidered, redesigned and rewritten to find the best solutions ever. No product has come even close to what I suggest to the world… I am so proud of myself!
Additionally, my parents wrote approximately 20 patents on my User Interface (UI) elements and User Experience (UX) concepts, which makes me extremely unique.
Prioritization of what goes in and what doesn't, especially during a time when fewer and fewer babysitters handled me during that year. It was a real challenge. Read my parent's post [How to drive forward an exhausted team?] for more details.
But my parents did it! They succeeded to add cool features like:
Filter analytics and free text, making the filter a great experience that everyone is using.
Great UX improvements like redesigning the tabs, adding right click menus, and adding more on-boarding enablers
Improving the dashboard.
Improving my core business, capacity management (four different times!), and still working on it.
Adding features that were initially deferred in my birth. Deferring features back then was the way to make my birth go smoother. Now, these missing features annoy people.
Improving quality dramatically, adding automation to the way people test me.
Adding differentiators, like the health widget, with more than 20 best practices that provide helpful tips to the customer when there’s a need to change something in their environment, to avoid future issues.
Continue to bring added values for the 'A-family'. I am monitoring: FlashSystem A9000/R, XIV and Spectrum Accelerate, both on and off premises. This added value makes for a family with the most powerful management solutions and experience."
If you are planning to attend the upcoming IBM Systems Technical University, Orlando Florida, May 22-26, There will also be a variety of hands-on labs. I recommend participating in the hands-on session to feel and witness the next release of IBM Hyper-Scale Manager.
You may not be the right person to ask but I am asking everyone so "How do you see hybrid disk drives?"
(For the record, I am not immediately related to Robert. At onepoint, "Pearson" was the 12th most common surname in the USA, but now doesn't even make the Top 100.)
Robert, I would like to encourage you and everyone else to ask questions, don't worry if I am the wrong person to ask, asprobably I know the right person within IBM. Some people have called me the "Kevin Bacon" of Storage,as I am often less than six degrees away from the right person, having worked in IBM Storage for over 20 years.
For those not familiar with hybrid drives, there is a good write-up in Wikipedia.
Unfortunately, most of the people I would consult on this question, such as those from Market Intelligence or Research, are on vacation for the holidays, so, Robert, I will have to rely on my trusted 78-card Tarot deck and answer you with a five-card throw.
Your first card, Robert, is the Hermit. This card represents "introspection". The best I/O is no I/O, which means that if applications can keep the information they need inside server memory, you can avoid the bus bandwidth limitations to going to external storage devices. Where external storage makes sense is when data is shared between servers, or when the single server is limited to a set amount of internal memory. So, consider maxing out the memory in your server first (IBM would be glad to sell you more internal memory!!!), then consider outside solid-state or hybrid devices. Windows for example has an architectural limit of 4GB.
Your second card, Robert, is the Four of Cups, representing "apathy".On the card, you see three cups together, with the fourth cup being delivered from a cloud. This reminds me thatwe have three storage tiers already (memory,disk,tape), and introducing a fourth tier into the mix may not garnermuch excitement. For the mainframe, IBM introduced a Solid-State Device, call the Coupling Facility, which can be accessed from multipleSystem z servers. It is used heavily by DFSMS and DB2 to hold shared information. However, given some customer's apathytowards Information Lifecycle Management which includes "tiered storage", introducing yet another tier that forcespeople to decide what data goes where may be another challenge.
Your third card, Robert, is the Chariot, which represents "Speed, Determination,and Will". In some cases, solid state disk are faster for reading, but can be slower for writing. In the case of ahybrid drive, where the memory acts as a front-end cache, read-hits would be faster, but read-misses might be slower.While the idea of stopping the drives during inactivity will reduce power consumption, spinning up and slowing downthe disk may incur additional performance penalties. At the time of this post, the fastest disk system remains the IBM SAN Volume Controller, based on SPC-1 and SPC-2 benchmarks in excess of those published for other devices.
Your fourth card, Robert, is the Eight of Pentacles, which represents"Diligence, Hard work". The pentacles are coins with five-sided stars on them, and this often represents money.Our research team has projected that spinning disk will continue to be a viable and profitable storage media for at least anothereight years.
Your fifth and last card, Robert, is the World, which normallyrepresents "Accomplishment", but since it is turned upside down, the meaning is reversed to "Limitation". Some Hybriddisks, and some types of solid state memory in general, do have limitations in the number of write cycles they can handle. For thoseunhappy with the frequency and slowness for rebuilds on SATA disk may find similar problems with hybrid drives.For that reason, businesses may not trust using hybrid drives for their busiest, mission-critical applications, but certainlymight use it for archive data with lower write-cycle requirements.
The tarot cards are never wrong, but certainly interpretations of the cards can be.
Several readers have asked me what is the difference between Hybrid Cloud and Multi-Cloud. The two phrases are used in various contexts, not just by IBM, but also by our competitors, as well as the press and industry analysts.
A hybrid cloud attempts to develop a single platform to run a specific Cloud workload. This single platform combines two or more of the following resources:
on-premise private Cloud
off-premise private Cloud
off-premise public Cloud
A Hybrid Cloud is like the United Nations peacekeeping force. A single force, with a single mission, representing the combined resources of many countries.
A Hybrid Cloud is a deployment model that might offer advantages over just using a Private Cloud, or just using a Public Cloud.
A practical example is Tennis Australia. For three weeks every January, they run the Australian Open, a tennis tournament, with over 4,000 employees, and millions of views to their website each day. For the rest of the year, they have only about 300 employees, and manage quite well to run smaller tournaments for high-school and college students, as well as plan for next year's event.
In this case, a Hybrid Cloud that combines perhaps two racks of an on-premise private Cloud, combined with the incredible power of IBM Cloud, gives them the variability and agility needed to run smoothly without wasting CAPEX on equipment they don't need.
Many "Hybrid Cloud" products focus on being the "glue" that combines two different resources together. This can be at the management layer, the data layer, the application layer, or the infrastructure layer.
In contrast, a Multi-Cloud represents a deployment strategy for different Cloud workloads. One workload might be better served on a Private Cloud, another workload might be better served on a Public Cloud, and a third workload, as we saw above, might benefit from the combined resources of a Hybrid Cloud.
In the past, people felt that all Cloud Service Providers were the same. Just as people buy gasoline from which ever gas station offers the lowest prices, many just chose their Cloud Service Provider based entirely on the costs involved. Loyalty can change the minute new price tables are published.
But today, Cloud Service Providers have made an effort to provide differentiation. For example, your Multi-Cloud might have three Hybrid Clouds. One cloud platform combines your on-premise Private Cloud with IBM Cloud, another combines your on-premise Private Cloud with Amazon Web Services, and a third combines your on-premise Cloud with Microsoft Azure.
In this case, a Multi-Cloud is like the various armed forces. You might deploy the Army for one mission, the Navy for another, and the Air Force or Marines for a third.
Many "Multi-Cloud" products focus on being versatile and multi-purpose. For example, the same FlashSystem 9100 that you deploy in your "Analytics Cloud" platform could also be useful for your "Docker Container Cloud" platform, or your "DevOPS Cloud" platform. IBM's various Multi-Cloud Solutions provide the additional software and services needed to complement the FlashSystem 9100 to pull this off.
Deciding to use a Multi-Cloud strategy is mostly a business decision. Deploying a Hybrid Cloud as one of your Multi-Cloud platforms could be a combination of business and technical decision.
While most of the post is accurate and well-stated, two opinions particular caught my eye. I'll be nice and call them opinions, since these are blogs, and always subject to interpretation. I'll put quotes around them so that people will correctly relate these to Hu, and not me.
"Storage virtualization can only be done in a storage controller. Currently Hitachi is the only vendor to provide this." -- Hu Yoshida
Hu, I enjoy all of your blog entries, but you should know better. HDS is fairly new-comer to the storage virtualization arena, so since IBM has been doing this for decades, I will bring you and the rest of the readers up to speed. I am not starting a blog-fight, just want to provide some additional information for clients to consider when making choices in the marketplace.
First, let's clarify the terminology. I will use 'storage' in the broad sense, including anything that can hold 1's and 0's, including memory, spinning disk media, and plastic tape media. These all have different mechanisms and access methods, based on their physical geometry and characteristics. The concept of 'virtualization' is any technology that makes one set of resources look like another set of resources with more preferable characteristics, and this applies to storage as well as servers and networks. Finally, 'storage controller' is any device with the intelligence to talk to a server and handle its read and write requests.
Second, let's take a look at all the different flavors of storage virtualization that IBM has developed over the past 30 years.
IBM introduces the S/370 with the OS/VS1 operating system. "VS" here refers to virtual storage, and in this case internal server memory was swapped out to physical disk. Using a table mapping, disk was made to look like an extension of main memory.
IBM introduces the IBM 3850 Mass Storage System (MSS). Until this time, programs that ran on mainframes had to be acutely aware of the device types being written, as each device type had different block, track and cylinder sizes, so a program written for one device type would have to be modified to work with a different device type. The MSS was able to take four 3350 disks, and a lot of tapes, and make them look like older 3330 disks, since most programs were still written for the 3330 format. The MSS was a way to deliver new 3350 disk to a 3330-oriented ecosystem, and greatly reduce the cost by handling tape on the back end. The table mapping was one virtual 3330 disk (100 MB) to two physical tapes (50 MB each). Back then, all of the mainframe disk systems had separate controllers. The 3850 used a 3831 controller that talked to the servers.
IBM invents Redundant Array of Independent Disk (RAID) technology. The table mapping is one or more virtual "Logical Units" (or "LUNs") to two or more physical disks. Data is striped, mirrored and paritied across the physical drives, making the LUNs look and feel like disks, but with faster performance and higher reliability than the physical drives they were mapped to. RAID could be implemented in the server as software, on top or embedded into the operating system, in the host bus adapter, or on the controller itself. The vendor that provided the RAID software or HBA did not have to be the same as the vendor that provided the disk, so in a sense, this avoided "vendor lock-in".Today, RAID is almost always done in the external storage controller.
IBM introduces the Personal Computer. One of the features of DOS is the ability to make a "RAM drive". This is technology that runs in the operating system to make internal memory look and feel like an external drive letter. Applications that already knew how to read and write to drive letters could work unmodified with these new RAM drives. This had the advantage that the files would be erased when the system was turned off, so it was perfect for temporary files. Of course, other operating systems today have this feature, UNIX has a /tmp directory in memory, and z/OS uses VIO storage pools.
This is important, as memory would be made to look like disk externally, as "cache", in the 1990s.
IBM AIX v3 introduces Logical Volume Manager (LVM). LVM maps the LUNs from external RAID controllers into virtual disks inside the UNIX server. The mapping can combine the capacity of multiple physical LUNs into a large internal volume. This was all done by software within the server, completely independent of the storage vendor, so again no lock-in.
IBM introduces the Virtual Tape Server (VTS). This was a disk array that emulated a tape library. A mapping of virtual tapes to physical tapes was done to allow full utilization of larger and larger tape cartridges. While many people today mistakenly equate "storage virtualization" with "disk virtualization", in reality it can be implemented on other forms of storage. The disk array was referred to as the "Tape Volume Cache". By using disk, the VTS could mount an empty "scratch" tape instantaneously, since no physical tape had to be mounted for this purpose.
Contradicting its "tape is dead" mantra, EMC later developed its CLARiiON disk library that emulates a virtual tape library (VTL).
IBM introduces the SAN Volume Controller. It involves mapping virtual disks to manage disks that could be from different frames from different vendors. Like other controllers, the SVC has multiple processors and cache memory, with the intelligence to talk to servers, and is similar in functionality to the controller components you might find inside monolithic "controller+disk" configurations like the IBM DS8300, EMC Symmetrix, or HDS TagmaStore USP. SVC can map the virtual disk to physical disk one-for-one in "image mode", as HDS does, or can also map virtual disks across physical managed disks, using a similar mapping table, to provide advantages like performance improvement through striping. You can take any virtual disk out of the SVC system simply by migrating it back to "image mode" and disconnecting the LUN from management. Again, no vendor lock-in.
The HDS USP and NSC can run as regular disk systems without virtualization, or the virtualization can be enabled to allow external disks from other vendors. HDS usually counts all USP and NSC sold, but never mention what percentage these have external disks attached in virtualization mode. Either they don't track this, or too embarrassed to publish the number. (My guess: single digit percentage).
Few people remember that IBM also introduced virtualization in both controller+disk and SAN switch form factors. The controller+disk version was called "SAN Integration Server", but people didn't like the "vendor lock-in" having to buy the internal disk from IBM. They preferred having it all external disk, with plenty of vendor choices. This is perhaps why Hitachi now offers a disk-less version of the NSC 55, in an attempt to be more like IBM's SVC.
IBM also had introduced the IBM SVC for Cisco 9000 blade. Our clients didn't want to upgrade their SAN switch networking gear just to get the benefits of disk virtualization. Perhaps this is the same reason EMC has done so poorly with its "Invista" offering.
So, bottom line, storage virtualization can, and has, been delivered in the operating system software, in the server's host bus adapter, inside SAN switches, and in storage controllers. It can be delivered anywhere in the path between application and physical media. Today, the two major vendors that provide disk virtualization "in the storage controller" are IBM and HDS, and the three major vendors that provide tape virtualization "in the storage controller" are IBM, Sun/STK, and EMC. All of these involve a mapping of logical to physical resources. Hitachi uses a one-for-one mapping, whereas IBM additionally offers more sophisticated mappings as well.
Last week, in Computer Technology Review's article [Tiering: Scale Up? Scale Out? Do Both], Mark Ferelli interviews fellow blogger Hu Yoshida, CTO of Hitachi Data Systems (HDS). Here's an excerpt:
"MF/CTR: A global cache should be required to implement that common pool that you’re talking about going across all tiers.
Hu/HDS: Right. So that is needed to get to all the resources. Now with our system, we can also attach external storage behind it for capacity so that as the storage ages out or becomes less active we can move it to the external storage. They would certainly have less performance capability, but you don’t need it for the stale data that we’re aging down. Right now we’re the only vendor that can provide this type of tiering.
If you look at other people who do virtualization like IBM’s SVC, the SVC has no storage within it because it’s sitting so if you attach any storage behind it, there is some performance degradation because you have this appliance sitting in front. That appliance is also very limited in cache and very limited in the number of storage boards on it. It cannot really provide you additional performance than what is attached behind it. And in fact, it will always degrade what is attached behind it because it’s not storage, where as our USP is storage and it has a global cache and it has thousands of port connections, load balancing and all that. So our front end can enhance existing storage that sits behind it."
This is not the first time I have had to correct Hu and others of misperceptions of IBM's SAN Volume Controller (SVC). This month marks my four year "blogoversary", and I seem to spend a large portion of my blogging time setting the record straight. Here are just a few of my favorite posts setting the record straight on SVC back in 2007:
Since day 1, SAN Volume Controllers has focused primarily on external storage. Initially, the early models had just battery-protected DRAM cache memory, but the most recent model of the SVC, the 2145-CF8, adds support for internal SLC NAND flash solid state drives. To fully appreciate how SVC can help improve the performance of the disks that are managed, I need to use some visual aids.
In this first chart, we look at a 70/30/50 workload. This indicates that 70 percent of the IOPS are reads, 30 percent writes, and 50 percent can be satisfied as cache hits directly from the SVC. For the reads, this means that 50 percent are read-hits satisfied from SVC DRAM cache, and 50 percent are read-miss that have to get the data from the managed disk, either from the managed disk's own cache, or from the actual spinning drives inside that managed disk array.
For writes, all writes are cache-hits, but some of them will be destaged to the managed disk. Typically, we find that a third of writes are over-written before this happens, so only two-thirds are written down to managed disk.
In this example, the SVC reduced the burden of the managed disk from 100,000 IOPS down to 55,000, which is 35,000 reads and 20,000 writes. Some have argued against putting one level of cache (SVC) in front of another level of cache (managed disk arrays). However, CPU processor designers have long recognized the value of hierarchical cache with L1, L2, L3 and sometimes even L4 caches. The cache-hits on SVC are faster than most disk system's cache-hits.
This is a Ponder curve, mapping millisecond response (MSR) times for different levels of I/O per second, named after the IBM scientist John Ponder that created them. Most disk array vendors will publish similar curves for each of their products. In this case, we see that 100,000 IOPS would cause a 25 millisecond response (MSR) time, but when the load is reduced to 55,000 IOPS, the average response time drops to only 7 msec.
To be fair, the SVC does introduce 0.06 msec of additional latency on read-misses, so let's call this 7.06 msec. This tiny amount of latency could be what Hu Yoshida was referring to when he said there was "some performance degradation". There are other storage virtualization products in the market that do not provide caching to boost performance, but rather just map incoming requests to outgoing requests, and these can indeed slow down every I/O they process. Perhaps Hu was thinking of those instead of IBM's SVC when he made his comments.
Of course, not all workloads are 70/30/50, and not every disk array is driven to its maximum capability, so your mileage may vary. As we slide down the left of the curve where things are flatter, the improvement in performance lowers.
IOPS before SVC
IOPS after SVC
MSR before SVC
MSR after SVC
Hitachi's offerings, including the HDS USP-V, USP-VM and their recently announced Virtual Storage Platform (VSP) sold also by HP under the name P9500, have similar architecture to the SVC and can offer similar benefits, but oddly the Hitachi engineers have decided to treat externally attached storage as second-class citizens instead. Hu mentions data that "ages out or becomes less active we can move it to the external storage." IBM has chosen not to impose this "caste" system onto its design of the SAN Volume Controller.
The SVC has been around since 2003, before the USP-V came to market, and has sold over 20,000 SVC nodes over the past seven years. The SVC can indeed improve performance of managed disk systems, in some cases by a substantial amount. The 0.06 msec latency on read-miss requests represents less than 1 percent of total performance in production workloads. SVC nearly always improves performance, and in the worst case, provides same performance but with added functionality and flexibility. For the most part, the performance boost comes as a delightful surprise to most people who start using the SVC.
To learn more about IBM's upcoming products and how IBM will lead in storage this decade, register for next week's webcast "Taming the Information Explosion with IBM Storage" featuring Dan Galvan, IBM Vice President, and Steve Duplessie, Senior Analyst and Founder of Enterprise Storage Group (ESG).
Last month, HP and Oracle jointly announced their new "Exadata Storage Server".This solution involves HP server and storage paired up with Oracle software, designed for Data Warehouse andBusiness Intelligence workloads (DW/BI).
I immediately recognized the Exadata Storage Server as a "me too" product, copying the idea from IBM's [InfoSphere Balanced Warehouse]which combines IBM servers, IBM storage and IBM's DB2 database software to accomplish this, but from a singlevendor, rather than a collaboration of two vendors.The Balanced Warehouse has been around for a while. I even blogged about this last year, in my post[IBMCombo trounces HP and Sun] when IBM announced its latest E7100 model. IBM offers three different sizes: C-class for smaller SMB workloads, D-class for moderate size workloads, and E-class for large enterprise workloads.
One would think that since IBM and Oracle are the top two database software vendors, and IBM and HP are the toptwo storage hardware vendors, that IBM would be upset or nervous on this announcement. We're not. I would gladlyrecommend comparing IBM offerings with anything HP and Oracle have to offer. And with IBM's acquisition of Cognos,IBM has made a bold statement that it is serious about competing in the DW/BI market space.
But apparently, it struck a nerve over at EMC.
Fellow blogger Chuck Hollis from EMC went on the attack, and Oracle blogger Kevin Closson went on the defensive.For those readers who do not follow either, here is the latest chain of events:
When it comes to blog fights like these, there are no clear winners or losers, but hopefully, if done respectfully,can benefit everyone involved, giving readers insight to the products as well as the company cultures that produce them.Let's see how each side fared:
Chuck implies that HP doesn't understand databases and Oracle doesn't understand server and storage hardware, socobbling together a solution based on this two-vendor collaboration doesn't make sense to him. The few I know who work at HP and Oracle are smart people, so I suspect this is more a claim againsteach company's "core strengths". Few would associate HP with database knowledge, or Oracle with hardware expertise,so I give Chuck a point on this one.
Of course, Chuck doesn't have deep, inside knowledge of this new offering, nor do I for that matter, and Kevin is patient enough to correct all of Chuck's mistaken assumptions and assertions. Kevin understands that EMC's "core strengths" isn't in servers or databases, so he explains things in simple enough terms that EMC employees can understand, so I give Kevin a point on this one.
If two is bad, then three is worse! How much bubble gum and bailing wire do you need in your data center? The better option is to go to the one company that offers it all and brings it together into a single solution: IBM InfoSphere Balanced Warehouse.
Are you going to the [IBM Edge 2015 conference]? This is IBM's premiere conference covering IBM System Storage, z Systems and POWER Systems.
Here are some secrets for winning prizes while you attend!
Sit in the first FIVE rows of the Techincal Kickoffs - Monday 8:30am
Funding has been approved to give out a few nice prizes. To be eligible, you need to show up on time, and sit in the first five rows of any of the following three Kickoffs. I will be in the one for Storage!
Attend sessions by Edge Event Sponsor companies
Brocade, Cisco and others often present lectures at Technical Edge, and they often give out prizes at those sessions, as part of their sponsorship to the event.
Take a "Selfie" with IBM z13 System mainframe
Yes, we will actually have a z13 System on display at the Solution Center for you to take pictures with. Post it on Instragram, Twitter, Facebook or your other favorite social media websites and be eligible to win prizes.
Get your handwriting analyzed with an IBM POWER8 system
Get your handwriting analyzed at the Solution Center and be eligible to win prizes.
Get your badge scanned at as many booths as you can at the Solution Center
Yes, this means you might get an email from the companies involved, but it will also add you to the list of people eligible for some raffles and drawings for prizes.
Participate in the #IBMEdgeHunt scavenger hunt!
Follow the Twitter hashtag #IBMEdgeHunt to see what else the "Hunt Organizers" have in store during the week!
I arrive Sunday afternoon! Below are some of the hashtags I will be using during the event. You can follow me on @az990tony Twitter handle.
"How can I participate in IBM's Smarter Planet, specifically Smarter Cities?"
With a lot of college students graduating next month, I thought this would be a good question to answer.
Apply for a Job at IBM
The best way to participate in IBM Smarter Cities is to get a job within IBM, and then get assigned to one of the many IBM Smarter Cities projects. Visit IBM's [Employment Page] to learn why IBM is recognized as one of the top 50 most attractive employers in the world. Mention "Smarter Cities" on your Resume so it can be routed to the appropriate manager.
Join the Conversation
Another way to participate in Smarter Cities is to "join the conversation". Each of IBM's 25 different programs has folks that are focused on that area, with blogs, forums and case studies. Here is the conversations page for [Smarter Cities]. Watch the videos at ibm.com/theSmarterCity]. Play IBM's [City One], IBM's Smarter Planet for game for Smarter Cities. Provide IBM feedback on any ideas you might have to help make cities smarter.
You can also join in one of the many upcoming [IBM Jam events]. Jams are not restricted to generating business ideas. Their methods, tools and technology can also be applied to social issues. In 2005, over three days, the Government of Canada, UN-HABITAT and IBM hosted Habitat Jam. Tens of thousands of participants - from urban specialists, to government leaders, to residents from cities around the world - discussed issues of urban sustainability. Their ideas shaped the agenda for the UN World Urban Forum, held in June 2006. People from 158 countries registered for the jam and shared their ideas for action to improve the environment, health, safety and quality of life in the world's burgeoning cities.
Buy Products and/or Services from IBM
IBM has the resources to help the planet in so many ways that NGOs and non-profit agencies only dream of. With IBM's advocacy for causes like global public education, universal healthcare, and improved infrastructures, people often forget that IBM is not itself a non-profit organization. IBM has learned early on that creating value for the world can also be good business. The more people buy from IBM, the more skills and resources IBM will have to solve the world's toughest challenges.
(FTC Disclosure: I do not work or have any financial investments in ENC Security Systems. ENC Security Systems did not paid me to mention them on this blog. Their mention in this blog is not an endorsement of either their company or any of their products. Information about EncryptStick was based solely on publicly available information and my own personal experiences. My friends at ENC Security Systems provided me a full-version pre-loaded stick for this review.)
The EncryptStick software comes in two flavors, a free/trial version, and the full/paid version. The free trial version has [limits on capacity and time] but provides enough glimpse of the product to decide before you buy the full version. You can download the software yourself and put in on your own USB device, or purchase the pre-loaded stick that comes with the full-version license.
Whichever you choose, the EncryptStick offers three nice protection features:
Encryption for data organized in "storage vaults", which can be either on the stick itself, or on any other machine the stick is connected to. That is a nice feature, because you are not limited to the capacity of the USB stick.
Encrypted password list for all your websites and programs.
A secure browser, that prevents any key-logging or malware that might be on the host Windows machine.
I have tried out all three functions and everything works as advertised. However, there is always room for improvement, so here are my suggestions.
The first problem is that the pre-loaded stick looks like it is worth a million dollars. It is in a shiny bronze color with "EncryptStick" emblazoned on it. This is NOT subtle advertising! This 8GB capacity stick looks like it would be worth stealing solely on being a nice piece of jewelry, and then the added bonus that there might be "valuable secrets" just makes that possibility even more likely.
If you want to keep your information secure, it would help to have "plausible deniability" that there is nothing of value on a stick. Either have some corporate logo on it, of have the stick look like a cute animal, like these pig or chicken USB sticks.
It reminds me how the first Apple iPod's were in bright [Mug-me White]. I use black headphones with my black iPod to avoid this problem.
Of course, you can always install the downloadable version of EncryptStick software onto a less conspicuous stick if you are concerned about theft. The full/paid version of EncryptStick offers an option for "lost key recovery" which would allow you to backup the contents of the stick and be able to retrieve them on a newly purchased stick in the event your first one is lost or stolen.
Imagine how "unlucky" I felt when I notice that I had lost my "rabbits feet" on this cute animal-themed USB stick.
I sense trouble for losing the cap on my EncryptStick as well. This might seem trivial, but is a pet-peeve of mine that USB sticks should plan for this. Not only is there nothing to keep the cap on (it slides on and off quite smoothly), but there is no loop to attach the cap to anything if you wanted to.
Since then, I got smart and try to look for ways to keep the cap connected. Some designs, like this IBM-logoed stick shown above, just rotate around an axle, giving you access when you need it, and protection when it is folded closed.
Alternatively, get a little chain that allows you to attach the cap to the main stick. In the case of the pig and chicken, the memory section had a hole pre-drilled and a chain to put through it. I drilled an extra hole in the cap section of each USB stick, and connected the chain through both pieces.
(Warning: Kids, be sure to ask for assistance from your parents before using any power tools on small plastic objects.)
The EncryptStick can run on either Microsoft Windows or Mac OS. The instructions indicate that you can install both versions of download software onto a single stick, so why not do that for the pre-loaded full version? The stick I have had only the Windows version pre-loaded. I don't know if the Windows and Mac OS versions can unlock the same "storage vaults" on the stick.
Certainly, I have been to many companies where either everyone runs Windows or everyone runs Mac OS. If the primary target audience is to use this stick at work in one of those places, then no changes are required. However, at IBM, we have employees using Windows, Mac OS and Linux. In my case, I have all three! Ideally, I would like a version of EncryptStick that I could take on trips with me that would allow me to use it regardless of the Operating System I encountered.
Since there isn't a Linux-version of EncryptStick software, I decided to modify my stick to support booting Linux. I am finding more and more Linux kiosks when I travel, especially at airports and high-traffic locations, so having a stick that works both in Windows or Linux would be useful. Here are some suggestions if you want to try this at home:
Use fdisk to change the FAT32 partition type from "b" to "c". Apparently, Grub2 requires type "c", but the pre-loaded EncryptStick was set to "b". The Windows version of EncryptStick> seems to work fine in either mode, so this is a harmless change.
Install Grub2 with "grub-install" from a working Linux system.
Once Grub2 is installed, you can boot ISO images of various Linux Rescue CDs, like [PartedMagic] which includes the open-source [TrueCrypt] encryption software that you could use for Linux purposes.
This USB stick could also be used to help repair a damaged or compromised Windows system. Consider installing [Ophcrack] or [Avira].
Certainly, 8GB is big enough to run a full Linux distribution. The latest 32-bit version of [Ubuntu] could run on any 32-bit or 64-bit Intel or AMD x86 machine, and have enough room to store an [encrypted home directory].
Since the stick is formatted FAT32, you should be able to run your original Windows or Mac OS version of EncryptStick with these changes.
Depending on where you are, you may not have the luxury to reboot a system from the USB memory stick. Certainly, this may require changes to the boot sequence in the BIOS and/or hitting the right keys at the right time during the boot sequence. I have been to some "Internet Cafes" that frown on this, or have blocked this altogether, forcing you to boot only from the hard drive.
Well, those are my suggestions. Whether you go on a trip with or without your laptop, it can't hurt to take this EncryptStick along. If you get a virus on your laptop, or have your laptop stolen, then it could be handy to have around. If you don't bring your laptop, you can use this at Internet cafes, hotel business centers, libraries, or other places where public computers are available.
Today, I met with Teresa Ferraro and Mike Buttrum from FirstRain in their Manhattan office in downtown New York City. IBM recently contracted FirstRain to provide IBMers like myself with analytics on publicly-available news to keep us informed for business meetings. Here's how IBMers can get the most out of this service.
Basically, FirstRain takes a list and generates the best summaries of publicly-available news that are most relevant. You can organize into different channels. Here I have seven channels.
Companies to watch refer to existing or prospective clients that I plan to be talking with soon. Some of my colleagues are assigned to specific clients, so they can set this up once and enjoy the news for the rest of the year. I, on the other hand, meet with different clients every week, so I will be updating this list on a frequent basis.
I have divided the Competitors between major ones, and smaller startups. Since I am often working with business partners and distributors, I made that a separate channel as well.
For product lines, I picked three: Data migration, Data storage solutions, and Software defined storage.
For conferences where I don't know which companies will attend, such as the IBM Technical University, I can set up information by territory. Here is one for Brazil.
I also attend industry-oriented events, so I can pick those vertical markets that might be helpful with dinner conversations. In this example, I chose Energy, Electric Utilities and Gas Utilities.
Once you have your channels configured, you get your results in various sections:
Management Changes lists any changes in top C-level positions, who left the company, who got recently hired.
Key Developments indicates news like mergers and acquisitions and government regulations.
First Reads prioritizes the top six articles for your channel. You can access more, but these six will get you started as you have your morning coffee.
First Tweets gives you the six most relevant tweets, if those articles above were just "TL;DR"
A section on Business Influencers and Market Drivers is interesting to see who the big players are, and what topics are driving the most conversation. Here's an example from my Energy/Electric/Gas channel:
The Most Talked About section covers quotes and commentary about the most talked about companies in your channel.
With most news sources focused on politics, weather and celebrity gossip, it is nice to have a quicker, more focused approach to get the news I need to prepare for my client briefings. Special thanks to my hosts Teresa and Mike for their hospitality!
Use more efficient disk media, such as high-capacity SATA disk drives
Both are great recommendations, but why limit yourself to what EMC offers? Your x86-based machines are only a subset of your servers,and disk is only a subset of your storage. IBM takes a more holistic approach, looking at the entire data center.
VMware is a great product, and IBM is its top reseller. But in addition to VMware, there are other solutions for the x86-based servers, like Xen and Microsoft Virtual Server. IBM's System p, System i, and System z product lines all support logical partitioning.
To compare the energy effectiveness of server virtualization, consider a metric that can apply across platforms. For example, for an e-mail server, consider watts per mailbox. If you have, say, 15,000 users, you can calculate how many watts you are consuming to manage their mailboxes on your current environment, and compare that with running them on VMware, or logical partitions on other servers. Some people find it surprising that it is often more cost-effective, and power-efficient, to run workloads on mainframe logical partitions (LPARs) than a stack of x86 servers running VMware.
More efficient Media
SATA and FATA disks support higher capacities, and run at slower RPM speeds, thus using fewer watts per terabyte.A terabyte stored on 73GB high-speed 15K RPM drives consumes more watts than the same terabyte stored using 500GB SATA.Chuck correctly identifies that tape is more power-efficient than disk, but then argues that paper is more power-efficient than tape. But paper is not necessarily more efficient than tape.
ESG analyst Steve Duplessie divides up data betweenDynamic vs. Persistent. The best place to put dynamic data is on disk, and here is where evaluation of FC/SAS versus SATA/FATA comes into play.Persistent data, on the other hand, can be stored on paper, microfiche, optical or tape media. All of these shelf-resident media consume no electricity, nor generate any heat that would require additional cooling.
A study by scientists at the Lawrence Berkeley National Laboratory titled High-Tech Means High-Efficiency: The Business Case for Energy Management in High-Tech Industries indicates thatData centers consume 15 to 100 times more energy per square foot than traditional office space. Storing persistent data in traditional office space can save a huge amount of energy. Steve Duplessie feels the ratio of dynamic to persistent data is 1:10 today, but is likely to grow to 1:100 in the near future, raising the demand for energy-efficient storage of persistent data ever more important to our environment.
Data centers consume nearly 5000 Megawatts in the USA alone, 14000 Megawatts worldwide. To put that in perspective, the country of Hungary I was in last week can generate up to 8000 Megawatts for the entire country (and they were using 7400 Megawatts last week as a result of their current heat wave, causing them grave concern).
Back in the 1990's, one of the insurance companies IBM worked with kept data on paper in manila folders, and armiesof young adults in roller skates were dispatched throughout the large warehouses of shelves to get the appropriate folder in response to customer service inquiries. Digitizing this paper into electronic format greatly reduced the need for this amount of warehouse space, as well as improved the time to retrieve the data.
A typical file storage box (12 inch x 12 inch x 18 inch) containing typed pages single-spaced, double-sided, 12 point font could hold perhaps 100MB. The same box could hold a hundred or more LTO or 3592 tape cartridges, each storing hundreds of GB of information. That's a million-to-one improvement of space-efficiency, and from a watts-per-TB basis, translates to substantial improvement in standard office air conditioning and lighting conditions.
To learn more about IBM's Project Big Green, watch thisintroductory video which used Second Life for the animation.
The concept that there should be a linear "Storage Administrators per TB" rule-of-thumb has been around for a while.Back in 1992, I went to visit a customer in Germany who had FIVE storage admins for 90 GB (yes, GB, not TB) disk array.I told them they only needed 3 admins, but they cited German laws that prohibited "overtime" work on evenings and weekends.
Later, in 1996, I visited an insurance company in Ohio to talk about IBM Tivoli Storage Manager. They had TWO admins to manage 7TB on their mainframe, and another 45 people managing the 7TB across their distributed systems running Linux, UNIX, and Windows. My first question, why TWO? Only one would be needed for the mainframe, but they responded that they back each other up when one takes a 2-week vacation. My second question to the rest of the audience was... "When was the last time you guys took a 2-week vacation?"
Today, admins manage many TBs of storage. But TBs are turning out not to be a fair ruler to estimate the number of admins you need. It's a moving target, and other factors have more influence that sheer quantity of data.Let's take a look at some of those factors, which we call "the three V's":
Variety of information types
In the beginning, there were just flat text files. In today's world, we have structured databases, semi-structured e-mail systems, hypertext documents, composite applications, audio and video formats that require streaming, and so on. Variety adds to the complexity of the environment. Different data requires different treatment, different handling, and perhaps even different storage technologies.
Volume of data
Data on disk and tape is growing 60% year on year. It's growing on paper also. It's growing on film like photos and X-rays. The problem is not the amount, but the rate of growth. Imagine if population and traffic in your city or town increased 60% in one year, most likely people would suffer because most governments just aren't prepared for that level of growth.
Velocity of change
Back in the 1950's and 1960's, people only had to make updates once a year, scheduling time during holidays. Now, people are making changes every month, sometimes every weekend. One customer we spoke with recently said they do about 8000 changes PER WEEKEND!
So, the key is that there is no simple rule-of-thumb. Fewer admins are need per TB on mainframe than distributed systems data. Fewer admins per TB are needed when you deploy productivity software, like IBM TotalStorage Productivity Center. Fewer admins per TB are needed when you deploy storage virtualization, like IBM SAN Volume Controller or IBM virtual tape libraries.
Well it's Tuesday again, and you know what that means? IBM Announcements!
Today, IBM announced an exciting new addition to the IBM System Storage™ product line, [IBM Spectrum Storage™], a family of software defined storage offerings.
To understand its significance, I need to explain a few things first. Software defined storage is part of a larger concept of software defined environment.
How is software defined environment different than what you have now? In every data center, you need to map business requirements of an application workload with an appropriate set of IT infrastructure, including server, network and storage resources.
The traditional approach involves an application owner or database administrator reviewing the business requirements documented for the application, calling the server, network and storage administrators, who match those requirements to appropriate IT hardware and notify the folks in facilities to rack and stack the gear accordingly.
In a software defined environment, Application Programming Interfaces (API), Service Level Agreements (SLA) and Orchestration workflows can automate the request for the appropriate resources. This is referred as the "Control Plane".
Responding to these requests, the software can provision the appropriate server, network and storage resources required. Server, network and storage virtualization, standard interfaces and deployment technologies exist to make this practical. This is referred to as the "Data Plane".
Any time new a way of doing things is introduced into the world, there could be some resistance. Let's tackle the three most frequently stated objections:
"IT infrastructure resources are rare and expensive! Administrators need to control or approve how resources are doled out!" An objection to self-service automation is the fear that employees would take too much.
If you have a bank account, Automated Teller Machines (ATM) can restrict the amount of cash you can take out, based on what is appropriate per request, or per day, with an upper limit of what you have in your personal checking or savings account. You enter your debit card and PIN into the "Control Plane" keypad and out comes a stack of 20-dollar bills from the "Data Plane" slot. In a software defined environment, you can limit requests through quotas and resource pools.
"Some application workloads are more important than others! Another objection is that every workload will be treated in the same standard way, mission critical workloads and dev/test would be treated alike.
At the gas station, you can select different levels of octane gasoline. You enter your credit card and zip code into the "Control Plane" keypad and selected octane comes out of the "Data Plane" hose. In a software defined environment, resources can be provisioned with different Quality of Service (QoS) levels.
"Different applications require different combinations of resources!" Another objection is the fear that fixed combinations of server, storage and network resources will be stifling to innovation and productivity.
At the vending machine, you can choose which candy bar and which chips to have with whatever soft drink you choose for lunch. You enter your bills and coins into the "Control Plane" slot, select the row letter and column number for your snack of choice, and then fetch your purchases from the "Data Plane" flap. In a software defined environment, a Service Catalog can offer a virtual menu of different server, network and storage resources to be combined together as needed.
These concerns are addressed well enough in software defined environments, in general, and with IBM Spectrum Storage family of products, in particular.
(Nostalgia: I remember the days before self-service automation. At the bank, I had to stand in line at the bank until I could to talk to a human bank teller to get cash from my savings account. At the gas station, human gas attendants would come out and pump the gas for me, check my oil and wash my windshield. And at a restaurant, I felt like I waited an eternity from the time I ordered my meal to the time the human short-order cook had it ready and human wait staff delivered it to my table. These all seem silly today, doesn't it?)
How do you define success? For some, it is based on their salary, or perhaps revenue they helped close for their company.
For others, their family life and the flexibility to handle work/life issues might be more important.
Still others look for certifications and awards from official agencies.
As a side gig, I sometimes do bartending on the weekends. Typically, these are for weddings or corporate parties.
I took weeks of bartender training and passed a three-hour exam to become state-certified to do so in Arizona. We Arizonans take our liquor seriously! If you think about it, bartending is just a notch below being a Pharmacist dispensing other drugs.
Surprisingly, some of my patrons will be condescending, "Don't you wish you can do more with your life than be a bartender?"
I am also certified "Laughter Yoga" instructor, and am called in at times to substitue for other instructors. Again, I took formal training and was certified to do so.
Again, some of my students will ask, "Don't you wish you could do more with your life than be a yoga instructor?"
In both cases, I would respond, "Dude, I earn six figures, and am happy to meet new people every week, how about you?" This usually shuts them up!
(For those interested, here are [my top 10 posts] which served as the basis of the interview!)
I am happy to be recognized externally and within IBM for my success as a blogger. Since I started blogging over 10 years ago, I have helped close over $4 Billion USD in revenue for IBM, written five books on IBM Storage, mentored dozens of other successful bloggers, and presented to thousands of clients at conferences, workshops and briefings.
This week, I was reminded that back in 2011, Watson beat two human players, Ken Jennings and Brad Rutter on the TV game show "Jeopardy!" On his last response, Ken wrote "I for one welcome our new computer overlords." With IBM investing heavily in Cognitive Solutions, should people be worried, or welcome the new technology?
Back in 1950, Isaac Asimov proposed "Three laws of robots":
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Let's take a look at how Artificial Intelligence has been represented in the movies over the past few decades. I have put these in chronological order when they were initially released in the United States.
(FCC Disclosure and Spoiler Alert: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for cognitive solutions made by IBM. While IBM may have been involved or featured in some of these movies, I have no financial interest in them. I have seen them all and highly recommend them. I am hoping that you have all seen these, or at least familiar enough with their plot lines that I am not spoiling them for you.)
2001: A Space Odyssey
Back in 1968, Stanley Kubrick and Arthur C. Clarke made a masterpiece movie about a mysterious obelisk floating near Jupiter. To investigate, a crew of human beings takes a space ship managed by a sentient computer named [HAL-9000].
(Many people thought HAL was a subtle reference to IBM. Stanley Kubrick clarifies:
"By the way, just to show you how interpretation can sometimes be bewildering: A cryptographer went to see the film, and he said, 'Oh. I get it. Each letter of HAL's name is one letter ahead of IBM. The H is one letter in front of I, the A is one letter in front of B, and the L is one letter in front of M.'
Now this is a pure coincidence, because HAL's name is an acronym of heuristic and algorithmic, the two methods of computer programming...an almost inconceivable coincidence. It would have taken a cryptographer to have noticed that."
Source: The Making of 2001: A Space Odyssey, Eye Magazine Interview, Modern Library, pp. 249)
The problem arises when HAL-9000 refuses commands from the astronauts. The astronauts are not in control, HAL-9000 was given separate orders from ground control back on earth, and it has determined it would be more successful without the crew.
In 1973, Michael Crichton wrote and directed this movie about an amusement park with three uniquely themed areas: Medieval World, Roman World, and Westworld. Robots are used to staff the parks to make them more realistic, interacting with the guests in character appropriate for each time period.
A malfunction spreads like a computer virus among the robots, causing them to harm or kill the park's guests. Yul Brenner played a robot called simply "the Gunslinger". Equipped with fast reflexes and infrared vision, the Gunslinger proves especially deadly!
(Michael Crichton also wrote "Jurassic Park", which had a similar story line involving dinosaurs with catastrophic results!)
Last year, HBO launched a TV series called "Westworld", based on the same themes covered in this movie. The first season of 10 episodes just finished, and the next season is scheduled for 2018.
Directed by Ridley Scott, this 1982 movie stars Harrison Ford as Rick Deckard, a law enforcement officer. Rick is tasked to hunt down and "retire" four cognitive androids named "replicants" that have killed some humans and are now in search of their creator, a man named J. F. Sebastian.
(I enjoy the euphemisms used in these movies. Terms like kill, murder or assassinate apply to humans but not machines. The word "retire" in this movie refers to destruction of the robots. As we say in IBM, "retirement is not something you do, it is something done to you!")
Destroying machines does not carry the same emotional toll as killing humans, but this movie explores that empathy. A sequel called "Blade Runner 2049" will be released later this year.
In 1983, Matthew Broderick plays David, a young high school student who hacks into the U.S. Military's War Operation Plan Response (WOPR) computer. The WOPR was designed to run various strategic games, including war game simulations, learning as it goes. David decides to initiate the game "Global Thermonuclear War", and the military responds as if the threats were real.
Can the computer learn that the only way to win a war is not to wage it in the first place? And if a computer can learn this, can our human leaders learn this too?
In this series of movies, a franchise spanning from 1984 to 2009, the US Military builds a defense grid computer called [Skynet]. After cognitive learning at an alarming rate, Skynet becomes self-aware, and decides to launch missiles, starting a nuclear war that kills over 3 billion people.
Arnold Schwarzenegger plays the Terminator model T-800, a cognitive solution in human form designed by Skynet to finish the job and kill the remainder of humanity.
In this 2004 movie, Will Smith plays Del Spooner, a technophobic cop who investigates a crime committed by a cognitive robot.
(Many people associate the title with author Isaac Asimov. A short story called "I, Robot" written by Earl and Otto Binder was published in the January 1939 issue of 'Amazing Stories', well before the unrelated and more well-known book 'I, Robot' (1950), a collection of short stories, by Asimov.
Asimov admitted to being heavily influenced by the Binder short story. The title of Asimov's collection was changed to "I, Robot" by the publisher, against Asimov's wishes. Source: IMDB)
Del Spooner uncovers a bigger threat to humanity, not just a single malfunctioning robot, but rather the Virtual Interactive Kinesthetic Interface, or simply VIKI for short, a cognitive solution that controls all robots. VIKI interprets Asimov's three laws in a manner not originally intended.
In this 2015 movie, Domhnall Gleeson plays Caleb, a 26 year old programmer at the world's largest internet company. Caleb wins a competition to spend a week at a private mountain retreat. However, when Caleb arrives he discovers that he must interact with Ava, the world's first true artificial intelligence, a beautiful robot played by Alicia Vikander.
(The title derives from the Latin phrase "Deus Ex-Machina," meaning "a god from the Machine," a phrase that originated in Greek tragedies. Sources: IMDB)
Nathan, the reclusive CEO of this company, relishes this opportunity to have Caleb participate in this experiment, explaining how Artificial Intelligence (AI) will transform the world.
(The three main characters all have appropriate biblical names. Ava is a form of Eve, the first woman; Nathan was a prophet in the court of David; and Caleb was a spy sent by Moses to evaluate the Promised Land. Source: IMDB)
The premise is based in part on the famous [Turing Test], developed by Alan Turing. This is designed to test a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Movies that depict the bad guys as a particular nationality, ethnicity or religion may be offensive to some movie audiences. Instead, having dinosaurs, monsters, aliens or robots provides a villain that all people can fear equally. This helps movie makers reach a more global audience!
Of course, if robots, androids and other forms of Artificial Intelligence did exactly what humans expect them to, we would not have the tense, thrilling action movies to watch on the big screen.
This is not a complete list of movies. Enter in the comments below your favorite movie that features Artificial Intelligence and why it is your favorite!
With all the excitement of the [IBM Challenge], where the [IBM Watson computer] will compete against humans on [Jeopardy!], I thought it would be good to provide the following homework exercise to help you appreciate how challenging the game is and the strategies required.
Overview of the game of Jeopardy!
If you are familiar with the show, you can safely skip this section.
Known as "America's Favorite Quiz Show", the Jeopardy pits three contestants against each other. The board is divided into six columns and five rows of answers. Each column indicates the category for that column of answers. The rows are ranked from easiest to most difficult, with more difficult answers being worth more money to wager.
The contestants take turns. The returning champion gets to select a spot on the board, by indicating the category (column) and wager (row), such as "I will take Animals for 800 dollars!" Contestants must then press a button to "buzz in", be recognized by the host, and respond correctly. If the contestant responds incorrectly, the other two contestants have the opportunity to respond. The contestant with the correct response gets to chose the next answer.
For each turn, the host, Alex Trebek, shows the answer on the board, and spends three seconds reading it aloud to give everyone a chance to come up with a corresponding question. This is perhaps what Jeopardy is most famous for. In a traditional "Quiz Show", the host asks questions, and the contestants answer that question. On Jeopardy, however, the host poses "answers", and the contestants provide their response in the form of a "questions" that best fit the category and answer clues. For example, if the categories were "Large Corporations" and the answer was "Sam Palmisano", the contestant would answer "Who is the CEO of IBM Corporation?" Both the categories, and the answers are filled with puns, slang and humor to make it more challenging. Often, the answer itself is not sufficient clue, you have to factor in the category as well to have a complete set of information.
The game is played in three rounds:
In the first round, there are six categories, and the rows are worth $200, $400, $600, $800 and $1000 dollars. If you respond correctly on all five answers in a category column, you would win $3000. If you respond to all thirty answers correctly, you would earn $18,000.
In the second round, there are six different categories, and the rows are worth twice as much.
The final round has a single category and a single question. Each player can decide to wager up to the full amount of their score in this game. This wager is done after they see the category, but before they see the answer.
After the host finishes reading the answer aloud, the buzzers are lighted so that the contestants can buzz in. If a contestant gets the question correctly, he earns the corresponding money for the row it was in. If the contestant guesses incorrectly, the money is subtracted from his score. If the first contestant fails, the buzzers are re-lit so the other two contestants can then buzz in with their answers, learning from previous failed attempts.
To provide added challenge, some of the answers are surprise "Daily Double". Instead of the dollar amount for the row, the contestant can wager any amount, up to their total score they have won so far in that game, or the largest dollar amount for that round, whichever is higher, based on his confidence in that category. There is one "Daily Double" surprise in the first round, and two in the second round.
In the final round, each contestant wagers an amount up to their total score, based on their confidence on the final category. A common strategy for the leading contestant with the highest score is to wager a low amount, so that if he fails to guess the response correctly, he will still have a large dollar amount. For example, if the leader has $2000 and the second place is $900, the leader can wager only $100 dollars, and the second place might wager his full $900. If the leader loses the round, he still has $1900, beating the second place regardless of how well he does.
Whomever has the most money at the end of all three rounds wins that amount of cash, and gets to return to the show for another game the next day to continue his winning streak. The other two contestants are given consolation prizes and a nominal appearance fee for being on the show, and are never seen from again.
The show is only 30 minutes long, so the folks at Sony Pictures who produce the show can film a full weeks' worth of television shows in just two days of real-life, Tuesday and Wednesday, allowing the host Alex Trebek and his "Clue Crew" time to research new categories and answers.
So, here is your homework assignment. Record a full episode of Jeopardy on your VCR or Digital Video Recorder (DVR) and have your thumb ready to press the pause button. For each round, listen to each category, pause, and try to guess what all the answers in that column will have in common. For each category, write down a statement like "All the responses in this category are ...".
The answers could be people, places or things. Suppose the category "Chicks Dig Me". In English, "chicks" can be slang for women, or refer to young chickens. The term "dig" can be slang for admires or adores, so this could be "Male Celebrities" that women find attractive, it could be objects of desire that women fancy (diamonds, puppies, etc.), or it could be places that women like to go to. As it turns out, the "dig" referred to archaeology, and the responses were all famous female archaeologists.
Once you have those all your statements written down, press play button again.
Next, as each answer is shown, you have three seconds to hit the pause again, so that you have the question on the screen, but before any contestants have responded. Go on your favorite search engine like Google or Bing and try to determine the correct response based on the category and answer. Consider these [tips for being an Internet Search ninja]. Once you think you have figured out your response, write it down, and the dollar amount you wager, or decide you will not respond for that answer, if you are not sure about your findings.
Even if you think you already know the correct response, you may decide to gain more confidence of your response by finding confirming or supporting evidence on the Internet.
Press play. Either one of the contestants will get it right, or the host will provide the question that was expected as the correct response.
How well did you do? Were you able to find on the the correct response online, or at least confirm that what you knew was correct. If you got it correct, add in your dollar amount to your score. If you got it wrong, subtract the amount.
At the end of each round, look back at your statements for each category. Did you guess correctly the common theme for each category column of answers? Did you misinterpret the slang, pun or humor intended?
At the end of the game, you might have done better than the contestant that won the game. However, check how much added time you took to do those Internet searches. The average winner only questions half of the answers and only gets 80 percent of them correctly.
If you are really brave, take the [Jeopardy Online Test]. If you do this homework assignment, feel free to post your insights in the comments below.
We are only days away from the big IBM Challenge of Watson computer against two human contestants on the show Jeopardy!
I watched two episodes of Jeopardy! on my Tivo, pausing it to follow the [homework assignment] I suggested in my last post. Here are my own results and observations.
Episode  involved a web programmer, a customer service representative, and a bank teller.
Of the first six categories in Round 1, I guessed four of the six themes for each category. For the category "Diamonds are Forever", I wrote down "All answers are some kind of gem or mineral", but the reality was that all the answers were some physical characteristic of diamonds specifically. For the category "...Fame is not", I wrote down "All answers are TV or Movie celebrities". I was close, but actually it was famous celebrities, rock bands and pop culture of the 1980s. (The movie "Fame" came out in 1980).
In the round, there were 27 of the 30 answers given before they ran out of time. Of these, I was able to get 24 of 27 correct by searching the Internet. That is 88 percent correct. Here were the ones that eluded me:
Answer related to a "multi-chambered mollusk". I could not find anything on the Internet definitively on this, so abstained from wager. The correct question was "What is Nautilus?".
Answer was the Irish variant of "Kathryne". I found Kathleen as a variant, but did not investigate if it had Irish origins. The correct question was "What is Caitlin?"
Answer was this Norse name for "ruler" whether you had red hair or not. I found "Roy" and "Rory" so guessed "What is Rory?" The correct question was "What is Eric?"
The second round, I guesed three of the six themese for the categories. For category "Musical Titles Letter Drop" I wrote down "All the answers are titles of musical songs" but it was actually "Musicals" as in the Broadway shows. For category "Place called Carson", I wrote down "All the answers are places" and was way off on that one, with answers that were people, places and names of corporations. And for "State University Alums", I wrote down "All the answers are college graduates", but instead they were all "State Universities" such as the University of Arizona.
In this second round, only 26 answers were posed. I got 80 percent correct with Internet searching. I missed three on the "Musical Titles", one in "Pope-pourri" and one State University (sorry SMU). The "Musical Titles Letter Drop category" was especially difficult, as for each title of a Musical, you had to remove a single letter out of it to form the correct response.
For the answer "Good luck when you ask the singers "What I Did For Love"; they never tell the truth", you would need to take "Chorus Line" the musical, where the song "What I did for Love" appears, and ask "What is Chorus Lie?" Note that "line" changed to "lie" and the letter "n" was dropped out.
For the answer "Embrace the atoms as Simba and company lose and gain electrons en masse in this production", you would need to recognize that Simba was the main character of "The Lion King" and change it to "What is The Ion King".
I think these play-on-words are the questions that would stump the IBM Watson computer.
In the final round, the category was "Ancient Quotes". I thought the answer would be a famous adage or quotation, but it was instead famous people who uttered those phrases. The answer was "He said, to leave this stream uncrossed will breed manifold distress for me; to cross it, for all mankind". I was able to determine the correct response readily from searching the Internet: The river was the Rubicon, the border of the Gaul region governed by an ambitious general. The correct response "Who was Julius Caesar?"
Total time for the entire exercise: 87 minutes.
The following night, episode  brought back Paul Wampler, the returning champion web programmer, against two new contestants: an actor, and high school principal.
Of the first six categories in Round 1, I guessed five of the six themes for each category. For the category "Nonce Words", I wrote all the answers would be nonsense words. I was close, the clues had words invented for a particular occasion, but the correct responses did not.
I was able to get 29 of 30 correct by searching the Internet. That is 96 percent correct. The one I missed was in the category "Nonce Words" and the answer was "In an arithmocracy, this portion of the population rules, not trigonometry teachers.." My response was "What is Math?" but the correct answer was "What are the majority?" It did not occur for me to even look up [Arithmocracy] as a legitimate word, but it is real.
The second round, I guesed five of the six themese for the categories. For category "Hawk" eyes, the "Hawk" was in quotation marks, so I wrote "All answers would start with the word Hawk or end with the word "eyes". I was close, the correct theme was that the word "hawk" would appear in the front, middle or end of the correct response.
In this second round, I got 28 of 30 correct. I got 93 percent correct with Internet searching. Ironically, it was the category "German Foods" that caught me off guard.
For, the answer was "Pichelsteiner Fleisch, a favorite of Otto von Bismarck, is this one-pot concoction, made with beef & pork". I know that "fleisch" is a German word for meat, so I guessed "What is sausage?" but the correct response was "What is stew?" I should have paid more attention to the "one-pot concoction" part of the answer.
For the answer was "Mimi Sheraton says German stuffed hard-boiled eggs are always made with a great deal of this creamy product". I didn't realize that "stuffed eggs" was German for "deviled eggs". Instead, I found Mimi Sheraton's "The German Cookbook" on Google Books, and jumped to the page for "Stuffed Eggs" The ingredients I read included whippedc cream, cognac, and worcestershire sauce. Taking the "creamiest" ingredient of these, I wrote down "What is whipped cream?" However, it turned out I was actually reading the ingredients for "Crabmeat Cocktail" that was coninuing from the previous page. I thought it was gross to put whipped cream with eggs, and should have known better. The correct response was "What is mayonnaise?"
In the final round, the category was "Political Parties". This could either be political organizations like Republicans and Democrats, or festivities like the Whitehouse Correspondents Dinner. The answer was "Only one U.S. president represented this party, and he said, I dread...a division of the republic into two great parties." So, we can figure out the answer refers to political organizations, but both Democrat and Republican are ruled out because each has had multiple presidents. So, looking at a [List of Political Parties of each US President], I found that there were four presidents in the Whig party, four in the Democrat-Republic party, but only one president in the Federalist party (John Adams), and one in the War Union party (Andrew Johnson). Looking at [famous quotes from John Adams] first, I found the quote, it matched, and so I wrote down "What is the Federalist party?". I got it right, as did two of the three contestants. Ironically, the one contestant who got it wrong, the returning champion web programmer, wagered a small amount, so he still had more money after the round and won the game overall.
Total time for the entire exercise: 75 minutes. I was able to do this faster as I skipped searching the internet for the responses I was confident on.
To find out when Jeopardy is playing in your town, consult the [Interactive Map].