Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
How do you define success? For some, it is based on their salary, or perhaps revenue they helped close for their company.
For others, their family life and the flexibility to handle work/life issues might be more important.
Still others look for certifications and awards from official agencies.
As a side gig, I sometimes do bartending on the weekends. Typically, these are for weddings or corporate parties.
I took weeks of bartender training and passed a three-hour exam to become state-certified to do so in Arizona. We Arizonans take our liquor seriously! If you think about it, bartending is just a notch below being a Pharmacist dispensing other drugs.
Surprisingly, some of my patrons will be condescending, "Don't you wish you can do more with your life than be a bartender?"
I am also certified "Laughter Yoga" instructor, and am called in at times to substitue for other instructors. Again, I took formal training and was certified to do so.
Again, some of my students will ask, "Don't you wish you could do more with your life than be a yoga instructor?"
In both cases, I would respond, "Dude, I earn six figures, and am happy to meet new people every week, how about you?" This usually shuts them up!
(For those interested, here are [my top 10 posts] which served as the basis of the interview!)
I am happy to be recognized externally and within IBM for my success as a blogger. Since I started blogging over 10 years ago, I have helped close over $4 Billion USD in revenue for IBM, written five books on IBM Storage, mentored dozens of other successful bloggers, and presented to thousands of clients at conferences, workshops and briefings.
Well, it's Tuesday again, and you know what that means? IBM Announcements! I am here in New York for the exciting news!
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for the IBM z14 mainframe and DS8880 Storage System.)
In support of the [IBM z14] mainframe announcement, IBM has also disclosed R8.3 enhancements for the DS8880 Storage System. Here is a quick recap:
New Tier-1 Flash Capacities available for HPFE Gen2 drawers
IBM introduces the new Tier-1 flash card capacity 3.84 TB flash card. In the past, IBM DS8880 only supported Tier-0 cards that support 10 Drive Writes per Day (10 DWPD), with capacities 400, 800, 1600 and 3200 GB. The Tier-1 flash card only handles 1 DWPD, often dubbed "Read-Intensive" devices, but can actually handle about 90 percent of most production workloads.
zHyperLink™ drastically reduces the latency between the IBM z14 mainframe and the DS8880 storage systems. Traditional FICON paths through SAN switches or directors introduced about 140 to 175 microseconds of latency between systems. This new system is a direct cable, with 20 microsecond latency.
The I/O bays on the DS8880 used for HPFE Gen2 already have zHyperLink ports on them. This direct cable is limited to 150 meters, however, so plan accordingly.
Transparent Cloud Tiering
IBM already announced Transparent Cloud Tiering to IBM Bluemix, IBM Cloud Object Storage and the IBM TS7760 virtualization engine in R8.2.3 release. The new Release 8.3 of DS8880 now adds support for Amazon S3, providing yet another choice for where to migrate data sets to. IBM also adds replication, allowing the data set to be migrated to two separate target locations, for added availability, much like writing to separate ML2 tape cartridges.
Cascading FlashCopy is a feature that has existing for awhile now on IBM XIV and SAN Volume Controller platforms, so this is just a port of that concept over to the DS8880 microcode. Now, if you FlashCopy target can become the source of a follow-on FlashCopy request. You can make copies of copies. This applies to both the volume and data set level functions.
Why would anyone do this? Well, you might suspend your application at midnight and create a clean FlashCopy of a 24-by-7 ever-changing database. Then in the following morning, workers who need a static "midnight version" of the database now can use this as their source and perform additional FlashCopy requests for their own needs.
IBM DS8880 MES Support
MES is an abbreviation for "Miscellaneous Equipment Specification", one of the many Three Letter Acronyms [TLA] that doesn't help knowing what the words stand for. In short, an MES is a formal supported option to upgrade a piece of hardware that is already installed and running at a client location. IBM will offer MES to upgrade existing DS8880 systems to have the additional HPFE Gen2 drawers, and to upgrade the I/O bays to support zHyperLink connections.
(Final note: you might notice the change in upper and lower case. The IBM z14 (lower case) refers to the specific mainframe model, consistent with its predecessors the z13 and z13s, but the family name "IBM z Systems" has been shortened to "IBM Z®" (upper case). IBM Storage Systems and IBM POWER Systems were already upper case, so the mainframe guys just wanted to follow suit. I suspect "IBM i" will remain lower case, however.)
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM Elastic Storage Server
Replacing the older "GSn" and "GLn" models, IBM announces the "Second Generation" GSnS and GLnS models (the second "S" stands for Second Generation), the "n" continues to refer to the number of storage drawers. All of these have a pair of POWER8 servers to drive amazing performance at a low price point.
The "GSnS" models are based on smaller 2U, 24-drive storage drawers, with 3.84 and 15.36 TB Tier-1 Read-intensive Solid-State Drives (SSD). The "GLnS" models are based on larger 5U, 84-drive storage drawers, with 4TB, 8TB and 10TB nearline (7200 rpm) spinning disk.
These new models have the latest IBM Spectrum Scale software pre-installed.
In addition to IBM's two existing Hyperconverged offerings--IBM Spectrum Accelerate for x86 servers, and IBM Spectrum Scale for x86, POWER and z Systems servers--IBM Power Systems now offers a third option. This integrated offering combines Nutanix's Enterprise Cloud Platform software with IBM Power Systems™ hardware to deliver a turnkey hyperconverged solution that targets critical workloads in large enterprises.
Nutanix is offered and will be defaulted/required on these Power® servers only:
While "Hyperconvergence" is still fairly new, and only about 1 percent of data centers have deployed this new technology, I am glad that IBM is a leader in this space with multiple offerings across both x86 and POWER systems platforms.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 5, the last day of the conference.
Integrating IBM Storage in Container Environments
Dr. Robert Haas, IBM CTO Storage for Europe, presented IBM Storage for Docker containers. These are different from containers in IBM Cloud Object Storage, and different from the Container Pools used in Spectrum Protect.
Robert gave an overview of IBM Spectrum Conductor, part of the IBM Software Defined Infrastructure (SDI) Spectrum Compute family of software products. The goal is to analyze large amounts of data, access these data efficiently, and protect the data, results and insights as intellectual property.
IBM Spectrum Compute comes in several offerings. IBM Spectrum LSF (Load Sharing Facility) manages long-running batch jobs for modeling, design and simulations. IBM Spectrum Symphony provides low-latency for risk analytics in the financial services sector. IBM Spectrum Conductor comes in two flavors. Conductor for Spark (CFS) manages Spark analytics. Conductor for Containers (CFC) handles Docker and Kubernetes containers.
Docker is the run-time platform. While there are other container run-time platforms like RKT and LXD, Docker is clearly the marketshare leader, growing 40 percent per year.
Statistics from the latest DockerCon2016 conference showed the most popular use cases and workloads for Docker. What can run in Docker: Lots of applications can be "containerized", including Redis, MongoDB, PostgreSQL, OracleDB, Java, to name a few. Docker is well established in enterprises, including service providers, healthcare, insurance and financial services, public sector, and technology firms.
Kubernetes, Mesos and Docker/Swarm are a layer above, as orchestrators. Spectrum Conductor for Containers uses Kubernetes and other open source tools to coordinate activity. Orchestrators restart failed applications, and can scale up or scale down the number of instances as needed. Orchestrators can manage groups of applications, across clusters on-premises and off-premises Cloud.
From a storage perspective, containers access storage like bare-metal operating systems, bypassing all of the layers normally associated with bloated Virtual Machine hypervisors. It also eliminates single root I/O virtualization (SR-IOV) that VMs use to compensate.
Persistent storage can be isolated, so that containers cannot see the files of other containers. This provides multi-tenancy.
Internal persistent storage (directory on host file system). However, if you move a container from one host to another, you may lose access to this internal storage.
External volume, manually mounted.
Volume driver plug-in REST API that automatically mounts it.
The fourth method is preferred. Plug-ins are available for IBM Spectrum Scale, GlusterFS, Portworx, Rancher Convoy, RexRay, and Contiv. The start-up Flocker have gone out of business last year.
The Docker hosts can attach to IBM Spectrum Scale in all of its supported offerings, including POSIX, NFS and SMB protocol. Containerized applications can move from one Docker host to another, and continue access the IBM Spectrum Scale namespace.
IBM has created the "Ubiquity Volume Service" that provides a consistent API for Docker and Kubernetes. This will use IBM Spectrum Control Base Edition to support IBM Spectrum Scale, Spectrum Accelerate, Spectrum Virtualize and DS8000 storage systems. For IBM Spectrum Scale, volumes are mapped to iSCSI volumes, filesets or directories. For other devices, volumes are mapped to block LUNs. Ubiquity is publicly available on GitHub.
Enterprise Applications for IBM Cloud Object Storage
Andy Kutner, IBM Cloud Architect, presented the various options available for NAS gateways that can front IBM Cloud Object Storage.
Ctera offers NAS gateways, and Endpoint agents for backup and Enterprise File Sync & Share (EFSS). This vendor targets Remote Office/Branch Office (ROBO) and small NAS consolidation that have less than 60 TB per office IBM is a reseller of Ctera, so you can get both Ctera and IBM COS from the same IBM sales rep.
Nasuni offers a global file system, accessible from any device, smartphone, tablet or desktop. They are focused on taking out EMC and NetApp NAS solutions. Performance at the edge, combined with capacity in the client's chosen Cloud (including IBM Cloud Object Storage or IBM Bluemix). Infinite snapshots replace backups, offering RPO of 1 minute for Disaster Recovery. Their global file system "UniFS" offers file locking.
Panzura focuses on Cloud Integrated NAS, File Distribution, and Collaboration. This can help eliminate "islands of storage". The File Distribution can be any type of file, but was originally designed for Media and Entertainment, such as videos. Collaboration employs EFSS features for workgroup shared file folders, such as CAD/CAM or engineering blueprints.
IBM Spectrum Scale can provide NFS and SMB access to files, and then move colder, less active data to IBM Cloud Object Storage, using Transparent Cloud Tiering feature. Spectrum Scale offers WAN caching across locations.
IBM COS now offers a native NFS v3 interface. This allows read/write NFS access, with S3 API read of the same content. Each file is mapped to a single object.
This is targeted for large scale archive, static-and-stable data, NFS-based backup software, and applications going through the transition from file-based to object-based. This is not intended for multi-site collaboration or primary NAS replacement. Regardless of the number of geographically dispersed IBM COS sites, the NAS can run on only one or two sites initially.
To provide NFS v3 support, IBM introduces new F5100 File Accessers, which talk to an IBM COS Accesser, which in turn acts on specific Vaults in the storage pools. The file-to-object mapping metadata is replicated on-premises across three File Accessers, and optionally replicated asynchronously to a second site for High Availability. S3 API can read access the file by file name, or by Object URI.
Initially, the "File Accesser" is only available as pre-built system, not as software-only.
There was not enough time to cover other solutions, including Avere, NetApp AltaVault, or Open Source S3FS.
This was a great event, just the right size, between 1,500 and 2,000 attendees. Similar IBM Technical University events coming up later this year:
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Thursday evening, we had the "Meet The Experts" sessions. There were four: Storage, Power Systems, z/OS, and a fourth one focused on z/VM and Linux on z Systems. I was on the expert panel for Storage.
Mo McCullough was the emcee. Special thanks for Shelly Howrigon in her help with this event.
(Disclaimer: Do not shoot the messenger! We had a dozen or so experts on the panel, representing System Storage hardware, software and services. I took notes, trying to capture the essence of the questions, and the answers given by the various IBM experts. The answers from individual IBMers may not reflect the official position of IBM management. I leave out any references to unannounced plans or products. Where appropriate, my own commentary will be in italics.)
When will IBM offer a single pane of glass management for all of its IBM storage products?
IBM is working hard on this. Our strategy is to focus on IBM Spectrum Control as the primary answer. We have extended support across block, file and object, with support for IBM Spectrum Scale and IBM Cloud Object Storage System. We have also provided plug-ins for VMware, Cisco UCS Director, and OpenStack Horizon, for those who prefer those management systems instead.
What we really need are REST APIs!
Good point. IBM already has some REST APIs for the DS8000, XIV and Spectrum Protect, now that IBM has browser-based GUI across its entire product line, it is our strategy to offer REST API across our product line as well.
What is the next generation of ProtecTIER Data Deduplication going to look like?
IBM is focused on provided "data deduplication" for backup workloads directly through IBM Spectrum Protect backup software. IBM continues to sell IBM ProtecTIER.
(Virtual Tape Libraries like IBM ProtecTIER and Dell EMC Data Domain were created to handle the fact that many backup software back only were designed for tape drives and libraries. VTL was disk that pretended to be tape library. Now that IBM Spectrum Protect, NetBackup, Commvault, and all of the other modern backup products write natively to disk, object storage or Cloud services, there really isn't a need for VTL products any more.)
Why does IBM bother with all-Flash version of DS8000 when it already has IBM FlashSystem?
Different products for different workloads. IBM DS8000 offers unique support for z System mainframe FICON attachment and 520-byte block support for IBM i. IBM also offers all-Flash Elastic Storage Server, all-Flash SVC and Storwize products, that complement the IBM FlashSystem product line.
We like how XIV can hot-enable encryption, even with existing data on it. Why doesn't DS8000 offer this?
Two separate implementations. At the time IBM DS8000 encryption was designed, it was decided that the client needed to enable encryption before writing any data.
Will we see a spinning disk version of the FlashSystem A9000
Flash is now less expensive than spinning disk, I don't see why IBM would go backwards. The future is Flash.
We would like Spectrum Control to manage our Dell EMC Isilon
Yes, we have heard that from others. We are working on extending our third party support. Send in your cards and letters to help us prioritize. Or, better yet, submit a "Request For Enhancement" (RFE).
The difference between Tier 0 (Write Endurance) flash and Tier 1 (Read Intensive) flash is confusing, are there any plans in the IT industry to simplify this?
No, if anything it will get worse. Today, IBM's Tier 0 is 10 Drive Write Per Day (DWPD), and Tier 1 is 1 DWPD. Other SSD drives offer 2, 3, 5, 10, 15 and 25 DWPD. As people buy more Flash, and less disk, expect more differentiation in this area.
We would like to tune Easy Tier on the Storwize products
Understood. IBM typically implements new features on the DS8000 platform first, then rolls them over to Spectrum Virtualize. The ability to influence allocation order, pin or avoid tiers, and have application API to influence the placement are already in DS8000.
What will the future of Storwize look like?
We don't have enough time to cover that in this meeting.
Recently, you raised the maximum Storwize FlashCopy background copy rate from 64 MB/sec to 2 GB/sec, but is that realistic?
The setting provides the background task a target "grains per second" to try to achieve. It may not be possible depending on your configuration and the number of concurrent tasks. Your Storwize may be so busy with background activity that it won't take host I/O.
We have been giving you our wishlist, but are there any questions the IBM experts have for the audience
Yes, are there any clients being asked to secure storage against Ransomware and insider threats from disgruntled employees?
(Several hands went up, and we collected their names to have further discussions.)
How should we assign business value to data?
IBM Spectrum Virtualize allows you to assign metadata tags to files, so that these can be used to drive different policies.
(The process of assigning business value is often called "Data Rationalization" and is part of ILM, BC/DR, and Data Governance efforts.)
I am concerned that AES 256 encryption is not good enough now that there is Quantum Computing.
It will be decades before Quantum Computing will be good enough to break these codes.
Will Blockchain drive huge or unique storage requirements?
No. The entries are small. You are appending small transactions to the end of existing ledgers. Nothing unique or different.
Were there any topics not adequately covered at this conference?
IBM didn't have much to offer for Spectrum Compute family of software, the Software Defined Infrastructure (SDI) that runs on both x86 and POWER systems. This should be done under the POWER brand, but many clients use Spectrum Compute with x86 servers. Ironically, Spectrum Compute products are managed under the Storage division, since Spectrum Compute and Spectrum Storage work well together.
We would like Storwize's clever NPIV to be implemented in all of the other IBM arrays, starting with DS8000.
That probably won't happen, as they are different architectures. Whereas Storwize and the rest of IBM Spectrum Virtualize family were designed for nodes to fail, and take their ports down with them, the DS8000 has independent I/O bays that continue to run independent of either POWER8 node. Likewise, FlashSystem 900 has similar separation between the FCP adapters and the processing nodes.
Can we have consistent licensing across the entire IBM Spectrum Virtualize set of products, please?
We have a task force to investigate this, and will gladly add your name to the list for input and feedback.
While the conference continues Friday morning, for many attendees, this was the last event.
IBM Spectrum Scale was formerly called GPFS and has been around since 1998. I am glad it was renamed, as GPFS suffered from "guilt by association" with other file systems, AFS, DFS, XFS, ZFS, and so on.
Spectrum Scale does so much more, supports volume, file and object level access, supports POSIX standards for Windows, AIX and Linux, support Hadoop and Spark with 100 percent compatible HDFS Transparency Connector, support NFS, SMB and iSCSI protocols, as well as OpenStack Swift and Amazon S3 object based access.
Initially designed for video streaming and High Performance Computing (HPC), IBM has extended its reach to work in a variety of workloads across different industries. More than 5,000 production systems are running at client locations.
IBM Spectrum Protect solution design: Server, Deduplication and Disaster Recovery decisions
Dan Thompson, IBM Storage Software Technical Sales Specialist, presented this session.
To make it easier to deploy, IBM Spectrum Protect now has a set of tested "blueprints" that are organized into small, medium and large. Find the one that fits your needs, and it will tell you exactly how the server should be configured. Dan recommends having a "test system" to try out new releases of IBM Spectrum Protect.
For multiple server configurations, Dan recommends adopting a standard naming convention, and to make use of Enterprise Configuration and server-side Client Option Sets. You may want to consider discrete instances for special non-backup functions, like library manager or Operations Center hub server, which allows you to upgrade more aggressively without affecting your backup clients.
If you plan to run multiple Spectrum Protect instances on the same VMware host, set the DBmemPercent to avoid having DB2 consume all of the memory, which will interfere with out Spectrum Protect instances.
For clustered servers, IBM supports Active/Passive, Active/Active, Many/One, and Many/Few configurations. You can mix and match these as needed.
For data spill remediation, consider NIST 800-88 data shredding. This depends on the type of storage media used.
IBM Spectrum Protect for Data Retention, formerly called System Storage Archive Manager (SSAM), offers For Non-erasable, Non-Rewriteable (NENR) enforced Immutability protection. (This used to be called Write-Once-Read-Many or WORM for short, but since WORM applies only to tape and optical media, and IBM Spectrum Protect now supports Flash, Disk, Object Storage and Cloud repositories, IBM has adopted the term NENR instead). Third party KPMG has certified IBM Spectrum Protect for Data Retention meets to their satisfaction the requirements for SEC 17a-4 regulations.
When sizing your server, Dan recommends that you always "over-size" it and grow into it. Use the published "Performance Optimization Guide" to help. Monitor the server and storage using OS and device specific monitoring, in combination with IBM Spectrum Protect reports.
If you are still on BC Tiers 1 or 2, transmitting tapes to a remote vaulting facility or secondary data center, consider upgrading to BC Tier 3 at least. This can be done via electronic vaulting to an Automated Tape Library (ATL), Virtual Tape Library (VTL) or IBM Cloud Object Storage, or a Cloud service provider such as IBM Bluemix or Amazon Web Services. This can be supplemented using DB2 HADR for the IBM Spectrum Protect database.
While Spectrum Protect server can run bare-metal or as a VM, the VM instance will not have support for FCP-based tape or Virtual Tape Library. Many people are moving off tape, especially VTL, and using native Disk, Directory or Cloud container pools instead.
Lastly, take advantage that Operations Center can view all Spectrum Protect servers across all locations. This can be helpful.
Enabling Mission Critical NoSQL workloads using IBM trillions of operations technology
TJ Harris, from the IBM Storage CTO office, and Scott Brewer, FlashSystem Team Lead, co-presented this session.
They gave a background on NoSQL, the most popular being MongoDB. The IT industry estimates that NoSQL will grow 38 percent CAGR from 2015-2020.
The problem occurs when NoSQL applications go through a full file system stack to work with low-latency devices like Flash, especially when the writes are small, often just a few dozen bytes to 100 KB. Fortunately, IBM Research has created the "Trillions of Operations" project to explore ways to take reduce the software stack, and make use of NVMe protocol.
The top three challenges for NoSQL deployments are: (a) Cost, (b) Data management and retention, and (c) Data relevancy.
To enable innovation, MongoDB offers a "Storage Engine API" that allows others to compete at this space. Currently MMAP v1 and WiredTiger are supported. IBM Research implemented its "Trillion Operations" project as a plug-in to this API, optimized for high rates of ingest for data. Compared to Facebook's RocksDB, IBM was 14x faster write, and 2.1x faster read.
Another challenge is coordinate backups and disaster recovery when applications mix traditional RDBMS with these new NoSQL databases.
The week is nearly over, and I can see the light at the end of the tunnel. Everyone had a great time last night's event at the Universal City Walk and Blue Man Group.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 4.
Configurable IBM Spectrum Scale
Kent Koeninger presented IBM Spectrum Scale software, which Kent refers to as "Configurable Spectrum Scale" (or CSS for short), as opposed to the pre-built system known as Elastic Storage Server (ESS).
Why choose CSS versus ESS? Lower entry price. You can start with just two single-socket servers and a drawer of disk.
IBM Spectrum Scale was formerly called IBM General Parallel File System (GPFS). Many who tried earlier versions of GPFS found it difficult to configure, because it only had a command line interface. Now, Spectrum Scale has a fully-functional GUI, and clients have been able to install and configure Spectrum Scale in just 30 minutes!
How big can Spectrum Scale grow? As much as your budget can afford! With an architecture that can support YottaBytes of data and 900 quintillion files, you won't hit any limits anytime soon.
There are some unique capabilities of ESS not available in CSS. For example, ESS offers Spectrum Scale Native RAID (erasure coding) with fast rebuild times, and ESS is certified for SAP HANA. You can combine any combination of CSS and ESS in the same Spectrum Scale to create a "data lake" for mixed workloads.
A good use case for Spectrum Scale, either CSS or ESS, is backup. Kent explained why it is an excellent option to store backups with enterprise backup software such as IBM Spectrum Protect or Commvault.
VersaStack - Hybrid Cloud like no other
This session was jointly presented by Chris Vollmar, IBM Storage Architect, and Brent Anderson, Cisco Global Consulting Systems Engineer. IBM and Cisco have been partners for more than 25 years.
VersaStack combines Cisco UCS x86 servers, Cisco Nexus and MDS switches, and IBM FlashSystem or Spectrum Virtualize storage.
What if you have a SAN Infrastructure built entirely from IBM b-type or Brocade-based switches? Cisco supports their SAN switches for this, but nobody has tested VersaStack in this combination, and UCS Director does not manage this combination, so IBM does not support this. Instead, for this situation, IBM recommends doing external connection via Ethernet, or using direct-attach configurations.
The Cisco Validated Design spends four months testing, and gives you bulletproof process to deploy the solution.
There is a difference between Cisco UCS Manager and UCS Director. UCS Manager is available at no additional charge, but only manages the Cisco x86 servers. UCS Director is optionally extra priced, and manages Cisco servers, Cisco networking, and IBM Spectrum Virtualize storage.
Brent explained the benefits of UCS Management through policies and profiles.
Chris covered Cisco CloudCenter, which the Cisco team shortens to just "C3". IBM Spectrum Copy Data Management can be used to move snapshots of data between on-premises and off-premises Cloud to help in Hybrid Cloud configurations.
How to Design an IBM Spectrum Scale solution
Tomer Perry, IBM Spectrum Scale I/O Development, presented this session.
For those who want to bring up a quick IBM Spectrum Scale environment to play around with, you can do this in as little as 30 minutes. But to design a mission critical deployment, additional requirements may need to be addressed. You may need to consult with not just storage admins, but also application owners, network admins and security personnel.
Large companies have hundreds or thousands of applications, so Tomer recommends to group these into "Workload families", based on data set types, access patterns and performance requirements. For NAS take-out, 80 percent of NAS I/O is "get attribute" that can easily be served directly from cache memory.
For each workload family, you may need to decide on snapshots, quotas, namespace (bind mounts, symlinks, etc.), security (ACL, encryption), estimated capacity, replication BC/DR, backup and ILM requirements.
Unless this is completely greenfield deployment, the existing infrastructure needs to be evaluated. This includes the LAN and WAN network topology, name resolution (DNS), time services (NTP), Authentication (AD, LDAP, NIS, Keystone), Keyserver (IBM SKLM), Monitoring and Migration requirements.
Tomer suggests designing the environment in this order: Cluster, File System, Storage Pools, Fileset, Replication, and finally Monitoring.
Generally, you need three NSD servers per cluster. For those licensing Spectrum Scale Standard Edition by the socket, you may be tempted to put everything into one big cluster. The new capacity-based Spectrum Scale Data Management Edition eliminates that concern, so Tomer recommends having separate computer clusters and storage clusters, connected by cross-cluster mount. All nodes in a cluster are considered an "ssh" administration domain.
A single Spectrum Control namespace can support up to 256 file systems. There are various reasons to have multiple file systems: block size, backup/recovery, snapshot, quotas, and cross-cluster isolation. If a file system gets corrupted, it will not affect other file systems. In an internal test, an "fsck" on 1 billion, 1 PB of data file system took only 30 minutes to repair.
Storage Pool design can separate metadata from content, and workloads can be separated to different storage media. With ILM, HSM and TCT, you can move colder data to Cloud, Object Storage, Spectrum Protect or Spectrum Archive.
Filesets are tree branches for each file system. IBM Spectrum Scale supports both dependent and independent filesets. Filesets can be used for Non-erasable, Non-Rewriteable (NENR) Immutability, policies, quotas, snapshots. Consider using a fileset instead of carving off a new file system.
Spectrum Scale offers both synchronous and asynchronous replication. For Synchronous, the ReadReplicaPolicy can be set to default, local or fastest. For Asynchronous, there are a variety of AFM modes (Read-only, Local-Update, Single-Writer, Independent-Writer, and Disaster Recovery). You may need to decide if your AFM gateways are dedicated or collocated. You will need to tune your TCP buffers for WAN performance to get the RPO you desire.
The nice thing about IBM solutions is that you can start small, and grow big. In all of these examples above, IBM offers sizes to match nearly any IT budget.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the sessions of Day 3.
Ethernet-only SANs -- Myth or Reality?
Anuj Chandra, IBM Advisory Engineer, presented an excellent overview of Ethernet-based SANs. He started with a quick history of Ethernet, starting with Robert Metcalfe's original drawing for his concept.
In the past, Ethernet was used for email and message transfer, and so dropped packets were tolerated. However, with the use of Ethernet for SANs, many standards have been adopted to make Ethernet networks more robust. These meet requirements for Flow Control, Congestion management, low latency, data integrity and confidentiality, network isolation, and high availability.
These standards are known as IEEE 802.1Q "Data Center Bridging", including 8012.Qbb Priority Flow Control, 802.1Qaz Enhanced Transmission Selection, 802.1Qau Congestion Notification. There is also the IETF Transparent Interconnection of Lots of Links (TRILL) to replace Spanning Tree Protocol (STP). All of these features are negotiated between endpoints server and storage. Ethernet that supports these new standards is often referred to as "Converged Ethernet" since it handles both traditional email/message traffic as well as SAN data traffic.
In addition to 1GbE and 10GbE, we now have 2.5, 5, 20, 40, 50, 100 Gb Ethernet speeds. By 2020, Anuj estimates over half of all Ethernet ports will be 25 GbE or faster. Amazingly, some of these can work on existing 10BASE-T cables.
Anuj also covered Remote Direct Memory Access (RDMA), and the RDMA-capable Network Interface Cards (RNIC) that support them. In one chart, shown here, Anuj explained Infiniband, RDMA over Converged Ethernet (RoCE) and RoCE v2, and Internet Wide Area RDMA Protocol (iWARP).
While many of these enhancements were intended for Fibre Channel over Ethernet (FCoE), the beneficiary has been iSCSI. Now there is iSCSI Extensions for RDMA (iSER) to take even more advantage of these changes, and can work with Infiniband, RoCE or iWARP. All of these networks can also be used as the basis for NVMe over Fabric (NVMeOF).
Ethernet is the backbone of Cloud usage, and IBM is well positioned to take advantage of these new networking technologies.
Digital Video Surveillance solutions for extended video evidence protection
Dave Taylor, IBM Executive Architect for Software Defined Storage solutions, presented this session on Digital Video Surveillance (DVS).
Most video surveillance is either analog-based, going to standard VHS tapes, or file-based. Sadly, security guards that watch live camera feeds lose their attention span after 22 minutes.
There are an estimated 72 million cameras globally, with 1.5 million more every year.
City governments spend 57 percent of their budget on "public safety". This can include body cams for police departments. Taser International, now called AXON, dominates the body-cam market.
City budgets may not be prepared to store all of this video content into a cloud that complies with Criminal Justice Information Services (CJIS) standards. These Cloud services tend to be more expensive, as the videos must be treated as evidence, tamper-proof, and with appropriate chain of custody.
DVS is not just storing movies. IBM offers Intelligent Video Analytics. It is important to be able to derive insight and actionable response.
Storage capacity adds up quickly. Standard 1080p (1920 by 1080 pixel) camera generates 2.92 GB per hour, 70 GB per day, and over 2TB per month. If you have 1,000 cameras, that's over 2PB of data.
For xProtect servers running Windows, the Tiger Bridge Connector can be used to move the video files to either IBM Spectrum Scale or IBM Cloud Object Storage.
Deep Dive into HyperSwap for Active-Active applications and Disaster Recovery
Andrew Greenfield, IBM Global Engineer for Storage, explained the different ways HyperSwap is implemented across the IBM storage portfolio.
For IBM DS8000, HyperSwap is based on Metro Mirror synchronous replication. In the event that the primary DS8000 fails, the host server can automatically re-direct all I/O to the secondary DS8000. This is often referred to as "High Availability" (HA), and in some cases can serve as Disaster Recovery.
For IBM Spectrum Virtualize products, including SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software, the implementation is different.
Previously, SVC offered Stretched Clusters, which put one node in one site, and a second node at another site, which allows for an Active/Active configuration. Unfortunately, the nodes in FlashSystem V9000 and Storwize are "connected at the hip", effectively bolted together, so putting separate nodes in different locations was not possible. To solve this, IBM developed HyperSwap that allows one node-pair to replicate across sites to another node-pair in the same Spectrum Virtualize cluster.
However, even though it is called "HyperSwap", it is not implemented in any way similar to the DS8000 method. Instead, Spectrum Virtualize uses the Global Mirror with Change Volumes to replicate data between sites.
IBM Storage and VMware Integration
This session was co-presented by Brian Sherman, IBM Distinguished Engineer, and Steve Solewin, IBM Corporate Solutions Architect.
For nearly two decades, IBM is a "Technology Alliance Partner" with VMware. To provide consistent integration to all the features and functions of VMware, IBM Spectrum Control Base Edition (SCBE) is provided at no additional charge for IBM DS8000, XIV, FlashSystem and Spectrum Virtualize products.
SCBE is downloadable as an RPM for RedHat Enterprise Linux (RHEL) can run bare-metal or as a VM.
For those using Hyper-Scale Manager, it will automatically install a special A-line-only version of SCBE. It will install SCBE, but it will only manage the A-line products (FlashSystem A9000, FlashSystem A9000R, XIV and Spectrum Accelerate).
Storage admins can define "storage services" that can be assigned to vCenter. This allows VMware admins to allocate storage in self-service mode.
After the meetings were over, IBM had a special event at the Universal City Walk to enjoy some drinks, food, and conversation, and to watch Blue Man Group.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 2.
IBM Spectrum Protect deep dive into Container Storage Pools
Ron Henkhaus, IBM Certified Consulting IT Specialist, presented the new Spectrum Protect concept of "Container Pools" that can either be "Directory Pools" on SAN or NAS-based disk storage, or "Cloud Pools". Container pools can contain deduplicated and non-dedupe data.
Ron cautioned that directory pools should not be placed on the same file system as your Spectrum Protect database or logs. Also, best practice for any directory pool is to assign an "overflow" pool to any non-directory pool, such as disk, tape or cloud container.
Cloud pools can use either OpenStack Swift, V1 Swift, Amazon S3 protocol, Amazon Web Services, IBM Bluemix, and IBM Cloud Object Storage. You can pre-define the vaults and buckets in the configuration.
For off-premises Cloud pools, the data is encrypted by default. For other container pools, encryption is optional. Performance to Cloud pools have been improved by using "accelerator storage", basically a disk cache to collect data before sending over to the Cloud pool. Backups to Cloud pools can reach 8 TB per hour. Restore times varies from 500 to 1500 GB per hour.
Container Pools were designed for the new "Deduplication 2.0" feature introduced in version 7. Traditional Dedupe 1.0 to Device Class FILE is still available, but not recommended.
Version 7.1.6 changed the compression algorithm from LZW to LZ4. In all cases, Spectrum Protect performs these actions in this order: deduplication, compression, encryption. Data that is encrypted by the Spectrum Protect client is therefore not deduped.
The "Protect Storage Pool" command can replicate a directory pool to either a remote directory pool or Cloud pool. In addition to this remote replication, you can copy a directory pool to tape to offer air-gap protection against ransomware. Such tapes are considered part of the "Copy Container Pool". In the event of directory pool corruption, the data can be repaired from either replication or tape.
IBM Aspera can now be used for replication, using SSL and AES-128 bit encryption. If your latency is greater than 50 msec, and have more than 0.5 percent packet loss, Aspera might help. This is available for Linux on x86 platforms running v7.1.6 or higher.
For existing customers, IBM Spectrum Protect allows you to convert your FILE, VTL and TAPE device class pools to directory or Cloud pools.
Introduction to IBM Cloud Object Storage (powered by Cleversafe)
In 2015, IBM acquired Cleversafe, recognized as the #1 Object Storage vendor. Their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS. IBM offers the IBM Cloud Object Storage System in three ways: as software, as pre-built systems, and as a cloud service on IBM Bluemix (formerly known as SoftLayer).
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
While other cloud service providers have offered data storage in the cloud, this new offering also allows hybrid configurations with geographically dispersed erasure coding.
Unlike RAID which protects against the loss of one or two drives, erasure coding can protect against a larger number of concurrent failures. For example, using an Information Dispersal Algorithm (IDA) of "7+5", where seven pieces of data are encoded on twelve independent disks, the system can lose up to five disk drives without losing any data.
Combining this with Geographically Dispersed Configuration across three or more sites means that you can lose an entire data center, four of the twelve disks, and still have instant full access to all of your data from eight drives at the other locations. In the graphic, you see two on-premise data centers combined with a third location in IBM SoftLayer.
New Generation of Storage Tiering: Simpler Management, Lower Costs, and Improved Performance
With ever changing amounts of storage, it is hard to find metrics that are consistent year to year. Fortunately, we found I/O density as the metric to focus my efforts, armed with real data from Intelligent Information Lifecycle Management (IILM) studies done at various clients. From that, I was able to talk about storage tiering on three fronts:
Storage tiering between Flash and disk. IBM FlashSystem and IBM Easy Tier on DS8000 and Spectrum Virtualize family for hybrid Flash-and-disk configurations.
Storage tiering between disk, tape, and Cloud. HSM and Information Lifecycle Management (ILM) on Spectrum Scale, Elastic Storage Server (ESS), Spectrum Archive and IBM Cloud Object Storage System.
Storage tiering automation across your entire environment. IILM studies can help identify a target mix of Tier 0, Tier 1, Tier 2 and Tier 3 storage. IBM Spectrum Storage Suite and the Virtual Storage Center (VSC) can recommend or perform the movement of LUNs to more appropriate tiers, based on age and I/O density measurements.
It's hard to say what the correct sequence of presentations should be. Some thought it might have been better for my talk on IBM Cloud Object Storage System prior to Ron's talk on Cloud container pools, but perhaps hearing Ron first helped drive more interest to my session.
I have been involved with Business Continuity and Disaster Recovery my entire career at IBM System Storage. However, with new workloads like Hadoop analytics and new Hybrid Cloud deployments, I thought it would be good to provide a refresh.
The need for Business Continuity and Disaster Recovery has increased recently due to (a) climate change caused by human activity, (b) ransomware and other cyber attacks, and (c) disgruntled employees.
Back in 1983, a task force of IBM clients at a GUIDE conference developed "Seven Business Continuity Tiers for Disaster Recovery", which I refer to as "BC Tiers". I divided the presentation into three sections:
Backup and Restore: BC tiers 1 through 3 are based on backup and restore methodologies. I explained how to backup Hadoop analytics data, all of the various options for IBM Spectrum Protect software, and how to encrypt the tape data that gets sent off premises.
Rapid Data Recovery: BC tiers 4 and 5 reduce the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) with snapshots, database journal shadowing, and IBM Cloud Object Storage.
Continuous Operations: BC tiers 6 and 7 provide data replication mirroring across locations. I covered 2-site, 3-site and 4-site configurations.
IBM Spectrum Virtualize - How it works - Deep dive
Barry Whyte, IBM Master Inventor and ATS for Spectrum Virtualize, covered a variety of internal topics "under the hood" of Spectrum Virtualize. This covers the SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software.
In version 7.7, IBM raised the limits. You can now have 10,000 virtual disks per cluster, rather than 2,048 per node-pair. Also, you can now have up to 512 compressed volumes per node-pair. With the new 5U-high 92-drive expansion drawers, Storwize V7000 can now support up to 3,040 drives, and Storwize V5030 can support up to 1,520 drives.
While each Spectrum Virtualize node has redundant components, the architecture is designed to handle entire node failure. The term "I/O Group" was created to refer to the node-pair of Spectrum Virtualize engines and the set of virtual disks it manages. This made sense when virtual disks were dedicated to a single node-pair. Now, virtual disks can be assigned to multiple node-pairs, dynamically adding or removing node-pairs as needed for each virtual disk.
However, even if you have a virtual disk assigned to multiple node-pairs, only one node-pair would manage its cache, causing all other node-pairs to coordinate I/O through the cache-owning node-pair. The other node-pairs are called "access I/O groups".
The architecture allows for linear scalability, double the number of nodes, and you double your performance. Some competitors use n-way caching across four or more nodes, and it is a semi-religious argument on the pros and cons of each approach. Barry feels the 2-way caching implemented by Spectrum Virtualize is the most effective and efficient for performance.
All of the nodes are connected over IP network, but there is one designated as a "config node", and one, often the same, as a "boss node".
A cluster can have up to three physical quorum disks (either drive or mDisk) and optionally up to five IP-based quorums. The IP-based is just a Java program that runs on any server or Cloud, provided it can respond within 80 msec.
Either IP-based or physical quorum can be used for "tie-breaking" a split-brain situations. In the event there is no "active" quorum, the administrator can now serve as the tie-breaker manually. Barry recommends for Storwize clusters, where physical quorum disks are attached to a single node-pair, that you have at least one IP-based quorum for tie-breaking.
However, only physical quorum can be used for T3 Recovery. T3 Recovery happens after power outages. All of the nodes update the quorum disk with critical information of all of the virtual mappings of blocks to volumes, and this is used when bringing up the nodes again.
To protect against one pool consuming all of the cache, Spectrum Virtualize will partition the cache, and prevent any one pool from consuming more than a certain percentage of the total cache. The percentage depends on the number of pools:
Number of Pools
Max percentage of any individual pool
5 or more
Barry explained how failover works in the event of node failure. There is voting involved, and the majority remains in the cluster. In the case of an even split, called a "split brain" situation, the quorum decides. Orphaned nodes in a node-pair go into write-through mode, since the cache is no longer mirrored.
The I/O forwarding layer has been split between upper and lower roles. The upper layer handles access I/O groups. The lower layer handles asymmetric access to drives, mDisks and arrays.
N-port ID Virtualization (NPIV) drastically improves multi-pathing. Perhaps one of the coolest improvements in awhile, NPIV allows us to assign "Virtual" WWPN to other ports. When an I/O sent to a single port fails, it retries one or more times again, then waits 30 seconds, and then invokes multi-pathing to find a completely different path to the data. With NPIV, when a port fails, its WWPN is re-assigned to a different port, so the retries are likely to be successful before having to wait 30 seconds!
Lastly, Barry covered the delicate art of Software upgrades. Software is rolled forward one node at a time, and the "cluster state" is maintained during this time.
Different presentations this week are at different technical levels. My session was meant to be an overview of the concepts of Business Continuity, independent of specific operating system platform, using specific IBM products to help illustrate specific examples. Barry's was a deep dive into a single product family.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 1.
Storage Brand Opening Session - Craig Nelson
Craig Nelson, Brocade manager for IBM Field Sales Channel, indicated the network equipment is the bridge that brings servers and storage together.
The squeeze -- faster servers and Flash storage causes storage networking to become the bottleneck. Fibre Channel will remain the protocol of choice for the next decade.
"Speed is the net currency of Business" -- Marc Benioff, Salesforce CEO.
Craig drew an analogy. We have been focused on making hard disk drives faster, and then Flash changed the game. Likewise, car manufacturers have focused on making gas engines better, and then Tesla Motors introduces an electric car with insane performance. The early models actually had an "Insane Mode".
The new Gen6 models of IBM b-type SAN equipment will support 32Gbps and 128Gbps ports. That's Insane!
Later models of Tesla Motors offer a "Ludicrous Mode". For flash storage, it is NVMe. NVMe can get storage down to 20 microsecond latency. That's Ludicrous!
Craig put in a plug for two Brocade sessions: "BEWARE - The four potholes on your road to success when deploying flash storage" and "Tune up your storage network! Is it healthy enough for flash storage and next-gen server platforms?"
Storage Brand Opening Session - Clod Barrera
Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist, presenting storage industry trends.
IDC predicts data capacity to grow 60-80% CAGR. This would require 44 percent drop in $/GB per year to maintain flat budget. Unfortunately, flash media cost is only dropping 25-30 percent per year, and spinning disk only 19 percent per year.
Since storage media will not offset capacity growth, we need other technologies to compensate, including compression, deduplication, defensible disposal, and "cold" storage to tape or optical media.
The smallest persistent storage that IBM has been able to achieve is 12 atoms. Current disk technology is 1200 atoms. Since 1956, IBM and the rest of the IT industry have improved storage 9 orders of magnitude, and now there are only 2 orders of magnitude left.
Clod poked fun at the "Star Wars: Rogue One" movie, indicating that their idea of the future of storage was a huge tape library. See my December 2016 blog post [Has your data gone rogue?]
What does it take to storage information forever? Tape will certainly be around. IBM Zurich demonstrated a 220TB back in 2015 as proof of technology.
A good example of the need for long-term retention are US films. Of those from the silent era, over 90 percent are lost. Over half of the films prior to 1950 are lost. The silver nitrate film stock that the reels were made of have deteriorated. Now that more movies are made digitally, can we do better?
Clouds will move from 10GbE to 25GbE. No slow down for FC in datacenters. Flash storage and object storage are both growing quickly
Move over Software-Defined Storage, Converged and Hyperconverged systems, the new up-and-coming thing are "Composable Systems deployed in Pods" adjustable hourly by workload requirements.
To protect against Ransomware, use "air gap" protection, not on the same network as production workload.
New storage models are needed for Cognitive workloads. Clod put in a plug for Joe Dain's presentation "Introducing cognitive index and search for IBM Cloud Object Storage leveraging Watson"
Storage Brand Opening Session - Axel Koester
Axel Koester, IBM Storage Chief Technologist, presented more storage industry directions.
What will the world look like in 10 years. Today mostly procedural programming, with some statistical big data, and a bit of machine learning. In 10 years, it will be mostly statistical and machine learning, with very little procedural programming. Why? Because it is faster to train computers with Machine Learning, than to program procedurally.
Examples of machine learning are IBM Watson, Google AlphaGo, drive-AI. Axel would rather be a passenger in a machine-learned self-driving car, than a procedurally-programmed one.
Neural networks to interpret hand-written numbers. Welcome to "Unsupervised learning".
A subset of Machine Learning is Deep Learning, a major breakthrough in 2006. Deep Learning is a subset of Machine Learning that uses three or more layers of neural networks. For example, face recognition "deep learning" algorithms can also be used to detect defects through visual inspection of circuit boards.
How does this impact storage?
Procedural -- archive test cases used
Statistical -- store all data for parallel processing
Machine Learning - train sample data, then archive and re-train yearly. Driving 5 minutes = 4 TB of sensor data used for self-driving cars
For Neural processing, x86 CPU are suitable for prototyping. GPU co-processors better, efficient but uncommon. IBM has developed the "TrueNorth" chip does nothing by Neural - 4096 cores with only 70 mW of energy consumption. No clock, instead dendrites, synapses, axons and neurons.
Instead of "Build or Buy?" the new question is "Train or Buy?" Train with confidential data, or buy ready-to-run 100% pre-trained cognitive systems as a service.
AI Frameworks are available on Docker containers with Kubernetes with Persistent storage (Ubiquity) such as Spectrum Scale. These frameworks include DL4J, Chainer, Caffe, torch, theano, tensorflow.
NVMe -- NVM is local only, how to do HA and DR? There are three options:
DB asynchronous shadowing
DB mirroring over NVMeOF
Cluster file system replication of persistent data, such as IBM Spectrum Scale
Example car manufacturer with 50 SAP HANA in memory instances on 4 Spectrum Scale nodes. IBM achieved 50,000 new files per second. Most NAS systems do much less.
Faster media on smaller electronics Holmium atoms on Magnesium Oxide on silver base, resulting in "single atom storage." ATM needle tip magnetizes, measured with Tunnel Magneto-resistance. Unfortunately, reading the data causes it to lose its value, so it is not as persistent as the 12-atom method described by Clod earlier.
As the title suggests, I explained why there is so much interest in Software-Defined Storage in the IT industry, what software-defined storage is, and how to deploy these solutions in your existing infrastructure without the full rip-and-replace. I covered which IBM products are available as software, pre-built systems and/or Cloud services.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Day 1 included keynote sessions. Here is my recap for the morning.
General Session "The Quantum Age"
Amy Hirst, IBM Director of Systems Training, served as emcee for the General Session. The theme this week is "Power of Knowledge, Power of Technology, Power of You. You to the IBM'th power".
Chris Schnabel, IBM Q Offering Manager, explained what "IBM Q" is.
Chris feels "our intuition of what we can compute is wrong". Classical (non-Quantum) computing has evolved over past 100 years.
Consider Molecular geometry. The best supercomputer can only handle the smallest molecules, those with 40 to 50 electrons, and even then are unable to calculate bond lengths within 10 percent accuracy. Quantum computing can.
Another area is what computer scientists call the "Traveling Salesman Problem". If you had a list of 57 cities, what would be the optimal path to minimize the distance traveled to get to all of the cities. Doing an exhaustive search would be 10 to the 76th power. Dynamic Programming techniques provide some shortcuts, reducing this down to 10 to the 20th power, but still, that is impossible on most computers.
Chris mentioned that there are easy problems to solve in polynomial time, and hard problems that are exponential, in that they get worse and worse the bigger the input set. There will always be hard problems.
"Nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical, and by golly it's a wonderful problem, because it doesn't look so easy."
-- Richard Feynman
Nature encodes information, but not in ones and zeros. Quantum computers are measured on the number of Qubits, their error rate, etc. The three factors that IBM focuses on are Coherence, Controllability and Connectivity.
Chris explained how Superposition and Entanglement are used in Quantum Computers. I won't bore you with the details here, but rather save this for a future post.
Today: 5 to 16 Qubits (can be simulated with today's classical computers. 5 Qubits is the power of your typical laptop)
Near future: 50-100 Qubits (too big to simulate on supercomputers), with answers that are approximate or correct only 2/3 of the time.
Future: millions of Qubits, fault-tolerant to provide exact, precise answers consistently.
Quantum Computing opens up a new range of problems, what Chris call "Quantum Easy" problems. Problems that might take years to solve on classical supercomputers could be solved in seconds on a Quantum computer.
Chris showed a picture of [Colossus], the first digital electronic computer used in the 1940s. Quantum computing today is like 1940's of classical computing.
IBM is now working on Hybrid Quantum-Classical algorithms, for example:
Quantum Chemistry - can be used in material design, healthcare pharmaceuticals
Optimization - logistics/shipping, risk analytics
There are different ways to build a quantum computer. IBM chose a single-junction transmon design, using Josephson junctions. While the chips are small, the refrigerators they are contained in are huge, and have to keep the chips at very cold 15 milliKelvin temperature (minus 459 Fahrenheit)!
To get people excited about Quantum computing, IBM created the "IBM Q Experience" [ibm.com/ibmq] that allows the public to run algorithms on a basic 5 Qubit system using a simple drag-and-drop interface to put different transformational gates in sequence.
IBM Research team were shocked to see 17 publications in prestigious journals make practical use of this 5 qubit system! Since then, IBM now offers a Software Developers Kit (SDK) called QISkit (pronounced Cheese-kit) as a text-based alternative to the drag-and-drop interface.
Amy Hirst came back on stage to remind people to use Twitter hashtag #ibmtechu to follow the event. There are two more events like this planned for the end of the year. A Power/Storage conference in New Orleans, October 16-20, and another event focused on z Systems mainframe, November 13-17.
Pendulum Swings Back -- Understanding Converged and Hyperconverged Systems
This presentation has an interesting back-story. At a client briefing, I was asked to explain the difference between "Converged" and "Hyperconverged" Systems, which I did with the analogy of a pendulum. I used the whiteboard, and then later made it into a single chart.
At the far left of the pendulum, I start with mainframe systems of the early 1950s that had internal storage. As the pendulum swings to the middle, I discuss the added benefits of external storage, from RAID protection and Cache memory to centralized management and backup.
To the far right of the pendulum, it swings over to networked storage, from NAS to SAN attached devices for flash, disk and tape. This offers excellent advantages, including greater host connectivity, and greater distances supported to help with things like disaster recovery.
Here is where the pendulum swings back. IBM introduced the AS/400 a long while ago, and more recently IBM PureSystems that combined servers, storage and switches into a single rack configuration. Other vendors had similar offerings, such as VCE Vblock, Flexpod from NetApp and Cisco, and Oracle Exadata.
Lately, the pendulum has swung fully back to internal storage, with storage-rich servers running specialized software on commodity servers. There are two kinds:
Pre-built systems like Nutanix, Simplivity or EVO:Rail which are x86 based server systems, pre-installed with software and internal flash and disk storage.
Software that can be deployed on your own choice of hardware, such as IBM Spectrum Accelerate, IBM Spectrum Scale FPO, or VMware VSAN.
So, over time, my single slide has evolved, and fleshed out into a full blown hour-long presentation!
Cloud storage comes in four flavors: persistent, ephemeral, hosted, and reference. The first two I refer to as "Storage for the Computer Cloud" and the latter two I refer to as "Storage as the Storage Cloud".
I also explained the differences between block, file and object access, and why different Cloud storage types use different access methods.
Finally, I covered some of our new public cloud storage offerings, using OpenStack Swift and Amazon S3 protocols to access objects off premises, including the new Cold Vault and Flex pricing on IBM Cloud Object Storage System in IBM Bluemix Cloud.
(FCC Disclosure: I work for IBM. I have no financial interest in SUSE, Scality, or any other storage vendor mentioned in this post. This blog post can be considered a "paid celebrity endorsement" for IBM Storwize, IBM Cloud Object Storage, and IBM Spectrum Storage software mentioned below.)
The study takes a realistic request for 250 TB of storage, at 25 percent compound annual growth rate (CAGR), to store infrequently accessed data in an online archive, and then looks at the Total Cost of Ownership (TCO) over five year period.
The study compares five different Software-Defined Solutions and three pre-built systems. The Software-defined solutions come as software-only, requiring that you purchase the hardware separately and build it yourself. The three pre-built systems were chosen from the top three storage vendors in the marketplace: Dell EMC, IBM and NetApp.
The cost of support is factored in, as it should be. To keep things equal, no data reduction like data deduplication or compression were used.
In an odd approach, the study mixes block, file and object based approaches all in the same study.
You can read the full 14-page study (linked above). I have organized the results into a single table, ranked from best to worst, color coded for the best deals in green ($100K to $200K), moderate solutions in yellow ($200K to $300K) and most expensive in red (over $300K). I put the software-only options on the left and pre-built systems on the right.
SUSE Enterprise Storage 4
IBM Storwize V5010
DataCore SAN Symphony
Red Hat Ceph Storage
Dell EMC Unity 300
I am often asked, "Isn't the software-only, build-it-yourself approach, always the lowest cost option?" Now, I can answer, "Sometimes yes, sometimes no." Fortunately, IBM offers Software-Defined Storage in a variety of packaging options including software-only, pre-built systems, and in the Cloud as a service.
IBM Storwize V5010 is based on IBM Spectrum Virtualize software, which you can deploy as software-only on your own x86 servers. This was not mentioned in the study, and perhaps it is my job to remind people that this option is also available for those who want to build their own storage.
For that matter, IBM Cloud Object Storage System -- available as software-only, pre-built systems, and in the Cloud -- might also be a cost-effective alternative.
Next week I will be in Orlando, Florida for the IBM Systems Technical University. If you are attending, stop by one of my presentations, or look for me at the Solution Center at one of the IBM peds, or attend the "Meet the Experts for IBM Storage" on Thursday!
I have been blogging for more than 10 years now, so I am no stranger to commenting on competitive comparisons. In some cases, I am setting the record straight, and other times, poking fun at competitor results, claims or conclusions. This comparison from Brian Carmody was too juicy to ignore.
(FCC Disclosure: I work for IBM. I have no financial interest in Infinidat, Dell EMC, nor Pure Storage, mentioned in this post. I do have friends and former co-workers who now work for Infinidat. This blog post can be considered a "paid celebrity endorsement" for IBM FlashSystem products.)
Here is an excerpt, I have added (Infinidat) wherever Brian says "we" just so there is no confusion:
"... So last week we (Infinidat) finally got around to running the same profiles against an INFINIDAT F6230 in our Waltham Solution Center, configured with 1.1TB of DDR-4 DRAM, 200TB TLC NAND, and 480 3TB Nearline HDDs.
In summary, we (Infinidat) wrecked the Pure and EMC systems. Here are the results side by side with EMC's data:
EMC Unity 600F
16K IOPS (80% Read)
9x Pure, 5x Unity
256K BW MBps
10.6x Pure, 3x Unity
4.5x Pure, 1.6x Unity
Steady-state latency (ms)
1/7 Pure, 1/2 Unity
By the way, we (Infinidat) took the liberty of running the test with a 200TB data set instead of Pure and EMC's 50TB because modern workloads require performance at scale, and we ran it with in-line compression enabled because our compression algorithm doesn't hurt performance.
This was an interesting test to run, and we (Infinidat) hope it helps the storage industry move away from media type wars and benchmarks (you will lose every time on performance if INFINIDAT is in the mix) ..."
Notice anything wrong here? anything missing?
The Tortoise beat "Hare 1" and "Hare 2", but did not invite the Cheetah to the race?
Brian was smart enough not to compare their product to anything from IBM. IBM has a wide variety of All-Flash Arrays, including the DS8880F models, the Storwize V7000F and V5030F models, and Elastic Storage Server models. However, for this workload, IBM would probably recommend the FlashSystem V9000, A9000 or A9000R.
Any All-Flash Array with a steady-state latency of 2 milliseconds or greater is embarassing, but then Infinibox is not really an All-Flash Array.
The architecture of their Infinibox appears much like the original XIV. It has a mix of DRAM memory and SSD cache, combined with spinning drives. It offers only compression, not data deduplication. Unlike the IBM XIV powered by six to 15 servers, the Infinibox appears under-powered with just three servers.
The Infinibox uses software-based in-line compression, which must put a huge tax on the few CPUs they have in those three servers. Infinidat chose not to compress the data in their cache, probably to reduce the additional overhead on their over-taxed CPUs.
The IBM FlashSystem V9000 has an innovative design, based on IBM Spectrum Virtualize, the mature software that you also find in the IBM SAN Volume Controller and Storwize family of products.
The FlashSystem V9000 offers hardware-accelerated compression. IBM takes advantage of the integrated Intel QuickAssist co-processor which runs the compression algorithm 20 times faster than standard Intel Broadwell CPU.
IBM compresses its cache, using a two-tier approach. The "upper cache" receives the data uncompressed, so that it can then tell the application to continue, for fastest turn-around time. Then the data is compressed, and stored in the "lower cache", optimizing the value and benefits of DRAM memory. Many databases get up to 80 percent savings, resulting in a 5-to-1 benefit in DRAM cache memory.
The IBM FlashSystem A9000 and A9000R also have an innovative, based on IBM Spectrum Accelerate, the code originally developed for IBM XIV storage system.
(Fun fact: Infinidat's founder, [Moshe Yanai], was formerly the founder and designer of XIV, and it appears that Infinidat is just a re-design of old XIV technology architecture, re-packaged with a few differences. Since Moshe left, IBM has drastically enhanced the IBM XIV.)
Like the IBM Spectrum Virtualize family, the IBM FlashSystem A9000 and A9000R have hardware-accelerated in-line compression, and two-tier approach to cache. The "upper cache" receives the data uncompressed, then the data is compressed and deduplicated, and stored in the "lower cache", optimizing the value and benefits of DRAM memory.
The IBM FlashSystem A9000 and A9000R also offer in-line data deduplication. Modern workloads are virtualized, and Virtual Machine (VM) and Virtual Desktop Infrastructure (VDI) get significant benefits from data deduplication. Infinidat does not play here. For the FlashSystem A9000, most of the metadata related to data deduplication is in cache, minimizing the overhead.
IBM FlashSystem A9000 and A9000R have full performance that blows these published Infinibox results away WITH compression and deduplication turned on.
Brian ran a workload that used the DRAM and SSD cache exclusively, eliminating the reality that any REAL WORLD workload would have to tap into those much slower spinning drives. This is not really a side-to-side benchmark. He is comparing his live run on Infinibox to published numbers from a previous comparison run on a completely different set of data.
This raises the question, why pay for all those spinning drives at all, if you plan to only use the DRAM and Flash storage for your workloads?
A week later, Brian followed up with another post [The INFINIDAT Challenge], acknowledging his comparison was bogus. Here's an excerpt. Again, I have added (Infinidat) wherever Brian is referring to his employer just so there is no confusion:
"... It's not likely that a room full of storage engineers will ever agree on parameters for a synthetic benchmark since storage evaluations are competitive and control of test parameters will invariably predetermine the 'winner'. However, I hope we can all agree that synthetic benchmarks are a waste of time, and that real world performance is what matters in the data center.
So, what can we (Infinidat) do about it?
We (Infinidat) cordially invite every enterprise storage customer who wants lower latency and lower storage cost to visit [FasterThanAllFlash.com] and sign up for The INFINIDAT Challenge.
We (Infinidat) will Give you an Infinibox system to test
We (Infinidat) will Help you clone and test your environment with Infinibox
We (Infinidat) Guarantee your applications will run faster on Infinibox than your All-Flash Array.
If we (Infinidat) fail, we'll take the system back and Donate $10,000 to the charity of your choice.
If our technology delivers, you can keep the system, and we'll (Infinidat) Donate $10,000 in your name to the charity of our choice (The American Cancer Society).
Thanks again to all who participated in the dialog over the past week. I know the post generated some controversy. Traditional storage companies are fighting for their lives trying to keep enterprise storage expensive; indeed their business models are predicated upon maintaining price levels from a bygone era...."
As consolidation play doing full range of data services, I do not see this Infinibox working out. Talking to clients who have the Infinibox, the performance deteriorates in REAL WORLD workloads as you add more data to the unit.
The Infinibox seems fine for workloads that do not demand high performance, so I was surprised Brian compared it to All-Flash arrays. The Infinibox is out of its league!
(To be fair, Pure Storage and EMC XtremeIO aren't really in the same league as IBM FlashSystem, either, given that both of those products are based on commodity SSD. IBM FlashSystem models are consistently 4 to 10 times lower latency than these Commodity-SSD based competitors.)
The Infinibox also lacks features many people expect in an Enterprise-class storage array, like Call-Home capability to identify problems quickly, and Synchronous remote mirroring for disaster recovery. It is often common for startups like Infinidat to deliver a [Minimum Viable Product] as their first offering.
To paraphrase Brian himself, your applications will lose every time on performance if INFINIDAT is in your datacenter.
The new TS1155 enterprise tape drive can write up to 15 TB uncompressed data to existing JD/JZ/JL media.
It can read/write existing 10TB-formatted JD media, and 7TB-formatted JC media, written by former TS1150 drives. It also can offer read-only support for older 4TB-formatted JC media from TS1140 drives.
These are uncompressed capacities, and some clients achieve 2x or 3x compression on top of these capacities. This depends heavily on the type of data. Your mileage may vary, as they say.
Most of the rest of the features of the TS1150 drives carry forward., The performance 360 MB/sec is similar, encryption via IBM Security Key Lifecycle Manager (SKLM) is similar, and support for IBM Spectrum Archive via Linear Tape File System (LTFS) format is similar.
An interesting development is that the TS1155, in addition to standard 8Gb Fibre Channel attach, is the first IBM enterprise drive to also offer 10Gb Ethernet support. IBM will offer both RDMA over Converged Ethernet (RoCE) as well as iSCSI support.
The newest member of the IBM Spectrum Storage software family, IBM Spectrum Copy Data Management automates the creation of snapshot images (FlashCopy for those familiar with IBM terminology) on IBM, NetApp and EMC storage arrays. These copies can be made for various uses, such as DevOps, Dev/Test, Backup/Restore, and Disaster Recovery.
At some data centers, these copies can consume as much as 60 percent of your total storage space, because often each developer and tester are generating their own copies. Instead, having copies automated, registered, cataloged, and made available to developers and testers eliminates rogue copies.
This release adds support for additional databases, including Microsoft SQL Server on physical machines, SAP HANA in-memory databases, and Epic/Caché from InterSystems used in Electronic Health Records (EHR) management systems.
IBM also adds support for long-distance Vmotion for VMware virtual machine images. The target for this movement is IBM Spectrum Accelerate running on IBM Bluemix Cloud, supporting Hybrid Cloud configurations.
Over the past ten years, my co-workers have asked to write a "guest post" on this blog. This time, Moshe Weiss, IBM Senior Manager, Development and Design, has offered the following post, not in his own voice, but in the voice of his "baby", the Hyper-Scale Manager software.
You might think this is a strange approach, but today we have robots that can dance, and cars that can drive themselves! If software could talk, this is what IBM Hyper-Scale Manager would say:
"I was born a year ago.
It wasn't an easy birth… there were many complications. In fact, so many, that I was almost prematurely born!
Most of my development, in preparation for labor and delivery, was done within the last 6 months of the overall 18 months. I was shaped and designed, and sometimes re-shaped, three times. Lots of assumptions had to be made in hopes to ease a successful delivery and help bring me to full term of the birthing process.
During my first year of maturity, I focused on learning how customers used me; what frustrated them the most, and what they loved or 'almost' loved, while still needing refinement and redesign.
The number of customers adopting me grew higher and higher, as did the number of complaints and bugs that I had to deal with, and my users’ frustrations and dislikes because I wasn't yet a complete solution and still had some missing features.
I was renewed four times! Each time of which improved me and made my senses better, faster, adding new capabilities that helped make me more approachable, intuitive and delightful.
Choosing how to renew, and what to add to each renewal, is not an easy task. Basically, it was about prioritizing user experience versus gaps that were deferred from my birth, versus differentiators to make me unique and sell more, versus features in my roadmap, versus investing huge efforts in my quality.
Each renewal was a complex process with lots of features and behaviors to add, while trying to make my customers’ life a bit easier, since features that were important to them were sometimes considered low priority.
But, there were also good times during my first year:
Huge customer adoption rate
100 new customers in two months!
Growing was a great thing and my parents were and are still so proud! But, like with most things, it came with a price - a lot of sustain issues from the field, requests for changes and bad feedback that I am hard to use and missing core elements.
Being a new baby in the Storage world is not a simple thing, as expectations are huge (mainly because of my successful elder brother, the XIV GUI) and I must quickly keep up with all of them.
Although, I am getting tons of good feedback for being revolutionary and unique. People are emotionally engaged with me, and being that I’m a baby, I love to see emotions!
Huge marketing efforts to put me center stage
However, because of some initial problems at the start -- I am a new product, remember? -- I was thrown out of multiple customer sites, and some sales/marketing guys just stopped believing in me. That made me sad.
My parents did a great job, though, in talking, explaining and demonstrating what I can do, together with what I can’t do now, but will do soon. This really helped in some areas, and customers began to see what my parents saw in me for so many years.
I’m really enthusiastic to hear what people will think of me when I’m two years old!
As part of the renewal I had four times during my first year, design elements were reconsidered, redesigned and rewritten to find the best solutions ever. No product has come even close to what I suggest to the world… I am so proud of myself!
Additionally, my parents wrote approximately 20 patents on my User Interface (UI) elements and User Experience (UX) concepts, which makes me extremely unique.
Prioritization of what goes in and what doesn't, especially during a time when fewer and fewer babysitters handled me during that year. It was a real challenge. Read my parent's post [How to drive forward an exhausted team?] for more details.
But my parents did it! They succeeded to add cool features like:
Filter analytics and free text, making the filter a great experience that everyone is using.
Great UX improvements like redesigning the tabs, adding right click menus, and adding more on-boarding enablers
Improving the dashboard.
Improving my core business, capacity management (four different times!), and still working on it.
Adding features that were initially deferred in my birth. Deferring features back then was the way to make my birth go smoother. Now, these missing features annoy people.
Improving quality dramatically, adding automation to the way people test me.
Adding differentiators, like the health widget, with more than 20 best practices that provide helpful tips to the customer when there’s a need to change something in their environment, to avoid future issues.
Continue to bring added values for the 'A-family'. I am monitoring: FlashSystem A9000/R, XIV and Spectrum Accelerate, both on and off premises. This added value makes for a family with the most powerful management solutions and experience."
If you are planning to attend the upcoming IBM Systems Technical University, Orlando Florida, May 22-26, There will also be a variety of hands-on labs. I recommend participating in the hands-on session to feel and witness the next release of IBM Hyper-Scale Manager.
This week, I was part of an all-day event called "Healthcare and Research Trends & Directions in a Cognitive World" at the IBM Executive Briefing Center (EBC) in Rochester, MN. I was one of many presenters covering Information Technology to improve healthcare outcomes. Todd Stacy, IBM Director Server Sales for US Public Market, served as our emcee.
This was a great day. Special thanks to Kathy Lehr, Trish Froeschle, and Scott Gass for organizing this event! We had clients from a variety of Health Care and Life Science industry backgrounds. I certainly learned a few things myself.
Dr. Michael Weiner, IBM Chief Medical Information Officer, Watson Health, covered some of the real challenges not just facing the United States, but also other countries. On average, healthcare in USA [costs over $10,000 USD per American citizen]! Compare that to only $3,700 USD for the folks in the United Kingdom! In fact, nearly all industrial nations spend between $2,000 and $5,000 per person. Where does all the U.S. money go?
A big challenge is our ever-aging population. Every day, there are 10,000 [Baby Boomers] reaching their 65th birthday, with fewer people in the 25-44 age group to work as nurses to take care of them. About 15 percent of the US population are elderly (over age 65) and this is expected to grow to 20 percent in year 2040. The situation is even worse in Japan, where 25 percent of the population today is elderly, and this is expected to be 40 percent by year 2060.
New Care Models
In some countries, like Australia and Japan, post office workers who spent their time delivering mail, now can stop in to check in on elderly people. As people ship less mail, using social media or email instead, this keeps the postal workers employed, in a manner that provides society value.
The USA enjoys one of the lowest costs for food, but then suffers from an epidemic of obesity, with over 34 percent of Americans are obese. When New York City eliminated Trans Fats, heart attacks dropped considerably.
In 2009, the Health Information Technology for Economic and Clinical Health [HITECH] Act required the digitization of medical information, known as "Meaningful Use", which has greatly influenced healthcare facilities. This was implemented by a combination of incentives and penalties. Now, more than than 92 percent of hospitals in the USA have digitized medical information! The rest are still using paper and Xray film images. Some places were initially exempted, such as Assisted Living Homes for example, so there is still more work to be done.
An advantage of using computer-based solutions like Artificial Intelligence is that it eliminates bias. When a woman walks into an Emergency Room complaining about chest pains, few health staff would consider this a sign of heart attack. When a man does same, health staff considers heart attack as the first diagnosis, at the risk of missing out on other possibilities.
Every year, over a million articles related to healthcare research are published. Who can read all this in a timely manner? IBM Watson! After [winning in Jeopardy], IBM Watson was "sent to medical school" to learn how to assist doctors in diagnosing patients.
Transforming Health Care Data Management with IBM Spectrum Storage
Greg Tevis, IBM Software Defined Storage Architect, and Raj Tandon, IBM Senior Strategist, co-presented this introduction to IBM Spectrum Storage family of products. They covered examples with IBM Spectrum Virtualize, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Scale, IBM Cloud Object Storage, and IBM Copy Data Management. The latter having support directly for EPIC and Cache databases.
Cognitive Imaging Solutions for Healthcare Providers
Jason Crites, IBM Healthcare and Life Sciences Data Solutions Leader, and Wayland Vacek, Enterprise Sales Manager for Merge, presented IBM Watson Imaging Clinical Review, from IBM's acquisition of the Merge company. The solution is based on IBM Spectrum Scale as the back-end storage repository.
Merge has been around for more than 20 years, with clinical workflow offerings in Cardiology, Radiology, Orthopedics and Eye care. Often, IBM Watson is able to identify things in medical images that escape the review or radiologists or other medical specialists.
At HIMSS conference earlier this year, The human radiologists were shown a collection of images used to train IBM Watson. The human radiologists only identified 20 percent of the images correctly, while IBM Watson got all of them, every time. In many cases, human radiologists have only a few seconds to look at an Xray image. Computers like IBM Watson are now fast enough to compete directly with human radiologists in the same number of seconds.
Building a Foundation for the Cognitive Era in Healthcare and Life Sciences
Dr. Jane Yu, IBM Systems Architect, Healthcare & Life Sciences, and Dr. Frank Lee, IBM Global Sales Leader, IBM Software Defined Infrastructure & Life Sciences, co-presented this topic. They present five challenges:
Growing data volumes are making it more difficult to manage, process and store this data.
Scientists find themselves spending more than 80 percent of their time manually integrating data from silos, and less than 20 percent of their time doing actual research and deriving insights from their analyses.
Compute- and data-intensive workflows may take days to complete on existing server and storage systems.
IT organizations must keep up with rapidly evolving applications, development frameworks, and databases for preferred. Health care Life Science (HCLS) applications. This includes SAS, Matlab, Hadoop, Spark, NoSQL databases, as well as Deep Learning and Machine Learning workloads.
Scientific integrity and government mandates increasingly require collaboration across organizational boundaries.
In one example, Sidra Medical and Research Center plans to map the genomes of all 250,000 citizens in the Middle Eastern country of Qatar. Imagine that processing each Qatari citizen will generate 200 GB of data for this project, resulting in 50 Petabytes (PB) of data!
Combining IBM Spectrum Compute products with IBM Spectrum Scale storage, can help address these challenges.
Modernize & Transform Helathcare with IBM Storage Solutions
Finally, I presented a 90-minute breakout session that covered three solution areas:
Flash storage to speed up medical records and research. Those who have already implemented Electronic Health Records (EHR) for "Meaningful Use" compliance recognize the value this provides to improving healthcare. Adding All-Flash Arrays such as IBM FlashSystem, Storwize V7000F or DS8000F can drastically improve application performance.
Spectrum Scale and IBM Cloud Object Storage for Vendor Neutral Archive. It seems silly that each PACS vendor has its own little island of storage. A better approach is to send all PACS data from various vendors into a "Vendor-Neutral" storage repository. Both IBM Spectrum Scale and IBM Cloud Object Storage System, either linked together or used separately, can be part of a VNA solution.
VersaStack to simplify deployments. VersaStack is a Converged System that combines best-of-breed Cisco servers and switches with best-of-breed IBM storage, pre-cabled, pre-configured, and pre-loaded with all the necessary software to manage the environment as a single entity. This can reduce the time it takes to deploy new medical applications from weeks to just hours.
Next month, I will be presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. There will not be an "IBM Edge" conference this year, so this is your best opportunity to hear the latest information on all of the IBM server and storage products at one conference.
I will be there! Here are the topics I will be presenting:
The pendulum swings back -- Understanding Converged and Hyperconverged environments
IBM cloud storage options
Software Defined Storage -- Why? What? How?
Business continuity -- The seven tiers of business continuity and disaster recovery
Introduction to object storage and its applications - Cleversafe
New generation of storage tiering: Less management, lower investment and increased performance
IBM Spectrum Scale for file and object storage
This conference is not all lectures, which some refer to as "Death by Powerpoint".
There will also be a variety of hands-on labs. I recommend participating in the hands-on session to feel and witness the next release of IBM Hyper-Scale Manager, which is the management application for what IBM calls its A-line storage family -- FlashSystem A9000/R, XIV Storage System, and Spectrum Accelerate software.
Hyper-Scale Manager is the most advanced GUI in the market today, may help reduce your management total cost of ownership (TCO) in half!
You can [Enroll Today!] There is an "early-bird" special to save hundreds of dollars if you enroll by April 16!
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Last week, at the InterConnect conference, IBM's premiere Cloud and Mobile event, there were announcements regarding IBM Cloud Object Storage.
IBM Cloud Object Storage Cold Vault
It seems that all the major Cloud Service Providers (CSPs) offer storage for warm, cool, and cold data.
IBM Cloud Object Storage Cold Vault service gives clients access to cold storage data on the IBM Cloud and is designed to lead the category for cold data among major competitors. Cold Vault joins the existing Standard and Vault tiers.
For rarely accessed mixed workloads and applications
Archiving, long-term data retention, historical data compliance
For cooler workloads accessed once a month or less
Archiving, short-term data retention, digital asset preservation, tape replacement and disaster recovery
For warm data, active workloads accessed multiple times per month
DevOps, Social, mobile, collaboration and analytics
These tiers are available in both "Regional" and "Cross Regional" deployment models. Customers can choose between Regional Service, which spreads their data across multiple data centers in a given region, and Cross Regional Service, which spans at least three geographically dispersed regions. For example, Regional may have data in three data centers all located in Texas, but Cross-Regional might have data in California, Texas and New York.
IBM Cloud Object Storage Flex
In the past, clients used "Information Lifecycle Management" (ILM) policies to place data where it was initially needed, then move it to less expensive tiers storage as it ages and is accessed less frequently. This was is done between Flash, Disk and Tape storage on premises, but the concept could also be applied to off-premises Cloud storage as well.
Flex is a new cloud storage service offering simplified pricing for clients whose data usage patterns are difficult to predict. Flex enables clients to benefit from the cost savings of cold storage for rarely accessed data, while maintaining high accessibility to all data.
Data that is accessed several times per month (warm) will be charged at Standard rates, data that is accessed once a month or so (cool) will be charged Vault rates, and data that is rarely accessed (cold) will be charged at Cold Vault rates.
In effect, this gives you ILM cost savings without ever having to move your data between Standard, Vault or Cold Vault tiers.
IBM Cloud Object Storage partnership with NetApp AltaVault
Many backup software products have been enhanced to write directly to IBM Cloud Object Storage, including IBM Spectrum Protect (formerly known as IBM Tivoli Storage Manager or TSM for short), as well as Commvault Simpana and Veritas NetBackup.
IBM extends this partnership to NetApp. NetApp AltaVault cloud backup solutions will now automatically be able to send backups to IBM Cloud Object Storage on IBM Cloud. Incorporating native support for IBM Cloud Object Storage into AltaVault solutions allows NetApp users to deploy backup data to the IBM Cloud.
IBM Cloud Object Storage for Bluemix Garage
For software developers, [IBM Bluemix Garage] provides a global consultancy with the DNA of a startup that combines the best practices of Design Thinking, Agile Development and Lean Startup to accelerate application development and cloud innovation.
Now, you can use IBM Cloud Object Storage in the IBM Cloud to support Garage method projects.
Want to try it out? With Promo Code "COSFREE", you can get a free year of Standard Cross-region IBM Cloud Object Storage with up to 25 GB/month access! See [Free Storage Promotion] for details.
This week, IBM InterConnect conference is going on in Las Vegas, Nevada.
One time in Las Vegas, I took the gondola ride at the Venetian Hotel. These are not boats with a motor on a chain or track, a but actually steered and propelled independently by the gondolier. At various points on our path, our gondolier would serenade our group with beautiful Italian songs.
As the ride was ending, I asked our gondolier how long their training program was to do this job. He told me "six weeks". I said "Wow, I would love to learn how to sing Italian songs like that in six weeks". He corrected me, "No, silly, they only hire experienced singers, and teach them six weeks to manage the gondola by turning the oar in the water."
(FCC Disclosure: I work for IBM. I have no financial interest in the Venetian Hotel, CBS Studios, or the producers of any television shows mentioned in this post. David Spark has provided me a complimentary copy of his book. This blog post can be considered an "unpaid celebrity endorsement" for the book reviewed below.)
InterConnect 2017 includes "Concourse", a trade show floor with people showing off the latest technologies. In the past 25 years, I have attended many conferences, and on occasion I have worked "booth duty". I am not in Las Vegas this week, so this post is advice to those that are.
One time, when the coordinators for an upcoming conference announced at an all-hands meeting they were looking for "a number of knowledgeable and outgoing volunteers" to work the IBM booth, one of the employees in the audience asked "How many of each?" While this might have meant to draw laughs, it underscored a real problem.
In many IT and engineering fields, the terms "knowledgeable" and "outgoing" are seen as mutually exclusive. People are either one or the other. A study titled [Personality types in software engineering], by Luiz Fernando Capretz of The University of Western Ontario, analyzed Myers-Briggs Type Indicator of personality and found the majority of engineers were "Introverts".
This line of thinking is further reinforced by the various characters on the television shows like "The Big Bang Theory". If you are familiar with the show, you have Sheldon and Amy are the most knowledgeable, but also the most socially awkward, and then you have Penny and Howard, less knowledgeable but at the more outgoing end of the spectrum.
I understand that for many engineers, working a booth at a trade show is far outside their "comfort zone". But what do you think is more likely, that you can train an engineer to work a booth in six weeks, be more outgoing, hold the right conversations, tell the right stories -- or -- train a professional model, a young, good looking man or woman, who is already outgoing and friendly, to answer technical engineering questions about your products and services?
I have been attending conferences for over 25 years, and occasionally have worked a booth or two. I started out as an engineer, but went through extensive training for public speaking, talking to the media and press, and moderating Q&A Expert panels.
Sadly, most people who work the booth get little to no training at all. You might be told your scheduled hours, how to scan bar codes on badges, and where the brochures and swag are stored. Then, you get your official "shirt" and told to wear it with a certain color pants, so that everyone looks like part of the team.
Fortunately, fellow blogger David Spark, of Spark Media Solutions, has written a book titled "Three feet from Seven Figures" with loads of advice on how to work a booth with one-on-one engagement techniques to qualify more leads at trade shows.
The title of his book warrants a bit of explanation. When you are working a booth, potential buyers and influencers are walking by, often just three feet away from you, and these could represent million-dollar opportunities.
Too often, the folks working a booth take a passive approach. They look down at their phones, chat with their colleagues, and basically wait for complete strangers to ask them a question or request a demo. This non-verbal communication can really be a turn-off. David explains this in all-too-familiar detail and how to be more actively engaged.
David shows how to break the ice and build rapport with each attendee, how to qualify them as legitimate leads, and how to handle each type of situation.
For qualified leads, you need to maximize the opportunity. If you imagine how much a company spends to send its employees to work the booth, plus the cost of the booth itself, and divide it by the limited number of hours that the trade show floor is open, you quickly realize that each hour is precious.
Your time is valuable, and certainly their time is valuable also. Let's not spend too much time on a single lead, but rather capture the information, end the conversation, and move on.
If you are working a booth at IBM InterConnect, or plan to work a booth at an event later this year, I highly recommend getting this book! It is available in a variety of hard copy and online formats at [ThreeFeetBook.com].
I am not in Las Vegas this week for this year's event, but the sessions will be streamed live through [IBM GO].
IBM Systems Technical University - May 22-26, 2017 - Orlando, FL
IBM Systems Technical University is the evolution of a variety of other conferences related to servers, storage and software. Starting out as the "IBM Storage Symposium", then added "System x" servers and renamed to "Storage and System x University", then dropped "System x" when IBM sold off that business to Lenovo.
A few years ago, it was renamed "Edge", initially just focused on Storage, but then two years ago combined with System z mainframe servers and POWER Systems for IBM i and AIX platforms. It also covers software products that previously had their own conferences, like IBM Pulse or MaximoWorld
Last year, the IBM Marketing team tried a daring experiment. Let's change "Edge" to be a "Cognitive Solutions and Cloud Platform" conference, with emphasis on IT Infrastructure.
The experiment failed. Not because IBM Systems don't support these new initiatives, but because the audience were more interested to hear about how IBM Systems help their current day-to-day business. As many attendees told me, "If we wanted to hear about Cognitive or Cloud, we have plenty of other of conferences that cover that already!"
While 40 percent of IBM revenues are generated from Cognitive Solutions and Cloud Platform, the other 60 percent are traditional, on-premise, systems-of-record application workloads, the kind that business, non-profit groups, and government agencies have been using for the past few decades!
To address this need, IBM offered three-day "IBM Systems Technical University" events at various locations. Last year, I presented storage topics at events in Atlanta, Austin, Bogota, Boston, Chicago, Dubai, Nairobi, and São Paulo.
We will have several of those this year as well. The main one will be a full 5-day event, May 22-26, in Orlando Florida. I will be there presenting various sessions on storage!
IBM World of Watson - October 29-November 2, 2017 - Las Vegas, NV
This is a Cognitive Solutions and Cloud Platform conference, with an emphasis on Analytics and Database technologies.
I did not attend World of Watson, or WoW for short, last year, but it was an evolution of the conference previously called "IBM Insight". I am sure everything from DB2 and Open Source databases to Hadoop and Spark will be covered this year as well.
In writing this post, I realize that this year will be like a "Conference Sandwich". Cognitive-and-Cloud at the top and bottom, with all the meat, veggies and garnish in the middle!
This week, IBM sponsored a nice multi-client event in San Juan, Puerto Rico. I was quite impressed with the quality of this video. Our marketing department has really done a good job on this!
This event was not just multi-client, but also spanned different industry sectors. IBM recently has realigned to five different sectors, and we had clients from different sectors attending the event.
The night before, I was able to meet most of the other IBM executives who came down for the event. Unfortunately, two were delayed because of the snow storms in the Northeast part of the United States, but they were able to arrive the next day.
The venue was the El Touro restaurant, near the Hilton Caribe. The weather was just right, about 75 degrees and breezy. It was a little humid for me, but everyone else were just happy to be out of the cold. Meanwhile it is nearly 90 degrees in Tucson, Arizona where I am from.
This was billed as a "Lunch and Learn" and the food was delicious! In an effort to keep it simple, we had small dishes of fish with fruit-based cream sauce, paella with rabbit meat and rice, pork belly, Crema Catalana and a churo for dessert. This gave everyone a sample taste of everything, without having to order off a menu.
We basically took the same approach with the presentation. First, Marcos Obermaeir and Marcos Otero, the two leads for this event, thanked the audience and explained their new roles. Marcos Obermaeir is focused on Financial and Insurance sector, while Marcos Otero focused on Communications sector.
Next we had Debbie Niven and Roopam Master, both IBM Executives, explain their roles, and how IBM can help both clients and Business Partners in Puerto Rico.
I presented samples of much larger presentations on three topics. First, the excitement over Software Defined Storage with IBM Spectrum Storage family of products. Second, IBM Spectrum Scale as a better replacement for Hadoop File System (HDFS) for Hadoop, IBM BigInsights and Hortonworks analytics deployments. Third, IBM Cloud Object Storage, and how this can be combined with IBM Spectrum Protect to backup your data to object storage either on premises, or in the Cloud.
I could have easily spoken an hour on each topic, but instead, we shortened to about 20 minutes each, in keeping with the "Tapas" theme of the restaurant. This allowed those clients who wanted to hear more to have a reason to request a follow-up visit or call.
After the clients left, the IBM team had a reception for the IBM Business Partners. About 80 percent of IBM's storage business in Puerto Rico is done through IBM Business Partners, so they are an important link in IBM's "Go-to-Market" strategy.
The moon was nearly full, and the breeze and waves were a spectacular backdrop to the conversations I had with each person I met.
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM Storwize V5030F and V7000F all-flash high-density expansion enclosure
The 5U-high, 92-drive expansion enclosure introduced for the IBM Storwize V5000 and V7000 is now available for the all-flash models V5030F and V7000F. High-density expansion enclosure Model A9F requires IBM Spectrum Virtualize Software V7.8, or later, for operation.
The enclosure allows any mix of "Tier 0" write-endurance SSD at 1.6TB and 3.2TB capacities, and "Tier 1" read-intensive SSD at 1.92TB, 3.84TB, 7.68TB and 15.36TB capacities.
Storwize V5030F control enclosure models support attachment of up to 40U of expansion enclosures, which equates to eight high-density expansion enclosures, up to 760 drives per control enclosure, and up to 1,056 per clustered system.
Storwize V7000F control enclosure models support attachment of up to eight high-density expansion enclosures, up to 760 drives per control enclosure, and up to 3,040 drives per clustered system.
IBM has adopted "Agile" process for all of its IBM Spectrum Storage software. Spectrum Virtualize is offered in a variety of forms. IBM offers the FlashSystem V9000, SAN Volume Controller, Storwize family, and Spectrum Virtualize as software that runs on Lenovo and SuperMicro servers. This means quarterly delivery of new features and functions!
Lots of small enhancements were added in this release:
Apply Quality-of-Service (QoS) to a Host Cluster in terms of IOPS and or MB/s throughput.
SAN Congestion reporting, via buffer credit starvation reporting in Spectrum Control and via the XML statistics reporting, for the 16Gbps FCP Host Bus Adapter (HBA).
Resizing for Metro Mirror and Global Mirror remote copy services of thin provisioned volumes.
Consistency Protection for Metro Mirror and Global Mirror. You can now define "Change Volumes" to be used in the event of problems with MM or GM, it will switch over to GMCV mode.
Increased FlashCopy Background Copy Rates
Proactive Host Failover during temporary and permanent node removals from cluster
IBM Aspera® Files cloud service helps to enable fast, easy, and secure exchange of files and folders of any size between users, even across separate organizations. Aspera Files is currently available in three all-inclusive editions of Personal, Business, and Enterprise. Clients can subscribe either to a committed amount of data transferred on a monthly or annual basis or as a pay-per-use option.
Personal edition now includes 20 authorized users and a single workspace.
Business edition now includes 100 authorized users, 100 workspaces, support for IBM Aspera Drive, support for IBM Mobile applications, and support for Single-Sign-On.
Enterprise edition now includes 500 authorized users, no limit on number of workspaces, support for IBM Aspera Drive, support for IBM Mobile applications, and support for Single-Sign-On.
IBM is now introducing a new "Elite edition" includes 2500 authorized users, no limit on number of workspaces, support for IBM Aspera Drive, support for IBM Mobile applications, support for Single-Sign-On, and access to IBM Aspera Developer Network and nonproduction organization.
With the addition of the new Elite edition, clients have the flexibility to subscribe to additional functionality in Aspera Files that helps provide higher value and greater differentiation. The Elite edition is available as a subscription and on a pay-per-use basis.
In addition to the existing charge metric of data transferred, a user subscription metric is now included for all four editions. Each edition comes with an included number of authorized users in addition to other key features and capabilities.
Well, it's Tuesday again, and you know what that means? IBM Announcements! There were lots of announcements today, so I have split this up into two posts. One for the Tape and Cloud announcements, and the other for the Spectrum Storage family.
IBM Spectrum Virtualize Software V7.8.1
IBM Spectrum Virtualize&trade: V7.8.1 is the latest software for FlashSystem V9000, SAN Volume Controller and Storwize products.
Last release, IBM introduced "Host Groups" for clusters that needed to share a common set of volumes. This release offers "Host cluster I/O throttling": I/O throttling can be managed at the host level (individual or groups) and at managed disk levels for improved performance management,and GUI support.
Increased background FlashCopy transfer rates: This feature enables you to increase the rate of background FlashCopy transfers, providing faster copies as the infrastructure allows. This takes advantage of the higher performance capabilities of today's systems, processing the copy in a shorter period of time. The default was 64 MB/sec, and now we can go up to 2 GB/sec, for those who want their FlashCopy to be done as fast as possible.
Port Congestion Statistic: Zero buffer credits help detect SAN congestion in performance-related issues, improving support in high-performance environments. IBM had this for the 8Gbps FCP cards, but not for the 16Gbps cards, so now that's fixed.
Resizing of volumes in remote mirror relationships: Target volumes in remote mirror relationships will be automatically resized when source volumes are resized. Lots of clients asked for this, and IBM delivered!
Consistency protection for Metro/Global Mirror relationships: An automatic restart of mirroring relationships after a link fails between the mirror sites improves disaster recovery scenarios, helping to ensure the applications are protected throughout the process.
When IBM introduced "Global Mirror with Change Volumes" (GM CV), I wanted to call it "Trickle Mirror", because the primary site takes a FlashCopy, trickles the data over, then FlashCopy at the remote site. Now, clients using traditional Metro or Global Mirror can add "Change Volumes" as protection. In the unlikely event a network disruption occurs, it drops down to GMCV until the link resumes full speed.
Support of SuperMicro servers for the Spectrum Virtualize as Software Only offering: Support for x86-based Intel™ servers by SuperMicro for Spectrum Virtualize Software is available with this release.
Last year, IBM offered Spectrum Virtualize as software that could run on Lenovo servers. However, now there are clients who want alternative server choices.
Supermicro SuperServer 2028U-TRTP+ is supported to run Spectrum Virtualize Software. This is a great option for end clients, managed service or cloud service providers deploying private clouds, building hosted services, or using software-defined storage on third party Intel servers. This a fully inclusive license with all key features available on Spectrum Virtualize in a single, downloadable image.
IBM Spectrum Control V5.2.13 and IBM Virtual Storage Center V5.2.13
We often joke that IBM Virtual Storage Center is the [Happy Meal] combining storage virtualization with Spectrum Virtualize hardware like FlashSystem V9000, SAN Volume Controller or Storwize as the "hamburger", Spectrum Control as the "fries" and "Spectrum Protect Snapshot" as the "soft drink". Storage Analytics was included as a "prize inside" only available in the VSC bundle to entice clients to chose this option.
Whenever IBM updates Spectrum Control, they often put out a new version of the Virtual Storage Center bundle as well. I was the Chief Architect for Spectrum Control 2001-2002, and Technical Evangelist for SVC in 2003 when we first introduced the product, so I have long history with both products.
This release provides additional information and performance metrics on Dell EMC VMAX and EMC VNX devices. This is done natively, they do not need to be virtualized by Spectrum Virtualize as was often done in the past.
IBM now offers better visibility of drives within IBM Cloud Object Storage Slicestor® nodes. IBM acquired Cleversafe 18 months ago, and are working to get it under the Spectrum Control management umbrella.
IBM Spectrum Scale™ file system to external pool correlation. Spectrum Scale can migrate data to three different type of "external pools":
Cloud Object pool, either on-premise Object Storage or off-premise Cloud Service Provider storage.
Spectrum Protect pool, where Spectrum Protect manages the migrated data on one of 700 supported devices, including tape, virtual tape, optical, flash, disk, object storage or cloud.
Spectrum Archive pool, where data is written directly to physical tape using the Industry-standard LTFS format.
This release provides additional information on the copy data panel about SAN Volume Controller (SVC) HyperSwap® and vDisk mirror.
While the "Virtual Storage Center" bundle is an awesome deal, some clients have asked for the "Vegetarian Option" (Fries and Drink only). Why? Because they want the advanced storage analytics (prize inside) for other devices like DS8000, XIV, etc. So, IBM created the "IBM Spectrum Control Advanced Edition", which has everything in VSC except the Spectrum Virtualize itself.
Advanced edition adds improvements to the chargeback report. It also includes IBM Spectrum Protect™ Snapshot V8.1 release.
IBM Spectrum Control Storage Insights Software as a Service
Storage Insights is IBM's "Software-as-a-Service" reporting-only offering subset of Spectrum Control Advanced Edition. It includes direct support for Dell EMC VMAX, VNX, and VNXe storage systems. This is huge! Now, clients who have only EMC hardware can now, on a monthly basis, figure out where they are wasting money and decrease their costs.
Other features carried over include the enhanced drive support for IBM® Cloud Object Storage, enhanced external capacity views for IBM Spectrum Scale™ and additional replication views for vDisk mirror and HyperSwap® relationships for SAN Volume Controller (SVC) and Storwize® devices that I mention above.
Well, it's Tuesday again, and you know what that means? IBM Announcements! There were lots of announcements today, so I have split this up into two posts. One for the Tape and Cloud announcements, and the other for the Spectrum Storage family.
IBM TS7700 Virtual Tape System
IBM TS7700 release 4.1.1 now supports seven- and eight-way grids with approved RPQs. Before this, grids could only have up to six TS7700 systems connected together.
IBM also plans to extend the capacity of the TS7760 base frame to over 600 TB, and to extend the capacity of a fully configured TS7760 system to over 2.45 PB, before compression, by supporting 8 TB disk drives. This is a huge increase over the 4TB and 6TB drives used today.
IBM offers the IBM Cloud Object Storage System in three ways: as software, as pre-built systems, and as a cloud server on IBM Bluemix (formerly known as SoftLayer).
For those not familiar with IBM Cloud Object Storage (IBM COS), consider it "Valet Parking" for your storage. In a valet parking environment, you have valet parking attendants that drive the cars, parking garages that hold the cars, and a manager that oversees the operation. With IBM COS, you have Accesser® nodes that receive and retrieve your data like valet parking attendants, you have Slicestor® nodes that store your objects like cars in a parking garage, and you have IBM COS Manager to oversee the operation.
Today, IBM announced new HDD options for their S01, S03 and S03 models of Slicestor nodes. These are all 7200 rpm, 3.5-inch Nearline drives, at capacities of 4 TB, 6 TB, 8 TB and 10 TB.
In addition, a short-range 40 GbE SFP+ transceiver is available for ordering on IBM Cloud Object Storage Accesser models A00, A01, and A02, and IBM Cloud Object Storage Slicestor models S01 and S02. This improves the performance of data transfer between the Accesser nodes and the Slicestor nodes. Think of it like shortening the distance valet parking attendants have to drive your car to the garage and run back.
I have been presenting Cloud Storage for nearly 10 years now. People are often shocked to learn that most of the major cloud providers -- including Amazon, Google, Microsoft -- do not offer "Data at Rest" encryption on their storage offerings.
Why not? Because it would mean investing in Self-Encrypting Drives, Key management software, and other related technology to make it happen. Instead, Cloud Service Providers (CSPs) expect you to encrypt the data in software. Most users encrypt data before it lands on the cloud, but what if you create the data in the cloud?
IBM solved this by offering IBM Cloud Object Storage in its IBM Cloud (formerly known as SoftLayer). It has integrated encryption software that takes care of this for you.
This new product, IBM Multi-Cloud Data Encryption V1.0, enables you to encrypt files, folders, and volumes in any cloud while maintaining local control of encryption keys. It integrates with IBM Security Key Lifecycle Manager (SKLM). This is designed to allow you to move cipher data between clouds that are running Multi-Cloud Data Encryption without decrypting and re-encrypting the data.
For example, you can use IBM Multi-Cloud Data Encryption to protect your data on Amazon, Google or Microsoft, then later realize that you can save a ton of money moving to IBM Cloud instead, and you are now able to move the data over seamlessly!
(Back in 2010, I poked fun at EMC with my post [VPLEX: EMC's Latest Wheel is Round]. I pointed out that EMC's announcement of "new features" that already existed in IBM's SAN Volume Controller. Oops! They did it again!)
Basically, Dell EMC is working on a new "2 Tiers" approach that combines high-performance flash tier with high-capacity object storage. Guess what? IBM already offers this! Why wait?
IBM Spectrum Scale, formerly known as the General Parallel File System (GPFS), supports POSIX, HDFS, OpenStack Swift, Amazon S3, NFS, SMB and iSCSI protocols.
Spectrum Scale can provide this front-end abstraction layer between flash and object storage, including IBM Cloud Object Storage system and IBM Bluemix (formerly SoftLayer) cloud services.
But why limit yourself to just two tiers? IBM Spectrum Scale can also support 15K, 10K and 7200 RPM spinning disk drive tiers, as well as virtual or physical tape tier, the ultimate low-cost high-capacity tier!
Several years ago, IBM coined the phrase "FLAPE" to discuss the two-tier approach of combining Flash with Tape using Spectrum Scale as the front-end abstraction layer.
Perhaps we should call combinations of Flash and Object "FLobject" storage? If the name catches on, you read it here first!
IBM is in a transition from being a "Systems, Software and Services" company, to become the leading "Cognitive Solutions and Cloud Platform" company. IBM has been in this transformation for the past three years or so, and [over 40 percent of its revenue] now comes from these strategic initiatives.
The purpose of AI and cognitive systems developed and applied by the IBM company is to augment human intelligence. Our technology, products, services and policies will be designed to enhance and extend human capability, expertise and potential. Our position is based not only on principle but also on science.
Cognitive systems will not realistically attain consciousness or independent agency. Rather, they will increasingly be embedded in the processes, systems, products and services by which business and society function -- all of which will and should remain within human control.
For cognitive systems to fulfill their world-changing potential, it is vital that people have confidence in their recommendations, judgments and uses. Therefore, the IBM company will make clear:
When and for what purposes AI is being applied in the cognitive solutions we develop and deploy.
The major sources of data and expertise that inform the insights of cognitive solutions, as well as the methods used to train those systems and solutions.
The principle that clients own their own business models and intellectual property and that they can use AI and cognitive systems to enhance the advantages they have built, often through years of experience. We will work with our clients to protect their data and insights, and will encourage our clients, partners and industry colleagues to adopt similar practices.
The economic and societal benefits of this new era will not be realized if the human side of the equation is not supported. This is uniquely important with cognitive technology, which augments human intelligence and expertise and works collaboratively with humans.
Therefore, the IBM company will work to help students, workers and citizens acquire the skills and knowledge to engage safely, securely and effectively in a relationship with cognitive systems, and to perform the new kinds of work and jobs that will emerge in a cognitive economy.
This week, I was reminded that back in 2011, Watson beat two human players, Ken Jennings and Brad Rutter on the TV game show "Jeopardy!" On his last response, Ken wrote "I for one welcome our new computer overlords." With IBM investing heavily in Cognitive Solutions, should people be worried, or welcome the new technology?
Back in 1950, Isaac Asimov proposed "Three laws of robots":
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Let's take a look at how Artificial Intelligence has been represented in the movies over the past few decades. I have put these in chronological order when they were initially released in the United States.
(FCC Disclosure and Spoiler Alert: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for cognitive solutions made by IBM. While IBM may have been involved or featured in some of these movies, I have no financial interest in them. I have seen them all and highly recommend them. I am hoping that you have all seen these, or at least familiar enough with their plot lines that I am not spoiling them for you.)
2001: A Space Odyssey
Back in 1968, Stanley Kubrick and Arthur C. Clarke made a masterpiece movie about a mysterious obelisk floating near Jupiter. To investigate, a crew of human beings takes a space ship managed by a sentient computer named [HAL-9000].
(Many people thought HAL was a subtle reference to IBM. Stanley Kubrick clarifies:
"By the way, just to show you how interpretation can sometimes be bewildering: A cryptographer went to see the film, and he said, 'Oh. I get it. Each letter of HAL's name is one letter ahead of IBM. The H is one letter in front of I, the A is one letter in front of B, and the L is one letter in front of M.'
Now this is a pure coincidence, because HAL's name is an acronym of heuristic and algorithmic, the two methods of computer programming...an almost inconceivable coincidence. It would have taken a cryptographer to have noticed that."
Source: The Making of 2001: A Space Odyssey, Eye Magazine Interview, Modern Library, pp. 249)
The problem arises when HAL-9000 refuses commands from the astronauts. The astronauts are not in control, HAL-9000 was given separate orders from ground control back on earth, and it has determined it would be more successful without the crew.
In 1973, Michael Crichton wrote and directed this movie about an amusement park with three uniquely themed areas: Medieval World, Roman World, and Westworld. Robots are used to staff the parks to make them more realistic, interacting with the guests in character appropriate for each time period.
A malfunction spreads like a computer virus among the robots, causing them to harm or kill the park's guests. Yul Brenner played a robot called simply "the Gunslinger". Equipped with fast reflexes and infrared vision, the Gunslinger proves especially deadly!
(Michael Crichton also wrote "Jurassic Park", which had a similar story line involving dinosaurs with catastrophic results!)
Last year, HBO launched a TV series called "Westworld", based on the same themes covered in this movie. The first season of 10 episodes just finished, and the next season is scheduled for 2018.
Directed by Ridley Scott, this 1982 movie stars Harrison Ford as Rick Deckard, a law enforcement officer. Rick is tasked to hunt down and "retire" four cognitive androids named "replicants" that have killed some humans and are now in search of their creator, a man named J. F. Sebastian.
(I enjoy the euphemisms used in these movies. Terms like kill, murder or assassinate apply to humans but not machines. The word "retire" in this movie refers to destruction of the robots. As we say in IBM, "retirement is not something you do, it is something done to you!")
Destroying machines does not carry the same emotional toll as killing humans, but this movie explores that empathy. A sequel called "Blade Runner 2049" will be released later this year.
In 1983, Matthew Broderick plays David, a young high school student who hacks into the U.S. Military's War Operation Plan Response (WOPR) computer. The WOPR was designed to run various strategic games, including war game simulations, learning as it goes. David decides to initiate the game "Global Thermonuclear War", and the military responds as if the threats were real.
Can the computer learn that the only way to win a war is not to wage it in the first place? And if a computer can learn this, can our human leaders learn this too?
In this series of movies, a franchise spanning from 1984 to 2009, the US Military builds a defense grid computer called [Skynet]. After cognitive learning at an alarming rate, Skynet becomes self-aware, and decides to launch missiles, starting a nuclear war that kills over 3 billion people.
Arnold Schwarzenegger plays the Terminator model T-800, a cognitive solution in human form designed by Skynet to finish the job and kill the remainder of humanity.
In this 2004 movie, Will Smith plays Del Spooner, a technophobic cop who investigates a crime committed by a cognitive robot.
(Many people associate the title with author Isaac Asimov. A short story called "I, Robot" written by Earl and Otto Binder was published in the January 1939 issue of 'Amazing Stories', well before the unrelated and more well-known book 'I, Robot' (1950), a collection of short stories, by Asimov.
Asimov admitted to being heavily influenced by the Binder short story. The title of Asimov's collection was changed to "I, Robot" by the publisher, against Asimov's wishes. Source: IMDB)
Del Spooner uncovers a bigger threat to humanity, not just a single malfunctioning robot, but rather the Virtual Interactive Kinesthetic Interface, or simply VIKI for short, a cognitive solution that controls all robots. VIKI interprets Asimov's three laws in a manner not originally intended.
In this 2015 movie, Domhnall Gleeson plays Caleb, a 26 year old programmer at the world's largest internet company. Caleb wins a competition to spend a week at a private mountain retreat. However, when Caleb arrives he discovers that he must interact with Ava, the world's first true artificial intelligence, a beautiful robot played by Alicia Vikander.
(The title derives from the Latin phrase "Deus Ex-Machina," meaning "a god from the Machine," a phrase that originated in Greek tragedies. Sources: IMDB)
Nathan, the reclusive CEO of this company, relishes this opportunity to have Caleb participate in this experiment, explaining how Artificial Intelligence (AI) will transform the world.
(The three main characters all have appropriate biblical names. Ava is a form of Eve, the first woman; Nathan was a prophet in the court of David; and Caleb was a spy sent by Moses to evaluate the Promised Land. Source: IMDB)
The premise is based in part on the famous [Turing Test], developed by Alan Turing. This is designed to test a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Movies that depict the bad guys as a particular nationality, ethnicity or religion may be offensive to some movie audiences. Instead, having dinosaurs, monsters, aliens or robots provides a villain that all people can fear equally. This helps movie makers reach a more global audience!
Of course, if robots, androids and other forms of Artificial Intelligence did exactly what humans expect them to, we would not have the tense, thrilling action movies to watch on the big screen.
This is not a complete list of movies. Enter in the comments below your favorite movie that features Artificial Intelligence and why it is your favorite!
(As IBM is focused on its transformation from a "Systems, Software and Services" company to a "Cognitive Solutions and Cloud Platform" company, it seems appropriate to highlight my 1,000 blog post on the concept of cognitive solutions.)
A lot of people ask me to explain what exactly does IBM mean by "cognitive", which is a fair question. Let's start with the [Dictionary definition]:
of or relating to cognition; concerned with the act or process of knowing, perceiving, etc.
of or relating to the mental processes of perception, memory, judgment, and reasoning, as contrasted with emotional and volitional processes.
What exactly does IBM mean by Cognitive? IBM has taken this definition, and focused on four key strategic areas:
In the summer of 1981, I spent a summer debugging a "Pascal" compiler at the University of Texas at Austin. I wasn't told that was what I was doing. Rather, I was tasked with writing sample Pascal programs that would demonstrate the features and capabilities of the language.
Every day, I would come up with a concept of a program, punch up the cards, run it through the CDC hopper, and verify that it would work properly. If I didn't have it working by lunch, I would take it to the "help desk", they would look it over, and tell me how to fix it after I got back.
Most of the time, it was a mistake in my software. A few times, however, it was a flaw in the compiler itself. My programs were basically test cases, and the Pascal Compiler development team was fixing or enhancing the compiler code every time I had a problem.
Compilers basically work by parsing the program text, looking for fixed keywords that are entered in a specifically prescribed order to make sense. Other keywords may represent data types, variables, constants or pre-defined macros.
But compilers are not cognitive. Cognitive solutions can understand natural language, and have to handle all the ambiguity of words not being in the correct order, or different words having different meanings.
As an Electrical Engineer, I had to take many classes on classical analog signal processing. In fact, all computers have some amount of analog components, where threshold processing is used to differentiate a zero (0) from a one (1).
For example, if a "zero" value was represented by 1 volt, and a "one" value by 5 volts, then you can set a threshold at 3 volts. Any voltage less than 3 would be considered a "zero" value, and anything 3 volts or greater a "one" value.
But threshold processing is not cognitive. Cognitive solutions also use thresholds, but their thresholds are dynamically determined, through advanced analytics and statistical mathematical models, and may adjust up and down as needed, based on machine learning over time.
IBM Research is proud to have developed the world's most advanced caching algorithms for its storage systems. Cache memory is very fast, but also very expensive, so offered in limited quantities. Caching algorithms decide which blocks of data should remain in cache, and which should be kicked out.
Ideally, a block in read cache would be kicked out precisely after the last time it was read, with little or no expectation for being read again anytime soon. Likewise, a block in write cache would be destaged to persistent storage precisely after the last time it was updated, with little or no expectation for being updated again anytime soon.
Traditional approach is "Least Recently Used" or [LRU]. Cache entries that were read recently or updated recently, would be placed on the top of the list, and the least referenced would be at the bottom of the list. When space is needed in cache, the entries at the bottom of the list would be kicked out.
IBM's [Adaptive Cache Algorithm outperforms LRU]. For example, on a workstation disk drive workload, at 16MB cache, LRU delivers a hit ratio of 4.24 percent while ARC achieves a hit ratio of 23.82 percent, and, for a SPC1 benchmark, at 4GB cache, LRU delivers a hit ratio of 9.19 percent while ARC achieves a hit ratio of 20 percent.
But caching algorithms, including IBM's Adaptive Cache, are not cognitive. These algorithms respond pragmatically based on the current state of the cache. Cognitive solutions learn, and improve with usage. This is often referred to as "Machine Learning".
The human-computer interface (HCI) has much room for improvement in a variety of areas.
Take for example a snack vending machine. In college, we had assignments to simulate the computing logic of these. We had to interact with the buyer, receive coins entered into the slot--nickels, dimes and quarters representing 5, 10 and 25 cents--determine a total monetary balance, and then dispense snacks of various prices and return an appropriate amount of change, if any. There is even a [greedy algorithm] designed to optimize how the change is returned.
But vending machines are not cognitive. Like the caching algorithms, vending machines interact based on fixed programmatic logic, treating all buyers in the same manner. Cognitive solutions can interact with different users in different ways, customized to their needs, and these interactions can improve over time, based on machine learning.
IBM is exploring the use of Cognitive Solutions in a variety of different industries, from Healthcare to Retail, Financial Services to Manufacturing, and more.
Well, it's Tuesday again, and you know what that means? IBM Announcements!
(Yes, OK, it's actually Thursday. I wrote this post weeks ago, but was embargoed until Jan 10, and then was asked to wait until Jan 12 so that the IBM Marketing team could translate my text into 15 different languages.)
This week, the IBM DS8000 team announces a new High Performance Flash Enclosure (HPFE-Gen2) and a series of All-Flash Array DS8880F models that exploit this new technology.
New High Performance Flash Enclosure (HPFE-Gen2)
The original HPFE was 1U high with 16 or 30 flash cards, and could support RAID-5 or RAID-10. Most used RAID-5, resulting in four array sites of 6+P each, leaving two cards for spare. These 1.8-inch cards were only 400 or 800 GB in size, so the maximum raw capacity was only 24TB per 1U enclosure.
The new HPFE-Gen2 enclosure is a complete re-design, consisting of two Microbays and two TeraPacks. The I/O Bays attach to the Microbays via PCIe Gen3. The Microbays in turn attach to both TeraPacks via redundant 6 Gb or 12 Gb SAS.
Each TeraPack holds 24 flash cards each. Since the TeraPacks come in pairs, you can install 16, 32 or 48 flash cards per enclosure. Each 16-card set represents two array sites, for a maximum of six array sites per HPFE-Gen2.
RAID-5 for 400/800 GB. Two 6+P arrays, four 7+P arrays, and two spares.
RAID-6 for 400/800/1600/3200 GB. Two 5+P+Q arrays, four 6+P+Q arrays, and two spares.
RAID-10 for 400/800/1600/3200 GB. Two 3+3 arrays, four 4+4 arrays, and four spares.
(Technically, these new "Flash cards" are 2.5-inch Solid State Drives (SSD) placed into the HPFE Gen2 connected to the PCIe Gen3 interface, with 50 percent additional capacity to tolerate up to 10 drive-writes-per-day (DWDP). IBM will continue to call them "Flash Cards" for naming consistency between the two generations of HPFE.)
The new HPFE-Gen2 enclosures are substantially faster, offering up to 90 percent more IOPS, and up to 268 percent more throughput (GB/sec). The Microbays use a new flash-optimized ASIC to perform the RAID calculations.
New All-Flash Array DS8880F models
IBM introduces the DS8884F, DS8886F and DS8888F that are based entirely on the HPFE-Gen2 enclosures described above.
Hybrid - HDD/SSD/HPFE mix
Hybrid - HDD/SSD/HPFE mix
AFA - HPFE only
AFA - HPFE-Gen2 only
AFA - HPFE-Gen2 only
AFA - HPFE-Gen2 only
New zHyperLink connection
Also, as a "Statement of Direction", IBM intends to deliver field upgradable support for zHyperLink on existing IBM System Storage DS8880 machines for connection to z System servers. zHyperLink is a short-distance, mainframe-attach link designed for lower latency than High Performance FICON.
Typical latency with FICON/zHPF is around 140-170 microseconds, and this new zHyperLink is estimated to reduce this down to 20-30 microseconds, but is limited to 150 meter fiber optic cable distance. zHyperLink is intended to speed up DB2® for z/OS® transaction processing and improve active log throughput.
Last month, I had the pleasure to help train Watson in its latest mission, to help answer questions from sellers, this are not just for the IBM feet on the street, but also for IBM distributors and IBM Business Partners as well.
"... [survey by SearchYourCloud] revealed 'workers took up to 8 searches to find the right document and information.' Here are a few other statistics that help tell the tale of information overload and wasted time spent searching for correct information -- either external or internal:
'According to a McKinsey report, employees spend 1.8 hours every day -- 9.3 hours per week, on average -- searching and gathering information. Put another way, businesses hire 5 employees but only 4 show up to work; the fifth is off searching for answers, but not contributing any value.' Source: [Time Searching for Information]
'19.8 percent of business time -- the equivalent of one day per working week -- is wasted by employees searching for information to do their job effectively,' according to Interact. Source: [A Fifth of Business Time is Wasted]
IDC data shows that 'the knowledge worker spends about 2.5 hours per day, or roughly 30 percent of the workday, searching for information ... 60 percent [of company executives] felt that time constraints and lack of understanding of how to find information were preventing their employees from finding the information they needed.' Source: [Information: The Lifeblood of the Enterprise]."
In the early days of the Internet, before search engines like Google or Bing, I competed in [Internet Scavenger Hunts]. A dozen or more contestants would be in a room, and would be given a list of 20 questions to find answers for. Each of us would then hunt down answers on the Internet. The person to find the most documented answers before time runs out wins. It was quite the challenge!
Over the years, I have honed my skills as a [Search Ninja]. With over 30 years of experience in IBM Storage, many sellers come to me for answers. Sometimes sellers are just too lazy to look for the answers themselves, too busy trying to meet client deadlines, or too green to know where to look.
A good portion of my 60-hour week is spent helping sellers find the answers they are looking for. Sometimes I dig into the [SSIC], product data sheets, or various IBM Redbooks.
Other times, I would confer with experts, engineers and architects in particular development teams. Often, I learn something new myself. In a few cases, I have turned some questions into ideas for blog posts!
It was no surprise when I was asked to help train Watson for the new "Systems SmartSeller" tool. This will be a tool that runs on smartphones or desktops to help answer questions that sellers might need to respond to RFP or other client queries.
The premise was simple. Treat Watson as a student at "Cognitive University" taking classes from dozens of IBM professors, in a series of semesters, or "phases".
Phase I involved building the "Corpus", the set of documents related to z Systems, POWER systems, Storage and SDI solutions; and a "Grading Tool" that would be used as the Graphical User Interface. I was not involved in phase I.
Phase II was where I came in. Hundreds of questions are categorized by product area. I worked on 500 questions for storage. For each question, Watson had up to eleven different responses, typically a paragraph from the Corpus. My job as a professor was to grade the responses to some 500 storage questions:
★ (one star)
Irrelevant, answer not even storage-related
★★ (two stars)
Relevant, at least it is storage-related, but does not answer the question, or answers it poorly
★★★ (three stars)
Relevant, adequately answers the question
★★★★ (four stars)
Relevant, answers the question well
Most of the answers were either 1-star (not storage related) or 2-star (mentioned storage, but poor response). I would search through the existing Corpus looking for a better answer, and at best found only 3-star responses, which I would add to the list and grade as a 3-star response.
I then searched the Internet for better answers. Once I found a good match, I would type up a 4-star response, add it to the list, and point it to the appropriate resources on the Web.
Other professors, who were also looking at these questions, would then get to grade my suggested responses as well. Watson would learn based on the consensus of how appropriate and accurate each response was graded.
I don't know where the Cognitive University team got some of the questions, but they were quite representative of the ones I get every week. In some cases, the seller didn't understand the question he heard from the client, making it difficult for me to figure out what they were actually asking for.
It reminds me of that parlor game ["Telephone" or "Chinese Whispers"], in which one person whispers a message to the ear of the next person through a line of people until the last player announces the message to the entire group. I have actually played this at an IBM event in China!
Watson needs to parse the question into nouns and verbs, and use that Natural Linguistic Programming (NLP) to then search the Corpus for appropriate answer. I determined three challenges for Watson in this case:
The questions are not always fully formed sentences. For example, "Object storage?" Is this asking what is object storage in general, or rather what does IBM offer in this area?
The questions often do not spell the names of products correctly, or use informal abbreviations. "Can Store-wise V7 do RtC?" is a typical example, short for "Can the IBM Storwize V7000 storage controller perform Real-time Compression?"
The questions ask what is planned in the future. "When will IBM offer feature x in product y?" I am sorry, but Watson is not [Zoltar, the fortune teller]!
I managed to grade the responses in the two weeks we were given. Part of my frustration was the grading tool itself was a bit buggy, and I spent some time trying to track down some of its flaws.
The next phase is in late January and February. This will give the Cognitive University team a chance to update the Corpus, improve the grading interface, and find more professors and different set of questions. I volunteered the most recent four years' worth of my blog posts to be added to the Corpus.
Maybe this tool will help me turn my 60-hour week back to the 40-hour week it should be!
Fellow blogger Chris Mellor from The Register has an interesting post titled [It's a ratchet: Old storage guard face incoming tech squeeze]. Chris opines that the big traditional storage vendors -- which he refers to as the "old guard": Dell EMC, HDS, HPE, IBM and NetApp -- are being squeezed out by startups with new technologies.
Last week, I saw the play [Fiddler on the Roof], a musical production by Arizona Theater Company (ATC), and thought of various parallels with Chris's post.
For those not familiar, the story centers around a father named Tevye and his wife trying to stick to tradition, with five daughters who are open to breaking with tradition to get married. The family lives in a small rural town, back in a time long ago when people were persecuted for their religious and ethnic background. Aren't you glad we live in [more enlightened times]!
Back to Chris Mellor, he writes in his post:
"This old guard has so far failed to squash newcomers in the all-flash array, hyperscale, object and software-defined storage areas. This is despite the established firms adopting these technologies and acquiring some startups."
Should the old guard try to squash newcomers? Often, these startups provide much needed innovations that move the IT industry forward.
In the play, Tevye wants to stick to tradition, whereby the town's matchmaker would find a husband for each daughter, and he, as father of each bride, would then provide his permission and blessing to the match.
Obviously, these startups are neither asking the old guard for their permission nor their blessing. While I can't speak for the rest of the "old guard", IBM is leading in these various spaces. Let's look at each of these new trends.
All-Flash Arrays (AFA)
The category of "All-Flash Arrays" include both purpose-built hardware as well as traditional devices based on solid-state drives (SSD). While the R&D investment needed for purpose-built hardware can limit this to some of the largest vendors, nearly any startup can slap commodity SSD into traditional HDD controllers and call it AFA.
IBM offers the world's fastest AFA, and has been a leader in the AFA category for the past three years, investing over $1 Billion USD on its FlashSystem, DS8000, Elastic Storage Server (ESS), SVC and Storwize product families.
Software-Defined Storage (SDS)
While the definition for SDS is still in a bit of flux, IDC has tried to identify three characteristics:
Storage software stack that can be installed on commodity resources (x86 hardware, hypervisors, or cloud) and/or off-the-shelf computing hardware
SDS should offer a full suite of storage services
Federation between the underlying persistent data placement resources to enable data mobility of its tenants between these resources
IBM has been ranked [Number 1 in Software Defined Storage] for several years now, investing over $1 Billion USD in its IBM Spectrum Storage family. This collection of software is implemented in a variety of offerings, including pre-built systems, software that you can deploy on commodity off-the-shelf servers, and in the Cloud.
Object storage breaks tradition with block and file-based storage solutions. Rather than reading and writing files using POSIX, NFS or SMB protocols, objects are accessed via HTTP GET and PUT requests. The two most common protocols are Amazon S3 and OpenStack Swift.
Object storage is ideal for static and stable data that either never changes, or changes infrequently. A lot of new workloads are based on unstructured data that falls in this category, such as Big Data Analytics, High-performance Computing (HPC), and active archives.
In the latest IDC Marketscape, [IBM is ranked #1 in Object Storage]. IBM has actually three software-defined storage offerings that support Object access methods. IBM Spectrum Scale, IBM Spectrum Archive and IBM Cloud Object storage System. The latter from 2015 acquisition of Cleversafe.
"Hyperscale leverages commodity servers and a software-defined approach, scaling the resources needed for applications and storage separately. As storage needs grow, companies can add servers running software-defined storage (SDS) to the storage tier to expand capacity... Data is automatically distributed across the entire cluster of storage servers as new nodes are added to the system... With hyperscale, .. cluster nodes network together to form a storage resource pool."
This breaks from the tradition of dual-controller high-end arrays, which scale-up, rather than scale-out. IBM offers its IBM Spectrum Accelerate, IBM Spectrum Scale, and IBM Cloud Object Storage System to fill this hyperscale requirement.
In the play, Tevye realizes the world is changing all around him, he can either fight these changes and stick to tradition, or accept that he must change also, and move on. After 105 years, IBM continues to lead the IT industry, primarily by adopting new trends and technologies, moving to new business opportunities as they present themselves.
IBM is doing a bit of year-end housekeeping. The Storage Community (storagecommunity.org) will be discontinued as of January 1, 2017.
IBM will continue to host a community for all of its followers and contributors to share insights on the latest trends in storage at [ibm.co/StorageSolutions].
All of the most recent IBM content from storagecommunity.org will now be available at this new domain. IBM hopes that you will continue to engage in its community of storage industry thought leaders.
If you would like to contribute to the new community, please [register here]. Simply click the silhouette icon in the top right-hand corner of the page and select "register." Input your email address and create a password, then sign in. You will receive an email from IBM with further instructions to get you set up.
IBM's twitter handle (@SmarterStorage) will also be sunset as of January 1, 2017, but I encourage you to follow @IBMStorage, or my own twitter handle @az990tony, for the latest storage news and announcements from IBM.
Last Thursday, Dec 15, I had the pleasure to present to 162 clients and IBM Business Partners, followed by the premiere showing of [Rogue One, a Star Wars movie]!
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for IBM products and services. I have no financial interest in Lucasfilm Ltd, or its parent company Disney, LEGO company, or any competitor mentioned in this post.. I was not compensated to review this film or mention it on my blog. All graphics from the film used in this blog and related presentation were publicly available under the U.S. "fair use" doctrine. There are no spoilers in this blog, so keep reading!)
This event was a collaboration between:
Arrow, one of IBM's distributors
Corus360, an IBM Business Partner
Regal Medlock 18, a theater with comfy seats with a bar that serves beer and wine
As a public speaker for IBM, I get to travel all over the world, and throughout the United States. This trip wraps up my travel for 2016, with 34 weeks on the road!
Normally, when I am asked to present, I am given a list of products or topics to cover. This time, I was just given the title "Has Your Data Gone Rogue? -- Using IBM Flash and solutions to obtain enhanced business insights" and the suggestion to keep within the theme of Star Wars.
I had 45 minutes to cover whatever I thought would be something of interest to the clients in the audience, which spanned a variety industries from Healthcare and Financial services, to Retail and Manufacturing.
I turned to mind-mapping software to brainstorm some ideas. On my smartphone, I use an app called [SimpleMind], and on my laptop, I use [View Your Mind (vym)]. Here is what I came up with:
I arrived to the theater early to setup and mingle with the clients in the lobby. The sponsors that organized this event had gifts to raffle off, including two drones, and three Star Wars themed LEGO sets.
I was told to be done by 7:30pm. It turns out that the movie is streamed electronically, rather than having the actual media distributed physically to the theaters, as a way to prevent piracy.
My PowerPoint charts were in 16:9 format to fill the screen. This was perhaps the biggest screen I had ever presented on! I look so tiny in comparison!
IBM has been a leader in all-flash arrays for the past three years in a row, and as an IBM Business Partner, Corus360 has been one of our top sellers in the Southeastern United States. IBM offers a wide array of choices, from DS8000 to FlashSystem to the new [IBM DeepFlash Elastic Storage Server (ESS)].
Rebels are inquisitive. IBM is considered number one in Analytics. For every type of question, IBM has analytics to help answer. Here are some examples:
What is happening? -- Descriptive Analytics
Why did this happen? -- Diagnostic Analytics
What might happen next? -- Predictive Analytics
What actions should we take? -- Prescriptive Analytics
I focused on the use of Hadoop and Spark with the [IBM Spectrum Scale] software pre-installed on the DeepFlash ESS device. The DeepFlash ESS combines powerful POWER8 servers with the DeepFlash 150, a 3U high JBOF that holds up to 64 solid-state boards 8TB each, optimized for analytics of unstructured data content.
Spectrum Scale is supported on any open source distribution of Hadoop and Spark, and is an optional add-on to [IBM BigInsights]. [IBM HDFS Transparency Connector] has 100 percent compatibility, allowing Hadoop and Spark analytics programs run directly without modification.
To provide valuable insight to the storage environment itself, IBM offers IBM Spectrum Control. The newest edition is [IBM Spectrum Control Storage Insights], a Software-as-a-Service (SaaS) that charges on a monthly per-capacity basis. Perfect for the Rebel Alliance on a tight budget and schedule!
The Galactic Empire has a different set of problems. They are behind schedule, having worked on the Death Star for the past 20 years, and upper management is growing impatient. A major test is imminent to prove its progress.
To speed development and test efforts, IBM offers a variety of FlashSystem products:
IBM FlashSystem 900
the World's Fastest Storage®, roughly 5 to 10 times faster than competitors based on commodity Solid State Drives (SSD) like Dell EMC XtremIO and PureStorage.
IBM FlashSystem V9000
adds the robust functionality of IBM Spectrum Virtualize, with Real-time Compression, Thin Provisioning, FlashCopy snapshots, and remote mirroring. Like the IBM SAN Volume Controller and Storwize family of products, the FlashSystem V9000 can virtualize almost 400 different storage devices from a variety of vendors.
IBM FlashSystem A9000 and A9000R
add the robust functionality of IBM Spectrum Accelerate, offering Real-time compression and data deduplication, making it ideal for Cloud, Virtual Machine and Virtual Desktop deployments.
As we learned in earlier episodes I to III of the Star Wars saga, a big problem was too many clones. IBM Spectrum Storage family has introduced the newest member: IBM Spectrum Copy Data Management. This software creates and catalogs data base clones to help with development and test efforts, reducing the number of rogue copies.
Lastly, the Empire must keep its secrets safe and protected. I covered the basics of data-at-rest encryption, the use of symmetric and asymmetric keys, [IBM Security Key Lifecycle Manager (SKLM), and how these are deployed on IBM flash, disk and tape products.
Then, we watched the movie. I found it quite entertaining!
Well, it's Tuesday again, and you know what that means? IBM Announcements!
I just got back from my vacation, so this is a guest post from my colleagues Moshe Weiss, Senior Manager, Development and Design, IBM Storage; and Diane Benjuya, Portfolio Marketing Manager for IBM Spectrum Accelerate.
1. What is IBM announcing?
Today IBM announces another leap forward in storage management, with the availability of IBM Hyper-Scale Manager version 5.1. In April 2016, when IBM announced IBM FlashSystem A9000 and A9000R, they also introduced a fully revamped GUI: IBM Hyper-Scale Manager 5.0. That version brought FlashSystem A9000/A9000R clients a terrific new storage management experience, with advanced look and feel, analytics tools, and other enhancements for managing smarter, with greater simplicity and in less time.
Hyper-Scale Manager 5.x dramatically reduces task time -- by 45% for this task
With Hyper-Scale Manager 5.1, IBM is bringing this exceptional GUI and unified user management experience across the entire set of Spectrum Accelerate-based products, which IBMers internally refer to as the "A family":
IBM Spectrum Accelerate software
IBM XIV Storage System
IBM Hyper-Scale Manager lets you view and move quickly across software-defined, disk based, and all-flash storage in seconds, equipping you with the information you need to ensure every application is performing at its peak.
2. What is innovative about the new GUI -- how does it help clients?
IBM Hyper-Scale Manager 5 makes storage management more insightful and easier in multiple ways, helping clients find info, act and troubleshoot faster. Concepts implemented include: web application with tablet-ready design, single page application, strong navigation scheme, smart filter with analytics, capacity trend/forecast, call for action, better communication using social media. All this helps users make fast, informed decisions while being able to see at a glance the impact of any change on the environment, including into the future. IBM team has designed it over the past three years working closely with clients and using Design Thinking methodology.
Get a holistic view of your storage
Provisioning, Monitoring and Troubleshooting
Find everything, get anywhere
Call for action!
The IBM team applied an "emotional design" approach that makes users feel emotionally attached to GUI for its coolness and elegance -- making the experience not just more productive but also more pleasant.
Version 5.1 brings many exciting and important new features to ease the client's day to day activities. Here are some key ones:
Managing your "A Family" in one UI
Instantly gain insights, spot problematic areas
Integrated Capacity Analytics
4. Any unique features that will be focused on?
The IT industry is entering a cognitive era, right? So IBM has brought cognitive into the GUI. The GUI actually learns each user's habits and preferences over time and adapts the experience to the specific user.
5. How does 5.1 add value to the family of products based on Spectrum Accelerate software?
Hyper-Scale manager makes this powerful family for private, public, hybrid block storage clouds that much more attractive and relevant. Just imagine yourself:
Waking up, driving to the office, opening the UI and seeing that your FlashSystem A9000 systems are doing worse than your XIV in terms of IOPS. Scary, but no worries.
You drill down to the specific FlashSystem A9000 by comparing IOPS. You find that a QoS performance class is deliberately reducing performance for the host. A quick analysis, and you find that it is due to the contract with the host. After a short chat with the host admin, you establish better terms, and decide to stop the IO limitation on the volumes and move them to a disk-based XIV to reduce dollar-per-TB cost.
You look for the best candidate by looking at the capacity trend/forecast charts for each XIV and at growth rate per month. You compare performance metrics and chose the preferred XIV to move the volumes to.
You migrate the volumes from the A9000 to the chosen XIV using the same interface, creating connectivity in one click. You then add the same host configuration as for the A9000 to the XIV in a second click. Then just map and monitor the new IO statistics with a third click. Easy!
Imagine carrying out your daily work and decisions -- creating volumes, monitoring, mirroring, troubleshooting and configuring -- across different systems of different types within the family in single clicks -- without the need to move between user interfaces. You can think of Hyper-Scale Manager 5.1 as a GUI come alive: a dynamic, breathing, thinking work enhancer that simplifies and helps you make the most of your investment.
Come see it in action! Register now for the [Live demo webinar], scheduled for Wednesday, November 9, 2016, from 10am to 11:30am MST!
Download the software from [IBM Fix Central], installation is one click and takes just seconds!
Here is an infographic!
Comments? Feedback? Enter them below. Both Moshe and Diane would be pleased to hear from you!
Well, it's Tuesday again, and you know what that means? IBM Announcements!
(OK, yes, today is Friday, but I was busy getting married on Tuesday, so IBM pushed the announcements out one day to Wednesday, and technically I am writing this blog post during my honeymoon vacation, so the IBM marketing team and my new wife both cut me some slack. Work/Life balance is all about compromises, right?)
IBM DS8880 Storage System
The IBM DS8880 comes in three models, the DS8884 entry level, the DS8886 enterprise level, and the DS8888 all-flash array. IBM offers 1, 2, 3 and 4 year warranties.
The new High Performance Flash Enclosure (HPFE) Gen2 delivers more capacity than Gen1. The 2U flash enclosures are configured in pairs with each enclosure supporting up to twenty-four 2.5-inch flash cards in capacities 400 GB, 800 GB, 1.6 TB and 3.2 TB.
The HPFE Gen2 are currently available for both the DS8884 and DS8886 models. The maximum flash capacity for the DS8886 increases from 96 TB to 614.4 TB, delivering reduced storage costs through lesser cost per IOPS with this new flash enclosure. IBM has made a statement of direction to offer these HPFE Gen2 on the DS8888 as well.
To improve security, IBM DS8880 now supports customer-defined digital certificates for authentication, and configurable Hardware Management Console (HMC) firewall support.
For IBM's mainframe clients, IBM now offers "Extents-level" space release support for z/OS®, DSCLI (Command Line Interface) support for z/OS environment, and FICON® Information Unit (IU) pacing improvements.
IBM Spectrum Virtualize™ V7.8 delivers support for the latest SAN Volume Controller, FlashSystem V9000 and Storwize® product family, and adds new software functionality and improvements
In conjunction with [IBM Spectrum Copy Data Management], Spectrum Virtualize v7.8 offers flexible data protection with transparent cloud tiering to leverage the cloud as FlashCopy targets and restore these snapshots from the cloud on select platforms.
However, the encryption keys are kept on USB thumb drives, which are either left in the USB ports on the back of the hardware, or locked away in a safe, only to be retrieved as needed when rebooting the systems or upgrading the firmware.
Now, IBM Spectrum Virtualize v7.8 supports the IBM Security Key Lifecycle Manager (SKLM) to manage encryption keys. IBM continues to support USB thumb drives if you prefer, but SKLM is used to manage keys for most of the rest of IBM products, and provides centralized management.
The SVC and Storwize models can directly attach via 12Gb SAS to expansion drawers. At the time, we supported 2U-high 12-bay that support Large Form Factor (LFF) 3.5-inch Nearline (7200 rpm) drives, and 2U-high 24-bay that support the Small Form Factor (SFF) 2.5-inch drives (SSD, 15K, 10K and 7200 rpm).
With Spectrum Virtualize v7.8, IBM now offers a third option, the 5U-high 92-bay that supports both LFF and SFF drives. This new expansion can be attached to Storwize V5000 Gen2, Storwize V7000 (models 524/Gen2 and 624/Gen2+), and SVC (models DH8 and SV1).
For the 12-bay and 92-bay, IBM now supports 10TB capacity 3.5-inch Nearline drives. For the 24-bay and 92-bay, IBM now supports 7.68 TB and 15.36 TB capacity Solid State Drives (SSD).
For those concerned about the phrase "lower endurance" in the press release, let me explain. SSD have a bit of extra capacity included. If you write the full capacity of the drive every day for a year, you will "burn up" about one percent of the capacity.
To handle ten "Full Drive Writes per Day" (10 FDWP) over the course of five years, IBM adds 50 percent extra spare capacity above the 400 GB, 800 GB, 1.6 TB and 3.2 TB capacities. So, a 400GB full-endurance drive is really 600 GB inside. These were sometimes referred to as "Enterprise" SSD.
For the larger device sizes, the IT industry has determined that 1 FDWP is sufficient, so instead of 50 percent spare capacity, IBM adds only 5 percent extra. The 7.68 TB is really 8.06 TB inside. These were earlier referred to as "Read-Intensive" SSD. These come in 1.92 TB, 3.84 TB, 7.68 TB and 15.36 TB capacities.
IBM is also offering non-disruptive model conversions. Storwize V5010 can now be converted to V5020, and V5020 can be converted to V5030. The Storwize V7000 Model 524 (Gen2) can be converted to model 624 (Gen2+).
The DeepFlash 150 is the perfect JBOF addition to the ESS family. The current ESS models had either 2U-high 24-drive bays, or 4U-high 60-drive bays. This new model is 3U-high with 64 high-capacity (8 TB) Board Solid State Drives (BSSD).
The ESS includes all the features of IBM Spectrum Scale, including both 8+2 and 8+3 Erasure Coding data protection. This provides file and object access to data, including POSIX compliance for Windows, Linux and AIX operating systems, as well as HDFS-compliant access for big data analytics.
Last month, I presented at the "IBM Technical University" event in beautiful Atibaia, Brazil. Here is my recap of the event.
Marcelo Porto, IBM General Manager for Brazil and Client Unit Executive for Retail
What a great way to start a conference! Marcelo asked if everyone was comfortable? Everyone cheered in the affirmative.
He then said "Well, not for long. We will take you out of your comfort zone! You will disrupt yourself, and disrupt your companies. You will learn about new technologies and solutions that will make you very uncomfortable."
He explained how everything is virtual, specifically the three companies Airbnb, Waze, Uber. All of these three have new transformational business models, and he suggested all companies should follow suit.
He then said people need to be focused on four things:
Adopting an "agile attitude"
Act like you own the company
Don't cling to the past
Have the courage to re-invent yourself and your company
Frank Koja, IBM Vice President for Sales, Enterprise Systems Hardware
(Managers and business leaders could probably raise this percentage considerably if they talked to their employees before making decisions, but that's another blog post!)
Frank showed a video of an IBM client, Plenty of Fish (POF). This is a worldwide dating site with three million POF members in Brazil. They now process over 30,000 requests and/or messages per minute. FlashSystem connected to 30 servers makes that possible.
OpenPower consortium started with just 5 companies in 2014 for technology collaboration. Today, 250 members across 26 countries in six continents collaborate to make POWER technology as ubiquitous a commodity as Intel x86.
Frank then switched to "Business models" innovation. Out of the audience of about 800 people, only 10 raised their hands that have heard of Blockchain (he asked IBMers not to raise their hands, as all IBMers have heard of Blockchain!).
Frank feels that Blockchain is the most disruptive innovation since Internet banking. Blockchain affects supply chain, finance, insurance, shipping logistics, customs inspections, and government registrations.
A video showed a woman from Everledger, which uses Blockchain for shipping diamonds. IBM offers Blockchain on LinuxOne mainframe servers.
Hybrid Cloud is point of no return, including Local, Dedicated and Public clouds. Frank feels we need to cloudify all business processes.
Mauro Angelo, IBM Enterprise Strategy & Industry Solutions Director
Mauro explained that ideas are turned into inventions, and inventions are put to good use to bring forth innovations.
If your business is not cognitive you are a full era behind. Machine learning is not knew. IBM DeepBlue beat Grandmaster in Chess tournament back in 1997.
Mauro then focused on eight specific trends:
Systems of Engagement (SoE)
This is the combination of Mobile applications and Social business. IBM invited the first smartphone, the Simon, back in 1994. Apple's iPhone came later in 2007. Pokemon Go is example of augmented reality.
Cloud offers new service and location models. IBM [SoftLayer], [Bluemix], and [Kenexa] are a few examples.
There have been a lot of enhancements in this space, including Natural Language Processing (NLP), visual recognition, even smell recognition. Cognitive solutions can also identify the appropriate context, such as GPS location. And Cognitive solutions can interact with users to ask for clarifications. It can process "Big Data", the collection of non-structured data that normal Relational Database Management Systems (RDBMS) do not touch. Finally, they can learn, something often referred to as "Machine Learning".
In 2011, IBM Watson beat two humans at the TV show game Jeopardy! Today, [Dino, a toy from CogniToys] provides Watson-like capabilities to children.
Mauro got one for his daughter. She naturally interacts with toy. "How much does an elephant weigh?" she asks. "It depends on the elephant, but a fully grown elephant weighs more than 2,000 kilos" it responds. That's cool.
Wearables like Fitbit can track blood pressure, minutes of exercise, total steps walked. IBM helped Under Armour company develop an app in this space.
Eliminates middlemen or trusted third party (TTP). The hotel chain, Hilton, is testing out a robot called Pepper, which can use Blockchain to book tennis courts.
These are technologies thinner than a strand of hair, measured in nanometers. The focus is to develop stronger, lighter materials, and macromolecules for life sciences for medicine delivery.
Mass customization meets personalization and fast design prototypes. This is not just limited to plastic, but also metal, paper, wood, biomaterials, ceramics, food, and even cement.
Cement? That's right. A Chinese company prints houses using a cement 3D printer. In a country of over one billion people, this company has figured out how to build houses without human laborers.
Internet of Things (IoT)
Olli, a 12-person self-driving bus, is the brainchild of Local Motors. They are testing it out in National Harbor, and hope to roll it out to cities like Copenhagen, Miami, and Las Vegas.
Luis Liguori, IBM Distinguished Engineer and CTO for IBM Brazil
What does IBM mean by "Digital transformation?" What separates success from failure? Developed countries from less developed countries?
Is it culture? Whether people focus on the long term, or just the short term? Does the culture encourage you to foresee the future, and adapt accordingly? Does the culture encourage you to be brave and bold? Do you hide behind Business case return on investments (ROI)? Does your culture consider conflict to be good or bad? The answer: Good!
Does your company have a purpose? When humans no longer serve purpose, they die. The same is true for companies. He said the secret to success is the four "R's" -- Relevant, Resources, Reputation and Rigor.
For example, in 1996, the Kodak was ranked the 4th largest, it filed bankruptcy in 2012 because it was no longer relevant.
Consider Samsung. Samsung has lost its reputation with the latest "Samsung Galaxy Note7" fiasco of exploding batteries!
Airbnb is an example of Digital Transformation. Who knew that there were lots of people who wanted to rent out their bedrooms and bathrooms to strangers!
Luis feels that successful companies are either born digital, or transforming to digital. Industries are merging. Lines are blurring between industries. The latest acquisition between AT&T and Time Warner is an example.
Cognitive brings intelligence to decision making. For example, Watson health has been put to task to focus on Leukemia. In one case, Watson was able to [pinpoint a rare form of Leukemia] that had misdiagnosed and being treated incorrectly with little effect.
Why cognitive? Because human beings cannot read or remember as well as computers. There are thousands of peer-reviewed articles published every day. People are afraid to act to avoid mistakes. Computers are fearless.
Did you know that Brazil celebrates "Black Friday"? There is no "Thanksgiving" in Brazil, but retailers liked the idea of having people stand outside in the middle of the night to start their Christmas shopping! A few years ago, there were [a few problems], but in most recent years, it has shown to help [boost retail sales.] Based on these initial purchases, Watson can be used to help drive the rest of the Christmas retail season.
Watson can analyze personality based on social media writings. The world will be taken over by digital natives. The last century was focused inward, or "ego-centric", but in this 21st century, we will be focused outward, towards a complete "ecosystem".
Who are your competitors? Are they the companies that make products and services similar to yours? No! They are the companies that are competing for your customer's time and attention.
While I speak English and Spanish fluently, my Brazilian Portuguese is terribly rusty. We had several rooms with a pair of real-time translators. I presented the following:
Software Defined Storage -- Why? What? How?
The Pendulum Swings Back -- Understanding Converged and Hyperconverged Environments
IBM Spectrum Scale for File and Object Storage
IBM Storage integration with OpenStack
Introduction to IBM Cloud Object Storage System and its Applications (powered by Cleversafe)
IBM's Cloud Storage Options
All of my sessions were well received, and well attended!
Photo by Dominique Salomon,
IBM Certified IT Specialist
On Wednesday night, we had a nice pool-side reception. Beers, Caiparinhas, and Caiparoskies. Caiparinhas combine a sugarcane juice-based distilled alcohol called cachaça with muddled limes and added sugar. Caiparoskies combined vodka with muddled kiwi fruit.
(Many of the IBMers from United States skipped this event to get dinner early, so they could then come back in time to watch the third and final US Presidential Debate. Because of the time zone changes, this didn't start til 11:00pm, so they could have easily attended the event and had dinner, with plenty of time to spare!)
There was also a live band! This three part band had two guitarists and one lead singer. The lead singer also did maracas and drums while singing. They covered both English and Portuguese language songs.
Rodrigo Giaffredo, IBM Engagement Catalyst
Rodrigo gave the closing session. Wearing jeans and sneakers, he reminded me of the casual storytelling style of Jeff Jonas. He organized his stories around four points:
Consider the battle between Twitter vs. Pownce in 2007. Twitter won because it offered better ways to limit what you read, or who you communicate to, through methods like Hashtags, groups, etc.
Henry Ford disrupted transportation. He realized that Time and space is money. However, as he famously said "If I asked people what they wanted, they would have said faster horses!"
Today the challenge is processing data faster. The company that is able to process faster has economic advantage.
Strong ideas focus on user needs. Weak ideas are tactical and features. Consider the [Hippo Roller]. For centuries, African women and children carried water from far away wells either on their hands on or their heads. Much of it would fall out during the long walks. The Hippo Roller holds 90 liters (about 24 gallons) and rolls easily over rough terrain.
Rodrigo showed an graph. On the y-axis was "Importance" and the x-axis "Feasibility". Solutions in the upper right corner are obvious choices. Solutions in the upper left, important but not very feasible, are considered "big bets". Solutions in the lower right, feasible but not very important, he labeled "amenities".
Most designers, architects and developers know that the later the error is found, the more expensive it is to fix. A prototype is worth a thousand meetings.
Take the company Zappos, which sells shoes online over the Internet. The founder, Nick Swinmurn, tried to get investors, getting a typical response: "What are you drinking?" (In USA, we would ask what are you smoking, but this is the way the Brazilians say it.)
With no investors, Nick built a simple website, took pictures of shoes, and fulfilled orders by purchasing the shoes from local San Francisco retailers and shipping them to the clients.
Nick started this in 1999, and finally got some $20 Million USD in funding in 2004. His simple prototype allowed him to focus on post sales support. Zappos was recognized as having the best call center, moving his operations to Las vegas, NV.
Consider the challenges of urban mobility.
Both methods eventually result in a car, but the agile prototypes allow for more effective experimental milestones.
As for Zappos, its prototype proved successful. Amazon acquired them for $1.2 Billion USD in 2009.
It is that simple: Understand, explore, prototype, and evaluate. IBM has adopted "Design Thinking" across its development organizations to better meet the needs of the marketplace.
Overall, it was a delightful event. It is nearly summer down in the Southern hemisphere, so a bit warm and humid. The attendees were all looking forward to a turn-around in the Brazilian economy, and the business opportunities that brings.
Well it's Tuesday again, and you know what that means? IBM announcements!
Today, IBM announced a few things related to storage.
IBM Spectrum Copy Data Management
This new member of the IBM Spectrum Storage family helps manage all of those snapshot and FlashCopy images made to support DevOps, data protection, disaster recovery, and Hybrid Cloud computing environments.
The software automates the creation and catalog the copy data on existing storage infrastructure, such as snapshots, vaults, clones, and replicas. This can be especially useful with Oracle, Microsoft SQL server, and other databases that are often copied to support application development, testing, and data protection.
Initially, the following storage devices are supported:
IBM storage systems running IBM Spectrum Virtualize™ Software V7.3, and later, including IBM SAN Volume Controller, IBM Storwize®, and IBM FlashSystem® V9000
Storage systems running IBM Spectrum Accelerate™ 11.5.3, and later, including IBM FlashSystem A9000, A9000R, and IBM XIV® and the Supermicro Hyperconverged Appliance
IBM SKLM is IBM's lead offering for creating and managing encryption keys used by various Flash, Disk, Tape and SAN products.
This software release enhances the separation of duties for better alignment with regulatory requirements, simplifying the administrative access, LDAP integration, and device certificate TrustStore management. Device-group key import and export improves the flexibility in key management across multiple organizations.
For those using Hardware Security Modules [HSM], this software now offers HSM-based backup and restore of the encryption key database.
IBM is also enhancing its support of the Key Management Interoperability Protocol [KMIP], an industry standard to support encryption keys and the products that use them. This release now supports integration with any KMIP-compliant device from any vendor, including the introduction of KMIP Opaque and Suite B profiles.
IBM Storage Networking MDS 9000 24/10-port SAN Extension Module
The new MDS 9000 24/10-port SAN Extension Module is supported on MDS 9700 Series Multilayer SAN Fabric Directors. It supports 24 Fibre Channel Ports (auto-negotiating 2/4/8/10/16 Gbps), eight (1/10) GbE Fibre Channel over IP (FCIP) for long-distance replication, and two 40 GbE FCIP ports.
The modules support virtual SAN (VSAN), Hardware-based encryption to help secure sensitive traffic with Internet Protocol Security (IPsec), and hardware-based compression to dramatically enhance performance for both high-speed and low-speed links. This can help reduce costs for long-distance replication over expensive WAN infrastructure.
Two years ago, the folks at University of Toronto asked me to help their graduate students build a "Watson" running entirely on IBM SoftLayer to see if this would be a worthwhile class project. Needless to say, it was more difficult than they expected, but we managed to pull it off during that summer, able to answer a handful of simple questions from a single page corpus.
Last month, [Industry Leaders Establish Partnership on AI], combining the talents from Amazon, DeepMind/Google, Facebook, IBM and Microsoft, to form a non-profit to explore best practices and ethical questions related to Watson and other Artificial Intelligence applications.
Since data is at the core of any Artificial Intelligence, IBM is pleased to announce today that IBM Cloud Object Storage System is now available on IBM SoftLayer. This is based on the Cleversafe technology IBM acquired last year.
While other cloud service providers have offered data storage in the cloud, this new offering also allows hybrid configurations with geographically dispersed erasure coding. Unlike RAID which protects against the loss of one or two drives, erasure coding can protect against a larger number of concurrent failures. For example, using an Information Dispersal Algorithm of "7+5", where seven pieces of data are encoded on twelve independent disks, the system can lose up to five disk drives without losing any data.
Click graphic to view larger
Combining this with Geographically Dispersed Configuration across three or more sites means that you can lose an entire data center, four of the twelve disks, and still have instant full access to all of your data from eight drives at the other locations. In the graphic, you see two on-premise data centers combined with a third location in IBM SoftLayer.
Today, I met with Teresa Ferraro and Mike Buttrum from FirstRain in their Manhattan office in downtown New York City. IBM recently contracted FirstRain to provide IBMers like myself with analytics on publicly-available news to keep us informed for business meetings. Here's how IBMers can get the most out of this service.
Basically, FirstRain takes a list and generates the best summaries of publicly-available news that are most relevant. You can organize into different channels. Here I have seven channels.
Companies to watch refer to existing or prospective clients that I plan to be talking with soon. Some of my colleagues are assigned to specific clients, so they can set this up once and enjoy the news for the rest of the year. I, on the other hand, meet with different clients every week, so I will be updating this list on a frequent basis.
I have divided the Competitors between major ones, and smaller startups. Since I am often working with business partners and distributors, I made that a separate channel as well.
For product lines, I picked three: Data migration, Data storage solutions, and Software defined storage.
For conferences where I don't know which companies will attend, such as the IBM Technical University, I can set up information by territory. Here is one for Brazil.
I also attend industry-oriented events, so I can pick those vertical markets that might be helpful with dinner conversations. In this example, I chose Energy, Electric Utilities and Gas Utilities.
Once you have your channels configured, you get your results in various sections:
Management Changes lists any changes in top C-level positions, who left the company, who got recently hired.
Key Developments indicates news like mergers and acquisitions and government regulations.
First Reads prioritizes the top six articles for your channel. You can access more, but these six will get you started as you have your morning coffee.
First Tweets gives you the six most relevant tweets, if those articles above were just "TL;DR"
A section on Business Influencers and Market Drivers is interesting to see who the big players are, and what topics are driving the most conversation. Here's an example from my Energy/Electric/Gas channel:
The Most Talked About section covers quotes and commentary about the most talked about companies in your channel.
With most news sources focused on politics, weather and celebrity gossip, it is nice to have a quicker, more focused approach to get the news I need to prepare for my client briefings. Special thanks to my hosts Teresa and Mike for their hospitality!
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year.
Day 4, the last day of the conference, is only a partial day, and many people opted to leave on Wednesday evening, or Thursday morning instead. The breakfast and lunch meals had fewer people than the previous days. Here is my recap of day 4 Thursday breakout sessions.
Building Hyperconverged Infrastructure for Next-Generation Workloads
Supermicro is more than happy to customize these, upgrading the CPU, RAM, disk or networking connectivity as needed. This solution is roughly half the price of Nutanix, and offers a better Next-Business-Day/9am-to-5pm support package .
The last time I was in Las Vegas, I presented this topic at [IBM Interconnect conference]. Back then, I was given only 20 minutes, was placed on the Solutions Expo showroom floor, competing with the noise and traffic of attendees going to lunch.
This time, it was much better, a large room, and a bigger-than-expected audience given that it was scheduled on Thursday morning.
Cloud storage comes in four flavors: persistent, ephemeral, hosted, and reference. The first two I refer to as "Storage for the Computer Cloud" and the latter two I refer to as "Storage as the Storage Cloud".
I also explained the differences between block, file and object access, and why different Cloud storage types use different access methods. I wrapped up the session covering the various storage solutions that IBM offers for all four Cloud Storage types.
IBM Storwize and IBM FlashSystem with VersaStack versus NetApp FlexPod
Norm Patten, part of the IBM Competitive Project Office Storage Team, presented a competitive comparison between VersaStack with IBM storage, versus FlexPod with NetApp storage.
Commodity Solid State Drives (SSD) and Shingled Magnetic Recording [SMR] offer low-cost, high-capacity storage.
However, they have their own set of problems, so IBM is developing software that can be included in IBM Spectrum Accelerate, Spectrum Scale, and Spectrum Virtualize to optimize their utility.
The concept of Log-Structured Array has been around since 1988. The IBM RAMAC Virtual Array back in the 1990s used it. NetApp's Write-Anywhere File System (WAFL) is an implementation of the [Log-Structured File System] general concept.
SALSA combines Log-Structured Array with enhancements borrowed from the IBM FlashSystem design, that I covered in my Monday and Wednesday presentations, to enhance write endurance by as much as 4.6 times!
This was an NDA session, so I cannot blog any of the details.
World-class Flash-optimized Data Reduction and Efficiency with IBM FlashSystem A9000 and A9000R
Tomer Carmeli, IBM Offering Manager for the A9000 and A9000R presented. He presented an overview of these models on Monday, so this session was focused on the data footprint reduction technologies.
Basically, it is a three step process. First, all "standard patterns" are removed. IBM has identified some 260 standard patterns that are 8KB in length, such as all zeros, all ones, or all spaces, and replaces these blocks immediately with a pattern token.
Second, [SHA-1] 20-byte hash codes are computed on 8KB pieces on a rolling 4KB alignment boundary. In other words, if a 64KB block of data is written, bytes 0-to-8KB are hashed an compared to existing hash codes. If no match, then bites 4KB-to-12KB are hashed, and so on. This approach nearly doubles the likelihood of finding duplicates. When a block match is found, the algorithm can replacing them with pointer and reference count.
Third, any unique data that still remains is compressed using Lempel-Ziv algorithm. This is done using the [Intel® QuickAssist]. This co-processor can compress data 20 times faster than software algorithms running on general-purpose x86 processors.
Do you want an estimate of how much "reduction ratio" you may achieve? IBM has developed two estimator tools to help. The first tool is a complete scan for data expected to be dedupe-friendly. It is a slow process, taking 8 hours per TB. This would be ideal for Virtual Desktop Infrastructure or backup copies.
The second tool is the infamous [Comprestimator] that IBM has had for awhile to help estimate compression savings for IBM Spectrum Virtualize storage solutions like SVC, Storwize and FlashSystem V9000. This tool is very fast, looking at only a statistically-valid subset of the data.
The results of both tools are merged, and the result is within five percent accuracy. This allows IBM to offer guidance on which data to place on these new A9000 and A9000R models, as well as offer a "reduction ratio" guarantee.
A client asked me why I bother to attend other sessions, when I probably know most of the material they present. I explained that I can always learn from others. I can honestly say that I learned something new and useful at every session I attended.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of Day 3 Wednesday.
Become your own Storage Consultant
Gary Graham, IBM Field Technical Specialist for Storage, and Brian Pioreck, IBM Client Technical Specialist for Storage, co-presented this session. This session explained how to use IBM's 30-day free trial of IBM Spectrum Control Storage Insights, a cloud-based services offering.
(Note: 15 years ago, I was the chief architect of version 1 of what we now call IBM Spectrum Control. I am pleased to see how well this product has evolved over the years.)
Storage Insights provides a reporting-only subset of the popular IBM Spectrum Control Standard and Advanced editions. It reports on IBM storage devices, as well as any non-IBM devices that are virtualized behind IBM Spectrum Virtualize products like SAN Volume Controller (SVC), Storwize, and FlashSystem V9000.
If you are a storage administrator, consider trying this out for 30 days, get some immediate results. Since it is cloud-based, you only need a Windows, Linux or AIX system to install a "collector" on site. This collector sends data up to the Cloud at one of IBM SoftLayer facilities. The installation process takes only 30 minutes, and you can download the code from the Internet.
If you find Storage Insights valuable, helping you reclaim some unused space, or provide other insight that saves your company money, consider buying the service, for only 250 US Dollars per 50 TB monitored. If you want more than just monitoring and reporting, consider one of the on-premise solutions like IBM Spectrum Control Standard, or IBM Spectrum Control Advanced edition, which provide provisioning and configuration capabilities as well.
Enhance your Security posture with At-Rest Encryption using the latest IBM Spectrum Virtualize
All of the IBM Spectrum Virtualize products support Data-at-Rest Encryption. For direct-attached storage, the 12Gb SAS controller performs hardware-assisted encryption.
For SAN-attached storage via FCP, FCoE or iSCSI back-end devices, IBM uses the [AES-NI instruction set] that comes included in certain Intel CPU processors.
Last November 2015, [IBM acquired Cleversafe] for $1.3 Billion US dollars because Cleversafe has the brand name recognition as the #1 Object Storage vendor the past two years in a row (2014 and 2015). On July 1 of this year, the transformation was complete, and their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS.
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
A few years ago, I explained to a client that Converged and Hyperconverged were like a pendulum swinging back. Over the past few decades, we have gone from internal disk, to externally attached disk, to SAN and LAN networks.
Each time, we gained more flexibility, greater connectivity and longer distances. Then I explained that Converged and Hyperconverged is like going backwards, the pendulum swinging back to the days of internal and direct-attached storage. The analogy was a hit, and thus this session was born!
IBM offers multiple Converged Systems. IBM PureSystems, PureData, PurePower and PureApplication solutions offer racks of compute, storage and network gear. Last year, IBM collaborated with Cisco to create VersaStack, a converged system that combines Cisco's x86 blade servers and switches with IBM FlashSystem and Storwize products.
IBM also offers Hyperconverged solutions. IBM Spectrum Accelerate allows the compute, storage and network functions run on 3 to 15 VMware ESXi hosts to form a cluster. The cluster can then make iSCSI-based volumes available to other virtual machines running on these same hosts. The volumes can also be made available to servers outside the cluster, such as bare metal servers or other Hypervisors. This is available as software-only, or you can get pre-built system called the Supermicro Hyperconvergence Appliance.
IBM Spectrum Scale provides a clustered file system that allows the compute, storage and network functions to run on 3 to 16,000 machines. Formerly called General Parallel File System (GPFS), IBM Spectrum Scale has been around for over 18 years. Over 200 of the world's largest "Top 500" supercomputers run IBM Spectrum Scale today.
IBM Spectrum Virtualize and IBM Storwize Birds-of-a-Feather
Barry Whyte, fellow blogger and IBM Master Inventor, presented an overview of the latest features, and where IBM is headed in 2017 for the IBM Spectrum Virtualize family of products. Barry now works in Advanced Technical Skills for Storage Virtualization Asia/Pacific Region.
The group then moved to another room offering delicious food and drink, as Eric Stouffer, IBM Director, Storwize Offering Manager and Business Line Exec, presented the future areas that IBM is consider for this product family.
All of this was done under Non-Disclosure Agreements (NDA), preventing me from blogging any details. Back in 2003, Las Vegas started a marketing campaign ["What Happens in Vegas, Stays in Vegas"]. Coincidentally, this is the same year IBM introduced the IBM SAN Volume Controller, the first product in the IBM Spectrum Virtualize family.
This was a long day, but was pleased with the large audiences I had at my sessions.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of breakout sessions on Day 2.
Introducing IBM FlashSystem A9000 and A9000R: Grid Architecture Designed for the Hybrid Cloud
Tomer Carmeli, IBM Offering Manager for the A9000 and A9000R presented. Both models offer data-at-rest encryption, snapshots, remote mirroring, and data footprint reduction, assuming 5.26:1, a combination of pattern removal, data deduplication and hardware-assisted Real-time compression.
The A9000 is an 8U high pod that can fit into existing racks. It comes in 60TB, 150TB and 300TB effective capacity.
The A9000R includes its own 42U rack. The rack is organized as two to six "grid elements" combined with two InfiniBand switches. Grid elements come in 150TB and 300TB effective capacities, giving you up to a whopping 1.8 PB in a single rack!
Similar to the IBM XIV and IBM Spectrum Accelerate offerings, the A9000 and A9000R support Hyper-Scale features. Hyper-Scale Manager lets you manage up to 144 devices on a single pane of glass. Hyper-Scale Mobility lets you move volumes (LUNs) non-disruptively from one device to another.
Different data compresses or dedupes at different ratios. Your mileage may vary. Unless you are evaluating a JBOF (just a bunch of flash) device, there is a great difference between raw, usable, and effective capacity. Raw capacity can be calculated by the size of each chip, times the number of chips. Usable capacity factors out RAID, and any spare capacity set aside for RAID rebuild and garbage collection. Effective capacity indicates the amount of information that can be stored by taking advantage of data footprint reduction technologies, such as compression or data deduplication.
IBM offers three options:
Measured Estimate -- IBM has a set of data reduction estimator tools that can scan your existing data, and estimate your reduction ratio, within 5 percent accuracy.
Competitive Match -- If a competitor had run their own set of estimator tools, IBM might be able to match the reduction ratio, without repeating the analysis, by just reviewing the competitor results.
"Sight unseen" -- without analyzing your actual data, reduction ratio is determine by the type of data (DB2, Oracle, SQL server, etc.), based on experience with similar data at other data centers.
Both A9000 and A9000R models are published at 250 microsecond latency, about 30 times faster than traditional spinning disk, although some workloads actually can run even faster than that. Assuming 5.26:1 reduction, these sell for about $1.50 per effective GB.
Flash Primer - Ready to move from disk storage?
Patricia Crowell, IBM Worldwide FlashSystem Enablement manager, presented. She presented an interesting time line:
First Solid-State Drive (SSD)
First Flash card, such as for digital cameras
First USB stick
Flash used in specialized IT appliances
Flash for the enterprise - Microsoft and UCSD paper on SSD
In 2012, Microsoft Research and University of California San Diego published ["The Bleak Future of NAND Flash Memory"], 8 pages, by Laura M. Grupp, John D. Davis, and Steven Swanson. Here is an excerpt:
"The technology trends we have described put SSDs in an unusual position for a cutting-edge technology: SSDs will continue to improve by some metrics (notably density and cost per bit), but everything else about them is poised to get worse. This makes the future of SSDs cloudy: While the growing capacity of SSDs and high IOP rates will make them attractive in many applications, the reduction in performance that is necessary to increase capacity while keeping costs in check may make it difficult for SSDs to scale as a viable technology for some applications"
IBM disagreed with this bleak assessment, announced it was investing $1 billion US Dollars into this technology, acquired Texas Memory Systems, and has deployed flash throughout its product line. For the past three years, IBM has been the #1 vendor for Flash storage systems.
Patricia offered the following example. What would it take to run 20 million IOPS? Here's a comparison:
Disk systems 15K rpm
Disk systems 7200 rpm
How to migrate from SONAS to IBM Spectrum Scale/ESS using Active File Manager
Paul Schena, IBM Senior IT Specialist, presented his experiences migrating existing SONAS data to new IBM Spectrum Scale or Elastic Storage Server (ESS) deployments. SONAS is going End-of-Service (EOS) on April 30, 2018, so it is never too soon to start this migration.
Paul gave two different methodologies. The first used Active File Management (AFM):
Setup an IBM Spectrum Scale "Gateway Node" in "Independent-Writer" AFM mode. Paul recommends 10 threads per gateway node.
Issue an AFM pre-fetch, disabling the "cache eviction" feature to ensure data remains. AFM transfers the directory structure, file data including sparse files, Access Control Lists (ACL), extended attributes.
Define your exports with no-root-squash and move your user mounts to the new systems
Once all the data is moved, convert the cache filesets to regular filesets
Define your quotas, export settings, ILM policies and rules
Decommision the SONAS
The second used Robocopy and Rsync, which may be required if there is high-latency, long-distance connection that prevents proper AFM connections:
Configure IBM Spectrum Scale CES servers to appropriate NFS and/or SMB protocols
Use Robocopy and/or Rsync as appropriate to move the data to the new system
Decommision the SONAS
Having it all: Hybrid Cloud Storage Services for Block, Power and Backup
Clint Parish, Director of Enterprise Solutions and Services for VSS, and Marc The'berge, Business Development for Supermicro, co-presented this session.
VSS offers POWER8-based Cloud services. They consider themselves "boutique" with POWER8 servers, able to run AIX, IBM i and Linux on POWER applications, but not at the scale and size of larger x86-based clouds like Amazon Web Services or Microsoft Azure.
For IBM i, they attach to IBM Storwize V7000. For AIX and Linux on POWER, they use IBM Storwize V7000 and/or Supermicro Hyperconverged Appliance, a pre-built system based on IBM Spectrum Accelerate.
Supermicro offers three "tee-shirt sizes", their small systems have six nodes, medium with 9 nodes, and large with 15 nodes. Unlike other Hyperconverged systems, the ones from Supermicro include a rack, and are pre-cabled with all the necessary Ethernet switches necessary to make a complete solution.
To offer backup services, VSS uses IBM Spectrum Protect with the Supermicro appliances.
In the evening, we were treated with a concert with Train, known for songs like "Meet Virginia", "Hey Soul Sister", "Calling all Angels" and "Drops of Jupiter". They played all of these, plus covered some songs by Led Zeppelin, Journey, Queen and Aerosmith,
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. Here is my recap of breakout sessions for Monday, Sep 19, 2016:
How do you storage a Zettabyte? IBM and Microsoft Know...
A [Zettabyte] is a million Petabytes, or a billion Terabytes, of data. Most clients I deal with have less than 10 PB of centralized storage in their data center, but there are a few that have much larger data repositories.
Ed Childers, IBM STSM and manager for Tape and LTFS development, and Aaron Ogus, Microsoft Architect, discussed different solutions developed by IBM and Microsoft. IBM's solution has been productized, and is available as IBM Spectrum Scale and IBM Spectrum Archive. Microsoft's solution is not productized, but is being "operationalized" to be used within Microsoft's Azure Cloud.
Not surprisingly, to be able to store a Zettabyte of data, you have to be creative and cost-effective with storage media. The current winner is magnetic tape, which continues to be 20 times less expensive than disk. IBM developed the Linear Tape File System (LTFS) and then shared it with other leading IT vendors. Ed also covered some future storage media developments, from using Macro-molecular strands of DNA, to Phase Change Memory (PCM).
All Flash is not Created Equal - Contrasting IBM FlashSystem with Solid State Drives (SSD)
Many IBM FlashSystem presentations focus on the product, but don't explain the underlying technology, specifically what differentiates IBM FlashSystem from substantially slower competitive alternatives like EMC XtremIO and PureStorage that are based instead on fallible commodity Solid State Drives (SSD).
By working closely with our chip vendor, Micron, IBM was able to improve the write endurance of these Multi-level cell (MLC) chips by 9.4x, and reduce write amplification by 45 percent.
I explained IBM's clever asymmetrical wear-level balancing, heat segregation, read disturb mitigation, voltage level shifting, and health binning, all of which contribute to the performance and reliability of this solution. IBM's innovative Error Correcting Code provides LDPC-like correction strength but at much faster BCH-like latency speed.
This was a popular session. Despite being moved to a much larger room, they still had to turn people away, so I will be repeating this session on Wednesday, 11:00am.
Real-time Compression: Bendingo and Adelaide Bank's Perspective
James Harris, Senior Storage Systems Specialist for [Bendingo and Adelaide Bank], presented his success story with the use of Real-time Compression. Oracle RAC databases got 60-70 percent savings. SQL databases got 70-80 percent savings. VMware VMFS datastores average 50 percent savings. For IBM i, he is getting 60-70 percent savings for SYSBAS, and over 70 percent savings of the rest of his IBM i production data.
As a result, the bank has not had to make any Capital Expenditures (CAPEX) for disk for 2-3 years since they started compressing in 2014.
Storage Options for Big Data and Analytics: IBM FlashSystem or Traditional Disk Systems?
Eric Sperley, IBM Software Defined Storage Architect, presented the basics of Hadoop and the Hadoop File System (HDFS), then explained how IBM Spectrum Scale, when combined with the right tiers of flash and disk technology, could be used to optimize an environment for big data analytics.
The Solutions EXPO is open all day, for people to visit the booths in between sessions. I stopped in for the evening reception. This is a great way to catch up on the latest products, re-connect with some clients or colleagues that I haven't seen in person for awhile, and meet new friends.
Shown here is Angie Welchert, who just started working for IBM a few years ago! I took her around to introduce her to some IBM executives at the Solutions EXPO.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year.
General Session - Outthink Status Quo
This week's motto is "Outthink the Status Quo.. Before the Status Quo disrupts your business!
Tom Rosamilia, IBM Senior VP for IBM Systems (and my fifth-line manager), kicked off the event. There are about 5,500 people at this event. He mentioned that just like a picture is worth a thousand words, "a prototype is worth a thousand meetings."
He showed a video of our client "Plenty of Fish" [POF], which is a dating site. They have 100 million members, of which 4 million access their site every day. IBM FlashSystem paid for itself, with an ROI payback period of 2 months.
Jason Pontin, Editor in Chief and Publisher of [MIT Technology Review], mentioned three major areas to watch:
Explosive innovation in Artificial Intelligence (AI), including IBM Watson, machine learning, etc.
Pervasive computing, including augmented reality or virtual reality, what IBM calls Internet of Things (IoT)
Re-writing life, directly editing genomes for healthcare and agriculture
Jason feels there are two major challenges for humans. First, what is the "future of work"? People are no longer working for the same company for their entire career. Rather, they come and go, moving in and out of companies. Second, how will we deliver food and water to the 9.6 billion population expected to exist by 2050, with added challenge of climate change. Ed Walsh, IBM General Manager for Storage and Software Defined Infrastructure, presented next. Last year, I was asked to throw my hat in the ring to be the next General Manager of IBM Storage. I was up against some strong competition, and in the end upper management selected Ed Walsh instead. He is a good choice, and I support his efforts.
Matt Cadieux, CIO for [Red Bull Racing], presented on the IT challenges of designing, building and racing Formula One racing cars. They have 21 races per year, and each race has slightly different specifications, forcing Red Bull Racing to break down and rebuild their cars for each race.
Michael Lawley, Senior IT Vice President for [HealthPlan Services], explained how his business grew 300 percent in the past four years. Their workloads are very "spiky", so it is good that they can scale up or down their IT infrastructure 3-4x as needed, within minutes.
Jacob Yundt, CIO for University of Pittsburgh Medical Center [UPMC], explained the importance of genomics as the next frontier of medicine. Genomics allows for more accurate cancer determinations, which helps target specific treatments. They moved from x86-based clusters to those based on Power LC models from IBM. For analytics, they chose IBM Power8 S822L servers with Elastic Storage Server (ESS) and the Hadoop Transparency Layer.
Lastly, Terri Virnig hosted two technology partners to the stage for some major announcements. First, Jim Totton from Red Hat, announced that RHEV v4 (based on Linux KVM) is announced for POWER platform. Secondly, Scott Gnau, CTO for [Hortonworks], announced that Hortonworks will run on the POWER platform, as part of IBM and Hortonworks Open Data Platform [ODP] initiative.
Trends & Directions: The Future of Storage in the Cloud and Cognitive Era
Eric Herzog, IBM Vice President, Product Marketing and Management Software Defined Infrastructure, served as emcee for this session.
Ed Walsh, IBM General Manager for IBM Storage and Software Defined Infrastructure, marveled at IBM's "storied history in storage innovation". He suggests clients should modernize and transform their business with IBM broadest storage portfolio in the IT industry.
Clod Barrera, IBM Engineer and the Chief Technical Strategist for IBM Systems Storage, explained that in the past 60 years of disk systems, areal density has improved by a factor of one billion. Unfortunately, that is slowing down, and we won't see such improvements anymore.
Bina Hallman, IBM Vice President, Software Defined Storage Solutions Offering Management, hosted a panel of clients, including:
Bob Osterlin, from [Nuance], that has 5-10 PB of data using IBM Spectrum Scale for voice recognition software.
Rich Spurlock, from [Cobalt Iron], that provides Backup-as-a-Service using IBM Spectrum Protect. Their clients experience an 80 percent reduction in operating expenditures (OPEX) using Spectrum Protect.
Moshe Perez, from [RR Media], that provides television channel distribution like ESPN and BBC to other countries. They use IBM Spectrum Accelerate to handle the demand peaks, such as the Olympics.
Mike Kuhn, IBM Vice President for Storage Solutions Offering Management, also hosted a panel of clients, including:
Kevin Muha, from [UPMC], managing 13 PB of storage, across a variety of IBM storage devices, including 700 TB of FlashSystem V9000.
Bill Reed, CTO for [Arizona State Land Department], that uses VersaStack with IBM FlashSystem V9000 for geographic information system [GIS] applications. They manage over 9.2 million acres to help fund K-12 schools in Arizona.
Owen Morley, from Plenty of Fish [POF] dating website, evaluated nearly every flash device in the market, and chose IBM FlashSystem. "The one metric that matters is Latency!"
These were the two main keynote sessions on Monday morning. During the rest of the week there will be over 285 storage-related breakout sessions, dozens of labs, and 7 panels.
This week, I am in Las Vegas for [Edge 2016], IBM's Premiere IT Infrastructure conference of the year. In previous years, this conference was held in May, June or July, but this year, it was moved back to September, to coincide with the 60th Anniversary of IBM Disk Systems.
I have arrived safely to Las Vegas, and checked in at Edge 2016 Conferenece Registration.
This year, the Solutions EXPO opens early, on Sunday with a reception. This gives people a chance to go to booth #330 to make appointments for one-on-one with various IBM Executives!
I was able to catch up with co-workers I have not seen in a while! There is a whole section on IBM storage products such as the IBM DS8888 All-Flash Array, as well as software products like IBM Spectrum Protect and IBM Spectrum Control.
On Monday, my session "All Flash is Not Created Equal: Tony Pearson Contrasts IBM FlashSystem and SSD" has moved from the tiny room to a much larger room "Studio A". There was a lot of demand for this session, so I have agreed to present this again, as a repeat session, on Wednesday.
Edge will be different in many ways this year. The past few years we had separate "Executive Edge" for C-level executives, "Winning Edge" for IBM Business Partners, and "Technical Edge" for server, network and storage administrators.
This year, all 1,000 sessions are combined back into one, but with clever hints in the titles. The words "General Session", "Outthink" or "Cognitive" are used to indicate C-level executive talks. Those that use the terms "Winning" or "Community" target IBM Business Partners, Managed Service Providers and Cloud Service Providers. Those that mention z Systems, POWER servers, or Storage solutions, often adding the term "Deep-Dive", are technical.
(Unlike other sessions that might appeal to one portion of the audience or another, mine are suitable for everyone, from C-level executives and IBM Business Partners to storage administrators. To help people find them under the new naming scheme, I have added "Tony Pearson Presents", or words to that effect.)
About 260 breakout sessions relate to IBM Storage, but there are only 20 or so time slots, so obviously you can't see them all in person.
I strongly suggest you pick about three to five topics per time slot, so that you are not overwhelmed by the dozens of choices during the event. This allows you to make a quick decision on which one you finally decide on during each time slot.
Occasionally, a session might get canceled, postponed, or be so full of attendees that nobody else is allowed in, so having three to five topics selected allows you to chose an alternate.
Here is my schedule for next week at Edge 2016.
Trends & Directions: The Future of Storage in the Cloud and Cognitive Era
All Flash is Not Created Equal: Tony Pearson Contrasts IBM FlashSystem and SSD
MGM Grand - Studio 9
Solution EXPO: Reception
Edge at Night: Poolside Reception and Concert "Train"
Tony Pearson Presents IBM Cloud Object Storage System and Its Applications
MGM Grand - Room 114
The Pendulum Swings Back: Tony Pearson Explains Converged and Hyperconverged Environments
MGM Grand - Room 113
Solution EXPO: Reception
Tony Pearson Presents IBM's Cloud Storage Options
MGM Grand - Room 116
My colleagues Dave Dabney or Adam Bergren will be located at the WW Systems Client Centers Booth 125 of the Solution EXPO.
If you are active in Social Media, consider using the hashtags #IBMedge, #IBMstorage, and #IBMcloud. You can follow me on Twitter, my handle is @az990tony
For those interested in a one-on-one meeting with me, over breakfast, lunch or dinner, or some other time, I have several slots still available. Fill out a request form on BriefingSource at: [https://briefingsource.dst.ibm.com/]
SAP HANA is an in-memory, relational database management system supported on Linux for x86 and POWER servers. The "HANA" acronym is short for "High-Performance Analytic Appliance" software. By keeping the data in memory, analytics and queries can be performed much faster than from traditional disk repositories.
Server memory, however, is volatile storage, so the data needs to be stored on persistent storage such as flash or disk drives. SAP has certified several configurations, some involve IBM Spectrum Scale solutions. I will use the following graphic to explain the three configurations.
Linux on x86-64 with Spectrum Scale FPO
With SAP HANA on Lenovo x86-64 servers, SAP has certified internal flash or disk drives running IBM Spectrum Scale in "File Placement Optimization" (FPO) mode. FPO provides a shared-nothing architecture that matches the SAP HANA architecture. IBM Spectrum Protect can backup this configuration, providing data protection and disaster recovery support.
Linux on POWER with Elastic Storage Server
With SAP HANA on POWER servers, SAP has certified external Elastic Storage Server (ESS). Not only is POWER the better platform to run SAP HANA than x86-64, but Elastic Storage Server offers excellent erasure coding to provide excellent rebuild times and storage efficiency.
The ESS is a pre-built system that combines IBM Spectrum Scale software with server and storage hardware. IBM Spectrum Protect can also backup this configuration, providing data protection and disaster recovery support.
Block-level Storage over Storage Area Network (SAN)
Various IBM block-level devices are support for SAP HANA on both Linux on x86-64 and Linux on POWER. Unfortunately, SAP only has certified (to date) the use of the XFS file system. The problem many clients mention about this configuration is the lack of end-to-end backup and disaster recovery. This is solved by the Spectrum Scale configurations in the previous two examples.
Other combinations, such as SAP HANA on POWER with Spectrum Scale FPO, or on x86-64 servers with Elastic Storage Serer, are either not SAP-certified, or not directly supported by SAP without their approval.
IBM and SAP have worked closely together for many years, and I am glad to see SAP HANA and IBM Spectrum Scale based solutions continue this tradition.
As we get to larger and larger flash and spinning disk drives, a common question I get is whether to use RAID-5 versus RAID-6. Here is my take on the matter.
A quick review of basic probability statistics
Failure rates are based on probabilities. Take for example a traditional six-sided die, with numbers one through six represented as dots on each face. What are the chances that we can roll the die several times in a row, that we will have no sixes ever rolled? You might think that if there is a 1/6 (16.6 percent) chance to roll a six, then you would guarantee hit a six after six rolls. That is not the case.
# of Rolls
Probability of no sixes (percent)
So, even after 24 rolls, there is more than 1 percent chance of not rolling a six at all. The formula is (1-1/6) to the 24th power.
Let's say that rolling one to five is success, and rolling a six is a failure. Being successful requires that no sixes appear in a sequence of events. This is the concept I will use for the rest of this post. If you don't care for the math, jump down to the "Summary of Results" section below.
Error Correcting Codes (ECC) and Unreadable Read Errors (URE)
When I speak to my travel agent, I have to provide my six-character [Record Locator] code. Pronouncing individual letters can be error prone, so we use a "spelling alphabet".
The International Radiotelephony Spelling Alphabet, sometimes known as the [NATO phonetic alphabet], has 26 code words assigned to the 26 letters of the English alphabet in alphabetical order as follows: Alfa, Bravo, Charlie, Delta, Echo, Foxtrot, Golf, Hotel, India, Juliett, Kilo, Lima, Mike, November, Oscar, Papa, Quebec, Romeo, Sierra, Tango, Uniform, Victor, Whiskey, X-ray, Yankee, Zulu.
Foxtrot Golf Mike Oscar Victor Whiskey
Foxtrot Gold Mine Oscar Vector Whisker
Boxcart Golf Miko Boxcart Victor Whiskey
Having five or so characters to represent a single character may seem excessive, but you can see that this can be helpful when communications link has static, or background noise is loud, as is often the case at the airport!
If spelling words are misheard, either (a) they are close enough like "Gold" for "Golf", or "Whisker" for "Whiskey", that the correct word is known, or (b) not close enough, such that "Boxcart" could refer to either "Foxtrot" or "Oscar" that we can at least detect that the failure occurred.
For data transfers, or data that is written, and later read back, the functional equivalent is an Error Correcting Code [ECC], used in transmission and storage of data. Some basic ECC can correct a single bit error, and detect double bit errors as failures. More sophisticated ECC can correct multiple bit errors up to a certain number of bits, and detect most anything worse.
When reading a block, sector or page of data from a storage device, if the ECC detects an error, but is unable to correct the bits involved, we call this an "Unrecoverable Read Error", or URE for short.
Bit Error Rate (BER)
Different storage devices have different block, sector or page sizes. Some use 512 bytes, 4096 bytes or 8192 bytes, for example. To normalize likelihood of errors, the industry has simplified this to a single bit error rate or BER, represented often as a power of 10.
Bit Error Rate per read (BER)
Consumer HDD (PC/Laptops)
Enterprise 15k/10k/7200 rpm
Solid-State and Flash
IBM TS1150 tape
In other words, the chance that a bit is unreadable on optical media is 1 in 10 trillion (1E13), on enterprise 15k drives is 1 in 10 quadrillion, and on LTO-7 tape is 1 in 10 quintillion.
There are eight bits per byte, so reading 1 GB of data is like rolling the die eight billion times. The chance of successfully reading 1GB on DVD, then would be (1 - 1/1E13) to the 8 billionth power, or 99.92 percent, or conversely a 0.08 percent chance of failure.
In this paper, Google had studied drive failure using an "Annual Failure Rate" or AFR. Here are two graphs from this paper:
This first graph shows AFR by age. Some drives fail in their first 3-6 months, often called "infant mortality". Then they are fairly reliable for a few years, down to 1.7 percent, then as they get older, they start to fail more often, up to 8.3 percent.
This second graph factors in how busy the drives are. Dividing the drive set into quartiles, "Low" represents the least busy drives (the bottom quartile), "Medium" represents the median two quartiles, and "High" represents the busiest drives, the top quartile. Not surprisingly, the busiest drives tend to fail more often than medium-busy drives.
Given an AFR, what are the chances a drive will fail in the next hour? There are 8,766 hours per year, so the success of a drive over the course of a year is like rolling the die 8,766 times. This allows us to calculate a "Drive Error Rate" or DER:
Drive Error Rate per hour (DER)
For example, an AFR=3 drive has a 1 in 287,800 chance of failing in a particular hour. The probability this drive will fail in the next 24 hours would be like rolling the die 24 times. The formula is (1-1/287,800) to the 24th power, resulting in a failure rate of roughly 0.008 percent.
Let's take a typical RAID-5 rank with 600GB drives at 15K rpm, in a 7+P RAID-5 configuration.
During normal processing, if a URE occurs on a individual drive, RAID comes to the rescue. The system can rebuild the data from parity, and correct the broken block of data.
When a drive fails, however, we don't have this rescue, so a URE that occurs during the rebuild process is catastrophic. How likely is this? Data is read from the other seven drives, and written to a spare empty drive. At 8 bits per byte, reading 4200 GB of data is rolling the die 33.6 trillion times. The formula is then (1-1/E16) to the 33.6 trillionth power, or approximately 0.372 percent chance of URE during the rebuild process.
The time to perform the rebuild depends heavily on the speed of the drive, and how busy the RAID rank is doing other work. Under heavy load, the rebuild might only run at 25 MB/sec, and under no workload perhaps 90 MB/sec. If we take a 60 MB/sec moderate rebuild rate, then it would take 10,000 seconds or nearly 3 hours. The chance that any of the seven drives fail during these three hours, at AFR=10 rolling the DER die (7 x 3) 21 times, results in a 0.025 percent chance of failure.
It is nearly 15 times more likely to get a URE failure than a second drive failure. A rebuild failure would happen with either of these, with a probability of 0.397 percent.
The situation gets worse with higher capacity Nearline drives. Let's do a RAID-5 rank with 6TB Nearline drives at 7200 rpm, in a 7+P configuration. The likelihood of URE reading 42 TB of data, is rolling the die 336 trillion times, or approximately 3.66 percent chance of URE failure. Yikes!
The time to rebuild is also going to take longer. A moderate rebuild rate might only be 30 MB/sec, so that rebuilding a 6TB drive would take 55 hours. The chance that one of the other seven drives fail, assuming again AFR=10, during these 55 hours results in a 0.462 percent.
This time, a URE failure is nearly eight times more likely than a double drive failure. The chance of a rebuild failure is 4.12 percent. Good thing you backed up to tape or object storage!
The math can be done easily using modern spreadsheet software. The URE failure rate is based on the quantity of data read from the remaining drives, so a 4+P with 600GB drives is the same as 8+P with 300GB drives. Both read 2.4 TB of data to recalculate from parity. The Double Drive failure rate is based on the number of drives being read times the number of hours during the rebuild. Slower, higher capacity drives take longer to rebuild. However, in both the 15K and 7200rpm examples, the chance of a URE failure was 8 to 15 times more likely than double drive failure.
Many of the problems associated with RAID-5 above can be mitigated with RAID-6.
After a single drive fails, any URE during rebuild can be corrected from parity. However, if a second drive fails during the rebuild process, then a URE on the remaining drives would be a problem.
Let's start with the 600GB 15k drives in a 6+P+Q RAID-6 configuration. The chance of a second drive failing is 0.0252 percent, as we calculated above. The likelihood of a URE is then based on the remaining six drives, 3600 GB of data. Doing the math, that is 0.0319 percent chance. So, the change of a URE during RAID-6 failure is the probability of both occurring, roughly 0.0000806 percent. Far more reliable than RAID-5!
Likewise, we can calculate the probability of a triple drive failure. After the second drive fails, the likelihood of a third drive at AFR=10, results in 0.00000546 percent.
Combining these, the chance of failure of rebuild is 0.000861 percent.
Switching to 6 TB Nearline drives, in a 6+P+Q RAID-6 configuration, we can do the math in the same manner. The likelihood of URE and two drives failing is 0.0145 percent, and for triple drive failure is 0.00183 percent. Chance of rebuild failure is 0.0163 percent.
Summary of Results
Putting all the results in a table, we have the following:
RAID-5 rebuild failure (percent)
RAID-6 rebuild failure (percent)
600GB 15K rpm
6 TB 7200rpm
Hopefully, I have shown you how to calculate these yourself, so that you can plug in your own drive sizes, rebuild rates, and other parameters to convince yourself of this.
In all cases, RAID-6 drastically reduced the probability of rebuild failure. With modern cache-based systems, the write-penalty associated with additional parity generally does not impact application performance. As clients transition from faster 15K drives to slower, higher capacity 10K and 7200 rpm drives, I highly recommend using RAID-6 instead of RAID-5 in all cases.