Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Well, it's Tuesday again, and you know what that means? IBM Announcements! I am here in New York for the exciting news!
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for the IBM z14 mainframe and DS8880 Storage System.)
In support of the [IBM z14] mainframe announcement, IBM has also disclosed R8.3 enhancements for the DS8880 Storage System. Here is a quick recap:
New Tier-1 Flash Capacities available for HPFE Gen2 drawers
IBM introduces the new Tier-1 flash card capacity 3.84 TB flash card. In the past, IBM DS8880 only supported Tier-0 cards that support 10 Drive Writes per Day (10 DWPD), with capacities 400, 800, 1600 and 3200 GB. The Tier-1 flash card only handles 1 DWPD, often dubbed "Read-Intensive" devices, but can actually handle about 90 percent of most production workloads.
zHyperLink™ drastically reduces the latency between the IBM z14 mainframe and the DS8880 storage systems. Traditional FICON paths through SAN switches or directors introduced about 140 to 175 microseconds of latency between systems. This new system is a direct cable, with 20 microsecond latency.
The I/O bays on the DS8880 used for HPFE Gen2 already have zHyperLink ports on them. This direct cable is limited to 150 meters, however, so plan accordingly.
Transparent Cloud Tiering
IBM already announced Transparent Cloud Tiering to IBM Bluemix, IBM Cloud Object Storage and the IBM TS7760 virtualization engine in R8.2.3 release. The new Release 8.3 of DS8880 now adds support for Amazon S3, providing yet another choice for where to migrate data sets to. IBM also adds replication, allowing the data set to be migrated to two separate target locations, for added availability, much like writing to separate ML2 tape cartridges.
Cascading FlashCopy is a feature that has existing for awhile now on IBM XIV and SAN Volume Controller platforms, so this is just a port of that concept over to the DS8880 microcode. Now, if you FlashCopy target can become the source of a follow-on FlashCopy request. You can make copies of copies. This applies to both the volume and data set level functions.
Why would anyone do this? Well, you might suspend your application at midnight and create a clean FlashCopy of a 24-by-7 ever-changing database. Then in the following morning, workers who need a static "midnight version" of the database now can use this as their source and perform additional FlashCopy requests for their own needs.
IBM DS8880 MES Support
MES is an abbreviation for "Miscellaneous Equipment Specification", one of the many Three Letter Acronyms [TLA] that doesn't help knowing what the words stand for. In short, an MES is a formal supported option to upgrade a piece of hardware that is already installed and running at a client location. IBM will offer MES to upgrade existing DS8880 systems to have the additional HPFE Gen2 drawers, and to upgrade the I/O bays to support zHyperLink connections.
(Final note: you might notice the change in upper and lower case. The IBM z14 (lower case) refers to the specific mainframe model, consistent with its predecessors the z13 and z13s, but the family name "IBM z Systems" has been shortened to "IBM Z®" (upper case). IBM Storage Systems and IBM POWER Systems were already upper case, so the mainframe guys just wanted to follow suit. I suspect "IBM i" will remain lower case, however.)
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM Elastic Storage Server
Replacing the older "GSn" and "GLn" models, IBM announces the "Second Generation" GSnS and GLnS models (the second "S" stands for Second Generation), the "n" continues to refer to the number of storage drawers. All of these have a pair of POWER8 servers to drive amazing performance at a low price point.
The "GSnS" models are based on smaller 2U, 24-drive storage drawers, with 3.84 and 15.36 TB Tier-1 Read-intensive Solid-State Drives (SSD). The "GLnS" models are based on larger 5U, 84-drive storage drawers, with 4TB, 8TB and 10TB nearline (7200 rpm) spinning disk.
These new models have the latest IBM Spectrum Scale software pre-installed.
In addition to IBM's two existing Hyperconverged offerings--IBM Spectrum Accelerate for x86 servers, and IBM Spectrum Scale for x86, POWER and z Systems servers--IBM Power Systems now offers a third option. This integrated offering combines Nutanix's Enterprise Cloud Platform software with IBM Power Systems™ hardware to deliver a turnkey hyperconverged solution that targets critical workloads in large enterprises.
Nutanix is offered and will be defaulted/required on these Power® servers only:
While "Hyperconvergence" is still fairly new, and only about 1 percent of data centers have deployed this new technology, I am glad that IBM is a leader in this space with multiple offerings across both x86 and POWER systems platforms.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 5, the last day of the conference.
Integrating IBM Storage in Container Environments
Dr. Robert Haas, IBM CTO Storage for Europe, presented IBM Storage for Docker containers. These are different from containers in IBM Cloud Object Storage, and different from the Container Pools used in Spectrum Protect.
Robert gave an overview of IBM Spectrum Conductor, part of the IBM Software Defined Infrastructure (SDI) Spectrum Compute family of software products. The goal is to analyze large amounts of data, access these data efficiently, and protect the data, results and insights as intellectual property.
IBM Spectrum Compute comes in several offerings. IBM Spectrum LSF (Load Sharing Facility) manages long-running batch jobs for modeling, design and simulations. IBM Spectrum Symphony provides low-latency for risk analytics in the financial services sector. IBM Spectrum Conductor comes in two flavors. Conductor for Spark (CFS) manages Spark analytics. Conductor for Containers (CFC) handles Docker and Kubernetes containers.
Docker is the run-time platform. While there are other container run-time platforms like RKT and LXD, Docker is clearly the marketshare leader, growing 40 percent per year.
Statistics from the latest DockerCon2016 conference showed the most popular use cases and workloads for Docker. What can run in Docker: Lots of applications can be "containerized", including Redis, MongoDB, PostgreSQL, OracleDB, Java, to name a few. Docker is well established in enterprises, including service providers, healthcare, insurance and financial services, public sector, and technology firms.
Kubernetes, Mesos and Docker/Swarm are a layer above, as orchestrators. Spectrum Conductor for Containers uses Kubernetes and other open source tools to coordinate activity. Orchestrators restart failed applications, and can scale up or scale down the number of instances as needed. Orchestrators can manage groups of applications, across clusters on-premises and off-premises Cloud.
From a storage perspective, containers access storage like bare-metal operating systems, bypassing all of the layers normally associated with bloated Virtual Machine hypervisors. It also eliminates single root I/O virtualization (SR-IOV) that VMs use to compensate.
Persistent storage can be isolated, so that containers cannot see the files of other containers. This provides multi-tenancy.
Internal persistent storage (directory on host file system). However, if you move a container from one host to another, you may lose access to this internal storage.
External volume, manually mounted.
Volume driver plug-in REST API that automatically mounts it.
The fourth method is preferred. Plug-ins are available for IBM Spectrum Scale, GlusterFS, Portworx, Rancher Convoy, RexRay, and Contiv. The start-up Flocker have gone out of business last year.
The Docker hosts can attach to IBM Spectrum Scale in all of its supported offerings, including POSIX, NFS and SMB protocol. Containerized applications can move from one Docker host to another, and continue access the IBM Spectrum Scale namespace.
IBM has created the "Ubiquity Volume Service" that provides a consistent API for Docker and Kubernetes. This will use IBM Spectrum Control Base Edition to support IBM Spectrum Scale, Spectrum Accelerate, Spectrum Virtualize and DS8000 storage systems. For IBM Spectrum Scale, volumes are mapped to iSCSI volumes, filesets or directories. For other devices, volumes are mapped to block LUNs. Ubiquity is publicly available on GitHub.
Enterprise Applications for IBM Cloud Object Storage
Andy Kutner, IBM Cloud Architect, presented the various options available for NAS gateways that can front IBM Cloud Object Storage.
Ctera offers NAS gateways, and Endpoint agents for backup and Enterprise File Sync & Share (EFSS). This vendor targets Remote Office/Branch Office (ROBO) and small NAS consolidation that have less than 60 TB per office IBM is a reseller of Ctera, so you can get both Ctera and IBM COS from the same IBM sales rep.
Nasuni offers a global file system, accessible from any device, smartphone, tablet or desktop. They are focused on taking out EMC and NetApp NAS solutions. Performance at the edge, combined with capacity in the client's chosen Cloud (including IBM Cloud Object Storage or IBM Bluemix). Infinite snapshots replace backups, offering RPO of 1 minute for Disaster Recovery. Their global file system "UniFS" offers file locking.
Panzura focuses on Cloud Integrated NAS, File Distribution, and Collaboration. This can help eliminate "islands of storage". The File Distribution can be any type of file, but was originally designed for Media and Entertainment, such as videos. Collaboration employs EFSS features for workgroup shared file folders, such as CAD/CAM or engineering blueprints.
IBM Spectrum Scale can provide NFS and SMB access to files, and then move colder, less active data to IBM Cloud Object Storage, using Transparent Cloud Tiering feature. Spectrum Scale offers WAN caching across locations.
IBM COS now offers a native NFS v3 interface. This allows read/write NFS access, with S3 API read of the same content. Each file is mapped to a single object.
This is targeted for large scale archive, static-and-stable data, NFS-based backup software, and applications going through the transition from file-based to object-based. This is not intended for multi-site collaboration or primary NAS replacement. Regardless of the number of geographically dispersed IBM COS sites, the NAS can run on only one or two sites initially.
To provide NFS v3 support, IBM introduces new F5100 File Accessers, which talk to an IBM COS Accesser, which in turn acts on specific Vaults in the storage pools. The file-to-object mapping metadata is replicated on-premises across three File Accessers, and optionally replicated asynchronously to a second site for High Availability. S3 API can read access the file by file name, or by Object URI.
Initially, the "File Accesser" is only available as pre-built system, not as software-only.
There was not enough time to cover other solutions, including Avere, NetApp AltaVault, or Open Source S3FS.
This was a great event, just the right size, between 1,500 and 2,000 attendees. Similar IBM Technical University events coming up later this year:
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Thursday evening, we had the "Meet The Experts" sessions. There were four: Storage, Power Systems, z/OS, and a fourth one focused on z/VM and Linux on z Systems. I was on the expert panel for Storage.
Mo McCullough was the emcee. Special thanks for Shelly Howrigon in her help with this event.
(Disclaimer: Do not shoot the messenger! We had a dozen or so experts on the panel, representing System Storage hardware, software and services. I took notes, trying to capture the essence of the questions, and the answers given by the various IBM experts. The answers from individual IBMers may not reflect the official position of IBM management. I leave out any references to unannounced plans or products. Where appropriate, my own commentary will be in italics.)
When will IBM offer a single pane of glass management for all of its IBM storage products?
IBM is working hard on this. Our strategy is to focus on IBM Spectrum Control as the primary answer. We have extended support across block, file and object, with support for IBM Spectrum Scale and IBM Cloud Object Storage System. We have also provided plug-ins for VMware, Cisco UCS Director, and OpenStack Horizon, for those who prefer those management systems instead.
What we really need are REST APIs!
Good point. IBM already has some REST APIs for the DS8000, XIV and Spectrum Protect, now that IBM has browser-based GUI across its entire product line, it is our strategy to offer REST API across our product line as well.
What is the next generation of ProtecTIER Data Deduplication going to look like?
IBM is focused on provided "data deduplication" for backup workloads directly through IBM Spectrum Protect backup software. IBM continues to sell IBM ProtecTIER.
(Virtual Tape Libraries like IBM ProtecTIER and Dell EMC Data Domain were created to handle the fact that many backup software back only were designed for tape drives and libraries. VTL was disk that pretended to be tape library. Now that IBM Spectrum Protect, NetBackup, Commvault, and all of the other modern backup products write natively to disk, object storage or Cloud services, there really isn't a need for VTL products any more.)
Why does IBM bother with all-Flash version of DS8000 when it already has IBM FlashSystem?
Different products for different workloads. IBM DS8000 offers unique support for z System mainframe FICON attachment and 520-byte block support for IBM i. IBM also offers all-Flash Elastic Storage Server, all-Flash SVC and Storwize products, that complement the IBM FlashSystem product line.
We like how XIV can hot-enable encryption, even with existing data on it. Why doesn't DS8000 offer this?
Two separate implementations. At the time IBM DS8000 encryption was designed, it was decided that the client needed to enable encryption before writing any data.
Will we see a spinning disk version of the FlashSystem A9000
Flash is now less expensive than spinning disk, I don't see why IBM would go backwards. The future is Flash.
We would like Spectrum Control to manage our Dell EMC Isilon
Yes, we have heard that from others. We are working on extending our third party support. Send in your cards and letters to help us prioritize. Or, better yet, submit a "Request For Enhancement" (RFE).
The difference between Tier 0 (Write Endurance) flash and Tier 1 (Read Intensive) flash is confusing, are there any plans in the IT industry to simplify this?
No, if anything it will get worse. Today, IBM's Tier 0 is 10 Drive Write Per Day (DWPD), and Tier 1 is 1 DWPD. Other SSD drives offer 2, 3, 5, 10, 15 and 25 DWPD. As people buy more Flash, and less disk, expect more differentiation in this area.
We would like to tune Easy Tier on the Storwize products
Understood. IBM typically implements new features on the DS8000 platform first, then rolls them over to Spectrum Virtualize. The ability to influence allocation order, pin or avoid tiers, and have application API to influence the placement are already in DS8000.
What will the future of Storwize look like?
We don't have enough time to cover that in this meeting.
Recently, you raised the maximum Storwize FlashCopy background copy rate from 64 MB/sec to 2 GB/sec, but is that realistic?
The setting provides the background task a target "grains per second" to try to achieve. It may not be possible depending on your configuration and the number of concurrent tasks. Your Storwize may be so busy with background activity that it won't take host I/O.
We have been giving you our wishlist, but are there any questions the IBM experts have for the audience
Yes, are there any clients being asked to secure storage against Ransomware and insider threats from disgruntled employees?
(Several hands went up, and we collected their names to have further discussions.)
How should we assign business value to data?
IBM Spectrum Virtualize allows you to assign metadata tags to files, so that these can be used to drive different policies.
(The process of assigning business value is often called "Data Rationalization" and is part of ILM, BC/DR, and Data Governance efforts.)
I am concerned that AES 256 encryption is not good enough now that there is Quantum Computing.
It will be decades before Quantum Computing will be good enough to break these codes.
Will Blockchain drive huge or unique storage requirements?
No. The entries are small. You are appending small transactions to the end of existing ledgers. Nothing unique or different.
Were there any topics not adequately covered at this conference?
IBM didn't have much to offer for Spectrum Compute family of software, the Software Defined Infrastructure (SDI) that runs on both x86 and POWER systems. This should be done under the POWER brand, but many clients use Spectrum Compute with x86 servers. Ironically, Spectrum Compute products are managed under the Storage division, since Spectrum Compute and Spectrum Storage work well together.
We would like Storwize's clever NPIV to be implemented in all of the other IBM arrays, starting with DS8000.
That probably won't happen, as they are different architectures. Whereas Storwize and the rest of IBM Spectrum Virtualize family were designed for nodes to fail, and take their ports down with them, the DS8000 has independent I/O bays that continue to run independent of either POWER8 node. Likewise, FlashSystem 900 has similar separation between the FCP adapters and the processing nodes.
Can we have consistent licensing across the entire IBM Spectrum Virtualize set of products, please?
We have a task force to investigate this, and will gladly add your name to the list for input and feedback.
While the conference continues Friday morning, for many attendees, this was the last event.
IBM Spectrum Scale was formerly called GPFS and has been around since 1998. I am glad it was renamed, as GPFS suffered from "guilt by association" with other file systems, AFS, DFS, XFS, ZFS, and so on.
Spectrum Scale does so much more, supports volume, file and object level access, supports POSIX standards for Windows, AIX and Linux, support Hadoop and Spark with 100 percent compatible HDFS Transparency Connector, support NFS, SMB and iSCSI protocols, as well as OpenStack Swift and Amazon S3 object based access.
Initially designed for video streaming and High Performance Computing (HPC), IBM has extended its reach to work in a variety of workloads across different industries. More than 5,000 production systems are running at client locations.
IBM Spectrum Protect solution design: Server, Deduplication and Disaster Recovery decisions
Dan Thompson, IBM Storage Software Technical Sales Specialist, presented this session.
To make it easier to deploy, IBM Spectrum Protect now has a set of tested "blueprints" that are organized into small, medium and large. Find the one that fits your needs, and it will tell you exactly how the server should be configured. Dan recommends having a "test system" to try out new releases of IBM Spectrum Protect.
For multiple server configurations, Dan recommends adopting a standard naming convention, and to make use of Enterprise Configuration and server-side Client Option Sets. You may want to consider discrete instances for special non-backup functions, like library manager or Operations Center hub server, which allows you to upgrade more aggressively without affecting your backup clients.
If you plan to run multiple Spectrum Protect instances on the same VMware host, set the DBmemPercent to avoid having DB2 consume all of the memory, which will interfere with out Spectrum Protect instances.
For clustered servers, IBM supports Active/Passive, Active/Active, Many/One, and Many/Few configurations. You can mix and match these as needed.
For data spill remediation, consider NIST 800-88 data shredding. This depends on the type of storage media used.
IBM Spectrum Protect for Data Retention, formerly called System Storage Archive Manager (SSAM), offers For Non-erasable, Non-Rewriteable (NENR) enforced Immutability protection. (This used to be called Write-Once-Read-Many or WORM for short, but since WORM applies only to tape and optical media, and IBM Spectrum Protect now supports Flash, Disk, Object Storage and Cloud repositories, IBM has adopted the term NENR instead). Third party KPMG has certified IBM Spectrum Protect for Data Retention meets to their satisfaction the requirements for SEC 17a-4 regulations.
When sizing your server, Dan recommends that you always "over-size" it and grow into it. Use the published "Performance Optimization Guide" to help. Monitor the server and storage using OS and device specific monitoring, in combination with IBM Spectrum Protect reports.
If you are still on BC Tiers 1 or 2, transmitting tapes to a remote vaulting facility or secondary data center, consider upgrading to BC Tier 3 at least. This can be done via electronic vaulting to an Automated Tape Library (ATL), Virtual Tape Library (VTL) or IBM Cloud Object Storage, or a Cloud service provider such as IBM Bluemix or Amazon Web Services. This can be supplemented using DB2 HADR for the IBM Spectrum Protect database.
While Spectrum Protect server can run bare-metal or as a VM, the VM instance will not have support for FCP-based tape or Virtual Tape Library. Many people are moving off tape, especially VTL, and using native Disk, Directory or Cloud container pools instead.
Lastly, take advantage that Operations Center can view all Spectrum Protect servers across all locations. This can be helpful.
Enabling Mission Critical NoSQL workloads using IBM trillions of operations technology
TJ Harris, from the IBM Storage CTO office, and Scott Brewer, FlashSystem Team Lead, co-presented this session.
They gave a background on NoSQL, the most popular being MongoDB. The IT industry estimates that NoSQL will grow 38 percent CAGR from 2015-2020.
The problem occurs when NoSQL applications go through a full file system stack to work with low-latency devices like Flash, especially when the writes are small, often just a few dozen bytes to 100 KB. Fortunately, IBM Research has created the "Trillions of Operations" project to explore ways to take reduce the software stack, and make use of NVMe protocol.
The top three challenges for NoSQL deployments are: (a) Cost, (b) Data management and retention, and (c) Data relevancy.
To enable innovation, MongoDB offers a "Storage Engine API" that allows others to compete at this space. Currently MMAP v1 and WiredTiger are supported. IBM Research implemented its "Trillion Operations" project as a plug-in to this API, optimized for high rates of ingest for data. Compared to Facebook's RocksDB, IBM was 14x faster write, and 2.1x faster read.
Another challenge is coordinate backups and disaster recovery when applications mix traditional RDBMS with these new NoSQL databases.
The week is nearly over, and I can see the light at the end of the tunnel. Everyone had a great time last night's event at the Universal City Walk and Blue Man Group.