Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The article starts out giving background history of the current mess we are in. Here is an excerpt:
"Throughout most of U.S. history, American high school students were routinely taught vocational and job-ready skills along with the three Rs: reading, writing and arithmetic...
...But in the 1950s, a different philosophy emerged: the theory that students should follow separate educational tracks according to ability...
Ability tracking did not sit well with educators or parents, who believed students were assigned to tracks not by aptitude, but by socio-economic status and race. ...
...The backlash against tracking, however, did not bring vocational education back to the academic core. Instead, the focus shifted to preparing all students for college, and college prep is still the center of the U.S. high school curriculum..."
My father was a mechanical engineer who enjoyed fixing cars and woodworking on the weekends. I had plenty of "vocational training" growing up at home, no need for me to have this in school, allowing me to focus on getting ready for college.
Nicholas asks legitimate questions at this stage: "So what’s the harm in prepping kids for college? Won’t all students benefit from a high-level, four-year academic degree program?" His initial response is:
"... As it turns out, not really. For one thing, people have a huge and diverse range of different skills and learning styles. Not everyone is good at math, biology, history and other traditional subjects that characterize college-level work.
Not everyone is fascinated by Greek mythology, or enamored with Victorian literature, or enraptured by classical music. Some students are mechanical; others are artistic. Some focus best in a lecture hall or classroom; still others learn best by doing, and would thrive in the studio, workshop or shop floor..."
Hard to argue that people are different, and learn in different ways. Not everyone is meant for college.
"...And not everyone goes to college. The latest figures from the U.S. Bureau of Labor Statistics (BLS) show that about 68 percent of high school students attend college. That means over 30 percent graduate with neither academic nor job skills..."
Here is what I have most problems with. To think that the 30 percent of high schools students graduate, but do not go to college, have neither academic nor job skills? I disagree with this, as there are many jobs where the academic and job skill training they received in high school is more than adequate. Nicholas then doubled down:
"...But even the 68 percent aren't doing so well. Almost 40 percent of students who begin four-year college programs don’t complete them, which translates into a whole lot of wasted time, wasted money, and burdensome student loan debt. Of those who do finish college, one-third or more will end up in jobs they could have had without a four-year degree. The BLS found that 37 percent of currently employed college grads are doing work for which only a high school degree is required.
It is true that earnings studies show college graduates earn more over a lifetime than high school graduates. However, these studies have some weaknesses. For example, over 53 percent of recent college graduates are unemployed or under-employed. And income for college graduates varies widely by major – philosophy graduates don’t nearly earn what business studies graduates do. Finally, earnings studies compare college graduates to all high school graduates. But the subset of high school students who graduate with vocational training – those who go into well-paying, skilled jobs – the picture for non-college graduates looks much rosier.
Yet despite the growing evidence that four-year college programs serve fewer and fewer of our students, states continue to cut vocational programs..."
There are a lot of successful billionaires who did not complete four yeas of college: Bill Gates, Steve Jobs, Michael Dell, Henry Ford, and Howard Hughes, just to name a few.
If you feel that the only purpose of attending high school or college is to get job-specific skills, then you are missing out on all the other aspects of those that teach you valuable life lessons, getting along with others, teamwork, communications, and other "soft skills" that aren't necessarily job-specific.
Teenagers entering college are still growing up, trying to figure out what they want to do with their lives, discovering new ideas, new ways of thinking, and networking with people of different backgrounds and cultures.
"...The U.S. economy has changed. The manufacturing sector is growing and modernizing, creating a wealth of challenging, well-paying, highly skilled jobs for those with the skills to do them. The demise of vocational education at the high school level has bred a skills shortage in manufacturing today, and with it a wealth of career opportunities for both under-employed college grads and high school students looking for direct pathways to interesting, lucrative careers. Many of the jobs in manufacturing are attainable through apprenticeships, on-the-job training, and vocational programs offered at community colleges. They don’t require expensive, four-year degrees for which many students are not suited..."
The skills shortage is real, but until employers are willing to pay people for what they're worth, the situation will not be resolved. The free market has a way to fix skills shortages. High demand raises salaries, and causes people to invest in high school and college education in part to vie for these positions. That is in part why medical doctors are paid so much.
"...The modern workplace favors those with solid, transferable skills who are open to continued learning. Most young people today will have many jobs over the course of their lifetime, and a good number will have multiple careers that require new and more sophisticated skills..."
A few years ago, I was hosting clients for dinner in Tucson. The sales rep had brought his daughter and her roommate along, as there was a shooting at their college campus and classes were canceled for the week. The daughter asserted, "In 18 months, I will no longer have to learn anything again. I will be done with school." Her roommate chimed in, "Ha! I am a year ahead of you, and only six months away from that!"
I was the bearer of bad news. "Ladies," I said, "you will have to get used to learning new things the rest of your lives." The highest ranking client at the table overheard me, and she re-iterated, "Ladies, that is probably the best advice I have heard in awhile. I suggest you heed it carefully."
A big part of high school and college education is to teach you how to learn on your own. Learn to read, search out information, take measurements, gather data, make plans, and ask the right questions. These are skills that are useful in a wide variety of careers.
Nicholas concludes with:
"...Just a few decades ago, our public education system provided ample opportunities for young people to learn about careers in manufacturing and other vocational trades. Yet, today, high-schoolers hear barely a whisper about the many doors that the vocational education path can open. The “college-for-everyone” mentality has pushed awareness of other possible career paths to the margins. The cost to the individuals and the economy as a whole is high. If we want everyone’s kid to succeed, we need to bring vocational education back to the core of high school learning."
I agree the educational system in United States is broken, but I am not sure I agree with everything that Nicholas writes in this article.
How do you define success? For some, it is based on their salary, or perhaps revenue they helped close for their company.
For others, their family life and the flexibility to handle work/life issues might be more important.
Still others look for certifications and awards from official agencies.
As a side gig, I sometimes do bartending on the weekends. Typically, these are for weddings or corporate parties.
I took weeks of bartender training and passed a three-hour exam to become state-certified to do so in Arizona. We Arizonans take our liquor seriously! If you think about it, bartending is just a notch below being a Pharmacist dispensing other drugs.
Surprisingly, some of my patrons will be condescending, "Don't you wish you can do more with your life than be a bartender?"
I am also certified "Laughter Yoga" instructor, and am called in at times to substitue for other instructors. Again, I took formal training and was certified to do so.
Again, some of my students will ask, "Don't you wish you could do more with your life than be a yoga instructor?"
In both cases, I would respond, "Dude, I earn six figures, and am happy to meet new people every week, how about you?" This usually shuts them up!
(For those interested, here are [my top 10 posts] which served as the basis of the interview!)
I am happy to be recognized externally and within IBM for my success as a blogger. Since I started blogging over 10 years ago, I have helped close over $4 Billion USD in revenue for IBM, written five books on IBM Storage, mentored dozens of other successful bloggers, and presented to thousands of clients at conferences, workshops and briefings.
Well, it's Tuesday again, and you know what that means? IBM Announcements! I am here in New York for the exciting news!
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for the IBM z14 mainframe and DS8880 Storage System.)
In support of the [IBM z14] mainframe announcement, IBM has also disclosed R8.3 enhancements for the DS8880 Storage System. Here is a quick recap:
New Tier-1 Flash Capacities available for HPFE Gen2 drawers
IBM introduces the new Tier-1 flash card capacity 3.84 TB flash card. In the past, IBM DS8880 only supported Tier-0 cards that support 10 Drive Writes per Day (10 DWPD), with capacities 400, 800, 1600 and 3200 GB. The Tier-1 flash card only handles 1 DWPD, often dubbed "Read-Intensive" devices, but can actually handle about 90 percent of most production workloads.
zHyperLink™ drastically reduces the latency between the IBM z14 mainframe and the DS8880 storage systems. Traditional FICON paths through SAN switches or directors introduced about 140 to 175 microseconds of latency between systems. This new system is a direct cable, with 20 microsecond latency.
The I/O bays on the DS8880 used for HPFE Gen2 already have zHyperLink ports on them. This direct cable is limited to 150 meters, however, so plan accordingly.
Transparent Cloud Tiering
IBM already announced Transparent Cloud Tiering to IBM Bluemix, IBM Cloud Object Storage and the IBM TS7760 virtualization engine in R8.2.3 release. The new Release 8.3 of DS8880 now adds support for Amazon S3, providing yet another choice for where to migrate data sets to. IBM also adds replication, allowing the data set to be migrated to two separate target locations, for added availability, much like writing to separate ML2 tape cartridges.
Cascading FlashCopy is a feature that has existing for awhile now on IBM XIV and SAN Volume Controller platforms, so this is just a port of that concept over to the DS8880 microcode. Now, if you FlashCopy target can become the source of a follow-on FlashCopy request. You can make copies of copies. This applies to both the volume and data set level functions.
Why would anyone do this? Well, you might suspend your application at midnight and create a clean FlashCopy of a 24-by-7 ever-changing database. Then in the following morning, workers who need a static "midnight version" of the database now can use this as their source and perform additional FlashCopy requests for their own needs.
IBM DS8880 MES Support
MES is an abbreviation for "Miscellaneous Equipment Specification", one of the many Three Letter Acronyms [TLA] that doesn't help knowing what the words stand for. In short, an MES is a formal supported option to upgrade a piece of hardware that is already installed and running at a client location. IBM will offer MES to upgrade existing DS8880 systems to have the additional HPFE Gen2 drawers, and to upgrade the I/O bays to support zHyperLink connections.
(Final note: you might notice the change in upper and lower case. The IBM z14 (lower case) refers to the specific mainframe model, consistent with its predecessors the z13 and z13s, but the family name "IBM z Systems" has been shortened to "IBM Z®" (upper case). IBM Storage Systems and IBM POWER Systems were already upper case, so the mainframe guys just wanted to follow suit. I suspect "IBM i" will remain lower case, however.)
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM Elastic Storage Server
Replacing the older "GSn" and "GLn" models, IBM announces the "Second Generation" GSnS and GLnS models (the second "S" stands for Second Generation), the "n" continues to refer to the number of storage drawers. All of these have a pair of POWER8 servers to drive amazing performance at a low price point.
The "GSnS" models are based on smaller 2U, 24-drive storage drawers, with 3.84 and 15.36 TB Tier-1 Read-intensive Solid-State Drives (SSD). The "GLnS" models are based on larger 5U, 84-drive storage drawers, with 4TB, 8TB and 10TB nearline (7200 rpm) spinning disk.
These new models have the latest IBM Spectrum Scale software pre-installed.
In addition to IBM's two existing Hyperconverged offerings--IBM Spectrum Accelerate for x86 servers, and IBM Spectrum Scale for x86, POWER and z Systems servers--IBM Power Systems now offers a third option. This integrated offering combines Nutanix's Enterprise Cloud Platform software with IBM Power Systems™ hardware to deliver a turnkey hyperconverged solution that targets critical workloads in large enterprises.
Nutanix is offered and will be defaulted/required on these Power® servers only:
While "Hyperconvergence" is still fairly new, and only about 1 percent of data centers have deployed this new technology, I am glad that IBM is a leader in this space with multiple offerings across both x86 and POWER systems platforms.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 5, the last day of the conference.
Integrating IBM Storage in Container Environments
Dr. Robert Haas, IBM CTO Storage for Europe, presented IBM Storage for Docker containers. These are different from containers in IBM Cloud Object Storage, and different from the Container Pools used in Spectrum Protect.
Robert gave an overview of IBM Spectrum Conductor, part of the IBM Software Defined Infrastructure (SDI) Spectrum Compute family of software products. The goal is to analyze large amounts of data, access these data efficiently, and protect the data, results and insights as intellectual property.
IBM Spectrum Compute comes in several offerings. IBM Spectrum LSF (Load Sharing Facility) manages long-running batch jobs for modeling, design and simulations. IBM Spectrum Symphony provides low-latency for risk analytics in the financial services sector. IBM Spectrum Conductor comes in two flavors. Conductor for Spark (CFS) manages Spark analytics. Conductor for Containers (CFC) handles Docker and Kubernetes containers.
Docker is the run-time platform. While there are other container run-time platforms like RKT and LXD, Docker is clearly the marketshare leader, growing 40 percent per year.
Statistics from the latest DockerCon2016 conference showed the most popular use cases and workloads for Docker. What can run in Docker: Lots of applications can be "containerized", including Redis, MongoDB, PostgreSQL, OracleDB, Java, to name a few. Docker is well established in enterprises, including service providers, healthcare, insurance and financial services, public sector, and technology firms.
Kubernetes, Mesos and Docker/Swarm are a layer above, as orchestrators. Spectrum Conductor for Containers uses Kubernetes and other open source tools to coordinate activity. Orchestrators restart failed applications, and can scale up or scale down the number of instances as needed. Orchestrators can manage groups of applications, across clusters on-premises and off-premises Cloud.
From a storage perspective, containers access storage like bare-metal operating systems, bypassing all of the layers normally associated with bloated Virtual Machine hypervisors. It also eliminates single root I/O virtualization (SR-IOV) that VMs use to compensate.
Persistent storage can be isolated, so that containers cannot see the files of other containers. This provides multi-tenancy.
Internal persistent storage (directory on host file system). However, if you move a container from one host to another, you may lose access to this internal storage.
External volume, manually mounted.
Volume driver plug-in REST API that automatically mounts it.
The fourth method is preferred. Plug-ins are available for IBM Spectrum Scale, GlusterFS, Portworx, Rancher Convoy, RexRay, and Contiv. The start-up Flocker have gone out of business last year.
The Docker hosts can attach to IBM Spectrum Scale in all of its supported offerings, including POSIX, NFS and SMB protocol. Containerized applications can move from one Docker host to another, and continue access the IBM Spectrum Scale namespace.
IBM has created the "Ubiquity Volume Service" that provides a consistent API for Docker and Kubernetes. This will use IBM Spectrum Control Base Edition to support IBM Spectrum Scale, Spectrum Accelerate, Spectrum Virtualize and DS8000 storage systems. For IBM Spectrum Scale, volumes are mapped to iSCSI volumes, filesets or directories. For other devices, volumes are mapped to block LUNs. Ubiquity is publicly available on GitHub.
Enterprise Applications for IBM Cloud Object Storage
Andy Kutner, IBM Cloud Architect, presented the various options available for NAS gateways that can front IBM Cloud Object Storage.
Ctera offers NAS gateways, and Endpoint agents for backup and Enterprise File Sync & Share (EFSS). This vendor targets Remote Office/Branch Office (ROBO) and small NAS consolidation that have less than 60 TB per office IBM is a reseller of Ctera, so you can get both Ctera and IBM COS from the same IBM sales rep.
Nasuni offers a global file system, accessible from any device, smartphone, tablet or desktop. They are focused on taking out EMC and NetApp NAS solutions. Performance at the edge, combined with capacity in the client's chosen Cloud (including IBM Cloud Object Storage or IBM Bluemix). Infinite snapshots replace backups, offering RPO of 1 minute for Disaster Recovery. Their global file system "UniFS" offers file locking.
Panzura focuses on Cloud Integrated NAS, File Distribution, and Collaboration. This can help eliminate "islands of storage". The File Distribution can be any type of file, but was originally designed for Media and Entertainment, such as videos. Collaboration employs EFSS features for workgroup shared file folders, such as CAD/CAM or engineering blueprints.
IBM Spectrum Scale can provide NFS and SMB access to files, and then move colder, less active data to IBM Cloud Object Storage, using Transparent Cloud Tiering feature. Spectrum Scale offers WAN caching across locations.
IBM COS now offers a native NFS v3 interface. This allows read/write NFS access, with S3 API read of the same content. Each file is mapped to a single object.
This is targeted for large scale archive, static-and-stable data, NFS-based backup software, and applications going through the transition from file-based to object-based. This is not intended for multi-site collaboration or primary NAS replacement. Regardless of the number of geographically dispersed IBM COS sites, the NAS can run on only one or two sites initially.
To provide NFS v3 support, IBM introduces new F5100 File Accessers, which talk to an IBM COS Accesser, which in turn acts on specific Vaults in the storage pools. The file-to-object mapping metadata is replicated on-premises across three File Accessers, and optionally replicated asynchronously to a second site for High Availability. S3 API can read access the file by file name, or by Object URI.
Initially, the "File Accesser" is only available as pre-built system, not as software-only.
There was not enough time to cover other solutions, including Avere, NetApp AltaVault, or Open Source S3FS.
This was a great event, just the right size, between 1,500 and 2,000 attendees. Similar IBM Technical University events coming up later this year:
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Thursday evening, we had the "Meet The Experts" sessions. There were four: Storage, Power Systems, z/OS, and a fourth one focused on z/VM and Linux on z Systems. I was on the expert panel for Storage.
Mo McCullough was the emcee. Special thanks for Shelly Howrigon in her help with this event.
(Disclaimer: Do not shoot the messenger! We had a dozen or so experts on the panel, representing System Storage hardware, software and services. I took notes, trying to capture the essence of the questions, and the answers given by the various IBM experts. The answers from individual IBMers may not reflect the official position of IBM management. I leave out any references to unannounced plans or products. Where appropriate, my own commentary will be in italics.)
When will IBM offer a single pane of glass management for all of its IBM storage products?
IBM is working hard on this. Our strategy is to focus on IBM Spectrum Control as the primary answer. We have extended support across block, file and object, with support for IBM Spectrum Scale and IBM Cloud Object Storage System. We have also provided plug-ins for VMware, Cisco UCS Director, and OpenStack Horizon, for those who prefer those management systems instead.
What we really need are REST APIs!
Good point. IBM already has some REST APIs for the DS8000, XIV and Spectrum Protect, now that IBM has browser-based GUI across its entire product line, it is our strategy to offer REST API across our product line as well.
What is the next generation of ProtecTIER Data Deduplication going to look like?
IBM is focused on provided "data deduplication" for backup workloads directly through IBM Spectrum Protect backup software. IBM continues to sell IBM ProtecTIER.
(Virtual Tape Libraries like IBM ProtecTIER and Dell EMC Data Domain were created to handle the fact that many backup software back only were designed for tape drives and libraries. VTL was disk that pretended to be tape library. Now that IBM Spectrum Protect, NetBackup, Commvault, and all of the other modern backup products write natively to disk, object storage or Cloud services, there really isn't a need for VTL products any more.)
Why does IBM bother with all-Flash version of DS8000 when it already has IBM FlashSystem?
Different products for different workloads. IBM DS8000 offers unique support for z System mainframe FICON attachment and 520-byte block support for IBM i. IBM also offers all-Flash Elastic Storage Server, all-Flash SVC and Storwize products, that complement the IBM FlashSystem product line.
We like how XIV can hot-enable encryption, even with existing data on it. Why doesn't DS8000 offer this?
Two separate implementations. At the time IBM DS8000 encryption was designed, it was decided that the client needed to enable encryption before writing any data.
Will we see a spinning disk version of the FlashSystem A9000
Flash is now less expensive than spinning disk, I don't see why IBM would go backwards. The future is Flash.
We would like Spectrum Control to manage our Dell EMC Isilon
Yes, we have heard that from others. We are working on extending our third party support. Send in your cards and letters to help us prioritize. Or, better yet, submit a "Request For Enhancement" (RFE).
The difference between Tier 0 (Write Endurance) flash and Tier 1 (Read Intensive) flash is confusing, are there any plans in the IT industry to simplify this?
No, if anything it will get worse. Today, IBM's Tier 0 is 10 Drive Write Per Day (DWPD), and Tier 1 is 1 DWPD. Other SSD drives offer 2, 3, 5, 10, 15 and 25 DWPD. As people buy more Flash, and less disk, expect more differentiation in this area.
We would like to tune Easy Tier on the Storwize products
Understood. IBM typically implements new features on the DS8000 platform first, then rolls them over to Spectrum Virtualize. The ability to influence allocation order, pin or avoid tiers, and have application API to influence the placement are already in DS8000.
What will the future of Storwize look like?
We don't have enough time to cover that in this meeting.
Recently, you raised the maximum Storwize FlashCopy background copy rate from 64 MB/sec to 2 GB/sec, but is that realistic?
The setting provides the background task a target "grains per second" to try to achieve. It may not be possible depending on your configuration and the number of concurrent tasks. Your Storwize may be so busy with background activity that it won't take host I/O.
We have been giving you our wishlist, but are there any questions the IBM experts have for the audience
Yes, are there any clients being asked to secure storage against Ransomware and insider threats from disgruntled employees?
(Several hands went up, and we collected their names to have further discussions.)
How should we assign business value to data?
IBM Spectrum Virtualize allows you to assign metadata tags to files, so that these can be used to drive different policies.
(The process of assigning business value is often called "Data Rationalization" and is part of ILM, BC/DR, and Data Governance efforts.)
I am concerned that AES 256 encryption is not good enough now that there is Quantum Computing.
It will be decades before Quantum Computing will be good enough to break these codes.
Will Blockchain drive huge or unique storage requirements?
No. The entries are small. You are appending small transactions to the end of existing ledgers. Nothing unique or different.
Were there any topics not adequately covered at this conference?
IBM didn't have much to offer for Spectrum Compute family of software, the Software Defined Infrastructure (SDI) that runs on both x86 and POWER systems. This should be done under the POWER brand, but many clients use Spectrum Compute with x86 servers. Ironically, Spectrum Compute products are managed under the Storage division, since Spectrum Compute and Spectrum Storage work well together.
We would like Storwize's clever NPIV to be implemented in all of the other IBM arrays, starting with DS8000.
That probably won't happen, as they are different architectures. Whereas Storwize and the rest of IBM Spectrum Virtualize family were designed for nodes to fail, and take their ports down with them, the DS8000 has independent I/O bays that continue to run independent of either POWER8 node. Likewise, FlashSystem 900 has similar separation between the FCP adapters and the processing nodes.
Can we have consistent licensing across the entire IBM Spectrum Virtualize set of products, please?
We have a task force to investigate this, and will gladly add your name to the list for input and feedback.
While the conference continues Friday morning, for many attendees, this was the last event.
IBM Spectrum Scale was formerly called GPFS and has been around since 1998. I am glad it was renamed, as GPFS suffered from "guilt by association" with other file systems, AFS, DFS, XFS, ZFS, and so on.
Spectrum Scale does so much more, supports volume, file and object level access, supports POSIX standards for Windows, AIX and Linux, support Hadoop and Spark with 100 percent compatible HDFS Transparency Connector, support NFS, SMB and iSCSI protocols, as well as OpenStack Swift and Amazon S3 object based access.
Initially designed for video streaming and High Performance Computing (HPC), IBM has extended its reach to work in a variety of workloads across different industries. More than 5,000 production systems are running at client locations.
IBM Spectrum Protect solution design: Server, Deduplication and Disaster Recovery decisions
Dan Thompson, IBM Storage Software Technical Sales Specialist, presented this session.
To make it easier to deploy, IBM Spectrum Protect now has a set of tested "blueprints" that are organized into small, medium and large. Find the one that fits your needs, and it will tell you exactly how the server should be configured. Dan recommends having a "test system" to try out new releases of IBM Spectrum Protect.
For multiple server configurations, Dan recommends adopting a standard naming convention, and to make use of Enterprise Configuration and server-side Client Option Sets. You may want to consider discrete instances for special non-backup functions, like library manager or Operations Center hub server, which allows you to upgrade more aggressively without affecting your backup clients.
If you plan to run multiple Spectrum Protect instances on the same VMware host, set the DBmemPercent to avoid having DB2 consume all of the memory, which will interfere with out Spectrum Protect instances.
For clustered servers, IBM supports Active/Passive, Active/Active, Many/One, and Many/Few configurations. You can mix and match these as needed.
For data spill remediation, consider NIST 800-88 data shredding. This depends on the type of storage media used.
IBM Spectrum Protect for Data Retention, formerly called System Storage Archive Manager (SSAM), offers For Non-erasable, Non-Rewriteable (NENR) enforced Immutability protection. (This used to be called Write-Once-Read-Many or WORM for short, but since WORM applies only to tape and optical media, and IBM Spectrum Protect now supports Flash, Disk, Object Storage and Cloud repositories, IBM has adopted the term NENR instead). Third party KPMG has certified IBM Spectrum Protect for Data Retention meets to their satisfaction the requirements for SEC 17a-4 regulations.
When sizing your server, Dan recommends that you always "over-size" it and grow into it. Use the published "Performance Optimization Guide" to help. Monitor the server and storage using OS and device specific monitoring, in combination with IBM Spectrum Protect reports.
If you are still on BC Tiers 1 or 2, transmitting tapes to a remote vaulting facility or secondary data center, consider upgrading to BC Tier 3 at least. This can be done via electronic vaulting to an Automated Tape Library (ATL), Virtual Tape Library (VTL) or IBM Cloud Object Storage, or a Cloud service provider such as IBM Bluemix or Amazon Web Services. This can be supplemented using DB2 HADR for the IBM Spectrum Protect database.
While Spectrum Protect server can run bare-metal or as a VM, the VM instance will not have support for FCP-based tape or Virtual Tape Library. Many people are moving off tape, especially VTL, and using native Disk, Directory or Cloud container pools instead.
Lastly, take advantage that Operations Center can view all Spectrum Protect servers across all locations. This can be helpful.
Enabling Mission Critical NoSQL workloads using IBM trillions of operations technology
TJ Harris, from the IBM Storage CTO office, and Scott Brewer, FlashSystem Team Lead, co-presented this session.
They gave a background on NoSQL, the most popular being MongoDB. The IT industry estimates that NoSQL will grow 38 percent CAGR from 2015-2020.
The problem occurs when NoSQL applications go through a full file system stack to work with low-latency devices like Flash, especially when the writes are small, often just a few dozen bytes to 100 KB. Fortunately, IBM Research has created the "Trillions of Operations" project to explore ways to take reduce the software stack, and make use of NVMe protocol.
The top three challenges for NoSQL deployments are: (a) Cost, (b) Data management and retention, and (c) Data relevancy.
To enable innovation, MongoDB offers a "Storage Engine API" that allows others to compete at this space. Currently MMAP v1 and WiredTiger are supported. IBM Research implemented its "Trillion Operations" project as a plug-in to this API, optimized for high rates of ingest for data. Compared to Facebook's RocksDB, IBM was 14x faster write, and 2.1x faster read.
Another challenge is coordinate backups and disaster recovery when applications mix traditional RDBMS with these new NoSQL databases.
The week is nearly over, and I can see the light at the end of the tunnel. Everyone had a great time last night's event at the Universal City Walk and Blue Man Group.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 4.
Configurable IBM Spectrum Scale
Kent Koeninger presented IBM Spectrum Scale software, which Kent refers to as "Configurable Spectrum Scale" (or CSS for short), as opposed to the pre-built system known as Elastic Storage Server (ESS).
Why choose CSS versus ESS? Lower entry price. You can start with just two single-socket servers and a drawer of disk.
IBM Spectrum Scale was formerly called IBM General Parallel File System (GPFS). Many who tried earlier versions of GPFS found it difficult to configure, because it only had a command line interface. Now, Spectrum Scale has a fully-functional GUI, and clients have been able to install and configure Spectrum Scale in just 30 minutes!
How big can Spectrum Scale grow? As much as your budget can afford! With an architecture that can support YottaBytes of data and 900 quintillion files, you won't hit any limits anytime soon.
There are some unique capabilities of ESS not available in CSS. For example, ESS offers Spectrum Scale Native RAID (erasure coding) with fast rebuild times, and ESS is certified for SAP HANA. You can combine any combination of CSS and ESS in the same Spectrum Scale to create a "data lake" for mixed workloads.
A good use case for Spectrum Scale, either CSS or ESS, is backup. Kent explained why it is an excellent option to store backups with enterprise backup software such as IBM Spectrum Protect or Commvault.
VersaStack - Hybrid Cloud like no other
This session was jointly presented by Chris Vollmar, IBM Storage Architect, and Brent Anderson, Cisco Global Consulting Systems Engineer. IBM and Cisco have been partners for more than 25 years.
VersaStack combines Cisco UCS x86 servers, Cisco Nexus and MDS switches, and IBM FlashSystem or Spectrum Virtualize storage.
What if you have a SAN Infrastructure built entirely from IBM b-type or Brocade-based switches? Cisco supports their SAN switches for this, but nobody has tested VersaStack in this combination, and UCS Director does not manage this combination, so IBM does not support this. Instead, for this situation, IBM recommends doing external connection via Ethernet, or using direct-attach configurations.
The Cisco Validated Design spends four months testing, and gives you bulletproof process to deploy the solution.
There is a difference between Cisco UCS Manager and UCS Director. UCS Manager is available at no additional charge, but only manages the Cisco x86 servers. UCS Director is optionally extra priced, and manages Cisco servers, Cisco networking, and IBM Spectrum Virtualize storage.
Brent explained the benefits of UCS Management through policies and profiles.
Chris covered Cisco CloudCenter, which the Cisco team shortens to just "C3". IBM Spectrum Copy Data Management can be used to move snapshots of data between on-premises and off-premises Cloud to help in Hybrid Cloud configurations.
How to Design an IBM Spectrum Scale solution
Tomer Perry, IBM Spectrum Scale I/O Development, presented this session.
For those who want to bring up a quick IBM Spectrum Scale environment to play around with, you can do this in as little as 30 minutes. But to design a mission critical deployment, additional requirements may need to be addressed. You may need to consult with not just storage admins, but also application owners, network admins and security personnel.
Large companies have hundreds or thousands of applications, so Tomer recommends to group these into "Workload families", based on data set types, access patterns and performance requirements. For NAS take-out, 80 percent of NAS I/O is "get attribute" that can easily be served directly from cache memory.
For each workload family, you may need to decide on snapshots, quotas, namespace (bind mounts, symlinks, etc.), security (ACL, encryption), estimated capacity, replication BC/DR, backup and ILM requirements.
Unless this is completely greenfield deployment, the existing infrastructure needs to be evaluated. This includes the LAN and WAN network topology, name resolution (DNS), time services (NTP), Authentication (AD, LDAP, NIS, Keystone), Keyserver (IBM SKLM), Monitoring and Migration requirements.
Tomer suggests designing the environment in this order: Cluster, File System, Storage Pools, Fileset, Replication, and finally Monitoring.
Generally, you need three NSD servers per cluster. For those licensing Spectrum Scale Standard Edition by the socket, you may be tempted to put everything into one big cluster. The new capacity-based Spectrum Scale Data Management Edition eliminates that concern, so Tomer recommends having separate computer clusters and storage clusters, connected by cross-cluster mount. All nodes in a cluster are considered an "ssh" administration domain.
A single Spectrum Control namespace can support up to 256 file systems. There are various reasons to have multiple file systems: block size, backup/recovery, snapshot, quotas, and cross-cluster isolation. If a file system gets corrupted, it will not affect other file systems. In an internal test, an "fsck" on 1 billion, 1 PB of data file system took only 30 minutes to repair.
Storage Pool design can separate metadata from content, and workloads can be separated to different storage media. With ILM, HSM and TCT, you can move colder data to Cloud, Object Storage, Spectrum Protect or Spectrum Archive.
Filesets are tree branches for each file system. IBM Spectrum Scale supports both dependent and independent filesets. Filesets can be used for Non-erasable, Non-Rewriteable (NENR) Immutability, policies, quotas, snapshots. Consider using a fileset instead of carving off a new file system.
Spectrum Scale offers both synchronous and asynchronous replication. For Synchronous, the ReadReplicaPolicy can be set to default, local or fastest. For Asynchronous, there are a variety of AFM modes (Read-only, Local-Update, Single-Writer, Independent-Writer, and Disaster Recovery). You may need to decide if your AFM gateways are dedicated or collocated. You will need to tune your TCP buffers for WAN performance to get the RPO you desire.
The nice thing about IBM solutions is that you can start small, and grow big. In all of these examples above, IBM offers sizes to match nearly any IT budget.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the sessions of Day 3.
Ethernet-only SANs -- Myth or Reality?
Anuj Chandra, IBM Advisory Engineer, presented an excellent overview of Ethernet-based SANs. He started with a quick history of Ethernet, starting with Robert Metcalfe's original drawing for his concept.
In the past, Ethernet was used for email and message transfer, and so dropped packets were tolerated. However, with the use of Ethernet for SANs, many standards have been adopted to make Ethernet networks more robust. These meet requirements for Flow Control, Congestion management, low latency, data integrity and confidentiality, network isolation, and high availability.
These standards are known as IEEE 802.1Q "Data Center Bridging", including 8012.Qbb Priority Flow Control, 802.1Qaz Enhanced Transmission Selection, 802.1Qau Congestion Notification. There is also the IETF Transparent Interconnection of Lots of Links (TRILL) to replace Spanning Tree Protocol (STP). All of these features are negotiated between endpoints server and storage. Ethernet that supports these new standards is often referred to as "Converged Ethernet" since it handles both traditional email/message traffic as well as SAN data traffic.
In addition to 1GbE and 10GbE, we now have 2.5, 5, 20, 40, 50, 100 Gb Ethernet speeds. By 2020, Anuj estimates over half of all Ethernet ports will be 25 GbE or faster. Amazingly, some of these can work on existing 10BASE-T cables.
Anuj also covered Remote Direct Memory Access (RDMA), and the RDMA-capable Network Interface Cards (RNIC) that support them. In one chart, shown here, Anuj explained Infiniband, RDMA over Converged Ethernet (RoCE) and RoCE v2, and Internet Wide Area RDMA Protocol (iWARP).
While many of these enhancements were intended for Fibre Channel over Ethernet (FCoE), the beneficiary has been iSCSI. Now there is iSCSI Extensions for RDMA (iSER) to take even more advantage of these changes, and can work with Infiniband, RoCE or iWARP. All of these networks can also be used as the basis for NVMe over Fabric (NVMeOF).
Ethernet is the backbone of Cloud usage, and IBM is well positioned to take advantage of these new networking technologies.
Digital Video Surveillance solutions for extended video evidence protection
Dave Taylor, IBM Executive Architect for Software Defined Storage solutions, presented this session on Digital Video Surveillance (DVS).
Most video surveillance is either analog-based, going to standard VHS tapes, or file-based. Sadly, security guards that watch live camera feeds lose their attention span after 22 minutes.
There are an estimated 72 million cameras globally, with 1.5 million more every year.
City governments spend 57 percent of their budget on "public safety". This can include body cams for police departments. Taser International, now called AXON, dominates the body-cam market.
City budgets may not be prepared to store all of this video content into a cloud that complies with Criminal Justice Information Services (CJIS) standards. These Cloud services tend to be more expensive, as the videos must be treated as evidence, tamper-proof, and with appropriate chain of custody.
DVS is not just storing movies. IBM offers Intelligent Video Analytics. It is important to be able to derive insight and actionable response.
Storage capacity adds up quickly. Standard 1080p (1920 by 1080 pixel) camera generates 2.92 GB per hour, 70 GB per day, and over 2TB per month. If you have 1,000 cameras, that's over 2PB of data.
For xProtect servers running Windows, the Tiger Bridge Connector can be used to move the video files to either IBM Spectrum Scale or IBM Cloud Object Storage.
Deep Dive into HyperSwap for Active-Active applications and Disaster Recovery
Andrew Greenfield, IBM Global Engineer for Storage, explained the different ways HyperSwap is implemented across the IBM storage portfolio.
For IBM DS8000, HyperSwap is based on Metro Mirror synchronous replication. In the event that the primary DS8000 fails, the host server can automatically re-direct all I/O to the secondary DS8000. This is often referred to as "High Availability" (HA), and in some cases can serve as Disaster Recovery.
For IBM Spectrum Virtualize products, including SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software, the implementation is different.
Previously, SVC offered Stretched Clusters, which put one node in one site, and a second node at another site, which allows for an Active/Active configuration. Unfortunately, the nodes in FlashSystem V9000 and Storwize are "connected at the hip", effectively bolted together, so putting separate nodes in different locations was not possible. To solve this, IBM developed HyperSwap that allows one node-pair to replicate across sites to another node-pair in the same Spectrum Virtualize cluster.
However, even though it is called "HyperSwap", it is not implemented in any way similar to the DS8000 method. Instead, Spectrum Virtualize uses the Global Mirror with Change Volumes to replicate data between sites.
IBM Storage and VMware Integration
This session was co-presented by Brian Sherman, IBM Distinguished Engineer, and Steve Solewin, IBM Corporate Solutions Architect.
For nearly two decades, IBM is a "Technology Alliance Partner" with VMware. To provide consistent integration to all the features and functions of VMware, IBM Spectrum Control Base Edition (SCBE) is provided at no additional charge for IBM DS8000, XIV, FlashSystem and Spectrum Virtualize products.
SCBE is downloadable as an RPM for RedHat Enterprise Linux (RHEL) can run bare-metal or as a VM.
For those using Hyper-Scale Manager, it will automatically install a special A-line-only version of SCBE. It will install SCBE, but it will only manage the A-line products (FlashSystem A9000, FlashSystem A9000R, XIV and Spectrum Accelerate).
Storage admins can define "storage services" that can be assigned to vCenter. This allows VMware admins to allocate storage in self-service mode.
After the meetings were over, IBM had a special event at the Universal City Walk to enjoy some drinks, food, and conversation, and to watch Blue Man Group.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 2.
IBM Spectrum Protect deep dive into Container Storage Pools
Ron Henkhaus, IBM Certified Consulting IT Specialist, presented the new Spectrum Protect concept of "Container Pools" that can either be "Directory Pools" on SAN or NAS-based disk storage, or "Cloud Pools". Container pools can contain deduplicated and non-dedupe data.
Ron cautioned that directory pools should not be placed on the same file system as your Spectrum Protect database or logs. Also, best practice for any directory pool is to assign an "overflow" pool to any non-directory pool, such as disk, tape or cloud container.
Cloud pools can use either OpenStack Swift, V1 Swift, Amazon S3 protocol, Amazon Web Services, IBM Bluemix, and IBM Cloud Object Storage. You can pre-define the vaults and buckets in the configuration.
For off-premises Cloud pools, the data is encrypted by default. For other container pools, encryption is optional. Performance to Cloud pools have been improved by using "accelerator storage", basically a disk cache to collect data before sending over to the Cloud pool. Backups to Cloud pools can reach 8 TB per hour. Restore times varies from 500 to 1500 GB per hour.
Container Pools were designed for the new "Deduplication 2.0" feature introduced in version 7. Traditional Dedupe 1.0 to Device Class FILE is still available, but not recommended.
Version 7.1.6 changed the compression algorithm from LZW to LZ4. In all cases, Spectrum Protect performs these actions in this order: deduplication, compression, encryption. Data that is encrypted by the Spectrum Protect client is therefore not deduped.
The "Protect Storage Pool" command can replicate a directory pool to either a remote directory pool or Cloud pool. In addition to this remote replication, you can copy a directory pool to tape to offer air-gap protection against ransomware. Such tapes are considered part of the "Copy Container Pool". In the event of directory pool corruption, the data can be repaired from either replication or tape.
IBM Aspera can now be used for replication, using SSL and AES-128 bit encryption. If your latency is greater than 50 msec, and have more than 0.5 percent packet loss, Aspera might help. This is available for Linux on x86 platforms running v7.1.6 or higher.
For existing customers, IBM Spectrum Protect allows you to convert your FILE, VTL and TAPE device class pools to directory or Cloud pools.
Introduction to IBM Cloud Object Storage (powered by Cleversafe)
In 2015, IBM acquired Cleversafe, recognized as the #1 Object Storage vendor. Their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS. IBM offers the IBM Cloud Object Storage System in three ways: as software, as pre-built systems, and as a cloud service on IBM Bluemix (formerly known as SoftLayer).
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
While other cloud service providers have offered data storage in the cloud, this new offering also allows hybrid configurations with geographically dispersed erasure coding.
Unlike RAID which protects against the loss of one or two drives, erasure coding can protect against a larger number of concurrent failures. For example, using an Information Dispersal Algorithm (IDA) of "7+5", where seven pieces of data are encoded on twelve independent disks, the system can lose up to five disk drives without losing any data.
Combining this with Geographically Dispersed Configuration across three or more sites means that you can lose an entire data center, four of the twelve disks, and still have instant full access to all of your data from eight drives at the other locations. In the graphic, you see two on-premise data centers combined with a third location in IBM SoftLayer.
New Generation of Storage Tiering: Simpler Management, Lower Costs, and Improved Performance
With ever changing amounts of storage, it is hard to find metrics that are consistent year to year. Fortunately, we found I/O density as the metric to focus my efforts, armed with real data from Intelligent Information Lifecycle Management (IILM) studies done at various clients. From that, I was able to talk about storage tiering on three fronts:
Storage tiering between Flash and disk. IBM FlashSystem and IBM Easy Tier on DS8000 and Spectrum Virtualize family for hybrid Flash-and-disk configurations.
Storage tiering between disk, tape, and Cloud. HSM and Information Lifecycle Management (ILM) on Spectrum Scale, Elastic Storage Server (ESS), Spectrum Archive and IBM Cloud Object Storage System.
Storage tiering automation across your entire environment. IILM studies can help identify a target mix of Tier 0, Tier 1, Tier 2 and Tier 3 storage. IBM Spectrum Storage Suite and the Virtual Storage Center (VSC) can recommend or perform the movement of LUNs to more appropriate tiers, based on age and I/O density measurements.
It's hard to say what the correct sequence of presentations should be. Some thought it might have been better for my talk on IBM Cloud Object Storage System prior to Ron's talk on Cloud container pools, but perhaps hearing Ron first helped drive more interest to my session.