Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
My session on IBM Cloud Object Storage had three sections. First, I covered an overview of what "Object Storage" was in general, how this differs from traditional block or file storage approaches.
Second, I explained what is unique and different of IBM Cloud Object Storage System, formerly called DsNet from Cleversafe. IBM acquired Cleversafe in 2015.
Third, I explained the various applications, use cases and industries that can take advantage of Object Storage.
IBM Storage and the NVMe Revolution
Brian Sherman, IBM Distinguished Engineer for Storage Advanced Technical Services, presented an overview of NVMe, NVMe Over Fabric (NVMeOF) and what IBM is doing in this area.
How to Build a Rockstar Personal Brand
Andrea Edwards, The Digital Conversationalist, is a globally award winning B2B communications professional with more than 20 years' worth of experience from around the globe, including 12 years exclusively in Asia Pacific. IBM has hired her in the Asia Pacific region to train many IBMers in Social Media.
She condensed her normal 5-6 hour training down to a single hour for this event. She explained why building a personal brand was important, how to do it, and why businesses and organizations should encourage their employees to do so.
For example, who has the most influence on most people? Behind friends and family are bloggers. Bloggers are more influential than journalists, religious leaders, celebrities and politicians.
(As the #1 blogger of IBM, I am considered to already have a "rockstar personal brand". I am pleased to see that IBM is taking social media seriously. I have been blogging since 2006, and have influenced over $4 billion US dollars in IBM revenue in the past 11 years.)
IBM Spectrum Virtualize technical updates
Andrew Martin, IBM Spectrum Virtualize Support Architect, presented the last 18 months of enhancements to Spectrum Virtualize, from v7.6.1 introduced in March 2016 to v7.8.1 released earlier this year.
He managed to highlight quite a few enhnacements:
Distributed RAID 5 and RAID 6
Integrated Compresstimator tool
New hardware: SVC, Storwize V7000 Gen2+, Storwize V5000 Gen 2, and 92-drive 5U High Density Expansion Enclosure
N-Port ID Virtualization (NPIV)
Virtualization Over iSCSI
Encryption for Distributed RAID Arrays
64GB Read Cache
Tier 1 Flash Support
Compressed IP Replication
Spectrum Virtualize as Software for Lenovo and SuperMicro servers
Host Clusters and Throttling
Raised limit to 10,000 Volumes
Transparent Cloud Tiering
Storwize Model Conversions
IBM SKLM Support for Encryption
Consistency Protection for Metro and Global Mirror remote-distance replication
Andrew called this a "reverse roadmap", rather than a session that presents where we are going in the next 18 months, he presented where we have been.
Solution Center Reception
Here I am with Morgan Tracey and Jenna Brooker from Computer Merchants, an IBM Business Partner.
Not only were Computer Merchants a sponsor with a booth at the Solution Center, but they also gave a customer testimonial at one of the breakout sessions on how they were able to use IBM Artificial Intelligence to help with their business.
I also spent time at the SuSE booth. SuSE is a distributor of Linux that runs on x86, POWER and IBM Z mainframe systems.
While I was working, Mo took a tour to Phillip Island. On the way, they stopped at Maru to feed kangaroos and take pictures with Koala bears.
At Phillip Island, Mo watched penguins come out of the ocean, waddle up on shore and march to their burroughs. This happens every evening and is one of the top tourist attractions near Melbourne.
Last week, I was in São Paulo, Brazil for IBM Systems Technical University.
Did the resort ask these two security guards to dress up as clowns? No, it turns out these were clowns dressed up as security guards! On other days, they were dressed in drag as housewives, or as Jamaican Rastafari in dreadlocks and tie-dyed tee shirts. Some of the attendees enjoyed their comic relief.
Here is my recap of Day 3 breakout sessions:
Demystifying Transparent Cloud Tiering for DS8000 and DFSMShsm
Ricardo Alan, IBM Client Technical Specialist, covered this recently announced synergy between DS8000 firmware and DFSMShsm, a part of the z/OS operating system for IBM Z mainframes.
(Historical note: I started my career as a software engineer for DFHSM, which was later renamed DFSMShsm, working my way up to lead architect for DFSMShsm, and later as chief architect for DFSMS overall. A good portion of my 19 patents are related to these products.)
Since the 1970s, mainframe clients were able to move less active data from expensive disk storage to lower cost tape media. DFSMShsm would be read data sets into the mainframe processor, chop them up into 16KB blocks, and then write them out to tape, often through an automated tape library.
Transparent Cloud Tiering introduces an alternative option. DFSMShsm now identifies which tracks of data need to be re-located, sends the request to IBM DS8000 storage device, and the IBM DS8000 sends the tracks as objects to the Cloud. Any application that references these data sets would automatically trigger a recall to bring the data back from the Cloud.
This feature is available for the DS8870 and DS8880 models, using the existing Ethernet ports already installed. No additional hardware is required. Enhancements to DFSMShsm will be rolled out via SPEs on z/OS releases. Initially, the system uses OpenStack Swift object protocol, but IBM has plans to support Amazon S3 protocol as well.
Data Migration Challenges and Solutions with IBM Enterprise Storage
Sidney Varoni Jr. presented this session on data migration methods. Data is migrated for three reasons. First, to re-balance across multiple storage arrays. If you bring in a new storage array, you often want to move data from older arrays to balance the workload.
The second reason is to get rid of old hardware altogether, you need to migrate the data to new hardware. With Dell acquisition of EMC, for example, many clients are using tools like TDMF to move data off of EMC and onto IBM DS8000 storage systems. IBM DS8000 storage systems are faster, easier to use and less expensive to operate from a total cost of ownership (TCO) than comparable capacity of EMC VMAX devices.
The third reason is to migrate from one data center to another. The average data center was built 10-15 years ago, and many no longer meet the needs and requirements of newer IT operations. Some clients are building new data centers, while others are moving their data to co-location facilities.
NVMe Over Fabrics: The next evolution in high performance for SSD interfaces is NVMe
Waner Dall Averde, Territory Representative from Brocade, presented this session on NVMe and NVMe Over Fabric (NVMeOF). As a joke, he showed this chart in Japanese.
(Fun Fact: The first Japanese immigrants arrived in Brazil in 1908. Brazil is home to the largest Japanese population outside Japan. Source: Wikipedia)
For the past 20 years, the Advanced Host Controller Interface (AHCI) served as the communication mechanism to send SCSI commands to SAS and SATA disk devices.
Unfortunately, AHCI is now the bottleneck between faster servers and faster Non-Volatile Memory such as Flash and Solid State Drive (SSD) storage devices. It only supports a handful of commands on a single command queue.
NVMe offers a replacement for the SCSI command set. It can support up to 64,000 commands on as many as 64,000 parallel command queues. Designed for 32 Gbps PCIe bus speeds, it is faster than traditional 6 Gbps and 12 Gbps SAS connections, reducing latency by 200 microseconds.
Unfortunately, PCIe cables are limited to just a few inches. PCIe Gen 1 supported 15 inches, PCIe Gen 2 supported 12 inches, and PCIe Gen 3 only 8 inches. To provide greater distances, NVMeOF allows the NVMe command set to be carried over long-distance networks, such as Ethernet, Infiniband or Fibre Channel.
Brocade Gen5 (16 Gbps) and Gen6 (32 and 128 Gbps) Fibre Channel switches and directors already support NVMeOF, and are designed to allow co-existence between NVMe and SCSI commands for smooth transition in mixed environments. Clients can buy their networking gear directly from IBM.
IBM Power Systems Flash Cache Acceleration
Petra Bührer, IBM Offering Manager for Power Systems software, explained recent the performance enhancement called "Flash Cache Acceleration".
This is a feature on POWER8 servers running AIX 7.1 TL4 SP2, AIX 7.2 TL0 SP0 – or higher. By using internal or direct-attach SSD, the operating system can cache most active blocks of data from external storage systems.
While this is certified for use with Oracle, it supports only single-instance databases. Oracle RAC and other active/active configurations are not supported at this time.
The Secret to IBM Disk Encryption - Deep Dive
As if Mo McCullough, one of the event coordinators for this conference, was not busy enough with keeping the conference going, he also gave technical presentations.
With the excitement over the IBM z14 end-to-end encryption announcement, there has been increased demand for everything related to encryption and security.
Unfortunately, I had to leave for the airport before the "Closing Session". The Club Med Lake Paradise resort was 60-90 minutes away from the GRU airport, and rush hour traffic in a city of 12 million people can get really bad.
Last week, I was in São Paulo, Brazil for IBM Systems Technical University.
Instead of separate physical rooms for each breakout session, this event had "virtual rooms". One speaker called it the "Software Defined Stage". Basically, there were five "rooms" in the main ballroom, and another eight rooms in a second ballroom.
Rather than blasting out each speaker's voice over loudspeakers, each speaker spoke softly into a headset microphone. All attendees wore headsets. Rooms 1 through 4 offered real-time translation, so attendees could chose to hear in English or Brazilian Portuguese.
In the other 13 "rooms", local speakers spoke in Brazilian Portuguese, but you still had to wear headsets to avoid speaking louder than the speaker next to you. For many of these, the charts were written in English.
My translators, Luciana and Marilia, explained to me the advantage of this approach. When speakers use English language, those who prefer must hear the real-time translation wore the "headphone of shame" which advertised to all others that an attendee's English proficiency was poor.
Sometimes, those who did not understand English well would not wear their headsets, nod or laugh with other attendees, but fail to understand the message. By forcing everyone to wear headsets, there is no stigma associated, and everyone can discreetly select the language they prefer to listen in.
Here is my recap for the breakout sessions on Day 2:
In this presentation, I gave an overview of interest in Cloud technologies, including OpenStack and RESTful APIs to manage server and storage resources. I then covered IBM Hybrid Cloud Storage configurations in five categories:
Cold storage for data infrequently accessed
Backup and Snapshot storage
Disaster Recovery storage
Daily Operations and Reporting
Special thanks to Chris Vollmar and Brian Sherman for their help in preparing this presentation.
Data Optimization: How to verify your data is being used efficiently
It is hard to believe that it was over 15 years ago that I was the chief architect for the software we now call IBM Spectrum Control. There are a variety of editions and bundles for this product, but my focus on this talk was on the advanced storage analytics found in IBM Virtual Storage Center and IBM Spectrum Control Advanced Edition.
I covered three use cases:
What storage tier to put your workload in, and how to move existing data into a faster or slower tier to meet business requirements and IT budgets.
For steady state environments, how to re-balance storage pools within a single tier to keep things even for optimal performance.
When it is time to decommission storage, how to transform volumes from one storage pool to another without downtime or outages.
Special thanks to Bryan Odom for his help in preparing this presentation.
IBM Hyperconverged Systems powered by Nutanix: Technical Overview
Ricardo Matinata, IBM Senior Technical Staff Member for Linux, KVM and Cloud on POWER, presented the latest IBM CS models for POWER systems that are pre-installed with Nutanix software running their Acropolis Hypervisor (AHV) to run Linux on POWER application virtual machines.
Managing Risks with Thin Provisioning, Compression, and Data Deduplication
This session had four parts. First, an overview of "Data Footprint Reduction" technologies, like compression, data deduplication, space-efficient snapshots and thin provisioning.
Second, a look at how these technologies can get storage administrators in trouble. Much like airlines selling more tickets than seats on the airplane, storage administrators may over-provision based on data reduction estimates, and then suddenly run out of storage capacity.
Third, an overview of IBM FlashSystem A9000 and A9000R products, often referred to as "A9000/R" to cover both as a family. These models offer data footprint reduction for all data.
Finally, I explain how the Hyper-Scale Manager GUI can help with reporting and analytics to avoid these risks. This GUI is available for the FlashSystem A9000/R, as well as XIV Gen3 and Spectrum Accelerate software clusters.
Special thanks to Rivka Matosevich for her help in preparing this presentation.
The Right Flash for the Right Workload
Fabiano Gomes, IBM Client Technical Specialist, presented IBM's portfolio of All-Flash Arrays, from FlashSystem and DS8000F to Elastic Storage Server and Storwize V7000F and V5000F models. Each of these have their own characteristics, which might favor one over the others for particular workloads and use cases.
The day was capped off with a nice evening reception at the pool bar. Bartenders were serving Caiparinhas, a Brazilian cocktail traditionally made sugar cane liquor, sugar and lime, but in this case offered in other flavors, such as pineapple or passion fruit.
Last week, I was in São Paulo, Brazil for IBM Systems Technical University.
Luciana and Marilia
While I speak Spanish fluently, my Brazilian Portuguese is a bit rusty, so I was asked to present in English language, and let these two real-time translators, Luciana and Marilia, speak on my behalf.
A big challenge is that English is a terse language, but Brazilian Portuguese is more verbose. It takes more syllables, and thus more time, to perform real-time translation. I have learned to pause at the end of each sentence to give a chance for my translators to catch up.
Servers (2 syllables)
Servidores (4 syllables)
Storage (2 syllables)
Armazenamento (6 syllables)
In this table, you can see that some technical terms take more syllables in Brazilian Portuguese than English. Often, I heard the local speakers just say "Servers" or "Storage" for convenience.
Here is my recap of breakout sessions on Day 1.
IBM Storage Trends and Directions
Alcides Bertazi, IBM Executive IT Specialist, presented the latest in Storage Trends and Directions.
Introduction to Object Storage and its Applications
This session had three sections. First, I covered an overview of what "Object Storage" was in general, how this differs from traditional block or file storage approaches.
Second, I explained what is unique and different of IBM Cloud Object Storage System, formerly called DsNet from Cleversafe. IBM acquired Cleversafe in 2015.
Third, I explained the various applications, use cases and industries that can take advantage of Object Storage.
IBM Spectrum Copy Data Management for Beginners
Eduardo Tomaz, IBM Client Technical Sales for Software Defined Storage solutions, presented an overview of IBM Spectrum Copy Data Management (CDM), the newest member of the IBM Spectrum Storage family.
IBM Spectrum Protect Update
Rosane Lagnor, IBM Certified IT Specialist - Storage Consultant Lab Services, and her two colleagues co-presented this session on the latest of IBM Spectrum Protect. The review went chronologically, from v7.1.4 introduced in late 2015, all the way to v8.1.1 release, the latest generally available.
(Note: IBM just announced v8.1.2 but is not generally available yet in Brazil.)
I managed to understand the local speakers in their native Brazilian Portuguese language. In many cases, the charts were in English language, so I was able to read in English what I may not have understood was spoken.
Last week, I was in São Paulo, Brazil for IBM Systems Technical University. With over 12 million people, it is the most-populous city in the Americas. Our venue was the Club Med Lake Paradise resort on the outskirts of town. We had about 700 attendees.
We had several local speakers do the opening session. Here is my recap:
Marcelo Porto, IBM General Manager for Brazil
This year, IBM Brazil celebrates 100 year anniversary. This all happened because Valentim Boucas persuaded IBM then-President Thomas Watson, Sr. to approve the establishment of a Rio de Janeiro office for the sale of IBM machines beginning in 1917.
For 100 years now, IBM has thrived with a set of core values. In every era in the past, IBM systems have been perfect for the business needs at the time, from punch cards to personal computers. But what got us here won't get us there in the future. The biggest challenge to transformation is people and culture. We must break the chains that hold us to the past. IBM drives disruption.
To prepare for the future, Marcelo recommended the following. First, learn English, because the English language is the "API of Business". Second, keep a curious mind. Seek out new things to learn. The new world needs skills and expertise in a variety of areas. Third, watch the movie "Hidden Figures", starring the IBM mainframe computer.
IBM Watson computer now speaks and understands Brazilian Portuguese language. Groupo Fleury uses Watson for genomics research. MRV Engineering uses this for chatbots. Mae de deus Hospital uses this for Oncology, as cancer patients now dominate the percentage of patients there. Walmart uses Blockchain to focus on food safety.
IBM Watson is used at Pinacoteca de São Paulo Museum to offer "Voz de Arte", the ability to ask IBM Watson about each painting in handheld smartphone devices. An example of this was available in the Solution Center.
In addition to natural language processing (NLP), IBM Watson can also do image recognition, a task normally only humans could do.
Watson can validate signatures, perform facial recognition at different angles, and even identify shirts, pants and shoes of fashion models in photographs.
Companies and organizations that are unable to transform data into insights and business decisions will fail.
Mauro D'Angelo, IBM Strategy and Business Development for Brazil
Why are companies like Uber and Airbnb successful? Mauro felt that it was because they had a proper Cloud infrastructure combined with the right data architecture.
(In this case, "success" is based on company valuation, often billions of US dollars. However, many of these companies are not profitable, losing millions of dollars in an aggressive effort to gain customers and establish their platform. It might take 12 to 24 months before a new customer becomes profitable.)
The data explosion is driving digital transformation. Cognitive systems must understand natural language, reason, learn and interact with humans. Machine Learning is much like training a puppy. You need to reward good behavior and fix bad behavior, and be patient, as it takes a long time.
In USA, patients asking Doctors for a diagnosis get only 50 percent correct on first consultation. Often, additional doctors or additional tests are needed to finally get correct assessments. In Brazil, it is probably less than 50 percent. Hopefully, Watson will help improve this.
Watson can also detect emotional tone and personality in social media. Is a customer angry? This could help prioritize which customer issues to address first.
Schools have not changed since the days of Aristotle. Mauro showed a picture of a school taken in 1934, and a picture of the same classroom, taken recently, showing it is nearly the same. Students want to learn anytime, anywhere, and from any channel.
At Georgia Tech University, a professor told his engineering students that there were nine "Teacher Assistants" (TAs) available to help answer questions online. One of these was [Jill Watson], which was the IBM Watson computer responding to the students. The students could not tell that Jill was not human!
In traditional schools, a teacher may reach only 50 to 60 students. Compare this to [Khan Academy] that offers video instruction that have had over 1.3 million views!
Frank Koja, IBM Systems Vice President for Brazil
When you buy something over the internet, what is your decision criteria? Often, it is lowest cost. Digital transformation often requires re-invention.
Trust beats risk. The new IBM z14 mainframe focuses on trust, with end-to-end encryption, Blockchain and Machine Learning. zHyperLink drastically improves the connection between mainframe and IBM DS8880 storage. IBM is helping over 400 clients adopting Blockchain.
The FlashSystem A9000 and A9000R models are 30x faster than traditional disk systems, and more dense, able to consolidate 20 racks down to one.
The new "PowerAI" bundle combines together a complete offering for Machine Learning and Deep Learning (ML/DL) for Power systems, taking advantage of GPU and NVlink capabilities.
The "waitless" world has arrived.
This was a good start for the conference. The three speakers of the opening session were passionate of what they were talking about, and people were excited to learn more as the week progressed.
The article starts out giving background history of the current mess we are in. Here is an excerpt:
"Throughout most of U.S. history, American high school students were routinely taught vocational and job-ready skills along with the three Rs: reading, writing and arithmetic...
...But in the 1950s, a different philosophy emerged: the theory that students should follow separate educational tracks according to ability...
Ability tracking did not sit well with educators or parents, who believed students were assigned to tracks not by aptitude, but by socio-economic status and race. ...
...The backlash against tracking, however, did not bring vocational education back to the academic core. Instead, the focus shifted to preparing all students for college, and college prep is still the center of the U.S. high school curriculum..."
My father was a mechanical engineer who enjoyed fixing cars and woodworking on the weekends. I had plenty of "vocational training" growing up at home, no need for me to have this in school, allowing me to focus on getting ready for college.
Nicholas asks legitimate questions at this stage: "So what’s the harm in prepping kids for college? Won’t all students benefit from a high-level, four-year academic degree program?" His initial response is:
"... As it turns out, not really. For one thing, people have a huge and diverse range of different skills and learning styles. Not everyone is good at math, biology, history and other traditional subjects that characterize college-level work.
Not everyone is fascinated by Greek mythology, or enamored with Victorian literature, or enraptured by classical music. Some students are mechanical; others are artistic. Some focus best in a lecture hall or classroom; still others learn best by doing, and would thrive in the studio, workshop or shop floor..."
Hard to argue that people are different, and learn in different ways. Not everyone is meant for college.
"...And not everyone goes to college. The latest figures from the U.S. Bureau of Labor Statistics (BLS) show that about 68 percent of high school students attend college. That means over 30 percent graduate with neither academic nor job skills..."
Here is what I have most problems with. To think that the 30 percent of high schools students graduate, but do not go to college, have neither academic nor job skills? I disagree with this, as there are many jobs where the academic and job skill training they received in high school is more than adequate. Nicholas then doubled down:
"...But even the 68 percent aren't doing so well. Almost 40 percent of students who begin four-year college programs don’t complete them, which translates into a whole lot of wasted time, wasted money, and burdensome student loan debt. Of those who do finish college, one-third or more will end up in jobs they could have had without a four-year degree. The BLS found that 37 percent of currently employed college grads are doing work for which only a high school degree is required.
It is true that earnings studies show college graduates earn more over a lifetime than high school graduates. However, these studies have some weaknesses. For example, over 53 percent of recent college graduates are unemployed or under-employed. And income for college graduates varies widely by major – philosophy graduates don’t nearly earn what business studies graduates do. Finally, earnings studies compare college graduates to all high school graduates. But the subset of high school students who graduate with vocational training – those who go into well-paying, skilled jobs – the picture for non-college graduates looks much rosier.
Yet despite the growing evidence that four-year college programs serve fewer and fewer of our students, states continue to cut vocational programs..."
There are a lot of successful billionaires who did not complete four yeas of college: Bill Gates, Steve Jobs, Michael Dell, Henry Ford, and Howard Hughes, just to name a few.
If you feel that the only purpose of attending high school or college is to get job-specific skills, then you are missing out on all the other aspects of those that teach you valuable life lessons, getting along with others, teamwork, communications, and other "soft skills" that aren't necessarily job-specific.
Teenagers entering college are still growing up, trying to figure out what they want to do with their lives, discovering new ideas, new ways of thinking, and networking with people of different backgrounds and cultures.
"...The U.S. economy has changed. The manufacturing sector is growing and modernizing, creating a wealth of challenging, well-paying, highly skilled jobs for those with the skills to do them. The demise of vocational education at the high school level has bred a skills shortage in manufacturing today, and with it a wealth of career opportunities for both under-employed college grads and high school students looking for direct pathways to interesting, lucrative careers. Many of the jobs in manufacturing are attainable through apprenticeships, on-the-job training, and vocational programs offered at community colleges. They don’t require expensive, four-year degrees for which many students are not suited..."
The skills shortage is real, but until employers are willing to pay people for what they're worth, the situation will not be resolved. The free market has a way to fix skills shortages. High demand raises salaries, and causes people to invest in high school and college education in part to vie for these positions. That is in part why medical doctors are paid so much.
"...The modern workplace favors those with solid, transferable skills who are open to continued learning. Most young people today will have many jobs over the course of their lifetime, and a good number will have multiple careers that require new and more sophisticated skills..."
A few years ago, I was hosting clients for dinner in Tucson. The sales rep had brought his daughter and her roommate along, as there was a shooting at their college campus and classes were canceled for the week. The daughter asserted, "In 18 months, I will no longer have to learn anything again. I will be done with school." Her roommate chimed in, "Ha! I am a year ahead of you, and only six months away from that!"
I was the bearer of bad news. "Ladies," I said, "you will have to get used to learning new things the rest of your lives." The highest ranking client at the table overheard me, and she re-iterated, "Ladies, that is probably the best advice I have heard in awhile. I suggest you heed it carefully."
A big part of high school and college education is to teach you how to learn on your own. Learn to read, search out information, take measurements, gather data, make plans, and ask the right questions. These are skills that are useful in a wide variety of careers.
Nicholas concludes with:
"...Just a few decades ago, our public education system provided ample opportunities for young people to learn about careers in manufacturing and other vocational trades. Yet, today, high-schoolers hear barely a whisper about the many doors that the vocational education path can open. The “college-for-everyone” mentality has pushed awareness of other possible career paths to the margins. The cost to the individuals and the economy as a whole is high. If we want everyone’s kid to succeed, we need to bring vocational education back to the core of high school learning."
I agree the educational system in United States is broken, but I am not sure I agree with everything that Nicholas writes in this article.
How do you define success? For some, it is based on their salary, or perhaps revenue they helped close for their company.
For others, their family life and the flexibility to handle work/life issues might be more important.
Still others look for certifications and awards from official agencies.
As a side gig, I sometimes do bartending on the weekends. Typically, these are for weddings or corporate parties.
I took weeks of bartender training and passed a three-hour exam to become state-certified to do so in Arizona. We Arizonans take our liquor seriously! If you think about it, bartending is just a notch below being a Pharmacist dispensing other drugs.
Surprisingly, some of my patrons will be condescending, "Don't you wish you can do more with your life than be a bartender?"
I am also certified "Laughter Yoga" instructor, and am called in at times to substitue for other instructors. Again, I took formal training and was certified to do so.
Again, some of my students will ask, "Don't you wish you could do more with your life than be a yoga instructor?"
In both cases, I would respond, "Dude, I earn six figures, and am happy to meet new people every week, how about you?" This usually shuts them up!
(For those interested, here are [my top 10 posts] which served as the basis of the interview!)
I am happy to be recognized externally and within IBM for my success as a blogger. Since I started blogging over 10 years ago, I have helped close over $4 Billion USD in revenue for IBM, written five books on IBM Storage, mentored dozens of other successful bloggers, and presented to thousands of clients at conferences, workshops and briefings.
Well, it's Tuesday again, and you know what that means? IBM Announcements! I am here in New York for the exciting news!
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for the IBM z14 mainframe and DS8880 Storage System.)
In support of the [IBM z14] mainframe announcement, IBM has also disclosed R8.3 enhancements for the DS8880 Storage System. Here is a quick recap:
New Tier-1 Flash Capacities available for HPFE Gen2 drawers
IBM introduces the new Tier-1 flash card capacity 3.84 TB flash card. In the past, IBM DS8880 only supported Tier-0 cards that support 10 Drive Writes per Day (10 DWPD), with capacities 400, 800, 1600 and 3200 GB. The Tier-1 flash card only handles 1 DWPD, often dubbed "Read-Intensive" devices, but can actually handle about 90 percent of most production workloads.
zHyperLink™ drastically reduces the latency between the IBM z14 mainframe and the DS8880 storage systems. Traditional FICON paths through SAN switches or directors introduced about 140 to 175 microseconds of latency between systems. This new system is a direct cable, with 20 microsecond latency.
The I/O bays on the DS8880 used for HPFE Gen2 already have zHyperLink ports on them. This direct cable is limited to 150 meters, however, so plan accordingly.
Transparent Cloud Tiering
IBM already announced Transparent Cloud Tiering to IBM Bluemix, IBM Cloud Object Storage and the IBM TS7760 virtualization engine in R8.2.3 release. The new Release 8.3 of DS8880 now adds support for Amazon S3, providing yet another choice for where to migrate data sets to. IBM also adds replication, allowing the data set to be migrated to two separate target locations, for added availability, much like writing to separate ML2 tape cartridges.
Cascading FlashCopy is a feature that has existing for awhile now on IBM XIV and SAN Volume Controller platforms, so this is just a port of that concept over to the DS8880 microcode. Now, if you FlashCopy target can become the source of a follow-on FlashCopy request. You can make copies of copies. This applies to both the volume and data set level functions.
Why would anyone do this? Well, you might suspend your application at midnight and create a clean FlashCopy of a 24-by-7 ever-changing database. Then in the following morning, workers who need a static "midnight version" of the database now can use this as their source and perform additional FlashCopy requests for their own needs.
IBM DS8880 MES Support
MES is an abbreviation for "Miscellaneous Equipment Specification", one of the many Three Letter Acronyms [TLA] that doesn't help knowing what the words stand for. In short, an MES is a formal supported option to upgrade a piece of hardware that is already installed and running at a client location. IBM will offer MES to upgrade existing DS8880 systems to have the additional HPFE Gen2 drawers, and to upgrade the I/O bays to support zHyperLink connections.
(Final note: you might notice the change in upper and lower case. The IBM z14 (lower case) refers to the specific mainframe model, consistent with its predecessors the z13 and z13s, but the family name "IBM z Systems" has been shortened to "IBM Z®" (upper case). IBM Storage Systems and IBM POWER Systems were already upper case, so the mainframe guys just wanted to follow suit. I suspect "IBM i" will remain lower case, however.)
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM Elastic Storage Server
Replacing the older "GSn" and "GLn" models, IBM announces the "Second Generation" GSnS and GLnS models (the second "S" stands for Second Generation), the "n" continues to refer to the number of storage drawers. All of these have a pair of POWER8 servers to drive amazing performance at a low price point.
The "GSnS" models are based on smaller 2U, 24-drive storage drawers, with 3.84 and 15.36 TB Tier-1 Read-intensive Solid-State Drives (SSD). The "GLnS" models are based on larger 5U, 84-drive storage drawers, with 4TB, 8TB and 10TB nearline (7200 rpm) spinning disk.
These new models have the latest IBM Spectrum Scale software pre-installed.
In addition to IBM's two existing Hyperconverged offerings--IBM Spectrum Accelerate for x86 servers, and IBM Spectrum Scale for x86, POWER and z Systems servers--IBM Power Systems now offers a third option. This integrated offering combines Nutanix's Enterprise Cloud Platform software with IBM Power Systems™ hardware to deliver a turnkey hyperconverged solution that targets critical workloads in large enterprises.
Nutanix is offered and will be defaulted/required on these Power® servers only:
While "Hyperconvergence" is still fairly new, and only about 1 percent of data centers have deployed this new technology, I am glad that IBM is a leader in this space with multiple offerings across both x86 and POWER systems platforms.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 5, the last day of the conference.
Integrating IBM Storage in Container Environments
Dr. Robert Haas, IBM CTO Storage for Europe, presented IBM Storage for Docker containers. These are different from containers in IBM Cloud Object Storage, and different from the Container Pools used in Spectrum Protect.
Robert gave an overview of IBM Spectrum Conductor, part of the IBM Software Defined Infrastructure (SDI) Spectrum Compute family of software products. The goal is to analyze large amounts of data, access these data efficiently, and protect the data, results and insights as intellectual property.
IBM Spectrum Compute comes in several offerings. IBM Spectrum LSF (Load Sharing Facility) manages long-running batch jobs for modeling, design and simulations. IBM Spectrum Symphony provides low-latency for risk analytics in the financial services sector. IBM Spectrum Conductor comes in two flavors. Conductor for Spark (CFS) manages Spark analytics. Conductor for Containers (CFC) handles Docker and Kubernetes containers.
Docker is the run-time platform. While there are other container run-time platforms like RKT and LXD, Docker is clearly the marketshare leader, growing 40 percent per year.
Statistics from the latest DockerCon2016 conference showed the most popular use cases and workloads for Docker. What can run in Docker: Lots of applications can be "containerized", including Redis, MongoDB, PostgreSQL, OracleDB, Java, to name a few. Docker is well established in enterprises, including service providers, healthcare, insurance and financial services, public sector, and technology firms.
Kubernetes, Mesos and Docker/Swarm are a layer above, as orchestrators. Spectrum Conductor for Containers uses Kubernetes and other open source tools to coordinate activity. Orchestrators restart failed applications, and can scale up or scale down the number of instances as needed. Orchestrators can manage groups of applications, across clusters on-premises and off-premises Cloud.
From a storage perspective, containers access storage like bare-metal operating systems, bypassing all of the layers normally associated with bloated Virtual Machine hypervisors. It also eliminates single root I/O virtualization (SR-IOV) that VMs use to compensate.
Persistent storage can be isolated, so that containers cannot see the files of other containers. This provides multi-tenancy.
Internal persistent storage (directory on host file system). However, if you move a container from one host to another, you may lose access to this internal storage.
External volume, manually mounted.
Volume driver plug-in REST API that automatically mounts it.
The fourth method is preferred. Plug-ins are available for IBM Spectrum Scale, GlusterFS, Portworx, Rancher Convoy, RexRay, and Contiv. The start-up Flocker have gone out of business last year.
The Docker hosts can attach to IBM Spectrum Scale in all of its supported offerings, including POSIX, NFS and SMB protocol. Containerized applications can move from one Docker host to another, and continue access the IBM Spectrum Scale namespace.
IBM has created the "Ubiquity Volume Service" that provides a consistent API for Docker and Kubernetes. This will use IBM Spectrum Control Base Edition to support IBM Spectrum Scale, Spectrum Accelerate, Spectrum Virtualize and DS8000 storage systems. For IBM Spectrum Scale, volumes are mapped to iSCSI volumes, filesets or directories. For other devices, volumes are mapped to block LUNs. Ubiquity is publicly available on GitHub.
Enterprise Applications for IBM Cloud Object Storage
Andy Kutner, IBM Cloud Architect, presented the various options available for NAS gateways that can front IBM Cloud Object Storage.
Ctera offers NAS gateways, and Endpoint agents for backup and Enterprise File Sync & Share (EFSS). This vendor targets Remote Office/Branch Office (ROBO) and small NAS consolidation that have less than 60 TB per office IBM is a reseller of Ctera, so you can get both Ctera and IBM COS from the same IBM sales rep.
Nasuni offers a global file system, accessible from any device, smartphone, tablet or desktop. They are focused on taking out EMC and NetApp NAS solutions. Performance at the edge, combined with capacity in the client's chosen Cloud (including IBM Cloud Object Storage or IBM Bluemix). Infinite snapshots replace backups, offering RPO of 1 minute for Disaster Recovery. Their global file system "UniFS" offers file locking.
Panzura focuses on Cloud Integrated NAS, File Distribution, and Collaboration. This can help eliminate "islands of storage". The File Distribution can be any type of file, but was originally designed for Media and Entertainment, such as videos. Collaboration employs EFSS features for workgroup shared file folders, such as CAD/CAM or engineering blueprints.
IBM Spectrum Scale can provide NFS and SMB access to files, and then move colder, less active data to IBM Cloud Object Storage, using Transparent Cloud Tiering feature. Spectrum Scale offers WAN caching across locations.
IBM COS now offers a native NFS v3 interface. This allows read/write NFS access, with S3 API read of the same content. Each file is mapped to a single object.
This is targeted for large scale archive, static-and-stable data, NFS-based backup software, and applications going through the transition from file-based to object-based. This is not intended for multi-site collaboration or primary NAS replacement. Regardless of the number of geographically dispersed IBM COS sites, the NAS can run on only one or two sites initially.
To provide NFS v3 support, IBM introduces new F5100 File Accessers, which talk to an IBM COS Accesser, which in turn acts on specific Vaults in the storage pools. The file-to-object mapping metadata is replicated on-premises across three File Accessers, and optionally replicated asynchronously to a second site for High Availability. S3 API can read access the file by file name, or by Object URI.
Initially, the "File Accesser" is only available as pre-built system, not as software-only.
There was not enough time to cover other solutions, including Avere, NetApp AltaVault, or Open Source S3FS.
This was a great event, just the right size, between 1,500 and 2,000 attendees. Similar IBM Technical University events coming up later this year:
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Thursday evening, we had the "Meet The Experts" sessions. There were four: Storage, Power Systems, z/OS, and a fourth one focused on z/VM and Linux on z Systems. I was on the expert panel for Storage.
Mo McCullough was the emcee. Special thanks for Shelly Howrigon in her help with this event.
(Disclaimer: Do not shoot the messenger! We had a dozen or so experts on the panel, representing System Storage hardware, software and services. I took notes, trying to capture the essence of the questions, and the answers given by the various IBM experts. The answers from individual IBMers may not reflect the official position of IBM management. I leave out any references to unannounced plans or products. Where appropriate, my own commentary will be in italics.)
When will IBM offer a single pane of glass management for all of its IBM storage products?
IBM is working hard on this. Our strategy is to focus on IBM Spectrum Control as the primary answer. We have extended support across block, file and object, with support for IBM Spectrum Scale and IBM Cloud Object Storage System. We have also provided plug-ins for VMware, Cisco UCS Director, and OpenStack Horizon, for those who prefer those management systems instead.
What we really need are REST APIs!
Good point. IBM already has some REST APIs for the DS8000, XIV and Spectrum Protect, now that IBM has browser-based GUI across its entire product line, it is our strategy to offer REST API across our product line as well.
What is the next generation of ProtecTIER Data Deduplication going to look like?
IBM is focused on provided "data deduplication" for backup workloads directly through IBM Spectrum Protect backup software. IBM continues to sell IBM ProtecTIER.
(Virtual Tape Libraries like IBM ProtecTIER and Dell EMC Data Domain were created to handle the fact that many backup software back only were designed for tape drives and libraries. VTL was disk that pretended to be tape library. Now that IBM Spectrum Protect, NetBackup, Commvault, and all of the other modern backup products write natively to disk, object storage or Cloud services, there really isn't a need for VTL products any more.)
Why does IBM bother with all-Flash version of DS8000 when it already has IBM FlashSystem?
Different products for different workloads. IBM DS8000 offers unique support for z System mainframe FICON attachment and 520-byte block support for IBM i. IBM also offers all-Flash Elastic Storage Server, all-Flash SVC and Storwize products, that complement the IBM FlashSystem product line.
We like how XIV can hot-enable encryption, even with existing data on it. Why doesn't DS8000 offer this?
Two separate implementations. At the time IBM DS8000 encryption was designed, it was decided that the client needed to enable encryption before writing any data.
Will we see a spinning disk version of the FlashSystem A9000
Flash is now less expensive than spinning disk, I don't see why IBM would go backwards. The future is Flash.
We would like Spectrum Control to manage our Dell EMC Isilon
Yes, we have heard that from others. We are working on extending our third party support. Send in your cards and letters to help us prioritize. Or, better yet, submit a "Request For Enhancement" (RFE).
The difference between Tier 0 (Write Endurance) flash and Tier 1 (Read Intensive) flash is confusing, are there any plans in the IT industry to simplify this?
No, if anything it will get worse. Today, IBM's Tier 0 is 10 Drive Write Per Day (DWPD), and Tier 1 is 1 DWPD. Other SSD drives offer 2, 3, 5, 10, 15 and 25 DWPD. As people buy more Flash, and less disk, expect more differentiation in this area.
We would like to tune Easy Tier on the Storwize products
Understood. IBM typically implements new features on the DS8000 platform first, then rolls them over to Spectrum Virtualize. The ability to influence allocation order, pin or avoid tiers, and have application API to influence the placement are already in DS8000.
What will the future of Storwize look like?
We don't have enough time to cover that in this meeting.
Recently, you raised the maximum Storwize FlashCopy background copy rate from 64 MB/sec to 2 GB/sec, but is that realistic?
The setting provides the background task a target "grains per second" to try to achieve. It may not be possible depending on your configuration and the number of concurrent tasks. Your Storwize may be so busy with background activity that it won't take host I/O.
We have been giving you our wishlist, but are there any questions the IBM experts have for the audience
Yes, are there any clients being asked to secure storage against Ransomware and insider threats from disgruntled employees?
(Several hands went up, and we collected their names to have further discussions.)
How should we assign business value to data?
IBM Spectrum Virtualize allows you to assign metadata tags to files, so that these can be used to drive different policies.
(The process of assigning business value is often called "Data Rationalization" and is part of ILM, BC/DR, and Data Governance efforts.)
I am concerned that AES 256 encryption is not good enough now that there is Quantum Computing.
It will be decades before Quantum Computing will be good enough to break these codes.
Will Blockchain drive huge or unique storage requirements?
No. The entries are small. You are appending small transactions to the end of existing ledgers. Nothing unique or different.
Were there any topics not adequately covered at this conference?
IBM didn't have much to offer for Spectrum Compute family of software, the Software Defined Infrastructure (SDI) that runs on both x86 and POWER systems. This should be done under the POWER brand, but many clients use Spectrum Compute with x86 servers. Ironically, Spectrum Compute products are managed under the Storage division, since Spectrum Compute and Spectrum Storage work well together.
We would like Storwize's clever NPIV to be implemented in all of the other IBM arrays, starting with DS8000.
That probably won't happen, as they are different architectures. Whereas Storwize and the rest of IBM Spectrum Virtualize family were designed for nodes to fail, and take their ports down with them, the DS8000 has independent I/O bays that continue to run independent of either POWER8 node. Likewise, FlashSystem 900 has similar separation between the FCP adapters and the processing nodes.
Can we have consistent licensing across the entire IBM Spectrum Virtualize set of products, please?
We have a task force to investigate this, and will gladly add your name to the list for input and feedback.
While the conference continues Friday morning, for many attendees, this was the last event.
IBM Spectrum Scale was formerly called GPFS and has been around since 1998. I am glad it was renamed, as GPFS suffered from "guilt by association" with other file systems, AFS, DFS, XFS, ZFS, and so on.
Spectrum Scale does so much more, supports volume, file and object level access, supports POSIX standards for Windows, AIX and Linux, support Hadoop and Spark with 100 percent compatible HDFS Transparency Connector, support NFS, SMB and iSCSI protocols, as well as OpenStack Swift and Amazon S3 object based access.
Initially designed for video streaming and High Performance Computing (HPC), IBM has extended its reach to work in a variety of workloads across different industries. More than 5,000 production systems are running at client locations.
IBM Spectrum Protect solution design: Server, Deduplication and Disaster Recovery decisions
Dan Thompson, IBM Storage Software Technical Sales Specialist, presented this session.
To make it easier to deploy, IBM Spectrum Protect now has a set of tested "blueprints" that are organized into small, medium and large. Find the one that fits your needs, and it will tell you exactly how the server should be configured. Dan recommends having a "test system" to try out new releases of IBM Spectrum Protect.
For multiple server configurations, Dan recommends adopting a standard naming convention, and to make use of Enterprise Configuration and server-side Client Option Sets. You may want to consider discrete instances for special non-backup functions, like library manager or Operations Center hub server, which allows you to upgrade more aggressively without affecting your backup clients.
If you plan to run multiple Spectrum Protect instances on the same VMware host, set the DBmemPercent to avoid having DB2 consume all of the memory, which will interfere with out Spectrum Protect instances.
For clustered servers, IBM supports Active/Passive, Active/Active, Many/One, and Many/Few configurations. You can mix and match these as needed.
For data spill remediation, consider NIST 800-88 data shredding. This depends on the type of storage media used.
IBM Spectrum Protect for Data Retention, formerly called System Storage Archive Manager (SSAM), offers For Non-erasable, Non-Rewriteable (NENR) enforced Immutability protection. (This used to be called Write-Once-Read-Many or WORM for short, but since WORM applies only to tape and optical media, and IBM Spectrum Protect now supports Flash, Disk, Object Storage and Cloud repositories, IBM has adopted the term NENR instead). Third party KPMG has certified IBM Spectrum Protect for Data Retention meets to their satisfaction the requirements for SEC 17a-4 regulations.
When sizing your server, Dan recommends that you always "over-size" it and grow into it. Use the published "Performance Optimization Guide" to help. Monitor the server and storage using OS and device specific monitoring, in combination with IBM Spectrum Protect reports.
If you are still on BC Tiers 1 or 2, transmitting tapes to a remote vaulting facility or secondary data center, consider upgrading to BC Tier 3 at least. This can be done via electronic vaulting to an Automated Tape Library (ATL), Virtual Tape Library (VTL) or IBM Cloud Object Storage, or a Cloud service provider such as IBM Bluemix or Amazon Web Services. This can be supplemented using DB2 HADR for the IBM Spectrum Protect database.
While Spectrum Protect server can run bare-metal or as a VM, the VM instance will not have support for FCP-based tape or Virtual Tape Library. Many people are moving off tape, especially VTL, and using native Disk, Directory or Cloud container pools instead.
Lastly, take advantage that Operations Center can view all Spectrum Protect servers across all locations. This can be helpful.
Enabling Mission Critical NoSQL workloads using IBM trillions of operations technology
TJ Harris, from the IBM Storage CTO office, and Scott Brewer, FlashSystem Team Lead, co-presented this session.
They gave a background on NoSQL, the most popular being MongoDB. The IT industry estimates that NoSQL will grow 38 percent CAGR from 2015-2020.
The problem occurs when NoSQL applications go through a full file system stack to work with low-latency devices like Flash, especially when the writes are small, often just a few dozen bytes to 100 KB. Fortunately, IBM Research has created the "Trillions of Operations" project to explore ways to take reduce the software stack, and make use of NVMe protocol.
The top three challenges for NoSQL deployments are: (a) Cost, (b) Data management and retention, and (c) Data relevancy.
To enable innovation, MongoDB offers a "Storage Engine API" that allows others to compete at this space. Currently MMAP v1 and WiredTiger are supported. IBM Research implemented its "Trillion Operations" project as a plug-in to this API, optimized for high rates of ingest for data. Compared to Facebook's RocksDB, IBM was 14x faster write, and 2.1x faster read.
Another challenge is coordinate backups and disaster recovery when applications mix traditional RDBMS with these new NoSQL databases.
The week is nearly over, and I can see the light at the end of the tunnel. Everyone had a great time last night's event at the Universal City Walk and Blue Man Group.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here is my recap of the sessions on the morning of Day 4.
Configurable IBM Spectrum Scale
Kent Koeninger presented IBM Spectrum Scale software, which Kent refers to as "Configurable Spectrum Scale" (or CSS for short), as opposed to the pre-built system known as Elastic Storage Server (ESS).
Why choose CSS versus ESS? Lower entry price. You can start with just two single-socket servers and a drawer of disk.
IBM Spectrum Scale was formerly called IBM General Parallel File System (GPFS). Many who tried earlier versions of GPFS found it difficult to configure, because it only had a command line interface. Now, Spectrum Scale has a fully-functional GUI, and clients have been able to install and configure Spectrum Scale in just 30 minutes!
How big can Spectrum Scale grow? As much as your budget can afford! With an architecture that can support YottaBytes of data and 900 quintillion files, you won't hit any limits anytime soon.
There are some unique capabilities of ESS not available in CSS. For example, ESS offers Spectrum Scale Native RAID (erasure coding) with fast rebuild times, and ESS is certified for SAP HANA. You can combine any combination of CSS and ESS in the same Spectrum Scale to create a "data lake" for mixed workloads.
A good use case for Spectrum Scale, either CSS or ESS, is backup. Kent explained why it is an excellent option to store backups with enterprise backup software such as IBM Spectrum Protect or Commvault.
VersaStack - Hybrid Cloud like no other
This session was jointly presented by Chris Vollmar, IBM Storage Architect, and Brent Anderson, Cisco Global Consulting Systems Engineer. IBM and Cisco have been partners for more than 25 years.
VersaStack combines Cisco UCS x86 servers, Cisco Nexus and MDS switches, and IBM FlashSystem or Spectrum Virtualize storage.
What if you have a SAN Infrastructure built entirely from IBM b-type or Brocade-based switches? Cisco supports their SAN switches for this, but nobody has tested VersaStack in this combination, and UCS Director does not manage this combination, so IBM does not support this. Instead, for this situation, IBM recommends doing external connection via Ethernet, or using direct-attach configurations.
The Cisco Validated Design spends four months testing, and gives you bulletproof process to deploy the solution.
There is a difference between Cisco UCS Manager and UCS Director. UCS Manager is available at no additional charge, but only manages the Cisco x86 servers. UCS Director is optionally extra priced, and manages Cisco servers, Cisco networking, and IBM Spectrum Virtualize storage.
Brent explained the benefits of UCS Management through policies and profiles.
Chris covered Cisco CloudCenter, which the Cisco team shortens to just "C3". IBM Spectrum Copy Data Management can be used to move snapshots of data between on-premises and off-premises Cloud to help in Hybrid Cloud configurations.
How to Design an IBM Spectrum Scale solution
Tomer Perry, IBM Spectrum Scale I/O Development, presented this session.
For those who want to bring up a quick IBM Spectrum Scale environment to play around with, you can do this in as little as 30 minutes. But to design a mission critical deployment, additional requirements may need to be addressed. You may need to consult with not just storage admins, but also application owners, network admins and security personnel.
Large companies have hundreds or thousands of applications, so Tomer recommends to group these into "Workload families", based on data set types, access patterns and performance requirements. For NAS take-out, 80 percent of NAS I/O is "get attribute" that can easily be served directly from cache memory.
For each workload family, you may need to decide on snapshots, quotas, namespace (bind mounts, symlinks, etc.), security (ACL, encryption), estimated capacity, replication BC/DR, backup and ILM requirements.
Unless this is completely greenfield deployment, the existing infrastructure needs to be evaluated. This includes the LAN and WAN network topology, name resolution (DNS), time services (NTP), Authentication (AD, LDAP, NIS, Keystone), Keyserver (IBM SKLM), Monitoring and Migration requirements.
Tomer suggests designing the environment in this order: Cluster, File System, Storage Pools, Fileset, Replication, and finally Monitoring.
Generally, you need three NSD servers per cluster. For those licensing Spectrum Scale Standard Edition by the socket, you may be tempted to put everything into one big cluster. The new capacity-based Spectrum Scale Data Management Edition eliminates that concern, so Tomer recommends having separate computer clusters and storage clusters, connected by cross-cluster mount. All nodes in a cluster are considered an "ssh" administration domain.
A single Spectrum Control namespace can support up to 256 file systems. There are various reasons to have multiple file systems: block size, backup/recovery, snapshot, quotas, and cross-cluster isolation. If a file system gets corrupted, it will not affect other file systems. In an internal test, an "fsck" on 1 billion, 1 PB of data file system took only 30 minutes to repair.
Storage Pool design can separate metadata from content, and workloads can be separated to different storage media. With ILM, HSM and TCT, you can move colder data to Cloud, Object Storage, Spectrum Protect or Spectrum Archive.
Filesets are tree branches for each file system. IBM Spectrum Scale supports both dependent and independent filesets. Filesets can be used for Non-erasable, Non-Rewriteable (NENR) Immutability, policies, quotas, snapshots. Consider using a fileset instead of carving off a new file system.
Spectrum Scale offers both synchronous and asynchronous replication. For Synchronous, the ReadReplicaPolicy can be set to default, local or fastest. For Asynchronous, there are a variety of AFM modes (Read-only, Local-Update, Single-Writer, Independent-Writer, and Disaster Recovery). You may need to decide if your AFM gateways are dedicated or collocated. You will need to tune your TCP buffers for WAN performance to get the RPO you desire.
The nice thing about IBM solutions is that you can start small, and grow big. In all of these examples above, IBM offers sizes to match nearly any IT budget.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the sessions of Day 3.
Ethernet-only SANs -- Myth or Reality?
Anuj Chandra, IBM Advisory Engineer, presented an excellent overview of Ethernet-based SANs. He started with a quick history of Ethernet, starting with Robert Metcalfe's original drawing for his concept.
In the past, Ethernet was used for email and message transfer, and so dropped packets were tolerated. However, with the use of Ethernet for SANs, many standards have been adopted to make Ethernet networks more robust. These meet requirements for Flow Control, Congestion management, low latency, data integrity and confidentiality, network isolation, and high availability.
These standards are known as IEEE 802.1Q "Data Center Bridging", including 8012.Qbb Priority Flow Control, 802.1Qaz Enhanced Transmission Selection, 802.1Qau Congestion Notification. There is also the IETF Transparent Interconnection of Lots of Links (TRILL) to replace Spanning Tree Protocol (STP). All of these features are negotiated between endpoints server and storage. Ethernet that supports these new standards is often referred to as "Converged Ethernet" since it handles both traditional email/message traffic as well as SAN data traffic.
In addition to 1GbE and 10GbE, we now have 2.5, 5, 20, 40, 50, 100 Gb Ethernet speeds. By 2020, Anuj estimates over half of all Ethernet ports will be 25 GbE or faster. Amazingly, some of these can work on existing 10BASE-T cables.
Anuj also covered Remote Direct Memory Access (RDMA), and the RDMA-capable Network Interface Cards (RNIC) that support them. In one chart, shown here, Anuj explained Infiniband, RDMA over Converged Ethernet (RoCE) and RoCE v2, and Internet Wide Area RDMA Protocol (iWARP).
While many of these enhancements were intended for Fibre Channel over Ethernet (FCoE), the beneficiary has been iSCSI. Now there is iSCSI Extensions for RDMA (iSER) to take even more advantage of these changes, and can work with Infiniband, RoCE or iWARP. All of these networks can also be used as the basis for NVMe over Fabric (NVMeOF).
Ethernet is the backbone of Cloud usage, and IBM is well positioned to take advantage of these new networking technologies.
Digital Video Surveillance solutions for extended video evidence protection
Dave Taylor, IBM Executive Architect for Software Defined Storage solutions, presented this session on Digital Video Surveillance (DVS).
Most video surveillance is either analog-based, going to standard VHS tapes, or file-based. Sadly, security guards that watch live camera feeds lose their attention span after 22 minutes.
There are an estimated 72 million cameras globally, with 1.5 million more every year.
City governments spend 57 percent of their budget on "public safety". This can include body cams for police departments. Taser International, now called AXON, dominates the body-cam market.
City budgets may not be prepared to store all of this video content into a cloud that complies with Criminal Justice Information Services (CJIS) standards. These Cloud services tend to be more expensive, as the videos must be treated as evidence, tamper-proof, and with appropriate chain of custody.
DVS is not just storing movies. IBM offers Intelligent Video Analytics. It is important to be able to derive insight and actionable response.
Storage capacity adds up quickly. Standard 1080p (1920 by 1080 pixel) camera generates 2.92 GB per hour, 70 GB per day, and over 2TB per month. If you have 1,000 cameras, that's over 2PB of data.
For xProtect servers running Windows, the Tiger Bridge Connector can be used to move the video files to either IBM Spectrum Scale or IBM Cloud Object Storage.
Deep Dive into HyperSwap for Active-Active applications and Disaster Recovery
Andrew Greenfield, IBM Global Engineer for Storage, explained the different ways HyperSwap is implemented across the IBM storage portfolio.
For IBM DS8000, HyperSwap is based on Metro Mirror synchronous replication. In the event that the primary DS8000 fails, the host server can automatically re-direct all I/O to the secondary DS8000. This is often referred to as "High Availability" (HA), and in some cases can serve as Disaster Recovery.
For IBM Spectrum Virtualize products, including SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software, the implementation is different.
Previously, SVC offered Stretched Clusters, which put one node in one site, and a second node at another site, which allows for an Active/Active configuration. Unfortunately, the nodes in FlashSystem V9000 and Storwize are "connected at the hip", effectively bolted together, so putting separate nodes in different locations was not possible. To solve this, IBM developed HyperSwap that allows one node-pair to replicate across sites to another node-pair in the same Spectrum Virtualize cluster.
However, even though it is called "HyperSwap", it is not implemented in any way similar to the DS8000 method. Instead, Spectrum Virtualize uses the Global Mirror with Change Volumes to replicate data between sites.
IBM Storage and VMware Integration
This session was co-presented by Brian Sherman, IBM Distinguished Engineer, and Steve Solewin, IBM Corporate Solutions Architect.
For nearly two decades, IBM is a "Technology Alliance Partner" with VMware. To provide consistent integration to all the features and functions of VMware, IBM Spectrum Control Base Edition (SCBE) is provided at no additional charge for IBM DS8000, XIV, FlashSystem and Spectrum Virtualize products.
SCBE is downloadable as an RPM for RedHat Enterprise Linux (RHEL) can run bare-metal or as a VM.
For those using Hyper-Scale Manager, it will automatically install a special A-line-only version of SCBE. It will install SCBE, but it will only manage the A-line products (FlashSystem A9000, FlashSystem A9000R, XIV and Spectrum Accelerate).
Storage admins can define "storage services" that can be assigned to vCenter. This allows VMware admins to allocate storage in self-service mode.
After the meetings were over, IBM had a special event at the Universal City Walk to enjoy some drinks, food, and conversation, and to watch Blue Man Group.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 2.
IBM Spectrum Protect deep dive into Container Storage Pools
Ron Henkhaus, IBM Certified Consulting IT Specialist, presented the new Spectrum Protect concept of "Container Pools" that can either be "Directory Pools" on SAN or NAS-based disk storage, or "Cloud Pools". Container pools can contain deduplicated and non-dedupe data.
Ron cautioned that directory pools should not be placed on the same file system as your Spectrum Protect database or logs. Also, best practice for any directory pool is to assign an "overflow" pool to any non-directory pool, such as disk, tape or cloud container.
Cloud pools can use either OpenStack Swift, V1 Swift, Amazon S3 protocol, Amazon Web Services, IBM Bluemix, and IBM Cloud Object Storage. You can pre-define the vaults and buckets in the configuration.
For off-premises Cloud pools, the data is encrypted by default. For other container pools, encryption is optional. Performance to Cloud pools have been improved by using "accelerator storage", basically a disk cache to collect data before sending over to the Cloud pool. Backups to Cloud pools can reach 8 TB per hour. Restore times varies from 500 to 1500 GB per hour.
Container Pools were designed for the new "Deduplication 2.0" feature introduced in version 7. Traditional Dedupe 1.0 to Device Class FILE is still available, but not recommended.
Version 7.1.6 changed the compression algorithm from LZW to LZ4. In all cases, Spectrum Protect performs these actions in this order: deduplication, compression, encryption. Data that is encrypted by the Spectrum Protect client is therefore not deduped.
The "Protect Storage Pool" command can replicate a directory pool to either a remote directory pool or Cloud pool. In addition to this remote replication, you can copy a directory pool to tape to offer air-gap protection against ransomware. Such tapes are considered part of the "Copy Container Pool". In the event of directory pool corruption, the data can be repaired from either replication or tape.
IBM Aspera can now be used for replication, using SSL and AES-128 bit encryption. If your latency is greater than 50 msec, and have more than 0.5 percent packet loss, Aspera might help. This is available for Linux on x86 platforms running v7.1.6 or higher.
For existing customers, IBM Spectrum Protect allows you to convert your FILE, VTL and TAPE device class pools to directory or Cloud pools.
Introduction to IBM Cloud Object Storage (powered by Cleversafe)
In 2015, IBM acquired Cleversafe, recognized as the #1 Object Storage vendor. Their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS. IBM offers the IBM Cloud Object Storage System in three ways: as software, as pre-built systems, and as a cloud service on IBM Bluemix (formerly known as SoftLayer).
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
While other cloud service providers have offered data storage in the cloud, this new offering also allows hybrid configurations with geographically dispersed erasure coding.
Unlike RAID which protects against the loss of one or two drives, erasure coding can protect against a larger number of concurrent failures. For example, using an Information Dispersal Algorithm (IDA) of "7+5", where seven pieces of data are encoded on twelve independent disks, the system can lose up to five disk drives without losing any data.
Combining this with Geographically Dispersed Configuration across three or more sites means that you can lose an entire data center, four of the twelve disks, and still have instant full access to all of your data from eight drives at the other locations. In the graphic, you see two on-premise data centers combined with a third location in IBM SoftLayer.
New Generation of Storage Tiering: Simpler Management, Lower Costs, and Improved Performance
With ever changing amounts of storage, it is hard to find metrics that are consistent year to year. Fortunately, we found I/O density as the metric to focus my efforts, armed with real data from Intelligent Information Lifecycle Management (IILM) studies done at various clients. From that, I was able to talk about storage tiering on three fronts:
Storage tiering between Flash and disk. IBM FlashSystem and IBM Easy Tier on DS8000 and Spectrum Virtualize family for hybrid Flash-and-disk configurations.
Storage tiering between disk, tape, and Cloud. HSM and Information Lifecycle Management (ILM) on Spectrum Scale, Elastic Storage Server (ESS), Spectrum Archive and IBM Cloud Object Storage System.
Storage tiering automation across your entire environment. IILM studies can help identify a target mix of Tier 0, Tier 1, Tier 2 and Tier 3 storage. IBM Spectrum Storage Suite and the Virtual Storage Center (VSC) can recommend or perform the movement of LUNs to more appropriate tiers, based on age and I/O density measurements.
It's hard to say what the correct sequence of presentations should be. Some thought it might have been better for my talk on IBM Cloud Object Storage System prior to Ron's talk on Cloud container pools, but perhaps hearing Ron first helped drive more interest to my session.
I have been involved with Business Continuity and Disaster Recovery my entire career at IBM System Storage. However, with new workloads like Hadoop analytics and new Hybrid Cloud deployments, I thought it would be good to provide a refresh.
The need for Business Continuity and Disaster Recovery has increased recently due to (a) climate change caused by human activity, (b) ransomware and other cyber attacks, and (c) disgruntled employees.
Back in 1983, a task force of IBM clients at a GUIDE conference developed "Seven Business Continuity Tiers for Disaster Recovery", which I refer to as "BC Tiers". I divided the presentation into three sections:
Backup and Restore: BC tiers 1 through 3 are based on backup and restore methodologies. I explained how to backup Hadoop analytics data, all of the various options for IBM Spectrum Protect software, and how to encrypt the tape data that gets sent off premises.
Rapid Data Recovery: BC tiers 4 and 5 reduce the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) with snapshots, database journal shadowing, and IBM Cloud Object Storage.
Continuous Operations: BC tiers 6 and 7 provide data replication mirroring across locations. I covered 2-site, 3-site and 4-site configurations.
IBM Spectrum Virtualize - How it works - Deep dive
Barry Whyte, IBM Master Inventor and ATS for Spectrum Virtualize, covered a variety of internal topics "under the hood" of Spectrum Virtualize. This covers the SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software.
In version 7.7, IBM raised the limits. You can now have 10,000 virtual disks per cluster, rather than 2,048 per node-pair. Also, you can now have up to 512 compressed volumes per node-pair. With the new 5U-high 92-drive expansion drawers, Storwize V7000 can now support up to 3,040 drives, and Storwize V5030 can support up to 1,520 drives.
While each Spectrum Virtualize node has redundant components, the architecture is designed to handle entire node failure. The term "I/O Group" was created to refer to the node-pair of Spectrum Virtualize engines and the set of virtual disks it manages. This made sense when virtual disks were dedicated to a single node-pair. Now, virtual disks can be assigned to multiple node-pairs, dynamically adding or removing node-pairs as needed for each virtual disk.
However, even if you have a virtual disk assigned to multiple node-pairs, only one node-pair would manage its cache, causing all other node-pairs to coordinate I/O through the cache-owning node-pair. The other node-pairs are called "access I/O groups".
The architecture allows for linear scalability, double the number of nodes, and you double your performance. Some competitors use n-way caching across four or more nodes, and it is a semi-religious argument on the pros and cons of each approach. Barry feels the 2-way caching implemented by Spectrum Virtualize is the most effective and efficient for performance.
All of the nodes are connected over IP network, but there is one designated as a "config node", and one, often the same, as a "boss node".
A cluster can have up to three physical quorum disks (either drive or mDisk) and optionally up to five IP-based quorums. The IP-based is just a Java program that runs on any server or Cloud, provided it can respond within 80 msec.
Either IP-based or physical quorum can be used for "tie-breaking" a split-brain situations. In the event there is no "active" quorum, the administrator can now serve as the tie-breaker manually. Barry recommends for Storwize clusters, where physical quorum disks are attached to a single node-pair, that you have at least one IP-based quorum for tie-breaking.
However, only physical quorum can be used for T3 Recovery. T3 Recovery happens after power outages. All of the nodes update the quorum disk with critical information of all of the virtual mappings of blocks to volumes, and this is used when bringing up the nodes again.
To protect against one pool consuming all of the cache, Spectrum Virtualize will partition the cache, and prevent any one pool from consuming more than a certain percentage of the total cache. The percentage depends on the number of pools:
Number of Pools
Max percentage of any individual pool
5 or more
Barry explained how failover works in the event of node failure. There is voting involved, and the majority remains in the cluster. In the case of an even split, called a "split brain" situation, the quorum decides. Orphaned nodes in a node-pair go into write-through mode, since the cache is no longer mirrored.
The I/O forwarding layer has been split between upper and lower roles. The upper layer handles access I/O groups. The lower layer handles asymmetric access to drives, mDisks and arrays.
N-port ID Virtualization (NPIV) drastically improves multi-pathing. Perhaps one of the coolest improvements in awhile, NPIV allows us to assign "Virtual" WWPN to other ports. When an I/O sent to a single port fails, it retries one or more times again, then waits 30 seconds, and then invokes multi-pathing to find a completely different path to the data. With NPIV, when a port fails, its WWPN is re-assigned to a different port, so the retries are likely to be successful before having to wait 30 seconds!
Lastly, Barry covered the delicate art of Software upgrades. Software is rolled forward one node at a time, and the "cluster state" is maintained during this time.
Different presentations this week are at different technical levels. My session was meant to be an overview of the concepts of Business Continuity, independent of specific operating system platform, using specific IBM products to help illustrate specific examples. Barry's was a deep dive into a single product family.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 1.
Storage Brand Opening Session - Craig Nelson
Craig Nelson, Brocade manager for IBM Field Sales Channel, indicated the network equipment is the bridge that brings servers and storage together.
The squeeze -- faster servers and Flash storage causes storage networking to become the bottleneck. Fibre Channel will remain the protocol of choice for the next decade.
"Speed is the net currency of Business" -- Marc Benioff, Salesforce CEO.
Craig drew an analogy. We have been focused on making hard disk drives faster, and then Flash changed the game. Likewise, car manufacturers have focused on making gas engines better, and then Tesla Motors introduces an electric car with insane performance. The early models actually had an "Insane Mode".
The new Gen6 models of IBM b-type SAN equipment will support 32Gbps and 128Gbps ports. That's Insane!
Later models of Tesla Motors offer a "Ludicrous Mode". For flash storage, it is NVMe. NVMe can get storage down to 20 microsecond latency. That's Ludicrous!
Craig put in a plug for two Brocade sessions: "BEWARE - The four potholes on your road to success when deploying flash storage" and "Tune up your storage network! Is it healthy enough for flash storage and next-gen server platforms?"
Storage Brand Opening Session - Clod Barrera
Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist, presenting storage industry trends.
IDC predicts data capacity to grow 60-80% CAGR. This would require 44 percent drop in $/GB per year to maintain flat budget. Unfortunately, flash media cost is only dropping 25-30 percent per year, and spinning disk only 19 percent per year.
Since storage media will not offset capacity growth, we need other technologies to compensate, including compression, deduplication, defensible disposal, and "cold" storage to tape or optical media.
The smallest persistent storage that IBM has been able to achieve is 12 atoms. Current disk technology is 1200 atoms. Since 1956, IBM and the rest of the IT industry have improved storage 9 orders of magnitude, and now there are only 2 orders of magnitude left.
Clod poked fun at the "Star Wars: Rogue One" movie, indicating that their idea of the future of storage was a huge tape library. See my December 2016 blog post [Has your data gone rogue?]
What does it take to storage information forever? Tape will certainly be around. IBM Zurich demonstrated a 220TB back in 2015 as proof of technology.
A good example of the need for long-term retention are US films. Of those from the silent era, over 90 percent are lost. Over half of the films prior to 1950 are lost. The silver nitrate film stock that the reels were made of have deteriorated. Now that more movies are made digitally, can we do better?
Clouds will move from 10GbE to 25GbE. No slow down for FC in datacenters. Flash storage and object storage are both growing quickly
Move over Software-Defined Storage, Converged and Hyperconverged systems, the new up-and-coming thing are "Composable Systems deployed in Pods" adjustable hourly by workload requirements.
To protect against Ransomware, use "air gap" protection, not on the same network as production workload.
New storage models are needed for Cognitive workloads. Clod put in a plug for Joe Dain's presentation "Introducing cognitive index and search for IBM Cloud Object Storage leveraging Watson"
Storage Brand Opening Session - Axel Koester
Axel Koester, IBM Storage Chief Technologist, presented more storage industry directions.
What will the world look like in 10 years. Today mostly procedural programming, with some statistical big data, and a bit of machine learning. In 10 years, it will be mostly statistical and machine learning, with very little procedural programming. Why? Because it is faster to train computers with Machine Learning, than to program procedurally.
Examples of machine learning are IBM Watson, Google AlphaGo, drive-AI. Axel would rather be a passenger in a machine-learned self-driving car, than a procedurally-programmed one.
Neural networks to interpret hand-written numbers. Welcome to "Unsupervised learning".
A subset of Machine Learning is Deep Learning, a major breakthrough in 2006. Deep Learning is a subset of Machine Learning that uses three or more layers of neural networks. For example, face recognition "deep learning" algorithms can also be used to detect defects through visual inspection of circuit boards.
How does this impact storage?
Procedural -- archive test cases used
Statistical -- store all data for parallel processing
Machine Learning - train sample data, then archive and re-train yearly. Driving 5 minutes = 4 TB of sensor data used for self-driving cars
For Neural processing, x86 CPU are suitable for prototyping. GPU co-processors better, efficient but uncommon. IBM has developed the "TrueNorth" chip does nothing by Neural - 4096 cores with only 70 mW of energy consumption. No clock, instead dendrites, synapses, axons and neurons.
Instead of "Build or Buy?" the new question is "Train or Buy?" Train with confidential data, or buy ready-to-run 100% pre-trained cognitive systems as a service.
AI Frameworks are available on Docker containers with Kubernetes with Persistent storage (Ubiquity) such as Spectrum Scale. These frameworks include DL4J, Chainer, Caffe, torch, theano, tensorflow.
NVMe -- NVM is local only, how to do HA and DR? There are three options:
DB asynchronous shadowing
DB mirroring over NVMeOF
Cluster file system replication of persistent data, such as IBM Spectrum Scale
Example car manufacturer with 50 SAP HANA in memory instances on 4 Spectrum Scale nodes. IBM achieved 50,000 new files per second. Most NAS systems do much less.
Faster media on smaller electronics Holmium atoms on Magnesium Oxide on silver base, resulting in "single atom storage." ATM needle tip magnetizes, measured with Tunnel Magneto-resistance. Unfortunately, reading the data causes it to lose its value, so it is not as persistent as the 12-atom method described by Clod earlier.
As the title suggests, I explained why there is so much interest in Software-Defined Storage in the IT industry, what software-defined storage is, and how to deploy these solutions in your existing infrastructure without the full rip-and-replace. I covered which IBM products are available as software, pre-built systems and/or Cloud services.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Day 1 included keynote sessions. Here is my recap for the morning.
General Session "The Quantum Age"
Amy Hirst, IBM Director of Systems Training, served as emcee for the General Session. The theme this week is "Power of Knowledge, Power of Technology, Power of You. You to the IBM'th power".
Chris Schnabel, IBM Q Offering Manager, explained what "IBM Q" is.
Chris feels "our intuition of what we can compute is wrong". Classical (non-Quantum) computing has evolved over past 100 years.
Consider Molecular geometry. The best supercomputer can only handle the smallest molecules, those with 40 to 50 electrons, and even then are unable to calculate bond lengths within 10 percent accuracy. Quantum computing can.
Another area is what computer scientists call the "Traveling Salesman Problem". If you had a list of 57 cities, what would be the optimal path to minimize the distance traveled to get to all of the cities. Doing an exhaustive search would be 10 to the 76th power. Dynamic Programming techniques provide some shortcuts, reducing this down to 10 to the 20th power, but still, that is impossible on most computers.
Chris mentioned that there are easy problems to solve in polynomial time, and hard problems that are exponential, in that they get worse and worse the bigger the input set. There will always be hard problems.
"Nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical, and by golly it's a wonderful problem, because it doesn't look so easy."
-- Richard Feynman
Nature encodes information, but not in ones and zeros. Quantum computers are measured on the number of Qubits, their error rate, etc. The three factors that IBM focuses on are Coherence, Controllability and Connectivity.
Chris explained how Superposition and Entanglement are used in Quantum Computers. I won't bore you with the details here, but rather save this for a future post.
Today: 5 to 16 Qubits (can be simulated with today's classical computers. 5 Qubits is the power of your typical laptop)
Near future: 50-100 Qubits (too big to simulate on supercomputers), with answers that are approximate or correct only 2/3 of the time.
Future: millions of Qubits, fault-tolerant to provide exact, precise answers consistently.
Quantum Computing opens up a new range of problems, what Chris call "Quantum Easy" problems. Problems that might take years to solve on classical supercomputers could be solved in seconds on a Quantum computer.
Chris showed a picture of [Colossus], the first digital electronic computer used in the 1940s. Quantum computing today is like 1940's of classical computing.
IBM is now working on Hybrid Quantum-Classical algorithms, for example:
Quantum Chemistry - can be used in material design, healthcare pharmaceuticals
Optimization - logistics/shipping, risk analytics
There are different ways to build a quantum computer. IBM chose a single-junction transmon design, using Josephson junctions. While the chips are small, the refrigerators they are contained in are huge, and have to keep the chips at very cold 15 milliKelvin temperature (minus 459 Fahrenheit)!
To get people excited about Quantum computing, IBM created the "IBM Q Experience" [ibm.com/ibmq] that allows the public to run algorithms on a basic 5 Qubit system using a simple drag-and-drop interface to put different transformational gates in sequence.
IBM Research team were shocked to see 17 publications in prestigious journals make practical use of this 5 qubit system! Since then, IBM now offers a Software Developers Kit (SDK) called QISkit (pronounced Cheese-kit) as a text-based alternative to the drag-and-drop interface.
Amy Hirst came back on stage to remind people to use Twitter hashtag #ibmtechu to follow the event. There are two more events like this planned for the end of the year. A Power/Storage conference in New Orleans, October 16-20, and another event focused on z Systems mainframe, November 13-17.
Pendulum Swings Back -- Understanding Converged and Hyperconverged Systems
This presentation has an interesting back-story. At a client briefing, I was asked to explain the difference between "Converged" and "Hyperconverged" Systems, which I did with the analogy of a pendulum. I used the whiteboard, and then later made it into a single chart.
At the far left of the pendulum, I start with mainframe systems of the early 1950s that had internal storage. As the pendulum swings to the middle, I discuss the added benefits of external storage, from RAID protection and Cache memory to centralized management and backup.
To the far right of the pendulum, it swings over to networked storage, from NAS to SAN attached devices for flash, disk and tape. This offers excellent advantages, including greater host connectivity, and greater distances supported to help with things like disaster recovery.
Here is where the pendulum swings back. IBM introduced the AS/400 a long while ago, and more recently IBM PureSystems that combined servers, storage and switches into a single rack configuration. Other vendors had similar offerings, such as VCE Vblock, Flexpod from NetApp and Cisco, and Oracle Exadata.
Lately, the pendulum has swung fully back to internal storage, with storage-rich servers running specialized software on commodity servers. There are two kinds:
Pre-built systems like Nutanix, Simplivity or EVO:Rail which are x86 based server systems, pre-installed with software and internal flash and disk storage.
Software that can be deployed on your own choice of hardware, such as IBM Spectrum Accelerate, IBM Spectrum Scale FPO, or VMware VSAN.
So, over time, my single slide has evolved, and fleshed out into a full blown hour-long presentation!
Cloud storage comes in four flavors: persistent, ephemeral, hosted, and reference. The first two I refer to as "Storage for the Computer Cloud" and the latter two I refer to as "Storage as the Storage Cloud".
I also explained the differences between block, file and object access, and why different Cloud storage types use different access methods.
Finally, I covered some of our new public cloud storage offerings, using OpenStack Swift and Amazon S3 protocols to access objects off premises, including the new Cold Vault and Flex pricing on IBM Cloud Object Storage System in IBM Bluemix Cloud.
(FCC Disclosure: I work for IBM. I have no financial interest in SUSE, Scality, or any other storage vendor mentioned in this post. This blog post can be considered a "paid celebrity endorsement" for IBM Storwize, IBM Cloud Object Storage, and IBM Spectrum Storage software mentioned below.)
The study takes a realistic request for 250 TB of storage, at 25 percent compound annual growth rate (CAGR), to store infrequently accessed data in an online archive, and then looks at the Total Cost of Ownership (TCO) over five year period.
The study compares five different Software-Defined Solutions and three pre-built systems. The Software-defined solutions come as software-only, requiring that you purchase the hardware separately and build it yourself. The three pre-built systems were chosen from the top three storage vendors in the marketplace: Dell EMC, IBM and NetApp.
The cost of support is factored in, as it should be. To keep things equal, no data reduction like data deduplication or compression were used.
In an odd approach, the study mixes block, file and object based approaches all in the same study.
You can read the full 14-page study (linked above). I have organized the results into a single table, ranked from best to worst, color coded for the best deals in green ($100K to $200K), moderate solutions in yellow ($200K to $300K) and most expensive in red (over $300K). I put the software-only options on the left and pre-built systems on the right.
SUSE Enterprise Storage 4
IBM Storwize V5010
DataCore SAN Symphony
Red Hat Ceph Storage
Dell EMC Unity 300
I am often asked, "Isn't the software-only, build-it-yourself approach, always the lowest cost option?" Now, I can answer, "Sometimes yes, sometimes no." Fortunately, IBM offers Software-Defined Storage in a variety of packaging options including software-only, pre-built systems, and in the Cloud as a service.
IBM Storwize V5010 is based on IBM Spectrum Virtualize software, which you can deploy as software-only on your own x86 servers. This was not mentioned in the study, and perhaps it is my job to remind people that this option is also available for those who want to build their own storage.
For that matter, IBM Cloud Object Storage System -- available as software-only, pre-built systems, and in the Cloud -- might also be a cost-effective alternative.
Next week I will be in Orlando, Florida for the IBM Systems Technical University. If you are attending, stop by one of my presentations, or look for me at the Solution Center at one of the IBM peds, or attend the "Meet the Experts for IBM Storage" on Thursday!
I have been blogging for more than 10 years now, so I am no stranger to commenting on competitive comparisons. In some cases, I am setting the record straight, and other times, poking fun at competitor results, claims or conclusions. This comparison from Brian Carmody was too juicy to ignore.
(FCC Disclosure: I work for IBM. I have no financial interest in Infinidat, Dell EMC, nor Pure Storage, mentioned in this post. I do have friends and former co-workers who now work for Infinidat. This blog post can be considered a "paid celebrity endorsement" for IBM FlashSystem products.)
Here is an excerpt, I have added (Infinidat) wherever Brian says "we" just so there is no confusion:
"... So last week we (Infinidat) finally got around to running the same profiles against an INFINIDAT F6230 in our Waltham Solution Center, configured with 1.1TB of DDR-4 DRAM, 200TB TLC NAND, and 480 3TB Nearline HDDs.
In summary, we (Infinidat) wrecked the Pure and EMC systems. Here are the results side by side with EMC's data:
EMC Unity 600F
16K IOPS (80% Read)
9x Pure, 5x Unity
256K BW MBps
10.6x Pure, 3x Unity
4.5x Pure, 1.6x Unity
Steady-state latency (ms)
1/7 Pure, 1/2 Unity
By the way, we (Infinidat) took the liberty of running the test with a 200TB data set instead of Pure and EMC's 50TB because modern workloads require performance at scale, and we ran it with in-line compression enabled because our compression algorithm doesn't hurt performance.
This was an interesting test to run, and we (Infinidat) hope it helps the storage industry move away from media type wars and benchmarks (you will lose every time on performance if INFINIDAT is in the mix) ..."
Notice anything wrong here? anything missing?
The Tortoise beat "Hare 1" and "Hare 2", but did not invite the Cheetah to the race?
Brian was smart enough not to compare their product to anything from IBM. IBM has a wide variety of All-Flash Arrays, including the DS8880F models, the Storwize V7000F and V5030F models, and Elastic Storage Server models. However, for this workload, IBM would probably recommend the FlashSystem V9000, A9000 or A9000R.
Any All-Flash Array with a steady-state latency of 2 milliseconds or greater is embarassing, but then Infinibox is not really an All-Flash Array.
The architecture of their Infinibox appears much like the original XIV. It has a mix of DRAM memory and SSD cache, combined with spinning drives. It offers only compression, not data deduplication. Unlike the IBM XIV powered by six to 15 servers, the Infinibox appears under-powered with just three servers.
The Infinibox uses software-based in-line compression, which must put a huge tax on the few CPUs they have in those three servers. Infinidat chose not to compress the data in their cache, probably to reduce the additional overhead on their over-taxed CPUs.
The IBM FlashSystem V9000 has an innovative design, based on IBM Spectrum Virtualize, the mature software that you also find in the IBM SAN Volume Controller and Storwize family of products.
The FlashSystem V9000 offers hardware-accelerated compression. IBM takes advantage of the integrated Intel QuickAssist co-processor which runs the compression algorithm 20 times faster than standard Intel Broadwell CPU.
IBM compresses its cache, using a two-tier approach. The "upper cache" receives the data uncompressed, so that it can then tell the application to continue, for fastest turn-around time. Then the data is compressed, and stored in the "lower cache", optimizing the value and benefits of DRAM memory. Many databases get up to 80 percent savings, resulting in a 5-to-1 benefit in DRAM cache memory.
The IBM FlashSystem A9000 and A9000R also have an innovative, based on IBM Spectrum Accelerate, the code originally developed for IBM XIV storage system.
(Fun fact: Infinidat's founder, [Moshe Yanai], was formerly the founder and designer of XIV, and it appears that Infinidat is just a re-design of old XIV technology architecture, re-packaged with a few differences. Since Moshe left, IBM has drastically enhanced the IBM XIV.)
Like the IBM Spectrum Virtualize family, the IBM FlashSystem A9000 and A9000R have hardware-accelerated in-line compression, and two-tier approach to cache. The "upper cache" receives the data uncompressed, then the data is compressed and deduplicated, and stored in the "lower cache", optimizing the value and benefits of DRAM memory.
The IBM FlashSystem A9000 and A9000R also offer in-line data deduplication. Modern workloads are virtualized, and Virtual Machine (VM) and Virtual Desktop Infrastructure (VDI) get significant benefits from data deduplication. Infinidat does not play here. For the FlashSystem A9000, most of the metadata related to data deduplication is in cache, minimizing the overhead.
IBM FlashSystem A9000 and A9000R have full performance that blows these published Infinibox results away WITH compression and deduplication turned on.
Brian ran a workload that used the DRAM and SSD cache exclusively, eliminating the reality that any REAL WORLD workload would have to tap into those much slower spinning drives. This is not really a side-to-side benchmark. He is comparing his live run on Infinibox to published numbers from a previous comparison run on a completely different set of data.
This raises the question, why pay for all those spinning drives at all, if you plan to only use the DRAM and Flash storage for your workloads?
A week later, Brian followed up with another post [The INFINIDAT Challenge], acknowledging his comparison was bogus. Here's an excerpt. Again, I have added (Infinidat) wherever Brian is referring to his employer just so there is no confusion:
"... It's not likely that a room full of storage engineers will ever agree on parameters for a synthetic benchmark since storage evaluations are competitive and control of test parameters will invariably predetermine the 'winner'. However, I hope we can all agree that synthetic benchmarks are a waste of time, and that real world performance is what matters in the data center.
So, what can we (Infinidat) do about it?
We (Infinidat) cordially invite every enterprise storage customer who wants lower latency and lower storage cost to visit [FasterThanAllFlash.com] and sign up for The INFINIDAT Challenge.
We (Infinidat) will Give you an Infinibox system to test
We (Infinidat) will Help you clone and test your environment with Infinibox
We (Infinidat) Guarantee your applications will run faster on Infinibox than your All-Flash Array.
If we (Infinidat) fail, we'll take the system back and Donate $10,000 to the charity of your choice.
If our technology delivers, you can keep the system, and we'll (Infinidat) Donate $10,000 in your name to the charity of our choice (The American Cancer Society).
Thanks again to all who participated in the dialog over the past week. I know the post generated some controversy. Traditional storage companies are fighting for their lives trying to keep enterprise storage expensive; indeed their business models are predicated upon maintaining price levels from a bygone era...."
As consolidation play doing full range of data services, I do not see this Infinibox working out. Talking to clients who have the Infinibox, the performance deteriorates in REAL WORLD workloads as you add more data to the unit.
The Infinibox seems fine for workloads that do not demand high performance, so I was surprised Brian compared it to All-Flash arrays. The Infinibox is out of its league!
(To be fair, Pure Storage and EMC XtremeIO aren't really in the same league as IBM FlashSystem, either, given that both of those products are based on commodity SSD. IBM FlashSystem models are consistently 4 to 10 times lower latency than these Commodity-SSD based competitors.)
The Infinibox also lacks features many people expect in an Enterprise-class storage array, like Call-Home capability to identify problems quickly, and Synchronous remote mirroring for disaster recovery. It is often common for startups like Infinidat to deliver a [Minimum Viable Product] as their first offering.
To paraphrase Brian himself, your applications will lose every time on performance if INFINIDAT is in your datacenter.