Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Systems Client Experience Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
IBM has been holding various "Hackathons" and "Meetups" as a new way to reach out to prospective clients. IBM sponsored a meetup at the Austin Executive Briefing Center (EBC) to discuss Machine Learning with TensorFlow on IBM Power systems, October 26, 2017.
This was a joint event, co-sponsored by [IBM Watson/Cognitive Austin] and [Big Data/AI Revealed] meetup groups. Special thanks to my colleague Cathy Cocco, IBM Executive IT Architect with the IBM Austin EBC, for coordinating this event with their organizers.
(What is a Meetup? [Meetup.com is an online social networking website that facilitates in-person local group meetings. Meetup allows members to find and join groups unified by a common interest, such as books, games, pets, technology, careers or hobbies. In 2017, there are 32 million users with 280 thousand groups available across 182 countries.)
Here was the agenda for the event:
Registration, Pizza & Soft drinks
Tensorflow 101 presentation
Demo: Using TensorFlow for Financial Market Predictions on IBM POWER Systems
Lightning Talk: IBM Data Science Experience
Clarisse Taaffe-Hedglin: Intro to TensorFlow on IBM Power servers
Our guest speaker was my colleague Clarisse Taaffe-Hedglin, IBM Cognitive Senior Technical Architect, part of the same Worldwide Client Centers team that I work in. She flew in from Charlotte, NC.
Her topic was TensorFlow, an open source [Machine Learning] framework. TensorFlow was originally developed by Google, but was made open source in November 2015.
Machine Learning is popular in a variety of industries, from self-driving cars and trucks, speech recognition and video surveillance, to what movie to watch next on Netflix. There are three aspects to Machine Learning:
Data: Start with the data you want to analyze. This could be IoT sensor data, security logs, or social media feeds. Check out all that happens in an "Internet Minute"!
Compute: While mathematical computations can be performed on traditional CPUs, some frameworks are optimized and accelerated with Graphical Processing Units (GPU). These GPU can perform Teraflops of single and double precision calculations.
Technique: As methodology have gotten more complicated over the years, frameworks have evolved to match.
The [TensorFlow] framework is now one of the most popular among data scientists. You can download it for free at [Github].
Clarisse showed the various programming/calculation tools used by data scientists. The top five were: Python, R, SQL language, MapReduce, and Microsoft Excel.
Mathematical models come in many flavors. Clarisse explained they can be used to identify clusters of data that might have similar properties, or to perform classification, or linear regression. The results can be "descriptive", gaining a better understanding of what already is, or "predictive" for what might be.
Some frameworks like Chainer or Torch are more flexible, using a dynamic Build-by-Run approach. However, these do not scale well. Theano and TensorFlow, on the other hand, employ a Define-then-Run approach, which scales better for larger projects. With the growth in popularity with TensorFlow, the Theano framework has been "functionally stabilized".
Clarisse Taaffe-Hedglin: Financial Markets Demo
For the demo, Clarisse had historical stock closing data for USA, Australia and Asian stock markets. The hypothesis: We can determine a Buy/Sell for USA stocks based on the closing results of non-American stock results? This is a classic "Binary Classification" model. The other stock markets close 4-16 hours before the U.S. markets open, so this has real-world applicability.
Since the data was in different monetary units, she did some cleanup to normalize the data, removing out the trends, and converting everything to U.S. Dollars (USD).
Clarisse used "Supervised Learning" on 80 percent subset of the data, and then used the other 20 percent remaining data to validate how well it did.
As with any model, you measure how good it is by how close it results in the correct answer. Wrong answers are weighted by how bad they are. This is often referred to as "Loss" or "Cost". Different models can therefore be compared by minimizing the loss.
Using a simple y=wx+b mathematical model, she ran 30,000 iterations. After 5,000 iterations, the model was already guessing correctly 55 percent of the time, by the time we hit 30,000 this was up to 68 percent accuracy.
TensorFlow also supports "hidden layers", basically intermediate variables that are then used in subsequent layers for more complicated calculations. This is the way our brain works with neural networks. With two added layers, she re-ran the 30,000 iterations, and now was up to 73 percent accuracy.
Normally, this kind of analysis would take hours or days, but since TensorFlow takes advantage of the IBM Power8 CPU and NVidia Tesla K80 GPU in the IBM Power server, the whole thing finished in five minutes!
Tuhin Mahmed: Lightning Talk on IBM Data Science Experience (DSX)
Tuhin Mahmed, IBM Software Developer, is the organizer for the Big Data/AI meetup group. He wants to promote the idea of "Lightning Talks" where each person presents for just 10-15 minutes. This is a variant of the popular [Pecha Kucha] events.
To get things started, he presented 10-15 minutes on [IBM Data Science Experience], or DSX for short. Taking Multiple Listing Service (MLS) real estate data of closing prices on houses sold in a range of zip codes from the Austin Area, he mapped these on x-y axis. The x axis was square feet, and the y axis was closing price.
Using DSX, he was able to develop a mathematical model that estimates house closing prices based on their zip code and square footage.
This was a simple example, but it showed the power of Jupyter Notebooks, and how anyone can get a 30-day free trial of DSX for their own experimentation.
Currently, being a data scientist is more of an art than a science. This is one of those fields that takes only a few months to learn, but years to master.
Rather than building a model from scratch, data scientists can take existing models, and modify them to fit their needs. There are a variety of existing models available in what is called the "Model Zoo". Google has over 2,000 projects already.
Those interested in trying this out TensorFlow for themselves were directed to [Nimbix], a Cloud Service Provider that offers POWER servers with NVidia GPUs.
There were about 50 attendees, more than half identified themselves data scientists. As the first inaugural sponsored event for the IBM Austin EBC, I think this was a success!
If you are in the Austin area, the next meetup will be at the [Capital Factory] on Brazos Street on November 30, 2017.
Well, it's Tuesday again, and you know what that means? IBM Announcements!
Today, IBM announces a complete refresh of its IBM FlashSystem® all-flash array product line.
(FCC Disclosure: I work for IBM. Compression, data footprint reduction, and performance results, based here on internal IBM tests, vary widely by data and workload type. Your mileage may vary. This blog post can be considered a "paid celebrity endorsement".)
New FlashSystem 900 model AE3
The new AE3 model introduces new Microlatency cards at larger capacities: 3.6, 8.5 and 18 TB. Compare that to the previous model AE2 at 1.2, 2.9 and 5.7 TB.
These capacities are achieved by combining three-dimensional (3D) chip layout with Triple-Level Cell (TLC) transistors, often referred to as 3D-TLC. The previous technology was single-layer 2-dimensional, multi-level cells (MLC).
Last week, at IBM Systems Technical University in New Orleans, Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist, explained this via an analogy. The 2-dimensional is like a Bungalow. If you want to pack in more people, you need to make the rooms smaller, which is getting more difficult. Alternatively, you could build a multi-story skyscraper, adding more floors relieves pressure to shrink the rooms down.
Triple-level cell holds three bits per transistor. In the past, we had Single-level Cell (SLC) that stored one bit, and Multi-level Cell (MLC) that stored two bits. A future technology, Quad-level Cell (QLC) is not yet ready for production workloads in a datacenter.
The new AE3 models also offer Embedded inline Compression (EiC), with "Always-On" compression being done right on the Microlatency cards. With a fully-loaded 12 card 2U drawer, that is 10+P+S RAID-5 configuration, the amount of effective capacity is drastically increased:
FlashSystem 900 Model AE3
2U Drawer (Usable TB)
2U Drawer (Effective TB) w/EiC
The compression gets 2x to 3.5x on typical data, but your mileage may vary. The small latency cards are capped at 110 TB, and the medium and large at 220 TB effective capacity, to avoid overwhelming the on-board DRAM cache. For clients who need smaller amounts of flash, IBM will continue to sell the AE2 models with 1.2 TB MLC Microlatency cards.
After the compression, the data is encrypted with AES 256-bit encryption. This is same as the previous AE2 models, so nothing changing there.
The EiC compression and encryption do not impact performance. The new Microlatency cards achieve as low as 95 microsecond latency, about 10x faster than traditional Solid-State Drives (SSD) found in Dell EMC XtremIO and Pure Storage competitive offerings, and 40 percent faster than the new NVMe Solid-State drives. A 2U drawer can deliver up to 1.2 million IOPS, slightly more than the AE2 models (1.1 Million IOPS).
The new FlashSystem V9000 take advantage of the new FlashSystem 900 AE3 models, effectively tripling the usable capacity.
The interesting thing now is compression. Both are hardware-accelerated, with EiC being done on the Flash cards, and Real-time Compression (RtC) being done by the Intel QuickAssist chips in the controllers.
The EiC method works on 4KB blocks, so only gets 2.5x to 3.5x on typical data. The RtC method works on larger 32KB blocks, is therefore able to find more replicated sequence of characters, gets up to 5x ratio, with compressed data in the controller node cache for better cache hit ratios.
However, RtC is limited to only 512 volumes, so admins would run the [Comprestimator tool] and select the cache friendly workloads with the best compression, such as Databases and CAD/CAM images.
With new FlashSystem V9000, you now get the benefits of both. Continue to use RtC for data that is better served with 4x-5x compression, and let EiC compress everything else!
FlashSystem V9000 model AE3
Usable (1 drawer) TB
Usable (8 drawers) TB
Running a typical 70/30 workload, representing 70 percent reads and 30 percent writes, each controller pair can deliver up to 600,000 IOPS. With four V9000 controller pairs clustered together, that is 2.4 Million IOPS. For more read-intensive, cache-friendlier workloads, IBM has clocked the system up to 1.3 million IOPS per controller node-pair, and 5.2 million for a four-pair cluster.
As with the previous model, the FlashSystem V9000 offers "Easy Tier" automatic sub-LUN tiering, and "storage virtualization" to manage both SAS-attached and SAN-attached storage. Over 400 different devices from major vendors are supported. This means that the busiest blocks will be moved up to low-latency Flash, and less active data will be moved to spinning disk.
As with the FlashSystem V9000, the A9000/R model 425 use the new FlashSystem 900, increasing the effective capacity.
The A9000/R models will continue to do "Data Footprint Reduction" of pattern removal, data deduplication and RtC compression for data to achieve up to 5x compression ratio. However, to improve performance, internal metadata will not be compressed with RtC, allowing the underlying Flash cards to do EiC instead. This reduces CPU workload.
The FlashSystem A9000 model 425, aka "The Pod", has three grid controllers combined with the new FlashSystem 900 model AE3 for compact 8U solution that can store nearly a Petabyte. For smaller deployments, IBM also offers an 8-card partially-filled drawer for lower entry system size.
A9000 Model 425
Number of cards/drawer
Effective @5x TB
The FlashSystem A9000R model 425, aka "The Rack", has two to four grid elements, each grid element has two grid controllers and one FlashSystem 900 AE3 drawer. The previous 415 model supported five and six grid elements, but for now, model 425 is limited to just two, three or four. The A9000R model 425 supports all three Microlatency sizes, whereas the previous 415 model only supported medium (2.9 TB) and large (5.7 TB) sizes.
FlashSystem A9000R model 425
Usable (2 elements) TB
Usable (3 elements) TB
Usable (4 elements) TB
Performance of both the A9000 and A9000R are based on the number of grid controllers. Each grid controller gets about 300,000 IOPS. The A9000 pod with three controllers gets up to 900,000 IOPS. Each A9000R grid element has two controllers, so 600,000 IOPS per element, with 2.4 million IOPS for a maxed out four-element A9000R rack.
Along with the hardware changes, IBM released version 12.2 of the Spectrum Accelerate software that runs in the FlashSystem A9000/R models.
This version supports Asynchronous mirroring between FlashSystem A9000/R systems and IBM XIV Gen3 storage. The replication can go in either direction, but the intent is to use FlashSystem for production, replicating to XIV Gen3 at a disaster recovery facility. Version 12.2 also increased the number of volumes, snapshots, and consistency groups supported.
24,000 volumes and snaps
1024 consistency groups, 512 volumes per consistency group
The new version applies to both the new model 425, as well as the previous 415 models!
This week, I am presenting at the IBM Systems Technical University for IBM Storage and POWER Systems. This conference is being held in New Orleans, Louisiana, October 16-20, 2017, at the beautiful Hyatt Regency. There were about 800 clients attending.
This is my recap for the last few sessions before I left town, spanning Tuesday afternoon and Wednesday afternoon.
Reasons why IBM hyperconverged systems powered by Nutanix surpass other HCI from HPE, Cisco and more
Rob Simpson, Senior Strategic Marketing Manager at Nutanix, presented Nutanix hyperconverged systems. Nutanix runs on both x86 and POWER. For x86, it supports VMware, Microsoft Hyper-V, and Citrix XenServer, as well as their own Acropolis Hypervisor (AHV) derived from Linux KVM. For POWER, it uses AHV re-compiled for POWER chip set.
Hyperconverged systems can be sold in full rack configurations, as individual appliances, or as software that can be deployed on your own servers. Rob compared Nutanix against three competitive appliances: Dell EMC VxRAIL based on VMware VSAN, HPE Simplivity, and Cisco HyperFlex.
Everything you wanted to know about IBM Spectrum Scale metadata but didn't know to ask
Eric Sperley, IBM Software Defined Infrastructure Architect, presented the internal metadata structures used in IBM Spectrum Scale.
Why, oh why, did I attend this presentation? I had worked on Spectrum Scale back when it was called GPFS over 15 years ago, and thought I already knew everything about "inodes" that I ever wanted to, but Eric proved me wrong!
"Laws, like sausages, cease to inspire respect in proportion as we know how they are made."
--John Godfrey Saxe
A lot has changed! There have been a lot of improvements to the internal structures to improve parallel I/O performance, and reduce latency of administrative tasks.
IBM Spectrum Scale can be divided into different file systems, each of which can be configured with different performance characteristics and block size, such as random small files for scanned images, versus large sequential files for streaming videos.
My presentation was nowhere near as technical as Eric's above. I provided an overview of how IBM Spectrum Scale is configured, how it works, and how it interacts with IBM Cloud Object Storage System, Spectrum Protect, and System Archive.
I also covered the latest GSxS and GLxS models of the Elastic Storage Server, or ESS for short. These models provide awesome performance at low cost. The GSxS models are all-flash arrays for high performance. The GLxS models are hybrid with 2 Solid-State Drives and the rest NL-SAS 7200 rpm spinning disk for high capacity.
IBM COS new features
Andy Kutner, IBM Channel and Alliances Architect, presented the latest features in IBM Cloud Object Storage, IBM COS for short.
Compliance Enabled Vaults, or CEV for short, offer Non-Erasable, Non-Rewriteable (NENR) tamperproof protection for objects. Objects written to a CEV vault can not be deleted or replaced with newer versions, for a specific retention period.
(Note: Some folks mistakenly use the term "Write Once, Read Many" (WORM) for this. WORM applies only to tape, optical, paper tape, punched cards, and non-erasable ROM chips. For this reason, the term "Non-Erasable, Non-Rewriteable" (NENR), used in the U.S. Securities Exchange Commission (SEC 17a-4) regulation, has been created to extend this tamperproof protection to flash, disk and cloud-based storage architectures.)
The entry-level systems lowers the minimum capacity of systems. Before, IBM recommended at least 500 TB capacity to consider IBM COS. Now, the combination of embedded Accessers and Concentrated Dispersal mode, can lower the starting point to as low as 72 TB, but still allow you to grow to multiple PBs.
This week, I am presenting at the IBM Systems Technical University for Storage and POWER Systems. This conference is being held in New Orleans, Louisiana, October 16-20, 2017, at the beautiful Hyatt Regency.
This is my recap for sessions on Day 2 morning.
FlashSystem A9000 and A9000R Overview
Andy Walls, IBM Fellow, CTO and Chief Architect,and Brent Yardley, IBM STSM and Master Inventor, co-presented this session. This was the "deep dive" of the A9000/R, a basic continuation of the one they did yesterday.
The Pendulum Swings Back -- Understanding converged and hyperconverged integrated systems
With IBM's partnership with Nutanix, this has become a particularly popular topic. I cover the last 50 years of storage evolution, from internal storage and external storage to NAS and SAN storage networks.
More recently, people have been willing to give up all those gains for something simpler, less powerful, less reliable, less expensive. Enter Converged and Hyperconverged Systems. IBM PureSystems and VersaStack lead the pack for Converged Systems, along with IBM Spectrum Scale, Spectrum Accelerate and Nutanix on IBM Power Systems for Hyperconverged Integrated Systems.
New Generation of Storage Tiering -- Less Management, Lower Costs, and Improved Performance
There are orders of magnitude between the fastest All-Flash Array and the least expensive tape storage. Ideally, there would be a "slider bar" that allowed people to select from the fastest to the least expensive. IBM offers a variety of solutions to offer this "slider bar", with automation to move data as needed between tiers.
I start with IBM Easy Tier, available on DS8000 and Spectrum Virtualize products, to IBM Virtual Storage Center where advanced analytics moves data to the right location, to IBM Spectrum Scale which provides the ultimate tiering, across multiple locations, between flash, disk and tape.
The lunches at these conferences are amazing, but then the "Big Easy" is known for its food!
This week, I am presenting at the IBM Systems Technical University for Storage and POWER systems. This conference is being held in New Orleans, Louisiana, October 16-20, 2017, at the beautiful Hyatt Regency.
The afternoon sessions on Monday were all about Cloud.
Back in 2009, I was designated the IBM Cloud Storage Center of Competency for all of the IBM Systems client centers. That was nearly a decade ago, and I am still talking about Cloud Storage!
Since then, IBM has decided to be a "Cloud Platform" company, and now everyone wants to know about Cloud Storage. Cloud is not just to lower costs, as it once start out as, but now for innovation and business value.
Nearly all of IBM Storage is enabled for cloud, from our high-end FlashSystem, DS8000 and XIV flash and disk storage arrays, to our Spectrum Storage software suite, to our various tape products.
Building Private Cloud with Ubuntu and OpenPOWER
Ivan Dobos, from Canonical--the company that makes Ubuntu--presented Ubuntu on OpenPOWER. Other Linux distributions like Red Hat and SuSE distributions offer both a "community supported" version (OpenSUSE or CentOS), and an "enterprise version" (SLES and RHEL). Ubuntu doesn't fork their versions, they have a single version for everyone.
Ubuntu 14.04 LTS was made available as a Little-Endian distribution for IBM POWER and OpenPOWER. Ubuntu was the first Linux distribution to support CAPI and PowerKVM for the POWER8 platform.
(A note on release numbers. Ubuntu releases every April and October, so 14.04 represents 2014/April release. Every two years, a release is designated "Long Term Support" (LTS) which is supported for five years.)
Since version 16.04, Ubuntu offers the LXD Container Hypervisor, based on LXC, similar to Solaris Zones, but running as a daemon. Virtual Machines are heavy because they have their own kernel. Containers instead use the kernel of the underlying hypervisor, but limited to Linux guests. The Linux guests are can be older versions of Debian, Red Hat or SuSE, but with the latest, most secure kernel of Ubuntu for safety and security.
(Canonical gives Ubuntu away for free, but offers "Enterprise Services" for a fee to companies that want this added level of support. One of the features with Enterprise Services is "Live Kernel Update". Normally, updating the Linux kernel requires a reboot, which would cause outage to all of the VMs and containers running on that host server.)
Like VMs, you can launch containers, switch to bash shell, install software, run applications, and shut down containers, all isolated from other containers. The LXD daemon can run LXC and Docker containers. Some advantages of doing this:
Lift and Shift, live mobility from one system to another
Collocation of different workloads on same node
More efficient to use containers than Virtual Machines
14x greater density with LXD than traditional KVM or VMware (tested on x86)
Based on open source LXC containers
Ubuntu is designed for the "Elastic Hybrid Cloud". Canonical recommends combining on-premises data center with two or more public cloud providers. Scarcity has shifted from "code" to "operations". Are you ready to run applications you don't understand?
Total Cost of Ownership is shifting from code license costs to operational costs. Canonical offers a free, downloadable, operations orchestration platform called "Juju" to help install, configure and scale applications. Juju means "magic" in Swahili.
Scripts on Juju are called charms. There are Juju charms to install and configure things like MongoDB and IBM Spectrum Scale. Furthermore, Juju charms can be bundled together for more complicated deployments.
Juju is not limited to LXD, can be used with VMware, OpenStack, bare metal servers, and public clouds. It is available on Ubuntu, Red Hat and Windows. As a demo, Ivan built an entire working OpenStack environment, with 20 applications on 4 bare metal servers, all installed and launched with Juju.
For OpenStack, you can use the basic "Ubuntu OpenStack", or a more complete "Canonical OpenStack", or even have Canonical folks manage your environment for you.
Canonical MaaS (Metal-as-a-Service) uses hardware APIs to manage bare metal servers, providing physical provisioning, dynamic allocation for workloads, and even Ubuntu and CentOS operating system installs. Canonical has clients with over 100,000 servers managed with MaaS.
Introduction to IBM Cloud Object Storage System and its applications (powered by Cleversafe)
Before 2015, IBM offered two "Object Storage" products: IBM Spectrum Scale and IBM Spectrum Archive, and I was constantly having to compare and contrast IBM products to Cleversafe.
Not any more! With the IBM acquisition of Cleversafe, IBM now offers all three!
This session explained all of the features and functions of IBM Cloud Object Storage System, available as software, as pre-built systems, including a VersaStack CVD, and as Storage-as-a-Service (STaaS) in the IBM Cloud.
(IBM renamed Cleversafe DSnet to "IBM Cloud Object Storage System". I joked that if IBM ever acquired Coca-Cola, they would probably rename their signature soft drink as the "Brown Carbonated Sugar Liquid", or BroCarb SugarLiq for short!)
In the evening, we had a nice reception with food and drink at the Solution Center. The Solution Center has booths where all of the IBM and Business Partners have their experts answering questions and handing out brochures of their offerings.
This week, I am presenting at the IBM Systems Technical University for Storage and POWER Systems. This conference is being held in New Orleans, Louisiana, October 16-20, 2017, at the beautiful Hyatt Regency.
Storage: Opening Keynote Session
Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist, and Craig Nelson, Brocade, co-presented this session.
Clod Barrera presented the latest in Storage trends. He organized his talk around four layers: Infrastructure, Storage Management, Storage Systems, and Storage Media.
Craig Nelson presented the changes in Storage Networking. With advancements in both server and storage bandwidth, the storage network becomes the bottleneck. Insane flash storage performance requires insanely fast storage networks. IBM offers Brocade-manufactured switches and directors that now support 32Gbps. Combining four paths together, these can offer Interswitch Connection Links (ICL) at 128 Gbps.
The Seven Tiers of Business Continuity and Disaster Recovery
With the recent Hurricans Harvey, Irma, Jose, and Maria, my topic on Business Continuity and Disaster Recovery (BC/DR) was well attended. I have been working in BC/DR for most of my career, including the "High Availability Center of Competency" or HACOC.
Back in 2005, I was here in New Orleans, the week before Hurricane Katrina, for the IBM Storage Symposium, August 22-26, the predecessor of this conference. I left on Friday, August 26, and the storm hit that weekend.
I met with people photographing all the buildings, in hopes to sell "before pictures" to insurance companies and filmmakers after the hurricane hit. Film director Spike Lee bought much of this footage. Smart!
However, natural disasters like hurricanes, tornados and floods represent less than 20 percent of all discasters. The majority of disasters, nearly 75 percent, arise from electrical power outages, human error, system failure and randsomware.
IBM FlashSystem Overview
Andy Walls, IBM Fellow, CTO and Chief Architect,and Brent Yardley, IBM STSM and Master Inventor, co-presented this session. Andy started with FlashSystem 900, V9000 and A9000/R.
The room was packed with standing room only, and Andy was answering so many questions that he never finished his portion, and Brent Yardley never had a chance to cover his portion.
Fortunately, there were "deep dive" sessions on FlashSystem 900, V9000 and A9000/R later in the week, so Andy suggested everyone go to lunch and attend these other more detailed sessions.
Tomorrow, I will be presenting at the STU Orlando for Storage and Cognitive Systems (formerly POWER servers). This conference will be held in New Orleans, Louisiana, October 16-20, 2017.
Here is my speaking schedule:
The Seven Tiers of Business Continuity and Disaster Recovery (BC/TR)
IBM's Cloud Storage Options
Introduction of IBM Cloud Object Storage System and its Applications (powered by Cleversafe)
The Pendulum Swings Back -- Understanding Converged and Hyperconverged Integrated Systems
New generation of storage tiering: Simpler management, lower costs, and increased performance
Introduction of IBM Cloud Object Storage System and its Applications (powered by Cleversafe) **repeat**
IBM Spectrum Scale for File and Object storage
If these topics seem familiar, I have presented them at prior events earlier this year, including the STU Orlando in Orlando Florida, and the one in Melbourne Australia. However, I have made updates! New products have been announced!
If you are planning to attend, here are some of my past blog posts to help you get up to speed:
STU Orlando - Orlando, Florida
This event was a large 5-day event to replace the technical portion of IBM's previous "Edge" conference.
This event was a smaller 3-day event to bring STU to other countries. We used to call these "Edge Comes to You" events, but now we call them "IBM Systems Technical University" just like the ones in the USA.
The STU at New Orleans will be a 5-day event. Instead of a "Meet the Experts" session, they are having a "Poster Session" in its place. Many of the posters will have QR codes, so make sure you have a "QR Scanner" application installed on your smartphone so you can scan them quickly!
Everyone, speakers and attendees alike, should consider making a QR code for themselves for this event. Go to [any number of websites] that generate a QR code. This could a VCF file with all of your contact information, a link to your blog or website, or point to your presentations on Slideshare or IBM@Box.
The next time someone at the event asks for this information, display the QR code on your smartphone, and let them scan it. Alternatively, you can send the image via MMS text message.
(My QR Code is fully functional, go ahead and practice scanning it with your smartphone for practice!)
I arrive in to New Orleans Sunday afternoon, so if you are in town, give me a shout! Or tweet me at @az990tony
IBM introduces the eight generation of Linear Tape Open (LTO) tape drive technology, with corresponding support in all of the IBM tape libraries.
Fellow blogger Jon Toigo, of Drunkendata.com fame, came to Tucson to interview Lee Jesionowski, Ed Childers, Calline Sanchez, and me about this. Check out the various segments on YouTube or his website.
The LTO-8 cartridges are not yet available, but when they are, they will hold 12 TB raw capacity, or 30 TB effective capacity at 2.5-to-1 compression ratio. The new drives are N-1 compatible to read/write LTO-7 cartridge media.
Previous generations also supported reading N-2 generation tapes, LTO-8 breaks from that tradition and will not support LTO-6 cartridges at all.
LTO-8 comes in both "Full Height" (FH) and Half-Height (HH) models. The FH models can transfer data at 360 MB/sec (or 900 MB/sec effective at 2.5-to-1 compression), and the HH models at 300 MB/sec (or 750 MB/sec effective at 2.5-to-1).
LTO-8 supports IBM Spectrum Archive and the "Linear Tape File System" (LTFS) tape format for self-describing long-term retention of data.
Compliance storage has come under many names. For tape and optical media, we had "WORM" for Write-Once, Read-Many. For disk-based storage, we had "Fixed-Content" or "Content-Addressable Storage". For file systems, we had "Immutable Storage".
Fortunately, the clever folks who crafted the SEC 17a-4 law came up with an umbrella term: "Non-Erasable, Non-Rewriteable" (NENR) that covers all storage media, from WORM tape and optical, to tamperproof flash, disk and cloud-based solutions.
The other major change is "Concentrated Dispersal" mode, or "CD mode" for short. Erasure Coding works best when data is dispersed across three or more sites. When this happens, you can lose all of the data at one site, and still have 100 percent access to all data from the other locations.
IBM's "Information Dispersal Algorithm", or IDA for short, scattered slices of data across many servers. Great for high availability and performance, but often meant that the minimum deployment was 500TB or greater.
Not every organization is ready for such a large purchase. Some want to just [dip their toe in the water] with something smaller, less expensive. Well IBM delivered!
The new CD mode means that instead of one slice per Slicestor node, you can pack lots of slices on each node. Each slice will be on distinct disk drives, for high availability.
Entry-level configurations now can be as little as 72-104 TB, across 1, 2 or 3 sites.
Next month, I will be presenting at the IBM Systems Technical University for Storage and POWER. This conference will be held in New Orleans, Louisiana, October 16-20, 2017.
Instead of a "Meet the Experts" Q&A panel, this event will feature a "Poster Session". I had the pleasure of doing one of these down in Melbourne, Australia last month. For those who missed it, here are my blog posts:
By now, you have already decided on a title and abstract of your poster. You will need to figure out a quick and easy way to explain your poster, and as always, shorter is better. It reminds me of a famous quote:
"Sorry this letter is too long...
If I had more time, I could have made it shorter!
-- Blaise Pascal
The event team asked me to write some instructions on the mechanics of how to put together a poster for this, since it is new for many people. I use Microsoft PowerPoint 2013 and ImageMagick tools to accomplish this.
Arrangement of Slides
Posters for the IBM Systems Technical University in New Orleans will be 24x36 inches in size. If you print out your poster in 8.5x11 inch standard size letter pages, that would be eight slides, 2 columns, 4 rows. This leaves one inch border all around.
The event will provide both the foam board and double-sided sticky tape. You can bring your poster as a stack of Letter-sized pages in a folder, and assemble your poster at the event.
You can increase the size of individual image to 17x22, to offer the "Big Picture" view. Basically, we can take a standard 8.5x11 Letter size page, expand it onto four separate pages, and then put them on the poster! I will show you how in the steps below.
Lastly, you can have two big slides. If your poster is organized as "Before/After" or "Problem/Solution" then this arrangement could be perfect for you.
Setting Custom Paper Size on PowerPoint
In Melbourne, I had to use European A4 standard paper, and had to figure out how to do this in PowerPoint. I was surprised to learn that the PowerPoint default is 4:3 ratio of 10x7.5 inch, and that this is stretched to be whatever paper size you print on.
The difference is slight, but I prefer [WYSIWYG], so we will change the slide to "Custom size" and force it to 8.5x11 inches, with "Landscape" orientation. This will avoid anything looking stretched or squished on the big poster.
Converting a PowerPoint Slide to PNG Image file
If you would like to resize one or more of your PowerPoint slides, you will need to save those slides as images. Select "File" and "Save As" and as the format, choose "PNG" format. You can also select GIF or JPG, but I prefer PNG.
You can export all of your slides as images, in which case it will create a folder and number each slide individually. Or, you can select "Just This One" for the current slide.
By default, it will use the same name as your PPT file, just change the extension to PNG. I suggest you name the file something meaningful to you. In my examples below, I use "small.png" as the file name.
I am using PowerPoint 2013, which defaults to 96 dpi. So, an 8.5x11 paper becomes 1056x816 pixels in size.
If you have PowerPoint 2003 or higher, you can change the Windows registry to specify image resolutions. Not recommended for the faint of heart. Or anyone else. But here's the deal if you want to try (if the following doesn't make any sense, it might be better not to mess with the registry):
Quit PowerPoint if it's running
Navigate to HKEY_CURRENT_USER\Software\Microsoft\Office\X.0\PowerPoint\Options
(For X> above, substitute 16.0 for PowerPoint 2016, 15.0 for PowerPoint 2013, 14.0 for PowerPoint 2010, 12.0 for PowerPoint 2007 and 11.0 for PowerPoint 2003.
Add a new DWORD value named ExportBitmapResolution and set its DECIMAL value to the DPI value you want (for example, 300 means 300 dots per inch)
Close REGEDIT, start PowerPoint and test. Your files will be 3300x2550 pixels instead.
Since the resulting four pieces are exactly the size of a page, you can put them back into your PowerPoint deck. Create four blank slides, select Insert then Pictures. Insert each picture (big_0.png, big_1.png, big_2.png, and big_3.png) as a separate page.
You can print this out, and bring with you to the event, or send it to someone to have them print for you.
Upload files to IBM@Box
This next step is completely optional, but found it adds a nice touch. As an IBMer, you can upload your presentation, and any documents, whitepapers or other materials, to [IBM@Box]. Create a directory that is unique to you, such as your last name and the conference. For example, I have "Pearson-STU-NOLA-2017" as my folder name.
You can create a "URL Link" to this folder. Select "Share", then "Share Link" to create a dialog box. It is important to specify "People with this link" if you want those outside of IBM, such as clients and IBM Business Partners, to have access.
Press the little "gear" button on the upper right, and it gives you options to customize the URL. Normally the URL is some long random sequence of characters, but you can rename it to something meaningful and easier to remember.
Generate a QR Code
Since you have a URL Share Link for your files on IBM@Box, you can generate a QR Code for this link, and include on your poster!
There are several online websites that can generate a QR Code for free. I use [QRme.com] in this example. Go to the website, copy in the URL, and press "Generate" button.
The QR Code is generated successfully, right click and "Save Image" to a file on your hard drive. This image can be inserted as a picture like we did above onto any slide. You can resize as needed.
In Melbourne, one of the posters had the QR Code at the top, with the Title, and it was impossible to see, so difficult to use a smartphone to scan the information. For this reason, I recommend putting the QR code in the lower right corner of your poster. Between shoulder and waist height for the audience, to be comfortable to scan.
I am looking forward to going back to New Orleans to speak at this conference!
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM announced a new product, IBM Spectrum Protect Plus. To understand why, I will need to discuss a bit of history related to Data Protection.
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for IBM Spectrum Protect, IBM Spectrum Protect Snapshot, IBM Spectrum Protect for Virtual Environments, and IBM Spectrum Copy Data Management products. I was not paid in any manner to promote Geoffrey Moore's book mentioned below.)
IBM Spectrum Protect was originally developed as the Workstation Data Save Facility (WDSF) back in the 1980s, back when Personal Computers were just getting deployed.
I started in 1986 developing mainframe software, so we all had bulky 3270 terminals. When our area was offered 120 PCs to replace them, I was tasked with determining how to roll these out, 24 at a time, over five months.
My job was to determine who would get a PC in the first round, the second round, and so on. I handed out a simple one-page survey, asking everyone basic questions. Are you familiar with Personal Computers? Do have one at home? Are you comfortable using a mouse? My plan was to give those most familiar with them sooner, and those less familiar in later rounds.
However, it was my final question that sealed the deal:
How soon do you want a PC to replace your 3270 terminal?
[ ]Immediately [ ]Next month [ ]No Hurry [ ]Put me last [ ]Never!
Surprisingly, I had roughly 24 folks choosing each option on this last question, which made my decision process easy for me!
(In his book Crossing the Chasm, fellow author Geoffrey Moore would come up with similar groups: Innovators, Early Adopters, Early Majority, Late Majority, and Laggards. This is a great book and I highly recommend it!)
Of course, we used WDSF to back up the files. WDSF would later morph into DFDSM, then ADSM, then TSM, and now it is called IBM Spectrum Protect.
Over the decades, the product has evolved from just backing up data on personal computers. IBM Spectrum Protect can now protect all kinds of machines, from tablets, mobile devices, and smartphones, to virtual machines, databases, and application servers in the data center.
Besides creating backup versions of files, IBM Spectrum Protect can also migrate older, less frequently used files to less expensive media, as well as archive files for long-term retention.
Different files can be assigned to different "management classes" that determine policies to be applied and enforced on the backup, migration and archive copies. For backups, this includes how many versions to keep while the file exists, how many versions to keep after the original file is deleted, how long to keep those inactive versions.
Instead of a grandfather-father-son [backup tape rotation], full-plus-incremental, or full-plus-differential scheme employed by other backup software, IBM Spectrum Protect has a unique "Incremental-Forever" approach that reduces backup time, LAN bandwidth requirements, and backup storage media.
While most companies still backup to tape, IBM Spectrum Protect can backup to flash, disk, tape, virtual and physical tape libraries, object storage, and even to public Cloud Service Providers such as IBM Bluemix, Amazon S3, and Microsoft Azure.
IBM Spectrum Protect both client-side and server-side data footprint reduction technologies including compression and deduplication, eliminating the need for expensive, single-purpose data deduplication devices like Dell-EMC Data Domain.
IBM Spectrum Protect is recognized as a leader in Data Protection software, able to scale up to meet the demands of the largest enterprises. However, the parameters and options that IBM Spectrum Protect has acquired over time have been compared to the cockpit or flight deck of an airplane!
For clients with Virtual Machines, IBM offered three solutions:
IBM Spectrum Protect Snapshot
Formerly called Tivoli Storage FlashCopy Manager (FCM), [IBM Spectrum Protect Snapshot] takes frequent, near-instant, non-disruptive, application-aware backups and restores for SAP, Oracle and Db2. It can also be used for VMware using advanced snapshot technology, on both IBM and non-IBM storage systems.
IBM Spectrum Protect Snapshot can be used as a stand-alone product, or integrated with IBM Spectrum Protect to move the snapshots and FlashCopy targets to other storage media.
IBM Spectrum Protect for Virtual Environments (VE)
Formerly called IBM Tivoli Storage Manager for Virtual Environments, [IBM Spectrum Protect VE] protects both VMware and Microsoft Hyper-V virtual machines.
IBM Spectrum Protect VE safely moves backup workloads to a centralized IBM Spectrum Protect server and enables administrators to create backup policies or restore virtual machines with just a few clicks. It allows you to protect data without a traditional backup window.
IBM Spectrum Copy Data Management makes copies available to DBAs, Developers and VM administrators when and where they need them. While this product is focused on DevOps and Dev/Test workflows, it can also be used to automate and schedule snapshots that can serve as backups.
Surprisingly, many companies do not take advantage of these solutions. Even clients who already have IBM Spectrum Protect deployed either (a) simply use Spectrum Protect clients on individual VM guests, or (b) use third-party products to backup VMs outside of Spectrum Protect infrastructure.
"Problems cannot be solved with the same mind set that created them."
-- Albert Einstein
Smaller clients want something simpler to deploy, and easier to use and administer. Rather than simplify the products above, a process called "kneecapping" in the IT industry, IBM opted for a clean slate, [start-from-scratch] approach.
The result is IBM Spectrum Protect Plus, new software that was preview announced last Wednesday in time for this week's VMworld 2017 conference in Las Vegas, and next month's VMworld conference in Barcelona, Spain.
IBM Spectrum Protect Plus is available as either a stand-alone product, or integrated with IBM Spectrum Protect for long-term protection. It is focused exclusively on VMware and Hyper-V environments. General Availability is expected some time in 4Q 2017.
Key features include:
Simple to install in less than 15 minutes, configured in an hour
Easy to use by DBA, VM or application administrator. No IBM Spectrum Protect skills required for stand-alone deployment
Pre-defined Gold, Silver and Bronze policies are ready to use. Additional customized policies can be configured as needed
Supports both application-aware and crash-consistent methods
Data Footprint Reduction technologies including compression and deduplication
Instant data recovery to support DevOps, Dev/Test, Reporting, Analytics and Training
Granular search and restore of entire Virtual Machines, VMDKs, and individual files
As for the name, I would have prefered "IBM Spectrum Protect Basic Edition". The "Plus" implies that the new product is more advanced, or offers more features, than the existing Spectrum Protect editions.