This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
This week, I am presenting at the IBM Systems Technical University for Storage and POWER Systems. This conference is being held in New Orleans, Louisiana, October 16-20, 2017, at the beautiful Hyatt Regency.
This is my recap for sessions on Day 2 morning.
FlashSystem A9000 and A9000R Overview
Andy Walls, IBM Fellow, CTO and Chief Architect,and Brent Yardley, IBM STSM and Master Inventor, co-presented this session. This was the "deep dive" of the A9000/R, a basic continuation of the one they did yesterday.
The Pendulum Swings Back -- Understanding converged and hyperconverged integrated systems
With IBM's partnership with Nutanix, this has become a particularly popular topic. I cover the last 50 years of storage evolution, from internal storage and external storage to NAS and SAN storage networks.
More recently, people have been willing to give up all those gains for something simpler, less powerful, less reliable, less expensive. Enter Converged and Hyperconverged Systems. IBM PureSystems and VersaStack lead the pack for Converged Systems, along with IBM Spectrum Scale, Spectrum Accelerate and Nutanix on IBM Power Systems for Hyperconverged Integrated Systems.
New Generation of Storage Tiering -- Less Management, Lower Costs, and Improved Performance
There are orders of magnitude between the fastest All-Flash Array and the least expensive tape storage. Ideally, there would be a "slider bar" that allowed people to select from the fastest to the least expensive. IBM offers a variety of solutions to offer this "slider bar", with automation to move data as needed between tiers.
I start with IBM Easy Tier, available on DS8000 and Spectrum Virtualize products, to IBM Virtual Storage Center where advanced analytics moves data to the right location, to IBM Spectrum Scale which provides the ultimate tiering, across multiple locations, between flash, disk and tape.
The lunches at these conferences are amazing, but then the "Big Easy" is known for its food!
My session on IBM Cloud Object Storage had three sections. First, I covered an overview of what "Object Storage" was in general, how this differs from traditional block or file storage approaches.
Second, I explained what is unique and different of IBM Cloud Object Storage System, formerly called DsNet from Cleversafe. IBM acquired Cleversafe in 2015.
Third, I explained the various applications, use cases and industries that can take advantage of Object Storage.
IBM Storage and the NVMe Revolution
Brian Sherman, IBM Distinguished Engineer for Storage Advanced Technical Services, presented an overview of NVMe, NVMe Over Fabric (NVMeOF) and what IBM is doing in this area.
How to Build a Rockstar Personal Brand
Andrea Edwards, The Digital Conversationalist, is a globally award winning B2B communications professional with more than 20 years' worth of experience from around the globe, including 12 years exclusively in Asia Pacific. IBM has hired her in the Asia Pacific region to train many IBMers in Social Media.
She condensed her normal 5-6 hour training down to a single hour for this event. She explained why building a personal brand was important, how to do it, and why businesses and organizations should encourage their employees to do so.
For example, who has the most influence on most people? Behind friends and family are bloggers. Bloggers are more influential than journalists, religious leaders, celebrities and politicians.
(As the #1 blogger of IBM, I am considered to already have a "rockstar personal brand". I am pleased to see that IBM is taking social media seriously. I have been blogging since 2006, and have influenced over $4 billion US dollars in IBM revenue in the past 11 years.)
IBM Spectrum Virtualize technical updates
Andrew Martin, IBM Spectrum Virtualize Support Architect, presented the last 18 months of enhancements to Spectrum Virtualize, from v7.6.1 introduced in March 2016 to v7.8.1 released earlier this year.
He managed to highlight quite a few enhnacements:
Distributed RAID 5 and RAID 6
Integrated Compresstimator tool
New hardware: SVC, Storwize V7000 Gen2+, Storwize V5000 Gen 2, and 92-drive 5U High Density Expansion Enclosure
N-Port ID Virtualization (NPIV)
Virtualization Over iSCSI
Encryption for Distributed RAID Arrays
64GB Read Cache
Tier 1 Flash Support
Compressed IP Replication
Spectrum Virtualize as Software for Lenovo and SuperMicro servers
Host Clusters and Throttling
Raised limit to 10,000 Volumes
Transparent Cloud Tiering
Storwize Model Conversions
IBM SKLM Support for Encryption
Consistency Protection for Metro and Global Mirror remote-distance replication
Andrew called this a "reverse roadmap", rather than a session that presents where we are going in the next 18 months, he presented where we have been.
Solution Center Reception
Here I am with Morgan Tracey and Jenna Brooker from Computer Merchants, an IBM Business Partner.
Not only were Computer Merchants a sponsor with a booth at the Solution Center, but they also gave a customer testimonial at one of the breakout sessions on how they were able to use IBM Artificial Intelligence to help with their business.
I also spent time at the SuSE booth. SuSE is a distributor of Linux that runs on x86, POWER and IBM Z mainframe systems.
While I was working, Mo took a tour to Phillip Island. On the way, they stopped at Maru to feed kangaroos and take pictures with Koala bears.
At Phillip Island, Mo watched penguins come out of the ocean, waddle up on shore and march to their burroughs. This happens every evening and is one of the top tourist attractions near Melbourne.
Well, it's Tuesday again, and you know what that means! IBM Announcements!
Starting today, April 1, 2014, the IBM Executive Briefing Centers (EBC) are adopting a new self-hosted model. In the past, each briefing was assigned a "Briefing Host", a member of the EBC staff, who acted as [master of ceremonies] for the day (or more) for the clients. At some locations, if there were three rooms, there would be three or more briefing hosts so that concurrent briefings could be held.
However, the method does not scale. Having a person per briefing means that you are limited to the number of total concurrent briefings. Inspired by self-service provisioning and scalability of the Cloud, IBM has adopted a new methodology.
In the new model, the visiting client rep, sales rep, or IBM Business Partner will be handed instructions and a map. This will include the agenda, the schedule, biographies of each speaker, the locations of the nearest restrooms, and so on.
I can take partial credit for the idea. In 2012, I made the analogy that having briefing centers at each development lab made a lot of sense, because it allowed clients to interact directly with the engineers and executives that made development decisions. I also made the analogy that having a fully-staffed EBC was like a fire department, whether you have five briefings per month, or fifty, you need a team that is ready, staying abreast of the latest technological changes.
In my post, [Like animals in the zoo], I argued there are two kinds of zoos, the self-guided kind, where visitors are handed a map, versus the docent-guided kind, where a member of the zoo staff introduces you to each animal.
The EBC briefing hosts in this analogy were the docents, and the animals that people came to visit were the engineers and executives.
As for the fire department, IBM management flipped the analogy around. They argued that many smaller communities had "volunteer fire departments", eliminating the need to keep full-time employees doing nothing but playing cards and sliding down brass poles in between fire fighting sessions. When a fire happens, phones calls are made, and this will help get everyone notified to get involved.
In my past 28 years at IBM, I have to say that you know you have good analogies when they can be used in both directions. The zoo analogy was used to prevent management from consolidating all of the EBC staff to Austin, TX. The fire department analogy helped us keep all of our lab equipment to run demonstrations.
The new self-hosted model will address both scheduling and scalability issues. We often had two-day and three-day briefings, and scheduling the rooms, and the briefing managers, based on their availability, was quite challenging.
There are three advantages to the new method:
A coordinator will merely assign rooms, no longer worrying if a briefing host is available for those days. Now, each EBC location can run at full capacity, limited only by real estate and floor space.
Subject matter experts, like myself, that often did double-duty serving as briefing hosts as needed, will have more free time. I personally will be doing more "outbound briefings" to attend conferences and visit clients at their location, eliminating the time I need to be in Tucson to host "inbound" briefings.
The awkward silence that happens when the client rep, sales or IBM Business Partner invites all the clients and presenters, but forgets to invite the briefing host, is completely eliminated.
Ken Gibson has written a four-part series about where the storage industry is going, on his Storage Thoughts blog. You can find the four parts here (Part 1,Part 2,Part 3,Part 4).
His analysis of the storage industry is based on the concepts in Clayton Christensen's latest book Seeing What's Next, his latest work on the heels of his last two successes "The Innovator's Dilemma" and "The Innovator's Solution". I've only read the first book, "The Innovator's Dilemma" but need to check out these other two.
Ken explores the efforts of the incumbent players, and I agree IBM is farthest along, but not only for our "Storage Tank" architecture. For those not aware of Storage Tank, it was the code-name of a project from IBM's Almaden Research Center, productized as IBM System Storage SAN File System (SFS). Earlier this year the advanced policy-based data placement, movement and expiration features of SFS were copied over to IBM's General Parallel File System (GPFS) which has wide adoption among the High-Performance Technical Computing (HPTC) community. As I've said before, switching from one file system to another is hard, so it makes sense for HPTC clients who already use GPFS to make use of these new features by staying with GPFS, rather than trying to get them to move to SFS.
I also like Ken's analysis of "overshot" and "undershot" clients. Overshot clients are those that find what the marketplace delivers already "good enough" for their needs, and are price sensitive against paying for features they don't think they need. The undershot clients are those that the current marketplace set of offerings are not yet good enough, and are willing to pay a premium to the vendor or supplier that can get them closer to what they are looking for.
Changes are underfoot, and it is an exciting time to be involved in the storage industry.[Read More]
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 2.
IBM Spectrum Protect deep dive into Container Storage Pools
Ron Henkhaus, IBM Certified Consulting IT Specialist, presented the new Spectrum Protect concept of "Container Pools" that can either be "Directory Pools" on SAN or NAS-based disk storage, or "Cloud Pools". Container pools can contain deduplicated and non-dedupe data.
Ron cautioned that directory pools should not be placed on the same file system as your Spectrum Protect database or logs. Also, best practice for any directory pool is to assign an "overflow" pool to any non-directory pool, such as disk, tape or cloud container.
Cloud pools can use either OpenStack Swift, V1 Swift, Amazon S3 protocol, Amazon Web Services, IBM Bluemix, and IBM Cloud Object Storage. You can pre-define the vaults and buckets in the configuration.
For off-premises Cloud pools, the data is encrypted by default. For other container pools, encryption is optional. Performance to Cloud pools have been improved by using "accelerator storage", basically a disk cache to collect data before sending over to the Cloud pool. Backups to Cloud pools can reach 8 TB per hour. Restore times varies from 500 to 1500 GB per hour.
Container Pools were designed for the new "Deduplication 2.0" feature introduced in version 7. Traditional Dedupe 1.0 to Device Class FILE is still available, but not recommended.
Version 7.1.6 changed the compression algorithm from LZW to LZ4. In all cases, Spectrum Protect performs these actions in this order: deduplication, compression, encryption. Data that is encrypted by the Spectrum Protect client is therefore not deduped.
The "Protect Storage Pool" command can replicate a directory pool to either a remote directory pool or Cloud pool. In addition to this remote replication, you can copy a directory pool to tape to offer air-gap protection against ransomware. Such tapes are considered part of the "Copy Container Pool". In the event of directory pool corruption, the data can be repaired from either replication or tape.
IBM Aspera can now be used for replication, using SSL and AES-128 bit encryption. If your latency is greater than 50 msec, and have more than 0.5 percent packet loss, Aspera might help. This is available for Linux on x86 platforms running v7.1.6 or higher.
For existing customers, IBM Spectrum Protect allows you to convert your FILE, VTL and TAPE device class pools to directory or Cloud pools.
Introduction to IBM Cloud Object Storage (powered by Cleversafe)
In 2015, IBM acquired Cleversafe, recognized as the #1 Object Storage vendor. Their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS. IBM offers the IBM Cloud Object Storage System in three ways: as software, as pre-built systems, and as a cloud service on IBM Bluemix (formerly known as SoftLayer).
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
While other cloud service providers have offered data storage in the cloud, this new offering also allows hybrid configurations with geographically dispersed erasure coding.
Unlike RAID which protects against the loss of one or two drives, erasure coding can protect against a larger number of concurrent failures. For example, using an Information Dispersal Algorithm (IDA) of "7+5", where seven pieces of data are encoded on twelve independent disks, the system can lose up to five disk drives without losing any data.
Combining this with Geographically Dispersed Configuration across three or more sites means that you can lose an entire data center, four of the twelve disks, and still have instant full access to all of your data from eight drives at the other locations. In the graphic, you see two on-premise data centers combined with a third location in IBM SoftLayer.
New Generation of Storage Tiering: Simpler Management, Lower Costs, and Improved Performance
With ever changing amounts of storage, it is hard to find metrics that are consistent year to year. Fortunately, we found I/O density as the metric to focus my efforts, armed with real data from Intelligent Information Lifecycle Management (IILM) studies done at various clients. From that, I was able to talk about storage tiering on three fronts:
Storage tiering between Flash and disk. IBM FlashSystem and IBM Easy Tier on DS8000 and Spectrum Virtualize family for hybrid Flash-and-disk configurations.
Storage tiering between disk, tape, and Cloud. HSM and Information Lifecycle Management (ILM) on Spectrum Scale, Elastic Storage Server (ESS), Spectrum Archive and IBM Cloud Object Storage System.
Storage tiering automation across your entire environment. IILM studies can help identify a target mix of Tier 0, Tier 1, Tier 2 and Tier 3 storage. IBM Spectrum Storage Suite and the Virtual Storage Center (VSC) can recommend or perform the movement of LUNs to more appropriate tiers, based on age and I/O density measurements.
It's hard to say what the correct sequence of presentations should be. Some thought it might have been better for my talk on IBM Cloud Object Storage System prior to Ron's talk on Cloud container pools, but perhaps hearing Ron first helped drive more interest to my session.
While some might be familiar with mashups that combine public Web 2.0 sources of information, enterprise mashups go one step further, integrating withthe "information infrastructure" of your data center. It's not just enough to deliver theright information to the right person at the right time, it has to bein the right format, in a manner that can be readily understood andacted upon. Enterprise mashups can help.
Last year in Beijing, China, one of my colleagues told me "When it rains here, cabs dry up". Normally, there are enough taxi cabs to handle normal conditions, but when it rains, people who normally walk now want to take a cab instead, and the demand goes up, resulting in being more difficult to find one when you need one.
I'm wrapping up my week here in Chicago, and it snowed yesterday. Cabs were scarce. I walked. Many others walked too, about half with umbrellas to protect themselves against the snowflakes.
Most systems are designed to handle typical average conditions. Taxi cabs in a city, for example, handle typicalamounts of traffic.
IT is different. In many cases, IT infrastructures are designed for the peaks, not the averages. Peaks can be where you need performance the most, and failure to design for peaks can be disastrous. As with any business decision, this represents a trade-off. Design for the average, and suffer through the peaks, or design for the peak, and be over-allocated and under-utilized most of the time otherwise.
Well, it's the end of the year, so I thought a recap of year 2014 would be in order.
The year started out with some January announcements, including the IBM FlashSystem 840. IBM is proud to be ranked #1 in All-Flash Arrays, and the IBM acquisition of Texas Memory System has caused all of the other competitors to scramble their own wanna-be offerings. IBM also announced it was going to sell off its System x division to Lenovo.
In February, I wrapped up a project to build a Linux-based PC for a kindergarten class. IBM announced some exciting new things at Pulse 2014 conference, including IBM Bluemix Platform-as-a-Service (PaaS), new IBM SmartCloud Virtual Storage Center offerings, and acquisition of Cloudant Database. Also, on Valentine's day, IBM announced the FlashSystem V840, which combines the software-defined storage features of SAN Volume Controller, with the Microlatency of the FlashSystem 840. IBM sold its 10,000th PureSystems converged expert-integrated system.
In March, I completed a six-month film project ["A Tucson Executive Briefing Center: A Quick Visual Tour"]. I was writer/director/actor for this quick 3-minute film posted on YouTube. I wrote the script and had it reviewed by a professional script reviewer, hired a professional cinemetographer, paid royalties for background music, located a voice-over expert for narration, and trained the actors (all IBM employees) how to read their lines and stand on their mark for the camera. It was a big success!
In April, I presented at the Systems Technical University in Istanbul, Turkey. I had been to Turkey before, but this was my first time to the city of Istanbul itself. The owner of my local [Savaya Coffee] is from Istanbul, and was able to introduce me to someone who was able to arrange for a full tour my first day! Meanwhile, on the other side of the pond, IBMers in New York were celebrating the 50th anniversary of the IBM mainframe, including a cameo appearance on the TV show "Mad Men".
In May, I was busy presenting at the IBM Edge conference in Las Vegas. IBM celebrated the sixth anniversary of IBM ProtecTIER data deduplication device, announced "Codename: Elastic Storage" and new features on the DS8870 disk system, and presented analyst findings that IBM Software Defined Storage was substantially less expensive than competitive offerings.
In July, I took a nice summer vacation, [a road trip across the state of Tennessee]. IBM made a strategic partnership with Apple to offer mobile apps for the data center enterprise for the iOS operating system on iPhones and iPad tablets.
In August, I completed a summer partnership with University of Toronto and IBM Softlayer to build "Concept IBM Watson", a scaled down version of IBM Watson based on my infamous 2011 blog post [How to replicate Watson hardware and systems design for your own use in your basement]. Rather than using three physical servers, however, we had virtual x86 machines running on IBM Softlayer cloud. The system was only asked the simplest "How many...?" questions against a single text document, but proved to the University that teaching analytics by replicating IBM's historic achievement was effective and possible.
In September, I celebrated my eight year "Blogoversary". That's right, I have been blogging for the past eight years! With over 800 posts, and five published books, I countinue to be ranked #1 most-read blog on IBM developerWorks. IBM was ranked #1 for Software Defined Storage!
In October, I presented at the Systems Technical University in Dublin, Ireland. This was my first time in Ireland, and I found Dublin to be quite a beautiful city, with friendly people and delicious food.
The rest of October, and much of November and December, I spent on the road, visiting clients to help close deals! (Sorry folks... Due to SEC black-out rules, I am prohibited from telling you how well I did) Since I am not allowed to talk about on-going discussions that I have with clients, my blog has been noticeably silent during these months. I apologize for any stress or anxiety this might have caused any of my readers!
Despite too-much-candy, too-much-turkey and too-many-cookies that the year-end often brings, I managed to lose twenty pounds on a low-carb, gluten-free, Paleo diet and exercise.
For a while now, IBM has been trying to explain to clients that focusingon just storage hardware acquisition costs is not enough. You need toconsider the "Total Cost of Ownership" or TCO of a purchase decision.For active data, a 3-5 year TCO assessment can give you a better comparison of costs between IBM and competitive choices. For long-term archive retention, 7-10 year TCO assessment may be necessary.
Now, IBM has a cute [2-minute video] that brings anappropriate analogy to help IT and non-IT executives understand.
Today, I met with Teresa Ferraro and Mike Buttrum from FirstRain in their Manhattan office in downtown New York City. IBM recently contracted FirstRain to provide IBMers like myself with analytics on publicly-available news to keep us informed for business meetings. Here's how IBMers can get the most out of this service.
Basically, FirstRain takes a list and generates the best summaries of publicly-available news that are most relevant. You can organize into different channels. Here I have seven channels.
Companies to watch refer to existing or prospective clients that I plan to be talking with soon. Some of my colleagues are assigned to specific clients, so they can set this up once and enjoy the news for the rest of the year. I, on the other hand, meet with different clients every week, so I will be updating this list on a frequent basis.
I have divided the Competitors between major ones, and smaller startups. Since I am often working with business partners and distributors, I made that a separate channel as well.
For product lines, I picked three: Data migration, Data storage solutions, and Software defined storage.
For conferences where I don't know which companies will attend, such as the IBM Technical University, I can set up information by territory. Here is one for Brazil.
I also attend industry-oriented events, so I can pick those vertical markets that might be helpful with dinner conversations. In this example, I chose Energy, Electric Utilities and Gas Utilities.
Once you have your channels configured, you get your results in various sections:
Management Changes lists any changes in top C-level positions, who left the company, who got recently hired.
Key Developments indicates news like mergers and acquisitions and government regulations.
First Reads prioritizes the top six articles for your channel. You can access more, but these six will get you started as you have your morning coffee.
First Tweets gives you the six most relevant tweets, if those articles above were just "TL;DR"
A section on Business Influencers and Market Drivers is interesting to see who the big players are, and what topics are driving the most conversation. Here's an example from my Energy/Electric/Gas channel:
The Most Talked About section covers quotes and commentary about the most talked about companies in your channel.
With most news sources focused on politics, weather and celebrity gossip, it is nice to have a quicker, more focused approach to get the news I need to prepare for my client briefings. Special thanks to my hosts Teresa and Mike for their hospitality!
My colleague, Marissa Benekos, is on location with her video camera in Orlando, Florida for theComputerWorld [Storage Networking World] conference.
The IT specialists from the IBM booth were excited at David Bricker's debut on YouTube.Here's the rest of the gang in this [video].
Here's Andy Monshaw, General Manager of IBM System Storage and keynote speaker at this SNW event, summarizingIBM's "Information Infrastructure" strategy in 60 seconds in this [Youtube video].
This last video is Clod Barrera talking about the importance of security. Clod is an IBM Distinguished Engineerand Chief Technical Strategist for IBM System Storage product line. Here is his[Youtube video]
It looks like Marissa is having a lot of fun taking these videos at the event.More videos, as we get them, will be posted to the [IBM videos channel].
IBM has been holding various "Hackathons" and "Meetups" as a new way to reach out to prospective clients. IBM sponsored a meetup at the Austin Executive Briefing Center (EBC) to discuss Machine Learning with TensorFlow on IBM Power systems, October 26, 2017.
This was a joint event, co-sponsored by [IBM Watson/Cognitive Austin] and [Big Data/AI Revealed] meetup groups. Special thanks to my colleague Cathy Cocco, IBM Executive IT Architect with the IBM Austin EBC, for coordinating this event with their organizers.
(What is a Meetup? [Meetup.com is an online social networking website that facilitates in-person local group meetings. Meetup allows members to find and join groups unified by a common interest, such as books, games, pets, technology, careers or hobbies. In 2017, there are 32 million users with 280 thousand groups available across 182 countries.)
Here was the agenda for the event:
Registration, Pizza & Soft drinks
Tensorflow 101 presentation
Demo: Using TensorFlow for Financial Market Predictions on IBM POWER Systems
Lightning Talk: IBM Data Science Experience
Clarisse Taaffe-Hedglin: Intro to TensorFlow on IBM Power servers
Our guest speaker was my colleague Clarisse Taaffe-Hedglin, IBM Cognitive Senior Technical Architect, part of the same Worldwide Client Centers team that I work in. She flew in from Charlotte, NC.
Her topic was TensorFlow, an open source [Machine Learning] framework. TensorFlow was originally developed by Google, but was made open source in November 2015.
Machine Learning is popular in a variety of industries, from self-driving cars and trucks, speech recognition and video surveillance, to what movie to watch next on Netflix. There are three aspects to Machine Learning:
Data: Start with the data you want to analyze. This could be IoT sensor data, security logs, or social media feeds. Check out all that happens in an "Internet Minute"!
Compute: While mathematical computations can be performed on traditional CPUs, some frameworks are optimized and accelerated with Graphical Processing Units (GPU). These GPU can perform Teraflops of single and double precision calculations.
Technique: As methodology have gotten more complicated over the years, frameworks have evolved to match.
The [TensorFlow] framework is now one of the most popular among data scientists. You can download it for free at [Github].
Clarisse showed the various programming/calculation tools used by data scientists. The top five were: Python, R, SQL language, MapReduce, and Microsoft Excel.
Mathematical models come in many flavors. Clarisse explained they can be used to identify clusters of data that might have similar properties, or to perform classification, or linear regression. The results can be "descriptive", gaining a better understanding of what already is, or "predictive" for what might be.
Some frameworks like Chainer or Torch are more flexible, using a dynamic Build-by-Run approach. However, these do not scale well. Theano and TensorFlow, on the other hand, employ a Define-then-Run approach, which scales better for larger projects. With the growth in popularity with TensorFlow, the Theano framework has been "functionally stabilized".
Clarisse Taaffe-Hedglin: Financial Markets Demo
For the demo, Clarisse had historical stock closing data for USA, Australia and Asian stock markets. The hypothesis: We can determine a Buy/Sell for USA stocks based on the closing results of non-American stock results? This is a classic "Binary Classification" model. The other stock markets close 4-16 hours before the U.S. markets open, so this has real-world applicability.
Since the data was in different monetary units, she did some cleanup to normalize the data, removing out the trends, and converting everything to U.S. Dollars (USD).
Clarisse used "Supervised Learning" on 80 percent subset of the data, and then used the other 20 percent remaining data to validate how well it did.
As with any model, you measure how good it is by how close it results in the correct answer. Wrong answers are weighted by how bad they are. This is often referred to as "Loss" or "Cost". Different models can therefore be compared by minimizing the loss.
Using a simple y=wx+b mathematical model, she ran 30,000 iterations. After 5,000 iterations, the model was already guessing correctly 55 percent of the time, by the time we hit 30,000 this was up to 68 percent accuracy.
TensorFlow also supports "hidden layers", basically intermediate variables that are then used in subsequent layers for more complicated calculations. This is the way our brain works with neural networks. With two added layers, she re-ran the 30,000 iterations, and now was up to 73 percent accuracy.
Normally, this kind of analysis would take hours or days, but since TensorFlow takes advantage of the IBM Power8 CPU and NVidia Tesla K80 GPU in the IBM Power server, the whole thing finished in five minutes!
Tuhin Mahmed: Lightning Talk on IBM Data Science Experience (DSX)
Tuhin Mahmed, IBM Software Developer, is the organizer for the Big Data/AI meetup group. He wants to promote the idea of "Lightning Talks" where each person presents for just 10-15 minutes. This is a variant of the popular [Pecha Kucha] events.
To get things started, he presented 10-15 minutes on [IBM Data Science Experience], or DSX for short. Taking Multiple Listing Service (MLS) real estate data of closing prices on houses sold in a range of zip codes from the Austin Area, he mapped these on x-y axis. The x axis was square feet, and the y axis was closing price.
Using DSX, he was able to develop a mathematical model that estimates house closing prices based on their zip code and square footage.
This was a simple example, but it showed the power of Jupyter Notebooks, and how anyone can get a 30-day free trial of DSX for their own experimentation.
Currently, being a data scientist is more of an art than a science. This is one of those fields that takes only a few months to learn, but years to master.
Rather than building a model from scratch, data scientists can take existing models, and modify them to fit their needs. There are a variety of existing models available in what is called the "Model Zoo". Google has over 2,000 projects already.
Those interested in trying this out TensorFlow for themselves were directed to [Nimbix], a Cloud Service Provider that offers POWER servers with NVidia GPUs.
There were about 50 attendees, more than half identified themselves data scientists. As the first inaugural sponsored event for the IBM Austin EBC, I think this was a success!
If you are in the Austin area, the next meetup will be at the [Capital Factory] on Brazos Street on November 30, 2017.
Well, it's Tuesday again, and you know what that means? IBM Announcements!
(This week I am in Pennsylvania and New York speaking to clients. The weather this week has not been cooperative!)
Spectrum Protect Plus 10.1.2
Just in time for the upcoming VMworld conference, IBM announces the following features added to Spectrum Protect Plus, a snapshot-based backup software for VMware, Hyper-V and databases.
Data-at-Rest Encryption for local backups stored in the vSnap repository
IBM Db2 support with point-in-time recovery
VMware vSphere 6.7 support
Alerting for backup and restore jobs and storage thresholds limits
Drill-down capabilities for dashboard widgets
Spectrum Protect 8.1.6
IBM also continues to enhance its traditional file-based backup product. Here are some of the features:
Tier data by backup state for container pools. When you have multiple backup versions, the most recent version is called the "active", the older versions are called "inactive" versions. Rarely do you recover inactive versions, so this feature allows them to be migrated off to object or cloud storage.
Ransomware detection for Virtual Environment workloads. This is an enhancement of the "Ransomware detection" introduced earlier this year, but for VMware and Hyper-V images.
IBM DS8882F All-Flash Array
When IBM announced the DS8880, it shocked folks that it changed them from the previous 33-inch wide, to a standard 19-inch width. The IBM Z team followed up with 19-inch wide models of its mainframe servers.
Now, IBM can bring these together. There are two flavors of the new DS8882F:
The "Rackless" model is 17U in height with the optional keyboard/monitor, and can be put into existing 19-inch racks. These can be used with VMware, Linux, Windows, AIX and z/OS.
The "Flex Frame" model, which is 16U, allowing it to fit nicely inside a single-rack IBM Z Z14 ZR1 model, or LinuxOne RockHopper II model. It is 16U instead of 17U because it shares the existing 1U-high keyboard/monitor unit.
Like the DS8888F, DS8886F, and DS8884F models, the new DS8882F uses the High Performance Flash Enclosure (HPFE) gen2 drawers, supporting either high-performance/high-endurance drives (400GB to 3.2TB each), or high-capacity/standard-endurance drives (3.8TB to 15.3 TB each).
The R8.5 release of firmware that accompanies this announcement also supports data-in-flight encryption for Transparent Cloud Tiering. It also supports a new feature called "Safeguarded Copies", up to 500 copies to protect against hackers and ransomware.
IBM Spectrum Access blueprints have been extended to support IBM Z and LinuxOne. These blueprints show how to run IBM Cloud Private with Spectrum Connect with IBM block storage, including IBM DS8880/F, SVC, Storwize and FlashSystem models.
IBM Storage Solutions for Virtual Desktop Infrastructures (VDI)
IBM offers a new blueprint to configure Virtual Desktops with its newly announced IBM FlashSystem 9100 device. The low latency/high IOPS capability of the FlashSystem 9100 is perfect for the type of "boot storms" that are often encountered with VDI deployments.
IBM Spectrum Scale 5.0.2 and Elastic Storage Server
At recent IBM Technical University, I joked that the IBM Elastic Storage Server is only "part of a complete breakfast" because it only supported the NSD POSIX interface. To make it useful in most situations, you needed to buy additional servers outside of the ESS to run Spectrum Scale protocol nodes to provide industry-standard file and object protocols.
Today, IBM announced that you can order a new "IBM Elastic Storage Server Data Server" (5148-22L) which is a POWER server with the Spectrum Scale software pre-installed for protocol node support. It has [similar specifications] to the IBM Elastic Storage Server Management Server (5148-21L).
If you prefer to run Spectrum Scale in the cloud, you can "Bring your own license" (BYOL) to Amazon Web Services.
I have been involved with Business Continuity and Disaster Recovery my entire career at IBM System Storage. However, with new workloads like Hadoop analytics and new Hybrid Cloud deployments, I thought it would be good to provide a refresh.
The need for Business Continuity and Disaster Recovery has increased recently due to (a) climate change caused by human activity, (b) ransomware and other cyber attacks, and (c) disgruntled employees.
Back in 1983, a task force of IBM clients at a GUIDE conference developed "Seven Business Continuity Tiers for Disaster Recovery", which I refer to as "BC Tiers". I divided the presentation into three sections:
Backup and Restore: BC tiers 1 through 3 are based on backup and restore methodologies. I explained how to backup Hadoop analytics data, all of the various options for IBM Spectrum Protect software, and how to encrypt the tape data that gets sent off premises.
Rapid Data Recovery: BC tiers 4 and 5 reduce the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) with snapshots, database journal shadowing, and IBM Cloud Object Storage.
Continuous Operations: BC tiers 6 and 7 provide data replication mirroring across locations. I covered 2-site, 3-site and 4-site configurations.
IBM Spectrum Virtualize - How it works - Deep dive
Barry Whyte, IBM Master Inventor and ATS for Spectrum Virtualize, covered a variety of internal topics "under the hood" of Spectrum Virtualize. This covers the SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software.
In version 7.7, IBM raised the limits. You can now have 10,000 virtual disks per cluster, rather than 2,048 per node-pair. Also, you can now have up to 512 compressed volumes per node-pair. With the new 5U-high 92-drive expansion drawers, Storwize V7000 can now support up to 3,040 drives, and Storwize V5030 can support up to 1,520 drives.
While each Spectrum Virtualize node has redundant components, the architecture is designed to handle entire node failure. The term "I/O Group" was created to refer to the node-pair of Spectrum Virtualize engines and the set of virtual disks it manages. This made sense when virtual disks were dedicated to a single node-pair. Now, virtual disks can be assigned to multiple node-pairs, dynamically adding or removing node-pairs as needed for each virtual disk.
However, even if you have a virtual disk assigned to multiple node-pairs, only one node-pair would manage its cache, causing all other node-pairs to coordinate I/O through the cache-owning node-pair. The other node-pairs are called "access I/O groups".
The architecture allows for linear scalability, double the number of nodes, and you double your performance. Some competitors use n-way caching across four or more nodes, and it is a semi-religious argument on the pros and cons of each approach. Barry feels the 2-way caching implemented by Spectrum Virtualize is the most effective and efficient for performance.
All of the nodes are connected over IP network, but there is one designated as a "config node", and one, often the same, as a "boss node".
A cluster can have up to three physical quorum disks (either drive or mDisk) and optionally up to five IP-based quorums. The IP-based is just a Java program that runs on any server or Cloud, provided it can respond within 80 msec.
Either IP-based or physical quorum can be used for "tie-breaking" a split-brain situations. In the event there is no "active" quorum, the administrator can now serve as the tie-breaker manually. Barry recommends for Storwize clusters, where physical quorum disks are attached to a single node-pair, that you have at least one IP-based quorum for tie-breaking.
However, only physical quorum can be used for T3 Recovery. T3 Recovery happens after power outages. All of the nodes update the quorum disk with critical information of all of the virtual mappings of blocks to volumes, and this is used when bringing up the nodes again.
To protect against one pool consuming all of the cache, Spectrum Virtualize will partition the cache, and prevent any one pool from consuming more than a certain percentage of the total cache. The percentage depends on the number of pools:
Number of Pools
Max percentage of any individual pool
5 or more
Barry explained how failover works in the event of node failure. There is voting involved, and the majority remains in the cluster. In the case of an even split, called a "split brain" situation, the quorum decides. Orphaned nodes in a node-pair go into write-through mode, since the cache is no longer mirrored.
The I/O forwarding layer has been split between upper and lower roles. The upper layer handles access I/O groups. The lower layer handles asymmetric access to drives, mDisks and arrays.
N-port ID Virtualization (NPIV) drastically improves multi-pathing. Perhaps one of the coolest improvements in awhile, NPIV allows us to assign "Virtual" WWPN to other ports. When an I/O sent to a single port fails, it retries one or more times again, then waits 30 seconds, and then invokes multi-pathing to find a completely different path to the data. With NPIV, when a port fails, its WWPN is re-assigned to a different port, so the retries are likely to be successful before having to wait 30 seconds!
Lastly, Barry covered the delicate art of Software upgrades. Software is rolled forward one node at a time, and the "cluster state" is maintained during this time.
Different presentations this week are at different technical levels. My session was meant to be an overview of the concepts of Business Continuity, independent of specific operating system platform, using specific IBM products to help illustrate specific examples. Barry's was a deep dive into a single product family.
Mark your calendars! IBM plans to have back-to-back Technical University events in Hollywood, Florida:
October 8-12, will focus on IBM Z mainframe, and a subset of IBM Storage that offer synergy for IBM Z, such as DS8880 storage system, and the TS7760 Virtual Tape Engine.
October 15-19, will focus on IBM Power Systems and the entire IBM Storage portfolio.
When I first learned of this, I was not aware there was a city called Hollywood in Florida. The Hollywood in Florida is situated between Fort Lauderdale and Miami, so you can fly into either of those two airports to get to the conference.
(Did you know? The Hollywood most people know in California is no longer its own city, but rather incorporated as a neighborhood district into Los Angeles back in 1910. There are actually thirty different places called "Hollywood" around the world, two dozen in the United States, with the rest scattered in Ireland, Turkey, Russia, Singapore and the Philippines. Not all of these are formally "cities", but in some cases neighborhoods, districts, unincorporated areas, or other populated places. The Hollywood in Maryland claims to be the first, established in 1867!)
I only plan to attend the second week only, October 15-19. Here are some highlights:
In the past, IBM had keynote sessions for each brand, for example, one focused on IBM Power systems, and another on IBM Storage. However, these were scheduled during the same time slot, forcing some people to make a tough choice.
To solve this, the two keynote sessions will be staggered, so attendees can attend both!
The storage keynote will take on a new format, with a panel of experts. I have been invited as one of the experts to participate! If there is a particular topic you want to hear about on the panel, please enter your comments below.
As with most conferences, there is a "Call for Papers" requesting speakers submit the topics they can present, and then conference coordinators accept, adjust or reject them in building the final agenda.
Here are the topics I submitted:
Build your personal brand! Social Media tips from an experienced blogger
The Pendulum Swings Back - Understanding Converged and Hyperconverged Systems
IBM Hybrid and Multi-Cloud storage solutions
IBM Cloud Object Storage (powered by Cleversafe)
Managing Risks with Data Footprint Reduction
Information Lifecycle Management: Why Archive is different than Backup
The Seven Tiers of Business Continuity and Disaster Recovery
If you attended the IBM Technical Universtiy in Orlando last May, the conference in October will have six months' worth of new announcements and products to cover.
I also plan to be at the IBM Technical University events in Johannesburg, South Africa (September 11-13), and Rome, Italy (October 22-26). If you plan to be at any of these events, let me know! If not, you can follow along with Twitter hashtag: #IBMtechU
Continuing my coverage of the IBM Systems Technical University in Orlando, here are the sessions that I presented or attended on Day 3 (Wednesday).
What is Big Data? Architectures and Use Cases
Tony Pearson explained what Big Data analytics are, and IBM's various products to support this, incluidng BigInsights, BigSQL and Spectrum Scale with the Hadoop Connector.
Why use IBM Spectrum Virtualize for High Availability
John Wilkinson, IBM Storage Software Engineer from the UK Hursley lab, presented the latest enhancements to Spectrum Virtualize-based products, such as SVC and Storwize V7000, related to Stretch Cluster and HyperSwap functions for High Availability.
IBM Systems Hybrid Cloud Strategy, POV and Showcase
Dave Willoughby, IBM z System Hardware Architect for Systems Cloud Emerging Technologies, provided a high-level "Point-of-View" for Hybrid Cloud, and why IBM is focused on helping clients transition from traditional IT infrastructures.
Data Footprint Reduction - Understanding IBM Storage Efficiency Options
Tony Pearson presented an overview of Thin Provisioning, Space-efficient snapshots, Data deduplication and Real-time Compression features.
IBM Spectrum Virtualize - Understnding SVC, Storwize and FlashSystem V9000
Tony Pearson provide an overview of SAN Volume Controller, the Storwize family of products and FlashSystem V9000, all of which are based on Spectrum Virtualize software.
The day ended with a trip to Universal Studios. Dinner on the City Walk offered entertainment with Dueling Pianos. This was then followed by a trip to Hogsmeade, the Harry Potter themed portion of the resort.
"Do you know what I do?" Mr. Mondavi recalls Mr. Gallo asked him when they first met. "Yes, you run the largest winery in the country," recalls Mr. Mondavi, then in his mid-20s. "No," Ernest corrected him. "I go out and visit customers in stores."
Robert Smith (aka Radio Voom) reports on National Public Radio that Second Life is now being used for campaigning for political candidates. It used to be that political candidates took trains and buses across the country, meeting people, discussing their issues, and getting a feel for what is going on in the hearts and minds of their potential voters. With the development of TV and Radio, candidates traveled less, hoping to get their word out to people who would listen to them. Using Second Life and other social networking tools brings candidates back to having conversations with the people they hope to represent.
Of course, many of these candidates are old, and are learning internet social networking skills for the first time. John McCain, my senator from Arizona, is running for President at 70 years old! It's true that old dogs CAN learn new tricks.
IBM is investing heavily into Second Life, as are many other forward-thinking companies, to explore the age-old human need for connectedness, community and dialog. I've asked my team to all get their avatars up and running in Second Life. Granted there is a bit of a learning curve, but everybody handles change in different ways, some better than others.
"Knowledge is the antidote to fear." -Ralph Waldo Emerson
Why are most of these guys (and girls) with over a billion US dollars in net worth still working? Perhaps because they embrace new ideas, and are on the thrill seeking side of humanity. I guess I am too. I'll be thrill-seeking in Chicago this weekend, celebrating St. Patrick's day.
Happy [Thanksgiving], everyone! Yes, I mean everyone. Even if you are not an American from United States eating turkey and stuffing this week, it wouldn't hurt to take stock in what you are thankful for.
And before I forget, I want to thank all of you, my readers, for making this blog the #1 most read blog at IBM developerWorks, and one of the top blogs in the IT storage industry!
A few years ago, I was stuck in Venice, Italy for the holiday. Not by choice, but because I was the victim of a car accident on my business trip. My neck in a brace, I was unable to fly home in time to celebrate Thanksgiving.
The local IBMers directed me to a wonderful restaurant where I would dine alone, on Thanksgiving, and insisted that I ask the waiter to have some butter for my bread. The joke was on me! A collection of waiters came out, banging on pot lids, with a huge six-pound "cube" of butter on a tray, with a fork and knife stuck into the top. They do like to make fun of the tourists in Vencie, don't they?
In other years past, I have found myself spending the holiday working at client locations, baby-sitting their datacenter. Why? Since Thanksgiving always lands on a Thursday in the USA, the day after is known as "Black Friday", the official kick-off of consumerism craziness.
The next day is "Small Business Saturday", to give small local businesses a chance to compete for some revenue. Two days after that is "Cyber Monday", where many people shop online from their office, rather than fight all the crowds in the traditional brick-and-mortar stores. My job was to make sure the systems ran smoothly, from Thursday to Monday, for our largest clients in the retail industry.
(Note: This is not the first time I have mentioned [Cyber Monday] on my blog. For the past few years, I remind people that the perfect holiday gift is one or more of my books from the Inside System Storage series, volumes I through V, are available in hardcover, paperback and eBook formats from my [Spotlight page on Lulu.com].)
This year, I am thankful that I will be in Tucson with my friends and family. The weather here is expected to be a beautiful 72 degrees Fahrenheit.
Last year, I hosted my friends and family in my home for the big meal. It went so well that I invited everyone back again this year. In August, I started the contractual process to remodel my kitchen, and the company I hired assured me repeatedly it would be ready by Thanksgiving.
Unfortunately, due to a combination of sloppy project management of the company that I hired to do the work and a few unforseen circumstances that caused some delays, I have no kichen.
I have an empty room where my kitchen used to be, the floor partially tiled, the walls clean and freshly panelled and painted. My new cabinets and sink are stacked up inside cardboard boxes in my garage.
So, instead, I am taking everyone out. I am thankful there are restaurants open tomorrow, and I was able to make a last-minute reservation for the six of us. Construction will resume on Friday.
I got an interesting email from a new blogger asking me for advice on how frequently to post entries.I am probably not the right person to ask, as I blog whenever a thought comes to mind that I think otherswould enjoy reading, and sometimes that means several times a day, and other times only a few per month.I actually have a day job, busy doing other things, and blogging is just now part of my general set of activities.My focus is quality not quantity.
With that in mind, I was delightfully surprised that this blog was ranked among theTop 10 Storage Blogs by Network World, which explains my recent spike in traffic.
I shared the news with my 72-year-old father, and he exclaimed "There are actually 10 or more blogsto cover the IT storage industry?" He couldn't understand why the world would read more than two or three. I personally track thirty-five of them, and I suspect there are hundredsothers out there. Of these, some blog quite regularly, while others do not, so I am in good company. Deni Connor, the author who selected these top 10, gave a nice general complement tothe entire list of blogs:
The blogs written by storage company executives can be surprisingly vendor-agnostic, though the analysts and consultants still tend to pull fewer punches.
And this was my goal as well, to enlighten and entertain, in a fair and balanced manner, that adds value to the blogosphere, rather than just repeat the IBM press releases of each day. If you are just looking for "announcements" there is an RSS feed for IBM System Storage you cansubscribe to.
Not surprisingly, two of the blog entries that Deni mentions are the ones I get the most comments on:
ILM for my iPod tried to explain Information Lifecycle Management (ILM) into laymen terms that everyone could understand. As an engineer-turned-marketeer, explaining technology and concepts into laymen terms is something I find myself doing a lot to help others grasp what is otherwise rather complex industry we are in. Not surprisingly, many IBMers were not aware they were eligible for discounts on Apple products like the iPod, and thanked me for pointing them to this.
Aperi is "Viagra" for SMI-S which has now become my infamous blog entry within the halls of IBM. I chose this term over "steroids" given the various scandals involving famous athletes that were going on at the time. To this day,if you search Google for "Tony Pearson" AND "Viagra" you get this blog entry at the top of the list. Oneco-worker overheard that I had "used Viagra" only to later find out they were referring to the fact that I "used Viagra as a metaphor in the title of a blog entry". And that was the real issue, not that I used the term in a popular vernacular that might not translate well into other languages, or that I failed to attribute this as a trademark that belonged to its respective manufacturer, but that it was in the title itself, and thus the URL became "aperi_is_viagra_for_smi" when published in newspapers and press releases. I have since learned to be more careful when phrasing the titles of my blog entries.
I began my year-end vacation today, but like exercising at the gym, I will try to keep up with my blogging over these next two weeks. Especially for those readers out there doing end-of-year storage infrastructure changes. This blog is for you.[Read More]
My friends over at Appcessories sent me an awesome infographic on the Internet of Things. If you happen to receive any gifts this holiday related to any of these categories, mention them in the comments below!
The State of Internet of Things in 6 Visuals – By the team at Appcessories
The blogosphere has quieted down a bit over the two papers on MTBF estimates for Disk Drive Modules (DDM).One article on SearchStorage.com by Arun Taneja asksIs RAID passé? Disk capacity is growing at a faster rate than DDM reliability. During the hours to rebuild a DDM, companies are at risk of additional failures that could require recovery from a copy, or result in data loss, depending on how well your Business Continuity (BC) plan is written and followed.
... The problem with that is that it's the DISK ARRAY that determines when a drive has failed an starts the rebuild process. That IS under the control of IBM, specifically the controller. But more importantly, it effects my risk of data loss.
As I see it, my risk of data loss with RAID-5 is influenced by two main factors. 1 - The drive replacement rate and 2 - The rebuild time (which to a great extent is a function of the drive size) both of which IBM has some control over.
So, I think that the question in my mind is, what's the tipping point? Where does the risk of using RAID-5 protection exceed what I'm willing to accept, and I need to move to some other protection mechanism like RAID-6? Is it when the rebuild times exceed 12 hours? 24 hours? 48 hours?
Also, I wonder why IBM isn't publishing some information to help me make these kinds of decisions?
Oh, dear - while Tony doesn’t seem to be parrying vigorously (as Seagate, Hitachi, and Chunk were doing), his contribution sounds more like IBM marketing than the kind of detailed, technical response one might have hoped for
... well, he *is* a manager, and a marketing one at that, so perhaps we shouldn’t expect more).
Both are fair comments. Disk arrays do run microcode to assist or perform the RAID function, detect failures and start the rebuild process, and so clever designs to support spare disks, process the rebuild quickly, and so on, can differentiate one vendor's offering from another.
On the issue of what does IBM provide to help its clients make the right decisions for their environments, Jon William Toigo at DrunkenData points his readers to IBM's Business Continuity Self-Assessment tool. In normal data center conditions, DDMs will fail, and a Business Continuity plan shouldbe written and developed to handle this fact. Using 2-site and 3-site mirroring, complemented with versions of tape backups, can help address some of these concerns and mitigate some of the risks involved with using disk systems.
For those who want a more technical answer, IBM has just published a series of IBM Redbooks.
Each client's situation is different, so no simple answer is possible. However, IBM does have a lot of experience in this area, and would be glad to help you write or update your existing Business Continuity plan.
In the 2004 comedy ["A Day Without a Mexican"], the director envisions how disruptive life would be in California if all the Mexicans suddenly disappeared. The point is that sometimes you take things in the background for granted.
I was reminded of this when I saw Mark Underwood's blog post [Mainframe: Still Not Crazy After All These Years]. The article reminds us how critical IBM z Systems mainframes (and related storage like the IBM DS8880 disk systems) are in our lives. Here's an excerpt:
"Warren Buffett's Berkshire Hathaway started buying up IBM stock in 2011 and bought still more of IBM later. Despite its disappointing short-term valuation, Berkshire Hathaway is standing by its IBM investment, which is one of Berkshire's top four plays. ... To make this case, some statistics may be needed:
The z13 can withstand an 8.0 earthquake.
z Systems enjoy the highest standardized security certification (FIPS 140-2, highest level 4 of 4).
23 of the world's top 25 retailers use a mainframe.
92 of the top 100 banks are mainframe users.
All 10 of the top 10 insurers have commitments in mainframe technologies.
Around 80 percent of all corporate data is managed by mainframes.
The z13 can process 2.5 billion transactions daily (that's 100 [Cyber Mondays], as IBM's Mark Anzani, VP of z Systems Strategy, Resilience and Ecosystems, observed)."
... In fact, and notwithstanding perceptions to the contrary, the mainframe's center-stage position in large corporations around the world has not budged. That's the conclusion of an industry survey sponsored by Syncsort Inc. and conducted in 2015 by Enterprise Systems Media, a publisher of magazines for IT managers and technical professionals. Seven out of 10 respondents (IT planners, architects and managers at global enterprises with $1 billion or more in annual revenues) ranked the use of the mainframe for large-scale transaction processing as very important."
What would a comparable film depicting "A Day without a Mainframe" be like? I would imagine it somewhere between a disaster movie like  and an end-of-the-world zombie horror movie like [28 Days Later]. I would gladly take a million dollars to write the screenplay!
(FCC Disclosure: I work for IBM and am a filmmaker as well. Earlier in my career, I was chief architect of IBM's Data Facility Storage Management Subsystem (DFSMS) which manages around 80 percent of the world's corporate data. This blog post can be considered a "paid celebrity endorsement" for IBM's z13 System mainframes and DS8880 Disk Systems. I have personal experience with both and highly recommend them. I am neither a Mexican nor resident of California, but work regularly with both in my job responsibilities. Like Warren Buffett, I also own stock in both IBM and Berkshire Hathaway companies. I had no involvement in the making of any of the major motion pictures mentioned in this blog post, have no financial interest in their distribution, and have not been provided any compensation for mentioning them in this blog post. They are all great movies worth watching!)
What do you think the movie would be like? Enter your comments below!
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM announced a new product, IBM Spectrum Protect Plus. To understand why, I will need to discuss a bit of history related to Data Protection.
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for IBM Spectrum Protect, IBM Spectrum Protect Snapshot, IBM Spectrum Protect for Virtual Environments, and IBM Spectrum Copy Data Management products. I was not paid in any manner to promote Geoffrey Moore's book mentioned below.)
IBM Spectrum Protect was originally developed as the Workstation Data Save Facility (WDSF) back in the 1980s, back when Personal Computers were just getting deployed.
I started in 1986 developing mainframe software, so we all had bulky 3270 terminals. When our area was offered 120 PCs to replace them, I was tasked with determining how to roll these out, 24 at a time, over five months.
My job was to determine who would get a PC in the first round, the second round, and so on. I handed out a simple one-page survey, asking everyone basic questions. Are you familiar with Personal Computers? Do have one at home? Are you comfortable using a mouse? My plan was to give those most familiar with them sooner, and those less familiar in later rounds.
However, it was my final question that sealed the deal:
How soon do you want a PC to replace your 3270 terminal?
[ ]Immediately [ ]Next month [ ]No Hurry [ ]Put me last [ ]Never!
Surprisingly, I had roughly 24 folks choosing each option on this last question, which made my decision process easy for me!
(In his book Crossing the Chasm, fellow author Geoffrey Moore would come up with similar groups: Innovators, Early Adopters, Early Majority, Late Majority, and Laggards. This is a great book and I highly recommend it!)
Of course, we used WDSF to back up the files. WDSF would later morph into DFDSM, then ADSM, then TSM, and now it is called IBM Spectrum Protect.
Over the decades, the product has evolved from just backing up data on personal computers. IBM Spectrum Protect can now protect all kinds of machines, from tablets, mobile devices, and smartphones, to virtual machines, databases, and application servers in the data center.
Besides creating backup versions of files, IBM Spectrum Protect can also migrate older, less frequently used files to less expensive media, as well as archive files for long-term retention.
Different files can be assigned to different "management classes" that determine policies to be applied and enforced on the backup, migration and archive copies. For backups, this includes how many versions to keep while the file exists, how many versions to keep after the original file is deleted, how long to keep those inactive versions.
Instead of a grandfather-father-son [backup tape rotation], full-plus-incremental, or full-plus-differential scheme employed by other backup software, IBM Spectrum Protect has a unique "Incremental-Forever" approach that reduces backup time, LAN bandwidth requirements, and backup storage media.
While most companies still backup to tape, IBM Spectrum Protect can backup to flash, disk, tape, virtual and physical tape libraries, object storage, and even to public Cloud Service Providers such as IBM Bluemix, Amazon S3, and Microsoft Azure.
IBM Spectrum Protect both client-side and server-side data footprint reduction technologies including compression and deduplication, eliminating the need for expensive, single-purpose data deduplication devices like Dell-EMC Data Domain.
IBM Spectrum Protect is recognized as a leader in Data Protection software, able to scale up to meet the demands of the largest enterprises. However, the parameters and options that IBM Spectrum Protect has acquired over time have been compared to the cockpit or flight deck of an airplane!
For clients with Virtual Machines, IBM offered three solutions:
IBM Spectrum Protect Snapshot
Formerly called Tivoli Storage FlashCopy Manager (FCM), [IBM Spectrum Protect Snapshot] takes frequent, near-instant, non-disruptive, application-aware backups and restores for SAP, Oracle and Db2. It can also be used for VMware using advanced snapshot technology, on both IBM and non-IBM storage systems.
IBM Spectrum Protect Snapshot can be used as a stand-alone product, or integrated with IBM Spectrum Protect to move the snapshots and FlashCopy targets to other storage media.
IBM Spectrum Protect for Virtual Environments (VE)
Formerly called IBM Tivoli Storage Manager for Virtual Environments, [IBM Spectrum Protect VE] protects both VMware and Microsoft Hyper-V virtual machines.
IBM Spectrum Protect VE safely moves backup workloads to a centralized IBM Spectrum Protect server and enables administrators to create backup policies or restore virtual machines with just a few clicks. It allows you to protect data without a traditional backup window.
IBM Spectrum Copy Data Management makes copies available to DBAs, Developers and VM administrators when and where they need them. While this product is focused on DevOps and Dev/Test workflows, it can also be used to automate and schedule snapshots that can serve as backups.
Surprisingly, many companies do not take advantage of these solutions. Even clients who already have IBM Spectrum Protect deployed either (a) simply use Spectrum Protect clients on individual VM guests, or (b) use third-party products to backup VMs outside of Spectrum Protect infrastructure.
"Problems cannot be solved with the same mind set that created them."
-- Albert Einstein
Smaller clients want something simpler to deploy, and easier to use and administer. Rather than simplify the products above, a process called "kneecapping" in the IT industry, IBM opted for a clean slate, [start-from-scratch] approach.
The result is IBM Spectrum Protect Plus, new software that was preview announced last Wednesday in time for this week's VMworld 2017 conference in Las Vegas, and next month's VMworld conference in Barcelona, Spain.
IBM Spectrum Protect Plus is available as either a stand-alone product, or integrated with IBM Spectrum Protect for long-term protection. It is focused exclusively on VMware and Hyper-V environments. General Availability is expected some time in 4Q 2017.
Key features include:
Simple to install in less than 15 minutes, configured in an hour
Easy to use by DBA, VM or application administrator. No IBM Spectrum Protect skills required for stand-alone deployment
Pre-defined Gold, Silver and Bronze policies are ready to use. Additional customized policies can be configured as needed
Supports both application-aware and crash-consistent methods
Data Footprint Reduction technologies including compression and deduplication
Instant data recovery to support DevOps, Dev/Test, Reporting, Analytics and Training
Granular search and restore of entire Virtual Machines, VMDKs, and individual files
As for the name, I would have prefered "IBM Spectrum Protect Basic Edition". The "Plus" implies that the new product is more advanced, or offers more features, than the existing Spectrum Protect editions.
For those who participated in Clark Hodge's "Where's Tony" contest,I was in Croatia, Bosnia and Herzogovina. There were optional side-tours to Montenegro and Slovenia, but I decided not to incur the added time and expense with those.
For those wondering where to go this Summer for vacation, I recommend Croatia. It is a beautiful country, with clean cities, good road conditions, and a calm Adriatic sea as we went from island to island.
And if you get to Mostar, don't let them talk you into jumping off the "old bridge". The water is terribly cold down there![Read More]
Based on our success with Second Life launch event last week, see my previous blog posts hereandhere, people have asked me what tools and software we used. The ones that were the most useful were:
GIMP - available at gimp.org - is an open-source alternative to Adobe PhotoShop or Corel PaintShopPro, and is useful for editing photos and graphics, such as the surfaces of 3-D objects and clothing.
Avimator - available at avimator.com - for the gestures to animate your avatar. This allowed us to hold microphones up to our mouth to speak, hold a pointing stick to focus attention on specific things, or to drink coffee afterwards.
FRAPS - available at fraps.com - to capture video and screen shots. The free version is limited, so our designated "camera crew" purchased the full-price version, and worked very well.
The smart people at the University of Pittsburgh manage five campuses and over 33,000 students, andneeded to create an enterprise storage solution that would give it three key benefits. Of course, they turnedto IBM, the number one overall storage hardware vendor, to deliver.
A new storage infrastructure with the capacity to grow with the University of Pittsburgh as needed
Improved system reliability with reduced downtime, and availability 24/7/365
A significantly more manageable storage solution that could lower costs and provide better system efficiency through virtualization
As a result, IBM shipped its 25,000th high-end disk storage system, in this case two IBM System Storage DS8300 models, along with storage virtualization, and other related hardware, software and services, to provide a complete end-to-end solution.
Here is what Jinx Walton, Director of Computing Services and Systems Development at the University of Pittsburgh, had to say about it...
"The University of Pittsburgh supports large enterprise systems, and the number and complexity of new systems continue to grow. To effectively manage these systems it was necessary to identify an enterprise storage solution that would leverage our existing investments in storage, make allocation of storage flexible and responsive to project needs, provide centralized management, and offer the reliability and stability we require. The integrated IBM storage solution met these requirements"
This week, I was in beautiful Melbourne, Australia for IBM Systems Technical University. On Wednesday evening, we had a poster session.
(I have so many photos that I will split this post up into topics. This post will focus on IBM Z systems, see my other posts for storage and IBM Power systems.)
Topics can be anything that is of interest to your peers and colleagues. It can be research-related, a specific solution you implemented or an interesting customer case you want to share.
Linux Scalability at a Small Scale (or, An Adventure In Minimalist Multitudinousness)
Vic Cross, IBM Senior Systems Engineer, used the Ganglia Monitor System to generate traffic and measure 1,680 Linux guests on a single IBM Z mainframe LPAR with only 16GB of memory! His poster consisted of 18 pages of material, a mix of traditional presentation slides, screen shots of web pages, and densely detailed performance results.
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. Learn more at [http://ganglia.sourceforge.net/]
Spectrum Scale 2 site cluster
Antony Steel, IBM Senior Consulting IT Specialist, presents an option to configure a 2 site GPFS (Spectrum Scale) "almost active-active" cluster when a 3rd site is not available. This option will require simple administrative tasks to make DR filesystem available should Production site fail. Spectrum Scale runs on IBM Z, IBM Power and x86 servers.
The poster used 13 traditional landscape slides, printed on what appears to be A4 paper. A4 is 297 mm wide, so three side by side exceeds the 841 mm width of the poster foam board. These were arranged with a title slide on top, and then 12 content slides in four rows of three.
While I was glad that someone else had a QR code on their poster, the placement was way at the top, and difficult for anyone to actually scan it. I thought of this, and had mine at waist level in the middle right side of my poster.
Life is better with Linux
I couldn't resist taking a photo of the back of this guy's tee-shirt, which says "Life is better with Linux"
In effect, tee-shirts can also be "posters", although that would make for an awkward "poster session" if everyone wore them? Pointing at your chest would be weird, and pointing to your back would be near impossible!
In 1999-2001, I helped the port of Linux to IBM S/390 mainframe chip-set architecture by testing and debugging the disk and tape device drivers. I was the first to install Linux on an IBM mainframe in Tucson, AZ!
I would then go on to work with SAN Volume Controller, Tivoli Storage Manager (now called Spectrum Protect), Tivoli Storage Productivity Center (now called Spectrum Control), and the General Parallel File System (GPFS, now called Spectrum Scale). All of these run on Linux!
I would become the "Linux storage expert" at conferences like SHARE and GUIDE. While my co-workers in DFSMS and z/OS felt Linux was just a fad, I predicted that Linux was going to be a major force in the IT industry. I was right, not only does Linux run on all of our IBM Z and Power servers, it is the underlying operating system for nearly all of IBM storage devices.
Today, I run Linux directly on my laptop, using a Windows KVM guest image as needed for specific projects or applications.
Erina Araki poses for a photo with one of the attendees, Marco. Erina was the organizer for this poster event, and was my primary contact to answer all of my questions. I think the poster session was a big success!
I hope everyone enjoyed the French Open in Second Life! Here are some upcoming events:
Rational Software Development Conference comes to Second Life
As part of its commitment to the developer community, IBM is broadening the experience for conference visitors and avatars visiting IBM CODESTATION, in the virtual world of Second Life. During RSDC this year, visitors can view the General Sessions, catch Rational product demonstrations, interact with Rational experts, and learn about the first CODESTATION "Coder's Challenge" kicking off in July.
For Rational Software Development Conference (RSDC) information and registration, running June 10-14:here
Virtual Technical Briefing in Second Life: Web 2.0
Join IBM developerWorks in Second Life for a virtual Web 2.0 Briefing on June 21, 2007 at 12:30 pm EDT/ 9:30 am PDT. During this briefing from IBM developerWorks you'll see presentations on Web 2.0 technologies, a flash demo of associated hot technologies and have a chance to have your questions answered by IBM experts.
In the last two years Web 2.0 has created one of the most remarkable growth surges in Web application history. The transition of consumer Web sites from isolated information silos to sources of shared content and functionality, make the Web a true computing platform serving web applications to end-users. Now it's time to take the lessons learned from that success and see how it can bring value to you and your business.
Based on our success for our April 26 event, we decided to have the next event in September. More details to follow,but we plan to have it open to customers, analysts and business partners. If you are interested in participating, now is a good time to get your avatar in second life up and running. If you need "System Storage", "IBM Business Partner" logo clothing for your avatar, send me a note.
Continuing this week's theme on Business Continuity, I thought I would explore more on the identification of scenarios to help drive appropriate planning. As I mentioned in my last post, this should be done first.
A recent post in Anecdote talks about the long list of cognitive biases which affect business decision making. This list is a good explanation of why so many people have a difficult time identifying appropriate recovery scenarios as the basis for Business Continuity planning. Their "cognitive biases" get in the way.
Again, using my IBM Thinkpad T60 laptop as an example, here are a variety of different scenarios:
Corrupted File System
Some file systems are more fragile than others. If your NTFS file system gets corrupted, you might be able to run
CHKDSK C: /F
but this just puts damaged blocks into dummy files, it doesn't really repair your files back to their pre-damage level.All kinds of things can damage the file system, including viruses, software defects, and user error.
I keep my programs and data in separate file systems. C: has my Windows operating system and applications, and D: holds my pure data. If one file system is corrupted, the other one might be in tact, mitigating the risk.
Hard Disk Crash
Hopefully, you will have temporary read/write errors to provide warning prior to a complete failure. In theory, if I kept a spare hard disk in my laptop bag, I could swap out the bad drive with the good drive. I don't have that. The three times that I have had a disk failure all occurred while I was in Tucson.
Instead, I keep the few files I need for my trip on a separate USB key, and carry bootable Live CD, which allows you to boot entirely from CDrom drive, either to run applications, or perform rescue operations.
The latest one that I am trying out is Ubuntu Linux, which has OpenOffice 2.2 that can read/write PowerPoint, Word, and Excel spreadsheets; Firefox web browser; Gimp graphics software; and a variety of other applications, all in a 700MB CDrom image. I even have been able to get Wireless (Wi-Fi) working with it, and the process to create your own customized Live CD with the your own application packages is fairly straightforward. Combined with a writeable USB key, you can actually get work done this way. Special thanks to IBM blogger Bob Sutor for pointing me to this.
(If you have a DVD-RAM drive, there are bigger Live CDs from SUSE and RedHat Fedora that provide even more applications)
Laptop Shell Failure
This might catch some people by surprise. I have had the keyboard, LCD screen, or some essential port/plug fail on my laptop. The disk drive and CDrom drive work fine, but unless you have another "laptop" to stick them into, they don't help you recover. This can also happen if the motherboard fails, or the battery is unable to hold a charge.
IBM provides a 24-hour turn around fix. Basically, IBM sends me a laptop shell, no drive, no CDrom, with instructions to move the disk drive and CDrom drive from your broken shell, to the new shell, then send the bad shell back in the same shipping box.
Here, again, I am thankful that I keep my key files on an USB key. Often I travel with other IBMers, and can borrow their laptop to make presentations, check my e-mail, or other work, until I can get my replacement shell. In you are travelling outside the US, you might be able to move your disk drive into a colleague's laptop, access the data, copy it to your USB key or burn a copy on CD or DVD.
In a data center, many outages are really "failures to access data", but the data is safe. For example, power outages, network outages, and so on, can prevent people from using their IT systems, but the data is safe when these are re-established.
At times, I have been temporarily separated from my laptop. Three examples:
A higher level executive had technical difficulties with his laptop, and usurped mine instead.
A colleague forgot his power supply for his laptop, and borrowed my laptop instead. (I wish there were a standard for laptop power plug connectors)
Customs agents confiscate your laptop, give you a receipt, and eventually you get it back.
In all cases, I was glad that no "recovery" was required, and that the few files I needed were on my USB key. A few times, I was able to get by on the machines available at the nearest Internet Cafe, in the meantime.
With some imagination, you can recognize that this scenario is similar to the previous one for laptop shell failure.Here is a good example that you can identify different scenarios, and then later discover they have similar properties in terms of recovery, and can be treated as one.
Laptops are stolen every day. Luckily, I've only had this happen twice to me in my career at IBM, and I managed to get a replacement soon enough. The key lesson here is to keep your USB key and recovery media in separate luggage.I know it is more convenient to keep all computer-related stuff in one place, but a thief is going to take your whole laptop bag, to make sure that all cables and power supplies are included, and is not going to leave anything behind. That would just slow them down.
In each case, some brainstorming, or personal experience, can help identify scenarios, identify what makes them unique from a recovery perspective, and plan accordingly. If you looking to create or upgrade your Business Continuity plan, give IBM a call, we can help!