This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections Platform is now in read-only mode and content is only available for viewing. No new wiki pages, posts, or messages may be added. Please see our FAQ for more information. The developerWorks Connections platform will officially shut down on March 31, 2020 and content will no longer be available. More details available on our FAQ. (Read in Japanese.)
In the 2004 comedy ["A Day Without a Mexican"], the director envisions how disruptive life would be in California if all the Mexicans suddenly disappeared. The point is that sometimes you take things in the background for granted.
I was reminded of this when I saw Mark Underwood's blog post [Mainframe: Still Not Crazy After All These Years]. The article reminds us how critical IBM z Systems mainframes (and related storage like the IBM DS8880 disk systems) are in our lives. Here's an excerpt:
"Warren Buffett's Berkshire Hathaway started buying up IBM stock in 2011 and bought still more of IBM later. Despite its disappointing short-term valuation, Berkshire Hathaway is standing by its IBM investment, which is one of Berkshire's top four plays. ... To make this case, some statistics may be needed:
The z13 can withstand an 8.0 earthquake.
z Systems enjoy the highest standardized security certification (FIPS 140-2, highest level 4 of 4).
23 of the world's top 25 retailers use a mainframe.
92 of the top 100 banks are mainframe users.
All 10 of the top 10 insurers have commitments in mainframe technologies.
Around 80 percent of all corporate data is managed by mainframes.
The z13 can process 2.5 billion transactions daily (that's 100 [Cyber Mondays], as IBM's Mark Anzani, VP of z Systems Strategy, Resilience and Ecosystems, observed)."
... In fact, and notwithstanding perceptions to the contrary, the mainframe's center-stage position in large corporations around the world has not budged. That's the conclusion of an industry survey sponsored by Syncsort Inc. and conducted in 2015 by Enterprise Systems Media, a publisher of magazines for IT managers and technical professionals. Seven out of 10 respondents (IT planners, architects and managers at global enterprises with $1 billion or more in annual revenues) ranked the use of the mainframe for large-scale transaction processing as very important."
What would a comparable film depicting "A Day without a Mainframe" be like? I would imagine it somewhere between a disaster movie like  and an end-of-the-world zombie horror movie like [28 Days Later]. I would gladly take a million dollars to write the screenplay!
(FCC Disclosure: I work for IBM and am a filmmaker as well. Earlier in my career, I was chief architect of IBM's Data Facility Storage Management Subsystem (DFSMS) which manages around 80 percent of the world's corporate data. This blog post can be considered a "paid celebrity endorsement" for IBM's z13 System mainframes and DS8880 Disk Systems. I have personal experience with both and highly recommend them. I am neither a Mexican nor resident of California, but work regularly with both in my job responsibilities. Like Warren Buffett, I also own stock in both IBM and Berkshire Hathaway companies. I had no involvement in the making of any of the major motion pictures mentioned in this blog post, have no financial interest in their distribution, and have not been provided any compensation for mentioning them in this blog post. They are all great movies worth watching!)
What do you think the movie would be like? Enter your comments below!
Well, it's Tuesday again, and you know what that means? IBM Announcements!
IBM announced a new product, IBM Spectrum Protect Plus. To understand why, I will need to discuss a bit of history related to Data Protection.
(FCC Disclosure: I work for IBM. This blog post can be considered a "paid celebrity endorsement" for IBM Spectrum Protect, IBM Spectrum Protect Snapshot, IBM Spectrum Protect for Virtual Environments, and IBM Spectrum Copy Data Management products. I was not paid in any manner to promote Geoffrey Moore's book mentioned below.)
IBM Spectrum Protect was originally developed as the Workstation Data Save Facility (WDSF) back in the 1980s, back when Personal Computers were just getting deployed.
I started in 1986 developing mainframe software, so we all had bulky 3270 terminals. When our area was offered 120 PCs to replace them, I was tasked with determining how to roll these out, 24 at a time, over five months.
My job was to determine who would get a PC in the first round, the second round, and so on. I handed out a simple one-page survey, asking everyone basic questions. Are you familiar with Personal Computers? Do have one at home? Are you comfortable using a mouse? My plan was to give those most familiar with them sooner, and those less familiar in later rounds.
However, it was my final question that sealed the deal:
How soon do you want a PC to replace your 3270 terminal?
[ ]Immediately [ ]Next month [ ]No Hurry [ ]Put me last [ ]Never!
Surprisingly, I had roughly 24 folks choosing each option on this last question, which made my decision process easy for me!
(In his book Crossing the Chasm, fellow author Geoffrey Moore would come up with similar groups: Innovators, Early Adopters, Early Majority, Late Majority, and Laggards. This is a great book and I highly recommend it!)
Of course, we used WDSF to back up the files. WDSF would later morph into DFDSM, then ADSM, then TSM, and now it is called IBM Spectrum Protect.
Over the decades, the product has evolved from just backing up data on personal computers. IBM Spectrum Protect can now protect all kinds of machines, from tablets, mobile devices, and smartphones, to virtual machines, databases, and application servers in the data center.
Besides creating backup versions of files, IBM Spectrum Protect can also migrate older, less frequently used files to less expensive media, as well as archive files for long-term retention.
Different files can be assigned to different "management classes" that determine policies to be applied and enforced on the backup, migration and archive copies. For backups, this includes how many versions to keep while the file exists, how many versions to keep after the original file is deleted, how long to keep those inactive versions.
Instead of a grandfather-father-son [backup tape rotation], full-plus-incremental, or full-plus-differential scheme employed by other backup software, IBM Spectrum Protect has a unique "Incremental-Forever" approach that reduces backup time, LAN bandwidth requirements, and backup storage media.
While most companies still backup to tape, IBM Spectrum Protect can backup to flash, disk, tape, virtual and physical tape libraries, object storage, and even to public Cloud Service Providers such as IBM Bluemix, Amazon S3, and Microsoft Azure.
IBM Spectrum Protect both client-side and server-side data footprint reduction technologies including compression and deduplication, eliminating the need for expensive, single-purpose data deduplication devices like Dell-EMC Data Domain.
IBM Spectrum Protect is recognized as a leader in Data Protection software, able to scale up to meet the demands of the largest enterprises. However, the parameters and options that IBM Spectrum Protect has acquired over time have been compared to the cockpit or flight deck of an airplane!
For clients with Virtual Machines, IBM offered three solutions:
IBM Spectrum Protect Snapshot
Formerly called Tivoli Storage FlashCopy Manager (FCM), [IBM Spectrum Protect Snapshot] takes frequent, near-instant, non-disruptive, application-aware backups and restores for SAP, Oracle and Db2. It can also be used for VMware using advanced snapshot technology, on both IBM and non-IBM storage systems.
IBM Spectrum Protect Snapshot can be used as a stand-alone product, or integrated with IBM Spectrum Protect to move the snapshots and FlashCopy targets to other storage media.
IBM Spectrum Protect for Virtual Environments (VE)
Formerly called IBM Tivoli Storage Manager for Virtual Environments, [IBM Spectrum Protect VE] protects both VMware and Microsoft Hyper-V virtual machines.
IBM Spectrum Protect VE safely moves backup workloads to a centralized IBM Spectrum Protect server and enables administrators to create backup policies or restore virtual machines with just a few clicks. It allows you to protect data without a traditional backup window.
IBM Spectrum Copy Data Management makes copies available to DBAs, Developers and VM administrators when and where they need them. While this product is focused on DevOps and Dev/Test workflows, it can also be used to automate and schedule snapshots that can serve as backups.
Surprisingly, many companies do not take advantage of these solutions. Even clients who already have IBM Spectrum Protect deployed either (a) simply use Spectrum Protect clients on individual VM guests, or (b) use third-party products to backup VMs outside of Spectrum Protect infrastructure.
"Problems cannot be solved with the same mind set that created them."
-- Albert Einstein
Smaller clients want something simpler to deploy, and easier to use and administer. Rather than simplify the products above, a process called "kneecapping" in the IT industry, IBM opted for a clean slate, [start-from-scratch] approach.
The result is IBM Spectrum Protect Plus, new software that was preview announced last Wednesday in time for this week's VMworld 2017 conference in Las Vegas, and next month's VMworld conference in Barcelona, Spain.
IBM Spectrum Protect Plus is available as either a stand-alone product, or integrated with IBM Spectrum Protect for long-term protection. It is focused exclusively on VMware and Hyper-V environments. General Availability is expected some time in 4Q 2017.
Key features include:
Simple to install in less than 15 minutes, configured in an hour
Easy to use by DBA, VM or application administrator. No IBM Spectrum Protect skills required for stand-alone deployment
Pre-defined Gold, Silver and Bronze policies are ready to use. Additional customized policies can be configured as needed
Supports both application-aware and crash-consistent methods
Data Footprint Reduction technologies including compression and deduplication
Instant data recovery to support DevOps, Dev/Test, Reporting, Analytics and Training
Granular search and restore of entire Virtual Machines, VMDKs, and individual files
As for the name, I would have prefered "IBM Spectrum Protect Basic Edition". The "Plus" implies that the new product is more advanced, or offers more features, than the existing Spectrum Protect editions.
"Do you know what I do?" Mr. Mondavi recalls Mr. Gallo asked him when they first met. "Yes, you run the largest winery in the country," recalls Mr. Mondavi, then in his mid-20s. "No," Ernest corrected him. "I go out and visit customers in stores."
Robert Smith (aka Radio Voom) reports on National Public Radio that Second Life is now being used for campaigning for political candidates. It used to be that political candidates took trains and buses across the country, meeting people, discussing their issues, and getting a feel for what is going on in the hearts and minds of their potential voters. With the development of TV and Radio, candidates traveled less, hoping to get their word out to people who would listen to them. Using Second Life and other social networking tools brings candidates back to having conversations with the people they hope to represent.
Of course, many of these candidates are old, and are learning internet social networking skills for the first time. John McCain, my senator from Arizona, is running for President at 70 years old! It's true that old dogs CAN learn new tricks.
IBM is investing heavily into Second Life, as are many other forward-thinking companies, to explore the age-old human need for connectedness, community and dialog. I've asked my team to all get their avatars up and running in Second Life. Granted there is a bit of a learning curve, but everybody handles change in different ways, some better than others.
"Knowledge is the antidote to fear." -Ralph Waldo Emerson
Why are most of these guys (and girls) with over a billion US dollars in net worth still working? Perhaps because they embrace new ideas, and are on the thrill seeking side of humanity. I guess I am too. I'll be thrill-seeking in Chicago this weekend, celebrating St. Patrick's day.
My colleague, Marissa Benekos, is on location with her video camera in Orlando, Florida for theComputerWorld [Storage Networking World] conference.
The IT specialists from the IBM booth were excited at David Bricker's debut on YouTube.Here's the rest of the gang in this [video].
Here's Andy Monshaw, General Manager of IBM System Storage and keynote speaker at this SNW event, summarizingIBM's "Information Infrastructure" strategy in 60 seconds in this [Youtube video].
This last video is Clod Barrera talking about the importance of security. Clod is an IBM Distinguished Engineerand Chief Technical Strategist for IBM System Storage product line. Here is his[Youtube video]
It looks like Marissa is having a lot of fun taking these videos at the event.More videos, as we get them, will be posted to the [IBM videos channel].
I have been involved with Business Continuity and Disaster Recovery my entire career at IBM System Storage. However, with new workloads like Hadoop analytics and new Hybrid Cloud deployments, I thought it would be good to provide a refresh.
The need for Business Continuity and Disaster Recovery has increased recently due to (a) climate change caused by human activity, (b) ransomware and other cyber attacks, and (c) disgruntled employees.
Back in 1983, a task force of IBM clients at a GUIDE conference developed "Seven Business Continuity Tiers for Disaster Recovery", which I refer to as "BC Tiers". I divided the presentation into three sections:
Backup and Restore: BC tiers 1 through 3 are based on backup and restore methodologies. I explained how to backup Hadoop analytics data, all of the various options for IBM Spectrum Protect software, and how to encrypt the tape data that gets sent off premises.
Rapid Data Recovery: BC tiers 4 and 5 reduce the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) with snapshots, database journal shadowing, and IBM Cloud Object Storage.
Continuous Operations: BC tiers 6 and 7 provide data replication mirroring across locations. I covered 2-site, 3-site and 4-site configurations.
IBM Spectrum Virtualize - How it works - Deep dive
Barry Whyte, IBM Master Inventor and ATS for Spectrum Virtualize, covered a variety of internal topics "under the hood" of Spectrum Virtualize. This covers the SAN Volume Controller (SVC), FlashSystem V9000, Storwize V7000 and V5000 products, as well as Spectrum Virtualize sold as software.
In version 7.7, IBM raised the limits. You can now have 10,000 virtual disks per cluster, rather than 2,048 per node-pair. Also, you can now have up to 512 compressed volumes per node-pair. With the new 5U-high 92-drive expansion drawers, Storwize V7000 can now support up to 3,040 drives, and Storwize V5030 can support up to 1,520 drives.
While each Spectrum Virtualize node has redundant components, the architecture is designed to handle entire node failure. The term "I/O Group" was created to refer to the node-pair of Spectrum Virtualize engines and the set of virtual disks it manages. This made sense when virtual disks were dedicated to a single node-pair. Now, virtual disks can be assigned to multiple node-pairs, dynamically adding or removing node-pairs as needed for each virtual disk.
However, even if you have a virtual disk assigned to multiple node-pairs, only one node-pair would manage its cache, causing all other node-pairs to coordinate I/O through the cache-owning node-pair. The other node-pairs are called "access I/O groups".
The architecture allows for linear scalability, double the number of nodes, and you double your performance. Some competitors use n-way caching across four or more nodes, and it is a semi-religious argument on the pros and cons of each approach. Barry feels the 2-way caching implemented by Spectrum Virtualize is the most effective and efficient for performance.
All of the nodes are connected over IP network, but there is one designated as a "config node", and one, often the same, as a "boss node".
A cluster can have up to three physical quorum disks (either drive or mDisk) and optionally up to five IP-based quorums. The IP-based is just a Java program that runs on any server or Cloud, provided it can respond within 80 msec.
Either IP-based or physical quorum can be used for "tie-breaking" a split-brain situations. In the event there is no "active" quorum, the administrator can now serve as the tie-breaker manually. Barry recommends for Storwize clusters, where physical quorum disks are attached to a single node-pair, that you have at least one IP-based quorum for tie-breaking.
However, only physical quorum can be used for T3 Recovery. T3 Recovery happens after power outages. All of the nodes update the quorum disk with critical information of all of the virtual mappings of blocks to volumes, and this is used when bringing up the nodes again.
To protect against one pool consuming all of the cache, Spectrum Virtualize will partition the cache, and prevent any one pool from consuming more than a certain percentage of the total cache. The percentage depends on the number of pools:
Number of Pools
Max percentage of any individual pool
5 or more
Barry explained how failover works in the event of node failure. There is voting involved, and the majority remains in the cluster. In the case of an even split, called a "split brain" situation, the quorum decides. Orphaned nodes in a node-pair go into write-through mode, since the cache is no longer mirrored.
The I/O forwarding layer has been split between upper and lower roles. The upper layer handles access I/O groups. The lower layer handles asymmetric access to drives, mDisks and arrays.
N-port ID Virtualization (NPIV) drastically improves multi-pathing. Perhaps one of the coolest improvements in awhile, NPIV allows us to assign "Virtual" WWPN to other ports. When an I/O sent to a single port fails, it retries one or more times again, then waits 30 seconds, and then invokes multi-pathing to find a completely different path to the data. With NPIV, when a port fails, its WWPN is re-assigned to a different port, so the retries are likely to be successful before having to wait 30 seconds!
Lastly, Barry covered the delicate art of Software upgrades. Software is rolled forward one node at a time, and the "cluster state" is maintained during this time.
Different presentations this week are at different technical levels. My session was meant to be an overview of the concepts of Business Continuity, independent of specific operating system platform, using specific IBM products to help illustrate specific examples. Barry's was a deep dive into a single product family.
For a while now, IBM has been trying to explain to clients that focusingon just storage hardware acquisition costs is not enough. You need toconsider the "Total Cost of Ownership" or TCO of a purchase decision.For active data, a 3-5 year TCO assessment can give you a better comparison of costs between IBM and competitive choices. For long-term archive retention, 7-10 year TCO assessment may be necessary.
Now, IBM has a cute [2-minute video] that brings anappropriate analogy to help IT and non-IT executives understand.
Well, it's Tuesday again, and you know what that means? IBM Announcements!
(This week I am in Pennsylvania and New York speaking to clients. The weather this week has not been cooperative!)
Spectrum Protect Plus 10.1.2
Just in time for the upcoming VMworld conference, IBM announces the following features added to Spectrum Protect Plus, a snapshot-based backup software for VMware, Hyper-V and databases.
Data-at-Rest Encryption for local backups stored in the vSnap repository
IBM Db2 support with point-in-time recovery
VMware vSphere 6.7 support
Alerting for backup and restore jobs and storage thresholds limits
Drill-down capabilities for dashboard widgets
Spectrum Protect 8.1.6
IBM also continues to enhance its traditional file-based backup product. Here are some of the features:
Tier data by backup state for container pools. When you have multiple backup versions, the most recent version is called the "active", the older versions are called "inactive" versions. Rarely do you recover inactive versions, so this feature allows them to be migrated off to object or cloud storage.
Ransomware detection for Virtual Environment workloads. This is an enhancement of the "Ransomware detection" introduced earlier this year, but for VMware and Hyper-V images.
IBM DS8882F All-Flash Array
When IBM announced the DS8880, it shocked folks that it changed them from the previous 33-inch wide, to a standard 19-inch width. The IBM Z team followed up with 19-inch wide models of its mainframe servers.
Now, IBM can bring these together. There are two flavors of the new DS8882F:
The "Rackless" model is 17U in height with the optional keyboard/monitor, and can be put into existing 19-inch racks. These can be used with VMware, Linux, Windows, AIX and z/OS.
The "Flex Frame" model, which is 16U, allowing it to fit nicely inside a single-rack IBM Z Z14 ZR1 model, or LinuxOne RockHopper II model. It is 16U instead of 17U because it shares the existing 1U-high keyboard/monitor unit.
Like the DS8888F, DS8886F, and DS8884F models, the new DS8882F uses the High Performance Flash Enclosure (HPFE) gen2 drawers, supporting either high-performance/high-endurance drives (400GB to 3.2TB each), or high-capacity/standard-endurance drives (3.8TB to 15.3 TB each).
The R8.5 release of firmware that accompanies this announcement also supports data-in-flight encryption for Transparent Cloud Tiering. It also supports a new feature called "Safeguarded Copies", up to 500 copies to protect against hackers and ransomware.
IBM Spectrum Access blueprints have been extended to support IBM Z and LinuxOne. These blueprints show how to run IBM Cloud Private with Spectrum Connect with IBM block storage, including IBM DS8880/F, SVC, Storwize and FlashSystem models.
IBM Storage Solutions for Virtual Desktop Infrastructures (VDI)
IBM offers a new blueprint to configure Virtual Desktops with its newly announced IBM FlashSystem 9100 device. The low latency/high IOPS capability of the FlashSystem 9100 is perfect for the type of "boot storms" that are often encountered with VDI deployments.
IBM Spectrum Scale 5.0.2 and Elastic Storage Server
At recent IBM Technical University, I joked that the IBM Elastic Storage Server is only "part of a complete breakfast" because it only supported the NSD POSIX interface. To make it useful in most situations, you needed to buy additional servers outside of the ESS to run Spectrum Scale protocol nodes to provide industry-standard file and object protocols.
Today, IBM announced that you can order a new "IBM Elastic Storage Server Data Server" (5148-22L) which is a POWER server with the Spectrum Scale software pre-installed for protocol node support. It has [similar specifications] to the IBM Elastic Storage Server Management Server (5148-21L).
If you prefer to run Spectrum Scale in the cloud, you can "Bring your own license" (BYOL) to Amazon Web Services.
Well, it's the end of the year, so I thought a recap of year 2014 would be in order.
The year started out with some January announcements, including the IBM FlashSystem 840. IBM is proud to be ranked #1 in All-Flash Arrays, and the IBM acquisition of Texas Memory System has caused all of the other competitors to scramble their own wanna-be offerings. IBM also announced it was going to sell off its System x division to Lenovo.
In February, I wrapped up a project to build a Linux-based PC for a kindergarten class. IBM announced some exciting new things at Pulse 2014 conference, including IBM Bluemix Platform-as-a-Service (PaaS), new IBM SmartCloud Virtual Storage Center offerings, and acquisition of Cloudant Database. Also, on Valentine's day, IBM announced the FlashSystem V840, which combines the software-defined storage features of SAN Volume Controller, with the Microlatency of the FlashSystem 840. IBM sold its 10,000th PureSystems converged expert-integrated system.
In March, I completed a six-month film project ["A Tucson Executive Briefing Center: A Quick Visual Tour"]. I was writer/director/actor for this quick 3-minute film posted on YouTube. I wrote the script and had it reviewed by a professional script reviewer, hired a professional cinemetographer, paid royalties for background music, located a voice-over expert for narration, and trained the actors (all IBM employees) how to read their lines and stand on their mark for the camera. It was a big success!
In April, I presented at the Systems Technical University in Istanbul, Turkey. I had been to Turkey before, but this was my first time to the city of Istanbul itself. The owner of my local [Savaya Coffee] is from Istanbul, and was able to introduce me to someone who was able to arrange for a full tour my first day! Meanwhile, on the other side of the pond, IBMers in New York were celebrating the 50th anniversary of the IBM mainframe, including a cameo appearance on the TV show "Mad Men".
In May, I was busy presenting at the IBM Edge conference in Las Vegas. IBM celebrated the sixth anniversary of IBM ProtecTIER data deduplication device, announced "Codename: Elastic Storage" and new features on the DS8870 disk system, and presented analyst findings that IBM Software Defined Storage was substantially less expensive than competitive offerings.
In July, I took a nice summer vacation, [a road trip across the state of Tennessee]. IBM made a strategic partnership with Apple to offer mobile apps for the data center enterprise for the iOS operating system on iPhones and iPad tablets.
In August, I completed a summer partnership with University of Toronto and IBM Softlayer to build "Concept IBM Watson", a scaled down version of IBM Watson based on my infamous 2011 blog post [How to replicate Watson hardware and systems design for your own use in your basement]. Rather than using three physical servers, however, we had virtual x86 machines running on IBM Softlayer cloud. The system was only asked the simplest "How many...?" questions against a single text document, but proved to the University that teaching analytics by replicating IBM's historic achievement was effective and possible.
In September, I celebrated my eight year "Blogoversary". That's right, I have been blogging for the past eight years! With over 800 posts, and five published books, I countinue to be ranked #1 most-read blog on IBM developerWorks. IBM was ranked #1 for Software Defined Storage!
In October, I presented at the Systems Technical University in Dublin, Ireland. This was my first time in Ireland, and I found Dublin to be quite a beautiful city, with friendly people and delicious food.
The rest of October, and much of November and December, I spent on the road, visiting clients to help close deals! (Sorry folks... Due to SEC black-out rules, I am prohibited from telling you how well I did) Since I am not allowed to talk about on-going discussions that I have with clients, my blog has been noticeably silent during these months. I apologize for any stress or anxiety this might have caused any of my readers!
Despite too-much-candy, too-much-turkey and too-many-cookies that the year-end often brings, I managed to lose twenty pounds on a low-carb, gluten-free, Paleo diet and exercise.
IBM has been holding various "Hackathons" and "Meetups" as a new way to reach out to prospective clients. IBM sponsored a meetup at the Austin Executive Briefing Center (EBC) to discuss Machine Learning with TensorFlow on IBM Power systems, October 26, 2017.
This was a joint event, co-sponsored by [IBM Watson/Cognitive Austin] and [Big Data/AI Revealed] meetup groups. Special thanks to my colleague Cathy Cocco, IBM Executive IT Architect with the IBM Austin EBC, for coordinating this event with their organizers.
(What is a Meetup? [Meetup.com is an online social networking website that facilitates in-person local group meetings. Meetup allows members to find and join groups unified by a common interest, such as books, games, pets, technology, careers or hobbies. In 2017, there are 32 million users with 280 thousand groups available across 182 countries.)
Here was the agenda for the event:
Registration, Pizza & Soft drinks
Tensorflow 101 presentation
Demo: Using TensorFlow for Financial Market Predictions on IBM POWER Systems
Lightning Talk: IBM Data Science Experience
Clarisse Taaffe-Hedglin: Intro to TensorFlow on IBM Power servers
Our guest speaker was my colleague Clarisse Taaffe-Hedglin, IBM Cognitive Senior Technical Architect, part of the same Worldwide Client Centers team that I work in. She flew in from Charlotte, NC.
Her topic was TensorFlow, an open source [Machine Learning] framework. TensorFlow was originally developed by Google, but was made open source in November 2015.
Machine Learning is popular in a variety of industries, from self-driving cars and trucks, speech recognition and video surveillance, to what movie to watch next on Netflix. There are three aspects to Machine Learning:
Data: Start with the data you want to analyze. This could be IoT sensor data, security logs, or social media feeds. Check out all that happens in an "Internet Minute"!
Compute: While mathematical computations can be performed on traditional CPUs, some frameworks are optimized and accelerated with Graphical Processing Units (GPU). These GPU can perform Teraflops of single and double precision calculations.
Technique: As methodology have gotten more complicated over the years, frameworks have evolved to match.
The [TensorFlow] framework is now one of the most popular among data scientists. You can download it for free at [Github].
Clarisse showed the various programming/calculation tools used by data scientists. The top five were: Python, R, SQL language, MapReduce, and Microsoft Excel.
Mathematical models come in many flavors. Clarisse explained they can be used to identify clusters of data that might have similar properties, or to perform classification, or linear regression. The results can be "descriptive", gaining a better understanding of what already is, or "predictive" for what might be.
Some frameworks like Chainer or Torch are more flexible, using a dynamic Build-by-Run approach. However, these do not scale well. Theano and TensorFlow, on the other hand, employ a Define-then-Run approach, which scales better for larger projects. With the growth in popularity with TensorFlow, the Theano framework has been "functionally stabilized".
Clarisse Taaffe-Hedglin: Financial Markets Demo
For the demo, Clarisse had historical stock closing data for USA, Australia and Asian stock markets. The hypothesis: We can determine a Buy/Sell for USA stocks based on the closing results of non-American stock results? This is a classic "Binary Classification" model. The other stock markets close 4-16 hours before the U.S. markets open, so this has real-world applicability.
Since the data was in different monetary units, she did some cleanup to normalize the data, removing out the trends, and converting everything to U.S. Dollars (USD).
Clarisse used "Supervised Learning" on 80 percent subset of the data, and then used the other 20 percent remaining data to validate how well it did.
As with any model, you measure how good it is by how close it results in the correct answer. Wrong answers are weighted by how bad they are. This is often referred to as "Loss" or "Cost". Different models can therefore be compared by minimizing the loss.
Using a simple y=wx+b mathematical model, she ran 30,000 iterations. After 5,000 iterations, the model was already guessing correctly 55 percent of the time, by the time we hit 30,000 this was up to 68 percent accuracy.
TensorFlow also supports "hidden layers", basically intermediate variables that are then used in subsequent layers for more complicated calculations. This is the way our brain works with neural networks. With two added layers, she re-ran the 30,000 iterations, and now was up to 73 percent accuracy.
Normally, this kind of analysis would take hours or days, but since TensorFlow takes advantage of the IBM Power8 CPU and NVidia Tesla K80 GPU in the IBM Power server, the whole thing finished in five minutes!
Tuhin Mahmed: Lightning Talk on IBM Data Science Experience (DSX)
Tuhin Mahmed, IBM Software Developer, is the organizer for the Big Data/AI meetup group. He wants to promote the idea of "Lightning Talks" where each person presents for just 10-15 minutes. This is a variant of the popular [Pecha Kucha] events.
To get things started, he presented 10-15 minutes on [IBM Data Science Experience], or DSX for short. Taking Multiple Listing Service (MLS) real estate data of closing prices on houses sold in a range of zip codes from the Austin Area, he mapped these on x-y axis. The x axis was square feet, and the y axis was closing price.
Using DSX, he was able to develop a mathematical model that estimates house closing prices based on their zip code and square footage.
This was a simple example, but it showed the power of Jupyter Notebooks, and how anyone can get a 30-day free trial of DSX for their own experimentation.
Currently, being a data scientist is more of an art than a science. This is one of those fields that takes only a few months to learn, but years to master.
Rather than building a model from scratch, data scientists can take existing models, and modify them to fit their needs. There are a variety of existing models available in what is called the "Model Zoo". Google has over 2,000 projects already.
Those interested in trying this out TensorFlow for themselves were directed to [Nimbix], a Cloud Service Provider that offers POWER servers with NVidia GPUs.
There were about 50 attendees, more than half identified themselves data scientists. As the first inaugural sponsored event for the IBM Austin EBC, I think this was a success!
If you are in the Austin area, the next meetup will be at the [Capital Factory] on Brazos Street on November 30, 2017.
While some might be familiar with mashups that combine public Web 2.0 sources of information, enterprise mashups go one step further, integrating withthe "information infrastructure" of your data center. It's not just enough to deliver theright information to the right person at the right time, it has to bein the right format, in a manner that can be readily understood andacted upon. Enterprise mashups can help.
Last year in Beijing, China, one of my colleagues told me "When it rains here, cabs dry up". Normally, there are enough taxi cabs to handle normal conditions, but when it rains, people who normally walk now want to take a cab instead, and the demand goes up, resulting in being more difficult to find one when you need one.
I'm wrapping up my week here in Chicago, and it snowed yesterday. Cabs were scarce. I walked. Many others walked too, about half with umbrellas to protect themselves against the snowflakes.
Most systems are designed to handle typical average conditions. Taxi cabs in a city, for example, handle typicalamounts of traffic.
IT is different. In many cases, IT infrastructures are designed for the peaks, not the averages. Peaks can be where you need performance the most, and failure to design for peaks can be disastrous. As with any business decision, this represents a trade-off. Design for the average, and suffer through the peaks, or design for the peak, and be over-allocated and under-utilized most of the time otherwise.
Mark your calendars! IBM plans to have back-to-back Technical University events in Hollywood, Florida:
October 8-12, will focus on IBM Z mainframe, and a subset of IBM Storage that offer synergy for IBM Z, such as DS8880 storage system, and the TS7760 Virtual Tape Engine.
October 15-19, will focus on IBM Power Systems and the entire IBM Storage portfolio.
When I first learned of this, I was not aware there was a city called Hollywood in Florida. The Hollywood in Florida is situated between Fort Lauderdale and Miami, so you can fly into either of those two airports to get to the conference.
(Did you know? The Hollywood most people know in California is no longer its own city, but rather incorporated as a neighborhood district into Los Angeles back in 1910. There are actually thirty different places called "Hollywood" around the world, two dozen in the United States, with the rest scattered in Ireland, Turkey, Russia, Singapore and the Philippines. Not all of these are formally "cities", but in some cases neighborhoods, districts, unincorporated areas, or other populated places. The Hollywood in Maryland claims to be the first, established in 1867!)
I only plan to attend the second week only, October 15-19. Here are some highlights:
In the past, IBM had keynote sessions for each brand, for example, one focused on IBM Power systems, and another on IBM Storage. However, these were scheduled during the same time slot, forcing some people to make a tough choice.
To solve this, the two keynote sessions will be staggered, so attendees can attend both!
The storage keynote will take on a new format, with a panel of experts. I have been invited as one of the experts to participate! If there is a particular topic you want to hear about on the panel, please enter your comments below.
As with most conferences, there is a "Call for Papers" requesting speakers submit the topics they can present, and then conference coordinators accept, adjust or reject them in building the final agenda.
Here are the topics I submitted:
Build your personal brand! Social Media tips from an experienced blogger
The Pendulum Swings Back - Understanding Converged and Hyperconverged Systems
IBM Hybrid and Multi-Cloud storage solutions
IBM Cloud Object Storage (powered by Cleversafe)
Managing Risks with Data Footprint Reduction
Information Lifecycle Management: Why Archive is different than Backup
The Seven Tiers of Business Continuity and Disaster Recovery
If you attended the IBM Technical Universtiy in Orlando last May, the conference in October will have six months' worth of new announcements and products to cover.
I also plan to be at the IBM Technical University events in Johannesburg, South Africa (September 11-13), and Rome, Italy (October 22-26). If you plan to be at any of these events, let me know! If not, you can follow along with Twitter hashtag: #IBMtechU
Well, it's Tuesday again, and you know what that means! IBM Announcements!
Starting today, April 1, 2014, the IBM Executive Briefing Centers (EBC) are adopting a new self-hosted model. In the past, each briefing was assigned a "Briefing Host", a member of the EBC staff, who acted as [master of ceremonies] for the day (or more) for the clients. At some locations, if there were three rooms, there would be three or more briefing hosts so that concurrent briefings could be held.
However, the method does not scale. Having a person per briefing means that you are limited to the number of total concurrent briefings. Inspired by self-service provisioning and scalability of the Cloud, IBM has adopted a new methodology.
In the new model, the visiting client rep, sales rep, or IBM Business Partner will be handed instructions and a map. This will include the agenda, the schedule, biographies of each speaker, the locations of the nearest restrooms, and so on.
I can take partial credit for the idea. In 2012, I made the analogy that having briefing centers at each development lab made a lot of sense, because it allowed clients to interact directly with the engineers and executives that made development decisions. I also made the analogy that having a fully-staffed EBC was like a fire department, whether you have five briefings per month, or fifty, you need a team that is ready, staying abreast of the latest technological changes.
In my post, [Like animals in the zoo], I argued there are two kinds of zoos, the self-guided kind, where visitors are handed a map, versus the docent-guided kind, where a member of the zoo staff introduces you to each animal.
The EBC briefing hosts in this analogy were the docents, and the animals that people came to visit were the engineers and executives.
As for the fire department, IBM management flipped the analogy around. They argued that many smaller communities had "volunteer fire departments", eliminating the need to keep full-time employees doing nothing but playing cards and sliding down brass poles in between fire fighting sessions. When a fire happens, phones calls are made, and this will help get everyone notified to get involved.
In my past 28 years at IBM, I have to say that you know you have good analogies when they can be used in both directions. The zoo analogy was used to prevent management from consolidating all of the EBC staff to Austin, TX. The fire department analogy helped us keep all of our lab equipment to run demonstrations.
The new self-hosted model will address both scheduling and scalability issues. We often had two-day and three-day briefings, and scheduling the rooms, and the briefing managers, based on their availability, was quite challenging.
There are three advantages to the new method:
A coordinator will merely assign rooms, no longer worrying if a briefing host is available for those days. Now, each EBC location can run at full capacity, limited only by real estate and floor space.
Subject matter experts, like myself, that often did double-duty serving as briefing hosts as needed, will have more free time. I personally will be doing more "outbound briefings" to attend conferences and visit clients at their location, eliminating the time I need to be in Tucson to host "inbound" briefings.
The awkward silence that happens when the client rep, sales or IBM Business Partner invites all the clients and presenters, but forgets to invite the briefing host, is completely eliminated.
Ken Gibson has written a four-part series about where the storage industry is going, on his Storage Thoughts blog. You can find the four parts here (Part 1,Part 2,Part 3,Part 4).
His analysis of the storage industry is based on the concepts in Clayton Christensen's latest book Seeing What's Next, his latest work on the heels of his last two successes "The Innovator's Dilemma" and "The Innovator's Solution". I've only read the first book, "The Innovator's Dilemma" but need to check out these other two.
Ken explores the efforts of the incumbent players, and I agree IBM is farthest along, but not only for our "Storage Tank" architecture. For those not aware of Storage Tank, it was the code-name of a project from IBM's Almaden Research Center, productized as IBM System Storage SAN File System (SFS). Earlier this year the advanced policy-based data placement, movement and expiration features of SFS were copied over to IBM's General Parallel File System (GPFS) which has wide adoption among the High-Performance Technical Computing (HPTC) community. As I've said before, switching from one file system to another is hard, so it makes sense for HPTC clients who already use GPFS to make use of these new features by staying with GPFS, rather than trying to get them to move to SFS.
I also like Ken's analysis of "overshot" and "undershot" clients. Overshot clients are those that find what the marketplace delivers already "good enough" for their needs, and are price sensitive against paying for features they don't think they need. The undershot clients are those that the current marketplace set of offerings are not yet good enough, and are willing to pay a premium to the vendor or supplier that can get them closer to what they are looking for.
Changes are underfoot, and it is an exciting time to be involved in the storage industry.[Read More]
SNW wrapped up Thursday. As is often the case, a lot of people have left already.
I saw two presentations worth discussing here in this blog.
Angus MacDonald, CEO of Mathon Systems,presented "Litigation Readiness: How prepared are you for the demands of eDiscovery?"
The process of eDiscovery is to take a large volume of data and get the small bits of relevance, as it relatesto a case, investigation or litigation. In 2004, there were 64 billion emails per day, and this is expected to be 103 billion by 2008. There are growing concerns about the "spoliation" of evidence, which I thought was a typo,until I looked it up. He encouraged everyone to check out the Electronic Discovery Reference Model, which is trying to standardize the wayIT and legal communication with each other.
The problem is often miscommunication over semantics and terminology. For example, in eDiscovery, the term"production" describes the delivery of relevant documents to a judge or opposing party. This may involve printingthem out on paper, delivering them electronically in their original format, or converting to a more standardelectronic format like Adobe PDF. The judge or opposing party reserves the right to request how they want thedocuments produced. Of course, in any format other than the original format, authenticity needs to be affirmed.
He gave two example lawsuits related to this.
In Zubulake v. UBS Warburg, Zubulake was awarded $29 million because UBS stored old emails on backup tapes, rather than an archiving system, and could not locate seven of these backup tapes. This is not the first time I have seen some IT department, or some legal department, think that keeping backups of email repositories for many years is the same as keeping an "archive".
In Coleman Holdings v. Morgan Stanley, Coleman was awarded $1.45 billion because the judge felt that Morgan Stanley failed to do proper eDiscovery. This was after they tried to reconstruct their email system from 5000 old backup tapes.
Angus suggests identifying the types of documents most often requested, and start planning from there.In an interesting twist, the CEO/CFO/CIO might go to jail if the IT department doesn't do something correctly, so perhaps IT managers will now get the respect/funding/technology they need to get the job done.
Bruce Kornfeld, Compellent Technologies, presented "Building Systems that Scale: Imagining the one Petabyte per Admin management ratio."
Bruce did a good job staying generic, and not mentioning his company's products too much. Specifically, Compellentmakes a frame similar to what IBM used to call the "SAN Integration Server". Back in 2003, IBM introduced the SAN Volume Controller, which had no disk, and the "SAN Integration Server" which had controller + disk. What IBM learned was that customers prefer the diskless model, minimizing the amount of disk that has to be purchased from the original vendor, and instead opting to have the freedom to choose any vendor they like for the managed capacity.
An interesting feature of the Compellent solution is that they chop up the virtual disk into 2MB pieces, and allow these pieces to be moved automatically from high-speed (FC) to low-speed (SATA) disk, based on their reference frequency. This is similar to HSM, but at the block level, rather than the file level.
Every advantage Bruce listed for his box already exists from IBM: improved capacity planning, improved performance, ease of data migration, flexible volumes, and a single pane of glass GUI administration tool.
Perhaps more interesting were the questions from the audience:
Q1. Do you have any customers that have 1PB of your solution? No, we have several in the 200-500TB range.
Q2. You only have a single two-node cluster, can we have more clusters? No, that is all we support, but if you need that you would have to go to one of the major storage vendors (like IBM).
Q3. Do we have to buy Compellent storage to go with the Compellent controllers? Yes, it is designed so it is an integrated solution. If you need to virtualize your existing storage, you have to go to one of the major storage vendors (like IBM).
Q4. Having data migrate automatically from FC to SATA behind the scenes lowers performance and raises the risk of disk failure? Our box is designed for inactive data, so performance is not an issue.
Q5. How do you protect against double-disk failures? We don't, and these would be even more detrimental to our solution than traditional solutions. Other vendors offer RAID6, but we don't have that yet.
It was a fun week, and good to see people I have communicated with, but never met in person.
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 2.
IBM Spectrum Protect deep dive into Container Storage Pools
Ron Henkhaus, IBM Certified Consulting IT Specialist, presented the new Spectrum Protect concept of "Container Pools" that can either be "Directory Pools" on SAN or NAS-based disk storage, or "Cloud Pools". Container pools can contain deduplicated and non-dedupe data.
Ron cautioned that directory pools should not be placed on the same file system as your Spectrum Protect database or logs. Also, best practice for any directory pool is to assign an "overflow" pool to any non-directory pool, such as disk, tape or cloud container.
Cloud pools can use either OpenStack Swift, V1 Swift, Amazon S3 protocol, Amazon Web Services, IBM Bluemix, and IBM Cloud Object Storage. You can pre-define the vaults and buckets in the configuration.
For off-premises Cloud pools, the data is encrypted by default. For other container pools, encryption is optional. Performance to Cloud pools have been improved by using "accelerator storage", basically a disk cache to collect data before sending over to the Cloud pool. Backups to Cloud pools can reach 8 TB per hour. Restore times varies from 500 to 1500 GB per hour.
Container Pools were designed for the new "Deduplication 2.0" feature introduced in version 7. Traditional Dedupe 1.0 to Device Class FILE is still available, but not recommended.
Version 7.1.6 changed the compression algorithm from LZW to LZ4. In all cases, Spectrum Protect performs these actions in this order: deduplication, compression, encryption. Data that is encrypted by the Spectrum Protect client is therefore not deduped.
The "Protect Storage Pool" command can replicate a directory pool to either a remote directory pool or Cloud pool. In addition to this remote replication, you can copy a directory pool to tape to offer air-gap protection against ransomware. Such tapes are considered part of the "Copy Container Pool". In the event of directory pool corruption, the data can be repaired from either replication or tape.
IBM Aspera can now be used for replication, using SSL and AES-128 bit encryption. If your latency is greater than 50 msec, and have more than 0.5 percent packet loss, Aspera might help. This is available for Linux on x86 platforms running v7.1.6 or higher.
For existing customers, IBM Spectrum Protect allows you to convert your FILE, VTL and TAPE device class pools to directory or Cloud pools.
Introduction to IBM Cloud Object Storage (powered by Cleversafe)
In 2015, IBM acquired Cleversafe, recognized as the #1 Object Storage vendor. Their flagship product was officially renamed to the IBM Cloud Object Storage System, which some abbreviate informally as IBM COS. IBM offers the IBM Cloud Object Storage System in three ways: as software, as pre-built systems, and as a cloud service on IBM Bluemix (formerly known as SoftLayer).
Since then, IBM has been busy integrating IBM COS into the rest of the storage portfolio. I explained how IBM COS can be used for all kinds of static-and-stable data, but not suited for frequently changed data, such as Virtual machines or Databases.
Object storage can be access via NFS or SMB NAS-protocols using a gateway product, like IBM Spectrum Scale, or those from third-party partners like Ctera, Avere, Nasuni or Panzura. It can also be used as an alternative to tape for backup copies, and is already supported by the major backup software like IBM Spectrum Protect, Commvault Simpana, or Veritas NetBackup.
While other cloud service providers have offered data storage in the cloud, this new offering also allows hybrid configurations with geographically dispersed erasure coding.
Unlike RAID which protects against the loss of one or two drives, erasure coding can protect against a larger number of concurrent failures. For example, using an Information Dispersal Algorithm (IDA) of "7+5", where seven pieces of data are encoded on twelve independent disks, the system can lose up to five disk drives without losing any data.
Combining this with Geographically Dispersed Configuration across three or more sites means that you can lose an entire data center, four of the twelve disks, and still have instant full access to all of your data from eight drives at the other locations. In the graphic, you see two on-premise data centers combined with a third location in IBM SoftLayer.
New Generation of Storage Tiering: Simpler Management, Lower Costs, and Improved Performance
With ever changing amounts of storage, it is hard to find metrics that are consistent year to year. Fortunately, we found I/O density as the metric to focus my efforts, armed with real data from Intelligent Information Lifecycle Management (IILM) studies done at various clients. From that, I was able to talk about storage tiering on three fronts:
Storage tiering between Flash and disk. IBM FlashSystem and IBM Easy Tier on DS8000 and Spectrum Virtualize family for hybrid Flash-and-disk configurations.
Storage tiering between disk, tape, and Cloud. HSM and Information Lifecycle Management (ILM) on Spectrum Scale, Elastic Storage Server (ESS), Spectrum Archive and IBM Cloud Object Storage System.
Storage tiering automation across your entire environment. IILM studies can help identify a target mix of Tier 0, Tier 1, Tier 2 and Tier 3 storage. IBM Spectrum Storage Suite and the Virtual Storage Center (VSC) can recommend or perform the movement of LUNs to more appropriate tiers, based on age and I/O density measurements.
It's hard to say what the correct sequence of presentations should be. Some thought it might have been better for my talk on IBM Cloud Object Storage System prior to Ron's talk on Cloud container pools, but perhaps hearing Ron first helped drive more interest to my session.
Today, I met with Teresa Ferraro and Mike Buttrum from FirstRain in their Manhattan office in downtown New York City. IBM recently contracted FirstRain to provide IBMers like myself with analytics on publicly-available news to keep us informed for business meetings. Here's how IBMers can get the most out of this service.
Basically, FirstRain takes a list and generates the best summaries of publicly-available news that are most relevant. You can organize into different channels. Here I have seven channels.
Companies to watch refer to existing or prospective clients that I plan to be talking with soon. Some of my colleagues are assigned to specific clients, so they can set this up once and enjoy the news for the rest of the year. I, on the other hand, meet with different clients every week, so I will be updating this list on a frequent basis.
I have divided the Competitors between major ones, and smaller startups. Since I am often working with business partners and distributors, I made that a separate channel as well.
For product lines, I picked three: Data migration, Data storage solutions, and Software defined storage.
For conferences where I don't know which companies will attend, such as the IBM Technical University, I can set up information by territory. Here is one for Brazil.
I also attend industry-oriented events, so I can pick those vertical markets that might be helpful with dinner conversations. In this example, I chose Energy, Electric Utilities and Gas Utilities.
Once you have your channels configured, you get your results in various sections:
Management Changes lists any changes in top C-level positions, who left the company, who got recently hired.
Key Developments indicates news like mergers and acquisitions and government regulations.
First Reads prioritizes the top six articles for your channel. You can access more, but these six will get you started as you have your morning coffee.
First Tweets gives you the six most relevant tweets, if those articles above were just "TL;DR"
A section on Business Influencers and Market Drivers is interesting to see who the big players are, and what topics are driving the most conversation. Here's an example from my Energy/Electric/Gas channel:
The Most Talked About section covers quotes and commentary about the most talked about companies in your channel.
With most news sources focused on politics, weather and celebrity gossip, it is nice to have a quicker, more focused approach to get the news I need to prepare for my client briefings. Special thanks to my hosts Teresa and Mike for their hospitality!
I am back safely from my travels to New Zealand and Australia, and would like to wish everyone today a Happy [Earth Day]!
The Tucson area has been continuously-inhabited by people for the past 3,500 years. One of the great challenges for this arid desert region is water. Recently, Tucson was selected for a [2013 IBM Smarter Cities Challenge] grant. Here is an excerpt from a blog post by Tucson Mayor Jonathan Rothschild titled [Ensuring Tucson's Water Future]:
"One critical area for cost-effective investment is technology. We are converting all of our customer water meters to digital in order to reduce the amount of labor required to manually read all the 225,000 customer meters each month. And we are replacing our Supervisory Control and Data Acquisition (SCADA) system in order to improve our ability to control and manage our water distribution system.
I was pleased that Tucson was selected for a 2013 IBM Smarter Cities Challenge grant. As a result, a team of senior IBM executives came to Tucson for three weeks to listen to our story, learn about our water system and lend their expertise. They came from North Carolina, Texas, New York, California and Virginia to learn about how one of the most arid American cities is setting the standard for wise water use. The IBM team lived in our community and worked with the Tucson Water Department. They learned a great deal and helped us even more.
The Smarter Cities team's final report delivered exactly what we were looking for. It contained a roadmap with both shorter and longer term recommendations. The report did not recommend additional investments beyond our means, but it did make an effective case for the timing and scheduling of our planned investments – recommendations which will help us achieve better near-term results while we develop sustainable practices for this ongoing project. The four areas of improvement detailed in the roadmap were:
Improve customer service with automated metering
Modernize our meter management systems
Implement advanced operations management systems
Build additional capacities for our existing information technology systems
It's clear that IBM has made a strategic decision to focus on the opportunities and challenges facing cities around the world through its Smarter Cities program. They understand that a city is a 'system of systems,' and that comprehensive analyses of the ways these systems interact with one another and with the populations they serve are critical to improving the quality of life of citizens everywhere. IBM's selection of Tucson as a global smarter city has given us the chance to demonstrate that we have some of the highest standards for resource management, conservation, financial planning and community engagement for municipal water departments anywhere in the United States."
While this is certainly good for the environment, IBM's focus on helping the Earth become a smarter planet has been good for its bottom line as well. According to the latest 1Q 2013 financial results, IBM revenues related to Smarter Planet initiatives, including the Smarter Cities campaign, have increased 25 percent year-to-year.
This week, I am presenting at the IBM Systems Technical University for Storage and POWER Systems. This conference is being held in New Orleans, Louisiana, October 16-20, 2017, at the beautiful Hyatt Regency.
This is my recap for sessions on Day 2 morning.
FlashSystem A9000 and A9000R Overview
Andy Walls, IBM Fellow, CTO and Chief Architect,and Brent Yardley, IBM STSM and Master Inventor, co-presented this session. This was the "deep dive" of the A9000/R, a basic continuation of the one they did yesterday.
The Pendulum Swings Back -- Understanding converged and hyperconverged integrated systems
With IBM's partnership with Nutanix, this has become a particularly popular topic. I cover the last 50 years of storage evolution, from internal storage and external storage to NAS and SAN storage networks.
More recently, people have been willing to give up all those gains for something simpler, less powerful, less reliable, less expensive. Enter Converged and Hyperconverged Systems. IBM PureSystems and VersaStack lead the pack for Converged Systems, along with IBM Spectrum Scale, Spectrum Accelerate and Nutanix on IBM Power Systems for Hyperconverged Integrated Systems.
New Generation of Storage Tiering -- Less Management, Lower Costs, and Improved Performance
There are orders of magnitude between the fastest All-Flash Array and the least expensive tape storage. Ideally, there would be a "slider bar" that allowed people to select from the fastest to the least expensive. IBM offers a variety of solutions to offer this "slider bar", with automation to move data as needed between tiers.
I start with IBM Easy Tier, available on DS8000 and Spectrum Virtualize products, to IBM Virtual Storage Center where advanced analytics moves data to the right location, to IBM Spectrum Scale which provides the ultimate tiering, across multiple locations, between flash, disk and tape.
The lunches at these conferences are amazing, but then the "Big Easy" is known for its food!
This week, I am presenting at the IBM Systems Technical University in Orlando, Florida, May 22-26, 2017. Here's my recap of the afternoon sessions of Day 1.
Storage Brand Opening Session - Craig Nelson
Craig Nelson, Brocade manager for IBM Field Sales Channel, indicated the network equipment is the bridge that brings servers and storage together.
The squeeze -- faster servers and Flash storage causes storage networking to become the bottleneck. Fibre Channel will remain the protocol of choice for the next decade.
"Speed is the net currency of Business" -- Marc Benioff, Salesforce CEO.
Craig drew an analogy. We have been focused on making hard disk drives faster, and then Flash changed the game. Likewise, car manufacturers have focused on making gas engines better, and then Tesla Motors introduces an electric car with insane performance. The early models actually had an "Insane Mode".
The new Gen6 models of IBM b-type SAN equipment will support 32Gbps and 128Gbps ports. That's Insane!
Later models of Tesla Motors offer a "Ludicrous Mode". For flash storage, it is NVMe. NVMe can get storage down to 20 microsecond latency. That's Ludicrous!
Craig put in a plug for two Brocade sessions: "BEWARE - The four potholes on your road to success when deploying flash storage" and "Tune up your storage network! Is it healthy enough for flash storage and next-gen server platforms?"
Storage Brand Opening Session - Clod Barrera
Clod Barrera, IBM Distinguished Engineer and Chief Technical Strategist, presenting storage industry trends.
IDC predicts data capacity to grow 60-80% CAGR. This would require 44 percent drop in $/GB per year to maintain flat budget. Unfortunately, flash media cost is only dropping 25-30 percent per year, and spinning disk only 19 percent per year.
Since storage media will not offset capacity growth, we need other technologies to compensate, including compression, deduplication, defensible disposal, and "cold" storage to tape or optical media.
The smallest persistent storage that IBM has been able to achieve is 12 atoms. Current disk technology is 1200 atoms. Since 1956, IBM and the rest of the IT industry have improved storage 9 orders of magnitude, and now there are only 2 orders of magnitude left.
Clod poked fun at the "Star Wars: Rogue One" movie, indicating that their idea of the future of storage was a huge tape library. See my December 2016 blog post [Has your data gone rogue?]
What does it take to storage information forever? Tape will certainly be around. IBM Zurich demonstrated a 220TB back in 2015 as proof of technology.
A good example of the need for long-term retention are US films. Of those from the silent era, over 90 percent are lost. Over half of the films prior to 1950 are lost. The silver nitrate film stock that the reels were made of have deteriorated. Now that more movies are made digitally, can we do better?
Clouds will move from 10GbE to 25GbE. No slow down for FC in datacenters. Flash storage and object storage are both growing quickly
Move over Software-Defined Storage, Converged and Hyperconverged systems, the new up-and-coming thing are "Composable Systems deployed in Pods" adjustable hourly by workload requirements.
To protect against Ransomware, use "air gap" protection, not on the same network as production workload.
New storage models are needed for Cognitive workloads. Clod put in a plug for Joe Dain's presentation "Introducing cognitive index and search for IBM Cloud Object Storage leveraging Watson"
Storage Brand Opening Session - Axel Koester
Axel Koester, IBM Storage Chief Technologist, presented more storage industry directions.
What will the world look like in 10 years. Today mostly procedural programming, with some statistical big data, and a bit of machine learning. In 10 years, it will be mostly statistical and machine learning, with very little procedural programming. Why? Because it is faster to train computers with Machine Learning, than to program procedurally.
Examples of machine learning are IBM Watson, Google AlphaGo, drive-AI. Axel would rather be a passenger in a machine-learned self-driving car, than a procedurally-programmed one.
Neural networks to interpret hand-written numbers. Welcome to "Unsupervised learning".
A subset of Machine Learning is Deep Learning, a major breakthrough in 2006. Deep Learning is a subset of Machine Learning that uses three or more layers of neural networks. For example, face recognition "deep learning" algorithms can also be used to detect defects through visual inspection of circuit boards.
How does this impact storage?
Procedural -- archive test cases used
Statistical -- store all data for parallel processing
Machine Learning - train sample data, then archive and re-train yearly. Driving 5 minutes = 4 TB of sensor data used for self-driving cars
For Neural processing, x86 CPU are suitable for prototyping. GPU co-processors better, efficient but uncommon. IBM has developed the "TrueNorth" chip does nothing by Neural - 4096 cores with only 70 mW of energy consumption. No clock, instead dendrites, synapses, axons and neurons.
Instead of "Build or Buy?" the new question is "Train or Buy?" Train with confidential data, or buy ready-to-run 100% pre-trained cognitive systems as a service.
AI Frameworks are available on Docker containers with Kubernetes with Persistent storage (Ubiquity) such as Spectrum Scale. These frameworks include DL4J, Chainer, Caffe, torch, theano, tensorflow.
NVMe -- NVM is local only, how to do HA and DR? There are three options:
DB asynchronous shadowing
DB mirroring over NVMeOF
Cluster file system replication of persistent data, such as IBM Spectrum Scale
Example car manufacturer with 50 SAP HANA in memory instances on 4 Spectrum Scale nodes. IBM achieved 50,000 new files per second. Most NAS systems do much less.
Faster media on smaller electronics Holmium atoms on Magnesium Oxide on silver base, resulting in "single atom storage." ATM needle tip magnetizes, measured with Tunnel Magneto-resistance. Unfortunately, reading the data causes it to lose its value, so it is not as persistent as the 12-atom method described by Clod earlier.
As the title suggests, I explained why there is so much interest in Software-Defined Storage in the IT industry, what software-defined storage is, and how to deploy these solutions in your existing infrastructure without the full rip-and-replace. I covered which IBM products are available as software, pre-built systems and/or Cloud services.
My session on IBM Cloud Object Storage had three sections. First, I covered an overview of what "Object Storage" was in general, how this differs from traditional block or file storage approaches.
Second, I explained what is unique and different of IBM Cloud Object Storage System, formerly called DsNet from Cleversafe. IBM acquired Cleversafe in 2015.
Third, I explained the various applications, use cases and industries that can take advantage of Object Storage.
IBM Storage and the NVMe Revolution
Brian Sherman, IBM Distinguished Engineer for Storage Advanced Technical Services, presented an overview of NVMe, NVMe Over Fabric (NVMeOF) and what IBM is doing in this area.
How to Build a Rockstar Personal Brand
Andrea Edwards, The Digital Conversationalist, is a globally award winning B2B communications professional with more than 20 years' worth of experience from around the globe, including 12 years exclusively in Asia Pacific. IBM has hired her in the Asia Pacific region to train many IBMers in Social Media.
She condensed her normal 5-6 hour training down to a single hour for this event. She explained why building a personal brand was important, how to do it, and why businesses and organizations should encourage their employees to do so.
For example, who has the most influence on most people? Behind friends and family are bloggers. Bloggers are more influential than journalists, religious leaders, celebrities and politicians.
(As the #1 blogger of IBM, I am considered to already have a "rockstar personal brand". I am pleased to see that IBM is taking social media seriously. I have been blogging since 2006, and have influenced over $4 billion US dollars in IBM revenue in the past 11 years.)
IBM Spectrum Virtualize technical updates
Andrew Martin, IBM Spectrum Virtualize Support Architect, presented the last 18 months of enhancements to Spectrum Virtualize, from v7.6.1 introduced in March 2016 to v7.8.1 released earlier this year.
He managed to highlight quite a few enhnacements:
Distributed RAID 5 and RAID 6
Integrated Compresstimator tool
New hardware: SVC, Storwize V7000 Gen2+, Storwize V5000 Gen 2, and 92-drive 5U High Density Expansion Enclosure
N-Port ID Virtualization (NPIV)
Virtualization Over iSCSI
Encryption for Distributed RAID Arrays
64GB Read Cache
Tier 1 Flash Support
Compressed IP Replication
Spectrum Virtualize as Software for Lenovo and SuperMicro servers
Host Clusters and Throttling
Raised limit to 10,000 Volumes
Transparent Cloud Tiering
Storwize Model Conversions
IBM SKLM Support for Encryption
Consistency Protection for Metro and Global Mirror remote-distance replication
Andrew called this a "reverse roadmap", rather than a session that presents where we are going in the next 18 months, he presented where we have been.
Solution Center Reception
Here I am with Morgan Tracey and Jenna Brooker from Computer Merchants, an IBM Business Partner.
Not only were Computer Merchants a sponsor with a booth at the Solution Center, but they also gave a customer testimonial at one of the breakout sessions on how they were able to use IBM Artificial Intelligence to help with their business.
I also spent time at the SuSE booth. SuSE is a distributor of Linux that runs on x86, POWER and IBM Z mainframe systems.
While I was working, Mo took a tour to Phillip Island. On the way, they stopped at Maru to feed kangaroos and take pictures with Koala bears.
At Phillip Island, Mo watched penguins come out of the ocean, waddle up on shore and march to their burroughs. This happens every evening and is one of the top tourist attractions near Melbourne.
New Generation Storage Tiering: Less Management, Lower Investment and Increased Performance
This was not just an update to my session last year in Brussels, Belgium. Rather, I decided to start over and focus I/O density as the metric to focus my efforts, armed with real data from Intelligent Storage Tiering Analysis (ISTA) studies done at various clients. From that, I was able to talk about storage tiering on three fronts:
Storage tiering between Flash and disk. IBM FlashSystem and IBM Easy Tier on DS8000 and Storwize family for hybrid Flash-and-disk configurations.
Storage tiering between disk and tape. HSM and Information Lifecycle Management (ILM) on SONAS, Storwize V7000 Unified and LTFS-EE.
Storage tiering automation across your entire environment. ISTA studies can help identify a target mix of Tier 0, Tier 1, Tier 2 and Tier 3 storage. SmartCloud Virtual Storage Center can recommend or perform the movement of LUNs to more appropriate tiers, based on age and I/O density measurements.
Next Generation FlashSystem 840 and V840, Architecture Deep Dive
Detlef Helmbrecht, from the IBM Advanced Technical Skills team in Germany, presented this deep dive in our latest IBM FlashSystem offerings. He started with an analogy. Latency is like a single car driving down an empty highway. IOPS, on the other hand, is like a lot of cars stuck in slow traffic, with all lanes filled on the autobahn. While there are more cars transported on a full highway, the individual cars are not driving very fast. Flash versus disk has similar comparisons.
Detlef explained the differences between the previous FlashSystem 810/820 with the new 840, as well as talk about the FlashAdapter 90 now available as a PCIe card.
Finally, we talked about SAN Volume Controller combined with Flash, and the new FlashSystem V840 which combines SVC and FlashSystem 840 to have an incredibly function-rich, robust solution.
Data Footprint Reduction - Understanding IBM Storage Efficiency Options
My last session of the week! This session covered all of the various technologies for data footprint reduction, including Thin Provisioning, Space-efficient FlashCopy and snapshots, Real-time compression and data deduplication. Frankly, I wasn't expecting many people to attend the last session of the last day, but nearly 50% of the seats were filled, so I was quite pleased on the turn-out.
Fun Fact: Istanbul is considered by TripAdvisor in 2014 as the #1 most popular city to visit in Europe!
"Information is moving—you know, nightly news is one way, of course, but it's also moving through the blogosphere and through the Internets." --- George W. Bush
As multinational companies transition to becomeglobally integrated enterprises, information is going to move across nationalboundaries. Laws that pertain to how data is stored and access need to be addressed.
Jon W Toigo over at DrunkenData.com discusses an Interesting proposal on Google Censorship. The New York Sun reports that NYC comptroller, Williams Thompson Jr. istargeting both Google and Yahoo over theirpolicies of abiding the local laws in each country they do business in.The proposal includes asking Google to fight local laws, publicize when Google complies withlocal laws, and publicize when local governments ask Google to comply with their laws. While Toigo focuses on Google, this issue applies to Yahoo, Microsoft, and many other companies that do business in multiple countries.
I admire when government officials use diplomacy to influence the policy of other governments, andwhen individuals act to influence the policies of those who govern them, but Thompson isdoing neither.In this matter, Thompson is trying to influence thepolicies of another government outside his jurisdiction, as a manager of investments in companies that do business there.Investors have two choices when trying to influence how companies do business.
Stop investing in those companies
Purchase shares, and vote your portion of the shares.
It appears Thompson is exercising the latter, proposing that this issue be brought to shareholder vote via proxy.There can only be two results from such a vote, either:
Shareholders vote for it, and Google changes the way it does business in this and other countries, possibly stops doing business in countries that don't appreciate hegemony.
Shareholders vote against it, and Google continues to do a great balancing act, complying with laws and their owncorporate culture
Did we forget that we have censorship in the USA as well? Would Thompson's proposalsapply to the rules and regs that our own government requires?
IBM does business in most, not all, countries on this planet. In the countries we don't do business in, we havegood reason not to. For the countries we do, we comply with all the laws that apply in each case.When I travel to these countries, including some of the countries specifically targeted by this proposal, I must abide by their laws. No exceptions.
The world is shrinking, and technologies now allow companies to become globally integrated. Before writing"The World Is Flat", Thomas Friedman wrote a book titled The Lexus and TheOlive Tree, which covers all the various issues related to conflicts between global companies and the countriesand cultures they do business in.
This reminds me of the wisdom of the Prime Directiveintroduced in the late 1960s on the popular TV show "Star Trek". The concept was simple, honor the sovereigntyof other cultures, on other worlds, and play by their rules when you are on their planet.I say "wisdom" in that it took me years to truly appreciate this idea.Initially, I considered this just a plot device to introduce conflict each time the captain and crew of thestarship "Enterprise" visits a new location, and discovers a culture different than their own. But over the years, as I have traveled to many countries, I began to see and understandthe wisdom of the "Prime Directive", and it applies as much now, in real life, as it did back then in the futuristic 1960s TV show.
Who are we to say that our way of doing things is the one and only way to do them?
Amy Hirst, IBM Director, z Systems, Power, & Storage Technical Training, kicked off the general session.
Dr. Seshadri "Sesha" Subbanna, IBM Corporate Innovation and Technology Evaluation, asked the audience what capability is needed to drive business growth. A recent poll indicated that the ability for businesses to innovate was the number one response.
The IT industry has had its own version of growth. Consider the Apollo 11 [Guidance Computer] used to land a man on the moon had just 4KB or RAM, and 36KB or ROM. A typical smartphone has 62,000,000 times as much.
The Appollo missions led and motivated the Integrated-Circuit technology, but soon, maybe in the next 10 years, Dr. Subbanna feels that Silicon may run its course. Today, both POWER8 and z13 servers are based on 22nm. IBM has projected possible reductions to 17nm, 13nm, 10nm, and finally 7nm. That's it, smaller than 7nm may not be possible without hitting atomic issues.
The City of Rio de Janeiro, Brazil is a good example. In 2010, heavy rains resulted in flooding and landslides that killed over 110 residents. To prevent such high death rates in the future, IBM helped the city government predictive analytics and forecasting that allows "rain simulations" to see how well the city can handle different situations.
IBM is already looking for a more holistic view of systems, and new technologies like cognitive computing. New 3D technology allows various chip technologies to be stacked as layers on a single chip. For example, you could have computer on the bottom layer, flash non-volatile storage in middle layers, and networking at top layer. Connecting the layers is merely a matter of drilling holds and filling them with metal.
The idea that compute is the center of the universe, with a mainframe server surrounded by input and output "peripheral" storage devices, is giving way to a more storage-centric model, where central storage repositories (or data lakes) are accessed by "peripheral" smartphones, tablets and variety of servers. For example, the IBM DB2 Accerlation Appliance acts as a storage-centric model that IBM z System mainframes can connect to, send data in, process complex database queries, and get the results 2000x faster.
In another client example, IBM helped a bank in China to determine optimal placement of bank branches, based on public information of average salary levels of each neighborhood.
CPU processors are also getting help from co-processor accelerators like GPU (Graphical Processing Unit) and FPGA (Field Programmable Gate Arrays). Comparing a single IBM POWER8 server that is CAPI-attached to an IBM FlashSystem to a stack of x86 servers with internal SSD, the POWER8 solution connsumes 12x less rackspace, consumes 12x less electricity, and reduces per-user costs from $24/user for x86 down to $7.50/user on POWER8.
While social media, mobile phones and the Internet of Things (IoT) generate a lot data. If you then factor the "context multiplier effect" of all the links, connections and cross-references, you quickly see that data is growing at incredible rates.
Another issue is the difficulty to identify application inter-dependencies. Forecasting disruptive anamolies can be quite difficult. In one example, adminstrators received warning messages 65 minutes before a major outage, but they did not respond in time because they were unable to understand the full implications.
Cognitive computing is different than the tabulating and programming paradigms of prior decades. It is focused on Natural Language Processing, citing evidence to base responsed, and the ability to learn and improve based on learning from experience. The IBM Watson group is working with Memorial Sloane Kettering to help oncology doctors with cancer patients.
In an interesting demo, IBM Watson computer analyzed thousands of "TED Talk" videos, and was able to respond to search queries by playing a 30-second video clip that most closely address the search topic.
Cognitive computing is also looking at "Neuro-Synaptic" chips that work very much like the neurons and synapses in the brain. I have seen some of this work already at the IBM Almaden Research Center in California.
The general session ended with a Q&A panel with Dr. Subbanna, Frank De Gilio, and Bill Starke.
This week, I am attending the [InterConnect Conference] in Las Vegas, Feb 21-25, 2016. This is IBM's premier Cloud & Mobile conference for the year.
Sunday, I attended a series from IBM Research talking about the latest research areas.
7110A Future Directions in Enterprise Mobile Computing
Gabi Zodik (IBM) presented. Mobile and wearables are transforming all industries. Enabling technologies are required to support the new computing models that are cognitive in nature. Real-time proactive decisions can be made based on the mobile context of a user. Driven by the huge amounts of data produced by mobile devices, the next wave in computing will need to exploit data and computing at the edge of the network.
Future mobile apps will have to be cognitive to "understand" user intentions based on all the available interactions and unstructured data. A new distributed programming paradigm is emerging to meet these needs, which has to deal with massive amounts of data and devices. While the compute and storage capacity on individual devices is small, collectively they exceed all of the servers and storage in Cloud datacenters.
7107A Wearables in the Enterprise
Asaf Adi (IBM) presented. Wearable technology is booming. It is only our imagination that will limit the number of industrial, military, consumer and healthcare applications for this new emerging technology. Wearables are transforming industries and professions, enabling new business opportunities. From a show of hands, half the audience was wearing smart technology already.
In one example, he focused on construction industry. In the USA alone, there are thousands of workplace injuries, costing $190 Billion dollars. Wearable technologies can be incorporated into a hardhat to bright orange vest. In a steel mill, heat stress can be determined from ambient temperature and an employee's heart rate. Over time, we will have multiple wearables, communicating to each other.
In another example, he was able to make a hand gesture (waving his hand in front of his smartphone), and use that to generate code fragment that can be used by software developers to detect that particular hand gesture was made in any application.
Wearables cannot assume they are always connected to the Cloud. Take for example mining, where miners are deep below the ground. Technology to ensure safety needs to work regardless of connectivity.
Privacy is also a big concern. Wearables should not be used by employers to monitor every movement and activity of the employees.
7152A Cognitive IoT -- Today, Tomorrow and Beyond
Alessandro Curioni (IBM) presented. Today's sensors aren't up to the task of unlocking the complex links between people, places and things. To reach the next level, we need technologies that enable them to gather and integrate data from many sources, to reason over that data, and to learn from it. IBM calls this the Cognitive Internet of Things (IoT).
We already know IoT data can be used to predict maintenance needs, but what if it can also help designers engineer more reliable products from scratch? In addition, with advancements in nanotechnology and machine learning we can bring the power of cognitive to the edge—where the data is collected. Imagine tiny edge computers providing Watson services on every sensor?
It is estimated that we have 13 billion IoT sensors today, and that this will more than double to 29 billion by year 2020. This introduces new security threats, new levels of employee engagement, and fundamental shifts in business models.
Sadly, 88 percent of all the IoT is dark, meaning that it is not collected or processed for analysis. While the IT industry has done amazing things with the other 12 percent, we realize that programming techniques are too limited.
That is why cognitive is needed to unleash the value of the data. IBM Watson offers excellent capabilities, including Natural Language Processing (NLP), Machine Learning (ML), Image/Video analytics, and Text Analytics.
Manufacturers like Whirlpool are investigating use of IoT for home appliances, like refrigerators, washers and dryers. This is just the beginning, other industries including Healthcare, Retail, Oil, Mining and Farming will also benefit.
7108A Blockchain and the Future of Finance
Ramesh Gopinath (IBM) presented. Transferring products and funds today is inefficient, expensive, and vulnerable. Blockchain is an emerging fabric for transaction services. It has the potential to radically transform multi-party business networks, enabling significant cost and risk reduction and innovative new business models.
About 18 months ago, the "Blockchain" concept was not ready for business. Since then, Apache has accepted the "HyperLedger" project, with 17 founding companies.
Imagine a company in China or India exporting a product to a company in USA. There may be 10 or some companies or agencies involved, including multiple banks, port authorities, trucking companies, etc. The hand off the equipment, and ensure all parties are paid, some 30 different paper documents may be needed. Each company maintains their own set of records, and all the middlemen take their cut.
Blockchain represents a digitally-signed, encrypted, immutable "ledger" that records all of the steps related to a particular transaction. Since each new block has a checksum of all of the previous blocks, it prevents tampering and fraud. All parties have access to all of the ledger, eliminating discrepancies between different repositories of records.
This can be used to sell stocks, buy real estate, or transfer financial funds to your family overseas. Each party involved in a Blockchain has a node in a peer-to-peer network of nodes that can access a shared Blockchain request. A user initiates the transaction, and the nodes in the network use a Practical Byzantine Fault Tolerance [PBFT] protocol.
By providing [disintermediation], fewer middlemen in the process reduces costs, processing time, and risks. The method allows for the user's transactional privacy, but also ensures accountability and auditability.
7234A Building Cloud Infrastructure for Next-Generation Workloads
Krishna Nathan (IBM) presented. Today's cloud providers are efficient at providing today's cloud services at low costs. However, this efficiency comes with the penalty of inflexible instance types and no real guarantees on performance or quality of service.
Today's systems are organized and optimized for transactional processing, a result of evolution of the past 60 years. Relational Databases offer specific features like Atomicity, Consistency, Isolation, and Durability, known collectively as [ACID].
However, we are expanding beyond "automating our world", or "understanding our world". This means tapping into 90% unstructured workloads, multi-modal scanning, noise-tolerant with variable precision and probabilistic outcomes.
Cloud Providers have used the "best practices" of transactional datacenters. Consequently, next-generation workloads that often do not share the characteristics of traditional workloads are limited in expressing their full potential because of these infrastructure limitations. Now they need to focus on four characteristics: Locality, Composability, Heterogeneity, and Dynamic resource allocation.
New workloads need a combination of CPU, GPU, NVMe, and other resources. How do you schedule which equipment to deploy for incoming workload processing that optimizes performance? By taking these factors into account, clever Cloud providers can optimize performance results to provide best fit for each workload request.
7135A Storing and Using Data in the Cloud -- Putting Together the Puzzle Pieces
Michael Factor (IBM) presented. What do OpenStack Swift, Spark, CouchDB, Kafka and ElasticSearch have in common? They are all open source, they all are available on IBM's cloud today, and they all focus on storage and using data. The trick, though, is putting these puzzle pieces together to solve real problems. You need smart integration between data services motivated by real examples from domains such as IoT, transport and retail.
There are a plethora of of open services to manage data. A recent IDC Analyst study indicates that the worlds data will grow from 8.6 Zetabytes today to 40 Zetabytes in 2020. Michael gave some eye-opening comparisons. If the data was stored on 10-TB hard disk drives, we could make some physical comparisons:
Imagine stacking all of those disk drives one on top of each like a stack of books. the stack today would be 22,000 kilometers, more than half the way to geosynchronous orbiting satellites, but would be over 100,000 kilometers, way past those satellites in 2020.
The weight of those drives today would be comparable to the weight of 1,450 Airbus 380 airplanes. In 2020, they would weigh 6,755 Airbus 380 airplanes.
If the drives were spread across the entire Mandalay Bay convention center floor, they would be 1.7 meters deep today (about 5 feet), but would be 8 meters deep in 2020.
An example of the EMT Madrid bus company using real-time sensors to react to traffic conditions.
Here are the various pieces:
OpenStack Swift -- provides object storage
ElasticSearch, based on Apache Lucene - search engine, such as for metadata or queries
Apache Spark - combines SQL, streams and complex analytics, with filter pushdown support
Apache Parquet -- a column-based data format to replace row-based Comma-Separated-Variable (CSV) format
Apache Kafka - a message bus, works with dashDB and Secor
Beyond programming "glue", we need smart integration to get an order of magnitude boost in performance.
The first official day of the [Systems Technical University 2014] conference had keynote sessions in the morning. The conference features experts from IBM Power Systems, IBM System x, IBM PureSystems, and IBM System Storage.
The keynote sessions were started with Amy Purdy, IBM Director of Technical Training Services, the group that is running this conference.
This conference is not focused on System z solutions, as many of the System z clients were in New York City for this birthday event, but it came up several times during the keynote sessions.
(FTC Disclosure: I work for IBM, and this blog post may be considered a paid, celebrity endorsement of IBM products and services. IBM has business relationship with both Intel and Amazon mentioned during the course of the keynote sessions, but I have no financial stake in either company. I was the chief architect for DFSMS, the storage management component of the z/OS mainframe operating system, and was part of the team that ported Linux to the System z mainframe.)
Nicolas Sekkaki, IBM Vice President of Systems and Technology Group in Europe, discussed IBM's commitment to client's privacy, the x86 and POWER server platforms, and a variety of mind-bogging announcements. He is focused on three trends: Big Data, Cloud, and Mobile.
IBM is focusing its hardware efforts on high-value, high-margin solutions such as System Storage, POWER Systems and System zEnterprise mainframe environments. Did you know that 65 percent of the world's business transactions are processed by either POWER systems or System zEnterprise mainframe?
IBM is also extending its continued focus on Linux and Open Source initiatives. For the System zEnterprise mainframes, 78 percent of our clients run Linux on System z. Over 290 clients have added the "zBX" option that allows them to run Windows and AIX on the mainframe as well. It is now less expensive to run workloads on System zEnterprise -- about 1 dollar per day per server -- than public cloud offerings from Amazon Web Services. Linux on POWER also has lower Total Cost of Ownership (TCO) than Linux-x86.
Nicolas also mentioned major changes for the POWER Systems, starting with the [OpenPOWER Consortium], formed by IBM, Google, Mellanox, NVIDIA and Tyan.
The move makes POWER hardware and software available to open development for the first time as well as making POWER Intellectual Property licensable to others, greatly expanding the ecosystem of innovators on the platform. The consortium will offer open-source POWER firmware, the software that controls basic chip functions. By doing this, IBM and the consortium can offer unprecedented customization in creating new styles of server hardware for a variety of computing workloads.
IBM POWER has switched from being "Big Endian" to being "Bi-Endian", allowing operating systems to choose between "Big Endian" or "Little Endian" modes. The Big Endian mode allows for Linux compatibility with the System zEnterprise mainframe, and the Little Endian mode for compatibility with Linux-x86.
Thorston Kahrmann, Intel Account Director for EMEA, presented Intel's rich history of collaboration with IBM, from technologies like BlueTooth and PCiE Generation 3, to platforms like BladeCenter and NeXtScale, to Industry Standards.
IBM had a lot of "firsts" in the x86 server area, including the first 16-processor server, the first to offer hot-swap memory, and over 100 leading performance benchmarks.
The latest Intel Xeon chip is the E7 version 2. For example, changing from DB2 v10.1 on the old E7, to running DB2 BLU columnar acceleration on the new E7 version 2, resulted in a 148 times increase in performance. A query on a 10TB database that previously took four hours was completed in under 90 seconds.
Thorston also wanted to remind the audience that nearly every System Storage product from IBM, from the high-end XIV, SAN Volume Controller, SONAS and FlashSystem V840, to midrange and entry level Storwize products, are all based on Intel's x86 processors.
Louise covered the findings from the latest 2012 CEO study, gathering insight from 1709 CEO interviews. The major focus areas for CEOs are:
Empowering employees through company-wide values
Engaging customers as individuals, rather than via demographics
Amplifying innovation with strategic and tactical partnerships
With smartphones, tablets and ubiquitous Internet access, everyone is now a technologist, so that IT is now becoming a competitive differentiator. IT projects and Business projects are no longer separate. If your IT department is seen as an expense, it will continue to get its budget cut. If, however, your IT department is part of your revenue stream, then it can be viewed as an asset.
Sadly, over 75 percent of IT projects fail, either are way over budget, delivered late, or some combination of the two. Business leaders are pushing for IT improvements, but often CIOs are too afraid to take the risks to move the business forward. Louise cited three reasons for this, which she called the three C's:
The IT and Business leaders did not full understand the context of the project.
The content of the project was not properly defined between IT and Business architects.
The collaboration between IT and Business personnel was not properly established.
Louise wrapped up her session with asking a simple question: How much is the cost of a light bulb. Some might focus on the cost of the bulb itself, while others might add the cost of maintenance, having ladders and personnel to replace them as needed, and others might include the electricity consumed. Both Business and IT leaders need to focus on Total Cost of Ownership (TCO) in their planning.
(FCC Disclosure: I work for IBM. I have no financial interest in SUSE, Scality, or any other storage vendor mentioned in this post. This blog post can be considered a "paid celebrity endorsement" for IBM Storwize, IBM Cloud Object Storage, and IBM Spectrum Storage software mentioned below.)
The study takes a realistic request for 250 TB of storage, at 25 percent compound annual growth rate (CAGR), to store infrequently accessed data in an online archive, and then looks at the Total Cost of Ownership (TCO) over five year period.
The study compares five different Software-Defined Solutions and three pre-built systems. The Software-defined solutions come as software-only, requiring that you purchase the hardware separately and build it yourself. The three pre-built systems were chosen from the top three storage vendors in the marketplace: Dell EMC, IBM and NetApp.
The cost of support is factored in, as it should be. To keep things equal, no data reduction like data deduplication or compression were used.
In an odd approach, the study mixes block, file and object based approaches all in the same study.
You can read the full 14-page study (linked above). I have organized the results into a single table, ranked from best to worst, color coded for the best deals in green ($100K to $200K), moderate solutions in yellow ($200K to $300K) and most expensive in red (over $300K). I put the software-only options on the left and pre-built systems on the right.
SUSE Enterprise Storage 4
IBM Storwize V5010
DataCore SAN Symphony
Red Hat Ceph Storage
Dell EMC Unity 300
I am often asked, "Isn't the software-only, build-it-yourself approach, always the lowest cost option?" Now, I can answer, "Sometimes yes, sometimes no." Fortunately, IBM offers Software-Defined Storage in a variety of packaging options including software-only, pre-built systems, and in the Cloud as a service.
IBM Storwize V5010 is based on IBM Spectrum Virtualize software, which you can deploy as software-only on your own x86 servers. This was not mentioned in the study, and perhaps it is my job to remind people that this option is also available for those who want to build their own storage.
For that matter, IBM Cloud Object Storage System -- available as software-only, pre-built systems, and in the Cloud -- might also be a cost-effective alternative.
Next week I will be in Orlando, Florida for the IBM Systems Technical University. If you are attending, stop by one of my presentations, or look for me at the Solution Center at one of the IBM peds, or attend the "Meet the Experts for IBM Storage" on Thursday!
Continuing my coverage of the IBM Systems Technical University in Orlando, here are the sessions that I presented or attended on Day 2 (Tuesday).
Andrew Greenfield, IBM Global XIV Storage and Networking Client Technical Specialist, presented IBM's future plans for XIV and FlashSystem products. This was a special NDA session.
Eric Aquaronne, IBM Systems and Cloud Business Development lead, explained what OpenStack was, and why IBM is so heavily invested in its success. OpenStack is cloud management software that can be used to manager both on-premise and off-premise environments, including computer, storage and networking resources.
Software Defined Storage - Why? What? How?
Tony Pearson presented an overview of Software Defined Environments and how storage fits into this.
Suspiciously, there was a lot of overlap with Brian Sherman's presentation on Day 1. As Charles Caleb Colton would say, "Imitation is the sincerest form of flattery."
Making Sense of IBM Cloud Offerings
Jay Kruemcke, IBM Cloud Program Executive Client Collaboration Market Management Offering Manager, gave a high-level overview of IBM's various Cloud offerings from SoftLayer to Managed Cloud Services.
The Pendulum Swings Back - Understanding Converged and Hyperconverged environments
Tony Pearson presented IBM's involvement with Converged Systems like VersaStack and Hyperconverged systems with Spectrum Accelerate and Spectrum Scale software.
Next Generation Storage Tiering: Less Management, Lower Cost and Increased Performance
Tony Pearson presented Easy Tier, Storage Analytics Engine in Spectrum Control Advanced Edition, and Spectrum Scale tiering across flash, disk and tape media.
The second day ended with a "Networking" Reception in the Solution Center, serving food and my favorite grape-flavored beverages.