Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
IBM hired independent analyst Enterprise Strategy Group[ESG] to validate the box, and run workload-specific benchmarks. I agreewith Chris, the results are impressive! The report includes results from Microsoft Exchange JetStresstool to provide insight into email performance, and another benchmark to simulate Web server IOPS.
Also, the published SPC-1 benchmark for the DS5300 puts it at about 29 percent improvement over the DS4800.Chris argues the DS5300 is similar in class to NetApp FAS3170, which IBM sells as the IBM System Storage N6070.
If you are interesting in either the DS5300 or N6070, contact your local IBM Business Partner or sales rep.
Well,This is completely off-topic, but now that I have a bluetooth-enabled Thinkpad T60, I have been interested in this new wireless technology. I have a bluetooth cell phone, a bluetooth wireless headset, and my thinkpad, and they all work together seemlessly. I am able to speak on my cell phone through my headset, listen to music and videos on my laptop through my headset, and even dial in to the IBM network through my cell phone, all without any cables!
A variation of the Wi-Fi soup-cantenna has emerged to intercepting bluetooth signals. Check out this coolBlueSniper Rifle
"Our survey data shows that over the past 12 months, more firms have bought their storage from a single vendor. While this might not be for everyone, it's worth serious consideration for your environment. Maybe you won't get the best price per gigabyte every time, but you'll probably save money in the long run because of simpler management, increased staff specialization, increased capacity utilization, and better customer service."
A Forrester survey of 170 companies ranging from SMBs to large enterprises in North America and Europe found that more than 80 percent bought their primary storage from one vendor over the last year. That includes 64 percent of the companies with more than 500 TB of raw storage.
The report, written by analyst Andrew Reichman, says using more than one primary storage vendor can make it more complex to manage, provision and support the storage environment. And while using multiple vendors can often bring better pricing, buying from one vendor can result in volume discounts.
“You may have tried to contain costs by forcing multiple incumbent vendors to continuously compete against each other, with price as the primary differentiator,” Reichman writes. “This strategy can reduce prices and limit vendor lock-in, but it can also lead to management complexity and poor capacity utilization.”
The report recommends keeping things simple by and using fewer vendors when possible. However, that advice comes with several caveats: buying all storage from one vendor means taking the bad with the good, and some vendors’ product families differ so much “they may as well come from different vendors.”
As if by coincidence, fellow blogger from EMC Chuck Hollis gives his reflections on this same topic. Here's an excerpt:
When it comes to buying storage (or any infrastructure technology, for that matter), there seem to be two camps:
Best-of-breed (i.e. multivendor): -- buy what's best, get the best price, keep all the vendors on their toes, etc. etc.
Single vendor: primarily use one vendor's offerings, and hold them accountable for the outcome.
If Chuck had said "multivendor" versus "single vendor", then that would have been a true dichotomy, but interestinglyhe equates best-of-breed with a multivendor approach. Let's consider two examples:
Disk from one vendor, Tape from another
Here is a multivendor strategy, and if you have a clear idea of what best-of-breed means to you, then you couldpick the best disk in the market, and the best tape in the market. However, I don't think this keeps either vendor"on their toes", or helps you negotiate lower prices by threatening to switch to the other vendor. In shops likethis, the staffing usually matches, so there are disk administration and tape operations, with little or no overlap, andlittle or no interest in retraining to use a new set of gear. It is true that disk-based VTL could be used where real tape libraries are used, but this may not be enough to threaten your existing vendors that you will switch all your disk to tape, or all your tape to disk.
One could argue that the vendor that sells the besttape could be the exact same vendor that sells the best disk. In this case, your multivendor strategy would actuallywork against you, forcing you away from one of your best-of-breed choices.
Disk and Tape from one vendor for some workloads, Disk and Tape from another vendor for other workloads
Here is a different multivendor strategy. Having disk and tape for the same vendor allows you to take advantageof possible synergies. The IT staff knows how to use the products from both vendors. This strategy does let you keep your vendors "on their toes". You can legitimately threaten to shift your budget from one vendor over another.However, whatever your definition of best-of-breed is, chances are the product from one vendor is, and the other vendor is not. Both meet some lowest common denominator, meeting some minimum set of requirements, which would allow you to swap out one for the other.
I guess I look at it differently. The equipment in your data center should be thought of as a team. Do your servers, storage and software work well together?
While Americans like to celebrate the accomplishments of individual musicians, athletes or executives, it is actually bands that compete against other bands, sports teams that compete against other sport teams, and companies that compete against other companies. Teamwork in the data center is not just for the people who work there, but also for the IT equipment. Just as a new incoming athlete may not get along well with teammates, shiny new equipment may not get along with your existing gear. Conversely, your existing infrastructure may not let the talents or features of your new equipment shine through.
Putting together the best parts from different teams might serve as a great diversion for those who enjoy["fantasy football"], it may not be the best approach for the data center. Instead, focus on managing your data center as a team, perhaps with theuse of IBM TotalStorage Productivity Center to minimize the heterogeneity of your different equipment. Pick an ITvendor that sells "team players" for your servers, storage and software, with broad support for interoperability and compatibility.
I'm glad this is the final day of the IBM Systems Technical Conference (STC08) here in Los Angeles.While I enjoyed the conference, one quickly reaches saturation point with all the information presented.
XIV Architecture Overview
Before this conference, many of the attendees didn't understandIBM's strategy, didn't understand Web 2.0 and Digital archive workloads,and didn't understand why IBM acquired XIV to offer "yet another disk systemthat servers LUNs to distributed server platforms." Brian Shermanchanged all that!
Brian Sherman, IBM Advanced Technical Support (ATS), is part of the exclusive dedicated XIVtechnical team to install these boxes at client locations, so he is very knowledgeable with the technical aspects of the architecture. He presented what the current XIV-branded model that clients can purchase now in select countries, and what the IBM-branded model will change when available worldwide.
Those who missed my earlier series on XIV can find them here:
Beyond this, Brian gave additional information on how thin provisioning, storage pools, disk mirroring, consistency groups, management consoles, and microcode updates are implemented.
N series and VMware Deep Dive
Norm Bogard, IBM Advanced Technical Support, presented why the IBM N series makes such great disk storage for VMware
deployments. This wasclearly labeled as a "deep dive", so anyone who got lost in all of theacronyms could not blame Norm for misrepresentation.
IBM has been doing server virtualization for over 40 years, so it makes sense thatit happens to be the number one reseller of VMware offerings.VMware ESX server is a hypervisor that runs on x86 host, and provides an emulationlayer for "guest Operating Systems". Each guest can hvae one or more virtualdisks, which are represented by VMware as VMDK files. VMware ESX server acceptsread/write requests from the guests, and forwards them on to physical storage.Many of VMware's most exciting features requires storage to be external to thehost machine. [VMotion]allows guests to move from one host to another, [Distributed Resource Scheduler (DRS)]allows a set of hosts to load-balance the guestsacross the hosts, and [High Availability (HA)] allows the guests on a failed hostto be resurrected on a surviving host. All of these require external disk storage.
ESX server allows up to 256 LUNs, attached via FCP and/or iSCSI, and up to 32 NFS mount points. Across LUNs, ESX server uses VMFS file system, which is a clusteredfile system like IBM GPFS that allows multiple hosts to access the same LUNs.ESX server has its own built-in native multipathing driver, and even provides FCP-iSCSIand iSCSI multipathing. In other words, you can have a LUN on an IBM System Storage N series thatis attached over both FCP and iSCSI, so if the SAN switch or HBA fails, ESX servercan failover to the iSCSI connection.
ESX server can use NFS protocol to access the VMDK files instead. While the default is only 8 NFS mount points, you can increase this to 32 mount points. NAS can takeadvantage of Link Aggregate Control Protocol [LACP] groups, what some call "trunking" or "EtherChannel". This is the ability to consolidate multiple streams onto fewer inter-switch Ethernet links, similar to what happens on SAN switches.For the IBM N series, IBM recommends a "fixed" path policy, rather than "most recently used".
IBM recommends disabling SnapShot schedules, and setting the Snap reserve to 0 percent.Why? A snapshot of an ESX server datastore has the VMDK files of many guests, all of which would have had to quiesce or stop to make the data "crash consistent" for theSnapshot of the datastore to even make any sense. So, if you want to take Snapshots, itshould be something you coordinate with the ESX server and its guest OS images, and notscheduled by the N series itself.
If you are running NFS protocol to N series, you can turn off the "accesstime" updates. In normal file systems, when you read a file, it updates the"access time" in the file directory. This can be useful if you are looking forfiles that haven't been read in a while, such as software that migrates infrequentlyaccessed files to tape. Assuming you are not doing that on your N series, you might as well turnoff this feature, and reduce the unnecessary write activity to the IBM N series box.
ESX server can also support "thin provisioning" on the IBM N series. There isa checkbox for "space reserved". Checked means "thick provisioning" and uncheckedmeans "thin provisioning". If you decide to use "thin provisioning" with VMware,you should consider setting AutoSize to automatically increase your datastorewhen needed, and to auto-delete-snap your oldest snapshots first.
The key advantage of using NFS rather than FCP or iSCSI is that it eliminates theuse of the VMFS file system. IBM N series has the WAFL file system instead, andso you don't have to worry about VMFS partition alignment issue. Most VMDK aremisaligned, so the performance is sub-optimal. If you can align each VMDK to a32KB or 64KB boundary (depending on guest OS), then you can get better performance.WAFL does this for you automatically, but VMFS does not. For Windows guests, use "Windows PE" to configurecorrectly-aligned disks. For UNIX or Linux guests, use "fdisk" utility.
What Industry Analysts are saying about IBM
Vic Peltz gave a presentation highlighting the accolades from securities analysts, IT analysts, and newsagencies about IBM and IBM storage products. For example, analysts like that IBM offersmany of the exciting new technologies their clients are demanding, like "thin provisioning", RAID-6 double-drive protection,SATA and Solid State Disk (SSD) drive technology.Analysts also like that IBM is open to non-IBM heterogeneous environments. Whereas EMC Celerra gateways supportonly EMC disk, IBM N series gateways and IBM SAN Volume Controller support a mix of IBM and non-IBM equipment.
Analysts also like IBM's "datacenter-wide" approach to issues like security and "Green IT". Rather than focusingon these issues with individual point solutions, IBM attacks these challenges with a complete"end-to-end" solution approach. A typical 25,000 square foot data center consumes $2.6 million dollars USD in power andcooling today, and IBM has proven technologies to reduce this cost in half. IBM's DS8000 on average consume26.5 to 27.8 percent less electricity than a comparable EMC DMX-4 disk system. IBM's tape systemsconsume less energy than comparable Sun or HP models.
IBM iDataPlex product technical presentation
Vallard Benincosa, IBM Technical Sales Specialist, presented the recently-announced [IBM System x iDataPlex].This is designed for our clients that have thousands of x86 servers, that buy servers "racks at a time", tosupport Web 2.0 and digital archive workloads. The iDataPlex is designed for efficient power and cooling,rapid scalability, and usable server density.
iDataPlex is such a radical design departure, that it might be difficult to describe in words.Most racks take up two floor tiles, each tile is 2 foot by 2 foot square. In that space, a traditionalrack would have servers that were 19 inches wide slide in horizontally, with flashing lights and hot-swappabledisks in the front, and all the power supply, fans and networking connections in the back. Even with IBM BladeCenter,you have chassis in these racks, and then servers slide in vertically in the front, and all of the power supply, fanand networking connections in the back. To access these racks, you have to be able to open the door on boththe front and back. And the cooling has to go through at least 26.5 inches from the front of the equipment to the back.
iDataPlex turns the rack sideways. Instead of two feet wide, and four feet deep, it is four feet wide, and two feet deep.This gives you two 19 inch columns to slide equipment into, and the air only has to travel 15 inches from frontto back. Less distance makes cooling more efficient.
Next, iDataPlex makes only thing in the back the power cord, controlled by an intelligent power distribution unit (iPDU) so you can turnthe power off without having to physically pull the plug. Everything else is serviced from the front door.This means that the back door can now be an optional "Rear Door Heat Exchanger" [RDHX] that is filled with running water to makecooling the rack extremely efficient. Water from a cooler distirubtion unit (CDU) can power about threeto four RDHX doors.
Let's say you wanted to compare traditional racks with iDataPlex for 84 servers. You can put 42 "1U" serversin two racks each, each rack requires 10 kVA (kilo-volt-amps) so you give it two 8.6 kVA feeds each, that is fourfeeds, and at $1500-2000 dollars USD per month, will cost you $6000-8000. The iDataPlex you can fit 84 serversin one 20 kVA rack, with only three 8.6 kVA feeds, saving you $1500-2000 dollars USD per month.
Fans are also improved. Fan efficiency is based on their diameter, so small fans in 1U servers aren't as effective as iDataPlex's 2U fans, saving about 12-49W per server. Whereas typical 1U server racks spend 10-20percent of their energy on the fans, the iDataPlex spends only about 1 percent, saving 8 to 36 kWH per year per rack.
Each 2U chassis snaps into a single power supply and a bank of 2U fans. A "Y"power cord allows you to have one cord for two power supplies. A chassis can hold either two small server "flexnodes"or one big "flexnode". An iDataPlex rack can hold up to 84 small servers or 42 big servers. Since each "Y" cord can power up to four "flexnode" servers, you greatly reduce the number of PDU sockets taken,leaving some sockets available for traditional 1U switches.
The small "flexnode" server can have one 3.5 inch HDD, or two 2.5 inch HDD, either SAS or SATA, and the big "flexnode" can have twice these.If you need more storage, there is a 2U chassis that holds five 3.5 inch HDD or eight 2.5 inch HDD. These areall "simple-swappable" (servers must be powered down to pull out the drives). For hot-swappable drives, a 3Uchassis with twelve 3.5 inch SAS or SATA drives.
The small "flexnode" server has one [PCI Express] slot, the big servers have two. Thesecould be used for [Myrinet] clustering. With only 25W power,the PCI Express slots cannot support graphics cards.
The iDataPlex is managed using the "Extreme Cluster Administration Toolkit" [XCAT]. This is an open source project under Eclipse that IBM contributes to.
Finally was the concept of "pitch". This is the distance from the center of one "cold aisle" to the next "cold aisle".On typical data centers, a pitch is 9 to 11 tiles. With the iDataPlex it is only three tiles when using the RDHX doors, or six tiles without. Most data centers run out of power and cooling before they run out of floor space, so having more dense equipmentdoesn't help if it doesn't also use less electricity.Since the iDataPlex uses 40 percent less power and cooling, you can pack more racks persquare foot of an existing data center floor with the existing power and cooling available. That is what IBM calls "usable density"!
What Did You Say? Effective Questioning and Listening Techniques
Maria L. Anderson, IBM Human Resources Learning, gave this "professional development" talk. I deal with different clients every week, so I fully understand that there is a mix of art and science incrafting the right questions and listening to the responses.The focus was on howto ask better questions and improve the understanding and communication during consultative engagements. Thisinvolves the appropriate mix of closed and open-ended questions, exchanging or prefacing as needed. This wasa good overview of the ERIC technique (Explore, Refine, Influence, and Confirm).
Well, that wraps up my week here in Los Angeles.Special thanks to my two colleagues, Jack Arnold and Glenn Hechler, both from the Tucson Executive Briefing Center,who helped me prepare and review my presentations!
Instead, the new VBC was designed to supplement in-person briefings, maintained by technical folks like myself who are part of the physical briefing centers. The site has four major sections:
This section features presentations for the five product lines and eight solution areas. The voiced-over presentations are recorded as flash videos.
Dozens of subject matter experts (SME) are contributing to this site with their own voices, blogs, presentations and videos. If you have been to a briefing center recently, you might recognize some familiar faces.
Scheduling and Planning
The VBC can also help plan and schedule for online and in-person visits. Not familiar with IBM's products and solutions? The VBC can help you get familiar before your briefing. Interested in discussing more about a particular topic? The VBC can identify webinars and briefings that you can attend. Already attended an IBM briefing in person and now you want to share the excitement with your colleagues back in the office? Use the VBC to help share your knowledge with others.
Not sure which briefing center to visit? Systems and Technology Group has 13 of them worldwide, and this section describes each one.
The VBC is available worldwide to all companies looking to buy IBM products and solutions, as well as IBM Business Partners and sales reps. Check it out!
I had attended this conference the past four years, but sadly will not be attending the one this year. If you are attending this conference for the first time, perhaps a quick look at my blog posts from last year will help you get oriented:
Continuing this week's theme on dealing with the global economic meltdown, recession and financial crisis, I found a great video that recaps IBM CEO Sam Palmisano's recommendations to being more competitive in thisenvironment.
In a recent speech to business leaders, Sam outlined what he sees as the four most importantsteps to thriving in the global economy. The highlights can be seen here in this [2-minute video]on IBM's "Forward View" eMagazine.
This week, I presented at the "IBM TechU Comes to You" event in beautiful Dubai, United Arab Emirates. This was a three-day event, so here is my recap of Day 2.
Introducing the Spectrum Storage Suite
Mike Griese (IBM WW Spectrum Storage Software Evangelist) presented an overview of the IBM Spectrum Storage family of products, and the new IBM Spectrum Storage Suite license which drastically simplifies the TB-licensing of all six products into a single number.
Spectrum Scale - introduction, use cases, competitive advantages
I presented an overview of IBM Spectrum Scale v4.2.1 release. I covered our support for POSIX, NFS, SMB, Hadoop, OpenStack Swift and Amazon S3 interfaces.
IBM Spectrum Scale is an ideal solution to replace NetApp filers, EMC Isilon or DataDomain storage devices. Use cases include clustered NAS, Object store, and Hadoop repository for analytics.
IBM Spectrum Archive -- Integration with Spectrum Scale, and its Applications in CCTV and Media
This was another special request from the UAE team, and I had a lot of fun putting it together. I started talking about IBM's recent acquisitions in video technologies, including LiveStream and ClearLeap.
I then explained how Spectrum Scale works, and how Spectrum Archive works either separately, or in combination with Spectrum Scale.
A live demo was planned to show this all off, but sadly I had network, firewall and/or VPN issues that prevented me from attaching to my Tucson-based systems. I then wrapped up with client references that have successfully used IBM Spectrum Archive in this area.
IBM Virtual Storage Center - Prepare your existing storage for the future
Mike Griese presented IBM Virtual Storage Center, which combines the "Control Plane" product of IBM Spectrum Control Advanced Edition, with the "Data Plane" products under IBM Spectrum Virtualize.
Introduction to Object Storage and its Applications - Cleversafe
I presented the basics of object store, a radical new way of storing information and how it is different from block or file-based storage alternatives.
I then covered the features of IBM Cleversafe solutions, available as software, pre-built appliances, and in the Cloud. I wrapped up with practical use cases for Content Repository, Enterprise Collaboration, Active Archive, Storage as a Service, and Backup storage pool.
Integration between IBM Spectrum Scale and Cleversafe
This was a fun session.
I presented an overview of IBM Spectrum Scale which provides volume, file and object-level storage interfaces on data that can span various flash, hybrid and spinning disk storage devices.
I gave a quick recap of Cleversafe for those who missed my earlier session.
I then showed how files can be migrated from IBM Spectrum Scale to either Cleversafe on-premises, Cleversafe in the Cloud on IBM SoftLayer, or LTFS-enabled tape using Spectrum Archive, or to any combination of disk, tape, object storage, Cleversafe and Cloud through IBM Spectrum Protect HSM and Space Management features.
Tuesday evening I went out to dinner with the z Systems team. Earlier in my career, I was the chief architect of DFSMS, the storage management element of z/OS operating system, so I continue to have close ties with the folks from Poughkeepsie.
Was Dubai too far away for you to attend? Want to hear the latest technical information about IBM Storage, but not willing to wait until the big [IBM Edge Conference] this September? We will have several more "IBM TechU Comes to You" events in May and June.
(Note: While Lenovo has officially taken over the System x on October 1st back in the United States, China, and several other countries in Asia and the Americas, it has not yet happened in Europe. This is expected to happen this December. This results in some awkwardness during this period of transition.)
Day 1 started off with some keynote sessions. Amy Purdy, IBM Director of Training Services, was the emcee.
Gareth Tucker, Director of EMEA for Intel
Gareth focused on the strong partnership between IBM, Lenovo and Intel. For example, a client query that took 4 hours with traditional DB2 database on Intel Xeon, but only 90 seconds on DB2 BLU with the new Xeon V2 chip.
10 years ago, some storage vendors warned clients not to use any Intel-based storage devices. Today, over 85 percent of storage is Intel-based, including most of the IBM System Storage portfolio. IBM SoftLayer also uses Intel to offer both bare metal and virtual x86 servers, and was the first cloud provider to use Intel's "Trusted Execution" mode.
Next year, Microsoft will drop support for Windows 2003 server on July 15, 2015. This represents an excellent selling opportunity to get clients to upgrade their x86 server hardware. Intel estimates there are 24 million instances of Windows 2003 worldwide. On average, it takes 150 days to migrate to Windows 2012, so get clients to start now!
Jeff Howard, Vice President of Lenovo Flex and BladeCenter
Jeff was a last-minute stand-in for Adalio Sanchez who is busy getting thousands of employees and hundreds of trailer trucks full of IT equipment from IBM's Raleigh location to Lenovo's new building in Morrisville.
Lenovo's goal is simple: to be the #1 vendor of x86 enterprise servers. Lenovo sees a $44 Billion USD opportunity in x86 servers, with an additional $14B opportunity selling IBM System Storage attached to these servers. Lenovo is already #1 for Personal Computers in the consumer space, and is #1 for customer satisfaction. IBM System x #1 in reliability and up-time for x86 servers. In a client survey of how many clients had an outage lasting four hours or more, less than 1 percent from IBM System x compared to 13 percent for HP servers. That's a big difference!
There is a 40 percent growth in "Converged Systems" such as the Flex System and PureFlex systems. Lenovo will take over the x86-only versions of these, while IBM will retain the POWER-based and Power-and-x86 hybrid models. IBM will also retain the PureApplication and PureData models of the PureSystems line.
Lenovo is also focused on security. Their "Trusted Platform" includes Self-encrypting Drives (SED) managed by IBM Security Key Lifecycle Manager software, and Crypto-assist co-processors.
Jeff also mentioned new reference architectures for VMware's VSAN, Microsoft's Fast-track Data warehouse for SQL Server, SmartCloud Desktop Infrastructure VDI with Atlantis ILIO, and Flex Systems for Hyper-V.
Greg Lotko, VP of IBM Storage Systems Development
Greg is the new VP of Storage Systems Development, about 11 months on the job, but I am glad to hear that he recognizes that IBM System Storage has a huge portfolio of products.
He focused on those areas where IBM is ranked #1:
IBM is #1 for All-Flash arrays.
IBM is #1 for Software Defined Storage (SDS).
IBM is #1 for Tape, including tape drives, tape libraries and virtual tape systems
The weather here in Dublin is great, although I have had not had much time to enjoy the outdoors with all the awesome and interesting sessions inside!
A Brief History of SVC and Storwize Family: What, How and Why?
Fellow IBM Master Inventor [and blogger] Barry Whyte gave an excellent session on the past 10 years of development history for IBM SAN Volume Controller and the rest of the Storwize Family based on its binary code. The SAN Volume Controller represents the start of a movement, what is now called "Software-Defined Storage", with a layer of abstraction that completely hides the differences between different back-end devices. The Storwize family is the most successful Software-defined Storage solution in the IT industry!
IBM Cloud Storage Architectures
IBM Clod Barrera presented an updated version of his "Cloud Storage Architecture" pitch from a technical and strategic viewpoint. From 2011 to 2015, external storage spend is increasing 25 percent for public cloud, and 17 percent for private cloud deployments, and that is not including all of the Do-it-yourselfers like Facebook who build their own storage devices from piece parts.
This year, Clod has expanded his "Cloud Storage Taxonomy" to six different categories:
OLTP/transactional, typically block-based
General purpose storage
Ephemeral storage that exists only while a specific virtual machine (VM) is running
Analytics, which tends to be more sequential than random in I/O pattern
IBM is a platinum sponsor of OpenStack, and is proud to have hundreds of contributors assigned to improve this open source initiative.
IBM Linear Tape File System - Enterprise Edition
IBM Ed Childers presented the latest announcement on Linear Tape File System [LTFS]. For a quick recap, IBM first introduced LTFS Standard Drive Edition [LTFS-SDE] in April 2010, which allowed workstations attached to single tape drives to use cartridges much like USB memory sticks. Then, IBM introduced LTFS Library Edition [LTFS-LE] which allows an entire tape library to be mounted as a file system, with each resident tape cartridge listed as a sub-directory.
Now, IBM has LTFS Enterprise Edition, which combines disk-based General Parallel File System [GPFS] with LTFS-LE, resulting in a combined hybrid disk-and-tape file system.
To provide a client's perspective, Konstantin Arnold with Biozentrum, the Life Sciences Research department of the University of Basel, Switzerland and SIB Swiss Institute of Bioinformatics presented some shocking information on their data growth. Biozentrum studies 3D protein folding, with information from the Worldwide Protein Data Bank [PDB] and [UniProt], which combines protein information from Swiss-Prot with manually annotations and TrEMBL computationally analyzed and automatically annotated entries.
Combining lab data, proteomics, deep-sequencing, imaging and high-content analysis, their storage requirements has grown exponentially, from less than 50 TB in 2009, to over 350TB in 2013. With the need to have such a large repository of unstructured data, it made sense to use LTFS-EE for this project!
IBMers presented the use of SAN Volume Controller (SVC) in a "stretch cluster" for a production environment at a bank in the Middle East. Before going into the technical details of the solution, they explained the challenges of running a bank under Sharia law. For example, Sharia law does not allow charging interest rates on borrowed money, but banks can charge fees for services. Debit cards are automatically denied at shops that are "black-listed" such as liquor stores, that are not consistent with the precepts of the Islamic religion.
The SVC implementation was rather straight-forward. IBM has offered Stretch Cluster since 2009 with version 5.1, but it only gained popularity years later when VMware pointed out that this can be used for datacenter to datacenter vMotion activity. The IBM team tested this out with a short 500 meter distance locally, before stretching it out to two locations now implemented. They have three SVC nodes managing 60TB of managed disk capacity at each data center, made up from a mix of DS8870, Storwize V7000 and DS3950 disk systems.
To demonstrate the robustness of the solution, the client requested that the IBM team demonstrate various recovery scenarios while running live in production mode! As you would expect, IBM SVC successfully handled every one.
IBM Cloud Storage with OpenStack and IBM System Storage
IBM Michael Factor presented this overview of OpenStack, and how IBM already supports various aspects of the open source initiative with products like SAN Volume Controller, XIV, and Storwize V7000.
This was the best overview of OpenStack I had heard. IBM is a platinum sponsor of this open source initiative, managed by the [OpenStack Foundation]. In traditional open source fashion, bi-annual releases are given alphabetically-ascending names. The last release was named Folsom, the current release is Grizzly, and the next release planned will be named Havana.
OpenStack is designed to manage your data center or cloud across four capabilities: Compute, Network, Storage and Shared Services. For Compute, the "Nova" project focuses on managing running VM instances, and "Glance" manages VM images that can be launched. The "Networking" project focuses on providing network connectivity. This was formerly called "Quantum", but Quantum (the company) felt there might be some confusion, so it was renamed to just "Networking".
For Storage, there are two projects, "Cinder" and "Swift". Cinder refers to persistent, external block storage, accessible via iSCSI or Fibre Channel. IBM's SAN Volume Controller, XIV and Storwize V7000 already support the Cinder API interface. Swift is focused on "object storage", which can provide an alternative way of storing information for cloud-based applications. SNIA's Cloud Data Management Interface (CDMI) is working with OpenStack to bring object storage into the mainstream.
With the Cinder API, applications can create volumes, take snapshots, set quotas, and attach these volumes to VM instances.
I realize there is a big time gap between this post and my last. Where have I been? "Where haven't I been?"... might be the better question! After my week at Edge, I flew from Las Vegas to Sao Paulo, Brazil where various protests delayed my departure, then visited clients in the Midwestern USA, then London to watch a bit of tennis. From there, I flew to Athens, Greece (and yes, more protests!), took some overdue time-off on the beach on various Greek islands, then taught a Storage Top Gun class in Bangalore, India. So, yes, I have been quite busy. I will try to catch up on typing up all my notes from the IBM Edge conference over the next few weeks!
Continuing my week in Tokyo, Japan, I was going to title this post "Chunks, Extents and Grains", but decidedinstead to use the fairy tale above.
Fellow blogger BarryB from EMC, on his The Storage Anarchist blog, once again shows off his [PhotoShop talents], in his post [the laurel and hardy of thin provisioning]. This time, BarryB depicts fellow blogger and IBM master inventor, Barry Whyte, as Stan Laurel and fellow blogger Hu Yoshida from HDS as Oliver Hardy.
At stake is the comparison in various implementations of thin provisioning among the major storage vendors.On the "thick end", Hu presents his case for 42MB chunks on his post [When is Thin Provisioning Too Thin]. On the "thin end", IBMer BarryW presents the "fine-grained" details of Space-efficient Volumes (SEV), made available with the IBM System Storage SAN Volume Controller (SVC) v4.3, in his series of posts:
BarryB paints both implementations as "extremes" in inefficiency. Some excerpts from his post:
"... Hitachi's "chubby" provisioning is probably more performance efficient with external storage than is the SVC's "thin" approach. But it is still horribly inefficient in context of capacity utilization.
... the "thin extent" size used by Symmetrix Virtual Provisioning is both larger than the largest that SVC uses, and (significantly) smaller than what Hitachi uses."
"free" may be the most expensive solution you can buy...
Before you rush off to put a bunch of SVCs running (free) SEV in front of your storage arrays, you might want to consider the performance implications of that choice. Likewise, for Hitachi's DP, you probably want to understand the impact on capacity utilization that DP will have. DP isn't free, and it isn't very space efficient, either."
BarryB would like you to think that since EMC has chosen an "extent" size between 257KB and 41MB it must therefore be the optimal setting, not too hot, and not too cold. As I mentioned last January in my post[DoesSize Really Matter for Performance?], EMC engineers had not yet decided what that extent size should be, andBarryB is noticeably vague on the current value.According to this [VMware whitepaper],the thin extent size is currently 768 KBin size. Future versions of the EMC Enginuity operating environment may change the thin extent size. (I am sure theEMC engineers are smarter and more decisive than BarryB would lead us to believe!)
BarryB is correct that any thin provisioning implementation is not "free", even though IBM's implementation is offeredat no additional charge. Some writes may be slowed downwaiting for additional storage to be allocated to satisfy the request, and some amount of storage must be set asideto hold the metadata directory to point to all these chunks, extents or grains. For the convenience of not havingto dynamically expand LUNs manually as more space is needed, you will pay both a performance and capacity "price".
However, as they say, the [proof of the pudding is in the eating], or perhaps I should say porridge in this case.Given that the DMX4 is slower than both HDS USP-V and IBM SVC, you won't see EMC publishing industry-standard[SPC benchmarks] using their"thin extent" implementation anytime soon. IBM allows a choice of grain size, from 32KB to 256KB, in an elegantdesign that keeps the metadata directory less than 0.1 to 0.5 percent overhead. I would be surprised if EMC canmake a case to be more efficient than that! The performance tests are stillbeing run, but what I have seen so far, people will be very pleased with the minimal impact from IBM SEV, an acceptable trade-off for improved utilization and reduced out-of-space conditions.
So if you are a client waiting for your EMC equipment to be fully depreciated so you can replace it for faster equipment from IBM or HDS, you can at least improveits performance and capacity utilization today by virtualizing it with IBM SAN Volume Controller.
Continuing my coverage of the [IBM Edge2014 conference], IBM's premiere conference for System Storage and related products, I attended EdgeTalks: Innovation That Impacts Our World that offered a series of inspiring talks styled after the famous [TED] conferences.
Surjit Chana, IBM Chief Marketing Officer (CMO) and VP of Strategy for IBM Systems and Technology Group, served as emcee to introduce the speakers.
Ron Finley, Renegade Gardener
Back in 2003, "South Central" was [renamed to South Los Angeles]. But as everyone in IT knows, merely renaming something doesn't fix any of its problems. Ron was tired of seeing empty lots filled with old mattresses, used condoms and discarded tires, and wanted to beautify his immediate surroundings by planting vegetables in his front yard.
Ron's army of volunteers, the [L.A. Green Grounds], filed a petition. As of October 2013, it is now legal to grow food on your parkway in Los Angeles.
Ron explained that South Los Angeles is a [food desert], where it is nearly impossible to get healthy, organic food. He is concerned the "drive-thrus" of fast food restaurants kill more of his neighbors than [drive-by] shootings.
Ron has discovered this problem is not limited to Los Angeles. The American food system is designed to fill you with processed food and chemicals, made worse by a health care system happy to cut you open or prescribe you more chemicals and drugs. Everywhere processed food goes, chronic disease follows. The USA exports obesity to the rest of the world.
"To change a community, and you must first change the composition of the soil." -- Ron Finley
The rise in cancer, diabetes, and childhood cardiac arrests inspired Ron to start the [Ron Finley Project] consisting of community farms, a marketplace that accepts EBT, SNAP and other government food programs, and portable "container cafes" based on standard shipping containers that could be placed near a garden to help sell the food grown locally.
John Wilbanks, Chief Commons Officer at Sage Bionetworks and Senior Fellow in Entrepreneurship for Faster Cures
We live in the age of cheap data. John prefers the term "cheap data" rather than "big data". Mapping the first human genome cost $3 Billion USD, now John can get his own genome mapped for about $1200.
John feels this cheap data changes the way we justify our opinions. From baseball scouts to the analytics demonstrated in the movie [Moneyball]. President Barack Obama used social media to help win elections. And cheap data is coming to health and medicine.
John gave an interesting example. A grad student wanted to study alcoholism among undergraduate students. The traditional method would have been to gather privacy permission slips from volunteers. Instead, he "friended" 4,000 undergraduates, and looked on social media containing the [distinctive color of red beer cups] for photos taken on Monday through Wednesday, indicative of a drinking problem. This innovative approach allowed the grad student to complete his research in less than six weeks.
Cheap data doesn't mean we have wisdom. John explained the wrong way of doing things. There are several machine-learning apps for smartphones to check for melanoma. Take a photo of your suspected mole, and the app will determine if it detects skin cancer, and recommend a biopsy. Incentives to sell apps, and to perform biopsies, result in 90 percent false positive rates. There is no financial incentive to improve accuracy.
Sharing is the innovation that converts cheap data into wisdom. Get the world's smartest people to compete to create wisdom. Collaborating with IBM on Dialogue for Reverse Engineering Assessments and Methods [DREAM] platform, a competition for modeling breast cancer was launched. Requiring all participants to share their code in real-time allowed the accuracy of the model to jump three orders of magnitude in just nine days. Over 60 teams participated. The winning team was awarded an article and cover of [Science Translational Medicine] magazine.
John feels that there are very few genius [data scientists] in the world, and they are isolated, hideously overpaid, managing hedge funds or search engines, but would probably rather be looking for cures for cancer.
Progress is not made if every company only has its own people looking at its own data. John wants data to shared amongst the world's scientists to create wisdom. However collaboration flies in the face of the competition that all the reward systems are based on in health care.
As an experiment, John wanted to make his own genome public. However, that requires "informed consent" for others to use his private health information, and it took him six months of legal and ethical rules to develop a system for him to provide this consent for public use.
In much the same way that gardens and fields were the first [commons] shared by farmers, John feels we need to cultivate the public domain, the "digital commons". This can truly transform medicine and health care.
Peter Singer, Technology Expert and Best-selling Author
The first web page appeared in 1991, and now there are over 30 trillion pages. Over 98 percent of military communications occur over civilian internet communications. The [Internet of Things] adds everything from smart cars to medical devices into the equation.
But along with all the benefits the web has brought society, there are also risks. Every second, nine new pieces of malware are discovered. An astounding 97 percent of Fortune 500 have admitted to being hacked. Over 100 governments have established a cybermilitary force.
(Instead of Powerpoint slides, Peter had a slideshow of his personal collection of the world's best and worst cybersecurity art. Studies show that audiences remember 60 percent more if they are looking at pictures when they hear a speaker.)
While IT folks are good at dealing with both hardware and software, they traditionally don't do well with "wetware", the human side of things. Essential cybersecurity terms and concepts are often misunderstood.
Business leaders over-react to some threats, but completely ignore others. Consider that 70 percent of cybersecurity decisions at companies are made by executives who have no training in cybersecurity. No single MBA program offers cybersecurity courses.
There is a shortage of talent to deal with cybersecurity. Hiring managers are only satisfied with 40 percent of the employees they hire in this Cybersecurity space.
Incentives help explain why some industries like financial services do security well, while others like health care do poorly.
In an effort to find which employees do not take cybersecurity seriously enough, Companies have resorted to sending [phishing] emails to their own employees. Those that click are caught, and must attend mandatory training, or are subject to dismissal. Unfortunately, senior executives are twice as likely to click on phishing emails than the general workforce.
Peter recommends companies focus on resilience. You can never build high enough walls to eliminate threats. Instead, focus on bouncing back after attacks, similar to the anti-bodies in the human body deal with illness.
Ben Franklin said that an ounce of prevention is worth a pound of cure. Peter cited a studied that found proper cyberhygiene would have prevented 94 percent of attacks. The most successful foreign military attack on the U.S. military happened when a soldier saw a memory stick in a parking lot, and was curious enough to connect it to the secure military network to see what it contained.
We need to build an ethic. We teach our kids to cover your mouth when you cough. This does not protect your child in any way, but is an ethic to avoid spreading disease. We need to teach the same ethics related to cybersecurity.
All three were excellent talks focused on innovation. Ron Finley used gardening in otherwise empty urban spaces to help grow people as well as food. John Wilbanks used innovation to help bring the smartest minds to determine models for identifying cancer from genomes. Peter Singer marveled at the innovation of the Internet, and how proper cyberhygiene is needed to keep it secure.
These talks were recorded and available on this [98-minute YouTube video]. For those on Twitter, my handle is @az990tony and the hashtag for this session was #ibmedgetalks.
According to Gartner data (from 2005!), host-based storage accounts for 34 percent of the overall market for external storage, with the remaining 66 percent going to "fabric-attached" (network) storage, expect this share to grow from 66 percent to 77 percent by 2007.What is the current reality? SAN vs. NAS, FC vs iSCSI?
IBM subscribes to a lot of data from different analysts, they all have their methods for collecting this data, from taking surveys of customers to reviewing financial results of each vendor. While theymight not agree entirely, there are some common threads that lead one to believe they represent "reality". Hereare some numbers from an IDC December 2007 report:
Worldwide Disk Storage
While the 32/68 split is similar to the 34/66 split you mentioned before, you can see that external growth isgrowing faster, so internal host-based storage will drop to 25 percent by 2011, with external storage growing to 75 percent, very close to the 77 predicted. Looking at just the externaldisk storage, there are basically three kinds: DAS (direct cable attachment), NAS (file level protocols suchas NFS, CIFS, HTTP and FTP), and SAN (block-level protocols like FC, iSCSI, ESCON and FICON):
Worldwide External Disk Storage
At these rates, fabric-attached (SAN and NAS) will continue to dominate the storage landscape.Looking more closely now at the block-oriented protocols.
Worldwide External Disk Storage
Fibre Channel (FC)
At these rates, iSCSI will overtake FC by 2011. IBM System Storage N series, DS3300 and XIV Nextraall support iSCSI attachment.
Jon Toigo over at DrunkenData offers some additional data from ex-STKer:[Fred Moore Outlook on Storage 2008]. I met Fredat a conference. He had left STK back in 1998, and started his own company called Horison. NeitherJon nor Fred cite the sources of his statistics, but the following comment leads me to assume hehasn't been paying attention closely to the tape market:
With the demise of STK, who will be the leader in the tape industry?
Depending on how old you are, you might remember exactly where you were when a significant eventoccurred, for example the[Space Shuttle Challenger]explosion. For many IBMers, it was the day our friends at Sun Microsystems announced they were [puttingour lead tape competitor out of its misery]. I was in New York that day, but there was still someconfetti on the floor in the halls of the IBM Tucson lab when I got home a few days later. IBM hasbeen the number one market share leader in tape for over the past four years.