This blog is for the open exchange of ideas relating to IBM Systems, storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
Tony Pearson is a Master Inventor, Senior IT Architect and Event Content Manager for [IBM Systems for IBM Systems Technical University] events. With over 30 years with IBM Systems, Tony is frequent traveler, speaking to clients at events throughout the world.
Lloyd Dean is an IBM Senior Certified Executive IT Architect in Infrastructure Architecture. Lloyd has held numerous senior technical roles at IBM during his 19 plus years at IBM. Lloyd most recently has been leading efforts across the Communication/CSI Market as a senior Storage Solution Architect/CTS covering the Kansas City territory. In prior years Lloyd supported the industry accounts as a Storage Solution architect and prior to that as a Storage Software Solutions specialist during his time in the ATS organization.
Lloyd currently supports North America storage sales teams in his Storage Software Solution Architecture SME role in the Washington Systems Center team. His current focus is with IBM Cloud Private and he will be delivering and supporting sessions at Think2019, and Storage Technical University on the Value of IBM storage in this high value IBM solution a part of the IBM Cloud strategy. Lloyd maintains a Subject Matter Expert status across the IBM Spectrum Storage Software solutions. You can follow Lloyd on Twitter @ldean0558 and LinkedIn Lloyd Dean.
Tony Pearson's books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
The developerWorks Connections platform will be sunset on December 31, 2019. On January 1, 2020, this blog will no longer be available. More details available on our FAQ.
Well, I'm back safely from my tour of Asia. I am glad to report that Tokyo, Beijing and Kuala Lumpur are pretty much how I remember them from the last time I was there in each city. I have since been fighting jet lag by watching the last thirteen episodes of LOST season 6 and the series finale.
Recently, I have started seeing a lot of buzz on the term "Storage Federation". The concept is not new, but rather based on the work in database federation, first introduced in 1985 by [A federated architecture for information management] by Heimbigner and McLeod. For those not familiar with database federation, you can take several independent autonomous databases, and treat them as one big federated system. For example, this would allow you to issue a single query and get results across all the databases in the federated system. The advantage is that it is often easier to federate several disparate heterogeneous databases than to merge them into a single database. [IBM Infosphere Federation Server] is a market leader in this space, with the capability to federate DB2, Oracle and SQL Server databases.
Storage expansion: You want to increase the storage capacity of an existing storage system that cannot accommodate the total amount of capacity desired. Storage Federation allows you to add additional storage capacity by adding a whole new system.
Storage migration: You want to migrate from an aging storage system to a new one. Storage Federation allows the joining of the two systems and the evacuation from storage resources on the first onto the second and then the first system is removed.
Safe system upgrades: System upgrades can be problematic for a number of reasons. Storage Federation allows a system to be removed from the federation and be re-inserted again after the successful completion of the upgrade.
Load balancing: Similar to storage expansion, but on the performance axis, you might want to add additional storage systems to a Storage Federation in order to spread the workload across multiple systems.
Storage tiering: In a similar light, storage systems in a Storage Federation could have different capacity/performance ratios that you could use for tiering data. This is similar to the idea of dynamically re-striping data across the disk drives within a single storage system, such as with 3PAR's Dynamic Optimization software, but extends the concept to cross storage system boundaries.
To some extent, IBM SAN Volume Controller (SVC), XIV, Scale-Out NAS (SONAS), and Information Archive (IA) offer most, if not all, of these capabilities. EMC claims its VPLEX will be able to offer storage federation, but only with other VPLEX clusters, which brings up a good question. What about heterogenous storage federation? Before anyone accuses me of throwing stones at glass houses, let's take a look at each IBM solution:
IBM SAN Volume Controller
The IBM SAN Volume Controller has been doing storage federation since 2003. Not only can IBM SAN Volume Controller bring together storage from a variety of heterogenous storage, the SVC cluster itself can be a mix of different hardware models. You can have a 2145-8A4 node pair, 2145-8G4 node pair, and the new 2145-CF8 node pair, all combined together into a single SVC cluster. Upgrading SVC hardware nodes in an SVC cluster is always non-disruptive.
IBM XIV storage system
The IBM XIV has two kinds of independent modules. Data modules have processor, cache and 12 disks. Interface modules are data modules with additional processor, FC and Ethernet (iSCSI) adapters. Because these two modules play different roles in an XIV "colony", that number of each type is predetermined. Entry-level six-module systems have 2 interface and 4 data modules. Full 15-module systems have 6 interface and 9 data modules. Individual modules can be added or removed non-disruptively in an XIV.
IBM Scale-Out NAS
The SONAS is comprised of three kinds of nodes that work together in concert. A management node, one or more interface nodes, and two or more storage nodes. The storage nodes are paired to manage up to 240 nodes in a storage pod. Individual interface or data nodes can be added or removed non-disruptively in the SONAS. The underlying technology, the General Parallel File System, has been doing storage federation since 1996 for some of the largest top 500 supercomputers in the world.
IBM Information Archive (IA)
For the IA, there are 1, 2 or 3 nodes, which manages a set of collections. A collection can either be file-based using industry-standard NAS protocols, or object-based using the popular System Storage™ Archive Manager (SSAM) interface. Normally, you have as many collections as you have nodes, but nodes are powerful enough to manage two collections to provide N-1 availability. This allows a node to be removed, and a new node added into the IA "colony", in a non-disruptive manner.
Even in an ant colony, there are only a few types of ants, with typically one queen, several males, and lots of workers. But all the ants are red. You don't see colonies that mix between different species of ants. For databases, federation was a way to avoid the much harder task of merging databases from different platforms. For storage, I am surprised people have latched on to the term "federation", given our mixed results in the other "federations" we have formed, which I have conveniently (IMHO) ranked from least effective to most effective:
The Union of Soviet Socialist Republics (USSR)
My father used to say, "If the Soviet Union were in charge of the Sahara desert, they would run out of sand in 50 years." The [Soviet Union] actually lasted 68 years, from 1922 to 1991.
The United Nations (UN)
After the previous League of Nations failed, the UN was formed in 1945 to facilitate cooperation in international law, international security, economic development, social progress, human rights, and the achieving of world peace by stopping wars between countries, and to provide a platform for dialogue.
The European Union (EU)
With the collapse of the Greek economy, and the [rapid growth of debt] in the UK, Spain and France, there are concerns that the EU might not last past 2020.
The United States of America (USA)
My own country is a federation of states, each with its own government. California's financial crisis was compared to the one in Greece. My own state of Arizona is under boycott from other states because of its recent [immigration law]. However, I think the US has managed better than the EU because it has evolved over the past 200 years.
The Organization of the Petroleum Exporting Countries [OPEC]
Technically, OPEC is not a federation of cooperating countries, but rather a cartel of competing countries that have agreed on total industry output of oil to increase individual members' profits. Note that it was a non-OPEC company, BP, that could not "control their output" in what has now become the worst oil spill in US history. OPEC was formed in 1960, and is expected to collapse sometime around 2030 when the world's oil reserves run out. Matt Savinar has a nice article on [Life After the Oil Crash].
United Federation of Planets
The [Federation] fictitiously described in the Star Trek series appears to work well, an optimistic view of what federations could become if you let them evolve long enough.
Given the mixed results with "federation", I think I will avoid using the term for storage, and stick to the original term "scale-out architecture".
Well, it's Tuesday again, and that means IBM announcements! Right on the heels of our big storage launch on February 9, today IBM announced some exciting options for its modular disk systems. Let's take a look:
2TB SATA-II drives
That's right, you can now DOUBLE your capacity with 2TB SATA type-II drives on the DS3950, DS4200, DS4700, DS5020, DS5100 and DS5300 disk controllers, as well as the DS4000 EXP420, EXP520, EXP810, EXP5000 and EXP5060 expansion drawers. Here are the Announcement Letters for the [HVEC] and [AAS] ordering systems.
300GB Solid State Drives
IBM also announces 300GB solid state drives (SSD) for the DS5100 and DS5300. These are four times larger than the 73GB drives IBM offered last year, for those workloads that need high read IOPS such as Online Transaction Processing (OLTP) and Enterprise Resource Planning (ERP) applications. Here is the [Announcement Letter].
New N series model N3400
For customers that need less than the minimum 21TB that our IBM Scale-Out Network Attach Storage (SONAS) can provide, IBM offers the new N3400 unified storage disk system, with support for NFS, CIFS, iSCSI and FCP. This is a 2U high 12 drive model that can be expanded up to 136 drives, basically doubling all the stats from last year's N3300 model. Fellow blogger, Rich Swain (IBM), does a great job recapping the speeds and feeds over on his blog [News and Information about IBM N series].
It also appears that the reports and rumors of the death of the DS6800 are premature. Don't believe misleading statements from competitors, such as those found written by fellow blogger BarryB (EMC), aka "the Storage Anarchist", in his latest post [Bring Out Your Dead] showing a cute little tombstone with "Feb 2010" on the bottom. Actually, if he had bothered to read IBM's [Announcement Letter], he would have realized that IBM plans to continue to sell these until June. Of course, IBM will continue to support both new and existing DS6800 customers for many years to come.
Technically, BarryB does not make any factually incorrect statements for me to correct on his blog. The idea that a product is "dead" is, of course, just opinion, and competitors poke fun at each others' announcements every day. One could argue that the EMC V-Max was "dead" after the ITG whitepaper [Cost/Benefit Case for IBM XIV Storage System - Comparing Costs for IBM XIV and EMC V-Max Systems] demonstrated that the IBM XIV cost 63 percent less than a comparable EMC V-Max over the life of three years total cost of ownership (TCO) back in July 2009. The comparison was made with data from clients in a variety of industries including manufacturing, health care, life sciences, telecommunications, financial services, and the public sector. This could explain why so many EMC customers are buying or investigating the IBM XIV and the rest of the IBM storage portfolio.
Am I dreaming? On his Storagezilla blog, fellow blogger Mark Twomey (EMC) brags about EMC's standard benchmark results, in his post titled [Love Life. Love CIFS.]. Here is my take:
A Full 180 degree reversal
For the past several years, EMC bloggers have argued, both in comments on this blog, and on their own blogs, that standard benchmarks are useless and should not be used to influence purchase decisions. While we all agree that "your mileage may vary", I find standard benchmarks are useful as part of an overall approach in comparing and selecting which vendors to work with, and which architectures or solution approaches to adopt, and which products or services to deploy. I am glad to see that EMC has finally joined the rest of the planet on this. I find it funny this reversal sounds a lot like their reversal from "Tape is Dead" to "What? We never said tape was dead!"
Impressive CIFS Results
The Standard Performance Evaluation Corporation (SPEC) has developed a series of NFS benchmarks, the latest, [SPECsfs2008] added support for CIFS. So, on the CIFS side, EMC's benchmarks compare favorably against previous CIFS tests from other vendors.
On the NFS side, however, EMC is still behind Avere, BlueArc, Exanet, and IBM/NetApp. For example, EMC's combination of Celerra gateways in front of V-Max disk systems resulted in 110,621 OPS with overall response time of 2.32 milliseconds. By comparison, the IBM N series N7900 (tested by NetApp under their own brand, FAS6080) was able to do 120,011 OPS with 1.95 msec response time.
Even though Sun invented the NFS protocol in the early 1980s, they take an EMC-like approach against standard benchmarks to measure it. Last year, fellow blogger Bryan Cantrill (Sun) gives his [Eulogy for a Benchmark]. I was going to make points about this, but fellow blogger Mike Eisler (NetApp) [already took care of it]. We can all learn from this. Companies that don't believe in standard benchmarks can either reverse course (as EMC has done), or continue their downhill decline until they are acquired by someone else.
(My condolences to those at Sun getting laid off. Those of you who hire on with IBM can get re-united with your former StorageTek buddies! Back then, StorageTek people left Sun in droves, knowing that Sun didn't understand the mainframe tape marketplace that StorageTek focused on. Likewise, many question how well Oracle will understand Sun's hardware business in servers and storage.)
What's in a Protocol?
Both CIFS and NFS have been around for decades, and comparisons can sometimes sound like religious debates. Traditionally, CIFS was used to share files between Windows systems, and NFS for Linux and UNIX platforms. However, Windows can also handle NFS, while Linux and UNIX systems can use CIFS. If you are using a recent level of VMware, you can use either NFS or CIFS as an alternative to Fibre Channel SAN to store your external disk VMDK files.
The Bigger Picture
There is a significant shift going on from traditional database repositories to unstructured file content. Today, as much as [80 percent of data is unstructured]. Shipments this year are expected to grow 60 percent for file-based storage, and only 15 percent for block-based storage. With the focus on private and public clouds, NAS solutions will be the battleground for 2010.
So, I am glad to see EMC starting to cite standard benchmarks. Hopefully, SPC-1 and SPC-2 benchmarks are forthcoming?
Those that prefer to work with one-stop shopping of an IT Supermarket, with companies like IBM, HP and Dell who offer a complete set of servers, storage, switches, software and services, what we call "The Five S's".
Those that perfer shopping for components at individual specialty shops, like butchers, bakers, and candlestick makers, hoping that this singular focus means the products are best-of-breed in the market. Companies like HDS for disk, Quantum for tape, and Symantec for software come to mind.
My how the IT landscape for vendors has evolved in just the past five years! Cisco starts to sell servers, and enters a "mini-mall" alliance with EMC and VMware to offer vBlock integrated stack of server, storage and switches with VMware as the software hypervisor. For those not familiar with the concept of mini-malls, these are typically rows of specialty shops. A shopper can park their car once, and do all their shopping from the various shops in the mini-mall. Not quite "one-stop" shopping of a supermarket, but tries to address the same need.
("Who do I call when it breaks?" -- The three companies formed a puppet company, the Virtual Computing Environment company, or VCE, to help answer that question!)
Among the many things IBM has learned in its 100+ years of experience, it is that clients want choices. Cisco figured this out also, and partnered with NetApp to offer the aptly-named FlexPod reference architecture. In effect, Cisco has two boyfriends, when she is with EMC, it is called a Vblock, and when she is with NetApp, it is called a FlexPod. I was lucky enough to find this graphic to help explain the three-way love triangle.
Did this move put a strain on the relationship between Cisco and EMC? Last month, EMC announced VSPEX, a FlexPod-like approach that provides a choice of servers, and some leeway for resellers to make choices to fit client needs better. Why limit yourself to Cisco servers, when IBM and HP servers are better? Is this an admission that Vblock has failed, and that VSPEX is the new way of doing things? No, I suspect it is just EMC's way to strike back at both Cisco and NetApp in what many are calling the "Stack Wars". (See [The Stack Wars have Begun!], [What is the Enterprise Stack?], or [The Fight for the Fully Virtualized Data Center] for more on this.)
(FTC Disclosure: I am both an employee and shareholder of IBM, so the U.S. Federal Trade Commission may consider this post a paid, celebrity endorsement of the IBM PureFlex system. IBM has working relationships with Cisco, NetApp, and Quantum. I was not paid to mention, nor have I any financial interest in, any of the other companies mentioned in this blog post. )
Last month, IBM announced its new PureSystems family, ushering in a [new era in computing]. I invite you all to check out the many "Paterns of Expertise" available at the [IBM PureSystems Centre]. This is like an "app store" for the data center, and what I feel truly differentiates IBM's offerings from the rest.
The trend is obvious. Clients who previously purchased from specialty shops are discovering the cost and complexity of building workable systems from piece-parts from separate vendors has proven expensive and challenging. IBM PureFlex™ systems eliminate a lot of the complexity and effort, but still offer plenty of flexibility, choice of server processor types, choice of server and storage hypervisors, and choice of various operating systems.
The old adage applies "You can't please everyone. Presidents can't. Prostitutes can't. Nobody can." I am reminded of that as I fielded a variety of interesting comments and emails about, of all things, my choice of order of things in recent blog posts.
Certainly, there are times when the order of things matters greatly. In my now-infamous blog post [Sock Sock Shoe Shoe], I use a scene from a popular 1970's television show to explain why compression should be done before encryption.
In my case, I put things in the order that I felt made sense to me, but not everyone agrees. Here are three recent examples:
In my blog post [Two IBMers Earn Their Retirement], I congratulated two of my colleagues on their retirement. Since their retirement happened on the same day, I decided to mention Mark Doumas first, and Jim Rymarczyk second.
However, one of my readers, who I will assume is a member of the unofficial "Jim Rymarczyk fan club", felt that I should have listed Jim first, as Jim served IBM for 44 years, and Mark only 32 years.
Really? I realize that movie stars insist on having their name listed first on the poster, but neither of these guys would be confused with George Clooney!
So, to Jim and all his fans out there, I assure you I did not mean this as a slight in any way. I have updated the post to indicate that the ordering was strictly alphabetical by last name.
In my blog post [IBM Announcements for February 2012], I presented tape products first, and disk second. Normally, I cover them alphabetically, disk first, then tape. However, I was asked to promote tape this year in preparation for the upcoming 60th anniversary of tape, so I mentioned the tape announcements first, and the disk second.
The feedback from the XIV community was swift. Many felt that I [buried the lede] in not mentioning the XIV Gen3 SSD caching first.
(Note: For those not familiar with the phrase used in journalism, 'burying the lede' refers to the failure to mention the most interesting or attention grabbing elements of a story in the first paragraph. In American news journalism, it is spelled "lede" and elsewhere it is spelled "lead". Major US dictionaries apparently accept both spellings for this phrase.)
Technically, my lead paragraph stated clearly that: "This week we have announcements for both disk and tape, but since 2012 is the 60th Diamond Anniversary for tape, I will start with tape systems first."
So, while I don't claim to be a journalist by any means, I think the lead paragraph accurately reflected that I would talk about both disk and tape products in the rest of the blog post, and if a reader didn't care to learn more about tape could bypass those sections and go directly to the section on disk instead.
I have had my head handed to me on a platter so many times here at IBM that I am considering installing a zipper around my neck. My friends in XIV land insisted that I write a secondary post about XIV Gen3 SSD caching that had no mention of tape whatsoever. One suggestion was to compare and contrast XIV Gen3 SSD caching with EMC's announcement for VFCache. The result was my blog post [IBM XIV Gen3 SSD Caching versus EMC VFCache].
What could go wrong with an apples-to-orange comparison of two different storage products sprinkled with a small amount of FUD against a major competitor?
I had two complaints on this one. First, is the order of products in my side-by-side table of comparisons. I put EMC VFCache in the left column, and IBM XIV Gen3 SSD caching in the right. I meant nothing sinister by this. Alphabetically, EMC comes before IBM, and VFCache comes before XIV. Chronologically, EMC's announcement came out on Monday, and IBM's announcement came out the following day.
(Note: The term [sinster] comes from the Latin word sinistra meaning "left hand". In the Middle Ages it was believed that when a person was writing with their left hand they were possessed by the Devil. Left-handed people were therefore considered to be evil. My poor mother was born left-handed and was forced as a child to write with her right hand to be accepted by society.)
Apparently, an unwritten convention within IBM is that comparison tables always have the newer product on the left column, followed by one or more older products to the right, or the IBM product on the left column, with one or more competitive alternatives to the right.
The second complaint came from a reader in the comments section: "... I think [what] you're doing is trying to ride EMC's release for your own marketing, did you really need to? XIV is an excellent array; adding SSD Cache to the Gen3 takes it further, Moshe would be fuming (which I think is a good thing), can you just stick to that and not ride someone else's wave?"
Both announcements relate to reducing latency of read IOPS through the use of Solid State Drives. That both companies would announce these were no surprise to any employee at either company, as both IBM and EMC have been talking about their intent to do so last year. IBM's announcement of XIV SSD Gen3 caching was certainly not in response to EMC's VFCache announcement, and I doubt EMC rushed out their VFCache announcement the day before as a pre-emptive strike against IBM's announcement of the XIV Gen3 SSD Caching feature.
(Note: I don't know her personally, but she has thousands of followers!)
There you have it. I will gladly fix false or misleading information, but I am not going to re-arrange the order of things just to please some readers, only to have other readers complain that they liked it better in the original order. As always, feel free to comment on any of this in the section below.