Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Am I dreaming? On his Storagezilla blog, fellow blogger Mark Twomey (EMC) brags about EMC's standard benchmark results, in his post titled [Love Life. Love CIFS.]. Here is my take:
A Full 180 degree reversal
For the past several years, EMC bloggers have argued, both in comments on this blog, and on their own blogs, that standard benchmarks are useless and should not be used to influence purchase decisions. While we all agree that "your mileage may vary", I find standard benchmarks are useful as part of an overall approach in comparing and selecting which vendors to work with, and which architectures or solution approaches to adopt, and which products or services to deploy. I am glad to see that EMC has finally joined the rest of the planet on this. I find it funny this reversal sounds a lot like their reversal from "Tape is Dead" to "What? We never said tape was dead!"
Impressive CIFS Results
The Standard Performance Evaluation Corporation (SPEC) has developed a series of NFS benchmarks, the latest, [SPECsfs2008] added support for CIFS. So, on the CIFS side, EMC's benchmarks compare favorably against previous CIFS tests from other vendors.
On the NFS side, however, EMC is still behind Avere, BlueArc, Exanet, and IBM/NetApp. For example, EMC's combination of Celerra gateways in front of V-Max disk systems resulted in 110,621 OPS with overall response time of 2.32 milliseconds. By comparison, the IBM N series N7900 (tested by NetApp under their own brand, FAS6080) was able to do 120,011 OPS with 1.95 msec response time.
Even though Sun invented the NFS protocol in the early 1980s, they take an EMC-like approach against standard benchmarks to measure it. Last year, fellow blogger Bryan Cantrill (Sun) gives his [Eulogy for a Benchmark]. I was going to make points about this, but fellow blogger Mike Eisler (NetApp) [already took care of it]. We can all learn from this. Companies that don't believe in standard benchmarks can either reverse course (as EMC has done), or continue their downhill decline until they are acquired by someone else.
(My condolences to those at Sun getting laid off. Those of you who hire on with IBM can get re-united with your former StorageTek buddies! Back then, StorageTek people left Sun in droves, knowing that Sun didn't understand the mainframe tape marketplace that StorageTek focused on. Likewise, many question how well Oracle will understand Sun's hardware business in servers and storage.)
What's in a Protocol?
Both CIFS and NFS have been around for decades, and comparisons can sometimes sound like religious debates. Traditionally, CIFS was used to share files between Windows systems, and NFS for Linux and UNIX platforms. However, Windows can also handle NFS, while Linux and UNIX systems can use CIFS. If you are using a recent level of VMware, you can use either NFS or CIFS as an alternative to Fibre Channel SAN to store your external disk VMDK files.
The Bigger Picture
There is a significant shift going on from traditional database repositories to unstructured file content. Today, as much as [80 percent of data is unstructured]. Shipments this year are expected to grow 60 percent for file-based storage, and only 15 percent for block-based storage. With the focus on private and public clouds, NAS solutions will be the battleground for 2010.
So, I am glad to see EMC starting to cite standard benchmarks. Hopefully, SPC-1 and SPC-2 benchmarks are forthcoming?
The technology industry is full of trade-offs. Take for example solar cells that convert sunlight to electricity. Every hour, more energy hits the Earth in the form of sunlight than the entire planet consumes in an entire year. The general trade-off is between energy conversion efficiency versus abundance of materials:
Get 9-11 percent efficiency using rare materials like indium (In), gallium (Ga) or cadmium (Cd).
Get only 6.7 percent efficiency using abundant materials like copper (Cu), tin (Sn), zinc (Zn), sulfur (S), and selenium (Se)
A second trade-off is exemplified by EMC's recent GeoProtect announcement. This appears similar to the geographic dispersal method introduced by a company called [CleverSafe]. The trade-off is between the amount of space to store one or more copies of data and the protection of data in the event of disaster. Here's an excerpt from fellow blogger Chuck Hollis (EMC) titled ["Cloud Storage Evolves"]:
"Imagine a average-sized Atmos network of 9 nodes, all in different time zones around the world. And imagine that we were using, say, a 6+3 protection scheme.
The implication is clear: any 3 nodes could be completely lost: failed, destroyed, seized by the government, etc.
-- and the information could be completely recovered from the surviving nodes."
For organizations worried about their information falling into the wrong hands (whether criminal or government sponsored!), any subset of the nodes would yield nothing of value -- not only would the information be presumably encrypted, but only a few slices of a far bigger picture would be lost.
Seized by the government?falling into the wrong hands? Is EMC positioning ATMOS as "Storage for Terrorists"? I can certainly appreciate the value of being able to protect 6PB of data with only 9PB of storage capacity, instead of keeping two copies of 6PB each, the trade-off means that you will be accessing the majority of your data across your intranet, which could impact performance. But, if you are in an illicit or illegal business that could have a third of your facilities "seized by the government", then perhaps you shouldn't house your data centers there in the first place. Having two copies of 6PB each, in two "friendly nations", might make more sense.
(In reality, companies often keep way more than just two copies of data. It is not unheard of for companies to keep three to five copies scattered across two or three locations. Facebook keeps SIX copies of photographs you upload to their website.)
ChuckH argues that the governments that seize the three nodes won't have a complete copy of the data. However, merely having pieces of data is enough for governments to capture terrorists. Even if the striping is done at the smallest 512-byte block level, those 512 bytes of data might contain names, phone numbers, email addresses, credit cards or social security numbers. Hackers and computer forensics professionals take advantage of this.
You might ask yourself, "Why not just encrypt the data instead?" That brings me to the third trade-off, protection versus application performance. Over the past 30 years, companies had a choice, they could encrypt and decrypt the data as needed, using server CPU cycles, but this would slow down application processing. Every time you wanted to read or update a database record, more cycles would be consumed. This forced companies to be very selective on what data they encrypted, which columns or fields within a database, which email attachments, and other documents or spreadsheets.
An initial attempt to address this was to introduce an outboard appliance between the server and the storage device. For example, the server would write to the appliance with data in the clear, the appliance would encrypt the data, and pass it along to the tape drive. When retrieving data, the appliance would read the encrypted data from tape, decrypt it, and pass the data in the clear back to the server. However, this had the unintended consequences of using 2x to 3x more tape cartridges. Why? Because the encrypted data does not compress well, so tape drives with built-in compression capabilities would not be able to shrink down the data onto fewer tapes.
(I covered the importance of compressing data before encryption in my previous blog post
[Sock Sock Shoe Shoe].)
Like the trade-off between energy efficiency and abundant materials, IBM eliminated the trade-off by offering compression and encryption on the tape drive itself. This is standard 256-bit AES encryption implemented on a chip, able to process the data as it arrives at near line speed. So now, instead of having to choose between protecting your data or running your applications with acceptable performance, you can now do both, encrypt all of your data without having to be selective. This approach has been extended over to disk drives, so that disk systems like the IBM System Storage DS8000 and DS5000 can support full-disk-encryption [FDE] drives.
In addition to dominating the gaming world, producing chips for the Nintendo Wii, Sony PlayStation, and Microsoft Xbox 360, IBM also dominates the world of Linux and UNIX servers. Today, IBM announced its new POWER7 processor, and a line of servers that use this technology. Here is a quick [3-minute video] about the POWER7.
While others might be [Dancing on Sun's grave], IBM instead is focused on providing value to the marketplace. Here is another quick [2-minute video] about why thousands of companies have switched from Sun, HP and Dell over to IBM.
Continuing on the [IBM Storage Launch of February 9], John Sing has offered to write the following guest post about the [announcement] of IBM Scale Out Network Attached Storage [IBM SONAS]. John and I have known each other for a while, traveled the world to work with clients and speak at conferences. He is an Executive IT Consultant on the SONAS team.
Guest Post written by John Sing, IBM San Jose, California
What is IBM SONAS? It’s many things, so let’s start with this list:
It’s IBM’s delivery of a productized, pre-packaged Scale Out NAS global virtual file server, delivered in a easy-to-use appliance
IBM’s solution for large enterprise file-based storage requirements, where massive scale in capacity and extreme performance is required, especially for today’s modern analytics-based Competitive Advantage IT applications
Scales to many petabytes of usable storage and billions of files in a single global namespace
Provides integrated central management, central deployment of petabyte levels of storage
Modular commercial-off-the-shelf [COTS] building blocks. I/O, storage, network capacity scale independently of each other. Up to 30 interface nodes and 60 storage nodes, in an IBM General Parallel File System [GPFS]-based cluster. Each 10Gb CEE interface node port is capable of streaming at 900 MB/sec
Files are written in block-sized chunks, striped over as many multiple disk drives in parallel – aggregating throughput on a massive scale (both read and write), as well as providing auto-tuning, auto-balancing
Functionality delivered via one program product, IBM SONAS Software, which provides all of above functions, along with clustered CIFS, NFS v2/v3 with session auto-failover, FTP, high availability, and more
IBM SONAS makes automated tiered storage achievable and realistic at petabyte levels:
Integrated high performance parallel scan engine capable of identifying files at over 10 million files per minute per node
Integrated parallel data movement engine to physically relocate the data within tiered storage
And we’re just scratching the surface. IBM has plans to deploy additional protocols, storage hardware options, and software features.
However, the real question of interest should be, “who really needs that much storage capacity and throughput horsepower?”
The answer may surprise you. IMHO, the answer is: almost any modern enterprise that intends to stay competitive. Hmmm…… Consider this: the reason that IT exists today is no longer to simply save cost (that may have been true 10 years ago). Everyone is reducing cost… but how much competitive advantage is purchased through “let’s cut our IT budget by 10% this year”?
Notice that in today’s world, there are (many) bright people out there, changing our world every day through New Intelligence Competitive Advantage analytics-based IT applications such as real time GPS traffic data, real time energy monitoring and redirection, real time video feed with analytics, text analytics, entity analytics, real time stream computing, image recognition applications, HDTV video on demand, etc. Think of how GPS industry, cell phone / Twitter / Facebook, iPhone and iPad applications, as examples, are creating whole new industries and markets almost overnight.
Then start asking yourself, “What's behind these Competitive Advantage IT applications – as they are the ones that are driving all my storage growth? Why do they need so much storage? What do those applications mean for my storage requirements?”
To be “real-time”, long-held IT paradigms are being broken every day. Things like “data proximity”: we can no longer can extract terabytes of data from production databases and load them to a data warehouse – where’s the “real-time” in that? Instead, today’s modern analytics-based applications demand:
Multiple processes and servers (sometimes numbering in the 100s) simultaneously ….
Running against hundreds of terabytes of data of live production data, streaming in from expanding number of smarter sensors, input devices, users
Producing digital image-intensive results that must be programatically sent to an ever increasing number of mobile devices in geographically dispersed storage
Requiring parallel performance levels, that used to be the domain only of High Performance Computing (HPC)
This is a major paradigm shift in storage – and that is the solution and storage capabilities that IBM SONAS is designed to address. And of course, you should be able to save significant cost through the SONAS global virtual file server consolidation and virtualization as well.
Certainly, this topic warrants more discussion. If you found it interesting, contact me, your local IBM Business Partner or IBM Storage rep to discuss Competitive Advantage IT applications and SONAS further.
Well, it's Tuesday again, and that means IBM announcements! Today we had a major launch, with so many products, services and offerings
that I can't fit them all into a single post, so I will split them up into several posts to give the attention they deserve. So, in this
post, I will focus on just the networking gear.
IBM Converged Switch B32
The "Converged" part of this switch refers to Converged Enhanced Ethernet (CEE), which is just a lossless Ethernet that meets certain standards to allow Fibre Channel over Ethernet (FCoE) that are still being discussed between Brocade and Cisco. Thankfully, IBM demanded both Brocade and Cisco stick to open agreed-upon standards, and the rest of the world gets to benefit from IBM's leadership in keeping everything as open and non-proprietary as possible.
The B32 ("B" because it was made by Brocade) starts with 24 10Gb Converged Enhanced Ethernet (CEE) ports, and then you can add eight Fibre Channel ports, for a total of 32 ports, hence the name B32. These are designed to be Top-of-Rack (TOR) switches. Basically, instead of having expensive optical cables for Ethernet and/or Fibre Channel out of each server, you have cheap twinax copper cables connecting the server's Converged Network Adapters (CNA) to this TOR switch, and then you can have the 10Gb Ethernet go to your regular Ethernet LAN, and your 8Gbps FC traffic go to your regular FC SAN. In other words, the CNA serves both the role of an Ethernet Network Interface Card (NIC) as well as a Fibre Channel Host Bus Adapter (HBA) card.
(You might see 8Gbps Fibre Channel represented as 8/4/2 or 2/4/8, this is just to remind you that these 8Gb FC ports can auto-negotiate down to 2Gbps and 4Gbps legacy hardware, but not 1Gbps. If you are still using 1Gbps FC, you need 4Gpbs SFP transceivers instead, shown often as 1/2/4 or 4/2/1.)
New SSN-16 module for Cisco directors and switches
When I present SAN gear to sales reps, I often get the question, "What is the difference between a switch and a director?" My quick and simple answer is that switches have fixed ports, but directors have slots that you can slide in different blades or expansion modules. The Cisco MDS9500 series are directors with slots, the three models provide a hint to their capacity. The last two digits represent the number of total slots, but the first two slots are already taken. In other words, model 9513 has 11 slots, model 9509 has seven slots, and model 9506 has four slots. You can have a 48-port blade in a slot, so in theory, you can have a maximum of 528 ports on the biggest model 9513.
However, if you want FCIP for disaster recovery, or I/O Acceleration (IOA) for remote e-vaulting tape libraries, you need a special 18/4 blade. This has 18 FC ports, four 1GbE ports and a special service processor that speaks FCIP or IOA. If you wanted two service processors for FCIP and two for IOA, you would need four of these blades, and that takes up slots that could have been used for 48-port blades instead. The solution? The new SSN-16 has sixteen 1GbE ports and four service processors, so with one slot, you can handle the FCIP and IOA processing that you previously used four cards, giving you three slots back to use with higher port-density cards.
Even better, you can put this new SSN-16 in the Cisco 9222i. The model 9222i is a "hybrid" switch with 22 fixed ports (18 FC ports, four fixed 1GbE ports, and a service processor, so basically the fixed port version of the 18/4 blade above), but it also has one slot! That one slot can be used for the SSN-16 to give you added FCIP or IOA capability.
For our mainframe clients, the FICON package includes four 24-port FICON blades and 96 SFP 4Gbps transceivers to fully populate them. Here is the IBM [Press Release].
Cisco Nexus 5000 series for IBM System Storage
The Cisco Nexus 5000 series is Cisco's entry into the Converged Enhanced Ethernet world, although Cisco sometimes refers to this as Data Center Ethernet (DCE), IBM will continue to use CEE when referring to either Brocade and Cisco gear. These are also Top-of-Rack aggregators that support CNA connections over cheaper twinax copper wires. Model 5010 has 10 ports that can be configured for either 1GbE or 10Gb CEE, 10 ports that are 10Gb CEE, and a slot for an expansion module. The Model 5020 has basically twice as much of everything, including two slots instead of one. Since 10Gb Ethernet does not auto-negotiate down to 1GbE, half the ports can be configured to run 1GbE instead. Frankly, that can be seen as wasting your precious Nexus ports with 1GbE connections, so you might find a 1GbE-to-10GbE aggregator that combines a dozen or more 1GbE to a few 10GbE links instead.
Today's announcement is that in addition to 10GbE and 4Gbps FC expansion modules, there is now an expansion module that supports 8Gbps Fibre Channel. Here is the IBM [Press Release].
Whether you choose Brocade or Cisco, nearly all of IBM System Storage disk and tape products can work today with Converged Enhanced Ethernet environments, either directly using iSCSI, NFS or CIFS, or using the FCoE methodology.
As you can see, it took me a whole post just to cover just our networking gear announcements, and I haven't even covered our disk, tape and cloud storage offerings. I'll get to these in later posts.