Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
To avoid overwhelming people with too many features and functions, IBM decided to keep things simple for the first release. Let's take a look:
The base frame (2231-IA3) supports a single collection, from as small as 3.6 TB to as large as 72 TB of usable capacity. You can attach one expansion frame (2231-IS3) that holds two additional collections, 63 TB usable capacity for each collection. Disk capacity is increased in eight-drive (half-drawer) increments of 3.6 TB usable capacity each. A full configured IA system (304 drives, 1 TB raw capacity per drive) provides 198 TB usable capacity.
Of course, that is just the disk side of the solution. Like its predecessor, the IBM System Storage DR550, the IA v1.1 can also attach to external tape storage to store and protect petabytes (PB) of archive data. Hundreds of different IBM and non-IBM tape drives and libraries are supported, so that this can be easily incorporated into existing tape environments.
Each collection can be configured to one of three protection levels: basic, intermediate, and maximum.
Basic protection provides RAID protection of data using standard NFS group/user controls for access to read and write data. This can be useful for databases that need full read/write access. Users can assign expiration dates, but in Basic mode they can delete the data before the expiration date is reached.
Intermediate adds Non-Erasable Non-Rewriteable (NENR) protection against user actions to delete or modify protected data. However, similar to IBM N series "Enterprise SnapLock", intermediate mode allows authorized storage admins to clean up the mess, increase or reduce retention periods, and delete data if it is inadvertently protected. I often refer to this as "training wheels" for those who are trying to work out their workflow procedures before moving on to Maximum mode.
Maximum provides the strictest NENR protection for business, legal, government and industry requirements, comparable to IBM N series "Compliance SnapLock" mode, for data that traditionally were written to WORM optical media. Data cannot be deleted until the retention period ends. Retention periods of individual files and objects can be increased, but not decreased. Retention Hold (often referred to as Litigation Hold) can be used to keep a set of related data even longer in specific circumstances.
You can decide to upgrade your protection after data is written to a collection. Basic mode can be upgraded to Intermediate mode, for example, or Intermediate mode upgraded to Maximum.
To keep things simple, v1.1 of the Information Archive supports only two industry standard protocols: NFS and SSAM API. The NFS option allows standard file commands to read/write data. The System Storage Archive Manager (SSAM) API allows smooth transition from earlier IBM System Storage DR550 deployments. With this announcement, IBM will [discontinue selling the DR550 DR2 models].
As we say here at IBM, "Today is the best day to stop using EMC Centera." For more information, see the
IBM [Announcement Letter].
Am I dreaming? On his Storagezilla blog, fellow blogger Mark Twomey (EMC) brags about EMC's standard benchmark results, in his post titled [Love Life. Love CIFS.]. Here is my take:
A Full 180 degree reversal
For the past several years, EMC bloggers have argued, both in comments on this blog, and on their own blogs, that standard benchmarks are useless and should not be used to influence purchase decisions. While we all agree that "your mileage may vary", I find standard benchmarks are useful as part of an overall approach in comparing and selecting which vendors to work with, and which architectures or solution approaches to adopt, and which products or services to deploy. I am glad to see that EMC has finally joined the rest of the planet on this. I find it funny this reversal sounds a lot like their reversal from "Tape is Dead" to "What? We never said tape was dead!"
Impressive CIFS Results
The Standard Performance Evaluation Corporation (SPEC) has developed a series of NFS benchmarks, the latest, [SPECsfs2008] added support for CIFS. So, on the CIFS side, EMC's benchmarks compare favorably against previous CIFS tests from other vendors.
On the NFS side, however, EMC is still behind Avere, BlueArc, Exanet, and IBM/NetApp. For example, EMC's combination of Celerra gateways in front of V-Max disk systems resulted in 110,621 OPS with overall response time of 2.32 milliseconds. By comparison, the IBM N series N7900 (tested by NetApp under their own brand, FAS6080) was able to do 120,011 OPS with 1.95 msec response time.
Even though Sun invented the NFS protocol in the early 1980s, they take an EMC-like approach against standard benchmarks to measure it. Last year, fellow blogger Bryan Cantrill (Sun) gives his [Eulogy for a Benchmark]. I was going to make points about this, but fellow blogger Mike Eisler (NetApp) [already took care of it]. We can all learn from this. Companies that don't believe in standard benchmarks can either reverse course (as EMC has done), or continue their downhill decline until they are acquired by someone else.
(My condolences to those at Sun getting laid off. Those of you who hire on with IBM can get re-united with your former StorageTek buddies! Back then, StorageTek people left Sun in droves, knowing that Sun didn't understand the mainframe tape marketplace that StorageTek focused on. Likewise, many question how well Oracle will understand Sun's hardware business in servers and storage.)
What's in a Protocol?
Both CIFS and NFS have been around for decades, and comparisons can sometimes sound like religious debates. Traditionally, CIFS was used to share files between Windows systems, and NFS for Linux and UNIX platforms. However, Windows can also handle NFS, while Linux and UNIX systems can use CIFS. If you are using a recent level of VMware, you can use either NFS or CIFS as an alternative to Fibre Channel SAN to store your external disk VMDK files.
The Bigger Picture
There is a significant shift going on from traditional database repositories to unstructured file content. Today, as much as [80 percent of data is unstructured]. Shipments this year are expected to grow 60 percent for file-based storage, and only 15 percent for block-based storage. With the focus on private and public clouds, NAS solutions will be the battleground for 2010.
So, I am glad to see EMC starting to cite standard benchmarks. Hopefully, SPC-1 and SPC-2 benchmarks are forthcoming?
Well, it's Tuesday again, and that means IBM announcements! Today we had a major launch, with so many products, services and offerings
that I can't fit them all into a single post, so I will split them up into several posts to give the attention they deserve. So, in this
post, I will focus on just the networking gear.
IBM Converged Switch B32
The "Converged" part of this switch refers to Converged Enhanced Ethernet (CEE), which is just a lossless Ethernet that meets certain standards to allow Fibre Channel over Ethernet (FCoE) that are still being discussed between Brocade and Cisco. Thankfully, IBM demanded both Brocade and Cisco stick to open agreed-upon standards, and the rest of the world gets to benefit from IBM's leadership in keeping everything as open and non-proprietary as possible.
The B32 ("B" because it was made by Brocade) starts with 24 10Gb Converged Enhanced Ethernet (CEE) ports, and then you can add eight Fibre Channel ports, for a total of 32 ports, hence the name B32. These are designed to be Top-of-Rack (TOR) switches. Basically, instead of having expensive optical cables for Ethernet and/or Fibre Channel out of each server, you have cheap twinax copper cables connecting the server's Converged Network Adapters (CNA) to this TOR switch, and then you can have the 10Gb Ethernet go to your regular Ethernet LAN, and your 8Gbps FC traffic go to your regular FC SAN. In other words, the CNA serves both the role of an Ethernet Network Interface Card (NIC) as well as a Fibre Channel Host Bus Adapter (HBA) card.
(You might see 8Gbps Fibre Channel represented as 8/4/2 or 2/4/8, this is just to remind you that these 8Gb FC ports can auto-negotiate down to 2Gbps and 4Gbps legacy hardware, but not 1Gbps. If you are still using 1Gbps FC, you need 4Gpbs SFP transceivers instead, shown often as 1/2/4 or 4/2/1.)
New SSN-16 module for Cisco directors and switches
When I present SAN gear to sales reps, I often get the question, "What is the difference between a switch and a director?" My quick and simple answer is that switches have fixed ports, but directors have slots that you can slide in different blades or expansion modules. The Cisco MDS9500 series are directors with slots, the three models provide a hint to their capacity. The last two digits represent the number of total slots, but the first two slots are already taken. In other words, model 9513 has 11 slots, model 9509 has seven slots, and model 9506 has four slots. You can have a 48-port blade in a slot, so in theory, you can have a maximum of 528 ports on the biggest model 9513.
However, if you want FCIP for disaster recovery, or I/O Acceleration (IOA) for remote e-vaulting tape libraries, you need a special 18/4 blade. This has 18 FC ports, four 1GbE ports and a special service processor that speaks FCIP or IOA. If you wanted two service processors for FCIP and two for IOA, you would need four of these blades, and that takes up slots that could have been used for 48-port blades instead. The solution? The new SSN-16 has sixteen 1GbE ports and four service processors, so with one slot, you can handle the FCIP and IOA processing that you previously used four cards, giving you three slots back to use with higher port-density cards.
Even better, you can put this new SSN-16 in the Cisco 9222i. The model 9222i is a "hybrid" switch with 22 fixed ports (18 FC ports, four fixed 1GbE ports, and a service processor, so basically the fixed port version of the 18/4 blade above), but it also has one slot! That one slot can be used for the SSN-16 to give you added FCIP or IOA capability.
For our mainframe clients, the FICON package includes four 24-port FICON blades and 96 SFP 4Gbps transceivers to fully populate them. Here is the IBM [Press Release].
Cisco Nexus 5000 series for IBM System Storage
The Cisco Nexus 5000 series is Cisco's entry into the Converged Enhanced Ethernet world, although Cisco sometimes refers to this as Data Center Ethernet (DCE), IBM will continue to use CEE when referring to either Brocade and Cisco gear. These are also Top-of-Rack aggregators that support CNA connections over cheaper twinax copper wires. Model 5010 has 10 ports that can be configured for either 1GbE or 10Gb CEE, 10 ports that are 10Gb CEE, and a slot for an expansion module. The Model 5020 has basically twice as much of everything, including two slots instead of one. Since 10Gb Ethernet does not auto-negotiate down to 1GbE, half the ports can be configured to run 1GbE instead. Frankly, that can be seen as wasting your precious Nexus ports with 1GbE connections, so you might find a 1GbE-to-10GbE aggregator that combines a dozen or more 1GbE to a few 10GbE links instead.
Today's announcement is that in addition to 10GbE and 4Gbps FC expansion modules, there is now an expansion module that supports 8Gbps Fibre Channel. Here is the IBM [Press Release].
Whether you choose Brocade or Cisco, nearly all of IBM System Storage disk and tape products can work today with Converged Enhanced Ethernet environments, either directly using iSCSI, NFS or CIFS, or using the FCoE methodology.
As you can see, it took me a whole post just to cover just our networking gear announcements, and I haven't even covered our disk, tape and cloud storage offerings. I'll get to these in later posts.
Continuing on the [IBM Storage Launch of February 9], John Sing has offered to write the following guest post about the [announcement] of IBM Scale Out Network Attached Storage [IBM SONAS]. John and I have known each other for a while, traveled the world to work with clients and speak at conferences. He is an Executive IT Consultant on the SONAS team.
Guest Post written by John Sing, IBM San Jose, California
What is IBM SONAS? It’s many things, so let’s start with this list:
It’s IBM’s delivery of a productized, pre-packaged Scale Out NAS global virtual file server, delivered in a easy-to-use appliance
IBM’s solution for large enterprise file-based storage requirements, where massive scale in capacity and extreme performance is required, especially for today’s modern analytics-based Competitive Advantage IT applications
Scales to many petabytes of usable storage and billions of files in a single global namespace
Provides integrated central management, central deployment of petabyte levels of storage
Modular commercial-off-the-shelf [COTS] building blocks. I/O, storage, network capacity scale independently of each other. Up to 30 interface nodes and 60 storage nodes, in an IBM General Parallel File System [GPFS]-based cluster. Each 10Gb CEE interface node port is capable of streaming at 900 MB/sec
Files are written in block-sized chunks, striped over as many multiple disk drives in parallel – aggregating throughput on a massive scale (both read and write), as well as providing auto-tuning, auto-balancing
Functionality delivered via one program product, IBM SONAS Software, which provides all of above functions, along with clustered CIFS, NFS v2/v3 with session auto-failover, FTP, high availability, and more
IBM SONAS makes automated tiered storage achievable and realistic at petabyte levels:
Integrated high performance parallel scan engine capable of identifying files at over 10 million files per minute per node
Integrated parallel data movement engine to physically relocate the data within tiered storage
And we’re just scratching the surface. IBM has plans to deploy additional protocols, storage hardware options, and software features.
However, the real question of interest should be, “who really needs that much storage capacity and throughput horsepower?”
The answer may surprise you. IMHO, the answer is: almost any modern enterprise that intends to stay competitive. Hmmm…… Consider this: the reason that IT exists today is no longer to simply save cost (that may have been true 10 years ago). Everyone is reducing cost… but how much competitive advantage is purchased through “let’s cut our IT budget by 10% this year”?
Notice that in today’s world, there are (many) bright people out there, changing our world every day through New Intelligence Competitive Advantage analytics-based IT applications such as real time GPS traffic data, real time energy monitoring and redirection, real time video feed with analytics, text analytics, entity analytics, real time stream computing, image recognition applications, HDTV video on demand, etc. Think of how GPS industry, cell phone / Twitter / Facebook, iPhone and iPad applications, as examples, are creating whole new industries and markets almost overnight.
Then start asking yourself, “What's behind these Competitive Advantage IT applications – as they are the ones that are driving all my storage growth? Why do they need so much storage? What do those applications mean for my storage requirements?”
To be “real-time”, long-held IT paradigms are being broken every day. Things like “data proximity”: we can no longer can extract terabytes of data from production databases and load them to a data warehouse – where’s the “real-time” in that? Instead, today’s modern analytics-based applications demand:
Multiple processes and servers (sometimes numbering in the 100s) simultaneously ….
Running against hundreds of terabytes of data of live production data, streaming in from expanding number of smarter sensors, input devices, users
Producing digital image-intensive results that must be programatically sent to an ever increasing number of mobile devices in geographically dispersed storage
Requiring parallel performance levels, that used to be the domain only of High Performance Computing (HPC)
This is a major paradigm shift in storage – and that is the solution and storage capabilities that IBM SONAS is designed to address. And of course, you should be able to save significant cost through the SONAS global virtual file server consolidation and virtualization as well.
Certainly, this topic warrants more discussion. If you found it interesting, contact me, your local IBM Business Partner or IBM Storage rep to discuss Competitive Advantage IT applications and SONAS further.
Wrapping up my coverage of the annual [2010 System Storage Technical University], I attended what might be perhaps the best session of the conference. Jim Nolting, IBM Semiconductor Manufacturing Engineer, presented the new IBM zEnterprise mainframe, "A New Dimension in Computing", under the Federal track.
The zEnterprises debunks the "one processor fits all" myth. For some I/O-intensive workloads, the mainframe continues to be the most cost-effective platform. However, there are other workloads where a memory-rich Intel or AMD x86 instance might be the best fit, and yet other workloads where the high number of parallel threads of reduced instruction set computing [RISC] such as IBM's POWER7 processor is more cost-effective. The IBM zEnterprise combines all three processor types into a single system, so that you can now run each workload on the processor that is optimized for that workload.
IBM zEnterprise z196 Central Processing Complex (CPC)
Let's start with the new mainframe z196 central processing complex (CPC). Many thought this would be called the z11, but that didn't happen. Basically, the z196 machine has a maximum 96 cores versus z10's 64 core maximum, and each core runs 5.2GHz instead of z10's cores running at 4.7GHz. It is available in air-cooled and water-cooled models. The primary operating system that runs on this is called "z/OS", which when used with its integrated UNIX System Services subsystem, is fully UNIX-certified. The z196 server can also run z/VM, z/VSE, z/TPF and Linux on z, which is just Linux recompiled for the z/Architecture chip set. In my June 2008 post [Yes, Jon, there is a mainframe that can help replace 1500 servers], I mentioned the z10 mainframe had a top speed of nearly 30,000 MIPS (Million Instructions per Second). The new z196 machine can do 50,000 MIPS, a 60 percent increase!
The z196 runs a hypervisor called PR/SM that allows the box to be divided into dozens of logical partitions (LPAR), and the z/VM operating system can also act as a hypervisor running hundreds or thousands of guest OS images. Each core can be assigned a specialty engine "personality": GP for general processor, IFL for z/VM and Linux, zAAP for Java and XML processing, and zIIP for database, communications and remote disk mirroring. Like the z9 and z10, the z196 can attach to external disk and tape storage via ESCON, FICON or FCP protocols, and through NFS via 1GbE and 10GbE Ethernet.
IBM zEnterprise BladeCenter Extension (zBX)
There is a new frame called the zBX that basically holds two IBM BladeCenter chassis, each capable of 14 blades, so total of 28 blades per zBX frame. For now, only select blade servers are supported inside, but IBM plans to expand this to include more as testing continues. The POWER-based blades can run native AIX, IBM's other UNIX operating system, and the x86-based blades can run Linux-x86 workloads, for example. Each of these blade servers can run a single OS natively, or run a hypervisor to have multiple guest OS images. IBM plans to look into running other POWER and x86-based operating systems in the future.
If you are already familiar with IBM's BladeCenter, then you can skip this paragraph. Basically, you have a chassis that holds 14 blades connected to a "mid-plane". On the back of the chassis, you have hot-swappable modules that snap into the other side of the mid-plane. There are modules for FCP, FCoE and Ethernet connectivity, which allows blades to talk to each other, as well as external storage. BladeCenter Management modules serve as both the service processor as well as the keyboard, video and mouse Local Console Manager (LCM). All of the IBM storage options available to IBM BladeCenter apply to zBX as well.
Besides general purpose blades, IBM will offer "accelerator" blades that will offload work from the z196. For example, let's say an OLAP-style query is issued via SQL to DB2 on z/OS. In the process of parsing the complicated query, it creates a Materialized Query Table (MQT) to temporarily hold some data. This MQT contains just the columnar data required, which can then be transferred to a set of blade servers known as the Smart Analytics Optimizer (SAO), then processes the request and sends the results back. The Smart Analytics Optimizer comes in various sizes, from small (7 blades) to extra large (56 blades, 28 in each of two zBX frames). A 14-blade configuration can hold about 1TB of compressed DB2 data in memory for processing.
IBM zEnterprise Unified Resource Manager
You can have up to eight z196 machines and up to four zBX frames connected together into a monstrously large system. There are two internal networks. The Inter-ensemble data network (IEDN) is a 10GbE that connects all the OS images together, and can be further subdivided into separate virtual LANs (VLAN). The Inter-node management network (INMN) is a 1000 Mbps Base-T Ethernet that connects all the host servers together to be managed under a single pane of glass known as the Unified Resource Manager. It is based on IBM Systems Director.
By integrating service management, the Unified Resource Manager can handle Operations, Energy Management, Hypervisor Management, Virtual Server Lifecycle Management, Platform Performance Management, and Network Management, all from one place.
IBM Rational Developer for System z Unit Test (RDz)
But what about developers and testers, such as those Independent Software Vendors (ISV) that produce mainframe software. How can IBM make their lives easier?
Phil Smith on z/Journal provides a history of [IBM Mainframe Emulation]. Back in 2007, three emulation options were in use in various shops:
Open Mainframe, from Platform Solutions, Inc. (PSI)
FLEX-ES, from Fundamental Software, Inc.
Hercules, which is an open source package
None of these are viable options today. Nobody wanted to pay IBM for its Intellectual Property on the z/Architecture or license the use of the z/OS operating system. To fill the void, IBM put out an officially-supported emulation environment called IBM System z Professional Development Tool (zPDT) available to IBM employees, IBM Business Partners and ISVs that register through IBM Partnerworld. To help out developers and testers who work at clients that run mainframes, IBM now offers IBM Rational Developer for System z Unit Test, which is a modified version of zPDT that can run on a x86-based laptop or shared IBM System x server. Based on the open source [Eclipse IDE], the RDz emulates GP, IFL, zAAP and zIIP engines on a Linux-x86 base. A four-core x86 server can emulate a 3-engine mainframe.
With RDz, a developer can write code, compile and unit test all without consuming any mainframe MIPS. The interface is similar to Rational Application Developer (RAD), and so similar skills, tools and interfaces used to write Java, C/C++ and Fortran code can also be used for JCL, CICS, IMS, COBOL and PL/I on the mainframe. An IBM study ["Benchmarking IDE Efficiency"] found that developers using RDz were 30 percent more productive than using native z/OS ISPF. (I mention the use of RAD in my post [Three Things to do on the IBM Cloud]).
What does this all mean for the IT industry? First, the zEnterprise is perfectly positioned for [three-tier architecture] applications. A typical example could be a client-facing web-server on x86, talking to business logic running on POWER7, which in turn talks to database on z/OS in the z196 mainframe. Second, the zEnterprise is well-positioned for government agencies looking to modernize their operations and significantly reduce costs, corporations looking to consolidate data centers, and service providers looking to deploy public cloud offerings. Third, IBM storage is a great fit for the zEnterprise, with the IBM DS8000 series, XIV, SONAS and Information Archive accessible from both z196 and zBX servers.