Tony Pearson is a Master Inventor and Senior IT Architect for the IBM Storage product line at the
IBM Executive Briefing Center in Tucson Arizona, and featured contributor
to IBM's developerWorks. In 2016, Tony celebrates his 30th year anniversary with IBM Storage. He is
author of the Inside System Storage series of books. This blog is for the open exchange of ideas relating to storage and storage networking hardware, software and services.
(Short URL for this blog: ibm.co/Pearson )
My books are available on Lulu.com! Order your copies today!
Safe Harbor Statement: The information on IBM products is intended to outline IBM's general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on IBM products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for IBM products remains at IBM's sole discretion.
Tony Pearson is a an active participant in local, regional, and industry-specific interests, and does not receive any special payments to mention them on this blog.
Tony Pearson receives part of the revenue proceeds from sales of books he has authored listed in the side panel.
Tony Pearson is not a medical doctor, and this blog does not reference any IBM product or service that is intended for use in the diagnosis, treatment, cure, prevention or monitoring of a disease or medical condition, unless otherwise specified on individual posts.
Clod Barrera is an IBM Distinguished Engineer and Chief Technical Strategist for IBM System Storage. He predicts that by 2015, 10 percent of the servers and storage purchases, as well as 25 percent of the network gear purchases, will be related to Cloud deployments. Cloud Storage is expected to grow at a compound annual growth rate (CAGR) of 32 percent through 2015, compared to only 3.8 percent growth for non-Cloud storage.
Cloud Computing is allowing companies to rethink their IT infrastructure, and reinvent their business. Clod presented an interesting chart on the "Taxonomy" of storage in Cloud environments. On the left he had examples of Storage that was part of a Cloud Compute application. On the right he had storage that was accessed directly through protocols or APIs. Under each he had several examples for transactional data, stream data, backups and archives.
Clod feels the only difference between Private and Public clouds is a matter of ownership. In private clouds, these are owned by the company that uses them via their private Intranet network. Public clouds are owned by Cloud Service providers and are accessed over the public Internet. Clod presented IBM's strategy to deliver Cloud at five levels:
Private Cloud: on-site equipment, behind company firewall, managed by IT staff
Managed Private Cloud: on-site equipment, behind company firewall, managed by IBM or other Cloud Service provider
Hosted Private Cloud: dedicated, off-premises equipment, located and managed by IBM or other Cloud Service Provider, and access through VPN
Shared Cloud Services: shared, off-premises equipment, located at IBM or other Cloud Service Provider, managed by IBM or Cloud Service provider, and access through VPN. The facility is intended for enterprises only, on a contractual basis, and will be auditable for compliance to government regulations, etc.
Public Cloud: shared, off-premises equipment, located and managed by IBM or other Cloud Service provider, targeted to offer cloud compute and storage resources, with standardized platforms of operating systems and middleware, for individuals, small and medium sized businesses.
As with storage in traditional data center deployments, storage in clouds will be tiered, with Tier 0 being the fastest tier, to Tier 4 for "deep and cheap" archive storage. IBM SONAS is an example of Cloud-ready storage that can help make these tiers accessible through standard Ethernet protocols. Cloud Service providers will use metering and Service Level Agreements (SLAs) to offer different rates for different tiers of storage in the cloud.
Clod wrapped up his session explaining IBM's Cloud Computing Reference Architecture (CCRA). This is an all-encompassing diagram that shows how all of IBM's hardware, software and services fit into Cloud deployments.
Since the [IBM System Storage Technical University 2011] runs concurrently with the System x Technical University, attendees are allowed to mix-and-match. I attended several presentations regarding server virtualization and hypervisors.
Matt Archibald is an IT Management Consultant in IBM's Systems Agenda Delivery team. He started with a history of hypervisors, from IBM's early CP/CMS in 1967, through the latest VMware Vsphere 5 just announced.
He explained that there are three types of Hypervisor architectures today:
Type 1 - often referred to as "Bare Metal" runs directly on the server host hardware, and allows different operating system virtual machines to run as guests. IBM's System z [PR/SM] and [PowerVM] as well as the popular VMware ESXi are examples of this type.
Type 2 - often referred to as "Hosted" runs above an existing operating system, and allows different operating system virtual machines to run as guests. The popular [Oracle/Sun VirtualBox] is an example of this type.
OS Containers - runs above an existing operating system base, and allows multiple "guests" that all run the same operating system as the base. This affords some isolation between applications. [Parallels Virtuozzo Containers] is an example of this type.
The dominant architecture is Type 1. For x86, IBM is the number one reseller of VMware. VMware recently announced [Vsphere 5], which changes its licensing model from CPU-based to memory-based. For example, a virtual machine with 32 virtual CPUs and 1TB of virtual RAM (VRAM) would cost over $73,000 per year to license the VMware "Enterprise Plus" software. The only plus-side to this new licensing is that the "memory" entitlement transfers during Disaster Recovery to the remote location.
"Xen is dead." was the way Matt introduced the section discussing Hybrid Type-1 hypervisors like Xen and Hyper-V. These run bare-metal, but require networking and storage I/O to be processed by a single bottleneck partition referred to as "Dom 0". As such, this hybrid approach does not scale well on larger multi-sock host servers. So, his Xen-is-dead message was referring to all Hybrid-based Hypervisors including Hyper-V, not just those based on Xen itself.
The new up-and-comer is "Linux KVM". Last year, in my blog post about [System x KVM solutions], I mentioned the confusion over KVM acronym used with two different meanings. Many people use KVM to refer to Keyboard-Video-Mouse switches that allow access to multiple machines. IBM has renamed these switches to Local Console Managers (LCM) and Global Console Manager (GCM). This year, the System x team have adopted the use of "Linux KVM" to refer to the second meaning, the [Kernel-based Virtual Machine] hypervisor.
Linux KVM is not a product, but an open-source project. As such, it is built into every Linux kernel. Red Hat has created two specific deliverables under the name Red Hat Enterprise Virtualization (RHEV):
RHEV-H, a tiny ESXi-like bare-metal hypervisor that fits in 78MB, making it small enough to be on a USB stick, CD-rom or memory chip.
RHEV-M, a vCenter-like management software to manage multiple virtual machines across multiple hosts.
Personally, I run RHEL 6.1 with KVM on my IBM laptop as my primary operating system, with a Windows XP guest image to run a few Windows-specific applications.
A complaint of the current RHEV 2.2 release from Linux fanboys is that RHEV-M requires a Windows server, and uses Windows Powershell for scripting. The next release of RHEV is likely to provide a Linux-based option for management server.
Of the various hypervisors evaluated, KVM appears to be poised to offer the best scalability for multi-socket host machines. The next release is expected to support up to 4096 threads, 64TB of RAM, and over 2000 virtual machines. Compare that to VMware Vsphere 5 that supports only 160 threads, 2TB of RAM and up to 512 virtual machines.
Linux KVM Overview
Matt also presented a session focused on Linux KVM. While IBM is the leading reseller of VMware for the x86 server platform, it has chosen Linux KVM to run all of its internal x86 Cloud Computing facilities, as it can offer 40 to 80 percent savings, based on Total Cost of Ownership (TCO).
Linux KVM can run unmodified Windows and Linux guest operating systems as guest images with less than 5 percent overhead. Since KVM is built into the Linux kernel, any certification testing automatically benefits KVM as well. KVM takes advantage of modern CPU extensions like Intel's VT and AMD's AMD-V.
For high availability, in the event that a host fails, KVM can restart the guest images on other KVM hosts. RHEV offers "prioritized restart order" which allows mision-critical images to be started before less important ones.
RHEV also provides "Virtual Desktop Infrastructure", known as VDI. This allows a lightweight client with a browser to access an OS image running on a KVM host. Matt was able to demonstrate this with Firefox browser running on his Android-based Nexus One smartphone.
RHEV also adds features that make it ideal for cloud deployments, including hot-pluggable CPU, network and storage; service Level Agreement monitoring for CPU, memory and I/O resources; storage live migrations to move the raw image files while guests are running; and a self-service user portal.
IBM has been doing server virtualization for decades. When I first started at IBM in 1986, I was doing z/OS development and testing on z/VM guest images. Later, around 1999, I started working with the "Linux on z" team, running multiple Linux images under PR/SM and z/VM. While the server virtualization solutions most people are familiar with (VMware, Hyper-V, Xen) have only been around the last five years or so, IBM has a much deeper and robust understanding and long heritage. This helps to set IBM apart from the competition when helping clients.
I gotten several emails expressing worry that I have fallen off the face of th earth. The last two weeks have been educational and eye-opening for me. I can't provide details in my blog, so I will just say that it involved government agencies that IBM refers to as "dark accounts", and that I am now back safely in the USA. Between adjusting to time zone differences, ridiculously long hours, and restricted access to the internet, I was unable to blog lately.
Instead, I will resume my coverage of the [IBM System Storage Technical University 2011]. The "Solutions Expo" runs Monday evening through Wednesday lunch. This is a chance for people to explore all the solutions that are part of IBM's large "eco-system" for IBM System storage and System x products. There were several sponsors for this event.
As is often the case at these conferences, the various booths hand out fun items. The hot items this year were tie-dyed tee-shirts from Qlogic, and propeller beanies from the IBM rack and power systems team. Here is Amanda, one of the bartenders showing off the latter.
After the expo on Tuesday night, my friends at [Texas Memory Systems] held an after-party. Unlike the pens, tee-shirts and keychains at the Expo, these guys had a raffle for real storage products. Here is Erik Eyberg handing out a RamSan PCIe card, valued at $14,000 or so. IBM recently certified the TMS RamSan as External SSD storage for the IBM SAN Volume Controller (SVC). The SVC can optimize performance using this for automated sub-LUN tiering with the IBM System Storage Easy Tier feature.
I always try to catch a session from Jim Blue, who works in our "SAN Central" center of competency team. This session was a long list of useful hints and tips, based on his many years of experience helping clients.
SAN Zoning works by inclusion, limiting the impact of failing devices. The best approach is to zone by individual initiator port. The default policy for your SAN zoning should be "deny".
Ports should be named to identify who, what, where and how.
While many people know not to mix both disk and tape devices on the same HBA, Jim also recommends not mixing dissimilar disks, test and production, FCP and FICON.
The sweet spot is FOUR paths. Too many paths can impact performance.
When making changes to redundant fabrics, make changes to the first fabric, then allow sufficient time before making the same changes to the other fabric.
Use software tools like Tivoli Storage Productivity Center (Standard Edition) to validate all changes to your SAN fabric.
Do not mix 62.5 and 50.0 micron technology.
Use port caps to disable inactive ports. In one amusing anecdote, he mention that an uncovered port was hit by sunlight every day, sending error messages that took a while to figure out.
Save your SAN configuration to non-SAN storage for backup
Consider firmware about two months old to be stable
Rule of thumb for estimating IOPS: 75-100 IOPS per 7200 RPM drive, 120-150 IOPS per 10K RPM drive, and 150-200 IOPS per 15K RPM drive.
Decide whether your shop is just-in-time or just-in-case provisioning. Just-in-time gets additional capacity on demand as needed, and just-in-case over-provisions to avoid scrambling last minute.
Avoid oversubscribing your inter-switch links (ISL). Aim for around 7:1 to 10:1 ratio.
Don't go cheap on bandwidth between sites for long-distance replication
Next Generation Network Fabrics - Strategy and Innovations
Mike Easterly, IBM Director of Global Field Marketing, presented IBM System Networking strategy, in light of IBM's recent acquisition of Blade Network Technologies (BNT). BNT is used in 350 of the Fortune 500 companies, and is ranked #2 behind Cisco in sales of non-core Ethernet switches (based on number of units sold).
Based on a recent survey, companies are upgrading their Ethernet networks for a variety of reasons:
56 percent for Live Partition Mobility and VMware Vmotion
45 percent for integrated compute stacks, like IBM CloudBurst
43 percent for private, public and hybrid cloud computing deployments
40 percent for network convergences
Many companies adopt a three-level approach, with core directors, distribution switches, and then access switches at the edge that connect servers and storage devices. IBM's BNT allows you to flatten the network to lower latency by collapsing the access and distribution levels into one.
IBM's strategy is to focus on BNT for the access/distribution level, and to continue its strategic partnerships for the core level.
IBM BNT provides better price/performance and lower energy consumption. To help with hot-aisle/cold-aisle rack deployments, IBM BNT provides both F and R models. F models have ports on the front, and R models have ports in the rear.
IBM BNT supports virtual fabric and HW-offload iSCSI traffic, and future-enabled for FCoE. Support for TRILL (transparent interconnect of lots of links) and OpenFlow will be implemented through software updates to the switches.
While Cisco Nexus 1000v is focused on VMware Enterprise Plus, IBM BNT's VMready works with VMware, Hyper-V, Linux KVM, XEN, OracleVM, and PowerVM. This allows single pane of management of VMready and ESX vSwitches.
In preparation for Converged Enhanced Ethernet (CEE), IBM BNT will provide full 40GbE support sometime next year, and offer switches that support 100GbE uplinks. IBM offers extended length cables, including passive SFP+ DAC at 8.5 meters, and 10Gbase-T Cat7 cables up to 100 meters.
Inter-datacenter Workload Mobility with VMware vSphere and SAN Volume Controller (SVC)
This session was co-presented between Bill Wiegand, IBM Advanced Technical Services, and Rawley Burbridge, IBM VMware and midrange storage consultant. IBM is the leader in storage virtualization product (SVC), and is the leading reseller of VMware.
Like MetroCluster on IBM N series, or EMC's VPLEX Metro, the IBM SAN Volume Controller can support a stretched cluster across distance that allows virtual machines to move seamlessly from one datacenter to another. This is a feature IBM introduced with SVC 5.1 back in 2009. This can be used for PowerVM Live Partition Mobility, VMware vMotion, and Hyper-V Quick Migration.
SVC stretched cluster can help with both Disaster Avoidance and Disaster Recovery. For Disaster Avoidance, in anticipation of an outage, VMs can be moved to the secondary datacenter. For Disaster Recover, additional automation, such as VMware High Availability (HA) is needed to restart the VMs at the secondary datacenter.
IBM stretched cluster is further improved with a feature called Volume Mirroring (formerly vDisk Mirroring) which creates two physical copies of one logical volume. To the VMware ESX hosts, there is only one volume, regardless of which datacenter it is in. The two physical copies can be on any kind of managed disk, as there is no requirement or dependency of copy services on the back-end storage arrays.
Another recent improvement is the idea of spreading the three quorum disks to three different locations or "failure domains". One in each data center, and a third one in a separate building, somewhere in between the other two, perhaps.
Of course, there are regional disasters that could affect both datacenters. For this reason, SVC stretched cluster volumes can be replicated to a third location up to 8000 km away. This can be done with any back-end disk arrays, as again there is not requirement for copy services from the managed devices. SVC takes care of it all.
Networking is going to be very important for a variety of transformational projects going forward in the next five years.
I have been working on Information Lifecycle Management (ILM) since before they coined the phrase. There were several break-out sessions on the third day at the [IBM System Storage Technical University 2011] related to new twists to ILM.
The Intelligent Storage Service Catalog (ISSC) and Smarter ILM
Hans Ammitzboll, Solution Rep for IBM Global Technology Services (GTS), presented an approach to ILM focused on using different storage products for different tiers. Is this new? Not at all! The original use of the phrase "Information Lifecycle Management" was coined in the early 1990s by StorageTek to help sell automated tape libraries.
Unfortunately, disk-only vendors started using the term ILM to refer to disk-to-disk tiering inside the disk array. Hans feels it does not make sense to put the least expensive penny-per-GB 7200 RPM disk inside the most expense enterprise-class high-end disk arrays.
IBM GTS manages not only IBM's internal operations, but the IT operations of hundreds of other clients. To help manage all this storage, they developed software to supplement reporting, monitoring and movement of data from one tier to another.
The Intelligent Storage Service Catalog (ISSC) can save up to 80 percent of planning time for managing storage. What did people use before? Hans poked fun at chargeback and showback systems that "offer savings" but don't actually "impose savings". He referred to these as Name-and-Shame, where the top 10 offenders of storage usage.
His storage pyramid involves a variety of devices, with IBM DS8000, SVC and XIV for the high-end, midrange disk like Storwize V7000, and blended disk-and-tape solutions like SONAS and Information Archive (IA) for the lower tiers.
Mark Taylor, IBM Advanced Technical Services, presented the policy-driven automation of IBM's Scale-Out NAS (SONAS). A SONAS system can hold 1 to 256 file systems, and each file system is further divided into fileset containers. Think of fileset containers like 'tree branches' of the file system.es.
SONAS supports policies for file placement, file movement, and file deletion. These are SQL-like statements that are then applied to specific file systems in the SONAS. Input variables include date last modified, date last accessed, file name, file size, fileset container name, user id and group id. You can choose to have the rules be case-sensitive or case-insensitive. The rules support macros. A macro pre-processor can help simplify calculations and other definitions that are used repeatedly.
Each file system in SONAS consists of one or more storage pools. For file systems with multiple pools, file placement policies can determine which pool to place each file. Normally, when a set of files are in a specific sub-directory on other NAS systems, all the files will be on the same type of disk. With SONAS, some files can be placed on 15K RPM drives, and other files on slower 7200 RPM drives. This file virtualization separates the logical grouping of files from the physical placement of them.
Once files are placed, other policies can be written to migrate from one disk pool to another, migrate from disk to tape, or delete the file. Migrating from one disk pool to another is done by relocation. The next time the file is accessed, it will be accessed directly from the new pool. When migrating from disk to tape, a stub is left in the directory structure metadata, so that subsequent access will cause the file to be recalled automatically from tape, back to disk. Policies can determine which storage pool files are recalled to when this happens.
Migrating from disk to tape involves sending the data from SONAS to external storage pool manager, such as IBM Tivoli Storage Manager (TSM) server connected to a tape library. SONAS supports pre-migration, which allows the data to be copied to tape, but left on disk, until space is needed to be freed up. For example, a policy with THRESHOLD(90,70,50) will kick in when the file system is 90 percent full, file will be migrated (moved) to tape until it reaches 70 percent, and then files will be pre-migrated (copied) to tape until it reaches 50 percent.
Policies to delete files can apply to both disk and tape pools. Files deleted on tape remove the stub from the directory structure metadata and notify the external storage pool manager to clean up its records for the tape data.
If this all sounds like a radically new way of managing data, it isn't. Many of these functions are based on IBM's Data Facility Storage Management Subsystem (DFSMS) for the mainframe. In effect, SONAS brings mainframe-class functionality to distributed systems.
Understanding IBM SONAS Use Cases
For many, the concept of a scale-out NAS is new. Stephen Edel, IBM SONAS product offering manager, presented a variety of use cases where SONAS has been successful.
First, let's consider backup. IBM SONAS has built-in support for Tivoli Storage Manager (TSM), as well as supporting the NDMP industry standard protocol, for use with Symantec NetBackup, Commvault Simpana, and EMC Legato Networker. While many NAS solutions support NDMP, IBM SONAS can support up to 128 session per interface node, and up to 30 interface nodes, for parallel processing. SONAS has a high-speed file scan to identify files to be backed up, and will pre-fetch the small files into cache to speed up the backup process. A SONAS system can support up to 256 systems, and each file system can be backed up on its own unique schedule if you like. Different file systems can be backed up to different backup servers.
SONAS also has anti-virus support, with your choice of Symantec or McAfee. An anti-virus scan can be run on demand, as needed, or as files are individually accessed. When a Windows client reads a file, SONAS will determine if it has been already scanned with the most recent anti-virus signatures, and if not, will scan before allowing the file to be read. SONAS will also scan new files created.
Successful SONAS deployments addressed the following workloads:
content capture including video capture
high performance computing, research and business analytics
"Cheap and Deep" archive
worldwide information exchange and geographically distant collaboration
SONAS is selling well in Government, Universities, Healthcare, and Media/Entertainment, but is not limited to these industries. It can be used for private cloud deployments and public cloud deployments. Having centralized management for Petabytes of data can be cost-effective either way.
IBM SONAS brings the latest techologies to bring a Smarter ILM to a variety of workloads and use cases.