IBM has today announced a whole swag of planned new features across the entire IBM Storage product line. You can read the announcement letter here and I have also dropped the text at the bottom of this blog post (to save you clicking on the link).
It's a very impressive list, but to hone in on a few of the more exciting offerings:
IBM Easy Tier will be enhanced to cache hot data in SSD storage installed in a client server. Looks like it will initially be a combination of DS8700/DS8800 and AIX with or Linux servers. I am sure there are plenty who will immediately think of EMC VFCache, so I am keen to get more details so I can see how the two compare. If you are curious in the meantime, check out this EMC fact sheet and then read this fascinating interview with the CMO of FusionIO.
A new high density storage module will be made available, initially I suspect for the DS8800. This is a really important step as we are seeing a lot of new technologies emerging in the SSD space. This is because the technical requirements of SSD don't always line up with the architectures of existing storage controllers, so a custom built enclosure designed just for SSD makes perfect sense.
The IBM XIV will be enhanced with the ability to cluster multiple XIVs together and migrate volumes non-disruptively between them. The non-disruptive volume migration is a great new feature which should definitely help with swapping XIVs out as new models come available.
There are plenty of other new features as well, so check out the announcement letter reproduced below:
IBM® intends to support a number of new enhancements to a variety of IBM storage systems in the future. These enhancements will leverage innovative research on intelligent algorithms, automation, and virtualization that is being incorporated into products in the IBM storage portfolio. The statements of direction highlighted here are intended to provide a glimpse into the IBM storage roadmap for selected product capabilities.
IBM intends to deliver:
Advanced Easy Tier™ capabilities on selected IBM storage systems, including the IBM System Storage® DS8000® , designed to leverage direct-attached solid-state storage on selected AIX® and Linux™ servers. Easy Tier will manage the solid-state storage as a large and low latency cache for the "hottest" data, while preserving advanced disk system functions, such as RAID protection and remote mirroring.
An application-aware storage application programming interface (API) to help deploy storage more efficiently by enabling applications and middleware to direct more optimal placement of data by communicating important information about current workload activity and application performance requirements.
A new high-density flash storage module for selected IBM disk systems, including the IBM System Storage DS8000 . The new module will accelerate performance to another level with cost-effective, high-density solid-state drives (SSDs).
IBM intends to extend IBM Active Cloud Engine™ capabilities to:
Allow files on selected NAS devices to be virtualized by SONAS and Storwize® V7000 Unified. Virtualization capabilities provide access across a unified global namespace, while facilitating transparent file migrations in parallel with normal operations. This capability will help provide customer investment protection as clients continue to leverage their existing NAS assets while exploiting the capabilities of IBM Active Cloud Engine .
Enable file collaboration globally via IBM Active Cloud Engine . This capability will help enhance productivity where users at geographically dispersed locations can both share and modify the same file.
IBM intends to deliver Cloud features to SONAS and Storwize V7000 Unified to support:
Web Storage Services, a standards-based object store and API that implements the Cloud Data Management Interface (CDMI) standard from Storage Networking Industry Association (SNIA) to support the implementation of storage cloud services.
Self-service portal designed to speed storage provisioning, monitoring, and reporting.
IBM intends to support an increased scalability of capacity, performance, and host bandwidth by clustering IBM XIV® Gen3 systems together and providing the capability to migrate volumes across the cluster without disrupting applications. Management of the cluster will remain simple with consolidated views and shared configurations across the systems. These capabilities are intended to help clients address the scalability and management requirements for effective cloud computing.
IBM intends to extend NAS data retention enhancements for IBM Storwize V7000 Unified and IBM SONAS to provide file "immutability" to help support file integrity from the time the file is designated as immutable through its lifecycle. Immutability is intended to secure files from inadvertent or malicious change or deletion.
IBM intends to enable Real-time Compression for block and file workloads on Storwize V7000 Unified systems. This enhancement is designed to help clients experience the same high-performance compression for active primary block and file workloads on Storwize V7000 Unified that is being announced for block workloads on Storwize V7000. IBM Storwize V7000 Real-time Compression is designed to deliver enhanced storage efficiency with potential benefits including lower storage acquisition cost (because of the ability to purchase less hardware), reduced storage growth, and lower rack space, power, and cooling requirements.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information in the above paragraphs are intended to outline our general product direction and should not be relied on in making a purchasing decision. The information is for informational purposes only and may not be incorporated into any contract. This information is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.
One common question that I hear on a regular basis regards the availability of an SRA for VMware SRM 5.0 when using Storwize V7000 or IBM SVC running V6.3 firmware. This combination is currently unsupported as per the alert found here.
The good news is that there are now IBM SRAs available for clients running SRM in combination with V6.3 firmware. While this combination is still not listed on the VMware support matrix found here, you can download the SRAs direct from IBM if your need is urgent.
So you need to do some disk performance testing? Maybe some benchmarking? What tools are out there to help you out? Well I am glad you asked... here are some that I use on my daily travels:
IOmeter is an old classic, with emphasis on the word old. At time of writing, the most recent update was from 2006. However it remains very popular mainly because it is free and easy to use.
Some tips when using IOmeter:
On Windows, IOmeter needs to be run as an Administrator, which seems to be the most common mistake people make (not running as Administrator means you don't see any drives). You can only run one instance of IOmeter in Windows, which means if multiple users logon to the same server, only one user can run IOmeter. You also really need to run IOmeter with a queue depth ( or number of outstanding I/Os) greater than one, with multiple workers. If you don't, you will not be able to drive the storage to saturation. For instance here are some results running 75% read I/O, 0% random, 4 KB blocks on a Windows 2008 machine with 4 workers. In each case against the same 128 GB volume on a Storwize V7000 backended by 4 x 300 GB SSDs in a RAID10 array. In each case I let the machine run for 10 minutes before taking the screen capture to ensure the performance was steady state and not peaking.
Firstly I used a queue depth of one. Aggregate performance was around 27000 IOPS.
Then I used a queue depth of 10. Aggregate performance was around 81000 IOPS.
I then used a queue depth of 20. Aggregate performance was around 113000 IOPS.
What I am trying to show is that taking the defaults (one worker with a queue depth of 1) will not drive the storage to a useful value for comparison... you need to do some tuning and some experimenting to get valid results. At some point increasing queue depths will not improve performance (it may actually decrease it).
There is an alternative to IOmeter called IOrate (created by an EMC employee). It is also very popular and appears to still be in active development. It is not unusual to see IBM performance whitepapers that used IOrate to generate the workload.
This is a fairly recent tool that I have not had a chance to try out (due to time pressures). The tool uses virtual machines under VMware to generate the I/O and includes some very nice workload capture and playback tools as well as reporting tools.
Jetstress is a benchmarking tool created by Microsoft to simulate Microsoft Exchange workloads. I like the fact you can configure it to run for very long periods and it has a more real world feel about it than just running empty I/Os. You can get the base software here, but you will also need some files from a Microsoft Exchange install DVD (or from an installed instance of Microsoft Exchange). If you cannot get to those files you cannot complete the startup process inside Jeststress.
Oracle offer a tool on their website called Orion, which will simulate the workload of an Oracle database. You can get the tool from here (although you will need to create a free Oracle user account before you can download it).
SDelete is not a benchmarking tool or a performance modelling tool. But it is a great way to generate I/O with very little effort. Just create a new drive in Windows and then run SDelete against it with the -c parameter. This parameter is used for secure deletion, so generates random patterns (which is real traffic - albeit 100% sequential writes). The syntax is like:
(updated April 20, 2012 - I found in version 1.6 of SDelete the meaning of the -z and -c parameters got swapped. In version 1.6 if you want random patterns use -c, if you want zeros use -z. In previous versions it is the other way around!).
Just doing file copies is probably the worst way to generate benchmarks, especially as a single copy is usually a single threaded operation.
I am sure there are plenty of other tools out there to generate benchmarks and simulate workload. My main concern with many of them is that synthetic (artificial) workloads do not reflect real world workloads.
Right now I am working on giving a client a recommended version of firmware for their Cisco MDS Fibre Channel switches. For FICON, the recommendations are easy, but for Open Systems there are so many choices. So what am I going to recommend?
FICON Switches and Directors
For FICON switches, sticking to the FICON (IBM Mainframe Fibre Connection) recommended versions (which are determined by the IBM System z Mainframe team), is a very good strategy. The best place to get these is here (standard IBM logon is required). Just look along the right hand column for the release letters.
The SAN-OS and NX-OS release notes found on the Cisco website also show recommended versions for FICON. For instance have at the look at the FICON recommendations table in the releases notes for version 5.2.2a that you can find here. The upgrade path is just below the table I have linked to. This link will get outdated over time (as newer versions come out), but you can list all the release notes here.
If you are using a IBM TS7700 you should also be aware of this page on the IBM Techdocs site.
So based on current versions, if you are running SAN-OS 3.3.1c or below you need to move to 4.2.7b (as per the non-disruptive upgrade path). I strongly recommend you get to at least version 4.2.7b and start planning to move to release 5.2.2 (provided your hardware supports it).
For open systems attached Fibre Channel switches there are a number of versions to choose from. There are five things to consider:
Being on the very latest version has a small potential risk (of un-discovered bugs). However being on very old versions has a greater implicit risk (of being exposed to KNOWN bugs). Just because you have not hit a bug yet, does not insure you from potential issues, especially if your SAN is growing.
Your hardware. Some older Generation hardware is not supported at higher levels (for example Supervisor-1 cards cannot go past SAN-OS 3.3.5b) but later generation hardware is not supported at lower levels (for example Fabric 3 modules need NX-OS 5.2.2). The Cisco recommended versions page is the best place to confirm this.
End of life. As SAN-OS reached end of development in 2011, 3.3.5b is the best choice for all hardware that cannot upgrade to NX-OS. However be aware that some Cisco Generation 1 hardware (such as 2 Gbps capable hardware) will go end of service in September 2012 (for example Supervisor-1 cards and MDS 9120 switches). Links for this are below. Of course your service provider may choose to offer support beyond the Cisco end of life date, but instead of updating code, maybe you should be updating hardware.
You need to also upgrade your Fabric Manager to at least the same or a higher version than your switches are running. One important thing to be aware of is that from version 5.2, Cisco Fabric Manager has been merged into a new product called Cisco Data Center Network Manager (DCNM).
It is ironic that only days after I wrote that 497 is the IT number of the beast, I learn that Linux has another unfortunate number: 208.
The reason for this is a defect in the internal Linux kernel used in recent firmware levels of SVC, Storwize V7000 and Storwize V7000 Unified nodes. This defect will cause each node to reboot after 208 days of uptime. This issue exists in unfixed versions of the 6.2 and 6.3 level of firmware, so a large number of users are going to need to take some action on this (except those who are still on a 4.x, 5.x, 6.0 or 6.1 release). If you have done a code update after June 2011, then you are probably affected. This means that if you are an IBM client you need to read this alert now and determine how far you are into that 208 day period. If you are an IBMer or an IBM Business Partner, you need to make sure your clients are aware of this issue, though hopefully they have signed up for IBM My Notifications and have already been notified by e-mail.
In short what needs to happen is that you must:
Determine your current firmware level.
Check the table in the alert to determine if you are affected at all, and if so, how far you are potentially into the 208 day period.
Prior to the 208 day period finishing, either reboot your nodes (one at a time, with a decent interval between them) or install a fixed level of software (as detailed in the alert).
To give you an example of the process, my lab machine is on software version 126.96.36.199 which you can see in the screen capture below. So when I check the table in the alert, I see that version 188.8.131.52 was made available on January 24, 2012, which means the 208 day period cannot possibly end before August 19, 2012.
Earliest possible date that a system running this release could hit the 208 day reboot.
SAN Volume Controller and Storwize V7000 Version 6.3
30 November 2011
25 June 2012
24 January 2012
19 August 2012
Regardless, I need to know the uptime of my nodes, so I download the Software Upgrade Test Utility (in case you have an older copy, we need at least version 7.9) and run it using the Upgrade Wizard (NOTE! We are NOT updating anything here, just checking):
I Launch the Upgrade Wizard, use it to upload the tool and follow the prompts to run it, so that I get to see the output of that tool. The output in this example shows the uptime of each node is 56 days, so I have a maximum of 152 days remaining before I have to take any action. At this point I select Cancel. You can run this tool as often as you like to keep checking uptime.
Note if you are on 6.1 or 6.2 code you may see a timeout error when running the tool, especially for the first time. If you do see an error, please follow the instructions in the section titled "When running the the upgrade test utility v7.5 or later on Storwize V7000 v6.1 or v6.2" at the Test Utility download site.
As per the Alert:
If you are running a 6.0 or 6.1 level of firmware, you are not affected.
If you are running a 6.2 level of firmware, the fix level is v184.108.40.206 which is available here for Storwize V7000 and here for SVC.
If you are running a 6.3 level of firmware, the fix level is v220.127.116.11 which is available here for Storwize V7000 and here for SVC.
If you are using a Storwize V7000 Unified, the fix level is v18.104.22.168 which is available here.
You should keep checking the alert to find out any new details as they come to hand. If you are curious about Linux and 208 day bugs, try this Google search.
*** Updated April 4, 2012 with links to fix levels ***
If you have any questions or need help, please reach out to your IBM support team or leave me a comment or a tweet.
*** April 10: The IBM Web Alert has been updated with new information on what to do if your uptime has actually gone past 208 days without a reboot. In short you still need to take action. Please read the updated alert and follow the instructions given there. ***
We just updated our Cisco MDS9509s to NX-OS 4.2.7b (from Cisco SAN-OS 3.3.1c) and now we are getting emails from this source: GOLD-major.
The actual message looks like this:
Time of Event:2012-03-05 15:07:21 GMT+00:00 Message Name:GOLD-major Message Type:diagnostic System Namexxxx Contact Namexxx@xxx.com Contact Emailxx@xxx.com Contact Phone:+61-3-xxxx-xxxx Street Addressx Road, xxxx, VIC, Australia Event Description:RMON_ALERT
WARNING(4) Falling:iso.22.214.171.124.126.96.36.199.1.10.18366464=2401032512 <= 4680000000:135, 4 Event Owner:ifHCOutOctets.fc4/5@w5c260a03c162
So who is GOLD-major?
GOLD actually stands for Generic OnLine Diagnostics. From Cisco's website: GOLD verifies that hardware and internal data paths are operating as designed. Boot-time diagnostics, continuous monitoring, and on-demand and scheduled tests are part of the Cisco GOLD feature set. GOLD allows rapid fault isolation and continuous system monitoring. GOLD was introduced in Cisco NX-OS Release 4.0(1). GOLD is enabled by default and Cisco do not recommend disabling it.
So in our example GOLD is actually reporting a major event (to do with exceeded thresholds, in this example utilisation on interface fc4/5).
Most clients using Cisco MDS switches are now moving to NX-OS (over SAN-OS, the name Cisco used for MDS firmware between version 1 and version 3) so this question will become more common. I am working on a post that discusses recommended versions (and the sunsetting of SAN-OS), so expect something soon. If on the other hand you are thinking.... how do I setup call home on a Cisco MDS switch? The information for NX-OS is here.
Curiously my brain cannot help itself, when I hear Gold Major I think it means Gold Leader which leads me to Red Leader which leads me to Red October. Maybe it's just me? Enjoy:
Because if a product uses a 32 bit counter to record uptime, and that counter records a tick every 10 msec, then that 32-bit counter will overflow after approximately 497.1 days. This is because a 32 bit counter equates to 2^32, which equals 4,294,967,296 ticks. If a tick is counted every 10 msec, we create 8,640,000 ticks per day (100*60*60*24). So after 497.102696 days, the counter will overflow. What happens next depends on good programming: normally the counter just starts again, but worst case a function might stop working or the product might even reboot.
Fortunately we are seeing less and less of these issues but just occasionally one still slips out. Recently IBM released details of a 994 day reboot bug in the ESM code of some of their older disk enclosures (EXP100, EXP700 and EXP710). Details about this bug can be found here. What I find interesting is the number of days it takes to occur, since 994 is actually 497 times two. This suggests that this product records a tick every 20 msec. This meant we got past 497 days without an issue but hit a problem after exactly double that number. So if you still have these older storage enclosures, you will need to reboot the ESMs (after checking the alert).
I googled 497 to see what images that number brings up and was amazed to find the M-497 jet powered train. More details on this rather interesting attempt at speeding up the commute home can be found here and here. It adds a whole new meaning to keeping behind the yellow line.
If you have combined vSphere 5.0 with XIV, then you may want to try out the new IBM Storage Provider for VMware VASA (vSphere Storage APIs for Storage Awareness). You can download the installation instructions, the release notes and the current version of the IBM VASA provider from here. Clearly because VASA is introduced in vSphere 5.0 your VMware vCenter also needs to be on version 5.0.
Now IBM have had a vCenter plugin for a very long time (which I have written about here, here and here) and while you still need that plugin if you want to do storage volume creation and mapping from within vCenter (as opposed to using the XIV GUI), the VASA provider makes storage awareness more native to vCenter. This is a very important step. It means instead of using vendor added icons and tabs (like the IBM Storage icon and the IBM Storage tab that are added by the IBM Storage Management Console for vCenter), you just use the default vCenter tabs.
Right now version 1.1.1 of the IBM VASA provider delivers information about storage topology, capabilities, and state, as well as events and alerts to VMware. This means you will see new additional information in three tabs: Storage Views, Alarms and Events.
After installing and setting up the VASA provider, in vCenter select your VMware cluster, go to the Storage Views tab and select the view Show all SCSI Volumes (LUNs) there are four columns with more information. The Committed, Thin Provisioned information, Storage Array and Identifier on Array (indicated with red arrows) comes straight from the XIV (hit the Update button at upper right if you are not seeing anything yet). This is really useful information as it lets you correlate the SCSI ID of a LUN to an actual volume on a source array. Here is a cut-down view of that extra information:
If you want a larger screen capture you can find one here.
The Task & Events and Alarms tabs will also now contain events reported by the VASA provider such as thin provisioning threshold alerts (although if you have just installed the provider you may see nothing new, as nothing has occurred yet to provoke an alert or event).
As usual I have some handy tips on the steps you will need to take to get VASA going:
First up you will need to identify a virtual machine to run the provider on (or just create a new one). I chose to deploy a new instance of Windows 2008 from a template. Because the VASA provider communicates to vCenter via an Apache Tomcat server listening on port 8443, that port needs to be free and unblocked. This also means you should not run the VASA provider in the same instance of Windows as the vCenter server (see below for more information as to why).
Download the IBM Storage Provider for VMware VASA as per the link above (use version 1.1.1, see the user comments in this post for details about a bug in version 1.1.0).
Install the provider in the Windows VM you created in step 1. The tasks are detailed in the Installation Instructions, but it is a simple follow-your-nose application installation. As per most XIV software packages, it will install a runtime environment (xPYV which is Python) as part of the install.
Now we need to define the credentials that VMware vCenter will use to authenticate to the IBM VASA Storage Provider. These should be unique (and are not an XIV userid and password - this is only between vCenter and the provider software). In my example I use xivvasa and pa55w0rd. The truststore password is used to encrypt the username and password details (so that they are not stored in plain text). Open a Windows command prompt (make sure to right select and open it as an Administrator) and enter the following commands:
cd "C:\Program Files (x86)\IBM\IBM Storage Provider for VMware VASA\bin" vasa_util register -u xivvasa -p pa55w0rd -t changeit
Don't close the command prompt, because we now need to define the XIV to the IBM VASA provider.
You need the IP address of your XIV and a valid user and password on the XIV that can be used to logon to the XIV. So in this example my XIV is using 10.1.60.100 and I am using the default admin username and password (which I know does not set a good example). This is the command you need to run:
If this command fails, reporting your firmware is invalid, you are probably using the original 1.1.0 version of the VASA provider, go back to the IBM Fix Central website and make sure you have the latest version (at least version 1.1.1). If it reports the firmware cannot be read, make sure you are running the Command Prompt as an Administrator.
Once you successfully added the XIV to the provider, you need to restart the Apache webserver. Do this by starting the services.msc panel and looking for the Apache Tomcat IBMVASA service as pictured below. Stop it and then start it. Once you have done that you can logoff from the VASA VM.
Now connect to your vSphere Client (which needs to be on at least version 5.0.0) and from the Home panel, open the Storage Providers panel.Then select the option to Add a new provider. The URL needs to include the correct port number (by default 8443), so it will look something like this (where the provider is running on 10.1.60.193). Note also that the VASA provider version number is in the URL, so if you upgrade the provider you will need to change the URL (currently v1.1.1):
The Login and password should match the user id and password you defined in step 4 (remember it is not logging into the XIV, it is logging into the VASA provider).
If you get a message saying your user id and password are wrong, you probably forgot to stop and start Apache in step 6 above. If you succeed you should see a new provider listed. Highlight the provider and select sync to update the last sync time.
Your setup tasks are now all completed. Now go and explore the panels I detailed above to see what new information you have available to your vCenter server.
Why a separate server for the VASA provider?
The IBM VASA provider uses Apache Tomcat, which by default listens on port 8443. However since vCenter already has a service listening on port 8443, it means we have a clash. I googled and found the Dell and Netapp VASA providers also listen on port 8443 and they also recommend separate servers. I noted Fujitsu's provider uses a different port but still requires a separate server. So it seems if you have multiple vendors you will either have to spin up a separate server for each vendors provider, or start playing with changing the port number. The installation instructions for the IBM VASA Provider explain how to change the default port number if you are truly keen.
I always laugh when people say to me: I wouldn't know what to blog about!
When you work in pre-sales support, you constantly get asked questions and each one of them could be the subject of a new blog post. Right now the most common question I am getting is:
I am implementing VMware Site Recovery Manager (SRM). One of the components I need are vendor specific Site Recovery Agents (SRA). I have searched IBM's website but cannot find them. Where are they?
So the short answer is: you get them from the VMware SRM download site. However before downloading, there is a key task that absolutely needs to be performed:
Visit the VMware vCenter Site Recovery Manager Storage Partner Compatibility Matrix. This site will confirm what products are supported by each version of SRM. You can find it here, but clearly you need to check back regularly to ensure you have the latest information.
Now find your storage device in the matrix and confirm what firmware levels are supported. This is really important. For example, the Feb 27, 2012 edition of the matrix tells me that the Storwize V7000 is supported for SRM version 5.0, but only when running Storwize V7000 firmware version 6.1 or 6.2. This is significant because if you upgrade to version 6.3 you are not supported. In fact that combination doesn't actually work yet, as detailed here. Clearly something you need to be aware of when planning firmware updates.
So where are the SRAs? On each of the pages below use the Show Details button to see what version SRAs are being shipped with that SRM (although sometimes the pages take a few days between an SRA being added and the page being updated):
There are a few more questions I routinely get asked:
Does IBM actually have an SRA download site?
The answer is yes, but it is an FTP site only for SRAs written by IBM. It is principally a repository for older SRAs and beta SRAs but you can also find the current SRAs on it. You can find the site here. Note however that it is NOT the official source. For that you need to use the VMware site.
What about the SRA for LSI/Engenio based products like the DS4800?
These used to also be found on the LSI site, but since LSI sold Engenio to NetApp, it is no longer available from the LSI or NetApp websites. You need to download the current version from the VMware sites listed above. There is a version for SRM 5 on the VMware download site.
What about nSeries SRAs?
If you need an nSeries SRA, again you should go to the VMware download pages. There are separate SRAs listed and available for IBM nSeries (as opposed to an SRA for NetApp branded filers).
What about an SRA for XIV with SRM version 5?
The answer: The SRA for XIV with SRM 5 (and 5.0.1) is now available from VMware. If you have access to download SRM, you will be able to download SRA version 2.1.0. It is the same SRA for both XIV Generation2 and Gen3.
What about an SRA for Storwize V7000 and SVC version 6.3 code?
The answer: It is coming. We are working to make it available as soon as possible. I will update this post as soon as I have a date for you (we are talking weeks, not months).
*** Update March 23, 2012 - Added details on SRM 5.0.1 ***
I have updated my IBM Storage WWPN Determination Guide to version 6.5. You can find the updated guide on IBM Techdocs here.
The main change is that new DS8800s are now presenting slightly different WWPNs, so I added three new pages to describe the changes.
If this guide is new to you, its purpose it to let you take a WWPN and decode it so you can work out not only which type of storage that WWPN came from, but the actual port on that storage. People doing implementation services, problem determination, storage zoning and day to day configuration maintenance will get a lot of use out of this document. If you think there is an area that could be improved or products you would like added, please let me know.
It is also important to point out that IBM Storage uses persistent WWPN, which means if a host adapter in an IBM Storage device has to be replaced, it will always present the same WWPNs as the old adapter. This means no changes to zoning are needed after a hardware failure.
I also host the book on slideshare, so you can also view and download it from there:
Its been a long time coming, but I finally joined the cult of Mac in the form of a new MacBook Pro. Having not used an Apple Mac for over 15 years, I must say I am truly loving what they have done with the operating system and the hardware (my last Mac was a Mac SE bought in 1990).
Now this post is not a rant from a new convert to everything Apple. In fact my main gripe is that what you rapidly discover when you move to Mac OS is that not every piece of software is going to work in your new world. While Lotus Notes and Sametime have very nice Mac versions, my day to day work involves IBM Storage and there are several tools that I need that are Windows only. These include Capacity and Disk Magic (used to size solutions) and eConfig (used to order IBM products). This means for certain applications I need to use a Hypervisor (such as VMware Fusion or Parallels).
But what about managing IBM Storage? Well I have some good news on that front:
SAN Volume Controller and Storwize V7000: Because these products are managed from a web page, they are operating system agnostic. To be clear, officially only Firefox 3.5 (and above) and IE 7.0 and 8.0 are supported (support details are right at the bottom of this page while setup details are here). Since IE is no longer available for Mac, you should install Firefox (or try out Safari or Chrome, I have tried all three without issue).
XIV: The XIV GUI is available in a native Mac OS version from here. The release notes state that the XIV GUI works on Mac OS X 10.6 but I am happily using it on Mac OS X 10.7 (Lion). The Mac OS X installation process is simply beautiful (just drag and drop, one of the truly nice features of Mac OS X) and of course it works just as nicely on Mac as it does on the other supported operating systems.
Drag and drop done right.
Attaching OS X to IBM Storage
Of course maybe you want to attach your Mac OS X box to IBM Storage. If you visit the SSIC you will find IBM supports OS X on pretty well it's entire range including SVC, Storwize V7000, XIV, DS3500 and DCS3700. Mainly these use the ATTO HBA and multipath device driver. If your particular setup is not there, get your IBM Pre-sales Support to open a support request, depending on your request, approvals are normally very fast.
Of course I have to mention the iPhone and iPad. IBM have the XIV Mobile Dashboard for both devices, which I previously blogged about here (iPad) and here (iPhone). These are really elegant apps that even have a cool YouTube video.
Of course now I want all the goodies promised in Mountain Lion. With the convergence of OS X and iOS, I would love to see even more converged tools. A man can dream....
There is a demo mode, but right now there is no tick box to activate it. Simply use the word demo in all three fields at login. In other words:
IP address: demo User ID: demo Password: demo
2) Retina display requirement
The Mobile Dashboard was written for the Retina display (that comes with an iPhone 4 or iPhone 4S). This sadly means that the iPhone 3GS and earlier will not be able to use the new Mobile Dashboard. This wasn't done as part of some devious plan on IBM's part to force you to buy a new iPhone, the developers simply needed the better resolution to draw those graphs and provide the richest and clearest display of information on a single page (you will believe it when you see it, the detail is quite stunning).
The Apple Store clearly states the hardware and iOS requirements on the download page, however you can still try to install it on an iPhone 3GS. Curiously what you get is this rather bizarre message:
The reason you get this message is simple: There is no way to specify when uploading a new app to Apple that you need the Retina display. So instead the developer needs to specify a feature that is not found on earlier iPhones, such as a camera flash. So it is not that the XIV Mobile Dashboard needs a flash in your camera, it is simply a quirk of the Apple store.
And for those of you who are using Android devices, your calls are being heard. Watch this space for developments in that direction.
My good friend Rob Jackard from the ATS Group has compiled this list of updates to the IBM Support site. It is a very comprehensive list of updates, flashes, tips and warnings and it is well worth spending a few minutes scanning the list to see if any apply to your environment. There is even a warning about a 497 day bug.
Ideally none of these tips should be news to you if you are getting regular emails using IBM My Notifications so please sign up (or maybe check that your notification list has the correct products) and then read on.
(2011.11.07) DS8700/DS8800 users running with 8Gb host adapters on Release 6.1 exposed to potential loss of access condition. NOTE: The firmware fix is available in R6.1 for DS8700- Bundle 188.8.131.52 or higher (184.108.40.206 is recommended), and for DS8800- Bundle 220.127.116.11 or higher. http://www-01.ibm.com/support/docview.wss?uid=ssg1S1003931
(2011.11.01) DS8700 / DS8800 internal error recovery can result in loss of access when 2145 SVC is attached. NOTE-1: Firmware fixed for DS8700 [Bundle 18.104.22.168 or higher, recommended 22.214.171.124]. NOTE-2: Firmware fixed for DS8800 [Bundle 126.96.36.199 or higher, recommended 188.8.131.52]. http://www-01.ibm.com/support/docview.wss?uid=ssg1S1003743
(2011.11.13) Storwize V7000- Performance Degradation and Loss of GUI/CLI Access Due to Excessive Numbers of Socket Connections. NOTE: This issue exists in all V6.1.0.x and V184.108.40.206-V220.127.116.11 releases. This issue was fixed in the V18.104.22.168 PTF release. http://www-01.ibm.com/support/docview.wss?uid=ssg1S1003930
The big question of course is which drive type to choose? The answer is that ideally you should possess three pieces of information:
How much usable space do you need in GB or TiB? Don't confuse binary and decimal!
What is your typical I/O profile. For instance 70% reads 30% writes, 32KB block size.
What are your IOPS and response time requirements?
Armed with this information, get your IBM Sales Rep or Business Partner to model your requirements using Capacity Magic and Disk Magic. These modelling tools will tell you how much usable capacity a particular configuration will give you and what performance you can expect to get from it (given a particular I/O profile). If you don't know your I/O profile or IOPS requirements, you can still see performance modeling using industry standard benchmarks.
Now this is interesting: IBM is offering a ratings system that allows customers who bought IBM products to write reviews and leave ratings (out of five) on IBM Storage, Power and System Z products, straight from the main ibm.com website.
Many of the preventable issues that occur in a SAN fabric can be avoided by using the right management and monitoring software. One way to get this software is to create or adapt open source packages. While I really like the idea (and price) of roll-your-own solutions, it is not always practical. Apart from the fact that you need to have staff with the relevant skills to do this, long-term maintenance can prove difficult when key people move on. Unfortunately the other extreme (which is far more common) is that many shops actually do nothing at all, ending up without any overall SAN management and monitoring methodology.
An ideal off the shelf solution alternative in a Brocade SAN fabric is to use IBM Network Advisor, the successor product to Data Center Fabric Manager (DCFM). IBM Network Advisor actually has its heritage in a great product called EFCM (Enterprise Fabric Connectivity Manager), that Brocade picked up when they bought McData . I loved working with EFCM and McData switches, especially the McData 6140, which was truly a great SAN director. When Brocade purchased McData they combined EFCM with their own Fabric Manager to create DCFM. They have since combined it with their Network switch management software to create Network Advisor, bringing things to a whole new level. The IBM announcement letter for this software is here.
Now the first thing you may be wondering is: OK so this software sounds great, but how much will it cost? The good news is that trying it out wont cost you anything. It's free to download and trial for 75 days. You can find the download site here.
To demo it, you can spin up a Windows 2008 guest from a template in your favorite Hypervisor. This means you don't even need to request separate hardware to do this trial.
So what benefits should you expect to see? Well first up I am talking about preventing issues like these:
Mistakes made when performing zoning updates
Failure to create regular configuration backups (which especially hurts after a switch failure)
Difficulties upgrading firmware or simply too many upgrades to get through
Poor (or no) switch and performance monitoring
Poor (or no) error notification (including notification back to IBM)
Difficulty collecting log data
Lack of report creation software
In some ways you can sum up the benefits of the software quite easily by looking at the three central menus of IBM Network Advisor: Configure, Monitor, Report.
To give you a view of some of the menu choices, you can see just how rich the options are:
From a configuration perspective you can manage the zonesets of all your fabrics from the one place. This means you don't need to jump between switches. More importantly it gives you a clear indication of what a zoning update is adding AND removing. Accidental removal of a required zone is a very common cause of zoning related SAN issues:
Do you mean to remove that zone?
It can automatically backup your switch configurations. Backing up your configs is frankly a mandatory task that is routinely never done. If a switch fails, then any customization and zoning (if it is a single switch fabric) is lost. This can be a major issue, especially if a business partner or former employee set the switch up. If we schedule a regular backup you won't need to remember, because IBM Network Advisor will do it for you:
Firmware updates also become a far simpler affair. IBM Network Advisor has a built-in FTP server and happily acts as a firmware repository. If you're facing a set of Kangaroo hops, this is a great way to make the whole process very very simple. It will perform compatibility checks before you start and also act as a repository for both firmware and release notes (which is a really nice touch).
From a monitoring perspective, the ability to set up call home to IBM is a huge advantage and a vital step in building a SAN with the highest levels of availability. The added bonus is that you can use IBM Network Advisor to generate a supportsave (a log offload file that you will invariably be asked for during trouble shooting) off every switch in your fleet in one go (you can also set it up to perform this on a regular basis), significantly boosting productivity and aiding in trouble shooting. You can also set up Fabric Watch across the entire fleet of switches, all from a single interface.
If you own DCFM already, then you are eligible for a free upgrade. If after trialing the software you feel that the significant availability benefits this software will give you are worth achieving, talk to your IBM Sales Rep or Business Partner to get a price. I personally think you will find it very reasonable, plus I guarantee that it will not be shelfware and will prove to be a vital tool in getting the most from your SAN.
But... if after trialling IBM Network Advisor you're still determined to try to avoid paying for software, then you could always consider the open-source alternative (rather than do nothing). Check out this document written by Andy Loftus and Chad Kerner from the National Center for Supercomputing Applications at the University of Illinois. It's a great example of a lessons learning document that describes how they built their own monitoring solution. You will find all of their documents and scripts here. As I said, roll-your-own might avoid vendor costs, but they have costs all of their own. Does your team have the skills, willpower and time to do this and maintain it? I would love to hear about your experiences either way.
On Friday November 18, 2011, IBMers around the world engaged in the worlds first group therapy session held entirely in Twitter! (well maybe not the first, and not really group therapy, but it sounds more dramatic when I put it like that).
It focused entirely on tweeting classic lines heard in day to day life at IBM, using the hashtag #stuffibmerssay. The result was an amusing out-pouring that kept growing as the day went on (and has not stopped). Karl Roche did a great summary write-up here where he captured some of the more classic stuff. Holly Neilson also wrote a nice blog post on the subject here.
You will notice many of the tweets focus on phone conferences, which are without a doubt the greatest contributor to and destroyer of, productivity in IBM. Classics such as this one came up again and again (and it's a common problem for me):
VMware vSphere 5.0 brought in a considerable number of storage related improvements. One of these is VASA, which stands for VMware APIs for Storage Awareness - in which VMware yet again manages to place an acronym (API) inside an acronym (someone needs to send Grammar Girl down there to beat them up). But I digress...
VASA improves VMware vSphere’s ability to monitor and automate storage related operations. The VASA Provider delivers information about storage topology, capabilities, and state, as well as events and alerts to VMware. The VASA Provider is a standard vSphere management plug-in that is deployed once on each vCenter server to interact with VMware APIs for Storage Awareness.
You will of course need a VMware vCenter and an ESXi server both running version 5.0. Your XIV can be a Generation2 running 10.2.2 or 10.2.4 firmware or an XIV Gen3.
You can download the installation instructions for the IBM VASA provider here. You can download the release notes for the IBM VASA provider here. You can download the IBM VASA provider itself here.
If none of these links work, then the IBM Fix Central page for every XIV related file ishere.
I am unsure about unnatural love, but perhaps the level of enthusiasm he is seeing comes from: ease of use, awesome GUI, consistent performance, freedom from planning RAID groups, simple growth and upgrade path... I could keep going... it all adds up.
So if you are a member of the cult of XIV, I have a little present for you: A really nice and simple reporting tool.
Here is what you need to do:
1) Download XIV Capacity Report 3.7 from this link. Click where it says Downloading this file.
2) You will get a zip file with five files in it. Unzip them into a folder on a Windows workstation. The Windows workstation also needs the XIV GUI installed on it (actually you only need the XCLI, but the Windows version of the GUI will give you that).
3) Of the five files you just unzipped, you need to edit the file called: xiv_capacity_report_get_files.vbs. Open that file with a text editor (such as Notepad). The easiest way to do this is to right-select the file and choose edit.
4) You need to edit the section that looks like this:
' *********** Edit this list of IP/names and user/password for your own configs ************************
myConfigs.Add "1", "-m 22.214.171.124 -u admin -p adminadmin"
myConfigs.Add "2", "-m 126.96.36.199 -u admin -p adminadmin"
Lets say you have two XIVs, the details for which are:
XIV1 : Management: IP 10.1.10.100 Userid: admin Password: passw0rd XIV2 : Management: IP 10.1.20.100 Userid: admin Password: passw0rd
So we edit the section I mentioned above and make it look like this:
' *********** Edit this list of IP/names and user/password for your own configs ************************
myConfigs.Add "1", "-m 10.1.10.100 -u admin -p passw0rd"
myConfigs.Add "2", "-m 10.1.20.100 -u admin -p passw0rd"
Now save the file and we are done editing. If you only have one XIV, then delete the line starting with myConfigs.Add "2" (or put an apostrophe at the start of the line to comment it out). If you have more than two XIVs, just add extra lines for myConfigs.Add "3", myConfigs.Add "4" and so on, adding details for each machine as shown above. You can ignore the lines further down in the file that start with an apostrophe, these are just examples.
Unless you acquire another XIV, you will not have to do this file editing again.
5) Now double-click on the icon: xiv_create_capacity_report.bat. This is a Windows bat file that will create a Windows command prompt while it is running. It uses XCLI commands, so if the XIV GUI or XCLI is not installed, it won't work. The output will be a new folder with today's date and time. Inside that folder will be a report that will be named something like: xiv_capacity_report_2011_10_30_17_6_36.xls
You can now open the report and check it out (presuming you have Microsoft Excel or some other software that can open XLS files). On my laptop I get a message talking about file formats, when I open the file.
You can ignore this message. If you save the file as an XLS you won't get this message again.
The report itself will have five tabs as shown below:
For every column in every tab, filtering (or sorting) is already setup. This makes it really easy to re-arrange the data to suit what you're looking for.
Arrays Tab List details about all your XIVs including: serial numbers, code versions, soft and hard capacity, how much of the soft and hard space is allocated, how much is free and how much space is being consumed. Great place to grab the machine serial number or confirm which machine has space available.
Pools tab Lists every pool in every XIV showing every possible sizing metric you could possibly want. Cells will be coloured red or yellow if limits are being reached. It is a great place to confirm if your pools are filling up and whether a pool is a good candidate to be changed to Thin Provisioning. Sort column L (allocated vs used) or column N (Hard Capacity Utilization) to identify good candidates for swapping to Thin Provisioning. These are the pools that can give up some hard space.
Hosts tab Will list every defined host for every XIV. You can straight away spot how much space has been allocated to each host and more importantly, how much is being used. Cells will be coloured yellow or red if limits are being reached. Some nice tricks:
Sort by column F (Allocated vs Used) to identify hosts that have asked for lots of space, but not used much of it.
Compare column G (# of volumes) with column I (# volumes mirrored). You may have critical hosts that require every volume to be mirrored, so a quick compare will confirm if there are exceptions.
Volumes tab Will list every volume defined on every XIV. This is a great tab to check which volumes are being mirrored, how many snapshots exist for each volume and how much space is being used by each volume. Again cells in the Used column will be coloured red or yellow if space is becoming short. Some great tricks here:
Sort column F or G (Used GB and %) to identify volumes with no or little data in them. Perhaps they are not really needed? Perhaps they are over-sized or should be in a Thin Provisioning pool.
Sort column H (Mirrored) to identify all volumes where Mirrored = No. Should they be mirrored?
Sort column K (Host Mapped) to identify all volumes not mapped to a host. Unmapped volumes are a great potential source of space!
Failures tab The Failures tab shows any failed components in your machines (like failed disks).
So please download the tool and try it out. Service providers love using this tool for reporting, it is so quick and easy to set up and run. Every time you run the tool you get a new report, so you can automate report creation and keep a nice history.
If you were signed into IBM developerWorks when you downloaded the tool and an update is made available, you should be notified by email, provided your IBM ID is set-up properly with a valid e-mail address.
And as for cults... there is only one cult I ever really liked and they really were called The Cult. The video takes about 15 seconds to get going and yes, the lead singer is dressed like a pirate. Enjoy! (if you like 80s rock...)
For those of you with Apple iPads, you might consider dropping by the Apple Store and picking up your free IBM XIV Mobile Dashboard.
The IBM XIV Mobile Dashboard application can be used to securely monitor the performance and health of your XIV over a Wi-Fi or 3G link. Having downloaded and installed the Mobile Dashboard you will get a lovely XIV Icon:
When you start the Mobile Dashboard you will have the choice to either run in Demo Mode or to connect to an actual XIV. Demo mode can be accessed by selecting the Demo Mode option deep in the lower right hand corner. So you don't actually need an XIV to give it a test drive.
To logon to a real XIV you will need a valid username, password and IP address.
Once connected you have the choice of viewing volume performance or host performance. If you view (hold) the iPad in portrait mode you get a list of up to 27 volumes or hosts ordered by performance metrics (it defaults to ordering by IOPS). If you view the iPad in landscape mode you will get a more graphical output (as per the examples below). There are no options to perform configuration, the dashboard is intended only for monitoring. This means each panel will show the performance and redundancy state of the XIV.
The volume performance panel is shown by default. The example below shows the output when the iPad is operated in landscape mode. From this panel you can see up to 120 seconds worth of performance for a highlighted volume. Use your finger to rotate the arrow on the blue volume icon to switch the display between IOPS, bandwidth (in megabytes per second or MBps) and latency (in milliseconds or MS). The data redundancy state of the XIV is shown in the upper right hand corner (in this example it is in Full Redundancy, but it could be Rebuilding or Redistributing).
The example above shows the output when the iPad is operated in landscape mode. If you instead rotate the iPad to portrait mode, you will get a list of the performance of up to 27 of your busiest volumes.
Now swipe to the left to navigate to the Hosts panel as shown below.
From this panel you can see up to 120 seconds worth of performance for a highlighted host. Use your finger to rotate the arrow on the purple host icon to switch the display between IOPS, bandwidth (in megabytes per second or MBps) and latency (in milliseconds or MS). The data redundancy state of the XIV is shown in the upper right hand corner (in this example it is in Full Redundancy, but it could potentially also be Rebuilding or Redistributing). Swipe to the right to navigate to the Volumes panel.
The example above shows the output when the iPad is operated in landscape mode. If you instead rotate the iPad to portrait mode, you will get a list of the performance of up to 27 of your busiest hosts.
From either the volumes or the hosts panels you can log off from the mobile dashboard using the icon in the upper right hand-most corner of the display. When you log back on, the last used XIV IP address and username will be displayed (but not the password which will need to be entered again).
I can see some nice use cases here. You get a call regarding performance but you are on the road. Are there any problems with the XIV? You can quickly logon with your iPad and confirm if response times are normal and the redundancy state is Full Redundancy.
A better use case... now you can ask your manager to buy you an iPad, so you can monitor your XIV! Let me know how that goes #
The eternal question: Which hardware/software combinations are tested and supported? If you use IBM Storage hardware and you need to answer this question you need to be using the IBM System Storage Interoperation Center, or SSIC, which you will find here:
I use this site a lot and rely heavily on the output it creates. I thought I knew the site well, but I recently learnt some really handy tricks that you might find helpful...
1) Export all the interop data for a single product version.
If you want to download every interoperability test result for a specific product version, you can select the relevant version from the Product Version box of the SSIC and then select Export Selected Product Version (xls).
In the example below we want to see all the results for XIV Gen3 which uses XIV Software version 11.
a) Use the scroll bar in the Product Version box to bring up the XIV product versions. Youdon't need to make a selection from the Product Family or Product Model boxes.
b) Select IBM Storage System (11) from the Product version list.
c) Select the option to Export Selected Product Version (xls). A spreadsheet compressed into a ZIP file will be downloaded in your browser.
So that is just two clicks and the result is a giant spreadsheet. Reminds me of when interop matrices were giant PDFs.
2) Changing your selections from an existing search
As you make selections, the webpage leaves what are called cookie crumbs. They will appear at the top of the page and can be seen in the example below, numbered 1 to 6. You can use those cookie crumbs to go backwards at any point, to any point.
3) Start anywhere
It seems human nature to always start at the top and work downwards. But in fact you can start anywhere on the SSIC and work in any direction. There are no real restrictions on the combinations you can attempt to build. Every time you make a selection in a different box, the number of configuration results will drop. For instance just click FICON in the Connection Protocol box... or just select IBM AIX 7.1 from the Operating System box. Then work up or down from there.
Hopefully these suggestions will help you work more effectively with the SSIC.
The IBM XIV Gen3 was announced July 12, 2011 with a planned availability date of September 8, 2011. So far I have written blog articles about the changes to the rack, the layout and the disk drives. So it's now time to head around to the back of the rack!
The first thing I like to point out when showing clients a 2nd Generation XIV is the patch panel. This is a really nice innovation that places all the external connections (Fibre Channel SAN, iSCSI LAN, Management LAN and Remote and Local Support connections) into the one easily accessible place. Users not only like the simplified layout, but also appreciate that they can run cables to the patch panel through the top of the machine or from under the floor.
The big change in the IBM XIV Gen3 is that the developers got very excited about iSCSI connections, provisioning a staggering 22 ports (on a fully provisioned machine). This means the patch panel had to be redesigned to accommodate these extra ports.
In the examples below (taken from the GUI), active ports are coloured white while inactive ports are yellow.
With the 2nd Generation XIV patch panel, the 24 fibre channel ports are at the top followed by the 6 iSCSI connections, the management ports, remote support ports and local support ports.
With the IBM XIV Gen3 patch panel, there are two sections. The top section has the 24 fibre channel port (all the ports on the left hand side) while the 22 iSCSI ports are all on the right hand side. Note the bottom two iSCSI ports are grayed out because they are used for internal connections (thus 22 ports, not 24). The lower section of the panel has the management and local and remote support ports .
Here is a picture showing the rear view of the rack with the two section patch panel indicated.
As I have explained in previous posts, you can get Visios of the XIV patch panels and the XIV racks from here. If there is another Visio stencil you would like to see, feel free to leave a comment and I will get busy.
IBM have offered Brocade switch modules for their BladeCenters for many years. One common question I get asked regards which of these Switch Modules are supported with the IBM Storwize V7000 or IBM XIV.
Your first port of call is the System Storage Interoperation center or SSIC. However that will only list the switch modules by the IBM Feature Part Number (for example 26K5601), which may confuse you. So to help you determine which switch module you possess, here is a history of Brocade SAN switch modules for IBM BladeCenter:
2 Gbps Brocade Switch Modules for IBM BladeCenter
There were two switches 26K5601 and90P0165 but they are actually the same switch with different software features. The Enterprise version had extra licenses like Trunking and Extended Fabrics. These products were withdrawn from marketing on May 26, 2006.
4 Gbps Brocade Switch Modules for IBM BladeCenter
IBM offered two models which were physically identical. You could upgrade from the 10 port to the 20 port with a software activation key. These products were withdrawn from marketing on December 31, 2010.
IBM Product Name
Brocade 10-port SAN Switch Module for IBM eServer™ BladeCenter
Brocade 20-port SAN Switch Module for IBM eServer BladeCenter
3 external 7 internal
6 external 14 internal
IBM Feature Part Number
Brocade Product Number
Brocade Switch Type Numer
IBM FRU Number
Brocade ASIC Type
Minimum FOS Version
Maximum FOS Version
8 Gbps Brocade Switch Modules for IBM BladeCenter
There are actually three models available but I have collapsed them down to two. The 20 port switch is offered as both Enterprise and non-Enterprise. The 42C1828 switch module is the Enterprise version that has more licensed software features such as Trunking, Fabric Watch and Extended Fabrics (and is indicated with an *).
IBM Product Name
Brocade 10-port 8Gb SAN Switch Module for IBM BladeCenter
Brocade 20-port 8Gb SAN Switch Module for IBM BladeCenter
3 external 7 internal
6 external 14 internal
IBM Feature Part Number
Brocade Product Number
Brocade Switch Type Numer
IBM FRU Number
Brocade ASIC Type
Minimum FOS Version
Maximum FOS Version
At the moment the SSIC does not list every switch module. However support is available for many configurations via the IBM SCORE request system (sometimes called an RPQ). Your IBM pre-sales storage specialist can raise one of these. Depending on your request you may get a support statement as quickly as overnight.
If your wondering what an IBM FRU number is: a FRU or Field Replacement Unit is the part number used in the IBM spare parts system to replace your part under warranty or maintenance agreement.
If you want the information above in a spreadsheet format, you can find it here.
I was inspired by this article on CBC News regarding the 30th anniversary of the launch of the IBM PC. That's right, on Friday August 12, 2011 the IBM PC turned 30.
The IBM 5150 Personal Computer
Of course there were plenty of alternatives out there, but the IBM PC set standards that changed the industry forever (and IBM!). There is some great material in the IBM archives. Check them out here.
My first computer? An Exidy Sorcerer that I purchased around 1982. There it is in my bedroom (checkout the wood paneling and macrame plant holder!). It had 32KB of RAM plus plugable cartridges and a cassette tape recorder for storage.
The Exidy Sorcerer
I sold it in 1984 to a Doctor who paid far more than I initially did. He was running his whole surgery on a Sorcerer and desperately needed another one for parts. Tells you something about the risks of writing software for a closed platform.
My next computer was a 512 KB Apple Macintosh that I bought in 1985 through the University of Western Australia (UWA). UWA was an all Apple campus with Macs and then Mac SEs in every faculty. The library had Macs you could rent by the hour.
The original Apple Macintosh (image from Wikipedia)
I remember paying $400 Australian for an external floppy disk drive. There was no hard drive and definitely no web browser!
My first employer (a High School) had networked BBC Micros running CPM. There were four 5.25" floppy disk drives in the main unit, the A the B the C and the D Drive.
The BBC Micro (image from Wikipedia)
My second employer (also a High School) had IBM JXs running DOS.
The IBM JX (image from Wikipedia)
And my first computer at IBM was not a PC at all. It was an IBM 3290 Gas Plasma terminal that gave you four mainframe logons at the same time. I still remember that console with great affection. I found an image in Flikr if you want to see what one looked like.
VMware vSphere 4.1 brings in a brilliant new function to offload storage related workload. This function is called VAAI (vStorage APIs for Array Integration) and requires that your SAN storage supports VAAI and that your ESX or ESXi server has a driver installed to utilize it.
IBM first supported VAAI with the IBM XIV using an IBM supplied VAAI driver. IBM then added support to the Storwize V7000 and SVC, so IBM has now released a new VAAI driver to support all three products at once. You can find the driver, installation guide and release notes at this URL.
I discovered some quirks in the process to update the IBM VAAI driver from version 188.8.131.52 to version 184.108.40.206 on VMware ESXi. The benefit in moving to version 220.127.116.11 is that the updated driver supports both the IBM XIV as well as the Storwize V7000 and IBM SVC.
I downloaded the new driver from here and which uses the following naming convention:
Version 18.104.22.168 is named IBM-ibm_vaaip_module-268846-offline_bundle-395553.zip Version 22.214.171.124 is named IBM-ibm_vaaip_module-268846-offline_bundle-406056.zip Version 126.96.36.199 is named IBM-ibm_vaaip_module-268846-offline_bundle-613937.zip
The last 6 digits in the file name is what differentiates them. However when I ran the --query command against an ESXi box, I got confused:
Both the uplevel and downlevel VAAI driver files start with: IBM-ibm_vaaip_module-268846 So which one is installed? The 188.8.131.52 version or the 184.108.40.206 version? I ran the following command to confirm if the updated bulletin applies (the one ending in
613937 ). This confirmed my ESXi server was using version 220.127.116.11 and needed an upgrade to version 18.104.22.168.
vihostupdate.pl --server 10.1.60.11 --username root --password passw0rd --scan --bundle IBM-ibm_vaaip_module-268846-offline_bundle-613937.zip
The bulletins which apply to but are not yet installed on this ESX host are listed.
---------Bulletin ID--------- ----------------Summary-----------------
IBM-ibm_vaaip_module-268846 vmware-esx-ibm-vaaip-module: ESX release
To perform the upgrade I first used vMotion to move all guests off the server I was upgrading. I then placed the server in maintenance mode and installed the new driver:
There are no commands needed to activate VAAI or claim VAAI capable devices in ESXi. You simply need to confirm that the both boxes shown in the example below have the number 1 in them (for hardware accelerated move and for fast init):
To test VAAI I normally do a storage migration (storage vMotion) moving a VMDK between datastores on the same storage device. What you should see is very little VMware to Storage I/O, as I depicted in this blog post and this blog post.
My colleague Alexandre Chabrol from Montpellier Benchmarking Center also helped me out with the ESXCLI commands to control VAAI. We can confirm the state of each of the three VAAI functions and switch them off and on. We use the -g switch to display them, the -s 0 switch to turn them off and the -s 1 switch to turn them on. In this example I first confirm that VAAI is active for hardware accelerated moves, hardware accelerated inititialization (write zeros) and then hardware assisted locking. I then disable and enable hardware accelerated moves.
esxcfg-advcfg.pl --server 10.1.60.11 --username root --password password -g /DataMover/HardwareAcceleratedMove
Value of HardwareAcceleratedMove is 1
esxcfg-advcfg.pl --server 10.1.60.11 --username root --password password -g /DataMover/HardwareAcceleratedInit
Value of HardwareAcceleratedInit is 1
esxcfg-advcfg.pl --server 10.1.60.11 --username root --password password -g /VMFS3/HardwareAcceleratedLocking
Value of HardwareAcceleratedLocking is 1
esxcfg-advcfg.pl --server 10.1.60.11 --username root --password password -s 0 /DataMover/HardwareAcceleratedMove
Value of HardwareAcceleratedMove is 0
esxcfg-advcfg.pl --server 10.1.60.11 --username root --password password -s 1 /DataMover/HardwareAcceleratedMove
Value of HardwareAcceleratedMove is 1
Final thought: Most if not all of these commands can be done via the vSphere Client GUI, you do not need to use CLI. But I am surprised how many people like to use the CLI and want to see example syntax. Got a preference yourself? Love to hear about your experiences.
*** Update February 20, 2012 ***
The IBM Storage Device Driver for VMware VAAI was updated to version 22.214.171.124 in February 2012. This new version fixes a rare case where XIV, Storwize V7000, or SVC LUNs are not claimed by the IBM Storage device driver. If you are using version 126.96.36.199 without issue, there is no need to upgrade. I have updated this post to reflect the new version.
Last week I talked about the differences between the XIV Generation 2 and XIV Gen3 by just looking at the rack. This week we open the front door to see if we can spot any more differences...
First up you notice that it looks almost exactly the same..... but appearances can be deceiving.
So what actually is different? From the front there are three obvious visible differences, two of which are not that interesting....
The XIV Generation 2 has a storage grid that uses two 48 port Gigabit Ethernet network switches for interconnection between the modules (these are only visible from around the back of the rack). However these switches get redundant power via an RPS-600 Redundant Power Supply (RPS) which sits at the front of the rack (directly above module 6). The XIV Gen3 on the other hand uses two 36 port Infinband switches that have redundant power supplies built in. So the Gen3 does not need the RPS. Thus in the XIV Gen3 it is no longer there. But its spot has not remained empty....
The XIV Generation2 has a special server called a Maintenance Module located at the rear of the rack. You may notice the USB modem plugged into it. The XIV Gen3 uses an IBM System x3250 M3 mounted at the front of the rack. This server is used for maintenance, upgrades and remote access (if necessary, via modem). You can spot it here directly below the nameplate where the RPS used to be:
If you look closely at the disks in a Gen3 you will notice they are marked as SAS drives, not SATA. This gives us a performance boost even though the rotation speed remains the same.If you want to see this closeup yourself, check out this Kaon 3D model of the XIV Gen3.
This got me wondering why SAS drives that have the same rotational speed and seek time as SATA drives, could perform better. The main two reasons are that SAS is full duplex and that SAS supports tagged command queuing. There is a great article regarding the differences here that references SPC testing that Seagate performed. A quote from the article:
SAS drives offer a significant improvement in performance over SATA drives in both throughput and IOPs primarily due to their full duplex, bi-directional I/O capabilities. Published Storage Performance Council (SPC) benchmark results demonstrate this feature with up to 64 percent improvement in the SPC-2 benchmark (based on multiple workload testing).
So is that it for differences between XIV Generation 2 and Gen3? Well visibly from the front... yes it is. The big changes are around the back and inside the modules, which I will cover in a future blog post.
In the meantime, check out my new Visios for XIV. I have added three new stencils (which I am still working on). Check them out and let me know what you think. You will find them here. If and when I update them, you will get a notification so you can keep up to date.
The IBM Storage Management Console for VMware vCenter version 2.5.1 is now available for download and install. This version supports XIV, SVC and Storwize V7000 as per the versions on the following table (the big change being support for version 6.2):
If you want to see a video showing the capabilities of the new console, check out this link.
After installing the console, you will get this lovely new icon:
Start it up and select the option to add new storage, you now get three choices:
If your using SVC or Storwize V7000 you need to specify an SSH private key. This key MUST be in Open SSH format. This caused me a problem as I kept getting this message when trying to add my Storwize V7000 to the plug-in:
Unable to connect to 10.1.60.107. Please check your network connection, user name, and other credentials.
I could use the same IP address, userid and SSH private key to logon to the Storwize V7000 using putty, so I knew none of these things were wrong.
I reread the Installation Instructions closely and realized my mistake. It clearly states:
Important: The private SSH key must be in the OpenSSH format.
If your key is not in the OpenSSH format, you can use a certified
OpenSSH conversion utility.
I pondered what conversion utility I could use when I realized I had the utility all the time:Puttygen. I opened PuttyGen, imported my private key (the .ppk file) and exported my SSH private key using OpenSSH format. You don't need to do anything with the public key.
I was then able to add the Storwize V7000 by specifying the private SSH key exported using OpenSSH format.
Now I have both IBM XIV and Storwize V7000 in the vCenter plug-in and can get detailed information about and manipulate both. In this example I have highlighted the Storwize V7000, revealing it is on 188.8.131.52 firmware.
I was tempted to detail all the many things you can do with the plug-in, but your better off watching the video via this link.
So are you using the plug-in? Have you upgraded to version 2.5.1 yet? Comments very welcome!
Hopefully if your were in Melbourne last week you made it to the IBM Pulse 2011 conference at the Crown Promenade. It was a great success and with 850 attendees, the facilities were packed, especially the main hall.
My highlights? Well apart from visiting the IBM developerWorks stand and getting a free IBM floppy disk T-Shirt...
... it was listening to customers. There were 14 customer case study presentations where attendees could hear real world experiences from real world customers. For the storage track we were lucky to have Angus Griffin from Edith Cowan University talking about how they use IBM solutions including IBM SVC with VMware SRM, to build their Disaster Recovery solution. Angus is a great presenter who used a sort of Takahashi MethodPowerPoint deck where each slide was just one sentence. Below is an example. Can you guess what he was talking about?
It was of course why clients sometimes do not have a comprehensive disaster recovery strategy.
I presented on Storage Virtualization and the Storwize V7000. You can check out my presentation on Slideshare. I have struggled for some time to match my presentation style to the sort of material that IBM produces. I am working to a more pared back approach. If you view this presentation on my Slideshare channel you will also get some speaker notes.
If you want a copy of the presentation and your an IBMer, you can find it on Cattail. For everyone else, please send me an email or leave a comment.
The other client who presented in the storage track, was Richard Whybrow from Hertz Australia. Richards presentation on how Hertz use IBM solutions to manage their backups and encryption requirements was short and to the point. But the highlight was Richard's movies. I want to point you to two of them which you can find on his Youtube channel. The first one is hilarious.... here is the SAL 9000 restoring 1.6 TB of data in seconds!
If your looking for something slightly more serious, here is Richard's winning entry to theIBM Tivoli Software Products Rock competition. Richard is sitting at Southbank, close to the IBM Building here in Melbourne. There is also a great shot of Melbourne's Flinders Street Station at the end (as well as a tribute to the film Minority Report)
Rob Jackard from the ATS Group does a great job amalgamating IBM storage site updates so I am sharing them here with you. Here is my high level view:
AIX Users: Review the service dates for your technology level. DS3500 users: Upgrade your firmware to 7.70.45.00 or 7.77.19.00. DS8000 users: Take note of the limitation on resizing a space efficient repository. I dealt with this recently at a client but writing a script to delete the flashcopy targets, delete and recreate the repositories and then create the flashcopy targets again. SVC and Storwize V7000 users: Upgrade to 184.108.40.206 or 220.127.116.11. Be aware of the limitations on split cluster and Global Mirror intra-cluster mirroring.
(2011.06.28) Technical Bulletin: AIX 5.3 Support Lifecycle Notice: NOTE-1: After October, 1, 2011, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 5300-11. NOTE-2: End of Support for AIX 5.3 has been announced as 04/30/2012. NOTE-3: IBM is no longer providing generally available fixes or interim fixes for new defects on systems at AIX 53-TL06, -TL07, -TL08, -TL09, -TL10. http://www14.software.ibm.com/webapp/set2/subscriptions/onvdq?mode=18&ID=2110&myns=pwraix53
(2011.06.28) Technical Bulletin: AIX 6.1 Support Lifecycle Notice: NOTE-1: After October, 1, 2011, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 6100-04. NOTE-2: Sometime after May 1, 2012, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 6100-05. NOTE-3: IBM is no longer providing generally available fixes or interim fixes for new defects on systems at AIX 61-TL00, -TL01, -TL02, -TL03. http://www14.software.ibm.com/webapp/set2/subscriptions/pqvcmjd?mode=18&ID=5488&myns=paix61
(2011.06.29) Space Efficient Flash Copy Repository size should not be changed for an existing repository. NOTE: The code fix, to fail the chsestg command if the size of the repository is changed, is available for Release 4.3 (Bundle 18.104.22.168) and Release 5.1.5 (Bundle 22.214.171.124). https://www-304.ibm.com/support/docview.wss?uid=ssg1S1003793
(2011.07.11) Storwize V7000 and SAN Volume Controller Software Upgrades to V126.96.36.199 May Stall if Performance Monitoring Activities are Performed During the Upgrade Process. NOTE: This issue has been fixed by APAR IC77000 in the V188.8.131.52 PTF release. https://www-304.ibm.com/support/docview.wss?uid=ssg1S1003846
(2011.06.10) Storwize V7000 and SAN Volume Controller FlashCopy Replication Operations Involving Volumes Greater Than 2 TB in Size Will Result in Incorrect Data Being Written to the FlashCopy Target Volume. NOTE: This issue is fixed by APAR IC76806 in the 184.108.40.206 and 220.127.116.11 PTF releases. https://www-304.ibm.com/support/docview.wss?uid=ssg1S1003840
As a child I used to love the spot the difference cartoon in the Sunday paper. You usually had 10 differences to circle... and I could never find the last one. Look carefully at the two machines below. Can you spot the differences?
It's an XIV Generation 2 on the left and an XIV Gen3 on the right. For me it's the side panels that give it away (of course the Gen3 printed on the front panel helps).
The big change is the rack that the product uses:
The Generation 2 IBM XIV uses an APC AR3100 (also called a NetShelter). The Gen3 IBM XIV uses an IBM T42 rack .
So why the change?
Three good reasons:
Using the T42 lets us offer an optional Ruggedized Rack Feature, providing additional hardware that reinforces the rack and anchors it to the floor. This hardware is designed primarily for use in locations where earthquakes are a concern. As you may be aware there have been some major earthquakes around the world recently (with tragic results). Clearly our clients in earthquake prone areas need us to provide a model that can be hardened for use in earthquake zones.
Using the IBM T42 rack lets us offer an optional IBM Rear Door Heat Exchanger, which is an effective way to assist your Air Conditioning system in keeping your datacenter cool. It removes heat generated by the modules in the XIV before the heat enters the room. Inside the door of the heat exchanger are sealed tubes filled with circulating chilled water. Its unique design uses standard fittings and couplings and because there are no moving or electrical parts, helps increase reliability. It can be opened like any rear cover, so serviceability of an XIV Gen3 fitted with a heat exchanger is as easy as the standard air cooled version.
Using the T42 lets us offer a rack which matches our standard rack offering. It's a sturdier rack and travels far better over both short and long distances. To put it simply: Its a more substantial rack.
One nice feature that both products offer is feature code 0200 (weight reduction for shipping). When ordered it tells the plant to ship the XIV in a weight reduced format. For Generation 2 this means the rack that IBM ship will weigh around 300kg (unpacked from the shipping crate). The rest of the machine (the modules and the UPSs) are shipped in separate boxes. The XIV Gen3 will weigh more as less hardware is removed, although I am still confirming what that will be. The advantage is that you can user lower rated goods lifts and move the XIV across floors that are not rated for the maximum weight. You just have to ensure that planned location of the XIV can support the final weight. And the really nice thing? This feature is available at no extra cost.
(edited 27/7/11 to clarify feature code 0200 will be different for XIV Gen3).
Tiny toast or giant hand? It's not an optical illusion.
When IBM offered 2 TB drives on the XIV, I thought I was seeing an illusion: the overall power consumption had dropped (not risen). Guess what? With XIV Gen3, power consumption drops yet again.
Using worst case power consumption numbers I can see the following maximums:
180 drive XIV Generation 2 with 1 TB drives(79TB usable): 8.4 kVA
180 drive XIV Generation 2 with 2 TB drives(161TB usable): 7.1 kVA
180 drive XIV GEN3 with 2 TB drives (161TB usable): 6.7 kVA
So with every update to the product, power consumption has kept dropping. Lets compare the first Generation 2 to the XIV Gen3:
Usable capacity up by 103%
Power consumption down by 20%
Heat output down by 20%
Noise level down by 33%
Performance up by up to 400%
How about this as a measure: lets compare the Microsoft Exchange ESRP report for XIV Generation 2 found here with the newly released report for XIV Gen3 found here. While the ESRP program is not a bench marking program, the results are truly impressive.
For more information on Power Consumption and the cost of running XIV, check out these white papers:
If your an existing XIV user and your interested in measuring your current power consumption, check out my tutorial here. If you want the spreadsheet shown in the video, drop me a comment (your email address will appear in my comments dashboard but will not be visible to anyone else).
IBM has been selling IBM branded Brocade switches since 2001 when we announced the 8-port 2109-S08 and 16-port 2109-S16. These were classic switches that ran at 1 Gbps. They had a front operator panel with a small keypad (a feature which in the rush to fit in more SFPs, did not appear in future models). Since then IBM has gone on to sell many of Brocades switches and directors.
Sometimes you need to convert a Brocade model name to an IBM model name (or the other way around). One way to assure yourself with scientific accuracy which type of switch you are working on, is to telnet or SSH to a switch and issue a switchshowcommand. You will get a switchType value. In this example, my switch is a switchtype 27.2.
Or if you are using the Web GUI, you can also see the switch type on the opening screen. In this example the switch is a type 34.0.
Having scientifically determined the type of switch, we can now use my decoder ring to determine the IBM machine type, IBM model name and the Brocade model name. I have ordered the switches by Type number. There are three things to note:
Brocade have dropped the Silkworm branding, so I have dropped it too.
Each switch type has sub-types, for example 34.0 and 34.1. The difference is a sub-version number which is normally not published or documented.
IBM announced 16 Gbps SAN switches on August 16, 2011 so I updated the chart on that date.
If you use Data Center Fabric Manager (DCFM), it actually displays the Switch Type using Brocade model names. Here is an example report from the DCFM we are running in my lab. This level of information is very helpful.
XIV Gen 3 modules are built on a new generation of Intel microprocessors based on the Nehalemmicro-architecture. Nehalem is the most profound architecture change that Intel has introduced in the 21st century. Some of the key changes and their benefits are:
Integrated memory controller: The memory controller now sits on the same silicon die as the processors. It runs at the same clock-speed as the processors instead of at the lower speed of an external front-side bus. This dramatically improves memory performance and therefore overall system performance.
No need for buffered memory: Previously, buffered memory was required to improve the performance of the memory sub-system. Buffered memory is relatively expensive and energy hungry. With the faster Nehalem integrated memory controller, the system can deliver improved performance without needing buffered memory, saving cost as well as energy. XIV Gen 3 will be faster and cooler at the same time using unbuffered DDR3 RAM. And since the memory is cheaper, we can put more in.
Increased memory capacity: Nehalem supports more memory chips at higher speeds. In XIV Gen 3 this translates into a 50 to 200% increase in system cache, significantly lifting the performance headroom of an already stellar performer.
No more front-side bus: Memory, second CPU package and peripherals no longer have to share and wait on a single bus to communicate. The connections are now direct or switched, enabling increased parallelism and the ability to do more work simultaneously.
PCI Express Generation 2: The I/O sub-system doubles in speed with the introduction of PCI Gen-2. This enables faster network and I/O adapters for XIV Gen 3: - 8 Gbps fibre-channel host connections. - More iSCSI host connections (including at the entry configuration of 6 modules) - Multi-channel, low latency infiniband as the inter-module connection. - A slot for solid state disk (SSD).
Better systems management instrumentation: The system supports increased monitors for sub-systems for more sophisticated self diagnostics and healing. Remote management capability has also been improved.
Furthermore, the new motherboards have additional expansion capacity (more processors, memory and I/O) that can be utilized to deliver future improvements in performance and increased software functionality.
XIV Gen 3 is not the first storage sub-system to adopt the Nehalem architecture. Some of our competitors (EMC and NetApp for example) have already done so with their dual-controller arrays. XIV Gen 3 takes the Nehalem architecture advantage forward, not twice, but six to fifteen times.
Many thanks to Patrick Lee for writing up this great summation.
Today IBM is announcing a new member of the XIV family, which we are calling XIV Gen3. I thought I would give a brief history of how we got here before I get too carried away with details.
What was Generation 1 of the XIV?
In 2002 an Israeli startup began work on a revolutionary new grid storage architecture. They devoted three years to developing this unique architecture that they called XIV. They delivered their first system to a customer in 2005. Their product was called Nextra(does it look familiar?).
What was Generation 2 of the XIV?
In December 2007, the IBM Corporation acquired XIV, renaming the product the IBM XIV Storage System. The first IBM version of the product was launched publicly on September 8, 2008. Unofficially within IBM we refer to this as Generation 2 of the XIV.
The differences between Gen1 and Gen2 were not architectural, they were mainly physical. We introduced new disks, new controllers, new interconnects, improved management, additional software functions.
As anyone who has read my blog knows, I have been working on the Generation 2 XIV since the day IBM began planning to release it as an IBM product. So it is very exciting to be able to share with you that we are now releasing Generation 3 of the IBM XIV Storage System.
What is Generation 3 of the XIV?
Generation 3 of the XIV is a new member of the XIV family, that will be an alternative to the Generation 2 XIVs we currently offer. It does not change the fundamental architecture, that remains the same. What it does do is bring significant updates to almost every part of the XIV, including:
Introducing Infiniband interrconnections between the modules.
Upgrading the modules to add 2.4 Ghz quad core Nehalem CPUs; new DDR3 RAM and PCI Gen 2 (using 8x slots that can operate at 40 Gbps) .
Upgrading the host HBAs to operate at 8 Gbps.
Upgrading the SAS adapter.
Upgrading the disks to native SAS.
A New rack.
A new dedicated SSD slot (per module) for future SSD upgrades.
Enhancements to the GUI plus a native Mac OS version.
I will be blogging about each of these changes over the coming days and weeks as we move to general availability date, so watch this space. In the meantime, why not visit the official XIV page here and check out the ITG Report linked there.
Over at SearchStorage.com.AU they recently published an article entitled Six reasons to adopt storage virtualisation. You can find the article here. The six given reasons are:
Storage virtualisation reduces complexity
Storage virtualisation makes it easier to allocate storage
Better disaster recovery
Better tiered storage
Virtual storage improves server virtualisation
Virtual storage lets you take advantage of advanced virtualisation features
Its a well written article and I agree with every point. But one could be forgiven for reading the article and thinking that either storage virtualisation is new, or that storage virtualisation is something you might consider AFTER doing server virtualisation. Both of which are not true.
IBM embraced storage virtualisation in June 2003 when we announced our SAN Volume Controller (the IBM SVC). I even found a CNET.com article from way back then. You can find it here (the image below is a screen capture of that CNET website).
IBM's SVC product has been enhanced repeatedly since 2003 with an enormous list of supported host servers and backend storage controllers. We have added new functions every year including Easy Tier, split cluster, VAAI, an enhanced GUI and a new form factor for the SVC code in the form of the Storwize V7000.
So let me give you a seventh reason for adopting storage virtualisation: A vendor who has shown genuine support for this technology. No vendor has embraced storage virtualisation with more enthusiasm than IBM. We have an industry leading solution with phenomenalSPC benchmarks, an enormous number of case studies and an architecture that does not lock you in. Indeed it is an architecture that can grow as you grow and that can be upgraded without disruption.
So please consider storage virtualisation from IBM, using either the SVC or the Storwize V7000. If your in Australia, we have demo centers dotted around the country. Many of our Business Partners can also demonstrate IBM storage virtualisation using their own Storwize V7000s. If your in Melbourne feel free to give me a call and schedule a time to drop into Southgate.
I read a great blog post recently on Written Impact that talked about how to create effective presentations. It's well worth reading and can be found here. They describe several different formats that will help you develop interesting presentations, ones that don't put your subjects to sleep.
Talking of presenting, I recently presented at the IBM Power and Storage Symposium in Manila. It was a great event and was very well attended. We even had cake to celebrate IBM's 100th birthday.
There are two IBM Symposiums coming up in Australia that I would love for you to attend:
The next IBM Power Systems Symposium will be held in Sydney running from August 16 to 19, 2011. We are currently finalizing the agenda on this one and while this symposium is dedicated mainly to IBM Power Systems... I will be attending and presenting on storage related topics. To check out the details and enroll, please head over to here.
An IBM Storage Symposium will be held in Melbourne running from November 15 to 17, 2011. The agenda is still being set, so if you have ideas about what you would like to see, please let me know. To check out the details and enroll, please head over to here. And yes! I will be attending and I will be presenting.
If your a user of XIV, or your considering purchasing an XIV, then there is one tool that you will truly love. It's called XIVTop. The XIVTop application comes packaged with the XIV GUI and is one of the handiest add-ons I have ever seen. It lets you monitor your XIV in real time, seeing exactly how much IO or throughput is being achieved and at what response time (in milliseconds). You can immediately answer questions like:
Is poor application response time being caused by poor storage response time?
What application is currently generating so much traffic on the SAN?
What effect has performing file de-fragmentation had on performance?
Are the backups running and how much traffic are they generating?
What happens when I run multiple application batch jobs at the same time?
The ability to get this information in real time is what makes XIVTop so invaluable.
So in the tradition of always pushing my boundaries, I thought I would create a narrated video about XIVTop. What I discovered is just how terribly hard doing narrated videos are: You need to write a script... you need to stick to the script... you need to not fluff any words.... you need to speak slowly and clearly and not start talking in a strange accent. I had trouble with all of these, so I made take after take after take after take, until I was heartily sick of the process. I have now got a much greater respect for newsreaders and film actors. This narration stuff is hard!
So please check out my final take. It's still far from perfect, but all feedback is very welcome. The only other thing that is quite strange is Youtubes choice of videos to watch after mine. Its worth watching just to see the list. I think the term performance confuses the algorithm.
I thought I would quickly check out two of the announced features of the 6.2 release: the new Performance Monitor panel and support for greater than 2 TiB MDisks. So on Sunday I got busy and upgraded my lab Storwize V7000 to version 18.104.22.168.
Remember that in nearly every aspect the firmware for the SVC and Storwize V7000 are functionally identical, so while I am showing you a Storwize V7000, it equally applies to an SVC.
Firstly I tried the performance monitor panel, and what better way to show you what I saw than on YouTube? This is my first YouTube video so please forgive me if its not slick. I started the performance monitor and captured two minutes of performance data using Camtasia Recorder. Because it is fairly boring to stare at graphs slowly moving right to left, I then sped it up eight times, and this is the result:
The video is shot in HD, so if what your seeing is grainy or hard to read, change the display to 720p or 1080p. Now if you want to see the performance monitor at its actual speed, here is the original normal speed video. Remember this is the same video as above, just slower. It can also be viewed in 720p.
The top right hand quadrant is volume throughput in MBps as well as current volume latency and current IOPS.
The bottom left hand quadrant is Interface throughput (FC, SAS and iSCSI).
The bottom right hand quadrant is MDisk throughput in MBps as well as current MDisk latency and current IOPS.
You will note that each metric has a large number (which is the current metric in real time) and a historical graph showing the previous five minutes. You can also change the display to show either node in the I/O group.
I found the monitor to be genuinely real time: the moment I changed something in the SAN (such as starting or stopping IOMeter or starting or stoping a Volume Mirror), I immediately saw a change.
Greater than 2 TB MDisk support
Next I logged onto my lab DS4800 and created two 3.3TiB volumes to present to the Storwize V7000. I chose this size because I had exactly 6.6 TiB worth of available free space on the DS4800 and I wanted to demonstrate multiple large MDisks. On versions 6.1 and below, the reported size of the MDisks would have been 2 TiB (as I discussedhere). Now that I am on release 6.2 with a supported backend controller, I can present larger MDisks. In the example below you can clearly see that the detected (and useable size) is 3.3 TiB per MDisk.
What controllers are supported for huge MDisks?
The supported controller list for large MDisks has been updated. The links for Storwize V7000 6.2 are here and for SVC here. If your backend controller is not on the list, then talk to your IBM Sales Representative about submitting a support request (known as an RPQ).
(edited 24/5/2011 --> removed old Visio Stencils link).
VisioCafehas been updated with IBM's latest official stencils for use with Microsoft Visio. These include all models of theStorwize V7000, including the newest models: The2076-312and2076-324(which have the dual port 10 Gbps iSCSI card).
I really enjoy teaching, particularly when the students are coming from a non-IBM background. It gives me the chance to better learn how IBM's products compare to our competitors, because the experiences and view points come from real end users. It also helps me to reconfirm my knowledge and understanding of our own products. There is a very basic rule in IT: If you cannot explain a concept to someone else, you probably don't understand it yourself.
The course consists of a day of lectures and a day of labs (using the XIV Labs inMontpellier France). Here is the course layout.
Unit 1 - IBM XIV Storage System
Unit 2 - IBM XIV administration
Unit 3 - Implementation and configuration
Unit 4 - Host systems attachment and mappings using FCP
Unit 5 - Host systems attachment and mapping using iSCSI
Unit 6 - Copy Services
Lab 0 - Lab setup and preliminary instructions
Lab 1 - IBM XIV Storage Management: Installation
Lab 2 - IBM XIV Storage Management: Configuration
Lab 3 - Host definition and mappings: Attaching a Windows server to an XIV
Lab 4 - Host definition and mappings: Attaching an AIX server to an XIV
Lab 5 - Host definition and mappings: Attaching a Linux server to an XIV
Lab 6 - IBM XIV configuration: Monitoring
Lab 7 - IBM XIV Copy Services: Snapshots
Lab 8 - IBM XIV Copy Services: Remote mirror
The idea is to teach all the concepts on day one and then let the students hit real machines in a remote lab environment on day two. The hands on part is always the best bit as far as I am concerned (learning by doing always beats learning by listening). Students who have never touched the XIV GUI always enjoy this part.
A bigger challenge is when you have a student who already has lots of hands on experience. In those cases I work to consolidate what they have already learned.
I am curious, how often do you wait so long to do a course, that there was not much left to learn by the time you actually got to do it?
Oh and please ignore this strange string: XQ983UH6VUFD
In a previous blog entry I mentioned a new iPhone and Blackberry app that gives you info on IBM Storage. I actually now have three IBM supplied iPhone apps that you can get through the Apple Store. The dW app is a social networking app that lets you interact with your contacts on the IBM developerWorks website. I didn't realize that IBM effectively had its own Social Networking site..... but that's exactly what the developerWorks site is! For more information, check out the October 13 developerWorks Podcast, available here. There is more information here.
The IBM Storage and IBM System x iPhone apps are very similar in design and layout. They both list product types by family, giving specifications for each machine type. For example these are the specifications listed for the Storwize V7000. For each product you also get a Description page and Web link pages. You also get links to Facebook, Youtube,Twitter, LinkedIn and other contacts.
There are still some areas where things can be improved. Not all of the products have their specifications listed yet. They instead direct you to the web.
Never the less I think this is a great start. It shows IBM's commitment to both Social Media and being as informative open and communicative with our customers as possible. As for Android users, we are listening... Expect an Android version hopefully before the end of the year.
Oh.... and to find these apps... just open the Apple iStore and search for IBM.
Just a quick note about using Oracle Solaris with IBM XIV (I so want to say "SUN Solaris", I need to retrain my brain). When using IBM XIV with Solaris, you need to install IBM's XIV Host Attachment Kit (delightfully called a HAK). This is to ensure multi-pathing is correctly configured (regardless of whether your using DMP or MPxIO). The relevant software, release notes and instruction guides are found here. Anyway... the whole point of this blog entry is to correct a short coming in the release notes. They currently fail to mention some minimum system requirements. I am getting this corrected, but until then... please note the following: . 1. The HAK for Solaris 10 supports only Solaris 10 U4 and greater (this is also referred to as Solaris 10 - 08/07). This means if for instance your on 11/06 (update 3) or 03/05, you will need to first perform a Solaris update. 2. The following patch is mandatory for Solaris 9/SPARC: 118462-03 (it's a prerequisite for HAK installation). 3. Solaris 9 is supported for SPARC only. 4. Solaris 8 is not supported
I am pleased to announce IBM will have a Symposium dedicated just to IBM System Storage. It will be held at: . Crowne Plaza Darling Harbour 150 Day Street Sydney, Australia DURATION: 2.5 days DATE: 6-8 December 2010 (thats a Monday to Wednesday). . I am personally organising the session list in conjunction with our USA based Symposium organisation team. I am also helping to line up the speakers. The focus is definitely on helping clients see the IBM vision while getting practical information on how to get the best use of IBM System Storage. The final session list is still being finalised, once it is, I will be posting it. . Jenny Morris and Danielle Lofting will be organising the venue. They have a fantastic, enthusiastic and very experienced team, so the organisation will be at a very high level. I have posted the current symposium description below. As I said... more news to come, so stay tuned. . . Come away from our first Annual System Storage® Technical Symposium in Asia Pacific with strategic and tactical knowledge of how to optimise every aspect of your customers' server/storage network. Let IBM® executives, developers and industry expert speakers from all around the world help you realise what the very latest on IBM Storage technology means to your bottom line. . TRACKS The IBM System Storage Technical Symposium in Sydney will offer sessions and tracks in Archiving & Compliance, Business Continuity, Disk Systems, Open Systems & System z® Storage Management, Security Storage Networking (Data Center Networking (DCN) - Network Attached Storage (NAS) - Storage Area Network (SAN)), Tape Systems, SONAS and Virtualization solutions. Test-drive the latest technologies through hands-on labs. Roll up your sleeves and get involved with the latest systems and software . OVERVIEW
Attend this Technical Symposium and find out how you can address the growing challenge of managing and securing retention-managed data using storage solutions from IBM and IBM Business Partners. Learn about the latest enhancements to the IBM System Storage portfolio, all of which support virtualization, openness and collaborative innovation.
Learn how to dramatically improve storage utilisation by using IBM's unique and patented HyperFactor® technology in the IBM System Storage ProtecTIER® Deduplication solutions
Discover how to rapidly expand storage infrastructure by 100s of terabytes up to multiple petabytes at a time with minimal effort using IBM’s new Scale Out Network Attached Storage (SONAS) solution
Learn about the impact of cloud computing on storage infrastructure
Discover how to protect information with IBM self-encrypting disk and tape storage solutions.
Learn about the IBM XIV® Storage System, a revolutionary high-end open disk system designed to support key current and future business requirements for a highly available information infrastructure.
Discover new Tivoli® Storage Productivity Centre functions that deliver enhanced management capabilities for virtualization technologies, including AIX®, VMware® and SAN Volume Controller (SVC).
Learn how to use IBM Tivoli Storage Manager (TSM) to automate data backup and restore functions, and centralise storage management operations
Discover how to reduce cost and complexity through consolidation and virtualization using the new DS8700 with solid state drives (SSDs)
The XIV GUI is all about simplicity. Its about taking tasks which on other products are difficult or time consuming, and either eliminating them, or making them as simple as possible. But for those who like to issue commands via a command line interface (a CLI), the XIV also has a very rich CLI called XCLI. If your familiar with the XCLI your hopefully aware that list commands can produce much more detailed output if the -x option is used (-x requests XML output). So here is something you can try out. If your XIV is on 10.2.1 firmware you can use the module_list -x command to display how much server memory each XIV module has. If your XIV has 2 TB disks, you should find that you have 16 GB of server memory per module. This means a 15 module machine has a whopping 240 GB of server RAM. To be clear, I am not referring to this as 'cache' because a small portion (around 2.5 GB) of the RAM in each module is used by the modules internal Linux operating system. This means that a 15 module XIV with 2 TB drives and 16 GB of server memory per module, has over 200 GB of cache. As former Australian Prime Minister Paul Keating once said: "its a beautiful set of numbers" . XIV>>module_list -x <XCLIRETURN STATUS="SUCCESS" COMMAND_LINE="module_list -x"> <OUTPUT> <module id="100156"> <component_id value="1:Module:1"/> <status value="OK"/> <currently_functioning value="yes"/> <requires_service value="no"/> <target_status value=""/> <type value="p10hw_regular"/> <disk_bay_count value="12"/> <fc_port_count value="0"/> <ethernet_port_count value="0"/> <io_allowed value="yes"/> <io_enabling_priority value="0"/> <serial value="SHM0946265L0QPK"/> <last_serial value=""/> <original_serial value="SHM0946265L0QPK"/> <part_number value="45W8430"/> <last_part_number value=""/> <original_part_number value="45W8430"/> <bmc_version value="0.100-343"/> <sdr_version value="SDR Package 46"/> <bios_version value="S5000.86B.10.00.0094.101320081858"/> <ses_version value="60"/> <sas_version value="1.27.00"/> <memory_gb value="16"/>
So its that time of the month again. Rob Jackard from the ATS group does a fantastic job summarizing changes to the IBM Storage Support site and you get all the benefit of his hard work (via me!). . So cast your eyes down the list and look for issues that may affect you.... . AIX:
(2010.08.21) AIX Support Lifecycle Notice- AIX 5.3 TL9 & TL10.
NOTE-1: After November 2010, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 5300-09 (applies to all Service Packs within TL9). Sometime after May 1, 2011, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 5300-10 (applies to all Service Packs within TL10).
NOTE-2: As a reminder, IBM is no longer providing generally available fixes or interim fixes for new defects on systems at AIX 5300-06, AIX 5300-07 or AIX 5300-08.
(2010.08.21) AIX Support Lifecycle Notice- AIX 6.1 TL2 & TL3.
NOTE-1: After November 2010, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 6100-02 (applies to all Service Packs within TL2). Sometime after May 1, 2011, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 6100-03 (applies to all Service Packs within TL3).
NOTE-2: As a reminder, IBM is no longer providing generally available fixes or interim fixes for new defects on systems at AIX 6100-00 or AIX 6100-01.
NOTE: Users of MPIO storage running the 5300-12 TL- An operation to change the preferred path of a LUN could hang. A similar hang could be experienced during LPAR migration where it will try to switch the preferred paths. Install APAR IZ77907.
NOTE: Users of MPIO storage running the 6100-02 TL- An operation to change the preferred path of a LUN could hang. A similar hang could be experienced during LPAR migration where it will try to switch the preferred paths. Install APAR IZ77908.
(2010.08.17) NPIV clients of SSDPCM hosts
may experience permanent application errors during SVC concurrent code upgrade
or node reset with certain APARs and SSDPCM versions. The risk, although rare,
exists in any AIX SSDPCM host or client.
NOTE: The changes made for VIOS client
hangs in Technote SSG1S1003579 require additional AIX driver and SDDPCM code
updates for a specific SVC error condition.
IBM recently announced the new System Storage DS3500 Express. The DS3500 is an entry level storage system that can be easily serviced and managed by an end-user. It is a very worthy successor to the DS3200/DS3300/DS3400 product line. So I thought I would share with you 10 things I really like about the new IBM DS3500 (in no particular order).
1) Its small
The base unit is only 2U in size and can hold either 12 of the 3.5" disks or 24 of the smaller 2.5" disks (depending on model). Each expansion drawer can also hold 12 of the 3.5" or 24 of the 2.5" disks (depending on model) and you can have 3 of them. So thats a potential 96 disks in 8U of rack space.
2) Its all SAS
In my opinion, Serial Attached SCSI (SAS) is the future of disk attachment. Traditional parallel SCSI is so 20th century and FATA didn't work out too well. I think SATA and FCAL attached disk will eventually be replaced by SAS and the DS3500 is all SAS at the disk back end and SAS by default at the host front end as well.
3) Its got flashcopy
The DS3500 can create two flashcopies without any extra licenses. I really like the fact that if your doing an OS or application upgrade, you can give yourself a quick roll-back point by just reserving some space for a flashcopy repository. This is also a great way to test whether flashcopy is right for your business and if so, buy the license to create more than 2 copies at a time.
4) Its got remote mirror
The DS3000 range up until now did not offer remote mirror capability. This meant that if you wanted a DR solution you needed to buy something to go over the top such as IBM SVC or Softek Replicator. The DS3500 now offers its own native replication that not only fills a spot but is compatible with existing DS4000s and DS5000s that you may already have in your business.
5) Its got nearline
So FATA disk may not have worked out, by nearline SAS is a far better alternative. The 2.5" model offers a 500 GB 7.2 K RPM nearline SAS drive. Or how about a 2 TB drive in the 3.5" form factor? Want some archive disk using nearline where the spindle count will still deliver good performance? Heres the solution.
6) Its green
If we accept that MAID was not the solution for the masses, the better thing is to simply do more with less, which is exactly what the DS3500 does. We are talking around 500W of power usage for a 48 disk two drawer solution (with 2.5" disks). Thats around half the power consumption of the equivalent model with 3.5" disks. This means less power drawn in and less hot air blown out.
7) One model to rule them all
The DS3500 comes in one model: SAS. You want fibre channel? No problem, just add the card. You instead want iSCSI? Same deal, just add the card. All models retain the SAS adapters which are proving so popular in the rack and blade server space.
You need a point solution to provide data-at-rest encryption? Here it is with 300 GB and 600 GB Self Encrypting drives that protect your data with no performance impact. Even better is that the software to manage encryption is rolled into the DS Storage Manager. Talking of which...
9) Easy Management
The DS3500 continues to use an intuitive and easy to use GUI which now includes all the dynamic volume management. This is an improvement over previous models where this had to be done via command line.
10) Its cheap
Being entry level it is priced for that market. You could also place it behind the SVC for a quick encryption solution or as a VDisk mirror repository.
I received a question the other day about how the XIV interacts with a customers building UPS.
So I thought I would share my answer with you.
The XIV has two separate line cords (there is an option to have four line cords but I am trying not to complicate this). This means the clients building power provides the XIV with two separate power sources.
As long as one of those two line cords provides input power, then the XIV will continue to operate normally.
If both power sources stop supplying input power then the client is not providing any electricity to the XIV (none at all).
This would suggest the clients computer room has suffered a severe building facility failure and that all of their other equipment has lost power too.
In this situation the XIV will continue to operate normally for 30 seconds on battery power, waiting in the hope that the clients power will come back on at least one of the two line cords.
If after 30 seconds, the XIV has not detected the return of any input power, it must take action to ensure it does not flatten it's internal UPS batteries. So it performs a graceful shutdown and powers itself off. Why wait only 30 seconds? The main reason is brown-out protection. If the client loses power for 20 seconds and then returns power, and does this recursively, they could progressively flatten the battery to the point where the XIV may not be able to gracefully shut down. This is not desirable, so the 30 second timer is a good compromise.
Overall this design allows the client the greatest levels of availability and data protection.
In terms of site EPO, the XIV does not have an EPO switch or interface, because the XIV design has a strict requirement to perform a graceful shutdown prior to power off.
If the client wants to manually power the machine off, they could instead issue a CLI or GUI command to the machine to request shutdown.
Shutdown takes about 30 seconds to complete because the machine needs time to destage cache and meta data to disk prior to shutting down the Linux OS that runs on each module.
So how do you power the XIV back on?
Just press the On switch on each of the three UPS modules (preferably all at once).
So how do you manually power the XIV off?
Always use the xCLI or XIV GUI to shut the XIV down. There are power off buttons on each XIV UPS, but these should be covered by a plate and never used (if they are not covered up, please contact IBM to have this done). We don't use these buttons as they don't let the modules shutdown gracefully.
If you launch an xCLI session from the XIV GUI, issue the following command and then respond to the prompts:
If you want to script the command then you need a script that looks like this: