IBM SAN Volume Controller (SVC) has offered Fibre Channel Storage Virtualization since June 2003. Two SVC nodes communicate with each other via fibre channel to form a high availability I/O group. They then communicate with the storage that they virtualize via Fibre Channel and with the hosts they serve that virtual storage to, via Fibre Channel. When IBM added real-time (metro mirror) and near real-time (global mirror) replication it was also done using Fibre Channel, with each SVC cluster communicating to the other by connecting using fibre channel protocol transported over dark fibre with or without a WDM or via FCIP (Fibre Channel over IP) routers.
Each Fibre Channel port on an SVC node can be a SCSI initiator to backend storage, a SCSI target to hosts and all the time communicate to its peer nodes using those same ports. With every generation of SVC node, these ports got faster and faster, going from 2 Gbps to 4 Gbps to 8 Gbps. In SVC firmware V5.1 IBM added iSCSI capability to the SVC using the two 1 Gbps ethernet ports in each node. This allowed each node to also be an iSCSI target to LAN attached hosts.
When the Storwize V7000 came out in Oct 2010 it offered all of this capability, plus offered two fundamental changes to the design.
Firstly the two controllers in a Storwize V7000 can communicate with each other across an internal bus, eliminating the need to zone them together (or even attach the Storwize V7000 to Fibre Channel fabrics).
The other more obvious difference is that a Storwize V7000 comes with its own disks, which it communicates with via multi-lane 6 Gbps SAS.
When IBM added 10 Gbps Converged Enhanced Ethernet adapters to the SVC and to the Storwize V7000, these adapters operated as iSCSI Targets, allowing clients to access their volumes via a high-speed iSCSI network. In V6.4 code IBM allowed these adapters to also be used for FCoE (Fibre Channel over Ethernet). These are also effectively SCSI targets ports allowing hosts that use CEE adapters to connect to the SVC or V7000 over a converged network.
If you have a look at the Configuration limits page for SVC and Storwize V7000 version 6.4 (the Storwize V7000 one is here), you will see this interesting comment:
"Partnerships between systems, for Metro Mirror or Global Mirror replication, do not require Fibre Channel SAN connectivity and can be supported using only FCoE if desired"
So does this mean we can stop using FCIP routers to achieve near real-time replication between SVC clusters or Storwize V7000s? The short answer is most likely not. Lets look at why...
The whole reason Fibre Channel became the standard method to interconnect Enterprise Storage to Enterprise hosts is simple: Packet loss is prevented by buffer credit flow control. Frames are not allowed to enter a Fibre Channel network unless there are buffers in the system to hold them. Frames are normally only dropped if there is no destination to accept them. Fibre channel is a highly reliable, scalable and mature architecture. When we extend Fibre Channel over a WAN we do not want to lose this reliable nature, so we use FCIP routers like Brocade 7800s, that continues to ensure frames are reliably delivered in order, from one end point to another.
Converged enhanced ethernet allows Fibre Channel to be transported inside enhanced ethernet frames. The one fundamental that CEE brings to the table is the same principle that a frame should not enter the network without a buffer to hold it. Extending FCoE over distance has the same challenge: the moment you start moving those frames over a WAN connection you need to ensure frames are not lost due to congestion. How do we do this? The same way we did with Fibre Channel: we use Dark Fibre, we use WDMs or we use routers. The same issues and requirements exist.
For more information on FCoE over distance check out this fantastic Q&A from Cisco:
It's a story that has been repeated many times: You buy a shiny new storage system..... and it is beautiful.
Then... a disk fails, which takes just the tiniest bit of shine off.
No problem you declare! You place a service call and the disk is replaced. So far so good.
But then as the vendor service representative is walking out the door, it suddenly occurs to you... hey, that person is taking away the failed disk. Doesn't that disk have my data on it?
The short answer is that unless you have purchased self encrypting drives or are encrypting your data prior to writing it, then that failed drive will almost certainly contain some readable data. How readable will depend on the product. If the disk contains de-duplicated compressed data, it would present a great (but I suppose not insurmountable) challenge to any would be data snooper. But a failed disk removed from a standard RAID array, would contain data in sequential chunks (that are perhaps 256 KB in size). Whether that would be useful is another question.
So what to do?
First up, every responsible vendor takes great pains to ensure failed hard drives are not simply thrown in the dumpster or sold in job lots. As Railcorp in Australia found out the hard way (when they started selling off the media they had in the lost and found department) not controlling media with client data is a very bad idea. Instead responsible vendors usually return failed drives either to the original manufacturer (to get a warranty rebate) or to a reutilization center (either their own, or a third-party). In either case, there is a financial benefit to them to do this. The shipment will be done in a secure fashion and any disk drive that can be repaired will be thoroughly wiped. If not it will be securely destroyed. Again, all the major vendors should be able to produce a policy document explaining how this is done. For the majority of clients out there, I personally think this is good enough.
But what if you don't think this is good enough? What if your data is way too sensitive to take any risks?
Simple answer: Keep the failed disks.
A quick Google search came up with lots of easy to find programs from most major storage vendors. Just search for something like disk retention service (retention is the key word here). Here are some examples:
The only fly in the ointment is that these services are generally not free... and if you realize this only after the first drive has failed, you may find yourself negotiating with your vendor on price, well after the main purchase is complete. The only exception I have found so far is that IBM Australia lets you retain failed drives for free, provided the machine is covered by a Service Pac.
Of course maybe you knew this already and have always retained failed drives, but now your store-room is slowly filling with failed disks. Now what? Well I do not suggest you do this, but I sure laughed while watching it (sorry if there is an advertisement before-hand):
Instead Google search for secure hard disk shredding or secure hard disk recycling. Examples I found in Australia very quickly ( I have not contacted or dealt with either of these) included this one and this one. I am sure there are plenty of choices out there.
IBM has today announced a whole swag of planned new features across the entire IBM Storage product line. You can read the announcement letter here and I have also dropped the text at the bottom of this blog post (to save you clicking on the link).
It's a very impressive list, but to hone in on a few of the more exciting offerings:
IBM Easy Tier will be enhanced to cache hot data in SSD storage installed in a client server. Looks like it will initially be a combination of DS8700/DS8800 and AIX with or Linux servers. I am sure there are plenty who will immediately think of EMC VFCache, so I am keen to get more details so I can see how the two compare. If you are curious in the meantime, check out this EMC fact sheet and then read this fascinating interview with the CMO of FusionIO.
A new high density storage module will be made available, initially I suspect for the DS8800. This is a really important step as we are seeing a lot of new technologies emerging in the SSD space. This is because the technical requirements of SSD don't always line up with the architectures of existing storage controllers, so a custom built enclosure designed just for SSD makes perfect sense.
The IBM XIV will be enhanced with the ability to cluster multiple XIVs together and migrate volumes non-disruptively between them. The non-disruptive volume migration is a great new feature which should definitely help with swapping XIVs out as new models come available.
There are plenty of other new features as well, so check out the announcement letter reproduced below:
IBM® intends to support a number of new enhancements to a variety of IBM storage systems in the future. These enhancements will leverage innovative research on intelligent algorithms, automation, and virtualization that is being incorporated into products in the IBM storage portfolio. The statements of direction highlighted here are intended to provide a glimpse into the IBM storage roadmap for selected product capabilities.
IBM intends to deliver:
Advanced Easy Tier™ capabilities on selected IBM storage systems, including the IBM System Storage® DS8000® , designed to leverage direct-attached solid-state storage on selected AIX® and Linux™ servers. Easy Tier will manage the solid-state storage as a large and low latency cache for the "hottest" data, while preserving advanced disk system functions, such as RAID protection and remote mirroring.
An application-aware storage application programming interface (API) to help deploy storage more efficiently by enabling applications and middleware to direct more optimal placement of data by communicating important information about current workload activity and application performance requirements.
A new high-density flash storage module for selected IBM disk systems, including the IBM System Storage DS8000 . The new module will accelerate performance to another level with cost-effective, high-density solid-state drives (SSDs).
IBM intends to extend IBM Active Cloud Engine™ capabilities to:
Allow files on selected NAS devices to be virtualized by SONAS and Storwize® V7000 Unified. Virtualization capabilities provide access across a unified global namespace, while facilitating transparent file migrations in parallel with normal operations. This capability will help provide customer investment protection as clients continue to leverage their existing NAS assets while exploiting the capabilities of IBM Active Cloud Engine .
Enable file collaboration globally via IBM Active Cloud Engine . This capability will help enhance productivity where users at geographically dispersed locations can both share and modify the same file.
IBM intends to deliver Cloud features to SONAS and Storwize V7000 Unified to support:
Web Storage Services, a standards-based object store and API that implements the Cloud Data Management Interface (CDMI) standard from Storage Networking Industry Association (SNIA) to support the implementation of storage cloud services.
Self-service portal designed to speed storage provisioning, monitoring, and reporting.
IBM intends to support an increased scalability of capacity, performance, and host bandwidth by clustering IBM XIV® Gen3 systems together and providing the capability to migrate volumes across the cluster without disrupting applications. Management of the cluster will remain simple with consolidated views and shared configurations across the systems. These capabilities are intended to help clients address the scalability and management requirements for effective cloud computing.
IBM intends to extend NAS data retention enhancements for IBM Storwize V7000 Unified and IBM SONAS to provide file "immutability" to help support file integrity from the time the file is designated as immutable through its lifecycle. Immutability is intended to secure files from inadvertent or malicious change or deletion.
IBM intends to enable Real-time Compression for block and file workloads on Storwize V7000 Unified systems. This enhancement is designed to help clients experience the same high-performance compression for active primary block and file workloads on Storwize V7000 Unified that is being announced for block workloads on Storwize V7000. IBM Storwize V7000 Real-time Compression is designed to deliver enhanced storage efficiency with potential benefits including lower storage acquisition cost (because of the ability to purchase less hardware), reduced storage growth, and lower rack space, power, and cooling requirements.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information in the above paragraphs are intended to outline our general product direction and should not be relied on in making a purchasing decision. The information is for informational purposes only and may not be incorporated into any contract. This information is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.
One common question that I hear on a regular basis regards the availability of an SRA for VMware SRM 5.0 when using Storwize V7000 or IBM SVC running V6.3 firmware. This combination is currently unsupported as per the alert found here.
The good news is that there are now IBM SRAs available for clients running SRM in combination with V6.3 firmware. While this combination is still not listed on the VMware support matrix found here, you can download the SRAs direct from IBM if your need is urgent.
A few weeks ago I wrote a piece called How to spot an old IBMer. It was a sort of reminiscence about my early days with IBM but it turned out to be one that really touched a chord with many Big Blue veterans. In fact the response was overwhelming, I have never received more hits or more comments for anything I have written. It was also pleasing that these responses were almost universally positive.
So it's ironic that today I am becoming an ex-IBMer.
Yes it's time for me to move on, so I have decided to try something new. I am joining a really exciting IT startup called Actifio.
So for all of you who have worked with me and helped me over the past 23 years: Thank you. It has been an honor to work at IBM and I wish Big Blue and all who continue to work there, nothing but success and happiness.
So you need to do some disk performance testing? Maybe some benchmarking? What tools are out there to help you out? Well I am glad you asked... here are some that I use on my daily travels:
IOmeter is an old classic, with emphasis on the word old. At time of writing, the most recent update was from 2006. However it remains very popular mainly because it is free and easy to use.
Some tips when using IOmeter:
On Windows, IOmeter needs to be run as an Administrator, which seems to be the most common mistake people make (not running as Administrator means you don't see any drives). You can only run one instance of IOmeter in Windows, which means if multiple users logon to the same server, only one user can run IOmeter. You also really need to run IOmeter with a queue depth ( or number of outstanding I/Os) greater than one, with multiple workers. If you don't, you will not be able to drive the storage to saturation. For instance here are some results running 75% read I/O, 0% random, 4 KB blocks on a Windows 2008 machine with 4 workers. In each case against the same 128 GB volume on a Storwize V7000 backended by 4 x 300 GB SSDs in a RAID10 array. In each case I let the machine run for 10 minutes before taking the screen capture to ensure the performance was steady state and not peaking.
Firstly I used a queue depth of one. Aggregate performance was around 27000 IOPS.
Then I used a queue depth of 10. Aggregate performance was around 81000 IOPS.
I then used a queue depth of 20. Aggregate performance was around 113000 IOPS.
What I am trying to show is that taking the defaults (one worker with a queue depth of 1) will not drive the storage to a useful value for comparison... you need to do some tuning and some experimenting to get valid results. At some point increasing queue depths will not improve performance (it may actually decrease it).
There is an alternative to IOmeter called IOrate (created by an EMC employee). It is also very popular and appears to still be in active development. It is not unusual to see IBM performance whitepapers that used IOrate to generate the workload.
This is a fairly recent tool that I have not had a chance to try out (due to time pressures). The tool uses virtual machines under VMware to generate the I/O and includes some very nice workload capture and playback tools as well as reporting tools.
Jetstress is a benchmarking tool created by Microsoft to simulate Microsoft Exchange workloads. I like the fact you can configure it to run for very long periods and it has a more real world feel about it than just running empty I/Os. You can get the base software here, but you will also need some files from a Microsoft Exchange install DVD (or from an installed instance of Microsoft Exchange). If you cannot get to those files you cannot complete the startup process inside Jeststress.
Oracle offer a tool on their website called Orion, which will simulate the workload of an Oracle database. You can get the tool from here (although you will need to create a free Oracle user account before you can download it).
SDelete is not a benchmarking tool or a performance modelling tool. But it is a great way to generate I/O with very little effort. Just create a new drive in Windows and then run SDelete against it with the -c parameter. This parameter is used for secure deletion, so generates random patterns (which is real traffic - albeit 100% sequential writes). The syntax is like:
(updated April 20, 2012 - I found in version 1.6 of SDelete the meaning of the -z and -c parameters got swapped. In version 1.6 if you want random patterns use -c, if you want zeros use -z. In previous versions it is the other way around!).
Just doing file copies is probably the worst way to generate benchmarks, especially as a single copy is usually a single threaded operation.
I am sure there are plenty of other tools out there to generate benchmarks and simulate workload. My main concern with many of them is that synthetic (artificial) workloads do not reflect real world workloads.
Right now I am working on giving a client a recommended version of firmware for their Cisco MDS Fibre Channel switches. For FICON, the recommendations are easy, but for Open Systems there are so many choices. So what am I going to recommend?
FICON Switches and Directors
For FICON switches, sticking to the FICON (IBM Mainframe Fibre Connection) recommended versions (which are determined by the IBM System z Mainframe team), is a very good strategy. The best place to get these is here (standard IBM logon is required). Just look along the right hand column for the release letters.
The SAN-OS and NX-OS release notes found on the Cisco website also show recommended versions for FICON. For instance have at the look at the FICON recommendations table in the releases notes for version 5.2.2a that you can find here. The upgrade path is just below the table I have linked to. This link will get outdated over time (as newer versions come out), but you can list all the release notes here.
If you are using a IBM TS7700 you should also be aware of this page on the IBM Techdocs site.
So based on current versions, if you are running SAN-OS 3.3.1c or below you need to move to 4.2.7b (as per the non-disruptive upgrade path). I strongly recommend you get to at least version 4.2.7b and start planning to move to release 5.2.2 (provided your hardware supports it).
For open systems attached Fibre Channel switches there are a number of versions to choose from. There are five things to consider:
Being on the very latest version has a small potential risk (of un-discovered bugs). However being on very old versions has a greater implicit risk (of being exposed to KNOWN bugs). Just because you have not hit a bug yet, does not insure you from potential issues, especially if your SAN is growing.
Your hardware. Some older Generation hardware is not supported at higher levels (for example Supervisor-1 cards cannot go past SAN-OS 3.3.5b) but later generation hardware is not supported at lower levels (for example Fabric 3 modules need NX-OS 5.2.2). The Cisco recommended versions page is the best place to confirm this.
End of life. As SAN-OS reached end of development in 2011, 3.3.5b is the best choice for all hardware that cannot upgrade to NX-OS. However be aware that some Cisco Generation 1 hardware (such as 2 Gbps capable hardware) will go end of service in September 2012 (for example Supervisor-1 cards and MDS 9120 switches). Links for this are below. Of course your service provider may choose to offer support beyond the Cisco end of life date, but instead of updating code, maybe you should be updating hardware.
You need to also upgrade your Fabric Manager to at least the same or a higher version than your switches are running. One important thing to be aware of is that from version 5.2, Cisco Fabric Manager has been merged into a new product called Cisco Data Center Network Manager (DCNM).
If you work (or have worked) for IBM then you have probably met many old timers. IBMers who have been with the company for 25 years or more (or even 50!).
But how do you spot an old IBMer?
Is it by the cut of their suit? Not sure about that anymore.
An IBM General Systems Division marketing rep in New Jersey in 1978.
It's certainly not by their extensive beards.
Development of the 3800 printer, taken in the early 1970s by Ray Froess (http://www.froess.com/IBM/3800printer.htm)
Is it by the size of their laptop? I hope not!
IBM 5100 Portable Computer (1975)
No... you can spot them by their use of certain words and phrases.
Here are a few I can think of... you may know more. Try this out as a test on someone who you think is an old IBMer and see how they go:
1) While showing a powerpoint presentation they keep saying they are showing foils (despite having not seen an overhead projector in over 10 years).
2) They refer to disk storage as DASD (pronounced Dazz-Dee).
3) They still call a Sales Rep a Marketing Rep (check out Buck Roger's book The IBM Way to see why).
4) They refer to their inbox as their reader (see #6 below).
5) They refer to the IBM corporate personnel database as callup (it has been a Web based application called BluePages for around 15 years).
6) If you say I will PROFS you (or I will send you a PROFS mail), they don't blink an eye-lid (PROFs was IBM's Mainframe based mail system, replaced by OfficeVision which was replaced by Lotus Notes in the 1990s).
7) If you say you F4ed or PF4ed an email... they know what you mean (it meant that you deleted it in PROFS/OfficeVision).
8) They reveal they are a veteran of IBM Typewriters by regaling you with their knowledge of Selectric Rotate Tapes.
It is ironic that only days after I wrote that 497 is the IT number of the beast, I learn that Linux has another unfortunate number: 208.
The reason for this is a defect in the internal Linux kernel used in recent firmware levels of SVC, Storwize V7000 and Storwize V7000 Unified nodes. This defect will cause each node to reboot after 208 days of uptime. This issue exists in unfixed versions of the 6.2 and 6.3 level of firmware, so a large number of users are going to need to take some action on this (except those who are still on a 4.x, 5.x, 6.0 or 6.1 release). If you have done a code update after June 2011, then you are probably affected. This means that if you are an IBM client you need to read this alert now and determine how far you are into that 208 day period. If you are an IBMer or an IBM Business Partner, you need to make sure your clients are aware of this issue, though hopefully they have signed up for IBM My Notifications and have already been notified by e-mail.
In short what needs to happen is that you must:
Determine your current firmware level.
Check the table in the alert to determine if you are affected at all, and if so, how far you are potentially into the 208 day period.
Prior to the 208 day period finishing, either reboot your nodes (one at a time, with a decent interval between them) or install a fixed level of software (as detailed in the alert).
To give you an example of the process, my lab machine is on software version 18.104.22.168 which you can see in the screen capture below. So when I check the table in the alert, I see that version 22.214.171.124 was made available on January 24, 2012, which means the 208 day period cannot possibly end before August 19, 2012.
Earliest possible date that a system running this release could hit the 208 day reboot.
SAN Volume Controller and Storwize V7000 Version 6.3
30 November 2011
25 June 2012
24 January 2012
19 August 2012
Regardless, I need to know the uptime of my nodes, so I download the Software Upgrade Test Utility (in case you have an older copy, we need at least version 7.9) and run it using the Upgrade Wizard (NOTE! We are NOT updating anything here, just checking):
I Launch the Upgrade Wizard, use it to upload the tool and follow the prompts to run it, so that I get to see the output of that tool. The output in this example shows the uptime of each node is 56 days, so I have a maximum of 152 days remaining before I have to take any action. At this point I select Cancel. You can run this tool as often as you like to keep checking uptime.
Note if you are on 6.1 or 6.2 code you may see a timeout error when running the tool, especially for the first time. If you do see an error, please follow the instructions in the section titled "When running the the upgrade test utility v7.5 or later on Storwize V7000 v6.1 or v6.2" at the Test Utility download site.
As per the Alert:
If you are running a 6.0 or 6.1 level of firmware, you are not affected.
If you are running a 6.2 level of firmware, the fix level is v126.96.36.199 which is available here for Storwize V7000 and here for SVC.
If you are running a 6.3 level of firmware, the fix level is v188.8.131.52 which is available here for Storwize V7000 and here for SVC.
If you are using a Storwize V7000 Unified, the fix level is v184.108.40.206 which is available here.
You should keep checking the alert to find out any new details as they come to hand. If you are curious about Linux and 208 day bugs, try this Google search.
*** Updated April 4, 2012 with links to fix levels ***
If you have any questions or need help, please reach out to your IBM support team or leave me a comment or a tweet.
*** April 10: The IBM Web Alert has been updated with new information on what to do if your uptime has actually gone past 208 days without a reboot. In short you still need to take action. Please read the updated alert and follow the instructions given there. ***
We just updated our Cisco MDS9509s to NX-OS 4.2.7b (from Cisco SAN-OS 3.3.1c) and now we are getting emails from this source: GOLD-major.
The actual message looks like this:
Time of Event:2012-03-05 15:07:21 GMT+00:00 Message Name:GOLD-major Message Type:diagnostic System Namexxxx Contact Namexxx@xxx.com Contact Emailxx@xxx.com Contact Phone:+61-3-xxxx-xxxx Street Addressx Road, xxxx, VIC, Australia Event Description:RMON_ALERT
WARNING(4) Falling:iso.220.127.116.11.18.104.22.168.1.10.18366464=2401032512 <= 4680000000:135, 4 Event Owner:ifHCOutOctets.fc4/5@w5c260a03c162
So who is GOLD-major?
GOLD actually stands for Generic OnLine Diagnostics. From Cisco's website: GOLD verifies that hardware and internal data paths are operating as designed. Boot-time diagnostics, continuous monitoring, and on-demand and scheduled tests are part of the Cisco GOLD feature set. GOLD allows rapid fault isolation and continuous system monitoring. GOLD was introduced in Cisco NX-OS Release 4.0(1). GOLD is enabled by default and Cisco do not recommend disabling it.
So in our example GOLD is actually reporting a major event (to do with exceeded thresholds, in this example utilisation on interface fc4/5).
Most clients using Cisco MDS switches are now moving to NX-OS (over SAN-OS, the name Cisco used for MDS firmware between version 1 and version 3) so this question will become more common. I am working on a post that discusses recommended versions (and the sunsetting of SAN-OS), so expect something soon. If on the other hand you are thinking.... how do I setup call home on a Cisco MDS switch? The information for NX-OS is here.
Curiously my brain cannot help itself, when I hear Gold Major I think it means Gold Leader which leads me to Red Leader which leads me to Red October. Maybe it's just me? Enjoy:
Because if a product uses a 32 bit counter to record uptime, and that counter records a tick every 10 msec, then that 32-bit counter will overflow after approximately 497.1 days. This is because a 32 bit counter equates to 2^32, which equals 4,294,967,296 ticks. If a tick is counted every 10 msec, we create 8,640,000 ticks per day (100*60*60*24). So after 497.102696 days, the counter will overflow. What happens next depends on good programming: normally the counter just starts again, but worst case a function might stop working or the product might even reboot.
Fortunately we are seeing less and less of these issues but just occasionally one still slips out. Recently IBM released details of a 994 day reboot bug in the ESM code of some of their older disk enclosures (EXP100, EXP700 and EXP710). Details about this bug can be found here. What I find interesting is the number of days it takes to occur, since 994 is actually 497 times two. This suggests that this product records a tick every 20 msec. This meant we got past 497 days without an issue but hit a problem after exactly double that number. So if you still have these older storage enclosures, you will need to reboot the ESMs (after checking the alert).
I googled 497 to see what images that number brings up and was amazed to find the M-497 jet powered train. More details on this rather interesting attempt at speeding up the commute home can be found here and here. It adds a whole new meaning to keeping behind the yellow line.
If you have combined vSphere 5.0 with XIV, then you may want to try out the new IBM Storage Provider for VMware VASA (vSphere Storage APIs for Storage Awareness). You can download the installation instructions, the release notes and the current version of the IBM VASA provider from here. Clearly because VASA is introduced in vSphere 5.0 your VMware vCenter also needs to be on version 5.0.
Now IBM have had a vCenter plugin for a very long time (which I have written about here, here and here) and while you still need that plugin if you want to do storage volume creation and mapping from within vCenter (as opposed to using the XIV GUI), the VASA provider makes storage awareness more native to vCenter. This is a very important step. It means instead of using vendor added icons and tabs (like the IBM Storage icon and the IBM Storage tab that are added by the IBM Storage Management Console for vCenter), you just use the default vCenter tabs.
Right now version 1.1.1 of the IBM VASA provider delivers information about storage topology, capabilities, and state, as well as events and alerts to VMware. This means you will see new additional information in three tabs: Storage Views, Alarms and Events.
After installing and setting up the VASA provider, in vCenter select your VMware cluster, go to the Storage Views tab and select the view Show all SCSI Volumes (LUNs) there are four columns with more information. The Committed, Thin Provisioned information, Storage Array and Identifier on Array (indicated with red arrows) comes straight from the XIV (hit the Update button at upper right if you are not seeing anything yet). This is really useful information as it lets you correlate the SCSI ID of a LUN to an actual volume on a source array. Here is a cut-down view of that extra information:
If you want a larger screen capture you can find one here.
The Task & Events and Alarms tabs will also now contain events reported by the VASA provider such as thin provisioning threshold alerts (although if you have just installed the provider you may see nothing new, as nothing has occurred yet to provoke an alert or event).
As usual I have some handy tips on the steps you will need to take to get VASA going:
First up you will need to identify a virtual machine to run the provider on (or just create a new one). I chose to deploy a new instance of Windows 2008 from a template. Because the VASA provider communicates to vCenter via an Apache Tomcat server listening on port 8443, that port needs to be free and unblocked. This also means you should not run the VASA provider in the same instance of Windows as the vCenter server (see below for more information as to why).
Download the IBM Storage Provider for VMware VASA as per the link above (use version 1.1.1, see the user comments in this post for details about a bug in version 1.1.0).
Install the provider in the Windows VM you created in step 1. The tasks are detailed in the Installation Instructions, but it is a simple follow-your-nose application installation. As per most XIV software packages, it will install a runtime environment (xPYV which is Python) as part of the install.
Now we need to define the credentials that VMware vCenter will use to authenticate to the IBM VASA Storage Provider. These should be unique (and are not an XIV userid and password - this is only between vCenter and the provider software). In my example I use xivvasa and pa55w0rd. The truststore password is used to encrypt the username and password details (so that they are not stored in plain text). Open a Windows command prompt (make sure to right select and open it as an Administrator) and enter the following commands:
cd "C:\Program Files (x86)\IBM\IBM Storage Provider for VMware VASA\bin" vasa_util register -u xivvasa -p pa55w0rd -t changeit
Don't close the command prompt, because we now need to define the XIV to the IBM VASA provider.
You need the IP address of your XIV and a valid user and password on the XIV that can be used to logon to the XIV. So in this example my XIV is using 10.1.60.100 and I am using the default admin username and password (which I know does not set a good example). This is the command you need to run:
If this command fails, reporting your firmware is invalid, you are probably using the original 1.1.0 version of the VASA provider, go back to the IBM Fix Central website and make sure you have the latest version (at least version 1.1.1). If it reports the firmware cannot be read, make sure you are running the Command Prompt as an Administrator.
Once you successfully added the XIV to the provider, you need to restart the Apache webserver. Do this by starting the services.msc panel and looking for the Apache Tomcat IBMVASA service as pictured below. Stop it and then start it. Once you have done that you can logoff from the VASA VM.
Now connect to your vSphere Client (which needs to be on at least version 5.0.0) and from the Home panel, open the Storage Providers panel.Then select the option to Add a new provider. The URL needs to include the correct port number (by default 8443), so it will look something like this (where the provider is running on 10.1.60.193). Note also that the VASA provider version number is in the URL, so if you upgrade the provider you will need to change the URL (currently v1.1.1):
The Login and password should match the user id and password you defined in step 4 (remember it is not logging into the XIV, it is logging into the VASA provider).
If you get a message saying your user id and password are wrong, you probably forgot to stop and start Apache in step 6 above. If you succeed you should see a new provider listed. Highlight the provider and select sync to update the last sync time.
Your setup tasks are now all completed. Now go and explore the panels I detailed above to see what new information you have available to your vCenter server.
Why a separate server for the VASA provider?
The IBM VASA provider uses Apache Tomcat, which by default listens on port 8443. However since vCenter already has a service listening on port 8443, it means we have a clash. I googled and found the Dell and Netapp VASA providers also listen on port 8443 and they also recommend separate servers. I noted Fujitsu's provider uses a different port but still requires a separate server. So it seems if you have multiple vendors you will either have to spin up a separate server for each vendors provider, or start playing with changing the port number. The installation instructions for the IBM VASA Provider explain how to change the default port number if you are truly keen.
I always laugh when people say to me: I wouldn't know what to blog about!
When you work in pre-sales support, you constantly get asked questions and each one of them could be the subject of a new blog post. Right now the most common question I am getting is:
I am implementing VMware Site Recovery Manager (SRM). One of the components I need are vendor specific Site Recovery Agents (SRA). I have searched IBM's website but cannot find them. Where are they?
So the short answer is: you get them from the VMware SRM download site. However before downloading, there is a key task that absolutely needs to be performed:
Visit the VMware vCenter Site Recovery Manager Storage Partner Compatibility Matrix. This site will confirm what products are supported by each version of SRM. You can find it here, but clearly you need to check back regularly to ensure you have the latest information.
Now find your storage device in the matrix and confirm what firmware levels are supported. This is really important. For example, the Feb 27, 2012 edition of the matrix tells me that the Storwize V7000 is supported for SRM version 5.0, but only when running Storwize V7000 firmware version 6.1 or 6.2. This is significant because if you upgrade to version 6.3 you are not supported. In fact that combination doesn't actually work yet, as detailed here. Clearly something you need to be aware of when planning firmware updates.
So where are the SRAs? On each of the pages below use the Show Details button to see what version SRAs are being shipped with that SRM (although sometimes the pages take a few days between an SRA being added and the page being updated):
There are a few more questions I routinely get asked:
Does IBM actually have an SRA download site?
The answer is yes, but it is an FTP site only for SRAs written by IBM. It is principally a repository for older SRAs and beta SRAs but you can also find the current SRAs on it. You can find the site here. Note however that it is NOT the official source. For that you need to use the VMware site.
What about the SRA for LSI/Engenio based products like the DS4800?
These used to also be found on the LSI site, but since LSI sold Engenio to NetApp, it is no longer available from the LSI or NetApp websites. You need to download the current version from the VMware sites listed above. There is a version for SRM 5 on the VMware download site.
What about nSeries SRAs?
If you need an nSeries SRA, again you should go to the VMware download pages. There are separate SRAs listed and available for IBM nSeries (as opposed to an SRA for NetApp branded filers).
What about an SRA for XIV with SRM version 5?
The answer: The SRA for XIV with SRM 5 (and 5.0.1) is now available from VMware. If you have access to download SRM, you will be able to download SRA version 2.1.0. It is the same SRA for both XIV Generation2 and Gen3.
What about an SRA for Storwize V7000 and SVC version 6.3 code?
The answer: It is coming. We are working to make it available as soon as possible. I will update this post as soon as I have a date for you (we are talking weeks, not months).
*** Update March 23, 2012 - Added details on SRM 5.0.1 ***
Many years ago I picked up a book that literally blew my mind. It was the Cuckoo's Egg by Clifford Stoll and it's a genuine classic, a true tale of hackers and how one was tracked down in the very early days of the internet.
Now the story is about events in 1986, so it captures the state of technology at the time (which rather dates the book), but wow, what a great story.
So why mention the book? Well apart from the fact that it is well worth a read, the key issue that Clifford saw again and again was default passwords. The hacker would identify a target and then try to logon using default IDs and default passwords, usually with great success.
Now I have blogged in the past about the determined (but often ignored) way that Brocade switches berate you into changing default passwords. But pretty well all products need to do this, as they all have the same issue (and a truly problematic counter-point). You absolutely need to do two things with every product in your data center:
Change the default passwords on every device you deploy.
Record what those passwords got set to (preferably using a logical or physical password safe).
Now don't laugh, but forgotten/lost passwords on data center kit (like switches) is a VERY common problem. When I worked in the IBM Storage Support team I took calls EVERY WEEK from clients who had devices they could not logon to, for all manner of reasons. For some, supplying them with the default passwords saved them (and condemned their employer?), but for others they needed much more detailed assistance.
My preferred solution to this challenge is to use external authentication (like LDAP) but being able to reset passwords with an external tool is also a nice option to have available.
The reason I started thinking about this is a nice tool IBM offer for the Storwize V7000 called the Initialization Tool that you can download from here. Using this tool you can reset the password of the Superuser ID on a Storwize V7000 back to the default (passw0rd). The tool runs on a USB key. After requesting the tool to help you to reset the superuser password, you insert the USB key into the Storwize V7000, wait for the orange indicator light on the relevant node canister to stop blinking and the task is complete. Then put the USB key back into your laptop and run the init tool again to get a completion report that should look like this:
This is great to rescue customers who have lost their passwords, but the question then gets raised: Can I block this?
My first response is: if you are concerned about unauthorized people with malicious intent placing USB keys into your Storwize V7000, then don't let them into your computer room (presuming you can spot them by the colour of the hat they are wearing). If that is not an option, lock the rack that the Storwize V7000 resides in (change control does have its benefits). If that is not an option, there is one more alternative, but it is a tad extreme.
What we can do is prevent password reset via USB key (or in the case of the SVC, via the front panel). We do this by issuing the following CLI command: setpwdreset -disable
In the following example, I confirm that password reset is possible (value 1), I then disable it and confirm that password reset is no longer possible (value 0). If curious I could then get some help on that command:
Only if your paranoia is matched by your attention to detail.
My reason to hesitate recommending it is simple: If you prevent password reset and then forget your password (and have no other local Security Administrator accounts), you have locked the door and thrown away the key. Far better to physically lock the rack.
In the end though, your company needs to set a policy that is actively enforced (with no exceptions). So get to it.
The updated XIV GUI that supports version 11.1 of the XIV software (which adds support for SSD Read Cache) is now available for download. This brings the XIV GUI to version 3.1 and you can download it for Windows, Mac, Linux, Solaris, AIX and HP-UX from here.
So what benefits will you get?
The new GUI will display information about the SSD read cache. For instance the statistics panel will now also report on SSD cache hits (as opposed to memory cache hits). The GUI will also display the presence and health of the SSD in each module (presuming they have been ordered for that machine). You can clearly see that it is located at the rear of the module!
It supports the IPv6 protocol. So if your XIV system has code level 11.1.0 or above, you can manage that XIV over an IPv6 connection (after using the updated GUI to define the new addresses).
The GUI can now manage up to 81 systems from a single console. Yes you read that right: 81 systems. So let's think about that: IBM would only take the GUI to that number if there were clients who were approaching that number. Outstanding!
Enhanced search and filtering. Allows you to search across all managed devices and also filter what gets displayed. The search function is really nice. You get to it from the View menu as shown: In this example I search for the term test and get a considerable number of hits. If you notice the first column uses some very nice icons to indicate the resource type (such as a volume, pool or host cluster):
The GUI now displays un-mapped LUNs as a separate category. This is also a very nice enhancement.
One other change is that if you start the XIV GUI in demo mode it now also displays an XIV Gen3 (so you can see the Gen3 patch panel).
If you are running Generation 2 XIVs (on 10.x.x code) you will benefit from those last three improvements so there is something for everyone.
When IBM brought out the SAN Volume Controller (SVC) in 2003, the goal was clear: support as many storage vendors and products as possible. Since then IBM has put a huge ongoing effort into interoperation testing, which has allowed them to continue expanding the SVC support matrix, making it one of the most comprehensive in the industry. When the Storwize V7000 was released in 2010 it was able to leverage that testing heritage, allowing it to have an amazingly deep interoperation matrix on launch date. It almost felt like cheating.
However I recently got challenged on this with a simple question: Where is the VNX? If you check out the Supported Hardware list for SVC V6.3 or Storwize V7000 V6.3 you can find the Clariion up to a CX4-960, but no VNX.
The short answer is that while the VNX is not listed there yet, IBM are actively supporting customers using VNX virtualized behind SVC and Storwize V7000. If you have a VNX 5100, 5300, 5500, 5700 or 7500 then ask your IBM pre-sales Technical Support to open an Interoperation Support Request. The majority are being approved very quickly. The official support sites that I referenced above will be updated soon (but don't wait, if you need support, ask for it). IBM is working methodically with EMC to be certain that when a general publication of support is released for VNX (soon!), both companies will agree with the published details.
And for the wags who think that this is a ringing encouragement to buy VNX, you would be missing the point. You cannot be a serious Storage Virtualization vendor if you are not willing to support your clients purchasing decisions, regardless of which vendor they buy their storage from. IBM have been staying that course and demonstrating that willingness since 2003. It's a pretty good track record and one that they are determined to maintain.
I have updated my IBM Storage WWPN Determination Guide to version 6.5. You can find the updated guide on IBM Techdocs here.
The main change is that new DS8800s are now presenting slightly different WWPNs, so I added three new pages to describe the changes.
If this guide is new to you, its purpose it to let you take a WWPN and decode it so you can work out not only which type of storage that WWPN came from, but the actual port on that storage. People doing implementation services, problem determination, storage zoning and day to day configuration maintenance will get a lot of use out of this document. If you think there is an area that could be improved or products you would like added, please let me know.
It is also important to point out that IBM Storage uses persistent WWPN, which means if a host adapter in an IBM Storage device has to be replaced, it will always present the same WWPNs as the old adapter. This means no changes to zoning are needed after a hardware failure.
I also host the book on slideshare, so you can also view and download it from there:
Its been a long time coming, but I finally joined the cult of Mac in the form of a new MacBook Pro. Having not used an Apple Mac for over 15 years, I must say I am truly loving what they have done with the operating system and the hardware (my last Mac was a Mac SE bought in 1990).
Now this post is not a rant from a new convert to everything Apple. In fact my main gripe is that what you rapidly discover when you move to Mac OS is that not every piece of software is going to work in your new world. While Lotus Notes and Sametime have very nice Mac versions, my day to day work involves IBM Storage and there are several tools that I need that are Windows only. These include Capacity and Disk Magic (used to size solutions) and eConfig (used to order IBM products). This means for certain applications I need to use a Hypervisor (such as VMware Fusion or Parallels).
But what about managing IBM Storage? Well I have some good news on that front:
SAN Volume Controller and Storwize V7000: Because these products are managed from a web page, they are operating system agnostic. To be clear, officially only Firefox 3.5 (and above) and IE 7.0 and 8.0 are supported (support details are right at the bottom of this page while setup details are here). Since IE is no longer available for Mac, you should install Firefox (or try out Safari or Chrome, I have tried all three without issue).
XIV: The XIV GUI is available in a native Mac OS version from here. The release notes state that the XIV GUI works on Mac OS X 10.6 but I am happily using it on Mac OS X 10.7 (Lion). The Mac OS X installation process is simply beautiful (just drag and drop, one of the truly nice features of Mac OS X) and of course it works just as nicely on Mac as it does on the other supported operating systems.
Drag and drop done right.
Attaching OS X to IBM Storage
Of course maybe you want to attach your Mac OS X box to IBM Storage. If you visit the SSIC you will find IBM supports OS X on pretty well it's entire range including SVC, Storwize V7000, XIV, DS3500 and DCS3700. Mainly these use the ATTO HBA and multipath device driver. If your particular setup is not there, get your IBM Pre-sales Support to open a support request, depending on your request, approvals are normally very fast.
Of course I have to mention the iPhone and iPad. IBM have the XIV Mobile Dashboard for both devices, which I previously blogged about here (iPad) and here (iPhone). These are really elegant apps that even have a cool YouTube video.
Of course now I want all the goodies promised in Mountain Lion. With the convergence of OS X and iOS, I would love to see even more converged tools. A man can dream....
There is a demo mode, but right now there is no tick box to activate it. Simply use the word demo in all three fields at login. In other words:
IP address: demo User ID: demo Password: demo
2) Retina display requirement
The Mobile Dashboard was written for the Retina display (that comes with an iPhone 4 or iPhone 4S). This sadly means that the iPhone 3GS and earlier will not be able to use the new Mobile Dashboard. This wasn't done as part of some devious plan on IBM's part to force you to buy a new iPhone, the developers simply needed the better resolution to draw those graphs and provide the richest and clearest display of information on a single page (you will believe it when you see it, the detail is quite stunning).
The Apple Store clearly states the hardware and iOS requirements on the download page, however you can still try to install it on an iPhone 3GS. Curiously what you get is this rather bizarre message:
The reason you get this message is simple: There is no way to specify when uploading a new app to Apple that you need the Retina display. So instead the developer needs to specify a feature that is not found on earlier iPhones, such as a camera flash. So it is not that the XIV Mobile Dashboard needs a flash in your camera, it is simply a quirk of the Apple store.
And for those of you who are using Android devices, your calls are being heard. Watch this space for developments in that direction.
Here are two common statement I often hear from clients:
I don't just want SAS drives, I also want SATA drives. SATA drives are cheaper than SAS drives.
Nearline SAS drives are just SATA drives with some sort of converter on them.
So is this right? Is this the actual situation?
First up, if your storage uses a SAS based controller with a SAS backplane, then normally you can plug SAS drives into that enclosure, or you can plug SATA drives into that enclosure. This is great because when you plug SATA drives into a SAS backplane, you can actually send SCSI commands to the drive plus you can send native SATA commands t00 (which is handy when you are writing software for RAID array drivers).
But (and this is a big but) what we do know is that equivalent (size and RPM) SAS drives perform better than SATA drives. This is because:
SAS is full-duplex, SATA is simplex.
SAS uses the native SCSI command set which has more functionality (which leads to the next point).
A SAS drive uses SCSI error checking and reporting which is much more robust than the SATA error reporting. This allows your storage system to collect richer information from the drive if errors are occurring (such as a failing or marginal disk).
SAS drives are dual ported which is vital in dual controller enclosures.
So given a choice (and a very small price differential), why choose SATA over SAS? SAS is the clear winner. What we should instead differentiate on is speed (7.2K RPM vs 10K RPM vs 15K RPM vs SSD) and size (2.5" vs 3.5" form factor).
Which leads us to Nearline SAS
It is a common belief, that if you buy a Nearline SAS (or NL-SAS) drive it is really a SATA drive with a SAS connector (interposer) stuck on it. But this is confusion from the past.
What led to the confusion?
Most midrange and enterprise storage controllers and enclosures up until recent years, used disks that had fibre channel interfaces on them. We plugged those disks into fibre channel enclosures. Examples include the DS4700 or the DS8100. And yet these devices also offered SATA drives. How did they do this?
They took a SATA drive and added a SATA to Fibre Channel converter card to the disk. We call this extra piece of hardware an interposer or bridge card. So people start assuming that this is common practice in every product. In fact we are now seeing SAS drives being put into fibre channel disk enclosures by using a SAS to fibre channel interposer.
There are in indeed older products that did take a SATA drive and add a SATA to SAS interposer to achieve a similar thing. But that really is not necessary any more. The reason? The same hard drive can now be ordered from the factory as either a SAS drive or a SATA drive.
Seagate have a nice selector tool to let you see all their possible combinations. For instance you can order a 2 TB drive with a 6 Gbps SAS interface, which is a model ST32000444SS:
Or you can order a 2 TB drive with a 6 Gbps SATA interface, which is a model ST2000NM0011:
So what you get is very similar drive hardware (same spindles, heads, motors) but with different adapter hardware, built with the desired adapter at manufacture time. Meaning that if we install this drive into a SAS enclosure, there is no need to add an interposer or bridge card to the drive after you bought it.
This leads to the next question:
OK. So this is good, so Nearline SAS drives are MADE as SAS drives. Does that mean a drive manufactured with a SAS adapter is a SAS drive or a Nearline SAS drive?
Now we are mixing up two different things. SAS as a standard is a combination of a connection technology (the Serial Attached part) and a command set (the SCSI part). Actually SCSI as a standard also defines both connection methods and command sets. So SAS is really talking about how we connect to the disk and what command set we use to control the disk.
Nearline on the other hand is a statement about the disks rotational speed and it's mean time between failure (MBTF). A Nearline-SAS drive is Nearline because:
It rotates slower (7200 RPM) than the higher specified Enterprise drives (that spin at 10 K or 15K RPM). Because they are slower they can also hold way more data.
It has a lower MBTF (1.2 million hours) than the higher specified Enterprise drives (which are normally specified at 1.6 million hours).
So we have now gone full circle. A Nearline-SAS drive can use the same physical disk hardware as a SATA drive, but with a superior adapter that uses a superior command set, built onto the drive at manufacture time.
Still confused or want to read some more? Check out these links:
I got a question about Veritas DMP and XIV, so I thought I would write a quick post with some details on the subject.
A fundamental requirement for a host attached to a fibre channel SAN, is the use of multi-pathing software. One alternative to achieve this (that IBM support for most operating systems attaching to XIV) is Symantec Dynamic Multi Pathing (DMP). A nice way to find out whether this is the case for your particular operating system is to head to the SSIC, choose Enterprise Disk → XIV Storage System → Your product version and then Export the Selected Product Version to get a spreadsheet of every supported environment. Now under the multi-path heading of each page you will see what choices are supported.
It works with heterogeneous storage and server platforms (so you could have EMC and IBM attached to the same server at the same time).
You can centrally manage all storage paths from one central management GUI.
Then the question becomes, if I choose to go down the DMP route, do we still need the XIV Host Attachment Kit (HAK)?
The answer is a definite yes!
Veritas DMP and Solaris
If you're using DMP with Solaris, when you run XIV HAK wizard, it will scan for existing dynamic multi-pathing solutions. Valid solutions for the Solaris operating system are Solaris Multiplexed I/O (MPxIO) or Veritas Dynamic-Multipathing (VxDMP). If VxDMP is already installed and configured on the host, it is preferred over MPxIO.
Veritas DMP and Windows
For a Windows host the important point is that Veritas Storage Foundation Dynamic Multipathing (DMP) does not rely on the native multipath I/O (MPIO) capabilities of the Windows Server operating system. Instead, it provides its own custom multipath I/O solution. Because these two solutions cannot co‐exist on the same host, perform the following procedure if you intend to use the Veritas solution:
Install the Veritas Storage Foundation package (if it is not already installed).
Restart the host.
Install the IBM XIV Host Attachment Kit (or run the portable version).
The HAK will perform whatever system changes it detects are necessary while still allowing DMP to perform the multipathing. This may require a reboot (to install Windows hot fixes).
As I said, the HAK will ensure that the required hot fixes are present. These hot fixes are fairly important. To understand what tasks the HAK will want to perform WITHOUT performing them, use the portable HAK and run:
This will tell you what tasks will be undertaken when you run the command without the -i parameter. I detailed this behaviour here.
One benefit of the HAK is the wonderful xiv_devlist command. Even if you are using DMP, the xiv_devlist command will still work, although you may need to specify veritas as per this example:
xiv_devlist -m veritas
Need more documentation?
This is all documented in the XIV Host Attachment Users Guide which you can find here .
I love USB keys, I love free ones, conferences give away ones and ones shaped like lego blocks. The exciting thing (for me) is that if you buy a Storwize V7000, you also get a USB key: A key which has two fundamental purposes:
It's used to make installation very quick and easy (which it does very well!).
It's used to reset the superuser password (in case you forget what it is) or to set the service IP addresses (in case you didn't set them like I suggested you do ).
This is all well and good but what happens when you lose it, borrow it or accidentally throw it out? (oops) If you are searching for it, yours may well have looked like this:
So what to do? The answer is: It's OK, there is nothing magic about this key. In fact the key contains just one piece of software, which you can get from here. Just download the initialization tool and copy it onto your own USB key. The original key also had an Autorun file, but you don't need that (actually I object to auto-running USB keys anyway).
BUT... and there is always a but... I cannot guarantee that EVERY USB key you try will work. Why not? Because some USB keys are formatted strangely or insist on running unique applications before they will work. There is some good, simple advice on the InfoCenter that you can find here. The main trick is to use a USB key that is formatted with the FAT32, EXT2, or EXT3 file system on its first partition and does not need to auto-run any applications before working.
In that post I detailed how the XIV began as the Nextra, was then released as the IBM XIV and then updated to the XIV Gen3. So this means last year we saw release 3.0 of the XIV.
At the risk of getting over excited, some of the achievements of the IBM XIV have been truly remarkable:
There are 59 Clients with more than 1 PB each of usable XIV capacity
There are 16 Clients with more than 2 PB each of usable XIV capacity
I am sure some competitors will find larger numbers to try and drown out this achievement, but the point is this: These are FANTASTIC numbers. It shows that despite all the FUD, the XIV is a success for IBM and a success for IBM's customers.
So at the time of the Gen3 release, IBM made no secret of the fact that they planned to add the option of SSD as a read cache layer. In fact each and every Gen3 shipped so far has the mounting and attachment hardware needed to support those SSDs.
Now with release 3.1 IBM turns that promise into a reality.
So... to answer some possible questions:
How can I get some of this SSD goodness?
Order the feature! For existing machines, IBM will need to update the firmware of your XIV Gen3 (non-disruptively) to add SSD support. There will also be an updated version of the XIV GUI. Once those are in place, an IBM Service Representative will add an SSD to each interface module. All of this will be completed without interruption to your operations.
How much read cache will I get?
Each XIV Gen3 Module already has 24 GB of server RAM. Since an XIV can vary from 6 to 15 modules (based on capacity), that gives you between 144 GB and 360 GB of server RAM to provide read and write cache. If you add the SSD option you will get a 400 GB SSD per module. This means we get between 2.4 TB to 6 TB of additional read cache (depending on module count). The SSDs are not used as write cache.
What administration will I need to perform?
How about none? This is XIV: it's all about making it simple. It's no surprise that practically every IBM Storage device now uses the XIV GUI. These guys wrote the book on making things easy to use.
But seriously, no administration? Well... there are two things you may want to do:
Check how many SSD based read hits you are getting (versus memory based read hits). It's always nice to see just how effective these SSDs are proving themselves to be.
Turn SSD read caching off or on at a per volume level (by default it is on for all volumes). I don't anticipate many clients will need or want to do this, but the option is there and it is very easy to do.
Won't these SSDs wear out or slow down over time?
These are the two great fears of SSD... and XIV development has combined their art with some great work from IBM Research to make sure this is not an issue. The way data is written out to the SSD is handled in a very sophisticated manner. The end result will be consistent and predictable performance with a very long operational life. I will give you more details about exactly how this is done in a future post.
What happens if one of these SSD fails?
Because the SSD is not used as write cache, no data can be lost. Data in memory cache is de-staged by that module to both SAS disk and asynchronously to SSD (although not all data will necessarily go to SSD). So there are no bottlenecks and there is no risk. The other modules will keep using their SSDs and IBM will replace the failed SSD non-disruptively.
What sort of performance improvement will I see?
Depending on application and data patterns you should see your IOPS more than double. A three times improvement is quite possible. Response times could drop by more than two thirds. In many ways these are obvious results.
IBM intend to demonstrate using industry standard benchmarks what the performance of an XIV Gen3 with SSD will be. I can tell you these numbers are going to be very impressive. Watch this space.
Is that it? Any thing else in this release?
Release 3.1 also adds:
The ability to mirror between Generation 2 and Gen3 XIVs.
All the base support for IPv6 is now in place (although there are still some certification tests to complete)
Improvements to system thresholds (such as maximum pool size)
GUI enhancements (mainly to add panels for the SSD cache)
A new iPhone app (in addition to the existing iPad app)
If you interested in the current state of play with XIV, there are a huge number of new resources that have been created or updated as part of the XIV 3.1 update, so I thought I would give you a list. If you are a customer then please scan down to see if there is anything here that interests you. If you an IBMer or IBM Business Partner (or IBM competitor!), this is all mandatory reading. Either way, check out the new YouTube video, it is very cool.
As promised here is the new video on YouTube that shows the new XIV iPhone App!
I just checked the Apple App Store and cannot see the application yet (only the iPad version). I will update you the moment the iPhone version becomes available for download (and yes it will have a demo mode).
For more XIV related materials (white papers, demos, videos, case studies...), I invite you to pop over to the XIV area of the ibm web site: ibm.com/storage/disk/xiv. You'll find links to materials throughout, such as the SPC report and ISV white papers; click on the Resources tab for a consolidated list of the most recent materials.
I have previously blogged about two XIV report generation tools that you can download and start using. This is just a short update to let you know there are updated versions of both tools, plus a new one that has just been added. These tools are all on my files section at the IBM developerWorks site (where you can also find my Visios).
To sum up what these tools do:
XIV Capacity Report
This Script creates an XLS or CSV file that contains 4 very useful tabs: Systems, Pools, Hosts, Volumes. You can use this to report on your storage, find un-mapped or un-mirrored volumes, check your consumption, etc. Clients, Business Partners and Cloud providers love this nice and simple tool.
It is currently up to version 3.9 and you can find it here.
XIV Performance Report
This Script creates an XLS or CSV file that gives the same information as the XIV Top utility but for a range of days (so we are looking at historic versus current performance). You could for example see what were the most busy volumes for the past 3 days or for the previous week. You can easily spot if host HBAs are not being used or if XIV interface traffic is not being balanced.
It is currently up to version 3.9 and you can find it here.
XIV Usage Report - NEW!
This Script creates an Excel file that shows you the current and historic usage of your volumes and pools. It also gives a trend prediction that will help estimate when your pools or volumes will be full of data. This is great for trend and growth analysis.
It is currently on version 3.9 and you can find it here.
I listened to another great podcast from the Freakonomics team recently in which they recounted the story of Doctor Ignaz Semmelweis, which inspired me to make a connection to something I see in my day to day job.
Doctor Semmelweis worked at the Vienna General Hospital in the 1840s, delivering babies, teaching students and performing autopsies. Now while working there he realized there was something going horribly wrong at the hospital: up to 1 in 6 of the women whose babies were delivered by the male doctors were dying either during or after childbirth. This rate was far higher than the death rate for women whose babies were delivered by midwives and much higher even than the death rate for women who gave birth on the street!
Semmelweis studied this issue very closely and concluded (quite rightly) that the issue was invisible cadaverous particles on the hands of the doctors. The doctors were going straight from performing autopsies to delivering babies... and transmitting all sorts of foul material to the birthing mothers, killing some of them in the process.
His solution was simple: He made the doctors wash their hands.
The result? The rate of women dying after giving birth at that hospital went from a peak of 15% to less than 2%.
So you would like to think that this story ends with Semmelweis declared a hero and hospital hygiene achieving new heights. Sadly it instead ends with Semmelweis being mostly ignored, going mad and dying from injuries sustained from a beating he received in a mental asylum. His discoveries only really began getting wider recognition after work by greats such as Louis Pasteur and Joseph Lister.
So what on earth does this have to do with Fibre Channel attached storage?
Well the answer is invisible dirt particles and their role in causing hard to explain issues (work with me here your honour, I will make my point).
Fibre optic cable relies on the exposed fibre being absolutely clean. The center of the image below is the light coming from a light source being used with a fibre microscope. While that lit spot looks large, it is actually only 62.5 microns (which is tiny).
If you are using single mode (9 micron) fibre (commonly used with long wave adapters) that lit spot is even smaller:
So what does a dirty fibre look like? How about this:
Contaminated error generating cable
What about a badly cleaned one?
Badly cleaned cable
Now these images are scary. Even worse, the contamination is invisible to the naked eye. It is almost impossible to see dirt on your fibres (and staring at the end of a cable is not recommended anyway, regardless of what is at the other end). So this leads to some obvious questions:
How can I keep my cables from getting dirty?
Quite simply don't expose them to dirt. Always leave dust covers in place on the cable ends and in the SFPs until they need to be used. Don't drag unprotected cables under the floor or leave them hanging in the racks. Don't re-use cables without cleaning them. In fact I recommend cleaning new cables before you start using them. Finally your dust covers need to be protected from dust too. Store dust covers in a sealed bag so that if you re-use them, they have not become contaminated.
How can I clean my cables?
Cleaning kits are something every site should have onsite and always available (like hand sanitizer for Doctors!). Google fibre optic cleaning kit for lots of products. I have used Cletops devices but there are plenty of other choices on the market.
Can I create images like the ones above?
You sure can. Google fibre microscopes for lots of products that can do the job for less than $500. There are plenty of choices on the market. Even if you are not willing to make the expense yourself, make sure your cable provider has one available. If they are testing your cables with a flash light, get another provider.
Can my SAN switch tell me I have dirty cables?
The two most common commands I use are porterrshow and statsclear (on Brocade switches). If you see any values in the highlighted six columns of evil, you may have bad SFPs, damaged cabling or dirty cables. Just be careful it is not ancient history. Clear the stats (with statsclear) and wait a decent interval before checking again with porterrshow.
I could talk in even more detail about monitoring at the switch, but I think that is a whole other blog post.
Feel free to share your horror stories. Who knows, maybe dirty cables are causing your current horror story?
The IBM SVC has has been setting records in SPC-1 (OLTP-like) benchmarks for many years. However recently HP stole the crown with a 3Par benchmark of 450,212.66 IOPS.
But in breaking news, the SVC is back on top with the very first SPC benchmark that exceeded 500,000 IOPS (520,043.99 to be precise!). You can see the executive summary here.
This benchmark used eight of the current SVC engines (model CG8s) with Storwize V7000 as the backend disk. It shows the awesome power of SVC, its ability to scale and to handle very large configurations with very large throughput requirements. It also shows the power of IBM pSeries which was used to drive these IOPS.
My good friend Rob Jackard from the ATS Group has compiled this list of updates to the IBM Support site. It is a very comprehensive list of updates, flashes, tips and warnings and it is well worth spending a few minutes scanning the list to see if any apply to your environment. There is even a warning about a 497 day bug.
Ideally none of these tips should be news to you if you are getting regular emails using IBM My Notifications so please sign up (or maybe check that your notification list has the correct products) and then read on.
(2011.11.07) DS8700/DS8800 users running with 8Gb host adapters on Release 6.1 exposed to potential loss of access condition. NOTE: The firmware fix is available in R6.1 for DS8700- Bundle 22.214.171.124 or higher (126.96.36.199 is recommended), and for DS8800- Bundle 188.8.131.52 or higher. http://www-01.ibm.com/support/docview.wss?uid=ssg1S1003931
(2011.11.01) DS8700 / DS8800 internal error recovery can result in loss of access when 2145 SVC is attached. NOTE-1: Firmware fixed for DS8700 [Bundle 184.108.40.206 or higher, recommended 220.127.116.11]. NOTE-2: Firmware fixed for DS8800 [Bundle 18.104.22.168 or higher, recommended 22.214.171.124]. http://www-01.ibm.com/support/docview.wss?uid=ssg1S1003743
(2011.11.13) Storwize V7000- Performance Degradation and Loss of GUI/CLI Access Due to Excessive Numbers of Socket Connections. NOTE: This issue exists in all V6.1.0.x and V126.96.36.199-V188.8.131.52 releases. This issue was fixed in the V184.108.40.206 PTF release. http://www-01.ibm.com/support/docview.wss?uid=ssg1S1003930
The first update for Storwize V7000 and SVC release 6.3 is now available. You will find it here for Storwize V7000 and here for SVC (note both links will require you to login to Fix Central with your IBM ID). As usual the new release contains a combination of new features and fixes. The new features are:
New features in SVC 220.127.116.11
* Support for multi-session iSCSI host attachment
* Language Support for Brazilian Portuguese, French, German, Italian, Japanese, Korean, Spanish, Turkish, Simplified Chinese and Traditional Chinese
There are also several fixes (with some variation between SVC and Storwize V7000, mainly around the platform hardware). The release notes (which you can find at the links above) detail them all. Two fixes I have been looking forward to are:
IC80253 Unable to log into the GUI if password contains special characters. This meant that a password with a comma in it could not be used in the GUI (you got a backspace instead). Passwords with commas could be used in the CLI. This bug was picked up by one of my clients when trying out LDAP and is now fixed in 18.104.22.168.
IC80501 Performance statistics collection fails to record read and write response times for internal drives. This issue meant that SVC internal SSD drives always showed 0 ms response times in TPC.
Note that the Drive firmware does not need to be updated with this release. The new upgrade test tool (version 7.3) will not ask you to update them. I will let you know when that situation changes.
Having survived Wikipedia's 24 hour protest against SOPA and PIPA, a lot of people have suddenly discovered just how much they have come to rely on Wikipedia as an information source. Not that this is a bad thing, after all I love Wikipedia, I truly do. I think it is one of the greatest achievements of the world-wide web and represents a marvelous opportunity for each of us to preserve a record of our world and all the things in it.
I would use Wikipedia every day (often many times per day), as part of my job, for blogging and for general interest. But I grow increasingly saddened by the lack of tech company history (especially SAN) in Wikipedia. Truly great IT companies and truly great IT products have come and gone and are often not really represented. For instance:
When you search for McData you end up on the Brocade page. Where is the corporate history of McData? Do the McData 6064 and 6140 deserve nothing but passing mention? Ex McData employees where are you?
When you search for Engenio you end up on the NetApp page. Where is the corporate history of Engenio?
There are no pages for Inrange, CNT, Nishan, DDN. I could keep going, the list of missing companies is very long.
The first laser printer ever released by Xerox, the 9700, does not have its own page. This was a major milestone in the development of office technology.
The first IBM CMOS mainframe the one that ended IBM's love affair with water cooling, the IBM 9672, does not have its own page. This product started a major change in the mainframe market.
I could keep listing other significant products that need far more details than they currently have.
Now of course there had been a huge amount of work already done, so please don't think I am denigrating the work of Wikipedia's many contributors. But more needs to be done, and the good news is that a central tenet of Wikipedia is just fix it. So anyone can contribute, either by creating an article, or editing an existing article, or by participating in the talk pages that exist for each article (and those can also be VERY interesting).
If you are keen enough to create an article, you need to:
Ensure your article is properly referenced and cited. This last requirement is the biggest challenge. Many large corporations simply don't leave enough artifacts to make this easy. But artifacts are there, you just need to look. Google is not always your friend. You may need to do some genuine research.
Don't want to write? Here is something else you could do: If you have any photos that you took yourself of IT gear or computer rooms, upload them to Wikipedia Commons. If you don't know how to or feel uncomfortable doing so yourself, let me know, I can help. Provided the photos are your own work, are not scans of other people's images and you are happy to share them, then we can preserve them and use them freely in Wikipedia articles. Almost every IT related article would benefit from photos of equipment. I particularly want images of IBM printers and copiers, do you have any?
But why does Wikipedia and Lady Gaga leave me speechless? As a benchmark, look at the article on the Lady Gaga song, Speechless. If the Gaga's fans can find the enthusiasm to create such a long article (21 KB) on just a single song, surely some budding IT historians can tip the balance on some of our great contributors to SAN.
It is time to get writing....
The many articles on Lady Gaga in Wikipedia leave me speechless (in a good way).
One of the many tools in your XIV toolkit is the Host Attachment Kit or HAK. Two of my favorite commands provided by the HAK are xiv_attach and xiv_fc_admin which we use to configure our hosts. Of course users want to be exactly sure what changes these commands might make and while the current output gives some good information, some of our users wanted more. So the good news is that the XIV Host Attachment Kit version 1.7.1 is now available and the Windows and Linux versions now have a new extra verbose mode which you can access with the -i parameter as shown:
By using -i with the commands you will see in even more detail what changes are needed to configure your host to attach to an XIV, but without any changes actually being performed (which is a very cool thing). If you still need more information on the output you can use Appendix A of the updated XIV Host Attachment Kit Users Guide, which you can get from here. The guide has some very useful information on best practices and VMware setup and is really mandatory reading for people using XIV.
To be clear the new verbose parameter only works with HAK version 1.7.1, which you can download from here. Other notable changes in version 1.7.1 are over 40 bug fixes and over 20 improvements, including support for RHEL 5.7, 6.1, and 6.2 (but please check the SSIC to confirm support).
As I previously blogged, the XIV Host Attachment Kit now comes in a portable version. This means you can run the HAK commands without having to first install any software or make any changes on your server. When you download the HAK you get both the portable and full installer versions.
In other news the XIV Host Software team now have their own blog! Check it out here and add it to your RSS list.
The big question of course is which drive type to choose? The answer is that ideally you should possess three pieces of information:
How much usable space do you need in GB or TiB? Don't confuse binary and decimal!
What is your typical I/O profile. For instance 70% reads 30% writes, 32KB block size.
What are your IOPS and response time requirements?
Armed with this information, get your IBM Sales Rep or Business Partner to model your requirements using Capacity Magic and Disk Magic. These modelling tools will tell you how much usable capacity a particular configuration will give you and what performance you can expect to get from it (given a particular I/O profile). If you don't know your I/O profile or IOPS requirements, you can still see performance modeling using industry standard benchmarks.
This will be my last blog post for the year as I am taking a break for Christmas and New Years. I really want to thank you all for reading and following my blog. I certainly plan to keep blogging next year so don't think I have given up! You should see new updates in mid January.
2011 has been an amazing year with some great highs and some terrible lows. I certainly hope that 2012 is a bright year for everyone in the world, especially those whose lives have been affected by natural disaster and political upheaval.
To close, I want to wish you all a great holiday season, a Merry Christmas, a happy Hanukkah and a Happy New year and share with you three of the most important messages to take into the holiday season:
Firstly, stay safe. Here is a brilliantly put together (and totally serious) road safety message from New Zealand:
Secondly, stay healthy. If you can dance like these guys in a clip from the classic 1973 film of Jesus Christ Superstar, you will be very fit indeed! It is like a Zumba class on steroids (the video has a slow start, it really kicks off at 0:37 but it is worth the wait).
Thirdly, lets hope in 2012 that all of the worlds leaders will focus on peace and love. Here is a 1980s power ballad (yet again I show my age) with a very Christmas themed video clip. Enjoy.
I am getting this question on a very regular basis:
"We have just upgraded to ESXi 5.0 but we cannot find the VAAI driver on the IBM Website"
The answer? There is no vendor supplied driver because no driver is needed. ESXi 5.0 uses a SCSI T10 compliant set of commands that all vendors need to support for VAAI to work.
But of course in the tradition of all answered questions, it leads to another question:
"Once I have upgraded to ESXi 5.0 how can I tell if VAAI is really working?"
The good news is that it is very easy to spot if ESXi 5.0 has detected a VAAI capable LUN. The moment a new LUN is detected by ESXi 5.0 it tries out an Atomic Test and Set command. If that works, you will see that Hardware Acceleration shows as Supported in vCenter. In the screen capture below I have three datastores, two from XIV and one from Storwize V7000, all presented to an ESXi 5.0 server. I dragged the Hardware Acceleration column over from the right hand side to help with the screen capture (in case your vCenter looks different), but you can see the Hardware Acceleration column shows each DataStore as Supported (and did so the moment the volume was detected).
Of course having seen the Hardware Acceleration Supported message only proves that Atomic Test and Set works. To confirm if XCopy (Hardware Accelerated Move) is working, on SVC or Storwize V7000 we can use the Performance monitoring panel. In the example below I first performed a storage vMotion, moving a virtual machine between two Datastores located on the same Storwize V7000 (running 22.214.171.124 firmware). I then performed a clone of the same virtual machine, where the source was on one datastore and the target was placed on another (but both located on the same Storwize V7000). What you can clearly see is that both operations (storage vMotion and cloning) generated no volume traffic, only MDisk traffic. This means that the ESXi server is doing none of the work and the storage is doing all of the work.
The Storwize V7000 and SVC have a command line interface that you access via SSH. Every-time you logon, whether it is to transfer a file (using a tool like pscp), issue a single shot command from a script (using a tool like plink) or logon to issue commands interactively (using a tool like PuTTY), you clearly need to authenticate yourself. Since June 2003, the way you did this was to use a public/private key pair, where the SVC or Storwize V7000 had the public key and the SSH client (such as PuTTY) authenticated using the private key (the PPK file).
However with release 6.3 of the SVC and Storwize V7000 firmware, the use of key files is now optional. A user can now authenticate purely by using a password. This includes using your domain ID. So if you defined LDAP to your machine, as I documented here, you could now SSH direct to your Storwize V7000 or SVC, use your Domain user id and password and not go through the key file setup task. Nice!
The choice to continue to authenticate just with an SSH key remains available. If a user has both a password and a configured key file, then either method will work (you only need to use one - not both). Existing scripts will be unaffected by this change, so nothing gets broken because of this.
I think this is a very positive change and one I openly welcome. Combined with LDAP, this really makes user account setup an easy and simple task.
IBM recently released a new version of firmware for the SAN Volume Controller and Storwize V7000. This is known as release 6.3 and continues the tradition of two major updates per year, each adding significant new functions.
So the 6.3 release notes for both Storwize V7000and SVC listed the following new feature:
Support for 4096 host WWPNs
Since I blithely listed this feature in a recent post I have received lots of emails asking exactly what it means, so I thought I had better explain.
The IBM SVC and Storwize V7000 have always had very clearly published maximum capabilities such as the ones listed here for Storwize V7000 release 6.3 and here for SVC release 6.3.
Most of these numbers are very high and few customers actually approach these maximums. The main issue I am seeing for some of our larger AIX customers is this one:
Total Fibre Channel ports (WWPNs) per I/O group: 512
The reason this can become an issue is the combination of NPIV and AIX Live Partition Mobility. NPIV allows one physical HBA to be shared among multiple operating system instances, each one believing it has exclusive access to the HBA with each one allocated its own unique WWPNs. Suddenly a single HBA which used to present just one WWPN through the SAN to the SVC, can now present vast numbers of them. In addition AIX Live Partition Mobility (which lets you move AIX operating systems between LPARs on the fly) needs additional pre-configured WWPNs defined on the target LPAR to support the move. This further increases the quantity of WWPNs that need to be defined to the SVC (one easy way to spot NPIV generated WWPNs is they normally start with the letter C).
So the bottom line is that IBM needs to make this limit bigger and SVC and Storwize V7000 6.3 code contains the necessary architectural changes to allow this. The first phase is to start potentially supporting up to 2048 WWPNs per I/O group although clearly based on the initial version of the release notes, the long-term plan is to support 4096.
But there is a problem and it has nothing to do with the SVC or Storwize V7000. The problem is that there are certain SAN configurations which may have issues with these large numbers of WWPNs (mainly around older SAN switches not having the CPU power for the switch fibre channel name-server and login-server to handle vast numbers of WWPNs coming out of one HBA).
So what should you do if you need to push the limits?
Contact your IBM Pre-Sales support and ask for a SCORE request to be opened (also known as an RPQ). You will need to detail your current SAN configuration (especially switch models and firmware levels) so that SVC development can ensure you won't overwhelm your switches. It will also allow our development team to learn how many clients our there need this support. All approvals will include a requirement to upgrade to release 6.3, so you should include this in your planning.
Any questions? Feel free to leave a comment or send me a tweet or an email.
If you want to read something visionary, I just finished The Medium is the Massage by Marshall McLuhan. With dramatic changes occurring in the world, often driven by social media, it is amazing to read a book that understood this so well yet was written in 1967. In fact Marshall passed away in 1980, well before most people got near a TCP/IP address and when the role of the Internet in societal change was still many many years away. A fascinating read (with some amazing art work!).
I have also been reading the excellent Steve Jobs bio by Walter Isaacson, what a fantastic book! I did not know about the friendship between Steve Jobs and Larry Ellison, nor about Al Gores role in Apple. The details of the early collaboration between Steve Jobs and Bill Gates are really interesting, as is the amazing back story of how Pixar came to be. The book is as much a history of the personal computer as it is a bio of Jobs himself. Reading the book was also quite thought-provoking, especially on themes like: motivation; ignoring the impossible; creating A-teams and the impact of office layout on creativity and teamwork.
At one point the book refers to the Macintosh software dating game, which led me to watch this video. It is also a revelation. Enjoy!
The XIV GUI (which you can download here) is available for a very wide variety of platforms:
AIX 5.3, 6.1, 7.1
HP-UX 11i v2, 11i v3
Linux (RHEL 5, SLES 10 and 11)
Mac OS X 10.6
Solaris 9 and 10
Windows (various versions including Windows ME!)
My gut feel is that the truly vast number of installations and downloads are for the Windows GUI version. I suspect the rest rarely get a look in, I mean do people actually use the XIV GUI on AIX, Solaris or HP-UX hosts? I really doubt it.
However you can also get just the XIV command line interface (also called the XCLI) for the following operating systems:
Now I can see why these would be popular. Being able to script XIV CLI commands and execute them locally makes perfect sense, so all the major Operating Systems are represented. But it is curious that a separate Windows CLI installer is not listed. Of course you get the XIV CLI when you install the Windows GUI, but I am equally curious if there are users who want to avoid having the GUI present on servers that only need the CLI.
An example of this would be if you use Commvault SnapProtect with XIV, since each client will need to issue XCLI commands to drive the hardware based snapshots. So the good news is that you can actually force the XIV GUI installer to install only the CLI component. You can do this by using the following command (in a command prompt with the XIV GUI installer file in that directory):
You will need to change the file name at the start of the command to suit the version you have downloaded. I tested it on version 3.01 and it worked just fine, all it installed was the XIV CLI. So keep this in your back pocket and use when required.
And if you ARE using the XIV GUI on AIX, HP-UX or Solaris, I would love to hear which platform you are using and why. And if you are still using Windows ME.... your persistence is admirable.
With the 6.3 release of the Storwize V7000 and SVC code (which I blogged about here), there are so many new features and functions that I have plenty more to blog about!
The first new feature I blogged about was LDAP support, but an existing feature that has been enhanced is the performance monitor (brought in with release 6.2). When this first came out I put a video on You Tube showing what metrics could be displayed in that release. This is a sped up image with no voiceover:
Now with release 6.3 IBM has added separate graphs for reads and writes plus the ability to display IOPS or MBPS, plus the ability to display graphs of read and write latency. Nice! I got so excited I made another You Tube video, this one with narration. So now you can compare the new to the old:
Now this is interesting: IBM is offering a ratings system that allows customers who bought IBM products to write reviews and leave ratings (out of five) on IBM Storage, Power and System Z products, straight from the main ibm.com website.
Lets imagine a new rack server or a new blade server has been added to your Fibre Channel SAN. The first job for the SAN administrator is to zone it to the storage it requires access to. The task normally runs something like this:
Identify the WWPNs for the new server HBA. We can do this using Qlogic SAN Surfer or Emulex HBAnywhere, or by looking at the WWPNs reported by the Fibre Channel switch or by using datapath query wwpn (with SDD and SDDDSM) or by using the xiv_fc_admin -P command with the XIV HAK. There are lots of different ways, you get the idea.
On fabric 1 create a new alias for the server HBA port cabled to that fabric.
For each storage device that the server needs access to on fabric 1 (or possibly just switch 1), create a new zone and include the new server alias and the alias for every relevant storage port on that device. Repeat if you have other storage devices (so two XIVs means two new zones).
Put the new zone (or zones) into the active zoneset (or a clone of it) and activate it.
Repeat on fabric 2 (after waiting a decent interval to ensure no mistakes were made in fabric 1... well I hope you wait.... you do don't you?).
The main trap here is that when creating a zone, you need to ensure you select all of the correct storage aliases for your selected storage device. For instance we could have a simple layout like this:
Fabric 1 contains our new server (in this example an IBM x3850) and three XIV ports:
This means when creating the zone I need to identify and select four separate aliases. What I could do instead is create an alias with all my XIV target ports in it. Now I only have two aliases to select in that fabric:
This means when creating the zone I need to identify and select three separate aliases. What I could do instead is create an alias with both my Storwize V7000 WWPNs in it. Now I only have two aliases to select in that fabric:
This method of amalgamating multiple storage port aliases works fine for devices like DS8000, SVC, Storwize V7000 and XIV. I use this method all the time to simplify zoning and I find it reduces both mistakes and the time required to complete zoning tasks.
The only exceptions are:
Don't do it for DS3000, DS4000, DS5000 or DCS3700 as the controllers on these devices do not like to see each other through the switch.
Don't combine ports from different storage devices, so if you have two XIVs in a fabric create one alias for the target ports of each XIV (although you could combine ports from different SVC I/O groups within the same SVC cluster into one alias). You should still use individual aliases for ports being used for migration or replication purposes.
Don't use the WWNN to create an alias. Always create multi-WWPN aliases so you have granular control of which ports go into the alias. If you use the WWNN from an XIV you will also implicitly include any ports that are being used for replication or migration and thus zone them to the host, which makes no sense.
I would love to hear any techniques you have to make your (and my) life easier.
Once your SVC or Storwize V7000 is upgraded to version 6.3 you can start using LDAP for authentication. This means that when you logon, you authenticate with your domain user-id and password rather than a locally created user-id and password.
So why is this important?
It saves you having to configure every user on every SVC or Storwize V7000. If you have multiple machines this makes it far more efficient to set up authentication.
It means that when commands are executed on the SVC or Storwize V7000, the audit log will show the domain username that issued that command, rather than a local username, or worse just superuser (i.e. who mapped that volume? The superuser did.... who? )
It gives you central control over access. If someone leaves the company you just need to remove access at the domain controller, meaning there won't be orphan user-ids left on your Storage equipment.
So as an exercise I added my lab Storwize V7000 to our domain to show how it is done. This example also applies to an SVC so don't be confused if I only refer to Storwize V7000 from now on.
The first task is to negotiate with your Domain administrator to get a new group setup on the domain. In this example I use a group called IBM_Storage_Admins which lets me use this group for various storage devices (such as an XIV or a SAN Switch).
To create this group we need to logon to the Domain Controller and configure Active Directory. An easy way to do this from the AD controller is to go to Start → Run and type dsa.msc and hit OK. The Active Directory Users and Computers Management Console should open.
Select the groups icon to create a new group.
Enter your group name, in my case: IBM_Storage_Admins and hit OK.
Now right select relevant users who need access to the storage and add them to the IBM_Storage_Admins group. In this example I have selected Anthony (which uses anthonyv as a username).
In this example we are adding anthony into the IBM_Storage_Admins group:
Now it is time to configure the Storwize V7000 so start the Web GUI and logon as Superuser.
Firstly we go to Settings → Directory Services:
We choose the button to Configure Remote Authentication:
We choose LDAP and hit next.
We choose Microsoft Active Directory with no Transport layer Security. We then expand the Advanced Settings. My lab domain is ad.mel.stg.ibm so I use the Administrator ID on the Domain Controller to authenticate access. You could use any user that has authority to query the LDAP directory. We then hit Next.
We then add the domain controller which in this example is 10.1.60.50 and the base domain name chopped into pieces (so ad.mel.stg.ibm becomes dc=ad,dc=mel,dc=stg,dc=ibm ) and hit Finish.
Provided the command completes successfully we have defined the domain controller to the Storwize V7000. Now we need to add a group. Go to Access → Users.
Select the option to add a New User Group.
In this example we want to add a group for users allowed full admin access to the Storwize V7000. This matches the group we created on the Domain Controller. So we call the group IBM_Storage_Admins and we use the Security Administrator role (which is the most powerful role) and tick the box to enable LDAP for this group.
Now to test, I logon to the Storwize V7000 using the domain user-id anthonyv with that users domain password. Remember this user is not defined on the Storwize V7000 itself and that if it all goes wrong, we can still logon as Superuser.
Now I create a volume and delete it. Then I check the audit log from Access → Audit log.
Sure enough, we see exactly who did that command.
This is a great outcome for security,auditing and for easy access administration.
If you have issues, from the Settings → Directory Services menu, use the Global Actions dropdown on the right hand side to Test LDAP Connections and Authentication or re-configure LDAP.
If you already have existing users (what we call Local users), configuring remote authentication using LDAP does not disable or invalidate those local user-ids. This means you can either logon with a local user-id or logon with a Domain user-id. This is handy if the domain controller fails but can confuse you if your local user name and your domain user name are the same name (for example both anthonyv). The Storwize V7000 will look you up in the local user name list first. I suggest removing all local users (except superuser) as this will reduce confusion but still leave you a backdoor in case remote authentication stops working.
If you see any mistakes or have suggestions to improve the way I described this, please let me know.
The latest release of SVC and Storwize V7000 firmware is now available for download. The major new features that are added with this release are:
Global Mirror with Change Volumes
Native LDAP Authentication
Extended distance split clusters (for SVC)
Support for 4096 host WWPNs
These are some great new features. The ability to use Global Mirror with Change Volumes means clients can now mirror across far smaller pipes, while the increase in host WWPNs is very welcome news for NPIV installations that are suffering from WWPN sprawl.
If you plan to upgrade, firstly grab the new Upgrade Test Utility from here. The links to the Storwize V7000 and SVC versions are both on that page. Remember you can run this test as many times as you want whenever you want, to check the health of your device for upgrade. When you run the upgrade test utility on a Storwize V7000 you may get a message that your disks have down-level firmware. The process to update them is documented here.
If you're using a Storwize V7000 you can grab the 126.96.36.199 code from here. If you're using an SVC you can grab the 188.8.131.52 code from here. I am sending you to the compatibility matrix page because you should always check that your from level is ok for your to level.
To run the upgrade go to Configuration (the spanner icon) → Advanced → Upgrade Software →Launch Upgrade Wizard
I have not shown all the panels you will see because it is very much a follow-your-nose task, but in essence, first we feed it the Upgrade Test Utility file and run that test.
If you get warnings you may need to act on these. If you are unsure what to do to resolve a warning message, place a service call.
Once the test passes or you're happy you understand the warnings, we now point it at the code package and wait for it to copy across and keep hitting Next.
The application of the code shuts down and reboots each controller, with a 30 minute gap in between. You will transition from this (both nodes down-level, node1 being upgraded):
To this (node1 upgraded, node2 still online but waiting for 30 minutes):
When node2 starts the upgrade the GUI will failover to node1 and be upgraded to the new version. You will notice the difference immediately, it has a different look and feel. Please don't be tempted to play with the new functions until both controllers are upgraded! Wait until you see this (note a slight change, the GUI flow is now Settings (the spanner icon) → General → Upgrade Software:
Now your complete it is time to start checking out what is new... but that's a whole different blog post!
For some time I have maintained a document that details how to determine a WWPN for most common examples of Fibre Channel storage sold by IBM. This document is really useful when you're staring at a WWPN being reported by a Fibre Channel switch or device and you're trying to work out which port on your IBM Storage it comes from. It's great for when you're performing zoning, creating a build document or setting up mirroring between DS8000s.
One of the changes referenced in the new WWPN determination document is a corner case you might encounter if setting up mirroring from a DS8100 or DS8300 to a new DS8800, but only where the DS8800 has 8 port host adapters and where the mirroring is to the lower four ports of that adapter. If I have lost you already, then the following document will not interest you. However if you need to configure such an environment, check out my document detailing a variation in the way the adapter ID is reported, which you can find here.
Wordpress (where I mirror this blog) as a blogging platform has some very nice features. One of them is that you can find out what search terms led people to your blog. I have noticed that searches like these have become very common:
XIV GUI and XIV GUI V3 or version 3
IBM vcenter plugin
This suggest to me that people are having trouble finding these files, or more importantly: maybe Google is having trouble helping them to find them!
The reason is simple: They have been moved to IBM FixCentral rather than being posted on separate easily trawled web pages. The good news is that once you know about Fix Central, finding any file becomes very very easy.
First up you HAVE to bookmark this URL (Do it now! Yes NOW!):
Many of the preventable issues that occur in a SAN fabric can be avoided by using the right management and monitoring software. One way to get this software is to create or adapt open source packages. While I really like the idea (and price) of roll-your-own solutions, it is not always practical. Apart from the fact that you need to have staff with the relevant skills to do this, long-term maintenance can prove difficult when key people move on. Unfortunately the other extreme (which is far more common) is that many shops actually do nothing at all, ending up without any overall SAN management and monitoring methodology.
An ideal off the shelf solution alternative in a Brocade SAN fabric is to use IBM Network Advisor, the successor product to Data Center Fabric Manager (DCFM). IBM Network Advisor actually has its heritage in a great product called EFCM (Enterprise Fabric Connectivity Manager), that Brocade picked up when they bought McData . I loved working with EFCM and McData switches, especially the McData 6140, which was truly a great SAN director. When Brocade purchased McData they combined EFCM with their own Fabric Manager to create DCFM. They have since combined it with their Network switch management software to create Network Advisor, bringing things to a whole new level. The IBM announcement letter for this software is here.
Now the first thing you may be wondering is: OK so this software sounds great, but how much will it cost? The good news is that trying it out wont cost you anything. It's free to download and trial for 75 days. You can find the download site here.
To demo it, you can spin up a Windows 2008 guest from a template in your favorite Hypervisor. This means you don't even need to request separate hardware to do this trial.
So what benefits should you expect to see? Well first up I am talking about preventing issues like these:
Mistakes made when performing zoning updates
Failure to create regular configuration backups (which especially hurts after a switch failure)
Difficulties upgrading firmware or simply too many upgrades to get through
Poor (or no) switch and performance monitoring
Poor (or no) error notification (including notification back to IBM)
Difficulty collecting log data
Lack of report creation software
In some ways you can sum up the benefits of the software quite easily by looking at the three central menus of IBM Network Advisor: Configure, Monitor, Report.
To give you a view of some of the menu choices, you can see just how rich the options are:
From a configuration perspective you can manage the zonesets of all your fabrics from the one place. This means you don't need to jump between switches. More importantly it gives you a clear indication of what a zoning update is adding AND removing. Accidental removal of a required zone is a very common cause of zoning related SAN issues:
Do you mean to remove that zone?
It can automatically backup your switch configurations. Backing up your configs is frankly a mandatory task that is routinely never done. If a switch fails, then any customization and zoning (if it is a single switch fabric) is lost. This can be a major issue, especially if a business partner or former employee set the switch up. If we schedule a regular backup you won't need to remember, because IBM Network Advisor will do it for you:
Firmware updates also become a far simpler affair. IBM Network Advisor has a built-in FTP server and happily acts as a firmware repository. If you're facing a set of Kangaroo hops, this is a great way to make the whole process very very simple. It will perform compatibility checks before you start and also act as a repository for both firmware and release notes (which is a really nice touch).
From a monitoring perspective, the ability to set up call home to IBM is a huge advantage and a vital step in building a SAN with the highest levels of availability. The added bonus is that you can use IBM Network Advisor to generate a supportsave (a log offload file that you will invariably be asked for during trouble shooting) off every switch in your fleet in one go (you can also set it up to perform this on a regular basis), significantly boosting productivity and aiding in trouble shooting. You can also set up Fabric Watch across the entire fleet of switches, all from a single interface.
If you own DCFM already, then you are eligible for a free upgrade. If after trialing the software you feel that the significant availability benefits this software will give you are worth achieving, talk to your IBM Sales Rep or Business Partner to get a price. I personally think you will find it very reasonable, plus I guarantee that it will not be shelfware and will prove to be a vital tool in getting the most from your SAN.
But... if after trialling IBM Network Advisor you're still determined to try to avoid paying for software, then you could always consider the open-source alternative (rather than do nothing). Check out this document written by Andy Loftus and Chad Kerner from the National Center for Supercomputing Applications at the University of Illinois. It's a great example of a lessons learning document that describes how they built their own monitoring solution. You will find all of their documents and scripts here. As I said, roll-your-own might avoid vendor costs, but they have costs all of their own. Does your team have the skills, willpower and time to do this and maintain it? I would love to hear about your experiences either way.
IBM XIV GUI version 3.01 is now available for download from here. Fixes in this release include:
An increased timeout value to prevent disconnects in slow network environments.
Some compatibility fixes for using the GUI with XIVs on older firmware levels (10.2.1 or 10.1.0a or lower).
An LDAP fix.
The GUI installation will no longer include its JRE folder in the user's path environment variable (since this could break other JAVA based apps).
I recommend everyone who uses the XIV GUI, regardless of their XIV firmware level, to upgrade to this level. Don't forget that even if you don't have an XIV, you can download and install the GUI and run it in demo mode to see just what everyone is talking about.
Removing the XIV JRE Path statements
The path variable changes are not retrospective, meaning installing the v3.01 XIV GUI will not remove the offending JRE path statement. If you find this is needed (because you have some broken Java applets), here is what you have to do.
Download the XIV V3.01 GUI
Edit your path variable to remove all XIV statements.
Install XIV GUI 3.01
To give you a more detailed set of instructions:
Open your Control Panel and go to System and then open Advanced System Settings now go to Environment Variables.
Highlight your Path entry.
I recommend you then click in the Path variable box, right click and choose select all, then copy the entire path statement and paste it into a text file.
Save that text file to keep a record of your paths (in case you make a mistake).
Now remove every entry that relates to the XIV GUI. I had several.
Now paste the edited path statement back in and hit OK.
Now install the XIV GU v3.01.
After install is complete, check the PATH variable, you should see just one reference to XIV: C:\Program Files (x86)\IBM\Storage\XIV\XIVGUI
On Friday November 18, 2011, IBMers around the world engaged in the worlds first group therapy session held entirely in Twitter! (well maybe not the first, and not really group therapy, but it sounds more dramatic when I put it like that).
It focused entirely on tweeting classic lines heard in day to day life at IBM, using the hashtag #stuffibmerssay. The result was an amusing out-pouring that kept growing as the day went on (and has not stopped). Karl Roche did a great summary write-up here where he captured some of the more classic stuff. Holly Neilson also wrote a nice blog post on the subject here.
You will notice many of the tweets focus on phone conferences, which are without a doubt the greatest contributor to and destroyer of, productivity in IBM. Classics such as this one came up again and again (and it's a common problem for me):
VMware vSphere 5.0 brought in a considerable number of storage related improvements. One of these is VASA, which stands for VMware APIs for Storage Awareness - in which VMware yet again manages to place an acronym (API) inside an acronym (someone needs to send Grammar Girl down there to beat them up). But I digress...
VASA improves VMware vSphere’s ability to monitor and automate storage related operations. The VASA Provider delivers information about storage topology, capabilities, and state, as well as events and alerts to VMware. The VASA Provider is a standard vSphere management plug-in that is deployed once on each vCenter server to interact with VMware APIs for Storage Awareness.
You will of course need a VMware vCenter and an ESXi server both running version 5.0. Your XIV can be a Generation2 running 10.2.2 or 10.2.4 firmware or an XIV Gen3.
You can download the installation instructions for the IBM VASA provider here. You can download the release notes for the IBM VASA provider here. You can download the IBM VASA provider itself here.
If none of these links work, then the IBM Fix Central page for every XIV related file ishere.
Henry Ford has long been quoted as having said: "Any customer can have a car painted any colour that he wants so long as it is black."
While there is some debate on what Mr Ford exactly said, it's clear that for some time now IBM has heartily embraced this philosophy with a succession of all black machines (occasionally graced with a coloured stripe). So I was rather excited to spot something new in the IBM Melbourne demo center: an IBM Netezza (pronounced net-ease-a) It's rack door is one of the coolest IBM covers I have seen in years!
Even the internal blades look cool (I love the big N).
In case your curious, IBM® Netezza® Analytics is a purpose-built advanced analytics platform that enables your enterprise to get the most out of its data, giving you quicker answers to increasingly complex questions. It is the simple appliance for serious analytics.
Of course while I should have been thinking about big data and smart analytics, instead I have been reminiscing about IBM machines with coloured covers. For instance the IBM 3350 (storage from the 1970s) could be ordered with covers that were red... (Actually I think the correct name was garnet rose).
As far as I can tell, IBM have not offered coloured panels on Enterprise kit since June 28, 2002.
Prior to this devices could be ordered with feature codes like:
#9060 Willow green #9061 Garnet rose #9062 Sunrise yellow #9063 Classic blue #9064 Charcoal brown #9065 Pebble gray.
While it is easy to find pictures of machines with Classic Blue covers like these 3380s (with 3880 control unit)
And even visions of a red computer room (with an all white 3800 printer on the left hand side):
The only picture I have found so far that shows a yellow machine appears to have faded to orange over the years (I don't think IBM sold orange System/38s?).
I did some more digging and found this great Youtube video. You can see some old System 360 kit with red covers and at 00:46 there are some machines in custom bright yellow! The client literally ordered the machines painted with a custom tint. That takescase modding to a whole new level.
So should IBM be embracing the new cool and coming out with a bright orange XIV? How about a Storwize V7000 in fluorescent blue? A man can dream....
And if you want to see more about Netezza and it's incredibly cool rack (and even cooler architecture), check this video out:
If you're in Melbourne (that's Melbourne Australia, not Melbourne Florida) why not come along to the next Australian IBM Tivoli User Group meeting. The subject? Tivoli Storage!
The time and date: 10am to 1pm on Friday Nov 25th, 2011 (lunch included!) . The location: IBM building, Seminar Room, 60 City Rd, Southgate.
Now we need you to register to ensure that enough food is ordered for lunch, so please hit the following website, sign up if necessary and then register your intention to attend (Customers and Business Partners are very welcome):
10:00am - The meeting will open with a welcome and introductions from group leaders Nik Hatzikos (from IAG) and Richard Whybrow (from Hertz).
Session 1: Tivoli Storage "Latest Release" session. There is LOTS to talk about. This session will cover TSM 6.3, TSM for VE 6.3, Tivoli Storage FlashCopy Manager V3.1 and TPC 4.2.2 - Presented by Jacques Butcher, IBM Tivoli Storage Specialist. Jacques is a fountain of knowledge with lots of real world experience.
Session 2: MemberTalk - TSM V6 Upgrade. We will hear some interesting feedback from one of our members on a recent TSM V6 upgrade project. Presented by Richard Whybrow from Hertz, Richard is a great presenter who loves to create multi-media.
Session 3: Round table discussion - anyone attending is welcome to bring up a topic for discussion (preferably about Tivoli Storage #
Lunch will be supplied courtesy of IBM.
This is a great opportunity to meet and network with like minded peers at the end of year Tivoli User Group meeting. And hopefully Richard will show us a movie or two!
************************************************************************** Update 17/12/2011: A flash reporting a possible issue that could occur if a drive fails during drive firmware update can be found here.
Until the flash is updated showing how to avoid this issue, only update drive firmware when installing a new machine or if all hosts are offline.
IBM recently released new drive firmware for the Storwize V7000, so I thought I would share the process of how I update that firmware. You can download it from here. The details for this new package can be found here. I recommend you perform the drive update before you next update your Storwize V7000 microcode.
I want to be clear that one of the central goals of the Storwize V7000 is to ensure that performing drive firmware updates can be done online without host disruption. This is possible because each drive can be updated in less than around 4 seconds. The scripts I share below leave a 10 second delay between drives just to be safe. I would still prefer that you did the update during a quiet period.
We need to perform this procedure using the command line as there is no way to do this procedure from the GUI (yet).
There are four steps:
Upload the Software Upgrade Test Utility to determine which drives need updating.
Upload the drive microcode package.
Apply the drive software.
Confirm all drives are updated.
Step 1: Upload and run the upgrade utility
You will need the upgrade test utility which you can get from here.
You will need the Putty utility PSCP which you can get from here (although most of you should already have it).
You will need to have created a public/private key pair and assigned it to a user. In all the examples the user name I use is anthonyv. You need to use your own user-id, although you could also use admin. The process to create and associate the key pair is described here. Place the PPK file into the putty folder along with the upgrade test utility.
From the Putty folder we need to upload the test utility. You will need to change the key file name, userid and IP address (all highlighted in red) to suit your installation.
NOTE: The following command is being run in a Windows command prompt. You need to be in the C:\Program Files\Putty or C:\Program Files (x86)\Putty folder.
If you get a warning window like the one shown below, indicating we have down-level drives, we need to proceed to the next step (note that the enclosure and slot numbers are not the same as drive IDs). If you have a lot of drives, you can drop the -d from the svcupgradetest command to get a summary list.
******************* Warning found *******************
| Model | Latest FW | Current FW | Drive Info |
| HK230041S | 2920 | 291E | Drive in slot 24 in enclosure 1 |
| | | | Drive in slot 23 in enclosure 1 |
| ST9450404SS | B548 | B546 | Drive in slot 22 in enclosure 1 |
| | | | Drive in slot 21 in enclosure 1 |
| | | | Drive in slot 20 in enclosure 1 |
| | | | Drive in slot 19 in enclosure 1 |
| | | | Drive in slot 18 in enclosure 1 |
| | | | Drive in slot 17 in enclosure 1 |
| | | | Drive in slot 16 in enclosure 1 |
| | | | Drive in slot 15 in enclosure 1 |
| | | | Drive in slot 14 in enclosure 1 |
| | | | Drive in slot 13 in enclosure 1 |
| | | | Drive in slot 12 in enclosure 1 |
| | | | Drive in slot 11 in enclosure 1 |
| | | | Drive in slot 10 in enclosure 1 |
| | | | Drive in slot 9 in enclosure 1 |
| | | | Drive in slot 8 in enclosure 1 |
| | | | Drive in slot 5 in enclosure 1 |
| | | | Drive in slot 6 in enclosure 1 |
Step 2: Upload the drive microcode package
Download the drive update package from here. Put it into the PuTTY folder. From a Windows command prompt we need to upload the package using the following command. You will need to change the key file name, userid and IP address (all highlighted in red) to suit your installation. Note yet again that you are running this in a Windows command prompt from the PuTTY folder (not from inside an SSH session):
I have written some scripts to help you list the drive IDs that need to be updated and perform the updates. You can upgrade the drives one at a time, or in bulk, depending on how you want to do this. All the remaining commands are all run in a PuTTY session.
Firstly run this script to list all the drive IDs and current firmware levels. We need the drive IDs if we want to update individual drives.
svcinfo lsdrive -nohdr |while read did error use;do svcinfo lsdrive $did |while read id value;do if [[ $id == "firmware_level" ]];then echo $did" "$value;fi;done;done
The output will look something like this, showing the drive ID and that drive's current firmware level. From step 1 we know what the latest firmware level is, so we can compare to the current firmware level:
However you may have a lot of drives and want to upgrade them in bulk. So you could use this command, which updates drive ID 19 and 20 (highlighted in red). You could change and also add extra drives to the list as required:
for did in 19 20;do echo "Updating drive "$did;svctask applydrivesoftware -file IBM2076_DRIVE_20110928 -type firmware -drive $did;sleep 10s;done
If we just wanted to upgrade every single drive in the machine (regardless of their level), we could run this command:
svcinfo lsdrive -nohdr |while read did name IO_group_id;do echo "Updating drive "$did;svctask applydrivesoftware -file IBM2076_DRIVE_20110928 -type firmware -drive $did;sleep 10s;done
When updating multiple drives, I have inserted a 10 second sleep between updates, just to ensure the process runs smoothly. This means each drive takes about 13-15 seconds.
Once we have upgraded every drive, it is time for a final check.
Step 4: Confirm all drives are updated
You have two ways to confirm this. Firstly run the following command to list the firmware level of each drive. Is each drive reflecting the levels reported in Step 1?
svcinfo lsdrive -nohdr |while read did error use;do svcinfo lsdrive $did |while read id value;do if [[ $id == "firmware_level" ]];then echo $did" "$value;fi;done;done
Now run the software upgrade test utility again:
svcupgradetest -f -d
Provided you receive no warnings about drives not being at the recommended levels, you are now finished with the drive updates. Of course you could now proceed to install 184.108.40.206 firmware, but you can do that from the GUI.
The IBM XIV has some great performance monitoring tools. These include:
Realtime IOPS displayed on the front panel of the XIV GUI
Realtime performance statistics using XIV Top (as detailed here)
Realtime performance statistics using the XIV Mobile Dashboard (as detailed here)
Historical performance information with up to 1 year of statistics, all accessible using the performance tab in the XIV GUI.
The historical performance information is a great source of data and the XIV GUI has some very clever ways to display this data. However you may want to offload these statistics to a spreadsheet. The XCLI allows you to offload stats using a command that is formatted like the one below. This particular example will collect 1440 minutes (24 hours) of statistics in 1 minute intervals starting from midnight on November 6, from the XIV at IP address 10.10.1.10 piping the output to a CSV file.
However the subsequent output may require further manipulation before it can be used. So we have produced a tool, structured very much like the configuration collection tool I described here, that will help you produce a very useful report.
Here is what you need to do:
1) Download XIV Performance Report version 6.8 zip file from this link. Click where it says Downloading this file.
2) You will get a zip file with six files in it. Unzip them into a folder on a Windows workstation. The Windows workstation also needs the XIV GUI installed on it (actually you only need the XCLI, but the Windows version of the XIV GUI will give you that).
3) Of the six files you just unzipped, you first need to edit the file called:xiv_perf_report_get_files.vbs. Open that file with a text editor (such as Notepad). The easiest way to do this is to right-select the file and choose edit.
4) You need to edit the section that looks like this:
' ****** Edit this list of IP/names and user/password for your own configs
myConfigs.Add "1", "-m 220.127.116.11 -u admin -p adminadmin"
'myConfigs.Add "2", "-m 18.104.22.168 -u admin -p adminadmin"
'myConfigs.Add "3", "-m 22.214.171.124 -u admin -p adminadmin"
'myConfigs.Add "4", "-m 126.96.36.199 -u admin -p adminadmin"
'myConfigs.Add "5", "-m 188.8.131.52 -u admin -p adminadmin"
Lets say you have two XIVs, the details for which are:
XIV1 : Management: IP 10.1.10.100 Userid: admin Password: passw0rd XIV2 : Management: IP 10.1.20.100 Userid: admin Password: passw0rd
So we edit the section I mentioned above and make it look like this:
' ****** Edit this list of IP/names and user/password for your own configs
myConfigs.Add "1", "-m 10.1.10.100 -u admin -p passw0rd"
myConfigs.Add "2", "-m 10.1.20.100 -u admin -p passw0rd"
Now save the file. You can ignore the additional myConfigs lines further down in the file that start with an apostrophe, these are just examples.
Unless you acquire another XIV, you will not need to edit this particular file again.
5) The default behaviour of the script is to collect performance stats for the previous 24 hours. If this is what you want, then you do not need to change anything else. Just proceed to step 6. If however you want to collect stats for a specific period, then you will need to edit the xiv_create_perf_report.bat file to change the data collection time period. In that file, look for this section. You will need to make two changes.
REM ************ 2 options ahead, choose 1, leave the other in remark! ***************
REM *** 1. set your own time frame ***
REM edit the following line to the date you need.
REM cscript xiv_perf_report_get_files.vbs %folder% "2011-11-04.21:00:00" "minute" "10" "1"
REM *** 2. or, get the last 24h ***
FOR /F %%i IN ('cscript "%~dp0get_last_24h_start_time.vbs" //Nologo') do SET START_TIME=%%i
cscript xiv_perf_report_get_files.vbs %folder% %START_TIME% "minute" "1440" "1"
Firstly un-REM the line starting with cscript (this is the REM shown in red in the example above, which is then gone in the example below). Then REM the line for option 2 (as shown by the extra REM in the example below). You will then need to edit the time period shown in red below. The line "2011-11-04.21:00:00" "minute" "10" "1" will collect 10 samples of 1 minute duration starting at 21:00 on November 04, 2011. Here is another way to state that:
So we could change the 10 to a larger number to get more samples and change the start date and time to get a different time period. You can also change the resolution (from minutes to hours for instance) and interval size (from 1 minute to 10 minutes).
REM ************ 2 options ahead, choose 1, leave the other in remark! ***************
REM *** 1. set your own time frame ***
REM edit the following line to the date you need.
cscript xiv_perf_report_get_files.vbs %folder% "2011-11-04.21:00:00" "minute" "10" "1"
REM *** 2. or, get the last 24h ***
REM FOR /F %%i IN ('cscript "%~dp0get_last_24h_start_time.vbs" //Nologo') do SET START_TIME=%%i
cscript xiv_perf_report_get_files.vbs %folder% %START_TIME% "minute" "1440" "1"
Once you have finished editing, save the file and proceed to the next step.
6) Now double-click on the icon: xiv_create_perf_report.bat (this is the file you may have edited in step 5). This is a Windows bat file that will create a Windows command prompt while it is running. It uses XCLI commands, so if the XIV GUI or XCLI is not installed, it won't work. It will take some time to collect the data (a whole days stats can take up to an hour to collect), so be patient. Don't close the Windows Command prompt while it is running, or you will need to start the data collection again. The output will be a new folder with today's date and time. Inside that folder will be a report that will be named something like: 2011_11_7_8_7_9_xiv_perf_report.xls which will contain worksheets as shown below:
Each sheet is already set up for filtering and contains nicely formatted data. The most common question I get regards what is a write hit (versus a write miss). I explain that here.
If a host or port does not appear, the simple reason is normally that there are no stats to report on that port, so no entry appears.
The SVC and Storwize V7000 offers a command line interface that you access via SSH. You start your favorite SSH client (such as PuTTY or Mindterm) and then logon as adminor as your own user-id. Right now to do this you need to generate a private/public key pair, although with release 6.3 (which will be available November 2011), you will be able to logon via SSH using just a user-id and password.
Having logged on there are three categories of commands you can issue:
svcinfo: Informational commands that let you examine your configuration. svctask: Task commands that let you change your configuration. satask: Service commands that are only used in specific circumstances.
There are several CLI usability features that I routinely find users are not aware of, so I thought I would share some of them here:
1) Listing all possible commands
If you cannot remember a command, here is a simple trick to list them all. Issue one of the following commands:
svcinfo -h or svcinfo -?
svctask -h or svctask -?
You can also type either svcinfo or svctask and then hit the tab key twice to get a full listing. With svctask you will need to type y to list them all, as per the example shown below:
IBM_2076:STG_V7000:admin>svctask (HIT TAB twice!)
Display all 139 possibilities? (y or n) y
2) Getting help on a particular command
Having found the command you want, issue that command with either -? or -h to get help information. For instance:
svctask mkvdisk -?
svctask mkvdisk -h
You will be shown the same help information that you can find in the Infocenter, including examples of syntax.
3) Drop the svctask and svcinfo prefixes
In release 6.2 of the SVC and Storwize V7000 firmware, the requirement to prefix a command with svcinfo or svctask has been removed. However I tend to keep using them because I write a lot of example commands for clients and I cannot be sure which version of firmware they are running.
4) Use the shell
When we SSH to the SVC or Storwize V7000 we are connecting to a Linux operating system using a special restricted shell. Some of the common Unix commands don't work (such as ls or grep or awk), but any commands that are provided by the shell itself, will work, such as while, if, read, pipe and echo.
We can use this to construct some really clever commands.
For instance creating volume copies is very popular, but the default copy rate is rather slow (50, which equals 2 MBps). It is not unusual for end users to speed up the background copy and then forget to slow it down when they are finished. So I wrote two commands to help me out. Firstly I run a command to display the copy rate of every volume. Ideally I should see 50 alongside each volume. However I often find that some volumes are set to higher numbers, such as the maximum value of 100 (which is 64 MBps).
svcinfo lsvdisk -nohdr |while read id name IO_group_id;do svcinfo lsvdisk $id |while read id value;do if [[ $id == "sync_rate" ]];then echo $value" "$name;fi;done;done
Lets break down this command. The structure looks like this:
We start with svcinfo lsvdisk -nohdr. This gives us a list of every VDisk in column format with no header information.
We pipe the output of that lsvdisk command to while read. This reads the output one line at a time and lets us work with that output. We read the first three columns of output and label the data in the first column id, the second column name and the third column IO_Group. I find we need to label at least three columns. We could read extra columns if we wanted to, but all I want is the VDisk id and name.
For every line of data we issue an lsvdisk command against each listed VDisk using the VDisk id. This output is not in column format so we need to do something different here.
We now examine the output of the lsvdisk command for each VDisk by piping the output to while read. Since each line contains a descriptor and a value, we label them id and value. We use if to look for a line that starts with sync_rate.
When we find the sync_rate for a VDisk we print the value of the sync_rate and the VDisk name. We are done for this VDisk.
We now examine the next VDisk and again look for the sync_rate for that VDisk.
Once we have examined every VDisk, we are done.
I then run the following command which sets the copy rate for every volume to the default value of 50 (2 MBps):
svcinfo lsvdisk -nohdr |while read id name IO_group_id;do svctask chvdisk -syncrate 50 $id;done
Clearly you could edit this second command to change the copy rate to any value between 0 and 100. In each case you just paste the entire command in, and hit Enter.
Lets break down this command. The structure looks like this:
We start with svcinfo lsvdisk -nohdr. This gives us a list of every VDisk in column format with no header information.
We pipe the output of that lsvdisk command to while read. This reads the output one line at a time and lets us work with that output. We read the data ion the first three columns of output and label the data in the first column id, the second column name and the third column IO_Group. I find we need to label at least three columns. We could read extra columns if we wanted to, but all I want is the VDisk id.
For every line of data we read, we do the following command: svctask chvdisk -syncrate 50 $id. Since we labelled the first column of output from the lsvdisk command as id, and that column contains VDisk IDs, we are going to issue this command against every VDisk that got listed.
Once we have run the chvdisk command against every VDisk listed, we aredone.
There are lots of possible clever combinations and I will list a few more in upcoming posts.
I have also been getting lots of requests to write a post about updating drive firmware, so expect something on that very soon.
I am unsure about unnatural love, but perhaps the level of enthusiasm he is seeing comes from: ease of use, awesome GUI, consistent performance, freedom from planning RAID groups, simple growth and upgrade path... I could keep going... it all adds up.
So if you are a member of the cult of XIV, I have a little present for you: A really nice and simple reporting tool.
Here is what you need to do:
1) Download XIV Capacity Report 3.7 from this link. Click where it says Downloading this file.
2) You will get a zip file with five files in it. Unzip them into a folder on a Windows workstation. The Windows workstation also needs the XIV GUI installed on it (actually you only need the XCLI, but the Windows version of the GUI will give you that).
3) Of the five files you just unzipped, you need to edit the file called: xiv_capacity_report_get_files.vbs. Open that file with a text editor (such as Notepad). The easiest way to do this is to right-select the file and choose edit.
4) You need to edit the section that looks like this:
' *********** Edit this list of IP/names and user/password for your own configs ************************
myConfigs.Add "1", "-m 184.108.40.206 -u admin -p adminadmin"
myConfigs.Add "2", "-m 220.127.116.11 -u admin -p adminadmin"
Lets say you have two XIVs, the details for which are:
XIV1 : Management: IP 10.1.10.100 Userid: admin Password: passw0rd XIV2 : Management: IP 10.1.20.100 Userid: admin Password: passw0rd
So we edit the section I mentioned above and make it look like this:
' *********** Edit this list of IP/names and user/password for your own configs ************************
myConfigs.Add "1", "-m 10.1.10.100 -u admin -p passw0rd"
myConfigs.Add "2", "-m 10.1.20.100 -u admin -p passw0rd"
Now save the file and we are done editing. If you only have one XIV, then delete the line starting with myConfigs.Add "2" (or put an apostrophe at the start of the line to comment it out). If you have more than two XIVs, just add extra lines for myConfigs.Add "3", myConfigs.Add "4" and so on, adding details for each machine as shown above. You can ignore the lines further down in the file that start with an apostrophe, these are just examples.
Unless you acquire another XIV, you will not have to do this file editing again.
5) Now double-click on the icon: xiv_create_capacity_report.bat. This is a Windows bat file that will create a Windows command prompt while it is running. It uses XCLI commands, so if the XIV GUI or XCLI is not installed, it won't work. The output will be a new folder with today's date and time. Inside that folder will be a report that will be named something like: xiv_capacity_report_2011_10_30_17_6_36.xls
You can now open the report and check it out (presuming you have Microsoft Excel or some other software that can open XLS files). On my laptop I get a message talking about file formats, when I open the file.
You can ignore this message. If you save the file as an XLS you won't get this message again.
The report itself will have five tabs as shown below:
For every column in every tab, filtering (or sorting) is already setup. This makes it really easy to re-arrange the data to suit what you're looking for.
Arrays Tab List details about all your XIVs including: serial numbers, code versions, soft and hard capacity, how much of the soft and hard space is allocated, how much is free and how much space is being consumed. Great place to grab the machine serial number or confirm which machine has space available.
Pools tab Lists every pool in every XIV showing every possible sizing metric you could possibly want. Cells will be coloured red or yellow if limits are being reached. It is a great place to confirm if your pools are filling up and whether a pool is a good candidate to be changed to Thin Provisioning. Sort column L (allocated vs used) or column N (Hard Capacity Utilization) to identify good candidates for swapping to Thin Provisioning. These are the pools that can give up some hard space.
Hosts tab Will list every defined host for every XIV. You can straight away spot how much space has been allocated to each host and more importantly, how much is being used. Cells will be coloured yellow or red if limits are being reached. Some nice tricks:
Sort by column F (Allocated vs Used) to identify hosts that have asked for lots of space, but not used much of it.
Compare column G (# of volumes) with column I (# volumes mirrored). You may have critical hosts that require every volume to be mirrored, so a quick compare will confirm if there are exceptions.
Volumes tab Will list every volume defined on every XIV. This is a great tab to check which volumes are being mirrored, how many snapshots exist for each volume and how much space is being used by each volume. Again cells in the Used column will be coloured red or yellow if space is becoming short. Some great tricks here:
Sort column F or G (Used GB and %) to identify volumes with no or little data in them. Perhaps they are not really needed? Perhaps they are over-sized or should be in a Thin Provisioning pool.
Sort column H (Mirrored) to identify all volumes where Mirrored = No. Should they be mirrored?
Sort column K (Host Mapped) to identify all volumes not mapped to a host. Unmapped volumes are a great potential source of space!
Failures tab The Failures tab shows any failed components in your machines (like failed disks).
So please download the tool and try it out. Service providers love using this tool for reporting, it is so quick and easy to set up and run. Every time you run the tool you get a new report, so you can automate report creation and keep a nice history.
If you were signed into IBM developerWorks when you downloaded the tool and an update is made available, you should be notified by email, provided your IBM ID is set-up properly with a valid e-mail address.
And as for cults... there is only one cult I ever really liked and they really were called The Cult. The video takes about 15 seconds to get going and yes, the lead singer is dressed like a pirate. Enjoy! (if you like 80s rock...)
The IBM ProtecTIER performs in-line de-duplication of your backup data, enabling much faster backups and much faster restore times. De-duplicating your backups allows you to store a lot "more on the floor".
One of the advertised capabilities of ProtecTIER is that you can get a de-duplication ratio of up to 25 to 1. This sounds great, but advertising this sort of ratio is a blessing and a curse. On the one hand it shows the potential capability of the device, but it can also create very high expectations. In reality the ratio you will achieve is totally dependent on the type of data you back up (video versus database versus big empty files, etc) and way you back it up (full backups versus incremental backups). In my experience, somewhere between 8:1 and 16:1 is a realistic expectation. The reason for this is that your backup data needs to actually contain duplicate data, that is... data the ProtecTIER has already stored in its repository, for de-duplication to work. If every piece of data you backup is unique, encrypted or somehow obfuscated to appear different to the last backup, then no duplicate data will be detected. The result? Your de-dup ratio will be very low.
Backing up Lotus Domino databases is a good case in point. When backing up your Lotus Notes databases you may only see a 2.5 :1 dedup ratio, which is clearly not a good result. The issue may well be with a function called compaction. Compaction re-arranges all of the data contained in the NSF (Notes Storage Format) databases to reclaim space. While this function helps to reduce space utilization from the perspective of Lotus Domino, it also changes the layout and data pattern of every single NSF. So the next time ProtecTIER receives blocks from these databases, they all look unique, so the de-dup ratio naturally ends up being very low. However running compaction is a best practice for Lotus Domino, so disabling it is not a solution.
The solution involves using a tool called DAOS (Domino Attachment and Object Service), which removes all the email attachments from the NSF files and stores them separately. This not only provides substantial space savings for Domino (because it only stores each unique attachment once) but also means that compaction can still run on the NSF files (which are now attachment free). The result at one customer? The combined de-dup ratio went up to 8.5:1 (which was about 2.5:1 on the NSF but almost 20:1 on the attachment files).
The only caveat is that Lotus Domino needs to be at version 8.5 to use DAOS. More information on performing backups with DAOS can be found here.
Thanks to Francois Morin for sharing this with me.
In you case you hear different, it's time for a few simple facts about XIV:
XIV was founded in 2002 and shipped its first product in 2005. XIV is now up to its third generation in the development process. After over 6 years, XIV is a mature and established product in the marketplace.
IBM has sold over 5000 XIVs: A number IBM is proud and happy to disclose. In fact IBM has been open and honest about sales numbers throughout the program, which speaks volumes about how pleased they are with the success of the product.
Another point about sales numbers: Compared to the XIV, the Storwize V7000 can be sold with a starting capacity of less than 1 TB. The smallest XIV Generation 2 has a starting capacity of 27 TB, while the smallest XIV Gen3 starts at 55 TB. So clearly lower entry point products like the Storwize V7000 will outsell larger Enterprise products like the XIV. The sales numbers of both products continue to be outstanding for their size and class.
There are over 2000 XIV customers, including a considerable number of reference accounts. There are 75 success stories on the IBM XIV Website, which you can checkout here.
IBM has announced the first Storage Performance Council (SPC) result with XIV, the very first on the SPC-2/E benchmark. The XIV Storage System demonstrated its ability to handle Big Data as well as providing associated energy use data. The SPC-2/E result showed that the XIV Storage System provides outstanding enterprise price-performance and Large File Processing (LFP) performance. The numbers? 8259.94 MBPS SPC-2 (LFP) Data Rate and $137.07 SPC-2 (LFP). In case these numbers don't mean much to you, they are truly outstanding, there is only one other competitor who is even in the same ball park. (Price-Performance Source: Storage Performance Council SPC-2 Benchmark Results:http://www.storageperformance.org/results/benchmark_results_spc2, Results current as of 10/20/11). (Thanks to Elizabeth Stahl for the SPC-2/E info).
IBM have offered Enterprise Storage Virtualization since June 2003 with the IBM SAN Volume Controller (SVC). October 2010 saw IBM releasing the Storwize V7000, taking the SVC code and packaging it into a midrange disk product. So now you have four possible choices:
Use SVC to virtualize your storage.
Use Storwize V7000 to provide internal SAS drives plus virtualize your storage.
Use Storwize V7000 as a midrange disk product.
Use Storwize V7000 virtualized behind SVC.
The great thing is that all four choices are valid and all four choices work just fine. But for customers already using SVC, or considering SVC, the question then becomes, should I virtualize a Storwize V7000 behind an SVC? Does this makes sense?
The short answer: YES!
We have a great many customers happily doing this, so I thought I would share some common questions I get around configuration. Firstly there is an InfoCenter page on this which you will find here. Secondly there is a debate about whether we should create individual volume/arrays on the Storwize V7000 or just create a single pool on the Storwize V7000 (which equates to striping on striping). More bench marking is being done to see if one method is truly better than the other, so until then I recommend the method described below. If you have already done stripe on stripe, don't go changing anything until I update this post.
How many ports should I use for Zoning?
The Storwize V7000 has 8 Fibre Channel ports, 4 from each node canister. You need to zone at least two ports from each node canister to your SVC cluster. This is no different to how you would zone a DS5100 or an EMC VNX.
How will the SVC detect the Storwize V7000?
On the SVC you will see two storage controllers, one for each node canister. This is quite normal. The reason for this is that each node canister reports its own WWNN. This is not a problem and will not affect volume failover if one node canister goes offline.
In the example below the SVC has detected two new controllers. The confusing factor is that both report as 2145s, but they are a Storwize V7000. Rename them to reflect what they really are (something like StorwizeV7000_1_Node1 and StorwizeV7000_1_Node2).
How should I define the SVC on the Storwize V7000?
You need to create a new host on the Storwize V7000 and call it something like SVC_1. if the SVC WWPNs don't appear in the WWPN dropdown, you will need to manually add them as shown below:
You can get the SVC WWPNs from your existing zoning or by doing an svcinfo lsnodeagainst each SVC node or display them in the SVC GUI as shown below:
What size Storwize V7000 volumes should I create?
My recommendation is to do the following on the Storwize V7000
Create arrays of preferably 8 disks in size. The ideal number will depend on how many disks you have. On my machine I have 22 disks, so I create three arrays each with seven disks (and one hot spare):
Create one pool for each array:
Create one volume out of each pool (using all space in the pool).
Define the SVC to the Storwize V7000 as a host (as described above) and map all volumes to the SVC.
On the SVC detect all the Storwize V7000 LUNs as MDisks and create one pool.
Now you should have a pool on the SVC that you can use to create volumes to present to your hosts. They will be striped by default, which is exactly what you want.
Hopefully all of this makes sense. Questions and comments very welcome.
I got an email today from the SNIA (Storage Networking Industry Association) announcing the availability of the SNIA Emerald™ Power Efficiency Measurement Specification. I have blogged about this in the past as something the industry definately needs to do: A standardized way to compare the power consumption of different vendor products.
I have seen many clients try and do this (normally with some sort of spreadsheet) but the power numbers that vendors release are often worst case or for fully blown configurations. For instance the DS8000 power usage numbers are for a fully configured machine (where every possible slot is populated). Most clients do not buy this configuration, so the power numbers appear far higher than they actually would be for the machine they purchased.
Even worse is that power consumption needs to be matched by relative performance. How much bang are you getting per watt of power consumed?
Now only HP and IBM have so far submitted results; and only for one product each. But it is a start and it shows that vendors are willing to participate and contribute. Lets hope the results table gets more results in quick order....
For completeness, here is what the SNIA shared with me (everything below is from the SNIA and is not my own work):
The Storage Networking Industry Association (SNIA) Green Storage Initiative (GSI) has announced the availability of the SNIA Emerald™ Power Efficiency Measurement Specification which was developed collaboratively among more than 25 member companies. Additionally, the GSI announced the SNIA Emerald Program. In combination, the SNIA Emerald specification provides a vendor-neutral power efficiency test measurement set of methods and the SNIA Emerald Program provides an industry-wide repository of measured test data.
"The SNIA is proud to deliver the SNIA Emerald program to a global IT industry and national bodies concerned with energy efficiency," said Leah Schoeb, chair of the SNIA Green Storage Initiative. "We are providing end-users and the industry at-large with a means to test, measure and evaluate storage systems power usage and efficiency, at a time when datacentre energy usage is projected to increase by 19% in 2012 according to Data Center Dynamics 2011 Census Report.”
The SNIA Emerald Power Efficiency Measurement Specification consists of the following elements:
Taxonomy: An industry-wide means of segmenting storage systems for products that span the range from consumer solutions to enterprise configurations which will be used to categorize the test results.
Test Methodology: A detailed and consistent means of testing various types of storage systems with load generators and power measurement instruments.
Test Metrics - Idle Measurement Test: The idle test applies to storage systems and components which are configured, powered up, connected to one or more hosts and capable of satisfying externally initiated, application-level initiated IO requests within normal response time constraints, but no such IO requests are being submitted.
Test Metrics - Active Measurement Test: Testing of storage products and components are said to be in an “active” state when they are processing externally initiated, application-level requests for data transfer between host(s) and the storage product(s).
The SNIA Emerald Program website will provide the industry with the resources needed to learn about, evaluate, test and submit storage system power usage and efficiency test results acquired by using the SNIA Emerald Power Efficiency Measurement Specification. The Program is available to the industry at large, with no requirements of membership. At time of public unveiling of the SNIA Emerald Program website, both SNIA GSI members HP and IBM have submitted test results for their respective storage systems commonly found deployed in data centers around the world. Storage system manufacturers and industry testing labs can download the SNIA Emerald Power Efficiency Measurement Specification from the SNIA Emerald website (www.sniaemerald.com). SNIA also recommends down loading SNIA Emerald User Guide that provides step-by-step guidance on how to setup a test and measurement environment for a storage system under test, and then submit measured test results to the SNIA Emerald Program. Once submitted test results are approved for public posting, manufacturers will obtain a SNIA Emerald Program logo to highlight their program participation. In turn, the industry at large can view the posted test results of various storage systems and review products that met and undergone the SNIA Emerald testing requirements. The specification and program test report do address disclosing configuration information for the system under test about energy-saving storage capacity optimizations that the system may have including features such as deduplication and thin provisioning.
Craig Scroggie, Chairman of SNIA ANZ commented “The program that SNIA has announced is the culmination of many years of collaboration between the industry manufacturers and represents a significant step forward in helping data centre managers and storage administrators get a consistent view of storage power consumption. Australia and New Zealand customers will benefit from this announcement and it is timely given the recent concerns over the impact of carbon tax legislation.”
Reference note 1: 2011 Data Center Industry Census
So lets be honest here. Watching corporate advertising on YouTube is not my favourite thing to do. But watching people I know... People who are passionate and articulate and who are talking about a subject they understand with tremendous depth... that's worth taking your time to check out.
Brian Carmody is the XIV Technical Product Manager. There are few people who can explain a deeply technical concept as well as he can. I have spotted him in two videos so far. If you can forgive the (very) cheesy music and the shaky handheld-camera-like graphics, listen to Brian, Yossi Siles and Robert Cancilla talking about XIV.
One of the many popular features of the XIV is the ability to replicate using iSCSI. On XIV Gen3 there are now at least 10 and up to 22 active iSCSI ports on each machine (depending on how many modules you order).
Implementation of the iSCSI connection between two XIVs is a piece of cake. If both XIVs are defined to the XIV GUI (which they should be), you just need to drag and drop links between XIVs to bring the iSCSI mirroring connections alive. If the network gods are with you, the link goes green. But... if the networking gods are against you... the links staysred and then the question is... what to do?
Old fashioned problem diagnosis leads us straight to the ping command. However I routinely find that the ping command works fine (all interfaces respond), but the link stubbornly remains red.
The first possible problem is that iSCSI uses TCP port 3260, so hopefully there are no firewalls blocking that port.
The second possible problem is the MTU size (Maximum Transmission Unit). When we define the iSCSI interfaces on the XIV we set the MTU as a value of up to 4500 bytes. When we establish connections between two XIVs, each XIV will send test packets that are sized to the MTU. If the intervening network does not support that packet size, the packets will be dropped by the network, because the XIV sets the don't fragment flag to ON.
So how to work out what the MTU is? Well the first thing to do is ask your friendly networking team member. But sometimes I find that the intervening networks are controlled by third parties, which means that getting a straight (and reliable) answer can prove difficult. Even worse, some of these third parties charge a fee every time you call them, so there may be hesitation to even get them involved!
One simple trick is to re-use the ping command but play with payload sizes. We can use a command that looks like this:
ping -f -l 1472 192.168.0.1
That command sends a ping with a payload of 1472 bytes to IP address 192.168.0.1. We add the -f parameter to prevent packet fragmentation. What you then do is slowly increase the payload until you no longer get a reply.
This process works fine and is great way to determine the maximum payload size the end to end network will support. However if you're using the payload size to determine the maximum transmission unit, there is a little trick. The MTU is the maximum packet size, but a ping sends a payload wrapped in 28 bytes of IP and ICMP headers. So our example:
ping -f -l 1472 192.168.0.1
sends a 1500 byte frame to the 192.168.0.1 IP address (1472 bytes of payload and 28 bytes of headers.
If this command succeeds, you can use an MTU of 1500 in the XIV GUI or XCLI (rather than an MTU of 1472, which is 28 bytes smaller).
For those who are wondering how I did the networking sniffing to get the screen captures above, I used a brilliant piece of freeware software called Wireshark. My only warning is that your corporate security policies may have rules on sniffing the network. Don't take my blog post as permission to use it # And for the networking geeks among you, yes I know that extra packets could actually be wrapped around our ethernet packet for things like VLAN tags or encapsulation, but hopefully this should not affect our mathematics.
Controlling the background traffic
Final pointer. Having finally gotten the link up and going, you are now free to start replicating volumes. But how much traffic can the cross site link support? The XIV can limit the background copy bandwidth with a parameter called max_initialization_rate. This is useful to stop you flooding the cross site link and annoying your link co-tenants. To display the current setting, open an XCLI window and issue the following command:
With the announcement that you can order an XIV with 3 TB SAS disks, IBM now have some amazing capacity options and some equally clever growth options with XIV Gen3.
As you hopefully know, the XIV consists of modules that each contain 12 disks. An XIV can have 6, 9, 10, 11, 12, 13, 14 or 15 modules (all modules must have the same size disk). You can start at any of those points and then grow without interruption or outage up to 15 modules (that's 243.3 TB!). There is practically no planning required to do a capacity upgrade and the data relocation to re-balance between the nodes is done automatically by the machine (without any end-user intervention).
The useable capacity sizing with 3 TB drives stretches from 84.1 TB with 6 modules to 243.3 TB with 15 modules (these are decimal TB).
However the Capacity on Demand (CoD) options are far more interesting. With CoD you effectively buy a certain amount of capacity up front but also get up to 3 more modules shipped with the machine. You can start using this extra capacity when your business requirements demand it, at which point you will be asked by IBM to purchase it. The advantage here is that you physically get a bigger machine up front with all the performance benefits that bestows, plus you don't have to contact IBM to start using that extra capacity. Lets look at the possible configurations.
So lets take a scenario. You need 100 TB today, but you know this will grow to 130 TB over the next 12 months. So you could purchase an XIV with 9 physical modules (using 3 TB drives), with 7 CoD activations. This means IBM ship a machine that physically has 132 TB and that physically has 108 drives in 9 modules. Your data will be spread over all these drives and all of these modules will be active and working. However you have effectively only paid for 103 TB of that space up front. If you order extra CoD activations, you could also order extra physical modules. As long as you stick to the chart above and have at least one un-activated module, you stay in the CoD program.
When your data requirements exceed 103 TB you just start using the extra space, no license keys or special tasks required. Nice!
So having told you how great it is... are there any disadvantages?
1) You need to actually buy the storage... eventually. Depending on the CoD contract there will be a point when IBM expect you to purchase this extra capacity. The whole point of CoD is that it is like pre-ordering capacity without actually paying for it up front. If your really not certain you need extra capacity, your probably better off not ordering CoD capacity in the first place. Instead order capacity upgrades as you require them.
2) There is nothing to stop you using the storage. Now this is a curious disadvantage because it means that if you have paid for 103 TB and you start using 105 TB, the machine will not tell you off, or yell at you. So is this a good thing or a bad thing? Well I really like the flexibility so I think it is a good thing. Plus there is a nice command called cod_list which displays consumed capacity to help keep you on the path. You can also display it in the GUI. So it just means you need to keep an eye on volume and pool creation to ensure you don't start configuring extra capacity until your prepared to pay for it.
You can also use CoD with 2 TB drives on XIV Gen3 so this is another option. With 2 TB drives, the useable capacities look like this:
For those of you with Apple iPads, you might consider dropping by the Apple Store and picking up your free IBM XIV Mobile Dashboard.
The IBM XIV Mobile Dashboard application can be used to securely monitor the performance and health of your XIV over a Wi-Fi or 3G link. Having downloaded and installed the Mobile Dashboard you will get a lovely XIV Icon:
When you start the Mobile Dashboard you will have the choice to either run in Demo Mode or to connect to an actual XIV. Demo mode can be accessed by selecting the Demo Mode option deep in the lower right hand corner. So you don't actually need an XIV to give it a test drive.
To logon to a real XIV you will need a valid username, password and IP address.
Once connected you have the choice of viewing volume performance or host performance. If you view (hold) the iPad in portrait mode you get a list of up to 27 volumes or hosts ordered by performance metrics (it defaults to ordering by IOPS). If you view the iPad in landscape mode you will get a more graphical output (as per the examples below). There are no options to perform configuration, the dashboard is intended only for monitoring. This means each panel will show the performance and redundancy state of the XIV.
The volume performance panel is shown by default. The example below shows the output when the iPad is operated in landscape mode. From this panel you can see up to 120 seconds worth of performance for a highlighted volume. Use your finger to rotate the arrow on the blue volume icon to switch the display between IOPS, bandwidth (in megabytes per second or MBps) and latency (in milliseconds or MS). The data redundancy state of the XIV is shown in the upper right hand corner (in this example it is in Full Redundancy, but it could be Rebuilding or Redistributing).