Anthony's Blog: Using System Storage - An Aussie Storage Blog
anthonyv 2000004B9K Tags:  ds5100 2107 ds6000 1750 shark ds5300 ds480 ds8000 2105 xiv ds3400 storwize v7000 8,270 Views
For many years I have been working on a document that lets you translate a World Wide Port Name (WWPN) into a physical location on an IBM Storage System.
I blogged recently about how SVC and Storwize V7000 WWPNs have a slightly different layout.
The contents of that blog entry come straight from that document.
I have now pushed that document out to IBM Techdocs.
You can download it from here:
Feel free to share any feedback you have and share it with your colleagues.
IBM is announcing a set of remarkable new storage products and enhancements.
· Storage Efficiency
· Ease of use
· Smart technology
The announcements show not only IBMs significant investment in storage but also IBMs tremendous depth of knowledge and experience.
You will rapidly see that the focus is on our new Midrange Storage product, the Storwize V7000. However this is only one of four major releases that you will see (plus many more other incremental releases). From a product perspective the big new announcements are:
XIV. The XIV will support the VMWare VAAI API by updating the firmware to version 10.2.4. To remind you what I am talking about, check out my earlier blog on this subject here.
Storwize V7000. This is a major new product offering in the midrange space. It takes the intelligence and history of the SVC; brings in some disk controller technology from the DS8000; adds SAS version 2 disk enclosures; provides the sub-LUN performance benefits delivered by Easy Tier; uses a simplified GUI influenced strongly by XIV and has a simplified licensing structure. This is all put into a 2U modular form factor. Because the Storwize V7000 uses the same code base as the SAN Volume Controller (SVC), it brings all the smarts of SVC including virtualized disk (using both internal SAS disks and external storage controllers), thin provisioning, transparent data migration and mirroring (including Metro and Global Mirror). Right now there is no RACE technology in the Storwize V7000 (despite IBM using the Storwize brand). But I think you can take the name as a hint of things to come.
DS8800. This is a fantastic incremental new development in the DS8000 family. It takes the long history of DS8000 development and combines it with small form factor (2.5”) SAS version 2 disks connected via 8 Gbps host adapters and 8 Gbps device adapters. The performance numbers, the environmental and floor-space requirements are all improved by a significant factor. It positions the DS8000 for many years of new functions and features.
SVC. For SVC we are releasing SVC version 6.1. This is a major software update to the SVC code. It delivers a remarkable new GUI with Easy Tier and a whole raft of functional improvements.
Other announcements will include enhancements to TPC, IBM Director, the DS3500 and Softek TDMF.
As soon as I have the announce letter URLS, I will post them. There is clearly plenty more to come.
In part three of my series, AIX and XIV.... I will explore the recommended configuration changes you should make to AIX when attaching XIV disk.
So lets get started:
lsattr -El fcs0
Two of the attributes will look like this:
max_xfer_size 0x100000 Maximum Transfer
lsattr -El fscsi0
Two of the attributes will look like this:
Tracking of FC
I suggest you change these values as follows:
lsattr -El hdisk26
Two of the attributes will look like this:
TRANSFER Size True
I suggest you change these values as follows:
By increasing the max_transfer size, we allow the maximum LTG size on each volume group (VG) to be larger. The LTG size of a VG cannot be larger than the smallest max_transfer size of all the hdisks that make up that VG. When the LVM receives a request for an I/O, it breaks the I/O down into what is called logical track group (LTG) sizes before it passes the request down to the device driver of the underlying disks. The LTG is the maximum transfer size of an LV and is common to all the LVs in the VG.
You can display the LTG size by using the lsvg command against the relevant VG.
AIX XIV Utils
root@testserver [/home/anthonyv/aix_xiv_utils-2.0/bin] # ./lshba -x
We use the chhba commands to change fcs and fscsi attributes. We issue a single command to change all the HBAs at once. This command only changes the ODM. We need to reboot for the changes to take effect. Note we need to type yes when prompted, for the script to run.
root@testserver [/home/anthonyv/aix_xiv_utils-2.0/bin] # ./chhba -d
yes -f fast_fail -m 0x200000 -n 2048 -P
In this example we display the relevant attributes of all the XIV hdisks. The settings in this example are NOT correct (queue depth still 32 and max transfer size still 0x40000) so we need to use the chxiv command to correct them.
We use the chxiv command to change the attributes of every XIV hdisk. This command only changes the ODM for XIV disks. We need to reboot for the changes to take effect. Note we need to type yes when prompted, for the script to run.
[/home/anthonyv/aix_xiv_utils-2.0/bin] # ./chxiv -r 64 –m 0x100000 -P
Change algorithm to round_robin with a queue depth of 64 for these disks? yes
Getting new XIV disk information...
AIX_SIZE(MB) ALGORITHM Q_DEPTH SERIAL
Conclusions and gotchas
So at the conclusion of this process, you should have an AIX system with more ideal settings for XIV. There are a couple of gotchas.
1) HBAs being used for tape. The chhba command will not change HBAs in private loop mode. This is to prevent errors like this:
Date/Time: Thu Feb 4 11:59:08 2010
Sequence Number: 250846
Machine Id: 00CB6AC44C00
Node Id: us04od03
Resource Name: fscsi3
SOFTWARE DEVICE DRIVER
SOFTWARE DEVICE DRIVER
INCORRECT HARDWARE CONFIGURATION.
IDENTIFY OFFENDING SOFTWARE COMPONENT
VERIFY SYSTEM CONFIGURATION IS VALID
REFER TO PRODUCT DOCUMENTATION FOR ADDITIONAL INFORMATION
2) Your queue depth settings may still not be deep enough. Periodically run iostat -D 5 and if you consistently notice avgwqsz or sqfull consistently not zero then increase queue depth (you can go up to 256). Don’t be tempted to start at 256 and work down. You may flood the XIV with commands. For the vast majority of clients, 64 is a good number.
3) Do you need to use these scripts? No you don’t. You can use smit or command line to change attributes.
Do you always need to reboot? No you don’t. But you will need to change the relevant devices to
a defined state to change them. For
instance you could change the queue depth on an hdisk the commands below. But only if the hdisk in not part of an online volume group. It remains easier to just change the ODM and reboot for the changes to take affect.
rmdev –l hdisk25
So the next challenge when connecting your XIV to your AIX LPAR is how to zone the SAN (or hopefully, each SAN fabric).
The XIV consists of a number of modules (from 6 to 15), of which a subset are Interface Modules (meaning they have fibre channel and iSCSI interfaces)
Zone the SAN so that each HBA in an LPAR has 3 paths to the XIV.
If an LPAR has two HBAs, then zone the first HBA to modules 4, 6 and 8 and the second HBA to modules 5, 7 and 9.
For the next LPAR, do the reverse and zone the first HBA to modules 5, 7 and 9 and the second HBA to modules 4, 6 and 8.
If we look at a typical dual fabric 15 module XIV, it would be cabled as pictured below.
Port 1 on each XIV module attaches to Fabric 1
Port 3 on each XIV module attaches to Fabric 2.
Ports 2 and 4 are reserved for replication and mirroring.
But look closely at the use of colours in this lovely Visio diagram I created.
Host 1 is zoned using the links in blue.
Host 2 is zoned using the links in red.
Notice how they round robin between the odd and even numbered interface modules.
Rules of thumb
Because there is some evidence that having excessive paths can slightly raise CPU utilisation.
The other reality is that having multiple duplicate paths won't make your system more reliable.
So your planning to attach your XIV at an AIX host?
Here are some best practices for you to follow.
1) Native XIV detection
The XIV uses a path control module (PCM) that plugs into AIX MPIO. Depending on your AIX level the XIV will be recognised natively by AIX without additional software.
This is nice because it means you can simply run cfgmgr and detect the XIV hdisks without doing any system changes.
If your on the following AIX levels (with TL and SP) then your AIX system will detect the XIV natively. Frankly its a good excuse to perform a system update.
AIX Release APAR Bundled in
AIX 5.3 TL 10 IZ69239 SP 3
AIX 5.3 TL 11 IZ59765 SP 0
AIX 6.1 TL3 IZ63292 SP 3
AIX 6.1 TL 4 IZ59789 SP 0
If your running VIOS, you need to be on VIOS v2.1.2 FP22 to recognise the XIV natively.
Natively detected XIV devices will look like this when displayed using the command:
lsdev -Cc disk
hdisk2 Available 03-00-02 MPIO 2810 XIV Disk
2) XIV Host attachment kit
If you are not on the levels listed above, you can install the XIV Host Attachment Kit to get XIV support.
However at lower AIX and VIOS levels there are issues with queue depth and round robin (its limited to 1)
The following releases do not have the queue depth issue, so they are better levels to be on:
AIX 5.3 TL 10 SP 0,1 and 2
AIX 6.1 TL 4 SP 0,1 and 2
VIOS v2.1.1.x FP-21.x
If your on level lower than those, you can still install the Host Attachment Kit to get XIV device support.
To detect XIV volumes when using the XIV Host Attachment Kit, you use the command xiv_attach.
The very first time you run xiv_attach you will need to reboot the host. After that you can use xiv_attach or cfgmgr (without reboot).
XIV devices detected by the xiv_attach command will look like this when displayed using the command:
lsdev -Cc disk
hdisk3 Available 02-01-02 IBM 2810XIV Fibre Channel Disk
3) The xiv_devlist command
Regardless of what level of AIX your running, you should install the Host Attachment Kit (HAK) to get the wonderful xiv_devlist command.
The HAK uses a specially packaged version of Python which is renamed XPYV (to not get in the way of any system Python already installed).
Just installing the kit does not require a reboot.
The xiv_devlist command is the equivalent of what SDD gave you with datapath query device.
It lets you map an AIX device (an hdisk) to an XIV volume. Its a tool you don't want to live without.
In the example below you can see the hdisk number on the left,
but all the other information (volume size, number of paths, volume name, XIV host) all come from the XIV itself.
This is really useful information.
root@system] # xiv_devlist
Device Size Paths Vol Name Vol Id XIV Id XIV Host
/dev/hdisk26 204.0GB 6/6 PROD-3050 188 7802844 PROD-prd
/dev/hdisk27 42.9GB 6/6 PROD-3051 189 7802844 PROD-prd
In my next blog entries I will tell you about zoning and what fcs, fscsi and hdisk variable work best with XIV.
I will also share a great way to update them.
IBM has announced some new XIV power features while withdrawing others.
The changes are being made to simplify the ordering process while making the power choices more robust and better suited to client requirements.
So what changed?
This is pretty well an industry standard for Enterprise class disk.
The USA Announcement letter is here.
The Asia Pacific Announcement letter is here
The European Announcement letter is here.
I have no idea what this website is all about, but you have to love what they have they done with an XIV.
My favorite is the U2 model. With 2TB drives it can hold around 161 million minutes of music!
Plug that sucker into your iPod and put it on shuffle!!
The XIV GUI is all about simplicity. Its about taking tasks which on other products are difficult or time consuming, and either eliminating them, or making them as simple as possible.
But for those who like to issue commands via a command line interface (a CLI), the XIV also has a very rich CLI called XCLI.
If your familiar with the XCLI your hopefully aware that list commands can produce much more detailed output if the -x option is used (-x requests XML output).
So here is something you can try out.
If your XIV is on 10.2.1 firmware you can use the module_list -x command to display how much server memory each XIV module has.
If your XIV has 2 TB disks, you should find that you have 16 GB of server memory per module.
This means a 15 module machine has a whopping 240 GB of server RAM.
To be clear, I am not referring to this as 'cache' because a small portion (around 2.5 GB) of the RAM in each module is used by the modules internal Linux operating system.
This means that a 15 module XIV with 2 TB drives and 16 GB of server memory per module, has over 200 GB of cache.
As former Australian Prime Minister Paul Keating once said: "its a beautiful set of numbers"
<XCLIRETURN STATUS="SUCCESS" COMMAND_LINE="module_list -x">
<sdr_version value="SDR Package 46"/>
anthonyv 2000004B9K Tags:  ats ds8700 xiv ibm aix rob svc group ds5300 jackard ibmstorage 2 Comments 11,488 Views
So its that time of the month again. Rob Jackard from the ATS group does a fantastic job summarizing changes to the IBM Storage Support site
and you get all the benefit of his hard work (via me!).
So cast your eyes down the list and look for issues that may affect you....
(2010.08.21) AIX Support Lifecycle Notice- AIX 5.3 TL9 & TL10.
NOTE-1: After November 2010, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 5300-09 (applies to all Service Packs within TL9). Sometime after May 1, 2011, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 5300-10 (applies to all Service Packs within TL10).
NOTE-2: As a reminder, IBM is no longer providing generally available fixes or interim fixes for new defects on systems at AIX 5300-06, AIX 5300-07 or AIX 5300-08.
(2010.08.21) AIX Support Lifecycle Notice- AIX 6.1 TL2 & TL3.
NOTE-1: After November 2010, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 6100-02 (applies to all Service Packs within TL2). Sometime after May 1, 2011, IBM will no longer provide generally available fixes or interim fixes for new defects on systems at AIX 6100-03 (applies to all Service Packs within TL3).
NOTE-2: As a reminder, IBM is no longer providing generally available fixes or interim fixes for new defects on systems at AIX 6100-00 or AIX 6100-01.
(2010.07.29) Disable the TCP/IP port for SDDSRV/PCMSRV if the port is enabled.
(2010.07.23) SDDSRV respawn incorrectly causes system log filesystem overflow with AIX SDD version 220.127.116.11.
NOTE- Contact IBM Support Center to obtain the required temporary fix 18.104.22.168, which is now available.
(2010.07.23) AIX SDD Version 22.214.171.124 cannot be deinstalled due to sddsrv respawning incorrectly.
NOTE: Contact IBM Support Center to obtain the required temporary fix 126.96.36.199, which is now available.
(2010.07.19) IBM TechDoc- Diagnosing Oracle DB performance on AIX using IBM NMON and Oracle Statspack Reports.
(2010.07.19) IBM TechDoc- Maintaining two switch fabrics on NPIV migrations.
(2010.07.07) AIX 5.3 HIPER Alert- devices.common.IBM.mpio.rte 188.8.131.52.
NOTE: Users of MPIO storage running the 5300-12 TL- An operation to change the preferred path of a LUN could hang. A similar hang could be experienced during LPAR migration where it will try to switch the preferred paths. Install APAR IZ77907.
(2010.07.07) AIX 6.1 HIPER Alert- devices.common.IBM.mpio.rte 184.108.40.206.
NOTE: Users of MPIO storage running the 6100-02 TL- An operation to change the preferred path of a LUN could hang. A similar hang could be experienced during LPAR migration where it will try to switch the preferred paths. Install APAR IZ77908.
DS3000 / DS4000 / DS5000:
(2010.08.23) Updated ESM and HDSS Firmware v1.69 package.
(2010.08.20) Updated Disk Controller Firmware v7.70.23.00 code package.
NOTE: Code for DS3950, DS5020, DS5100, DS5300 subsystems.
(2010.08.17) RETAIN Tip# H197049- Issues on full synchronization of RVM LUNS > 2 TB.
(2010.08.17) SAS connectivity to IBM System Storage DS3200 is not supported- IBM BladeCenter JS23, JS43.
(2010.08.16) RETAIN Tip# H197402- Multi-node server ports require same host group for failover- IBM Disk Systems.
(2010.08.09) Updated Disk Controller Firmware v7.60.40.00 code package.
NOTE: Code for DS3950, DS4200 Express, DS4700 Express. DS4800, DS5020, DS5100, DS5300 subsystems.
(2010.07.23) IBM TechDoc- Power Cord Technical Guide for DS5000 Systems.
(2010.07.20) RETAIN Tip# H197172- UEFI systems with HBA and >2 TB LUN may have data errors.
(2010.07.07) RETAIN Tip# H196538: Controller reboots if multiple CASDs attempted.
DS6000 / DS8000:
(2010.08.24) Potential Data Error using Fast Reverse Restore following Establish FlashCopy without Change Recording.
(2010.08.21) DS8000 Code Bundle Information.
(2010.08.20) DS8700 Code Bundle Information.
(2010.08.04) IBM TechDoc- Effective Capacity for IBM DS8700 R5.1.
(2010.07.29) IBM Whitepaper- IBM DS8000 Metro Mirror DR within a Remote Cluster.
(2010.07.28) IBM TechDoc- DS6000 and DS8000 Data Replication.
(2010.07.15) IBM Whitepaper- IBM Handbook using DS8000 Data Replication for Data Migration.
(2010.07.14) DS6000 Microcode Release 220.127.116.11.
(2010.08.26) Important information for N series support.
(2010.08.18) IBM System Storage N series FRU lists.
(2010.08.17) Data ONTAP 7.3.4 Filer Publication Matrix.
(2010.08.17) Data ONTAP 7.3.4 Gateway Publication Matrix.
(2010.07.29) Data ONTAP 8.0 7-Mode Gateway Publication Matrix.
(2010.07.29) Data ONTAP 8.0 7-Mode Filer Publication Matrix.
(2010.07.20) Data Fabric Manager (DFM) 4.0 Publications Matrix.
(2010.07.17) RLM Update to Firmware Version 4.0 Fails with “Error Flashing linux“.
(2010.07.12) IBM System Storage N series FRU (Field Replaceable Unit) lists.
(2010.08.24) IBM SAN b-type Firmware Version 6.x Release Notes.
(2010.08.20) Cisco MDS Supervisor 1-to-2 Upgrade Process.
(2010.07.15) Cisco MDS9000 Field Notice: FN-63132. Potential DIMM Memory Issue in a Small Number of DS-X9530-SF2-K9 Supervisor Cards Manufactured between September 2007 and February 2008.
(2010.08.17) NPIV clients of SSDPCM hosts may experience permanent application errors during SVC concurrent code upgrade or node reset with certain APARs and SSDPCM versions. The risk, although rare, exists in any AIX SSDPCM host or client.
NOTE: The changes made for VIOS client hangs in Technote SSG1S1003579 require additional AIX driver and SDDPCM code updates for a specific SVC error condition.
(2010.08.11) SVC V4.3.x and V5.1.x Cluster Nodes May Repeatedly Reboot and CLI/GUI Access Loss May Occur When Shrinking Space Efficient VDisks.
(2010.08.05) SVC Console (GUI) Requirements for using IPv6.
(2010.08.05) Guidance for Identifying and Changing Managed Disks Assigned as Quorum Disk Candidates.
(2010.08.05) Offline or Degraded Disks May Result in Loss of I/O Access During Code Upgrade.
(2010.08.05) IBM System Storage SAN Volume Controller V5.1.0- Software Installation and Configuration Guide (English Version).
(2010.08.05) IBM System Storage SVC Code V18.104.22.168.
(2010.08.05) IBM System Storage SVC Console (SVCC) V22.214.171.1246.
(2010.08.02) IBM System Storage SVC Code V126.96.36.199.
(2010.08.02) IBM System Storage SVC Console (SVCC) V188.8.131.523.
(2010.08.02) SAN Volume Controller Concurrent Compatibility and Code Cross Reference.
(2010.07.23) SNMP MIB file for SVC V5.1.0.
(2010.07.21) 2145-CF8 Nodes May Repeatedly Loop Between Boot Codes 100 and 137 When Upgrading to SVC V184.108.40.206 or Later.
NOTE: This issue is resolved in SVC v220.127.116.11.
(2010.07.17) Changes in handling of SSH keys in SVC V5.1.
(2010.07.16) Incorrect 2145-8G4 Node Hardware Shutdown Temperature Setting in V18.104.22.168 – V22.214.171.124.
NOTE: This issue is resolved by APAR IC60083 in SVC V126.96.36.199.
(2010.07.16) Incorrect 2145-8A4 Node Hardware Shutdown Temperature Setting in V188.8.131.52 – V184.108.40.206.
NOTE: This issue is resolved by APAR IC68234 in the SVC V220.127.116.11 release.
(2009.11.19) 20091015 Drive Microcode Package for Solid State Drive.
SSPC / TPC / TPC-R:
(2010.08.30) Open HyperSwap status may report incorrectly via the Tivoli Productivity Center for Replication GUI.
(2010.08.25) IBM TechDoc- Basic Automation of TPC Performance Graphs.
(2010.08.04) TPC 4.1.x – Platform Support: Agents, Servers and GUI.
(2010.08.04) Q3, 2010- IBM Tivoli TotalStorage Productivity Center Suite Customer Support Technical Information Update.
(2010.08.25) IBM TechDoc- Utilizing IBM XIV Storage System snapshot technology in SAP environments.
(2010.08.24) How to Avoid Potential Problems During a Data Migration to an XIV Storage System.
(2010.08.02) IBM XIV Remote Support Proxy version 1.1.0.
(2010.07.19) XIV Volume Sizing Spreadsheet Tool.
(2010.07.08) Potential to inadvertently overwrite volumes using IBM XIV Management Tools (XIVGUI, XIVTop, XCLI) version 2.4.3.
NOTE: This issue is resolved with release 2.4.3.a.
(2010.07.01) IBM Certification: IBM Certified Specialist – XIV Storage System Technical Solutions Version 2.
(2010.07.01) IBM Certification: IBM Certified Specialist – XIV Storage System Replication and Migration Services Version 1.
I have been asked this question a few times now, so its worth a blog entry.
Clients love being able to easily view XIV performance statistics.
There is a simple panel that lets you display IOPS, throughput and response times for each host or volume or for the entire machine.
When viewing XIV performance statistics using the built in GUI panels, write I/Os are broken into two types: write hits and write misses.
The question that comes up is... what is the difference? And should I be worried about misses?
The use of the term miss can have negative connotations. To explain why:
So what about a write miss? Does it mean that the write I/O 'missed' the cache?
The answer is.... no!
To explain the difference:
A write hit is the situation where a host write generates less back end disk operations. This is because:
Its been an interesting week in IT retractions.
Microsoft seriously went off the rails with their Meter Maid Booth Babes on the Gold Coast.
Check out the story here or here.
I mention this story not because I want to embarrass Microsoft (who I don't think quite realised what they had signed up for).
To their credit they quickly apologised and moved to correct their mistake.
Instead I mention this because several Microsoft people were more than willing to (quite rightly) publicly express their opinions on the subject.
I thought this was fantastic.
But with great power comes great responsibility.
As an IBMer I have never been told what I can or cannot blog.
However I do of course follow IBM Business Conduct Guidelines as well as IBM Social Networking Guidelines.
So I have to say that I viewed with dismay HDS blogger Pete Gerrs extraordinary attack on Moshe Yanai and the IBM XIV.
He has since rather gracelessly withdrawn the blog entry but his follow on comments need some response.
The XIV has been (and continues to be) a fantastic product for IBM.
Not only is it a great sales success, it has also allowed us to talk to clients who would not normally purchase IBM storage.
Far from damaging IBMs existing product line, it has resulted in those lines growing stronger (just wait and see).
We have a new focus on usability and simplicity, on making the experience of using and managing storage easier and smarter.
To some part, XIV has brought that focus. I personally think we needed it and that we are stronger for it.
As the year comes to a close you will see the benefits of this reinvigoration with some truly fantastic storage product announcements (across the board).
So while hopefully Pete can take some lessons from his very creditable and measured fellow blogger Hu Yoshida,
I will patiently wait for Barry Burke to post that he was wrong about DS8000.
And I will keep trying to get it right the first time.
So I had the pleasure last night of observing a capacity upgrade on a client's XIV (so yes this is all real world).
The client in question had an 11 module XIV with 1 TB drives. This meant they had 54,649 GB of useable capacity ( approx 54TB ).
They client had ordered one new XIV module as an incremental capacity upgrade
All hardware upgrades are performed by IBM, so the upgrade work was all done by an IBM System Service Representative (SSR)
Task one for the IBM SSR was to remove the blanking plate at the front of the machine and slide the new module into place.
The module was then secured into place with its two captive screws (the only time a tool was needed).
The next task was delegated to me.... which was attaching the sticker (decal) which showed the relevant module number.
In our case the new module was module number 12.
Here you can see the new module with all the available decals (I think I did a good job).
Note the cables have not been plugged in yet.
Because the XIV is pre-cabled, all that the IBM Service Representative needed to do was plug the ethernet and power cables into the new module.
You can see all the cables plugged in and the lights are now on:
Once this was done, the module booted up and became available in the XIV GUI.
The IBM Service Representative needed to issue two commands to complete the upgrade (literally two mouse clicks in the GUI).
The first command, called an equip, introduces the module to the XIV and places it into the Ready state, as shown below:
The second command issued by the IBM SSR starts a process known as re-distrubution.
At bottom right the message changed from 'Full redundancy' to 'Redistributing'.
What does this mean?
It means the XIV is automatically spreading existing data across the new disks to load level the amount of data on each disk.
The machine will then be in a state of workload balance without any user intervention or host interruption.
This process is done with low priority, meaning the predicted end time kept jumping around as host I/O workload rose and fell.
To monitor the process, we used the XCLI command, monitor_redist as shown below:
The process actually ran all night, moving 8TB of data around the machine to ensure the most ideal data layout.
In the morning the redistribution had finished.
The machine now reported 'Full redundancy' and the useable capacity had risen from 54649 GB to 61744 GB
(the increase varies according to which module is being added).
Two tasks remained for the customer to complete:
1) Increase the size of the relevant storage pool(s) - takes a couple of mouse clicks, perhaps 20 seconds work.
2) Creates new volumes in those pools. Which again, takes just seconds to perform.
The XIV has two concepts when it comes to space:
Hard limit --> How much useable capacity you ACTUALLY have
Soft limit --> How much space you can allocate to volumes.
The ability to set the Soft Limit above the Hard Limit means you can over allocate the hard capacity (if you so choose!).
In the screen captures, the Hard and Soft Limits went up because the amount of useable capacity went up.
So to sum up..... to increase the capacity of the XIV while maintaining perfect load balance, the IBM Service representative performed about
2-3 minutes of physical work and then used two mouse clicks. Redistribution began and ran as a background task.
In the morning the client then increased the size of a storage pool (two mouse clicks) and the space was now ready for new volumes.
Easy as..... XIV.
So something I truly hate to see when visiting computer rooms is fibre cables hanging in the breeze with no dust covers, their precious glass connectors exposed to the world.
Even worse are fibre patch panels and HBAs without dust covers.
When new equipment arrives, every HBA, every patch panel port and every fibre optic cable will have a dust cover.
So what to do with these little guys once you remove them?
When you unplug a cable later you need to immediately re-install those precious covers both onto the cable and into the HBA or patch panel port, to protect the fibre optics from contamination.
I recommend storing dust covers in sealed plastic bags, preferably kept in the relevant rack so they are close to hand.
The picture below (taken from the rear of an XIV) is cute in that it shows a clever re-use of the XIV power cable covers,
but the dust covers are now exposed to contamination from the open air.
Since we are on the subject, take note of the colour coded power cables in the XIV.
Its another example of clever design to make power redundancy visually obvious.
The ends of the power cords are red, yellow and green to indicate which of the three UPSs these cables come from.
The unused cords at the top of the image are free because this machine is not fully populated with modules.
I recently listed to a great Podcast on VAAI from Greg Knierieman over at the Storage Monkeys website. You can find the podcast here (the whole site is well worth a visit).
He was talking with Marc Farley (his co-host, from 3PAR), Chad Sakac (from EMC) and Chris Evans (the 'Storage Architect'). The topic was VMWares newly announced VAAI.
To quote from the podcast, VAAI is a set of APIs focused on the VMWare kernel to off-load various functions onto the storage.
To get the newly announced VAAI functions you need to upgrade to the newly announced VSphere 4.1 and you will need to upgrade your storage hardware firmware to a version that supports it (when such a version comes out).
Some of the major new functions are:
Hardware accelerated locking (to avoid the need for ESX to use SCSI reserves when doing meta-data updates)
Hardware accelerated full copy (to help VMWare clone data without having to do lots of read and writes)
Hardware accelerated zero (to avoid the need to send vast numbers of 'empty' SCSI write I/Os to zero out blocks)
Given who was on the call, the conversation focused mainly on what EMC, 3PAR and to some extent what NetApp are doing in regards to this development.
Apart from some (good natured?) digging at HP, no other Vendor was really mentioned.
One thing that was mentioned was that some storage hardware architectures will lend themselves far better to VAAI than others.
In particular Chad mentioned that he would expect fullcopy and hardware accelerated zero would work better on V-Max than CLARiiON, due to hardware architecture differences that also benefit 3PAR.
I found that a really interesting observation.
What wasn't mentioned on the podcast was XIV.
So to be clear, the architecture of XIV lends itself very very well to the changes required to support VAAI.
To give an example of how we have done this with other vendors, XIV firmware 10.2.0a brought in support for Symantec Storage Foundation Thin Reclamation.
XIV support for Symantec's Storage Foundation and Thin Reclamation API means that when data is deleted by a user who uses the thin provisioning aware Veritas File System (VxFS), XIV will immediately free unutilised blocks and reclaim such blocks, rather than leaving them with 'garbage' data that wastes space.
So have no doubts, XIVs architecture is very 'friendly' to the sorts of things VMware are trying to achieve with VAAI. To underscore this, Chad also said that a VMware goal was that VMware admins should need "no requisite knowledge of the underlying infrastructure for any task". The goal is to use policies instead. Given this goal, XIV is also a perfect match. With XIV there is no need to think about raid types, raid sizes, disk types, disk sizes, LUN allegiances and trespassing, controller workload balancing or hot spot detection prevention or correction. All of these concerns simply don't exist.
The good news is that XIV is working with the VMware Reference Architecture Lab and the statement of direction is that we will announce VAAI support for XIV later this year. XIV continues to be an excellent choice for VMware environments and when VAAI support is added to XIV, this will only improve.
Finally, Chad made a great quote on the podcast. He said: "Never trust any vendor when they talk about what other vendors are doing"
I think this is a really great statement and one that everyone should take to heart.
I spent the first week of my recent vacation in Sydney (I live in Melbourne).
For someone who has spent many weeks in Sydney on business, this was the first time I actually went there as a real tourist.
First up, it was a great break and my family and I can heartily recommend Sydney to anyone (from any part of the world) who is looking for a "holiday with the lot".
As part of our planning, we clearly had a budget to work to as well as many idea about things to do.
Of course one of the big considerations was where to stay. I did a great deal of Googling around but relied on TripAdvisor heavily to help make our final decision.
I really love tripadvisor for two reasons:
So what has that go to do with IT? Well those who follow storage blogs will have seen a spike recently in discussions on two subjects:
I hope that its obvious that I work for IBM, so I accept that you may choose to view anything I say as potentially coloured by my relationship with my employer.
It has left me pondering where clients should go to get a good handle on what really matters most to them.
To chose the hotel that best matched my holiday requirements I used TripAdvisor.
But what can clients do?
I personally read many vendor and 'independent' blogs to try and ensure my 'world view' is as realistic and informed as possible.
There are some great blogs out there, but I do not know many of these people or their organizations personally. So I always read them through a haze of mild cynicism.
So is there a TripAdvisor for Storage IT? A place for genuine end-users to share their experience with specific solutions?
I have to say that XIV is one product that is crying out for such a beast. My experience is that client satisfaction levels with their XIVs is remarkably high but is that always reflected on the Web?
So far the best place I have seen for shared experiences from 'real' people is the XIV group on LinkedIn.
It would be great to see more actual end users head over there and contribute (so please do!).
Finally I should also point out that if there is a business relationship between IBM (my employer) and TripAdvisor or Linked In, I am not aware of it.
My opinions on these organizations are totally my own.
And the hotel we stayed in? The Quay Grand on Circular Quay. The view from our balcony was priceless. Check it out: