Anthony's Blog: Using System Storage - An Aussie Storage Blog
So lets be honest here. Watching corporate advertising on YouTube is not my favourite thing to do. But watching people I know... People who are passionate and articulate and who are talking about a subject they understand with tremendous depth... that's worth taking your time to check out. Brian Carmody is the XIV Technical Product Manager. There are few people who can explain a deeply technical concept as well as he can. I have spotted him in two videos so far. If you can forgive the (very) cheesy music and the shaky handheld-camera-like graphics, listen to Brian, Yossi Siles and Robert Cancilla talking about XIV.
One of the many popular features of the XIV is the ability to replicate using iSCSI. On XIV Gen3 there are now at least 10 and up to 22 active iSCSI ports on each machine (depending on how many modules you order).
Implementation of the iSCSI connection between two XIVs is a piece of cake. If both XIVs are defined to the XIV GUI (which they should be), you just need to drag and drop links between XIVs to bring the iSCSI mirroring connections alive. If the network gods are with you, the link goes green. But... if the networking gods are against you... the links staysred and then the question is... what to do?
Old fashioned problem diagnosis leads us straight to the ping command. However I routinely find that the ping command works fine (all interfaces respond), but the link stubbornly remains red.
The first possible problem is that iSCSI uses TCP port 3260, so hopefully there are no firewalls blocking that port.
The second possible problem is the MTU size (Maximum Transmission Unit). When we define the iSCSI interfaces on the XIV we set the MTU as a value of up to 4500 bytes. When we establish connections between two XIVs, each XIV will send test packets that are sized to the MTU. If the intervening network does not support that packet size, the packets will be dropped by the network, because the XIV sets the don't fragment flag to ON.
So how to work out what the MTU is? Well the first thing to do is ask your friendly networking team member. But sometimes I find that the intervening networks are controlled by third parties, which means that getting a straight (and reliable) answer can prove difficult. Even worse, some of these third parties charge a fee every time you call them, so there may be hesitation to even get them involved!
One simple trick is to re-use the ping command but play with payload sizes. We can use a command that looks like this:
ping -f -l 1472 192.168.0.1
That command sends a ping with a payload of 1472 bytes to IP address 192.168.0.1. We add the -f parameter to prevent packet fragmentation. What you then do is slowly increase the payload until you no longer get a reply.
This process works fine and is great way to determine the maximum payload size the end to end network will support. However if you're using the payload size to determine the maximum transmission unit, there is a little trick. The MTU is the maximum packet size, but a ping sends a payload wrapped in 28 bytes of IP and ICMP headers. So our example:
ping -f -l 1472 192.168.0.1
sends a 1500 byte frame to the 192.168.0.1 IP address (1472 bytes of payload and 28 bytes of headers.
If this command succeeds, you can use an MTU of 1500 in the XIV GUI or XCLI (rather than an MTU of 1472, which is 28 bytes smaller).
For those who are wondering how I did the networking sniffing to get the screen captures above, I used a brilliant piece of freeware software called Wireshark. My only warning is that your corporate security policies may have rules on sniffing the network. Don't take my blog post as permission to use it # And for the networking geeks among you, yes I know that extra packets could actually be wrapped around our ethernet packet for things like VLAN tags or encapsulation, but hopefully this should not affect our mathematics.
Controlling the background traffic
Final pointer. Having finally gotten the link up and going, you are now free to start replicating volumes. But how much traffic can the cross site link support? The XIV can limit the background copy bandwidth with a parameter called max_initialization_rate. This is useful to stop you flooding the cross site link and annoying your link co-tenants. To display the current setting, open an XCLI window and issue the following command:
For each target you should see three parameters:
<max_initialization_rate value="100"/> <max_resync_rate value="300"/> <max_syncjob_rate value="300"/>
These three settings should be tuned to reflect the possible throughput of the cross site links.
To change the settings use a command like this (change the target name and the rates to suit):
target_config_sync_rates target="Remote_XIV" max_initialization_rate=120 max_syncjob_rate=240 max_resync_rate=240
If you want to see the current throughput rate, open XIV Top on the remote machine. You should see how much write I/O is being sent to the mirror target volumes in MBps.
So hopefully your now better positioned to diagnose iSCSI link issues, maximize your MTU and tune and monitor your link speed.
Questions? Fire away...
With the announcement that you can order an XIV with 3 TB SAS disks, IBM now have some amazing capacity options and some equally clever growth options with XIV Gen3.
As you hopefully know, the XIV consists of modules that each contain 12 disks. An XIV can have 6, 9, 10, 11, 12, 13, 14 or 15 modules (all modules must have the same size disk). You can start at any of those points and then grow without interruption or outage up to 15 modules (that's 243.3 TB!). There is practically no planning required to do a capacity upgrade and the data relocation to re-balance between the nodes is done automatically by the machine (without any end-user intervention).
The useable capacity sizing with 3 TB drives stretches from 84.1 TB with 6 modules to 243.3 TB with 15 modules (these are decimal TB).
However the Capacity on Demand (CoD) options are far more interesting. With CoD you effectively buy a certain amount of capacity up front but also get up to 3 more modules shipped with the machine. You can start using this extra capacity when your business requirements demand it, at which point you will be asked by IBM to purchase it. The advantage here is that you physically get a bigger machine up front with all the performance benefits that bestows, plus you don't have to contact IBM to start using that extra capacity. Lets look at the possible configurations.
So lets take a scenario. You need 100 TB today, but you know this will grow to 130 TB over the next 12 months. So you could purchase an XIV with 9 physical modules (using 3 TB drives), with 7 CoD activations. This means IBM ship a machine that physically has 132 TB and that physically has 108 drives in 9 modules. Your data will be spread over all these drives and all of these modules will be active and working. However you have effectively only paid for 103 TB of that space up front. If you order extra CoD activations, you could also order extra physical modules. As long as you stick to the chart above and have at least one un-activated module, you stay in the CoD program.
When your data requirements exceed 103 TB you just start using the extra space, no license keys or special tasks required. Nice!
So having told you how great it is... are there any disadvantages?
1) You need to actually buy the storage... eventually. Depending on the CoD contract there will be a point when IBM expect you to purchase this extra capacity. The whole point of CoD is that it is like pre-ordering capacity without actually paying for it up front. If your really not certain you need extra capacity, your probably better off not ordering CoD capacity in the first place. Instead order capacity upgrades as you require them.
2) There is nothing to stop you using the storage. Now this is a curious disadvantage because it means that if you have paid for 103 TB and you start using 105 TB, the machine will not tell you off, or yell at you. So is this a good thing or a bad thing? Well I really like the flexibility so I think it is a good thing. Plus there is a nice command called cod_list which displays consumed capacity to help keep you on the path. You can also display it in the GUI. So it just means you need to keep an eye on volume and pool creation to ensure you don't start configuring extra capacity until your prepared to pay for it.
You can also use CoD with 2 TB drives on XIV Gen3 so this is another option. With 2 TB drives, the useable capacities look like this:
Questions? Fire away....
It is that time of the year - IBM Storage announcements time.
The number and depth of announcements is a bit over whelming.... there are so many things to talk about.
The good news is that my fellow bloggers are producing mountains of great content so please check them out:
You can also check out a list of all the announcements here.
It seems like only yesterday that it was November 2010.
It was a very clever decision, taking the mature and very popular SAN Volume Controller (SVC) and packaging it with the latest SAS enclosure technology and offering it as a midrange storage product.
Since IBM started shipping the Storwize V7000, they have sold (up until September 2011):
You can find Storwize V7000s in every major country of the world, in every major industry, in every common environment. A phenomenal success, all achieved in less than a year.
So what does the future hold? Watch this space... some very cool news is really close...
For those of you with Apple iPads, you might consider dropping by the Apple Store and picking up your free IBM XIV Mobile Dashboard.
The IBM XIV Mobile Dashboard application can be used to securely monitor the performance and health of your XIV over a Wi-Fi or 3G link. Having downloaded and installed the Mobile Dashboard you will get a lovely XIV Icon:
When you start the Mobile Dashboard you will have the choice to either run in Demo Mode or to connect to an actual XIV. Demo mode can be accessed by selecting the Demo Mode option deep in the lower right hand corner. So you don't actually need an XIV to give it a test drive.
Once connected you have the choice of viewing volume performance or host performance. If you view (hold) the iPad in portrait mode you get a list of up to 27 volumes or hosts ordered by performance metrics (it defaults to ordering by IOPS). If you view the iPad in landscape mode you will get a more graphical output (as per the examples below). There are no options to perform configuration, the dashboard is intended only for monitoring. This means each panel will show the performance and redundancy state of the XIV.
The volume performance panel is shown by default. The example below shows the output when the iPad is operated in landscape mode. From this panel you can see up to 120 seconds worth of performance for a highlighted volume. Use your finger to rotate the arrow on the blue volume icon to switch the display between IOPS, bandwidth (in megabytes per second or MBps) and latency (in milliseconds or MS). The data redundancy state of the XIV is shown in the upper right hand corner (in this example it is in Full Redundancy, but it could be Rebuilding or Redistributing).
The example above shows the output when the iPad is operated in landscape mode. If you instead rotate the iPad to portrait mode, you will get a list of the performance of up to 27 of your busiest volumes.
Now swipe to the left to navigate to the Hosts panel as shown below.
From this panel you can see up to 120 seconds worth of performance for a highlighted host. Use your finger to rotate the arrow on the purple host icon to switch the display between IOPS, bandwidth (in megabytes per second or MBps) and latency (in milliseconds or MS). The data redundancy state of the XIV is shown in the upper right hand corner (in this example it is in Full Redundancy, but it could potentially also be Rebuilding or Redistributing). Swipe to the right to navigate to the Volumes panel.
The example above shows the output when the iPad is operated in landscape mode. If you instead rotate the iPad to portrait mode, you will get a list of the performance of up to 27 of your busiest hosts.
From either the volumes or the hosts panels you can log off from the mobile dashboard using the icon in the upper right hand-most corner of the display. When you log back on, the last used XIV IP address and username will be displayed (but not the password which will need to be entered again).
I can see some nice use cases here. You get a call regarding performance but you are on the road. Are there any problems with the XIV? You can quickly logon with your iPad and confirm if response times are normal and the redundancy state is Full Redundancy.
A better use case... now you can ask your manager to buy you an iPad, so you can monitor your XIV! Let me know how that goes #
For those of you planning to move to ESXi 5.0, IBM have found an annoying (but not show stopping) issue with the way XCOPY is implemented in the VAAI driver. With ESX/EXi 4.1, IBM supplied the VAAI driver, but with ESXi 5.0 this changed and VMware now manage this themselves. It has since emerged that the way VMware implemented XCOPY in this driver does not totally work with the way IBM implemented XCOPY in both the XIV and the Storwize V7000 and SVC.
This is the current situation with the first three VAAI primitives in ESXi 5.0:
Hardware accelerated locking: Also known as Atomic Test and Set (ATS), this function works fine when ESXi 5.0 detects a volume from an XIV, Storwize V7000 or SVC. In fact the moment ESXi 5.0 detects a LUN from any of these products it uses ATS to confirm that VAAI is possible. So this is goodness.
Hardware accelerated initialization: Also known as write same, this function offloads almost all effort on the part of ESXi to write zeros across disks. This function works fine when ESXi 5.0 works with XIV, Storwize V7000 or SVC. So this is also goodness.
Hardware Accelerated Move: Also known as XCOPY, full copy or clone blocks, this function works fine with XIV, Storwize V7000 and SVC if you clone a virtual machine and place the new copy into the same datastore as the source. This means creating multiple clones of a VMDK inside the one datastore will still be accelerated by VAAI. So far so good, but unfortunately on XIV, if you place the clone in a different datastore on the same XIV, it will not be hardware accelerated. This means the clone is still created, but in the old-fashioned way (reading from the source and writing to the target). As for storage vMotion, it also reverts to working in the old-fashioned way, reading from the source and writing to the target, rather than the hardware accelerated way.
So to be clear, this issue with XCOPY:
How will this be fixed? Well right now it looks like it will be fixed in new firmware on the IBM hardware. Watch this space and I will update you as soon as I have more news to hand.
As for the fourth VAAI primitive, unmap, which is used for space reclamation on thin provisioning capable hardware, please also watch this space on when IBM hardware will support it... BUT in my opinion it does not matter right now, because this new unmap function in ESXi 5.0 can potentially cause issues. This is described here: http://kb.vmware.com/kb/2007427
So until VMware confirm a fix, I recommend that you run the following commands on all ESXi 5.0 boxes which connect to IBM Storage to disable unmap. I tested the following syntax to confirm it works:
First confirm the value of the unmap setting (1 means enabled, 0 means disabled):
~ # esxcli system settings advanced list -o /VMFS3/EnableBlockDelete | grep "Int Value" Int Value: 1 Default Int Value: 1
Then disable it:
~ # esxcli system settings advanced set --int-value 0 --option /VMFS3/EnableBlockDelete
Then confirm it is disabled:
~ # esxcli system settings advanced list -o /VMFS3/EnableBlockDelete | grep "Int Value" Int Value: 0 Default Int Value: 1
I just discovered that you can find freaky blue aliens wearing sun-glasses in release 3.0 of the XIV GUI.
Just go to the Users icon (the padlock) and take the option to Add User Group. I think you will find the new grouped users icon rather amusing. I don't know if I prefer the dude with sunglasses or the two faceless people behind him (or her... or it).
Here are two groups I created, inspired by the icon:
If your running in demo mode and you want to have a closer look, just go to the padlock icon on the XIV Demo2 machine.
You gotta love that icon... #
Actually.... the purpose of groups is to let you group together users in the Application Admin category who manage the same hosts. Application Admin users can only work with hosts that are assigned to them. This limits their access to only the volumes that are mapped to those hosts. A great (and safe) way to hand out user-ids to say SAP developers who want to snapshot as they go. With the added bonus of a cool spaceman icon.
anthonyv 2000004B9K Tags:  ibm interoperation interoperability ssic interop storage 6,277 Views
The eternal question: Which hardware/software combinations are tested and supported? If you use IBM Storage hardware and you need to answer this question you need to be using the IBM System Storage Center, or , which you will find here:
I use this site a lot and rely heavily on the output it creates. I thought I knew the site well, but I recently learnt some really handy tricks that you might find helpful...
1) Export all the data for a single product version.
If you want to download every interoperability test result for a specific product version, you can select the relevant version from the Product Version box of the and then select Export Selected Product Version (xls).
In the example below we want to see all the results for XIV Gen3 which uses XIV Software version 11.
a) Use the scroll bar in the Product Version box to bring up the XIV product versions. Youdon't need to make a selection from the Product Family or Product Model boxes.
b) Select IBM Storage System (11) from the Product version list.
c) Select the option to Export Selected Product Version (xls). A spreadsheet compressed into a ZIP file will be downloaded in your browser.
So that is just two clicks and the result is a giant spreadsheet. Reminds me of when matrices were giant .
2) Changing your selections from an existing search
As you make selections, the webpage leaves what are called cookie crumbs. They will appear at the top of the page and can be seen in the example below, numbered 1 to 6. You can use those cookie crumbs to go backwards at any point, to any point.
3) Start anywhere
It seems human nature to always start at the top and work downwards. But in fact you can start anywhere on the and work in any direction. There are no real restrictions on the combinations you can attempt to build. Every time you make a selection in a different box, the number of configuration results will drop. For instance just click in the Connection Protocol box... or just select IBM AIX 7.1 from the Operating System box. Then work up or down from there.
Hopefully these suggestions will help you work more effectively with the .
I know the walls are coming down... but there are still many organizational barriers that can exist in IT. How about:
Team work and co-operation? Sure it's an option.... but then an option means its optional.... right?
So when vendors come along with plug-ins and products that dare to connect two worlds... is this a unifying force, or is it anti-matter, or do they just get ignored and not used?
A case in point being the IBM Storage Management Console for VMware vCenter which you can download from here. I have written about this plug-in before, but with the release of version 2.6 (that supports vSphere 5.0), I thought I would try something out. Installing the plug-in potentially offloads a lot of storage management from the storage admin to the VMware admin. But what if the storage admin does not WANT to offload this work?
The answer is to give the VMware admin read-only access.
When you configure your IBM storage device to the plug-in, you supply the plug-in with log-in credentials (so it can log into your IBM storage device and collect the required information). If the user-id supplied only has read-only access to the XIV for instance, the plug-in still works... but not for any operations that change resources. You cannot see the pools on the XIV, but you can still see your volumes and any snapshots that have been created (but annoyingly you cannot see mirrors).
This does have one big advantage. You can clearly match the VMware datatstore name to the XIV volume name. You can also identify which XIV supplied the volume.
I also tested this with Storwize V7000 with a user in the Monitor category and got pretty well the same results. A nice bonus is that I could also see the state of the mirrors as well as the flashcopies. In the example below, all of this information would normally not be visible to the VMware admin, so this is very handy stuff.
Of course I get to also visit one-man bands where the same (exhausted) individual manages the VMware servers, the Operating System Guests, the Network, the Firewall, the Exchange server, the SQL servers and pretty well everything else including getting the elevators and coffee machine fixed. For those people, they need all the help they can get.