I am in Singapore this week running a teach the teacher seminar on Storwize V7000. We are creating more instructors as demand for courses on the product continues to increase.
I am fortunate to be staying at the Marina Bay Sands Resort, which is one of the most mind blowing facilities I have ever seen. Check out this view of the Infinity Pool from the Skydeck up on the 57th floor.
Out my hotel window they are building a huge new facility known as Gardens by the Bay, but frankly I think it looks more a spaceport!
According to Wikipedia these are Supertrees: tree-like structures that dominate the Gardens landscape with heights that range between 25 and 50 metres. They are vertical gardens that perform a multitude of functions, which include planting, shading and working as environmental engines for the gardens.
But doesn't this look like two huge crashed spaceships?
Actually they will be giant conservatories, again according to Wikipedia they are the Flower Dome and the Cloud Forest.
They certainly know how to think big in Singapore!
XIV Gen 3 modules are built on a new generation of Intel microprocessors based on the Nehalemmicro-architecture. Nehalem is the most profound architecture change that Intel has introduced in the 21st century. Some of the key changes and their benefits are:
- Integrated memory controller: The memory controller now sits on the same silicon die as the processors. It runs at the same clock-speed as the processors instead of at the lower speed of an external front-side bus. This dramatically improves memory performance and therefore overall system performance.
- No need for buffered memory: Previously, buffered memory was required to improve the performance of the memory sub-system. Buffered memory is relatively expensive and energy hungry. With the faster Nehalem integrated memory controller, the system can deliver improved performance without needing buffered memory, saving cost as well as energy. XIV Gen 3 will be faster and cooler at the same time using unbuffered DDR3 RAM. And since the memory is cheaper, we can put more in.
- Increased memory capacity: Nehalem supports more memory chips at higher speeds. In XIV Gen 3 this translates into a 50 to 200% increase in system cache, significantly lifting the performance headroom of an already stellar performer.
- No more front-side bus: Memory, second CPU package and peripherals no longer have to share and wait on a single bus to communicate. The connections are now direct or switched, enabling increased parallelism and the ability to do more work simultaneously.
- PCI Express Generation 2: The I/O sub-system doubles in speed with the introduction of PCI Gen-2. This enables faster network and I/O adapters for XIV Gen 3:
- 8 Gbps fibre-channel host connections.
- More iSCSI host connections (including at the entry configuration of 6 modules)
- Multi-channel, low latency infiniband as the inter-module connection.
- A slot for solid state disk (SSD).
- Better systems management instrumentation: The system supports increased monitors for sub-systems for more sophisticated self diagnostics and healing. Remote management capability has also been improved.
Furthermore, the new motherboards have additional expansion capacity (more processors, memory and I/O) that can be utilized to deliver future improvements in performance and increased software functionality.
XIV Gen 3 is not the first storage sub-system to adopt the Nehalem architecture. Some of our competitors (EMC and NetApp for example) have already done so with their dual-controller arrays. XIV Gen 3 takes the Nehalem architecture advantage forward, not twice, but six to fifteen times.
Many thanks to Patrick Lee for writing up this great summation.
Today IBM is announcing a new member of the XIV family, which we are calling XIV Gen3. I thought I would give a brief history of how we got here before I get too carried away with details.
What was Generation 1 of the XIV?
In 2002 an Israeli startup began work on a revolutionary new grid storage architecture. They devoted three years to developing this unique architecture that they called XIV.
They delivered their first system to a customer in 2005. Their product was called Nextra(does it look familiar?).
What was Generation 2 of the XIV?
In December 2007, the IBM Corporation acquired XIV, renaming the product the IBM XIV Storage System. The first IBM version of the product was launched publicly on September 8, 2008. Unofficially within IBM we refer to this as Generation 2 of the XIV.
The differences between Gen1 and Gen2 were not architectural, they were mainly physical. We introduced new disks, new controllers, new interconnects, improved management, additional software functions.
As anyone who has read my blog knows, I have been working on the Generation 2 XIV since the day IBM began planning to release it as an IBM product. So it is very exciting to be able to share with you that we are now releasing Generation 3 of the IBM XIV Storage System.
What is Generation 3 of the XIV?
Generation 3 of the XIV is a new member of the XIV family, that will be an alternative to the Generation 2 XIVs we currently offer. It does not change the fundamental architecture, that remains the same. What it does do is bring significant updates to almost every part of the XIV, including:
- Introducing Infiniband interrconnections between the modules.
- Upgrading the modules to add 2.4 Ghz quad core Nehalem CPUs; new DDR3 RAM and PCI Gen 2 (using 8x slots that can operate at 40 Gbps) .
- Upgrading the host HBAs to operate at 8 Gbps.
- Upgrading the SAS adapter.
- Upgrading the disks to native SAS.
- A New rack.
- A new dedicated SSD slot (per module) for future SSD upgrades.
- Enhancements to the GUI plus a native Mac OS version.
I will be blogging about each of these changes over the coming days and weeks as we move to general availability date, so watch this space. In the meantime, why not visit the official XIV page here and check out the ITG Report linked there.
I have received this question several times, so it's clearly something people are interested in.
The Storwize V7000 has two controllers known as node canisters. It's an active/active storage controller, in that both node canisters are processing I/O at any time and any volume can be happily accessed via either node canister.
The question then gets asked: what happens if a node canister fails and can I test this? The answer to the question of failure is that the second node canister will handle all the I/O on its own. Your host multipathing driver will switch to the remaining paths and life will go on. We know this works because doing a firmware upgrade takes one node canister offline at a time, so if you have already done a firmware update, then you have already tested node canister fail over. But what if you want to test this discretely? There are four ways:
- Walk up to the machine and physically pull out a node canister. This is a bit extreme and is NOT recommended.
- Power off a node canister using the CLI (using the satask stopnode command). This will work for the purposes of testing node failure, but the only way to power on the node canister is to pull it out and reinsert it. This is again a bit extreme and is not recommended. This is also different to an SVC, since each SVC has it's own power on/off button.
- Use the CLI to remove one node from the I/O group (using the svctask rmnode command). This works on an SVC because the nodes are physically separate. On a Storwize V7000 the nodes live in the same enclosure and a candidate node will immediately be added back to the cluster, so as a test this is not that helpful.
- Place one node into service state and leave it there will you check all your hosts. This is my recommended method.
First up this test assumes there is NOTHING else wrong with your Storwize V7000. We are not testing multiple failure here. You need to confirm the Recommend Actions panel as shown below, contains no items. If there are errors listed, fix them first.
Once we are certain our Storwize V7000 is clean and ready for test, we need to connect via the Service Assistant Web GUI. If you have not set up access to the service assistant, please read this blog post first.
So what's the process?
Firstly logon to the service assistant on node 1 and place node 2 into service state. I chose node 2 because normally node 1 is the configuration node (the node that owns the cluster IP address). You need to confirm your connected to node 1 (check at top right) and select node 2 (from the Change Node menu) and then choose to Enter Service State from the drop down and hit GO.
You will get this message confirming your placing node 2 into service state. If it looks correct, select OK.
The GUI will pause on this screen for a short period. Wait for the OK button to un-grey.
You will eventually get to this with Node 1 Active and Node 2 in Service.
Node 2 is now offline. Go and confirm that everything is working as desired on your hosts (half your paths will be offline but your hosts should still be able to access the Storwize V7000 via the other node canister).
When your host checking is complete, you can use the same drop down to Exit Service State on node2 and select GO.
You will get a pop up window to confirm your selection. If the window looks correct, select OK.
You will get the following panel. You will need to wait for the OK button to become available (to un-grey).
Provided both nodes now show as Active, your test is now complete.
I think this picture speaks for itself: Three XIVs. Three cities. Three way iSCSI.
All the mirror connections were created in seconds using drag and drop in the XIV GUI.
I can now take a volume in one city and mirror it to another.
And yes.... IBM Australia now has a demo XIV in each of three major cities, so why not drop by and have a look?
Over at SearchStorage.com.AU they recently published an article entitled Six reasons to adopt storage virtualisation. You can find the article here. The six given reasons are:
- Storage virtualisation reduces complexity
- Storage virtualisation makes it easier to allocate storage
- Better disaster recovery
- Better tiered storage
- Virtual storage improves server virtualisation
- Virtual storage lets you take advantage of advanced virtualisation features
Its a well written article and I agree with every point. But one could be forgiven for reading the article and thinking that either storage virtualisation is new, or that storage virtualisation is something you might consider AFTER doing server virtualisation. Both of which are not true.
IBM embraced storage virtualisation in June 2003 when we announced our SAN Volume Controller (the IBM SVC). I even found a CNET.com article from way back then. You can find it here (the image below is a screen capture of that CNET website).
IBM's SVC product has been enhanced repeatedly since 2003 with an enormous list of supported host servers and backend storage controllers. We have added new functions every year including Easy Tier, split cluster, VAAI, an enhanced GUI and a new form factor for the SVC code in the form of the Storwize V7000.
So let me give you a seventh reason for adopting storage virtualisation: A vendor who has shown genuine support for this technology. No vendor has embraced storage virtualisation with more enthusiasm than IBM. We have an industry leading solution with phenomenalSPC benchmarks, an enormous number of case studies and an architecture that does not lock you in. Indeed it is an architecture that can grow as you grow and that can be upgraded without disruption.
So please consider storage virtualisation from IBM, using either the SVC or the Storwize V7000. If your in Australia, we have demo centers dotted around the country. Many of our Business Partners can also demonstrate IBM storage virtualisation using their own Storwize V7000s. If your in Melbourne feel free to give me a call and schedule a time to drop into Southgate.
Bob Leah is one of our leading lights in the developerWorks team. His blog (found here
) is a great resource for Web designers. He recently created a new set of templates to enable a mobile page for developerWorks blogs. You can read his article about the new template here
This morning I boldly went and installed the new templates and so far I think it looks fantastic, not only on the iPhone, but also the iPad and on regular browsers. My only complaint is that I lost the banner image of my Golden Retriever (my loyal hound Suzie). Bob assures me she will reappear soon. In the meantime, I would love to hear feedback about the new template. This is what it looks like on my iPhone:
Tivoli Pulse is coming to Melbourne July 27 and 28, 2011 at the Crown Promenade in Melbourne.
How many chances do you get to listen to the following speakers in one place?
- Nigel Phair, Director, Centre for Internet Safety, University of Canberra
- Steve Van Aperen, Human Lie Detector and Director of SVA Training and Australian Polygraph Services
- Laura Guio, Vice President, Storage Sales, STG Growth Markets, IBM
- Joao Perez, Vice President of Worldwide Tivoli Software Sales, IBM USA
- Jamie Thomas, Vice President Tivoli Strategy and Development, IBM USA
- Glenn Wightwick, Director, IBM R&D Australia
There are 12 customer case studies presented mainly by customers. There are over 70 sessions in eight streams, expert keynote speakers, case studies and presentations. Pulse 2011 provides the tools to help advance your infrastructure goals.
Registration is free so what excuse do you have? Registration is free so what excuse do you have? Find out more here. You can enroll here. The agenda front page is here. The detailed agenda is here.
I will presenting on day two. talking about Virtualization and Storwize V7000 so maybe I will see you there!
I have an admission: I am a bit of an Apple fanboy. Well actually not a full on Apple fanboy, I have an iPhone and an iPad but I don't have a MacBook (although if IBM start offering a cash payment instead of giving laptops to mobile employees, that might change). But not everything is perfect in the land of Apple. Let me give you an example, one that I routinely find people are not aware of (apologies if you learnt all of this months ago).
The picture below appears to show three identical Apple charger packs (with Australian pins). You may have a similar collection. But are they all identical? Sadly not.
Only those with very good eyes can spot the difference by reading the rather pale decal on the bottom section of each charger. The text is so small and faint, I struggled to take a decent picture, but here is my sad attempt for one of them (they are all different):
So how are my three power adapters different?
- The first is marked as a 10 Watt USB Power Adapter (it came with an iPad). Its output is amusingly marked as 5.1 Volts DC at 2.1 Amps, which suggests 10.7 Watts.
- The second one is marked as a 5 Watt USB Power Adapter (it came with my iPhone). Its output is marked as 5 Volts DC at 1 Amp, which is indeed 5 Watts.
- The third is marked as an iPod USB Power Adapter. No stated wattage, but its output is marked as 5 Volts DC at 1 Amp, which again suggest 5 Watts. So perhaps my 5 Watt adapter and my iPod adapter are actually the same.
The big question that comes up: Are they interchangeable? The answer: Yes but with caveats.
If you have an iPad you should use the 10W adapter. If you use the 5W adapter it will still charge but at a much slower rate. Apple confirm this here where they state:
iPad will also charge, although more slowly, when attached to an iPhone Power Adapter (by which they mean a 5 Watt adapter).
If you have an iPhone or an iPod can you use the 10W adapter? The answer is yes! It will recharge with no ill effects. Apple confirm this here, where they state:
While designed for use with the iPad, you can use the iPad 10W USB Power Adapter to charge all iPhone and iPod models.
So I am putting my 5 Watt and iPod adapter in the cupboard and using the 10 Watt adapter exclusively. If you have an iPad and finds it's recharging slowly, you may be using an older 5 Watt adapter (but you may need a magnifying glass to spot the difference!).
My suggestion to Apple? A few more cents worth of ink please, to make things more obvious.
To close, on my first Apple focused blog entry, let me pose a question:
Will it blend?
I read a great blog post recently on Written Impact that talked about how to create effective presentations. It's well worth reading and can be found here. They describe several different formats that will help you develop interesting presentations, ones that don't put your subjects to sleep.
Talking of presenting, I recently presented at the IBM Power and Storage Symposium in Manila. It was a great event and was very well attended. We even had cake to celebrate IBM's 100th birthday.
There are two IBM Symposiums coming up in Australia that I would love for you to attend:
The next IBM Power Systems Symposium will be held in Sydney running from August 16 to 19, 2011. We are currently finalizing the agenda on this one and while this symposium is dedicated mainly to IBM Power Systems... I will be attending and presenting on storage related topics. To check out the details and enroll, please head over to here.
An IBM Storage Symposium will be held in Melbourne running from November 15 to 17, 2011. The agenda is still being set, so if you have ideas about what you would like to see, please let me know. To check out the details and enroll, please head over to here. And yes! I will be attending and I will be presenting.
The Storwize V7000 and SVC release 6.1 introduced a new WEB GUI interface to assist with service issues, known as the Service Assistant. The Service Assistant interface is a browser-based GUI that is used to service your nodes. Much of what you traditionally did with the SVC front panel can all be done using the Service Assistant GUI. You can see a screen capture of the Service Assistant below:
While I would like to be optimistic and hope that you will never have to use the Service Assistant, you should always ensure your toolkit is equipped with every possible tool. I say this because one thing I have noted is that the majority of installs are not configuring the Service Assistant IP addresses. This is particularly apparent as clients upgrade their SVC clusters to release 6.1.
By default on Storwize V7000, the Service Assistant is accessible on IP addresseshttps://192.168.70.121 for node 1 and https://192.168.70.122 for node 2 (don't try and point your browser at them right now, as your network routing won't work - you would need to set your laptop IP address to the same subnet and be on the same switch. Details to do that are here). For SVC there are no default IP addresses, although we traditionally asked the client to configure one service address per cluster. The best thing for you to do is approach your network admin and ask for two more IP addresses for each Storwize V7000 and/or SVC I/O group. Once you have these two extra IP addresses, record them somewhere and then set them using the normal GUI.
Its an easy five step process as shown in the screen capture below. Go to the Configuration group and then choose Network (step 1). From there select Service IP addresses (step 2) and the relevant node canister (step 3). Choose port one or port two (step 4) and then set the IP address, mask and gateway (step 5).
You can also set them using CLI (replace the word panelname with the panel name of each node, which you can get using the svcinfo lsnode command).
satask chserviceip -serviceip 10.10.10.10 -gw 10.10.10.1 -mask 255.255.255.0 panelname
If you forget these IP addresses, you can reset them using the same CLI commands or using the Initialization tool as documented here.
Finally having set the IP addresses, visit the service assistant by pointing your browser at each address. This is just to confirm you can access it. You logon with your Superuser password. With the process complete, ensure the IP addresses are clearly documented and filed away. So now if requested, you will be able to perform recovery tasks (in the unlikely chance they are needed). If for some reason your browser keeps bringing you to the normal GUI rather than the Service Assistance GUI, just add /service to the URL, e.g. browse to https://10.10.10.10/service rather than https://10.10.10.10.
So what should you do now?
If your an SVC customer on SVC code version v5 and below, please get two IP addresses allocated for each SVC I/O group, so you can set them the moment you upgrade to V6. Do this once the upgrade is complete.
If your an existing Storwize V7000 client or an SVC client already on V6.1 or V6.2 code, then hopefully you should already have set the service IP addresses. If not, please do so and test them.
I thought I would write a quick post about an issue that's not new, but is certainly worth being aware of....
One of the interesting tricks with the change to 8 Gbps Fibre Channel is that it required a change to the way the switch handles its idle time... the quiet time when no one is speaking and nothing is said. In these periods of quiet contemplation, a fibre channel switch will send idles. When the speed of the link increased from 4 Gbps to 8 Gbps, the bit pattern used in these idles proved to not always be suitable, so a different fill pattern was adopted, known as an ARB. All of this came to intrude on our lives when it became apparent that some 8 Gbps storage devices were having trouble connecting to IBM branded 8 Gbps capable Brocade switches because of this change. This led to two things:
- IBM released several alerts regarding how to handle the connection of 8 Gbps capable devices to 8 Gbps capable fibre channel switches.
- Brocade changed their firmware to better handle this situation.
An example of what was said?
"Starting with FOS levels v6.2.0, v6.2.0a & v6.2.0b, Brocade introduced arbff-arbff as the new default fillword setting. This caused problems with any connected 8Gb SVC ports and these levels are unsupported for use with SVC or Storwize V7000.
In 6.2.0c Brocade reintroduced idle-idle as the default fillword and the also added the ability to change the fillword setting from the default of idle-idle to arbff-arbff using the portcfgfillword command. For levels between 6.2.0c and 6.3.1 the setting for SVC and Storwize V7000 should remain at default mode 0.
From FOS v6.3.1a onwards Brocade added two new fillword modes with mode 3 being the new preferred mode which works with all 8Gb devices. This is the recommended setting for SVC and Storwize V7000"
So there are several tips that I will point you to, depending on your product of interest:
Brocade Release Notes
For most environments, Brocade recommends using Mode 3, as it provides more flexibility and compatibility with a wide range of devices. In the event that the default setting or Mode 3 does not work with a particular device, contact your switch vendor for further assistance. IBM publishes all the release notes for Brocade Fabric OS here.
Check out this link if your connecting an 8 Gbps capable DS3500, DS3950 or DS5000 to a 8 Gbps capable switch: http://www-947.ibm.com/support/entry/portal/docdisplay?brand=5000028&lndocid=MIGR-5083089
There is no tip for the DS8800 but the advice remains effectively the same as for the Storwize V7000. I can confirm that using a fill word setting of 3 works without issue.
SAN Volume Controller or Storwize V7000
Check out this link if your connecting a Storwize V7000 or CF8 or CG8 SVC node to an 8 Gbps capable switch: https://www-304.ibm.com/support/docview.wss?uid=ssg1S1003699&wv=1
The XIV Gen3 comes with 8 Gbps capable Fibre Channel connections. It does not support idle Fill Words meaning that the portCfgFillWord value should not be set to 0.
When an IBM System z server attaches an 8 Gbps capable FICON Express-8 CHPID to a Brocade switch with 8 Gbps capable SFPs, you should upgrade your switches or directors to Fabric OS (FOS) 6.4.0c or 6.4.2a and set the fill word to 3 (ARBff).
LTO5 and TS1140
IBM have two tape drives that are capable of 8 Gbps, the LTO-5 drive and the TS1140. Setting the fill word to 3 can actually cause issues with these drives. To avoid issues do one of the following (you only have to do one of these, not all three):
- Load the tape drive with firmware that has the access fairness algorithm fix for loop:
- LTO5 drives should be on BBN0 and beyond (you may need to contact IBM support to get this code.
- TS1140 drives should be on drive firmware 5CD or beyond.
- Change the Fibre Channel topology to point-to-point (N port) (as opposed to L or NL). This is my preferred option.
- Change the Fibre Channel speed to 4Gbps. This sounds slightly retrograde, but it is very rare for an individual drive to sustain a speed above 400 MBps (unless your data is very very compressible).
**** UPDATED 28 Feb 2012 - Added System z FICON and Tape info ****
I recently got a great email from an IBMer in the Netherlands by the name of Jack Tedjai. He sent me two screen shots, taken with the new performance monitor panel (that comes with the SVC and Storwize V7000 6.2 code). He wrote:
I am working on a project to migrate VMware/SRM/DS5100 to SVC Stretch Cluster and one of the goals is to prevent using ISL (4Gbps) and VMware Hypervisor/HBA load during the migration. For the migration we are using VMware Storage vMotion. To minimize the impact of the migration on production, we tested VAAI for Storage vMotion and template deployment and it worked perfectly.
So whats this all about? Well one of the improvements provided with VAAI support is the ability to dramatically offload the I/O processing generated by performing a storage vMotion. Normally a storage vMotion requires an ESX server to issue lots of reads from the source datastore and lots of writes to the target datastore. So there is a lot of I/O flowing from ESX to the SVC, and then from the SVC to its backend disk. What you get is something that looks like the image below. In the top right graph we have traffic from SVC to ESX (host to volume traffic). In the bottom right graph we have traffic from the SVC to its backend disk controllers (DS5100 in this case). This is SVC to MDisk traffic.
When we add VAAI support to the SVC, we suddenly change the picture. Suddenly VMWare does not need to do any of the heavy lifting. There is almost no I/O between VMWare and the SVC (no host to SVC volume traffic) related to the vMotion. The SVC is still doing the work, but it is happening in the background without burning VMWare CPU cycles or HBA ports (in that there is still SVC to MDisk traffic).
This difference translates to: Faster vMotion times, far less SAN I/O and far less VMware CPU being used on this process.
So do VMware support this? They sure do! Check this link here. It currently shows something like this (taken on June 23, 2011):
So what are your next steps?
- Upgrade your Storwize V7000 or SVC to version 6.2 code. Download details arehere.
- Download and install the VAAI driver onto your ESX servers. You can get it from here. If your already using the XIV VAAI driver you need to upgrade from version 188.8.131.52 to version 1.2. There is an installation guide at the same link.
And the blog title? It means friendly greetings in Dutch. So to Jack (and to all of you), vriendelijke groeten and please keep sending me those screen captures.
If your a user of XIV, or your considering purchasing an XIV, then there is one tool that you will truly love. It's called XIVTop. The XIVTop application comes packaged with the XIV GUI and is one of the handiest add-ons I have ever seen. It lets you monitor your XIV in real time, seeing exactly how much IO or throughput is being achieved and at what response time (in milliseconds). You can immediately answer questions like:
- Is poor application response time being caused by poor storage response time?
- What application is currently generating so much traffic on the SAN?
- What effect has performing file de-fragmentation had on performance?
- Are the backups running and how much traffic are they generating?
- What happens when I run multiple application batch jobs at the same time?
The ability to get this information in real time is what makes XIVTop so invaluable.
So in the tradition of always pushing my boundaries, I thought I would create a narrated video about XIVTop. What I discovered is just how terribly hard doing narrated videos are: You need to write a script... you need to stick to the script... you need to not fluff any words.... you need to speak slowly and clearly and not start talking in a strange accent. I had trouble with all of these, so I made take after take after take after take, until I was heartily sick of the process. I have now got a much greater respect for newsreaders and film actors. This narration stuff is hard!
So please check out my final take. It's still far from perfect, but all feedback is very welcome. The only other thing that is quite strange is Youtubes choice of videos to watch after mine. Its worth watching just to see the list. I think the term performance confuses the algorithm.
I joined IBM on June 26 1989, so this Sunday brings up my 22 year anniversary with the company. No small achievement, but I am still three years away from the mystical IBM Quarter Century Club. Of course for some: 22 years is nothing! I recently learned that Robert Neidig, who has been (and remains) a leading light in promoting IBM's Mainframe products, joined IBM on June 21, 1961. So this year bring up his 50th anniversary with the company!
For those with long memories, Bob has worked with the following IBM systems: 1401, 1410, S/360, S/370, 3031, 3032, 3033, 3081, 3083, 3084, 3090, ES9000, S/390, eServer zSeries, and System z. They have all been enhanced by Bob's contributions.
If you want to check out the history of some these world changing products, visit The IBM Mainframe Room. I particularly loved the Photo Album. There are some truly classic images of IBM products of old. If your forward looking, feel free to also visit the System z homepage.
So thanks Bob for your commitment and leadership on your half-centennial, truly a remarkable achievement!
- This is a Severity 1 issue at 2am
I wrote a blog post recently about my favourite podcasts. One of those I listed wasBackground Briefing, a radio program broadcast by the Australian Broadcasting Corporation (Australia's ABC). A recent episode entitled Fatigue Factor really sparked my interest. It talked about the affects of fatigue on professions such as:
- Air Traffic Controllers
- Train drivers
- Truck drivers
It contained some alarming facts about the potential affects of fatigue and is well worth taking the time to listen to. However in my opinion there was one major omission:
It did not mention workers in the IT industry.
For many years I worked as the Account Engineer for several of IBM's System z customers, mainly banks. Most weekends I skipped Saturday night as a sleep night. If I was lucky I might get to sleep from 10pm to 1am and would then head off to vast, noisy, dehydrating air conditioned computer rooms to perform various system changes. If I did my job well, had no hardware issues and the client confirmed everything was running as expected, I got to head home about 7am on Sunday. So that night I would have slept somewhere between zero and three hours. I would then spend the rest of the week recovering, before doing it all over again the following weekend at a different customer.
I mention all of this because fatigue was something I learnt to live with. Even when I moved to a support role, I still occasionally worked through the night on critical situations (something IBM calls Crit Sits). I also worked on a support roster which could involve 3am callouts to assist my fellow IBMers across the Asia Pacific region. So when I later moved to a Pre-Sales role, it certainly did wonders in helping me re-establish normal sleep patterns.
Listening to this podcast really brought home to me that the IT industry is just as guilty of failing to deal with fatigue, as the other industries that the podcast discusses. Now if your thinking this means it's an IBM problem, think again. Most weekends I was working alongside representatives from EMC, HDS, Storagetek, etc. Plus of course there were the clients themselves, many of whom were also missing a nights sleep to satisfy their change and business requirements.
One of the major issues raised in the podcast is that there is no accepted way to measure how fatigued an employee actually is. This is a major problem. There are established tests to confirm how affected someone is by alcohol, or by drugs. But we cannot easily confirm how badly fatigued a worker is; plus many people are unwilling or unable to admit that they are suffering from fatigue
If we think about many of the major IT related outages that have occurred recently, I ponder what role fatigue played in each one. Even if it didn't cause the initial issue, did making your employees work around the clock to resolve an issue, actually extend the outage time? For example, have a read of Amazons explanation of its recent Service Disruption. Just picking on some of the lines in the report:
At 12:47 AM PDT on April 21st, a network change was performed...
At 2:40 AM PDT on April 21st, the team deployed a change...
By 5:30 AM PDT, error rates and latencies again increased ....
At 11:30AM PDT, the team developed a way to prevent....
Was the person doing the change working out of their usual sleep pattern? Was the team working to resolve the issue working out of their normal sleep pattern? Did fatigue compound the outage? Its an interesting idea. Now it may well be that fatigue hadNOTHING to do with this outage. It is pure speculation on my part. But I am certain that the root causes of many of the recent IT meltdowns and their extended after affects (such as Sony's ongoing issues), MUST include the debilitating affects of fatigue.
Plus here is another rather disturbing fact. To quote from the podcast:
... if you're sleep deprived, you're more likely to crave chips over lettuce, and feel less like climbing the stairs. And that can become a vicious cycle, because many people who are overweight are even more prone to sleep disorders....
So please take the time to listen to the podcast. You will find it here and in places likeiTunes.
If your reading my blog, your probably interested in IBM Storage hardware (since apart from Bow Ties, thats all I talk about). So I would hope your already subscribed to IBM's notification service that you will find here. Rob Jackard from the ATS Group (an IBM Business Partner based in the USA) puts together a summary of these notifications which he sends to me on a regular basis. So I am bringing them to you here. Now hopefully none of these alerts are news to you... but please, have a read and if you have not done so already.... SUBSCRIBE!
DS3000 / DS4000 / DS5000:
(2011.06.09) IBM Retain Tip# H202771 – Expanding Dynamic Capacity Expansion (DCE) large arrays may fail due to out of memory conditions.
NOTE: The 7.xx firmware for the DS Storage Controller is affected. This is a permanent restriction. Possible workarounds are available.
(2011.05.27) Documentation: Instructions for opening the IBM System Storage DS Storage Manager interface are incorrect.
(2011.05.25) DS3950 / DS4000 / DS5000 Recommended Firmware Levels.
(2011.05.18) IBM Retain Tip# H202849 – Dynamic Volume Expansion is not possible on a LUN which is in an active mirror relationship with write-mode of ‘Asynchronous not write-consistent’.
NOTE: The DS Storage Controller is affected. A workaround is available.
(2011.05.10) IBM Retain Tip# H202771- Expanding (DCE) large arrays may fail due to out of memory conditions.
DS8000 / DS6000:
(2011.06.07) DS8800 Code Bundle Information.
(2011.06.03) EXN3500 (2857-006) Storage Expansion Unit Publication Matrix.
(2011.05.19) Excessive drive spinning up (0x2 – 0x4 0x1) messages on healthy EXN3000.
(2011.05.05) NEWS: Recommended Releases for IBM System Storage N series Data ONTAP.
(2011.04.28) DataFabric Manager (DFM) 4.0.2 Publication Matrix.
(2011.05.10) Cisco MDS Field Notice: FN-63416 – DS-C9124 & DC-C9148 have incorrect MAC Programming; UMPIRE Program in Place.
(2011.05.19) Intel has reported PAGE FAULT OR CORRUPTED DATA USING 64-BIT APP IN 64-BIT NOS (Fix is Now Available).
SVC / Storwize V7000:
(2011.06.10) IBM SAN Volume Controller Code V184.108.40.206.
(2011.06.10) IBM Storwize V7000 Code V220.127.116.11.
(2011.06.10) SAN Volume Controller and Storwize V7000 Software Upgrade Test Utility V6.5.
(2011.06.10) IBM Storwize V7000 Initialization Tool.
(2011.06.10) Storwize V7000 Concurrent Compatibility and Code Cross Reference.
(2011.06.10) IBM Storwize V7000 V6.2.0 – Installable Information Center and Guides.
(2011.06.10) IBM System Storage SAN Volume Controller and Storwize V7000 V6.2 – Command-Line Interface Guide.
(2011.06.10) IBM System Storage SAN Volume Controller and Storwize V7000 V6.2 – Troubleshooting Guide.
(2011.06.10) Incorrect Usage of Drive Upgrade Command May Cause Loss Of Access to Data.
NOTE: This issue was resolved by APAR IC74636 in the V18.104.22.168 release of the Storwize V7000 software.
(2011.06.10) Storwize V7000 and SAN Volume Controller FlashCopy Replication Operations Involving Volumes Greater Than 2 TB in Size Will Result in Incorrect Data Being Written to the FlashCopy Target Volume.
NOTE: This issue is fixed by APAR IC76806 in the 22.214.171.124 and 126.96.36.199 PTF releases.
(2011.06.08) IBM SAN Volume Controller Code V188.8.131.52.
(2011.06.08) IBM Storwize V7000 Code V184.108.40.206.
(2011.05.27) IBM SAN Volume Controller Code V220.127.116.11.
(2011.05.27) IBM Storwize V7000 Code V18.104.22.168.
(2011.05.27) Storwize V7000 Systems Running V22.214.171.124-V126.96.36.199 Code May Shut Down Unexpectedly During Normal Operation, Resulting in a Loss of Host Access and Potential Loss of Fast-Write Cache Data.
NOTE: If a single node shutdown event does occur when running V188.8.131.52, this node will automatically recover and resume normal operation without requiring any manual intervention. IBM Development is continuing to work on a complete fix for this issue, to be released in a future PTF, however customers should upgrade to V184.108.40.206 to avoid an outage.
(2011.05.09) SVC V4.3.x End of Service – April 30, 2012.
SSPC / TPC / TPC-R:
(2011.06.10) Administration of the TPC Environment: A Guide for TPC Administrators.
(2011.06.08) TPC4.2.x - Supported Storage Products Matrix.
(2011.06.08) TPC 4.1.x – Supported Storage Products List.
(2011.05.23) Shutdown sequence for TPC for Replication.
(2011.05.19) Fabric probe causes instability in Brocade DCFM server.
NOTE: Brocade Defect 332161 has been identified and is resolved in DCFM version 10.4.5.
(2011.05.19) Configuring Oracle for TPC for Databases.
(2011.05.17) TPC web browser support – Firefox 4.x and Internet Explorer 9.
(2011.05.06) TPC- Resolving Issues with Cisco Switches.
(2011.05.05) How to resolve TSPC Server service start problems when TIVGUID is mistakenly uninstalled.
(2011.04.29) Tivoli Storage Productivity Center v4.1.1 Fix Pack 6 (April 2011).
(2011.04.29) TPC database size increases after upgrade to 4.2.1.
(2011.06.09) IBM XIV Host Attachment Kit for AIX v1.6.
(2011.05.24) Potential Problem on XIV Storage System ranging microcode versions 10.2.2 thru 10.2.4.a that can be caused by changing system time via Network Time Protocol (NTP) or when changing the clock via XCLI.
(2011.05.10) IBM XIV Storage System Planning Guide.
As Barry Whyte pointed out in this blog post, the release 6.2 code is available for download and installation onto your SVC and Storwize V7000.
- The Storwize V7000 release of the 6.2 code is here.
- The SVC release of the 6.2 code is here.
I thought I would quickly check out two of the announced features of the 6.2 release: the new Performance Monitor panel and support for greater than 2 TiB MDisks. So on Sunday I got busy and upgraded my lab Storwize V7000 to version 220.127.116.11.
Remember that in nearly every aspect the firmware for the SVC and Storwize V7000 are functionally identical, so while I am showing you a Storwize V7000, it equally applies to an SVC.
Firstly I tried the performance monitor panel, and what better way to show you what I saw than on YouTube? This is my first YouTube video so please forgive me if its not slick. I started the performance monitor and captured two minutes of performance data using Camtasia Recorder. Because it is fairly boring to stare at graphs slowly moving right to left, I then sped it up eight times, and this is the result:
The video is shot in HD, so if what your seeing is grainy or hard to read, change the display to 720p or 1080p. Now if you want to see the performance monitor at its actual speed, here is the original normal speed video. Remember this is the same video as above, just slower. It can also be viewed in 720p.
So what are you seeing?
- The top left hand quadrant is CPU utilization.
- The top right hand quadrant is volume throughput in MBps as well as current volume latency and current IOPS.
- The bottom left hand quadrant is Interface throughput (FC, SAS and iSCSI).
- The bottom right hand quadrant is MDisk throughput in MBps as well as current MDisk latency and current IOPS.
You will note that each metric has a large number (which is the current metric in real time) and a historical graph showing the previous five minutes. You can also change the display to show either node in the I/O group.
I found the monitor to be genuinely real time: the moment I changed something in the SAN (such as starting or stopping IOMeter or starting or stoping a Volume Mirror), I immediately saw a change.
Greater than 2 TB MDisk support
Next I logged onto my lab DS4800 and created two 3.3TiB volumes to present to the Storwize V7000. I chose this size because I had exactly 6.6 TiB worth of available free space on the DS4800 and I wanted to demonstrate multiple large MDisks. On versions 6.1 and below, the reported size of the MDisks would have been 2 TiB (as I discussedhere). Now that I am on release 6.2 with a supported backend controller, I can present larger MDisks. In the example below you can clearly see that the detected (and useable size) is 3.3 TiB per MDisk.
What controllers are supported for huge MDisks?
The supported controller list for large MDisks has been updated. The links for Storwize V7000 6.2 are here and for SVC here. If your backend controller is not on the list, then talk to your IBM Sales Representative about submitting a support request (known as an RPQ).
I recently created a post about the XIV Host Attachment Kit (amusingly called the HAK). IBM has released an update to the HAK, taking us from version 1.5 to version 1.6. The updated versions, along with release notes and installation instructions can be found at the following links:
IBM XIV Host Attachment Kit for AIX, Version 1.6
IBM XIV Host Attachment Kit for HP-UX, Version 1.6
IBM XIV Host Attachment Kit for RHEL, Version 1.6
IBM XIV Host Attachment Kit for SLES, Version 1.6
IBM XIV Host Attachment Kit for Windows, Version 1.6
Whats changed you asked? Great question! Checking the Release Notes for each Operating System (which can be found in the links above), I found some common improvements to the HAK for every OS:
- The xiv_diag command now provides the HAK version number when used with the --version argument. This is handy to confirm what version of HAK you are currently running.
- More information is collected with the xiv_diag command.
- The xiv_devlist command can now display LUN sizes in different capacity units, by using the –u or --size-unit argument. I give an example below.
Usage: -u SIZE_UNIT, --size-unit=SIZE_UNIT
Valid SIZE_UNIT values: MB, GB, TB, MiB, GiB, TiB
- The xiv_devlist output can be saved to a file in CSV or XML format, by adding the –f or --file argument. I give an example below.
There are also several other fixes which are mainly common between Operating Systems. Given that a major part of the HAK are Python scripts such as xiv_attach, xiv_devlistand xiv_diag and given that the output and behaviors of these script are very similar for each OS, this is not surprising.
I installed the new version 1.6 HAK onto my 64-bit Windows 2008 server and found another pleasant surprise: When I ran the xiv_attach command it detected that my Qlogic driver was downlevel. In this example it detected I am running a Qlogic QLE2462 on driver version 9.18.25 and suggested I should instead run driver version 9.19.25.
I then tried out the xiv_devlist command, displaying volume sizes in both decimal (GB) and binary (GiB). Note the syntax I used to get the GiB output: xiv_devlist -u GiB
Finally I offloaded the output of the xiv_devlist command to a CSV file. Again please note the syntax as you may find it useful:
xiv_devlist -t csv -f devlist.csv -u GiB
You could use -t xml instead of getting CSV output. Clearly you could also change the file name devlist.csv to any filename you like.
You do not need to worry about which version of firmware your XIV is running. The release notes confirm HAK version 1.6 will work with XIV firmware 10.1.0, 10.2.0, 10.2.2, 10.2.4 and 10.2.4a, which should cover pretty well every machine in the world.
One final note: Under Known Limitations the release notes state that you should not map a LUN0 volume. This simply means leaving LUN0 disabled (which is the default). In the example below I start mapping volumes from LUN1 and have NOT clicked to enable mapping of volumes to LUN0. This should be the norm.
Any confusion or questions? You know where to find me.
Many months ago I set up my WordPress blog (this is not the one your reading now, but the mirror of this blog I maintain over at Wordpress). One of the configuration choices I had was to enable a mobile version of the site. This setting changes the user experience when using a mobile device. It was a very easy thing to set up:
The difference between the mobile version and the non-m0bile version is fairly stunning as can been seen below, both views from an iPhone 3GS. The mobile version is on the left and the non-mobile version is on the right. Note that there is no difference in the selected URL:
In March, WordPress added a new feature from Onswipe to allow Apple iPad users to have a more iPad friendly user experience. You can read the announcement on Onswipe's blog. Again for the content creator (me), the work to set to this up was practically non-existent, in fact I don't even recall having to turn it on.
And the result? If you visit my blog on an iPad, the look and feel is amazing. It grabs the first image from each blog post to build a really nice front page. It means I will have to take more care with my opening images!
Now the obvious question is: What about Android? If I check the WordPress FAQ found here, it says that support is coming.
So if you like the look of the mobile version, feel free to switch to using my Wordpress blog. It contains all the same posts and is found here:
With Brocade's recent announcement of a 16 Gbps capable Fibre Channel Switchand Director, the question of which cable type to purchase becomes even more relevant. Do you buy OM3 or OM4 cable over OM2?
Now if your saying... OM-what? Let me start at the beginning...
Back when fibre channel was fresh and new and ran at 1 Gbps, the common multi-mode fibre cable that we used had a glass core that was 62.5 microns in diameter. This became known as OM1 type fibre cable. We rapidly switched to 50 micron cores because you could get a reliable signal across a longer distance, say 500 meters maximum rather than 300 meters. The 50 micron cable became known as OM2 type cable.
What has happened since then is that fibre channel speeds have moved from 1 Gbps to 2 GBps to 4 Gbps to 8 Gbps to 16 Gbps. This is exciting stuff, but with every increase in speed, we suffer a decrease in maximum distance. This means that something else needs to change... and that something is the quality of the cables, or more specifically, the modal bandwidth (the signalling rate per distance unit).
With the evolution of 10 Gbps ethernet, the industry produced a new standard of fibre cable which the fibre channel world can happily use. Its called laser optimized cable, or more correctly: OM3. Since then OM3 has been joined by an even higher standard known as OM4.
Lets look at the distances we can achieve with different cable types. You can see in the table below that the modal bandwidth (given in MHz times kilometers), improves as we move to higher quality glass. You can also see that single mode fibre (with the 9 micron core) has not suffered the same issue with decreasing maximum distances as speeds have increased. These numbers come from Brocades SFP specification sheets found here andhere (so there may be slight variations if you view specs from other vendors).
I didn't fill in the table for 1 Gbps and 2 Gbps using OM4 cable, simply because I couldn't find it... but the distances would be very large indeed.
So how can you tell what sort of cable you have? The first hint is the colour, the second is the printing on the cable. Cables that are 50 micron and orange are almost certainly OM2. Cables that are aqua in colour (don't call them green!) are either OM3 or OM4. In the example below I can clearly tell which cable is OM3.
Pictured below is a roll of OM3 cable, all ready for deployment with standard LC connectors. Note you can also get OM3 cable with a smaller LC type connector used on the mSFPs in the high density 64 port blades in the Brocade DCX. You can find additional information on identifying cables here.
So should you be buying OM3 cable over OM2? Or even considering OM4?
The reality is that in many cases, server and storage hardware is often in the same or adjacent racks to the switch hardware. If this is true for your site, OM2 will satisfy the vast bulk of requirements, because the distances are quite short. The most common cable I add to configurations is either 5 or 25 meters long. This is why OM2 is still IBM's cable of choice, since either length would satisfy 16 Gbps connectivity. Checking with some local cable vendors, OM2 cable also remains the cheaper alternative.
Clearly if your computer room is large enough to need cable runs of over 35 meters, then serious consideration should be given to future proofing parts of your cable infrastructure with OM3 (or even OM4). There is nothing wrong with having a mix of cable types - just don't join them together.
I would be curious to know how many sites are choosing to move to OM3? Feel free to comment either way. I think there will be more to come on this subject, and remember.... OM3 and OM4 cables are aqua not green or blue. #.
Would love to hear about your sites recent cabling purchases.
And if the word Aqua reminds you of a late 90s Scandinavian pop group, look no further:
I started my IBM career with very dirty hands.
Every day I would go to work and come home smeared with toner, ink, grease and oil.
No I didn't work for a newspaper or in a garage... I worked for IBM, fixing cheque sorters and printers. This was the late 1980s and early 1990s. The years I spent working on IBMs 3800 and 3825 printers and 3890 cheque sorters were great years. I loved working with my customers and I loved working on those big machines. It was lots of fun... but there were lots of ways to get dirty.
What were these machines? Well for one, the IBM 3800 was the worlds first commercial fan-fold laser printer (released in 1975!). Here is a picture, but I would point out that this 3800 looks remarkably clean:
The 3890 Cheque Sorter was an enormous document processor that could move 2400 cheques per minute. For even better clothing destruction, the 3890 has an ink jet printer that used a special ink that you could easily remove from any garment - provided you used a pair of scissors. As for the IBM 3825 Page Printer, it used Charged Area Development, which without very regular maintenance, could result in huge amounts of toner wandering around inside the machine. No wonder the acronym for that technology is CAD.
And yet in all of this... I wore a suit and tie to work... every day... and I always wore a white shirt. It was an IBM standard that had existed for a very long time. People who turned up for work in a non-white shirt had better be a top performer and only the most remarkable or safety conscious turned up for work wearing something that is now rare in the workplace: The Bow Tie.
The only other IT organization I knew that was just (if not more) obsessed with suit and tie? EDS.
As for the System 38 utopian image below.... thats not me on the right! I never wore tan trousers or short sleeves to work. (Check out the size of those monitors!).
Things changed in the mid 1990s. Suddenly we didn't need to wear a tie. Some of us started wearing corporate branded polo shirts. Times had changed and we changed with them. One irony is that I now regularly wear black business shirts to work, something that I would never have gotten away with in 1990. Yet today the closest I come to toner is when I go and get a printout from the printer.
If your interested in seeing some great photos of how IBMers used to dress, visit the IBM History exhibit here: "The way we wore: A century of IBM attire". You could also head over to IBM's 100 Icons of Progress and in particular visit The Making of IBM to see Thomas J Watson Snr looking very smooth indeed.
I was brought to reminiscing about this when visiting a client on a friday. Friday has become casual clothes day at many organizations. And yet given how far we have come... I am pondering why we bother? In comparison to 20 years ago, every day is casual clothes day. Perhaps its time to put aside the polo shirts and bring back the bow tie? As Dr Who says "Bow Ties are Cool"
So are you with me? Bow Tie Friday?
Comments always welcome.
There was a time when 32 bits was considered a lot. A hell of a lot.
With 32 bits, you can create a hexadecimal number as big as 0xFFFFFFFE (presuming we reserve one bit).
In decimal that's 4,294,967,295. Hey... imagine a bank account balance that big?
If you use 32 bits to count out 512 byte sectors on a disk, you could have a disk that's 4,294,967,295 times 512... or 2,199,023,255,040 bytes! That's sounds huge, right?
Well... actually...no... that's 2 TiB, which most people would refer to as 2 Terabytes. Mmm.. Suddenly I am less impressed (still wouldn't mind that as a bank account though).
Now there are plenty of running Systems that still cannot work with a disk that is larger than 2 TiB. One of the more common is ESX. I am presuming this limitation is going to disappear, so Storage susbsystems need to be ready to create volumes that are larger than 2 TiB.
The good news is that with the May 2011 announcements, IBM is removing the last 2 TiB sizing limitations from its current storage products. There appears to have been some confusion in the past, so I thought I would go through and be clear where each product is at:
Firmware version 07.35.41.00 added support to create volumes larger than 2 TB. The maximum volume size is limited only by the size of the largest array you can create. This capability has been available for some time and hopefully you are already on a much higher release.
DS4000 and DS5000
Firmware version 07.10.22.00 added support to create volumes larger than 2 TB. The maximum volume size is limited only by the size of the largest array you can create. This capability has been available for some time and hopefully you are already on a much higher release.
DS8700 and DS8800
The DS8700 and DS8800 will support the creation of volumes larger than 2 TB once a code release in the 6.1 family has been installed. With this release you will be able to create a volume up to 16 TiB in size. The announcement letter for this capability is here.
The volume size on an XIV is limited only by the soft limit of the pool you are creating the volume in. This allows the possibility of a 161 TB volume.
SVC and Storwize V7000
These two products have two separate concepts:
- Volumes (or VDisks) that hosts can see.
- Managed disk (or MDisks) that are presented by external storage devices to be virtualized. Within this there are two further categories:
- Internal MDisks created using the Storwize V7000 SAS disks.
- External MDisks created by mapping volumes from external storage (such as from a DS4800).
SVC and Storwize V7000 Volumes (VDisks).
Prior to release 5.1 of the SVC firmware, the largest volume or VDisk that you could create using an SVC was 2 TiB in size. With the 5.1 release this was raised to 256 TiB, as announced here. When the Storwize V7000 was announced (with the 6.1 release) it also inherited the ability to create 256 TiB volumes.
Storwize V7000 Internal Managed Disks (Array MDisks).
Because the Storwize V7000 has its own internal disks, it can create RAID arrays. Each RAID array becomes one Mdisk. This means the largest MDisk we can create is limited only by the size of the largest disk (currently 2 TB), times the size of the largest array (16 disks). This means we can make arrays of over 18 TiB in size (using a 12 disk RAID6 array with 2 TB disks). Thus internally the Storwize V7000 supports giant MDisks. We can also present these giant MDisks to an SVC running 6.1 code and the SVC will be able to work with them.
SVC and Storwize V7000 External Managed Disks.
When presenting a volume to the SVC or Storwize V7000 to be virtualized into a pool (a managed disk group) we need to ensure two things are confirmed. Firstly you need to be on firmware version 6.2 as confirmed here for SVC and here for Storwize V7000. Secondly that the controller presenting the volume has to be approved to present a volume greater than 2 TiB. From an architectural point of view, MDisks can be up to 1 PB in size as confirmed here, where it says:
|Capacity for an individual external managed disk|
|Note: External managed disks larger than 2 TB are only supported for certain types of storage systems. Refer to the supported hardware matrix for further details.|
I recommend you go to the supported hardware matrix and confirm if your controller is approved. The links for Storwize V7000 6.2 are here and for SVC here. As of this writing, the list has still not been updated, but I am reliably informed it will include the DS3000, DS4000, DS5000, DS8700 and DS8800. It will not initially include XIV, which will come later. Please also note the following:
- Support for giant MDisks (greater than 2 TiB) is firmware controlled. If the controller (e.g. a DS5300) presenting a giant MDisk is not on the supported list for your SVC/Storwize V7000 firmware version, then only the first 2 TiB of that MDisk will be used.
- If your already presenting a giant MDisk (and using just the first 2 TiB), then just upgrading your SVC/Storwize V7000 firmware won't make the extra space useable. You will need to remove the MDisk from the pool, then do an MDisk discovery and then add the MDisk back to the pool. All of this can of course be done without disruption, using the basic data migration features we have supported since 2003.
What to do in the meantime?
If your currently using an SVC or external MDisks with a Storwize V7000, then you need to work within the 2 TiB MDisk limit (except for Storwize V7000 behind SVC). The recommendation is a single volume per Array for performance reasons (so the disk heads don't have to keep jumping all over the disk to support consecutive extents on different parts of the disk). This can require careful planning. For instance using 7+P RAID5 Arrays of 450 GB drives makes an array that is over 3 TB. What to do in this example?
- Divide it in half? (by creating two 1.5TB volumes)
- Waste space? (a whole 1 TB)
- Use smaller arrays? (a 4+P array of 450GB disks is 1.8 TB)
The answer is that where possible, create single volume arrays using 4+P or larger. If the disk size precludes that, then create multiple volumes per array and preferably split these volumes across different pools (MDisk groups).
Anything else to consider?
Well first up, will your Operating System support giant volumes? Googling produces so much old material that it becomes hard to nail down exact limits. For Microsoft, read this article here. For AIX check out this link. For ESX, check out this link.
Second of course is the consideration of size. File systems that utilize the space of giant volumes could potentially lead to giant timing issues. How long will it take to backup, defragment, index or restore a giant file system based on a giant volume (the restore part in particular)? Outside the scientific, video or geo-physics departments, are giant volumes becoming popular? Are they being held back by practical realities or plain fear? Would love to hear your experiences in the real world.
And a big thank you to Dennis Skinner, Chris Canto and Alexis Giral for their help with this post.
As you would expect, the IBM XIV supports a very wide range of Host Operating Systems. Even better, for most of these Operating Systems, IBM makes available (free-of-charge) a multipathing kit to install on these hosts. We call this the Host Attachment Kit, or HAK. You can find all of the available Host Attachment Kits at the IBM Support site found here. You will find HAKs for AIX, HP-UX, Linux, Solaris and Microsoft Windows.
What is important is that if the HAK is available for your Operating System, we need you to always install it on every host that attaches to IBM XIV. We ask this for the following reasons:
- By having the XIV HAK installed, your hosts are much easier for IBM to support. This is because installing the HAK ensures that your multipathing is setup correctly. When you installing the HAK and then run the xiv_attach command, the HAK will adjust system parameters to optimal values. For example on Windows hosts it ensures that the required MPIO Service is running and that the recommended hot fixes are installed. For Linux hosts it ensures that the multipath.conf file is correct. Every time you map a new volume from your IBM XIV, use should run xiv_attach to ensure you continue to have the correct settings.
- If you have an issue that requires IBM support, the HAK supplies a command known as xiv_diag. This command creates a zipped host log file that will contain useful and relevant information for IBM to analyze.
- The HAK supplies a very valuable command known as xiv_devlist which lets you list all attached volumes and match the host ID to the XIV volume name. If your host is attached to multiple XIVs, you can also map each volume back to it's relevant XIV. Its a command I cannot live without... I love it!
Here is an example of what xiv_devlist will tell you. In this example I have run it on a Windows 2008 machine, but the output is basically the same regardless of host operating system. You can see the operating system identifier (the Device as reported by the operating system, in my example PHYSICALDRIVE0), the name of the volume (as seen on the XIV, in my example W2K8X64-H02_BOOT - Exchange) and the serial number of the XIV providing the volume (in my example 6000081)
The operating system device identifier lets you map an XIV volume from XIV to host. So in this example, I know that the Windows (C drive, which is Windows Disk 0, maps to a volume on the XIV known as W2K8X64-H02_BOOT - Exchange.
And to finish, there are several other commands that are very helpful. For instance thexiv_fc_admin -P command will tell you your WWPNs.
C:\Windows\system32> xiv_fc_admin -P
21:00:00:0d:60:13:b0:8c: [QLogic IBM FCEC Fibre Channel Adapter]: IBM FCEC
21:00:00:0d:60:13:b0:8d: [QLogic IBM FCEC Fibre Channel Adapter]: IBM FCEC
Another useful command is xiv_fc_admin -R because it rescans your bus. In some operating systems it is not obvious how to do this (other than reboot of course).
The nice thing is that regardless of your host operating system, the commands are the same. This is possible because they use the Python programming language. You may notice Python being installed as xpyv when you install the HAK (it is so named to ensure it doesn't interfere with any other Python installs you have).
So please install the HAK on every host that attaches to XIV. You will be making everyones life a lot easier (especially your own).
Oh and by the way, you can confirm whether your Host Operating System can be attached to the XIV by consulting the IBM System Storage Interoperation Center (or SSIC). If the HAK is not available for your Operating System, the SSIC will list other Vendor approved multipathing solutions (such as Veritas DMP).
Hi Team! Just wanted to let everyone know that VisioCafe has been updated with IBM's latest official stencils for use with Microsoft Visio. These include all models of the Storwize V7000, including the newest models: The 2076-312 and 2076-324 (which have the dual port 10 Gbps iSCSI card).
Here is the link to VisioCafe. The Storwize V7000 stencils are in both the IBM-Disk as well as the IBM-Full packages.
Here is a screen capture of the Node Cannisters in the 2076-324. I have circled one of the shiny new 10 Gbps iSCSI cards.
So please stop using the stencils I previously supplied on my IBM developerWorks blog and switch to the official set.
And if you have some examples of Visio diagrams that include the Storwize V7000 I would love to see (and share) them.
I found a link to great video on Jason Boches Virtualization blog and I thought I would post it here as well.
What the video shows is 70 minutes worth of take-offs and landings at Logan International Airport in Boston, compressed into 150 seconds. Its an amazing piece of footage and very cleverly done. Seen anything equally as clever? Would love to hear about it. Enjoy!
A quick blog post about XIV call home..... As with most IBM products, the XIV can call home to IBM using e-mail notifications. I still meet people who call this dial-home, which reflects the 20th century practice of using modems to provide a Remote Support Facility (RSF). The e-mail notifications sent by the XIV allow IBM to track any issues that may occur and respond where appropriate.
This is all good, provided IBM know how to get hold of you if there actually is an issue. I had a situation recently where our internal client records had an out-of-date phone number. This led to a delay in problem resolution, a delay which was avoidable.
One way to help prevent delays is by keeping the XIV up to date with your contact details and as usual, the XIV GUI makes this easy.
From the XIV GUI, head to the Support menu as per the screen capture below:
From there you will find several tabs, three of which are well worth filling in, these being:
- Customer Information: Where is the machine?
- Primary Contact: Who should IBM try contacting first?
- Secondary Contact: Who should IBM try contacting second?
Actually don't hesitate to fill in ALL the tabs, but the point of this exercise is to at least ensure IBM knowwhere the machine is and who to call.
Its worth ensuring the XIV is updated if your support center phone numbers change, or if you relocate the machine to a different site. At some client sites, I find the primary contact is a single person (whose mobile number sadly ends up being the 24 hour storage help desk). If you are that person.... and your leaving the company.... ensure your name and number gets updated by your replacement. After all, its one thing to have IBM calling you at 3am when you manage the machine... but to be rung after you have left the company? Mmmm... thats just plain annoying.
Its a story told many times.....
You order a new storage solution and the world is good.
It's lovely, it's new and it offers mountains of new disk space.... but then... you.... fill it up!
So its off to order some new disks.
The order is in, the order is filled, the disks arrive.
What next? How about we just stick them in?
By just inserting the new disks, they will be made available to configure into RAID arrays from the Internal tab of the Physical Storage Group.
If the drives are showing as Unused, mark them to be Candidate. If they are already showing as Candidate (like most of the disks in my example below), then you are ready to hit the Configure Storage button and follow the guidance of the Wizard.
Of course maybe your enclosures are all full. In this case it's time to order another enclosure (remember we can have up to 10). Once you have racked the enclosure up and cabled the new enclosure to the correct SAS Chain, then use the Add Enclosure menu item shown below to kick off the configuration:
Just a very quick blog post to point you to another blog post that I found particularly pleasing. To be fair, the independent judges on the panel were the ones that selected IBM, rather than EMC itself.... but the recognition is well deserved. Enjoy!
Storage IT offers up many choices, some of which provoke argument so heated, you could almost describe the adherents as religious. I think you might know the sort of arguments I am talking about:
- File vs Block I/O
- iSCSI vs Fibre Channel
- CLI vs GUI
OK.... so maybe that last one isn't quite in the same league. But it is still fascinating to see the variation in usage patterns from sites where every command (of any description) is run via a command line interface (a CLI), to sites where the CLI is viewed with either great fear... or even greater distaste. There are those who view the CLI as... well... so 1970s.....
But the reality is that the CLI will always be with us for one principal reason: scripting. If you cannot script it, you cannot automate it (well actually thats not true, but stick with me here, I am on a roll). Every single major implementation I have ever done (whether it be SVC, XIV, DS8000), I have automated with scripting. I regularly use the concatenatecommand in Excel to build large numbers of commands that I can then run as a script.
So its pleasing to see that all of our products are working towards making the scripters life even easier. For example the XIV has offered a command log in the GUI for some time. I blogged about it here. You simply do a command once in the GUI and then consult the log to find the syntax, making scripting very easy:
With last years release of SVC 6.1 and Storwize V7000, we added this level of smarts to those two products as well. Now every command you run in the GUI will offer you the exact CLI command that was used under-the-covers to do this work. Simply toggle the details tab on the completion panel to see the command (or toggle it back to hide it!).
This weeks announcement of release 6.2 of the SVC and Storwize V7000 firmware, has brought in two more important usability improvements:
- Now when logging onto the CLI using individual user-ids, you can logon using the actual user-id itself, rather than admin. This change has been a long time coming and removes the confusion generated by logging onto the GUI as sayanthony, but then logging into a matching CLI session as admin. Now you would logon to either interface as anthony.
- Now when issuing CLI commands, you have the choice to drop the svctask and svcinfo headers. So instead of issuing the command svcinfo lsnode, you can issue the command lsnode. Both choices remain valid (so we don't break your existing scripts). Making this change is part of a bigger plan to move to a more common CLI.
And there are more improvements coming, so as always, watch this space....
.... and please... share with me... are you a GUI... or a CLI person? Whats your reasoning behind your choice?
*** Updated 25/07/2011: The VAAI plugin can be downloaded from here: http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=ibm~Storage_Disk&product=ibm/Storage_Disk/IBM+Storwize+V7000+(2076)&release=6.2&platform=All&function=all ***
The May 9 announcement that SVC and Storwize V7000 will support VAAI is very welcome news. The fundamental point is that the SVC and Storwize V7000 virtualise external storage. This means that the mountains of DS3000, DS4000, DS5000, AMS1000s, CX3s, etc, that are currently being virtualized behind these products, will inherit VAAI as soon as the virtualization layer supports it. This is yet another feature to add to the list of functions that IBM Storage virtualization can provide, such as: EasyTier; Thin Provisioning; multiple consistency groups; snapshots; remote mirroring; dynamic data relocation... the list goes on.
In addition we are releasing a plug-in for vCenter that enables VMware administrators to manage their SVC or Storwize V7000 from within the VMware management environment
Functions will include:
- Volume provisioning and resizing
- Displaying information about volumes
- Viewing general information about Storwize V7000 and SVC systems
- Receiving events and alerts for Storwize V7000 systems and SVC attached to vSphere
- The Storwize V7000 and SVC plug-in for vCenter will also supports virtualized external disk systems
The plug-in will be available at no charge on June 30 (for Version 6.1 software) and July 31 (Version 6.2). Here is a sneak peak of what it will look like:
And to get an independent viewpoint have a read of Stephen Fosketts blog entry here:
With the announced release of DS8000 6.1 code, IBM has moved its three major storage systems to a common GUI platform. This makes me think of aircraft manufacturers who utilize a common cockpit design. For airlines, this is major drawcard when choosing aircraft models. It cuts down on training costs for your pilots. Except in storage IT, there is a major difference in motivation....
First and foremost, the design of the XIV GUI (that has inspired such dramatic change in IBMs other GUIs), was made possible, not by clever XIV GUI developers (don't get me wrong - they ARE clever), but by a remarkably user-friendly architecture. The XIV GUI is a miracle of ease-of-use for end users, made possible because first and foremost, by design, the XIV made it almost impossible to make it hard.
The good news for Storage administrators, is that unlike a jet aircraft, where a pilot needs to spend hundreds of hours in the cockpit before they are considered potentially competent, the XIV GUI can be picked up in minutes and lends itself very well to casual contact. You don't need to keep using it to stay competent.
The challenge for IBM was take more complex products, which require more user decisions, and make the usage experience just as easy. To add to this, the SVC and DS8000 GUIs were driven by WebSphere. Changing these GUIs would require a complete re-write to employ Java script.
First off the rank was the SVC and Storwize V7000. With the release last year of the SVC 6.1 update, the transformation was nothing less than remarkable. End user experience ruled every decision. The key again is that the user does not need to spend hundred of hours learning this GUI or re-learning it every time they go to perform a configuration task. Everything is in its right place. Its much more than an XIV-like GUI. Its a GUI that took the ease of use experience of the XIV and used that to inspire something just as remarkable.
With the release of the 6.1 update for the DS8000, we complete another fundamental step towards a truly common GUI. The DS8000 GUI has undergone a complete re-write. Essentially it has been rebuilt from the ground up. This highlights something fundamental: It confirms the DS8000 has a very strong roadmap.
As you can see from the image below, the transformation from the old design (to the left) to an ease of use model is complete:
In short it a common flight deck, that almost anyone can fly.
Here is a list of all the IBM Asia Pacific and Japan Announcement Letters that were released on May 9. They are in several sections:
New disk drive option for IBM System Storage DS3950 Express Disk Systems
IBM System Storage DS3500 Express Storage System supports next-generation, high-performance 10Gb iSCSI technology
IBM Scale Out Network Attached Storage 1.2.0 supports multiple petabytes of storage
IBM Information Archive offers a new Server (2231-S3M),Disk Controller (2231-D3A), and Disk Expansion Drawer (2231-D3B)
IBM System Storage DS8700 and DS8800 (M/T 239x) delivers DS8000 Function Authorization for I/O Priority Manager and other advanced features
New disk drive option for IBM System Storage DS5020 disk systems
IBM System Storage N series N6270 offers enterprise-class Fibre Channel, iSCSI, and NAS storage with gateway options
IBM System Storage N series function authorizations for IBM System Storage N6270
IBM System Storage DS8700 and DS8800 (M/T 242x) delivers DS8000 I/O Priority Manager and advanced features to enhance data protection for multi-tenant copy services
IBM System Storage EXN3500 SAS expansion unit provides storage for IBM System Storage N series PCIe systems
IBM System Storage DS5000 series supports next generation, high-performance 10Gb iSCSI technology
IBM System Storage Tape Cartridge 3599 models provide enhanced capacity for enterprise tape drives
IBM Virtualization Engine TS7700 is designed to bring efficiency to tape operation and offer versatile models that support attachment to tape libraries
New features for IBM System Storage TS7650 ProtecTIER Deduplication Appliance (3958 AP1) and IBM System Storage TS7650G Gateway Server (3958DD4)
IBM System Storage TS1140 Tape Drive Model E07 delivers higher performance, reliability, and capacity
IBM System Storage TS3500 Tape Library Connector and TS1140 Tape Drive support for the IBM TS3500 Tape Library
New IBM System Storage SAN Volume Controller Storage Engine offers 10 Gigabit Ethernet connectivity
New IBM Storwize V7000 Disk System models 312 and 324 offer 10 Gigabit Ethernet connectivity
IBM Scale Out Network Attached Storage Software V1.2.0 for high availability environments
IBM announces many-to-many, bi-directional replication for IBM System Storage ProtecTIER Enterprise Edition V3.1 and ProtecTIER Appliance Edition V3.1
IBM System Storage ProtecTIER Entry Edition Version 3.1 supports many-to-many, bi-directional data replication
IBM System Storage Linear Tape File System Library Edition Version 2.1
IBM Storwize V7000 Version 6.2 delivers support for VMware VAAI, real-time performance monitoring, and 10 Gigabit iSCSI connectivity
IBM System Storage SAN Volume Controller Version 6.2 delivers support for VMware VAAI, real-time performance monitoring, and 10 Gigabit iSCSI connectivity
There are several withdrawals, but these are only because replacement products have been announced above.
Hardware withdrawal: IBM N series N6060 (2858 Model A22) and N6070 (2858 Model A21) -- Replacements available
Hardware withdrawal: IBM TS7740 (3957) Model V06 and IBM TS7720 (3957) Model VEA and associated features - Replacements available
Hardware Withdrawal: Select models and features for Information Archive (MT 2231) - Some replacements available
Hardware withdrawal: Feature number 3447 from IBM System Storage TS7650 and TS7650G ProtecTIER solutions - Replacement available
Hardware withdrawal: IBM Scale Out Network Attached Storage Models 2851-SI1 and 2851-SS1 - Replacements available
Hardware withdrawal: IBM System Storage SAN Volume Controller 2145 Model CF8 - Replacement available
Its that time of the year again - announcement time! And the May 9 set of storage announcements by IBM is one of the richest set of announcements I have ever seen. Practically every storage product has received updates with new features stretching from Tape Drives to Tape libraries to Disk system updates (from the smallest system to the largest system). We have NAS updates and we have storage virtualization updates. I struggled to decide which subject to start on, to do justice across the board. So let me first list just some of the products that have received updates:
TS1140 - Super fast, massive capacity, enterprise tape technology.
For many years IBM has been using its own technology (as an alternative to LTO) to offer clients a higher class of enterprise tape. The TS1140 is the fourth generation of this technology. Using the new JC media which has 4TB of native capacity, then presuming a compression ratio to 2.5 to 1, you could place 10 TB of compressed data onto a single cart. And you could do this at 250 MBps sustained, which according to Oracle, makes the TS1140 the fastest tape drive in the world! The TS1140 will happily burst at up to 650 MBps - so we now have a tape drive that can truly utilize a 8 Gbps fibre channel port. It reinforces the green credentials of tape by using only 46W of power and supports LTFS, the Long Term File System, which leads me to....
LTFS - Long Term File System
Speaking of LTFS, we have enhanced the LTFS standard to now support tape libraries. So get this idea.... you attach a tape library to your server. All the tapes in the library appear to the operating system as directories. You can select any of these directories and the library will open it up (i.e. mount the tape). Now the contents of the tape itself appear as a directory structure, from which you can add or remove files. In other words, the library and the tapes can be manipulated without any form of backup software sitting between you and the operating system. After the initial tape mount, the directory is locally cached, so you don't need to mount the tape again to see what is on it (and to search the directory). This whole concept has the most amazing potential use cases.
IBM has a truly fantastic tape library with the TS3500. Now we add the ability to shuttle tapes between aisles to create a larger logical library. How do you like the idea of a logical tape library that can hold 300,000 cartridges totaling 2.7 exabytes?
The IBM TS7700 is our Mainframe virtual tape library solution. It gets a major performance boost with the introduction of Power 7 servers plus many other improvements.
In terms of disk we have enhancements to the following products:
DS8000 Family with release 6.1
When we released the DS8800 last year, we committed to deliver a merged code library which would support both DS8700 and DS8800. This would ensure that they both have the same feature set. We now deliver on that commitment, plus supply an enormous set of new features and functions for both products: so both products continue to get major enhancements and updates. These include:
Easy Tier enhancements: Any two disk technologies can now be placed in a pool
I/O Priority Manager: Which allows for quality of service management.
Multi-tenancy management: Allows for the creation of separate Copy Services domains.
Larger LUN sizes: Allows the ability to create LUNs up to 16 TiB in size.
Enhanced GUI: We will now have a common GUI for DS8700, DS8800, Storwize V7000, SVC and XIV.
8Gb/s host adapters for the DS8700
V7000 and SVC Family with release 6.2
The IBM SAN Volume Controller and Storwize V7000 share a common code library, so improvements are common. In the 6.2 release we deliver the following enhancements:
Flash Copy Improvements: Allow remote copies of flashcopy targets
SVC 2145-CG8 Node: New hardware model
10 Gb iSCSI: For both Storwize V7000 and SVC
SVC Solid State Drive Support: Allowing SVCs to use internal SSDs for EasyTier
VMware VAAI: All three VAAI primitives now implemented.
Real Time Performance Statistics: A new GUI panel giving performance info.
Storwize V7000 System Clustering: Allowing us to cluster two Storwize V7000s together.
The DS3500 is IBM's entry level disk rocket ship. I am a huge fan of this box for clients with smaller or point solution requirements. We have enhanced the product with the following:
Double the drives: We now support 192 drives.
Scheduled flashcopies: Gives the ability to have scheduled flashcopies run without external intervention.
Improved volume copy: Gives the ability to create a volume copy without stopping host access.
10 Gb iSCSI: Allows us to add 10 Gb iSCSI to the DS3500.
The DS5000 range consists of DS5020 (also sold as DS3950), the DS5100 and the DS5300. Improvements include:
Scheduled flashcopies: Gives the ability to have scheduled flashcopies run without external intervention.
Improved volume copy: Gives the ability to create a volume copy without stopping host access.
10 Gb iSCSI: Allows us to add 10 Gb iSCSI to the DS5100 and DS5300
T10-PI: Allows selected operating systems to add meta data to track write integrity.
SAS drives: We are adding a 600 GB SAS drive that has a SAS to FC interposer so it can be installed in a EXP5000
I have not listed all of the product announcements. There are improvements to SONAS, our nSeries products, Information Archive, Real Time Compression device.... the list goes on.
I will write up another post with all the links....
I had some fun with my wife's computer this weekend.
She called me over because she was getting multiple messages telling her that the harddrive was failing, all being delivered by a very fancy GUI that looked like this:
I became suspicious immediately: Microsoft have never produced a GUI that looks so slick. Another big hint was that the Help & Support button tried to take me to a very strange URL. I say tried because her machine by this point was close to being a vegetable. The All Programs tab contained nothing, there were no desktop icons and the C: reported that it contained no files. We could not browse to the NET because all icons to start a browser were gone and even when I started a browser manually (from Start --> Run), the browser was set to use an unusual proxy.
Fortunately Doctor Google was very helpful and I rapidly found this URL:
I used the tools and instructions found there and was able to get her computer back into a working state. Many thanks to the authors of that page.
This experience brought home three lessons:
- Her employers anti-virus is useless (her laptop runs a corporate load).
- Google images searches can return poisoned URLs that contain malware. Have a read of this excellent article. My wife was doing a Google Images search, looking for pictures of Wheat Rust, when the infection occurred. I am loath to work out which URL it was, as I don't wish to risk a return to any of those poisoned sites.
- Using no-script is a very good idea, and one that I will be implementing on her PC, especially until her employer comes up with a better anti-virus regime.
All of this excitement distracted me from the main event, preparing for the May 9 announcements. You will see a log of blog posts over the next few days detailing what our developers have been up to. Prepare to hear about some very cool stuff.
In the meantime... feel free to share any other methods you have to avoid malware... and download and install MalwareBytes. It is a very nice piece of software that costs nothing to install and use.
I recently had a client ask me if I had seen this problem in Cisco Device manager: Device Manager was showing them 100% utilisation for CPU on one of their MDS9509s. I had a look at the show tech-support and curiously show process cpu showed practically no CPU usage at all. I suggested a display problem and sure enough, Cisco confirmed it:
Symptom: The show system resources command shows high CPU usage even when there is not
much activity on the switch. In one instance, the CPU utility (user and kernel)
was always 100 percent.
Conditions: You might see this symptom 248 days after the system came up
Curiously the Cisco tech support person stated that in fact a CP switchover every 497 days would prevent the issue reoccurring. This is curious because 248 days is close to half of 497 days. And 497 is ITs number of the beast.
The reason that 497 is a problem number is because of the use of a 32 bit counter to record uptime. If you record a tick for every 10 msec of uptime, then a 32-bit counter will overflow after approximately 497.1 days. This is because a 32 bit counter equates to 2^32, which can count 4,294,967,296 ticks. Because a tick is counted every 10 msec, we create 8,640,000 ticks per day (100*60*60*24). So after 497.102696 days, the counter will overflow. What happens next depends on good programming.
Some classic bugs can be found here, here, here and here. Most of these bugs are old and will almost certainly not affect anybody. But remain on notice: 497 day bugs are still possible. Just Google the search argument: 497.1 day bug.
Now let me be clear: I am not aware of any active disruptive, bring-down-your-business type 497 day bugs. The sky is not falling. But historically many vendors products have had 497 day bugs, some of them nasty. I ponder whether we should schedule a switch reboot every 496 days just to avoid the possibility of a 497 day bug. Its an interesting idea. I certainly endorse staggering initial switch reboots by at least an hour, so that a simultaneous 497 day reboot bug (should one be lurking), would not reboot every switch in every fabric at the same time. And in case your think I am picking on Cisco, when I looked at the client switch in question, it was showing a kernel uptime of 562 days, 23 hours, 35 minutes, 24 seconds. Thats some solid uptime.
Back from a short break (for Easter and the School Holidays) to three great pieces of news:
- A new series of Doctor Who is screening.
- Will and Kates wedding went off without a hitch (its not often I get to yell at the dog to stop barking at possums because there is a Royal Wedding on).
- The VAAI driver for XIV has an official download link.
Ok... maybe the Royal Wedding has no place in my blog, but the VAAI link is very appreciated.
Two ways to get to the driver:
- Get it directly from here
- Go to fix Central and select it from the download list: http://www-933.ibm.com/support/fixcentral/
Remember, your XIV needs to be on 10.2.4a firmware, so you need to be talking to your IBM Service Representative to schedule a concurrent firmware update before you turn the VAAI functions on.
Now if your going, um... what is VAAI and how does it help? Check this blog post out:
If your asking, hey what else will 10.2.4a code bring me?
- How about better write performance?
- How about QoS?
- 10.2.4a code also brings the ability to do 'truck' initialization of an async pair (which lets you pre-load an async secondary for faster initial mirroring, or to convert from sync to async without re-mirroring all your data).
- It also lets you format a snapshot, which means you can keep a snapshot in place and mapped to a host, but it will not consume any space.
Last week IBM released Version 2 of the management plug-in for VMware vCenter. The main benefit of Version 1 (the previous release) was that it allowed you to map your datastores to XIV volumes (i.e. which XIV volume equates to which VMware datastore). This was very handy (especially if you were not paying attention as you allocated volumes to your VMware farm), but you still needed the (very easy to use) XIV GUI as well as (obviously) vCenter to manage your landscape end to end.
With the release of Version 2 of the XIV plug-in, we suddenly have the tantalizing possibility that the VMware administrator will not need to talk to their storage administrator or turn to the XIV GUI for day to day operations.
Well Version 2 offers a new and improved graphical user interface (GUI), as well as brand new and powerful management features and capabilities, including:
- Full control over XIV‐based storage volumes (LUNs), including volume creation, resizing, renaming, migration to a different storage pool, mapping, unmapping, and deletion.
- Easy and integrated allocation of volumes to VMware datastores, used by virtual machines that run on ESX hosts or datacenters.
- The ability to monitor capacity, snapshots and replication.
So from vCenter you can now for instance map yourself some new volumes to create data stores, or re-size existing ones. You can also confirm that each of your datastores is being mirrored.
You can get the plug-in free of charge from here:
There is a users guide here. I urge you to download it and have a read. The Users Guide contains lots of really good examples of how the plug-in can be used with some great screen captures. The release notes are here and also make for very good reading.
I honestly think every VMware installation should be using this plug-in. But I am curious about how it will affect the responsibility divide. If your a one-person shop, the chances are that you love your XIV quite simply because you don't need to administer it. The XIV leaves you free to focus on your VMware farm, rather than fret about hot spots or hot spares or RAID groups. For you, this plug-in just makes your life even easier.
But what about larger companies? Firstly, its important to understand that to perform storage administration, the vCenter plug-in will need an XIV userid that has Storage Admin privileges. Why is this significant? Well what if the team who manage the XIV and the team who manage VMware, are not the same people? What if they are different teams; who maybe have different managers; who may work in different buildings or different cities? What if they work for different companies? Do plug-ins like this one erode the lines and bring these teams together? Or are the functional divides still too strong?
I would love to hear your experiences, both in using the plug-in.... and tearing down the walls.
For someone who blogs so frequently about the IBM XIV, I will let you in on a little pet hate of mine: The XIV uses decimal volume sizes.
The XIV GUI and CLI has the user create volumes using decimal sizing, meaning 1 GB = 1,000,000,000 bytes (1000 to the power of three).
Nearly every host system out there (i.e. Windows, AIX, Linux, VMware, Solaris) display volume sizes in binary, meaning 1 GiB = 1,073,741,824 bytes (1024 to the power of three).
This disparity has a quirky consequence. If the XIV says a volume is 17 GB, the host that uses that volume says it is 16 GiB (which the host often then mis-states as GB). This doesn't mean there is a loss of space, this isn't headroom or formatting - its just a different way of counting bytes. Its not a road block and its easy to understand and work with. But it is a little annoying. (Then again, so is my 32 GB iPhone reporting it has 29.3 GB of space).
The other point is that the IBM SVC, Storwize V7000, DS8000 and DS3000/DS4000/DS5000 families have always used binary sizing (even if their respective interfaces use the term GB as opposed to GiB - yet another pet hate of mine and the Storage Buddhist).
So whats the point of this rant?
The IBM XIV Storage System GUI (Version 3.0) will allow volume creation in both GiB and GB units. The IBM XIV Storage System management GUI version 3.0 will support the creation of volumes in Gigabyte (GB) or in Gibibyte (GiB) or Blocks (where each block is 512 bytes).
So this is a really good change.
The new GUI has not hit the download site yet... but I will be sure to tell you as soon as it has!
*** Update 08/09/2011 - corrected GUI version from 2.5 to 3.0, removed some confusing terms ***
I have some great news regarding VAAI support for XIV.
Let me detail the current situation:
- VMware has approved the IBM driver for VAAI and we can now release it to the public. The IBM_VAAIP_MODULE plugin will be available shortly from the ibm.com website. When the release URL is available I will update this post. In the meantime you can get the driver from your XIV TA (Technical Assistant) or IBM Account Team. If they have somehow missed the news, get them to talk to their XIV Product Manager (or they can always talk to me!).
- The VMware Hardware Compatibility guide found here shows that VMware support the three VAAI primitives with XIV, if you are using ESX 4.1 or ESX 4.1 U1 and your XIV is on firmware release 10.2.4 or higher.
- XIV firmware release 10.2.4a is available for install. Installation of this firmware is non-disruptive (concurrent) and will be performed by IBM.
- The VAAI driver and installation of the 10.2.4a code are all supplied free of charge.
So what should your plan be?
- Ensure VAAI is disabled on your ESX hosts.
- Talk to your local XIV TA or IBM Service Representative (SSR) and arrange to have 10.2.4a firmware installed.
- When 10.2.4a code is installed, you can then begin installing the VAAI driver on each of your ESX 4.1 servers. You will need to reboot each server to install the driver.
A question I get routinely asked relates to Windows disk partition alignment with XIV. If you don't know what I am talking about, take some time to read these very useful pages from our friends at Microsoft. Once you have had a look, come on back and read my perspective.
Disk Partition Alignment (Sector Alignment): Make the Case: Save Hundreds of Thousands of Dollars
Back already? Hopefully you now know that disk partition alignment is all about starting an IO at a logical block address that best matches how the underlying hardware stores your data. So now your wondering, what does this have to do with XIV? Well XIV has two concepts that relate to this: cache and partitions.
XIV cache (the server memory used to speed up reads and writes) is organised into 4 KB blocks (which is nice and small).
So the XIV cache does not care about disk alignment.
But when it comes to writing and read from disk, the XIV writes data into chunks of consecutive logical block addresses (LBAs) that we call partitions. These partitions are 1 MiB in size. What does that concept mean? It means the magical number for XIV is 1024 KB or 1 MB. (actually KiB and MiB, but for the sake of ease, I will stick to the naming used by Microsoft. Given this number is fairly large (other hardware often aligns to 32KB, 64KB or 256KB), for XIV this reduces the potential impact of misaligned partitions. Which is good.
Correct Windows Disk Alignment could give up to a 7% performance improvement when using an offset of 1024 KiB. (1 MiB). I need to be clear, that's not a guaranteedimprovement of 7%. It's a maximum possible improvement. Your particular server will see an improvement somewhere between 0% and 7%. It depends on your workload patterns. The more small and random your workload, the more useful setting the 1024 KB offset will be. The more sequential your workload, the less useful it will be, as only the first and last parts of an I/O could potentially be misaligned. This mis-alignment could equate to a tiny percentage of extra work for the XIV. Sadly there is no metric you can display to detect how much impact misalignment is actually having.
So should you do it? The good news is that new volumes created under Windows 2008 prefer the 1 MB boundary. So a fresh install should already be using the correct values. The bad news is that volumes created under earlier Windows Operating Systems (Such as Windows 2000 and 2003) will almost certainly be misaligned, and correcting the alignment is destructive to the data in the partition.
How to check alignment at the host? Here is an example:
I start diskpart:
Microsoft DiskPart version 6.1.7600
Copyright (C) 1999-2008 Microsoft Corporation.
On computer: ANTHONYV-PC
I list my disks. In this example I have two disks installed in my laptop. I select disk 0:
DISKPART> list disk
Disk ### Status Size Free Dyn Gpt
-------- ------------- ------- ------- --- ---
Disk 0 Online 238 GB 5724 MB
Disk 1 Online 232 GB 1024 KB
DISKPART> select disk 0
Disk 0 is now the selected disk.
Now I list the partitions and see the offset for each one.
DISKPART> list partition
Partition ### Type Size Offset
------------- ---------------- ------- -------
Partition 1 Primary 100 MB 1024 KB
Partition 2 Primary 232 GB 101 MB
Partition 1 has an offset of 1024 KB, which is 1 MB, which is perfect for XIV. Partition 2 has an offset of 101 MB, which is still on the 1MB boundary (it was pushed there by the combination of the size of the first partition (100 MB) and its offset (1 MB). So this is perfect.
For an example of how to create a partition with the correct offset, check out this how-to document, that also provides some good follow on reading:
What about other IBM products?
The IBM SVC and Storwize V7000 prefers 64 KB (or larger) offsets as documented here:
Why? Because the SVC and Storwize V7000 use a concept of grains, where each grain is usually 64KB or 256KB in size.
The DS8000 (regardless of model), also prefers 64 KB offsets. The DS8000 use the concept of logical tracks where each logical track is 64KB.
The DS3000/DS4000/DS5000 range allow the user to set the segment size of a logical volume on creation. The setting that you define should match the segment size defined for the logical drive being used. In the example below, it is 64 KB.
What about VMWare?
The answers are no different. Misalignment can indeed make a difference to client performance. Check this link from NetApp and this document from VMware:
For an EMC perspective, check out this link from someone I respect a great deal, Chad Sakac:
I searched around looking for an image to highlight the theme of alignment. I found this image in the IBM archives for the IBM Mass Storage Facility announced back in 1974. I am sure this product had some interesting alignment challenges.
(edited 24/5/2011 --> removed old Visio Stencils link).
VisioCafe has been updated with IBM's latest official stencils for use with Microsoft Visio. These include all models of the Storwize V7000, including the newest models: The 2076-312 and 2076-324 (which have the dual port 10 Gbps iSCSI card).
Here is the link to VisioCafe. The Storwize V7000 stencils are in both the IBM-Disk as well as the IBM-Full packages.
Remember you can also find my XIV stencils here:
Requests for Visio stencils are one of the most common comments I receive.
More are coming so your requests are being heard!
Over on my Wordpress blog, I have posted an entry on migrating a Linux RHEL host from EMC to XIV.
If that subject interests you, check out my article here:
The XIV 10.2.4 release notes report performance improvements that are worth investigating. Two of the reported improvements listed are:
- Improved write hit performance with small blocks
- Improved write caching performance
I visited a client running 10.2.4 to see if these could be detected in the XIV performance statistics. In this clients case, the upgrade occurred on Feb 14. First up I wanted to show that in the period I am examining there was no major variation in write IO. In other words, before and after the code load, I wanted to confirm the client performed the same level of IO.
Having confirmed that the write IOPS did not vary over the period in question, did the latency change? Here we have some good news. Firstly the latency for Write Hits improved (slightly). A write hit is a write into a 1MB partition that already has some data in cache. It is faster than a write miss because some of the address allocation work has already been done. Write hits and misses both hit cache as I explained here. You can see a change on Feb 14 (when the code was updated):
I then looked at the latency for write misses. Again the latency dropped. This suggests that cache operations in general are being handled faster.
I then started thinking.... are we getting more write cache hits? The answer was YES! This is curious because the client normally does not have much control over where they actual write data to... Clearly the XIV firmware is managing the write cache in a more efficient manner. This is good not only because write hits normally have lower latency than write misses, but also because a write hit can save us destaging a block of data to disk. This is because a write hit could involve over-writing data that had not yet been destaged to disk. So two writes to the same LBA would only result in one write to backend disk.
So in conclusion, the upgrade to 10.2.4 code resulted in a measurable improvement in write IO performance at a real world client. Nice!
Its easy to make a fool of yourself.
Its not hard to do.
All you is need is a moment of inattention combined with a massive assumption. In fact assumptions can bring you undone at any time. A former manager of mine introduced me to the saying: To assume is to make an ass of you and me.
So what was the assumption this time?
One of our business partners sold a client two new XIVs and 4 new IBM SAN40Bs (40 port fibre channel switches). So far so good. When you order the SAN switches you have a choice of ordering 4 Gbps capable SFPs (SFPs are the fibre optic sub assemblies that you plug your cables into) or 8 Gbps capable SFPs. There was a time when the 8 Gbps SFPs were much more expensive than the 4 Gbps, but today they are about 75% of the price of the 4 Gbps. So it makes sense to buy the faster SFPs. But you need to ensure that all the HBAs at the client site are at least 2 Gbps capable, because 8 Gbps SFPs are tri-rate and can only go at 2, 4 or 8 Gbps. Sure enough an assumption was made that this was not an issue... but it was. The client has WDMs that run at 1 Gbps and upgrading those WDMs would be a significant expense.
So I got to thinking... could I force the SFP to 1 Gbps?
If I display the 8 Gbps SFP it reports it is capable of 200, 400, 800 MBps which is code for 2, 4 or 8 Gbps.
But maybe I could force it to 1 Gbps?
Sadly all I did was break the port. A port in Mod_Inv status means the SFP is in an invalid state. This is not going to work.
So what to do? We could not just move the old SFPs into the new switch, as the new 8 Gbps capable Brocade switches only accept Brocade approved SFPs. The only solution was to make it right and swap four of the Brocade 8 Gbps SFPs with Brocade 4 Gbps SFPs. Fortunately as we needed only four, I was able to swap them with little expense or hassle (I contacted our local Brocade rep who happily helped us out).
The end point was a happy client and a lesson re-learnt..... 1 into 8 does not go.
I am curious though... is there much 1 Gbps gear still out there? Is this a common issue?
Over on Wordpress, I have just published an article on SNMP and XIV.
Given some funky formatting, I have decided not to paste it into this blog.
If your interested in monitoring an XIV with SNMP, please head over to here:
A friend of mine sent me a direct message on Twitter that pointed out something interesting.... A blog post I had written on SDDPCM had been copied word for word by another site. A little bit of googling revealed that in fact it had been picked up by two sites. Here is the original, and the copies are here and here.
What bothered me was not that the content was copied without any obvious (well, obvious to me) attempt to acknowledge the original author. In fact in both cases, the copied text included a link to another blog entry I had written, so an alert reader would pick up that the content had come from someone else (still... a little acknowledgement doesn't hurt). To begin with, I was also not concerned with the re-use of my work. After all, I am writing this to be helpful, so if you think something I have written is helpful... and you spread the word... that work is even more helpful (but hey thats what Twitter is for... right?). But then it occurred to me....by copying the article without a link back to the original source (mine), if I find a mistake is made and I update my blog post, those corrections will not flow to the clones. So this potentially undermines my efforts to be helpful.
I also noticed that in each case, the clones had advertisments by Google. Does this mean Google and/or these other bloggers, are actually making money from copying my content?Hmmm... acknowledgement is one thing... a cheque is even nicer.
Or I am reading too much into this?
Still... message to Anthony... if you push content into the public domain you have to be prepared for this.
After tweeting about this, I did learn it is possible to insert sentences into your content that you could then monitor for with Google Alerts. I don't plan to do this myself, but its certainly worth being aware of. This of course also presumes the cloners don't detect these sentences and delete them.
I am very curious to know of similar experiences. Has this happened to you? Did you do anything about it? Were you happy with the result?
When IBM first released the Storwize V7000, we announced it was capable of supporting ten enclosures, but would on initial release support only five. We stated that this restriction would be lifted in Q1.
The good news is that this restriction is indeed now lifted by the release of Storwize V7000 software version 18.104.22.168, which is available for download from here:
You should also check out this link:
Storwize V7000 6.1.0 Configuration Limits and Restrictions
This new level also contains an additional enhancement which I think users will really like, called Critical Fix Notification. The new Critical Fix Notification function enables IBM to warn Storwize V7000 and SVC users if we discover a critical issue in the level of code that they are using. The system will warn users when they log on to the GUI using an internet connected web browser. It works only if the browser being used to connect to the Storwize V7000 or SVC, also has access to the Internet. (The Storwize V7000 and SVC systems themselves do not need to be connected to the Internet.) The function cannot be disabled (which is a good thing) and each time we display a warning, it must be acknowledged (with the option to not warn the user again for that issue).
As I blogged previously, VAAI support for XIV has two dependencies:
- 10.2.4a code
- Vmware Certified driver
Both of these things are very close to release....
In the meantime I have had the chance to demonstrate the uncertified VAAI driver with XIV 10.2.4 code, just to see what affect it has.
And what is the affect?
VAAI dramatically reduces the amount of work that the vSphere 4.1 server needs to do to get things done.
The XIV implementation of VAAI provides the three fundamentals of VAAI:
- Full clone, copying data from one logical unit (LUN based) to another without writing to the ESX server.
- Block Zeroing, assigning zeros to large storage areas without actually sending the zeros to the storage system.
- Hardware Assisted locking, locking a particular range of blocks in a shared logical unit (providing exclusive access to these blocks), instead of using SCSI reservation that locks the entire logical unit.
To test VAAI with XIV, I did two things: a VMDK migration (a Storage Vmotion) and VMDK cloning. I used the vSphere client to time how long the operation took and XIV Top to see how much IO was being generated by the vSphere server. Now please understand, these numbers and timings are based on a lab environment. The speed and peaks will vary from client to client and install to install.
Firstly the migration: I performed a migration of a VMDK from one data store to another. The migration without VAAI took 42 seconds as can be seen from the screen capture below:
The migration generated a peak of 135 MBps of traffic being written to the target volume as can be seen from XIV Top:
I then turned on VAAI and did the same migration. I won't document the process to install the VAAI driver, as it will be different when the certified version is released. However after the driver is installed, I could turn VAAI on and off by toggling these settings from 0 1 and back again:
I we did another VMDK migration with VAAI enabled. This time the migration took 19 seconds (as opposed to 42 seconds), so an immediate improvement occurred.
When I checked XIV Top, there was no IO at all! In other words the vMotion was done with no apparent load on the vSphere HBAs or the SAN. I feel silly showing this screen capture, but this is what I saw.... nothing.
I then did a VMDK clone. The Data store was on XIV, VAAI was not enabled. There was no other IO running on the ESX server. The clone took 40 seconds (as reported by vCenter):
The clone generated a peak of 230 MBps for around 50 seconds (as reported by XIV Top)
We then again activated VAAI and repeated the clone. Now the clone took 15 seconds (as reported by vCenter), so thats 25 seconds faster (more than 50%).
The clone generated a peak of 2 MBps for around 20 seconds (as reported by XIV Top). Almost no fibre channel IO was thus generated by the clone.
As I have blogged before, I will be repeating this whole exercise once I have real live customers running this configuration, so expect further updates.
Things have been pretty revolting lately, and I am not talking about Tunisia or Egypt or Libya (thought actually they could equally apply to my story).
What I am talking about is mother nature, and she is pretty angry with us right now.
In the last few months Australia and New Zealand have seen massive floods in Queensland, Victoria and Western Australia, destructive cyclones hitting Queensland and Western Australia, ferocious bush fires in Western Australia and most recently, a massive earthquake in New Zealand.
The personal loss of life and of property have been shocking and tragic. Each of these events have reminded me how quickly everything we hold dear can be taken away in an instant... by an event over which you have no control.
Which leads me to storage clouds....
If something can be stored electronically, then it can be stored in a cloud. A cloud that is hopefully well backed up, and far away from your own personal location. And no this is not an advertisement... its a suggestion....
Given the events of the last few months, I have started using a storage cloud provider to protect my photos, my music and my insurance information.
I looked for cloud storage providers who:
- Offered a tool that when installed on my laptop/PC, automatically backs up the contents of selected folders. This means I don't have to remember to backup. It should happen automatically.
- Offered a way of accessing the backed up data from anywhere.
- Is reasonably priced.
I considered the following uses:
- Backup all my photos.
- Backup all my music.
- Digitize my insurance documents and back them up. Scan in all my receipts and some photos of the contents of each room of the house. That way if the house burns down... I have a base to work off.
- Scan in important documents that I could not easily replace.
Let me give you an example of a document I would never want to have to replace....
My son is practicing to get his drivers license. In Victoria you need 120 hours of driving experience recorded in a log book. This log book needs to be filled in every time he drives the car. If the log book is lost... those 120 hours would need to be driven again. I cannot tell you how hard it is to find 120 hours of driving opportunities (and I heartily support the 120 hours scheme!). Even if you did feel inclined to create fake entries to recreate the book (which is illegal), frankly creating 120 hours of fake driving log entires would be very hard work. To make things worse... where I am storing this booklet? In the car of course (which is the most convenient place to store it). So what happens if the car is stolen? There goes the logbook.... So the plan I work on is that every time a page is filled up, I scan that page as an image stored on my laptop. The image goes into a folder that is automatically backed up to the cloud. Yes it does depend on my being diligent, but the actual process of copying the file somewhere else is automatic. Now I have 3 copies... the original, the scanned image on my laptop and a third (automatically created) copy way off in the cloud somewhere.
As for personal recommendations:
1) Get 2 GB free on Dropbox. This is a great point solution and a great way to dip your toes in.
2) Get 1GB free on Google Docs. This is a great tool to share files with others.
3) Try 15 days free on Carbonite. These guys look like good value for money.
Are there others? Yes there are... Mozy is one I have seen recommended. There is alsoAmazon S3. I am sure there are plenty more....
Have there been issues with storage cloud providers? A quick search reveals stories like: Flikr deleted a users data and Carbonite lost data due to hardware failure. Still... I have no plans to store my ONLY copy of data in the cloud. For me its a backup medium... not a primary storage location.
Are you convinced?
Are you already using the cloud?
Or are you thinking its too expensive or too insecure?
Better still, have you already been saved by the cloud?
Oh... and my son? He is on 89 driving hours... 31 to go....