IBM has today announced a whole swag of planned new features across the entire IBM Storage product line. You can read the announcement letter here and I have also dropped the text at the bottom of this blog post (to save you clicking on the link).
It's a very impressive list, but to hone in on a few of the more exciting offerings:
IBM Easy Tier will be enhanced to cache hot data in SSD storage installed in a client server. Looks like it will initially be a combination of DS8700/DS8800 and AIX with or Linux servers. I am sure there are plenty who will immediately think of EMC VFCache, so I am keen to get more details so I can see how the two compare. If you are curious in the meantime, check out this EMC fact sheet and then read this fascinating interview with the CMO of FusionIO.
A new high density storage module will be made available, initially I suspect for the DS8800. This is a really important step as we are seeing a lot of new technologies emerging in the SSD space. This is because the technical requirements of SSD don't always line up with the architectures of existing storage controllers, so a custom built enclosure designed just for SSD makes perfect sense.
The IBM XIV will be enhanced with the ability to cluster multiple XIVs together and migrate volumes non-disruptively between them. The non-disruptive volume migration is a great new feature which should definitely help with swapping XIVs out as new models come available.
There are plenty of other new features as well, so check out the announcement letter reproduced below:
IBM® intends to support a number of new enhancements to a variety of IBM storage systems in the future. These enhancements will leverage innovative research on intelligent algorithms, automation, and virtualization that is being incorporated into products in the IBM storage portfolio. The statements of direction highlighted here are intended to provide a glimpse into the IBM storage roadmap for selected product capabilities.
IBM intends to deliver:
Advanced Easy Tier™ capabilities on selected IBM storage systems, including the IBM System Storage® DS8000® , designed to leverage direct-attached solid-state storage on selected AIX® and Linux™ servers. Easy Tier will manage the solid-state storage as a large and low latency cache for the "hottest" data, while preserving advanced disk system functions, such as RAID protection and remote mirroring.
An application-aware storage application programming interface (API) to help deploy storage more efficiently by enabling applications and middleware to direct more optimal placement of data by communicating important information about current workload activity and application performance requirements.
A new high-density flash storage module for selected IBM disk systems, including the IBM System Storage DS8000 . The new module will accelerate performance to another level with cost-effective, high-density solid-state drives (SSDs).
IBM intends to extend IBM Active Cloud Engine™ capabilities to:
Allow files on selected NAS devices to be virtualized by SONAS and Storwize® V7000 Unified. Virtualization capabilities provide access across a unified global namespace, while facilitating transparent file migrations in parallel with normal operations. This capability will help provide customer investment protection as clients continue to leverage their existing NAS assets while exploiting the capabilities of IBM Active Cloud Engine .
Enable file collaboration globally via IBM Active Cloud Engine . This capability will help enhance productivity where users at geographically dispersed locations can both share and modify the same file.
IBM intends to deliver Cloud features to SONAS and Storwize V7000 Unified to support:
Web Storage Services, a standards-based object store and API that implements the Cloud Data Management Interface (CDMI) standard from Storage Networking Industry Association (SNIA) to support the implementation of storage cloud services.
Self-service portal designed to speed storage provisioning, monitoring, and reporting.
IBM intends to support an increased scalability of capacity, performance, and host bandwidth by clustering IBM XIV® Gen3 systems together and providing the capability to migrate volumes across the cluster without disrupting applications. Management of the cluster will remain simple with consolidated views and shared configurations across the systems. These capabilities are intended to help clients address the scalability and management requirements for effective cloud computing.
IBM intends to extend NAS data retention enhancements for IBM Storwize V7000 Unified and IBM SONAS to provide file "immutability" to help support file integrity from the time the file is designated as immutable through its lifecycle. Immutability is intended to secure files from inadvertent or malicious change or deletion.
IBM intends to enable Real-time Compression for block and file workloads on Storwize V7000 Unified systems. This enhancement is designed to help clients experience the same high-performance compression for active primary block and file workloads on Storwize V7000 Unified that is being announced for block workloads on Storwize V7000. IBM Storwize V7000 Real-time Compression is designed to deliver enhanced storage efficiency with potential benefits including lower storage acquisition cost (because of the ability to purchase less hardware), reduced storage growth, and lower rack space, power, and cooling requirements.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information in the above paragraphs are intended to outline our general product direction and should not be relied on in making a purchasing decision. The information is for informational purposes only and may not be incorporated into any contract. This information is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.
It is ironic that only days after I wrote that 497 is the IT number of the beast, I learn that Linux has another unfortunate number: 208.
The reason for this is a defect in the internal Linux kernel used in recent firmware levels of SVC, Storwize V7000 and Storwize V7000 Unified nodes. This defect will cause each node to reboot after 208 days of uptime. This issue exists in unfixed versions of the 6.2 and 6.3 level of firmware, so a large number of users are going to need to take some action on this (except those who are still on a 4.x, 5.x, 6.0 or 6.1 release). If you have done a code update after June 2011, then you are probably affected. This means that if you are an IBM client you need to read this alert now and determine how far you are into that 208 day period. If you are an IBMer or an IBM Business Partner, you need to make sure your clients are aware of this issue, though hopefully they have signed up for IBM My Notifications and have already been notified by e-mail.
In short what needs to happen is that you must:
Determine your current firmware level.
Check the table in the alert to determine if you are affected at all, and if so, how far you are potentially into the 208 day period.
Prior to the 208 day period finishing, either reboot your nodes (one at a time, with a decent interval between them) or install a fixed level of software (as detailed in the alert).
To give you an example of the process, my lab machine is on software version 18.104.22.168 which you can see in the screen capture below. So when I check the table in the alert, I see that version 22.214.171.124 was made available on January 24, 2012, which means the 208 day period cannot possibly end before August 19, 2012.
Earliest possible date that a system running this release could hit the 208 day reboot.
SAN Volume Controller and Storwize V7000 Version 6.3
30 November 2011
25 June 2012
24 January 2012
19 August 2012
Regardless, I need to know the uptime of my nodes, so I download the Software Upgrade Test Utility (in case you have an older copy, we need at least version 7.9) and run it using the Upgrade Wizard (NOTE! We are NOT updating anything here, just checking):
I Launch the Upgrade Wizard, use it to upload the tool and follow the prompts to run it, so that I get to see the output of that tool. The output in this example shows the uptime of each node is 56 days, so I have a maximum of 152 days remaining before I have to take any action. At this point I select Cancel. You can run this tool as often as you like to keep checking uptime.
Note if you are on 6.1 or 6.2 code you may see a timeout error when running the tool, especially for the first time. If you do see an error, please follow the instructions in the section titled "When running the the upgrade test utility v7.5 or later on Storwize V7000 v6.1 or v6.2" at the Test Utility download site.
As per the Alert:
If you are running a 6.0 or 6.1 level of firmware, you are not affected.
If you are running a 6.2 level of firmware, the fix level is v126.96.36.199 which is available here for Storwize V7000 and here for SVC.
If you are running a 6.3 level of firmware, the fix level is v188.8.131.52 which is available here for Storwize V7000 and here for SVC.
If you are using a Storwize V7000 Unified, the fix level is v184.108.40.206 which is available here.
You should keep checking the alert to find out any new details as they come to hand. If you are curious about Linux and 208 day bugs, try this Google search.
*** Updated April 4, 2012 with links to fix levels ***
If you have any questions or need help, please reach out to your IBM support team or leave me a comment or a tweet.
*** April 10: The IBM Web Alert has been updated with new information on what to do if your uptime has actually gone past 208 days without a reboot. In short you still need to take action. Please read the updated alert and follow the instructions given there. ***
I always laugh when people say to me: I wouldn't know what to blog about!
When you work in pre-sales support, you constantly get asked questions and each one of them could be the subject of a new blog post. Right now the most common question I am getting is:
I am implementing VMware Site Recovery Manager (SRM). One of the components I need are vendor specific Site Recovery Agents (SRA). I have searched IBM's website but cannot find them. Where are they?
So the short answer is: you get them from the VMware SRM download site. However before downloading, there is a key task that absolutely needs to be performed:
Visit the VMware vCenter Site Recovery Manager Storage Partner Compatibility Matrix. This site will confirm what products are supported by each version of SRM. You can find it here, but clearly you need to check back regularly to ensure you have the latest information.
Now find your storage device in the matrix and confirm what firmware levels are supported. This is really important. For example, the Feb 27, 2012 edition of the matrix tells me that the Storwize V7000 is supported for SRM version 5.0, but only when running Storwize V7000 firmware version 6.1 or 6.2. This is significant because if you upgrade to version 6.3 you are not supported. In fact that combination doesn't actually work yet, as detailed here. Clearly something you need to be aware of when planning firmware updates.
So where are the SRAs? On each of the pages below use the Show Details button to see what version SRAs are being shipped with that SRM (although sometimes the pages take a few days between an SRA being added and the page being updated):
There are a few more questions I routinely get asked:
Does IBM actually have an SRA download site?
The answer is yes, but it is an FTP site only for SRAs written by IBM. It is principally a repository for older SRAs and beta SRAs but you can also find the current SRAs on it. You can find the site here. Note however that it is NOT the official source. For that you need to use the VMware site.
What about the SRA for LSI/Engenio based products like the DS4800?
These used to also be found on the LSI site, but since LSI sold Engenio to NetApp, it is no longer available from the LSI or NetApp websites. You need to download the current version from the VMware sites listed above. There is a version for SRM 5 on the VMware download site.
What about nSeries SRAs?
If you need an nSeries SRA, again you should go to the VMware download pages. There are separate SRAs listed and available for IBM nSeries (as opposed to an SRA for NetApp branded filers).
What about an SRA for XIV with SRM version 5?
The answer: The SRA for XIV with SRM 5 (and 5.0.1) is now available from VMware. If you have access to download SRM, you will be able to download SRA version 2.1.0. It is the same SRA for both XIV Generation2 and Gen3.
What about an SRA for Storwize V7000 and SVC version 6.3 code?
The answer: It is coming. We are working to make it available as soon as possible. I will update this post as soon as I have a date for you (we are talking weeks, not months).
*** Update March 23, 2012 - Added details on SRM 5.0.1 ***
Many years ago I picked up a book that literally blew my mind. It was the Cuckoo's Egg by Clifford Stoll and it's a genuine classic, a true tale of hackers and how one was tracked down in the very early days of the internet.
Now the story is about events in 1986, so it captures the state of technology at the time (which rather dates the book), but wow, what a great story.
So why mention the book? Well apart from the fact that it is well worth a read, the key issue that Clifford saw again and again was default passwords. The hacker would identify a target and then try to logon using default IDs and default passwords, usually with great success.
Now I have blogged in the past about the determined (but often ignored) way that Brocade switches berate you into changing default passwords. But pretty well all products need to do this, as they all have the same issue (and a truly problematic counter-point). You absolutely need to do two things with every product in your data center:
Change the default passwords on every device you deploy.
Record what those passwords got set to (preferably using a logical or physical password safe).
Now don't laugh, but forgotten/lost passwords on data center kit (like switches) is a VERY common problem. When I worked in the IBM Storage Support team I took calls EVERY WEEK from clients who had devices they could not logon to, for all manner of reasons. For some, supplying them with the default passwords saved them (and condemned their employer?), but for others they needed much more detailed assistance.
My preferred solution to this challenge is to use external authentication (like LDAP) but being able to reset passwords with an external tool is also a nice option to have available.
The reason I started thinking about this is a nice tool IBM offer for the Storwize V7000 called the Initialization Tool that you can download from here. Using this tool you can reset the password of the Superuser ID on a Storwize V7000 back to the default (passw0rd). The tool runs on a USB key. After requesting the tool to help you to reset the superuser password, you insert the USB key into the Storwize V7000, wait for the orange indicator light on the relevant node canister to stop blinking and the task is complete. Then put the USB key back into your laptop and run the init tool again to get a completion report that should look like this:
This is great to rescue customers who have lost their passwords, but the question then gets raised: Can I block this?
My first response is: if you are concerned about unauthorized people with malicious intent placing USB keys into your Storwize V7000, then don't let them into your computer room (presuming you can spot them by the colour of the hat they are wearing). If that is not an option, lock the rack that the Storwize V7000 resides in (change control does have its benefits). If that is not an option, there is one more alternative, but it is a tad extreme.
What we can do is prevent password reset via USB key (or in the case of the SVC, via the front panel). We do this by issuing the following CLI command: setpwdreset -disable
In the following example, I confirm that password reset is possible (value 1), I then disable it and confirm that password reset is no longer possible (value 0). If curious I could then get some help on that command:
Only if your paranoia is matched by your attention to detail.
My reason to hesitate recommending it is simple: If you prevent password reset and then forget your password (and have no other local Security Administrator accounts), you have locked the door and thrown away the key. Far better to physically lock the rack.
In the end though, your company needs to set a policy that is actively enforced (with no exceptions). So get to it.
When IBM brought out the SAN Volume Controller (SVC) in 2003, the goal was clear: support as many storage vendors and products as possible. Since then IBM has put a huge ongoing effort into interoperation testing, which has allowed them to continue expanding the SVC support matrix, making it one of the most comprehensive in the industry. When the Storwize V7000 was released in 2010 it was able to leverage that testing heritage, allowing it to have an amazingly deep interoperation matrix on launch date. It almost felt like cheating.
However I recently got challenged on this with a simple question: Where is the VNX? If you check out the Supported Hardware list for SVC V6.3 or Storwize V7000 V6.3 you can find the Clariion up to a CX4-960, but no VNX.
The short answer is that while the VNX is not listed there yet, IBM are actively supporting customers using VNX virtualized behind SVC and Storwize V7000. If you have a VNX 5100, 5300, 5500, 5700 or 7500 then ask your IBM pre-sales Technical Support to open an Interoperation Support Request. The majority are being approved very quickly. The official support sites that I referenced above will be updated soon (but don't wait, if you need support, ask for it). IBM is working methodically with EMC to be certain that when a general publication of support is released for VNX (soon!), both companies will agree with the published details.
And for the wags who think that this is a ringing encouragement to buy VNX, you would be missing the point. You cannot be a serious Storage Virtualization vendor if you are not willing to support your clients purchasing decisions, regardless of which vendor they buy their storage from. IBM have been staying that course and demonstrating that willingness since 2003. It's a pretty good track record and one that they are determined to maintain.
I have updated my IBM Storage WWPN Determination Guide to version 6.5. You can find the updated guide on IBM Techdocs here.
The main change is that new DS8800s are now presenting slightly different WWPNs, so I added three new pages to describe the changes.
If this guide is new to you, its purpose it to let you take a WWPN and decode it so you can work out not only which type of storage that WWPN came from, but the actual port on that storage. People doing implementation services, problem determination, storage zoning and day to day configuration maintenance will get a lot of use out of this document. If you think there is an area that could be improved or products you would like added, please let me know.
It is also important to point out that IBM Storage uses persistent WWPN, which means if a host adapter in an IBM Storage device has to be replaced, it will always present the same WWPNs as the old adapter. This means no changes to zoning are needed after a hardware failure.
I also host the book on slideshare, so you can also view and download it from there:
I love USB keys, I love free ones, conferences give away ones and ones shaped like lego blocks. The exciting thing (for me) is that if you buy a Storwize V7000, you also get a USB key: A key which has two fundamental purposes:
It's used to make installation very quick and easy (which it does very well!).
It's used to reset the superuser password (in case you forget what it is) or to set the service IP addresses (in case you didn't set them like I suggested you do ).
This is all well and good but what happens when you lose it, borrow it or accidentally throw it out? (oops) If you are searching for it, yours may well have looked like this:
So what to do? The answer is: It's OK, there is nothing magic about this key. In fact the key contains just one piece of software, which you can get from here. Just download the initialization tool and copy it onto your own USB key. The original key also had an Autorun file, but you don't need that (actually I object to auto-running USB keys anyway).
BUT... and there is always a but... I cannot guarantee that EVERY USB key you try will work. Why not? Because some USB keys are formatted strangely or insist on running unique applications before they will work. There is some good, simple advice on the InfoCenter that you can find here. The main trick is to use a USB key that is formatted with the FAT32, EXT2, or EXT3 file system on its first partition and does not need to auto-run any applications before working.
The IBM SVC has has been setting records in SPC-1 (OLTP-like) benchmarks for many years. However recently HP stole the crown with a 3Par benchmark of 450,212.66 IOPS.
But in breaking news, the SVC is back on top with the very first SPC benchmark that exceeded 500,000 IOPS (520,043.99 to be precise!). You can see the executive summary here.
This benchmark used eight of the current SVC engines (model CG8s) with Storwize V7000 as the backend disk. It shows the awesome power of SVC, its ability to scale and to handle very large configurations with very large throughput requirements. It also shows the power of IBM pSeries which was used to drive these IOPS.
The first update for Storwize V7000 and SVC release 6.3 is now available. You will find it here for Storwize V7000 and here for SVC (note both links will require you to login to Fix Central with your IBM ID). As usual the new release contains a combination of new features and fixes. The new features are:
New features in SVC 220.127.116.11
* Support for multi-session iSCSI host attachment
* Language Support for Brazilian Portuguese, French, German, Italian, Japanese, Korean, Spanish, Turkish, Simplified Chinese and Traditional Chinese
There are also several fixes (with some variation between SVC and Storwize V7000, mainly around the platform hardware). The release notes (which you can find at the links above) detail them all. Two fixes I have been looking forward to are:
IC80253 Unable to log into the GUI if password contains special characters. This meant that a password with a comma in it could not be used in the GUI (you got a backspace instead). Passwords with commas could be used in the CLI. This bug was picked up by one of my clients when trying out LDAP and is now fixed in 18.104.22.168.
IC80501 Performance statistics collection fails to record read and write response times for internal drives. This issue meant that SVC internal SSD drives always showed 0 ms response times in TPC.
Note that the Drive firmware does not need to be updated with this release. The new upgrade test tool (version 7.3) will not ask you to update them. I will let you know when that situation changes.
The big question of course is which drive type to choose? The answer is that ideally you should possess three pieces of information:
How much usable space do you need in GB or TiB? Don't confuse binary and decimal!
What is your typical I/O profile. For instance 70% reads 30% writes, 32KB block size.
What are your IOPS and response time requirements?
Armed with this information, get your IBM Sales Rep or Business Partner to model your requirements using Capacity Magic and Disk Magic. These modelling tools will tell you how much usable capacity a particular configuration will give you and what performance you can expect to get from it (given a particular I/O profile). If you don't know your I/O profile or IOPS requirements, you can still see performance modeling using industry standard benchmarks.
I am getting this question on a very regular basis:
"We have just upgraded to ESXi 5.0 but we cannot find the VAAI driver on the IBM Website"
The answer? There is no vendor supplied driver because no driver is needed. ESXi 5.0 uses a SCSI T10 compliant set of commands that all vendors need to support for VAAI to work.
But of course in the tradition of all answered questions, it leads to another question:
"Once I have upgraded to ESXi 5.0 how can I tell if VAAI is really working?"
The good news is that it is very easy to spot if ESXi 5.0 has detected a VAAI capable LUN. The moment a new LUN is detected by ESXi 5.0 it tries out an Atomic Test and Set command. If that works, you will see that Hardware Acceleration shows as Supported in vCenter. In the screen capture below I have three datastores, two from XIV and one from Storwize V7000, all presented to an ESXi 5.0 server. I dragged the Hardware Acceleration column over from the right hand side to help with the screen capture (in case your vCenter looks different), but you can see the Hardware Acceleration column shows each DataStore as Supported (and did so the moment the volume was detected).
Of course having seen the Hardware Acceleration Supported message only proves that Atomic Test and Set works. To confirm if XCopy (Hardware Accelerated Move) is working, on SVC or Storwize V7000 we can use the Performance monitoring panel. In the example below I first performed a storage vMotion, moving a virtual machine between two Datastores located on the same Storwize V7000 (running 22.214.171.124 firmware). I then performed a clone of the same virtual machine, where the source was on one datastore and the target was placed on another (but both located on the same Storwize V7000). What you can clearly see is that both operations (storage vMotion and cloning) generated no volume traffic, only MDisk traffic. This means that the ESXi server is doing none of the work and the storage is doing all of the work.
The Storwize V7000 and SVC have a command line interface that you access via SSH. Every-time you logon, whether it is to transfer a file (using a tool like pscp), issue a single shot command from a script (using a tool like plink) or logon to issue commands interactively (using a tool like PuTTY), you clearly need to authenticate yourself. Since June 2003, the way you did this was to use a public/private key pair, where the SVC or Storwize V7000 had the public key and the SSH client (such as PuTTY) authenticated using the private key (the PPK file).
However with release 6.3 of the SVC and Storwize V7000 firmware, the use of key files is now optional. A user can now authenticate purely by using a password. This includes using your domain ID. So if you defined LDAP to your machine, as I documented here, you could now SSH direct to your Storwize V7000 or SVC, use your Domain user id and password and not go through the key file setup task. Nice!
The choice to continue to authenticate just with an SSH key remains available. If a user has both a password and a configured key file, then either method will work (you only need to use one - not both). Existing scripts will be unaffected by this change, so nothing gets broken because of this.
I think this is a very positive change and one I openly welcome. Combined with LDAP, this really makes user account setup an easy and simple task.
IBM recently released a new version of firmware for the SAN Volume Controller and Storwize V7000. This is known as release 6.3 and continues the tradition of two major updates per year, each adding significant new functions.
So the 6.3 release notes for both Storwize V7000and SVC listed the following new feature:
Support for 4096 host WWPNs
Since I blithely listed this feature in a recent post I have received lots of emails asking exactly what it means, so I thought I had better explain.
The IBM SVC and Storwize V7000 have always had very clearly published maximum capabilities such as the ones listed here for Storwize V7000 release 6.3 and here for SVC release 6.3.
Most of these numbers are very high and few customers actually approach these maximums. The main issue I am seeing for some of our larger AIX customers is this one:
Total Fibre Channel ports (WWPNs) per I/O group: 512
The reason this can become an issue is the combination of NPIV and AIX Live Partition Mobility. NPIV allows one physical HBA to be shared among multiple operating system instances, each one believing it has exclusive access to the HBA with each one allocated its own unique WWPNs. Suddenly a single HBA which used to present just one WWPN through the SAN to the SVC, can now present vast numbers of them. In addition AIX Live Partition Mobility (which lets you move AIX operating systems between LPARs on the fly) needs additional pre-configured WWPNs defined on the target LPAR to support the move. This further increases the quantity of WWPNs that need to be defined to the SVC (one easy way to spot NPIV generated WWPNs is they normally start with the letter C).
So the bottom line is that IBM needs to make this limit bigger and SVC and Storwize V7000 6.3 code contains the necessary architectural changes to allow this. The first phase is to start potentially supporting up to 2048 WWPNs per I/O group although clearly based on the initial version of the release notes, the long-term plan is to support 4096.
But there is a problem and it has nothing to do with the SVC or Storwize V7000. The problem is that there are certain SAN configurations which may have issues with these large numbers of WWPNs (mainly around older SAN switches not having the CPU power for the switch fibre channel name-server and login-server to handle vast numbers of WWPNs coming out of one HBA).
So what should you do if you need to push the limits?
Contact your IBM Pre-Sales support and ask for a SCORE request to be opened (also known as an RPQ). You will need to detail your current SAN configuration (especially switch models and firmware levels) so that SVC development can ensure you won't overwhelm your switches. It will also allow our development team to learn how many clients our there need this support. All approvals will include a requirement to upgrade to release 6.3, so you should include this in your planning.
Any questions? Feel free to leave a comment or send me a tweet or an email.
With the 6.3 release of the Storwize V7000 and SVC code (which I blogged about here), there are so many new features and functions that I have plenty more to blog about!
The first new feature I blogged about was LDAP support, but an existing feature that has been enhanced is the performance monitor (brought in with release 6.2). When this first came out I put a video on You Tube showing what metrics could be displayed in that release. This is a sped up image with no voiceover:
Now with release 6.3 IBM has added separate graphs for reads and writes plus the ability to display IOPS or MBPS, plus the ability to display graphs of read and write latency. Nice! I got so excited I made another You Tube video, this one with narration. So now you can compare the new to the old:
Lets imagine a new rack server or a new blade server has been added to your Fibre Channel SAN. The first job for the SAN administrator is to zone it to the storage it requires access to. The task normally runs something like this:
Identify the WWPNs for the new server HBA. We can do this using Qlogic SAN Surfer or Emulex HBAnywhere, or by looking at the WWPNs reported by the Fibre Channel switch or by using datapath query wwpn (with SDD and SDDDSM) or by using the xiv_fc_admin -P command with the XIV HAK. There are lots of different ways, you get the idea.
On fabric 1 create a new alias for the server HBA port cabled to that fabric.
For each storage device that the server needs access to on fabric 1 (or possibly just switch 1), create a new zone and include the new server alias and the alias for every relevant storage port on that device. Repeat if you have other storage devices (so two XIVs means two new zones).
Put the new zone (or zones) into the active zoneset (or a clone of it) and activate it.
Repeat on fabric 2 (after waiting a decent interval to ensure no mistakes were made in fabric 1... well I hope you wait.... you do don't you?).
The main trap here is that when creating a zone, you need to ensure you select all of the correct storage aliases for your selected storage device. For instance we could have a simple layout like this:
Fabric 1 contains our new server (in this example an IBM x3850) and three XIV ports:
This means when creating the zone I need to identify and select four separate aliases. What I could do instead is create an alias with all my XIV target ports in it. Now I only have two aliases to select in that fabric:
This means when creating the zone I need to identify and select three separate aliases. What I could do instead is create an alias with both my Storwize V7000 WWPNs in it. Now I only have two aliases to select in that fabric:
This method of amalgamating multiple storage port aliases works fine for devices like DS8000, SVC, Storwize V7000 and XIV. I use this method all the time to simplify zoning and I find it reduces both mistakes and the time required to complete zoning tasks.
The only exceptions are:
Don't do it for DS3000, DS4000, DS5000 or DCS3700 as the controllers on these devices do not like to see each other through the switch.
Don't combine ports from different storage devices, so if you have two XIVs in a fabric create one alias for the target ports of each XIV (although you could combine ports from different SVC I/O groups within the same SVC cluster into one alias). You should still use individual aliases for ports being used for migration or replication purposes.
Don't use the WWNN to create an alias. Always create multi-WWPN aliases so you have granular control of which ports go into the alias. If you use the WWNN from an XIV you will also implicitly include any ports that are being used for replication or migration and thus zone them to the host, which makes no sense.
I would love to hear any techniques you have to make your (and my) life easier.
Once your SVC or Storwize V7000 is upgraded to version 6.3 you can start using LDAP for authentication. This means that when you logon, you authenticate with your domain user-id and password rather than a locally created user-id and password.
So why is this important?
It saves you having to configure every user on every SVC or Storwize V7000. If you have multiple machines this makes it far more efficient to set up authentication.
It means that when commands are executed on the SVC or Storwize V7000, the audit log will show the domain username that issued that command, rather than a local username, or worse just superuser (i.e. who mapped that volume? The superuser did.... who? )
It gives you central control over access. If someone leaves the company you just need to remove access at the domain controller, meaning there won't be orphan user-ids left on your Storage equipment.
So as an exercise I added my lab Storwize V7000 to our domain to show how it is done. This example also applies to an SVC so don't be confused if I only refer to Storwize V7000 from now on.
The first task is to negotiate with your Domain administrator to get a new group setup on the domain. In this example I use a group called IBM_Storage_Admins which lets me use this group for various storage devices (such as an XIV or a SAN Switch).
To create this group we need to logon to the Domain Controller and configure Active Directory. An easy way to do this from the AD controller is to go to Start → Run and type dsa.msc and hit OK. The Active Directory Users and Computers Management Console should open.
Select the groups icon to create a new group.
Enter your group name, in my case: IBM_Storage_Admins and hit OK.
Now right select relevant users who need access to the storage and add them to the IBM_Storage_Admins group. In this example I have selected Anthony (which uses anthonyv as a username).
In this example we are adding anthony into the IBM_Storage_Admins group:
Now it is time to configure the Storwize V7000 so start the Web GUI and logon as Superuser.
Firstly we go to Settings → Directory Services:
We choose the button to Configure Remote Authentication:
We choose LDAP and hit next.
We choose Microsoft Active Directory with no Transport layer Security. We then expand the Advanced Settings. My lab domain is ad.mel.stg.ibm so I use the Administrator ID on the Domain Controller to authenticate access. You could use any user that has authority to query the LDAP directory. We then hit Next.
We then add the domain controller which in this example is 10.1.60.50 and the base domain name chopped into pieces (so ad.mel.stg.ibm becomes dc=ad,dc=mel,dc=stg,dc=ibm ) and hit Finish.
Provided the command completes successfully we have defined the domain controller to the Storwize V7000. Now we need to add a group. Go to Access → Users.
Select the option to add a New User Group.
In this example we want to add a group for users allowed full admin access to the Storwize V7000. This matches the group we created on the Domain Controller. So we call the group IBM_Storage_Admins and we use the Security Administrator role (which is the most powerful role) and tick the box to enable LDAP for this group.
Now to test, I logon to the Storwize V7000 using the domain user-id anthonyv with that users domain password. Remember this user is not defined on the Storwize V7000 itself and that if it all goes wrong, we can still logon as Superuser.
Now I create a volume and delete it. Then I check the audit log from Access → Audit log.
Sure enough, we see exactly who did that command.
This is a great outcome for security,auditing and for easy access administration.
If you have issues, from the Settings → Directory Services menu, use the Global Actions dropdown on the right hand side to Test LDAP Connections and Authentication or re-configure LDAP.
If you already have existing users (what we call Local users), configuring remote authentication using LDAP does not disable or invalidate those local user-ids. This means you can either logon with a local user-id or logon with a Domain user-id. This is handy if the domain controller fails but can confuse you if your local user name and your domain user name are the same name (for example both anthonyv). The Storwize V7000 will look you up in the local user name list first. I suggest removing all local users (except superuser) as this will reduce confusion but still leave you a backdoor in case remote authentication stops working.
If you see any mistakes or have suggestions to improve the way I described this, please let me know.
The latest release of SVC and Storwize V7000 firmware is now available for download. The major new features that are added with this release are:
Global Mirror with Change Volumes
Native LDAP Authentication
Extended distance split clusters (for SVC)
Support for 4096 host WWPNs
These are some great new features. The ability to use Global Mirror with Change Volumes means clients can now mirror across far smaller pipes, while the increase in host WWPNs is very welcome news for NPIV installations that are suffering from WWPN sprawl.
If you plan to upgrade, firstly grab the new Upgrade Test Utility from here. The links to the Storwize V7000 and SVC versions are both on that page. Remember you can run this test as many times as you want whenever you want, to check the health of your device for upgrade. When you run the upgrade test utility on a Storwize V7000 you may get a message that your disks have down-level firmware. The process to update them is documented here.
If you're using a Storwize V7000 you can grab the 126.96.36.199 code from here. If you're using an SVC you can grab the 188.8.131.52 code from here. I am sending you to the compatibility matrix page because you should always check that your from level is ok for your to level.
To run the upgrade go to Configuration (the spanner icon) → Advanced → Upgrade Software →Launch Upgrade Wizard
I have not shown all the panels you will see because it is very much a follow-your-nose task, but in essence, first we feed it the Upgrade Test Utility file and run that test.
If you get warnings you may need to act on these. If you are unsure what to do to resolve a warning message, place a service call.
Once the test passes or you're happy you understand the warnings, we now point it at the code package and wait for it to copy across and keep hitting Next.
The application of the code shuts down and reboots each controller, with a 30 minute gap in between. You will transition from this (both nodes down-level, node1 being upgraded):
To this (node1 upgraded, node2 still online but waiting for 30 minutes):
When node2 starts the upgrade the GUI will failover to node1 and be upgraded to the new version. You will notice the difference immediately, it has a different look and feel. Please don't be tempted to play with the new functions until both controllers are upgraded! Wait until you see this (note a slight change, the GUI flow is now Settings (the spanner icon) → General → Upgrade Software:
Now your complete it is time to start checking out what is new... but that's a whole different blog post!
Wordpress (where I mirror this blog) as a blogging platform has some very nice features. One of them is that you can find out what search terms led people to your blog. I have noticed that searches like these have become very common:
XIV GUI and XIV GUI V3 or version 3
IBM vcenter plugin
This suggest to me that people are having trouble finding these files, or more importantly: maybe Google is having trouble helping them to find them!
The reason is simple: They have been moved to IBM FixCentral rather than being posted on separate easily trawled web pages. The good news is that once you know about Fix Central, finding any file becomes very very easy.
First up you HAVE to bookmark this URL (Do it now! Yes NOW!):
************************************************************************** Update 17/12/2011: A flash reporting a possible issue that could occur if a drive fails during drive firmware update can be found here.
Until the flash is updated showing how to avoid this issue, only update drive firmware when installing a new machine or if all hosts are offline.
IBM recently released new drive firmware for the Storwize V7000, so I thought I would share the process of how I update that firmware. You can download it from here. The details for this new package can be found here. I recommend you perform the drive update before you next update your Storwize V7000 microcode.
I want to be clear that one of the central goals of the Storwize V7000 is to ensure that performing drive firmware updates can be done online without host disruption. This is possible because each drive can be updated in less than around 4 seconds. The scripts I share below leave a 10 second delay between drives just to be safe. I would still prefer that you did the update during a quiet period.
We need to perform this procedure using the command line as there is no way to do this procedure from the GUI (yet).
There are four steps:
Upload the Software Upgrade Test Utility to determine which drives need updating.
Upload the drive microcode package.
Apply the drive software.
Confirm all drives are updated.
Step 1: Upload and run the upgrade utility
You will need the upgrade test utility which you can get from here.
You will need the Putty utility PSCP which you can get from here (although most of you should already have it).
You will need to have created a public/private key pair and assigned it to a user. In all the examples the user name I use is anthonyv. You need to use your own user-id, although you could also use admin. The process to create and associate the key pair is described here. Place the PPK file into the putty folder along with the upgrade test utility.
From the Putty folder we need to upload the test utility. You will need to change the key file name, userid and IP address (all highlighted in red) to suit your installation.
NOTE: The following command is being run in a Windows command prompt. You need to be in the C:\Program Files\Putty or C:\Program Files (x86)\Putty folder.
If you get a warning window like the one shown below, indicating we have down-level drives, we need to proceed to the next step (note that the enclosure and slot numbers are not the same as drive IDs). If you have a lot of drives, you can drop the -d from the svcupgradetest command to get a summary list.
******************* Warning found *******************
| Model | Latest FW | Current FW | Drive Info |
| HK230041S | 2920 | 291E | Drive in slot 24 in enclosure 1 |
| | | | Drive in slot 23 in enclosure 1 |
| ST9450404SS | B548 | B546 | Drive in slot 22 in enclosure 1 |
| | | | Drive in slot 21 in enclosure 1 |
| | | | Drive in slot 20 in enclosure 1 |
| | | | Drive in slot 19 in enclosure 1 |
| | | | Drive in slot 18 in enclosure 1 |
| | | | Drive in slot 17 in enclosure 1 |
| | | | Drive in slot 16 in enclosure 1 |
| | | | Drive in slot 15 in enclosure 1 |
| | | | Drive in slot 14 in enclosure 1 |
| | | | Drive in slot 13 in enclosure 1 |
| | | | Drive in slot 12 in enclosure 1 |
| | | | Drive in slot 11 in enclosure 1 |
| | | | Drive in slot 10 in enclosure 1 |
| | | | Drive in slot 9 in enclosure 1 |
| | | | Drive in slot 8 in enclosure 1 |
| | | | Drive in slot 5 in enclosure 1 |
| | | | Drive in slot 6 in enclosure 1 |
Step 2: Upload the drive microcode package
Download the drive update package from here. Put it into the PuTTY folder. From a Windows command prompt we need to upload the package using the following command. You will need to change the key file name, userid and IP address (all highlighted in red) to suit your installation. Note yet again that you are running this in a Windows command prompt from the PuTTY folder (not from inside an SSH session):
I have written some scripts to help you list the drive IDs that need to be updated and perform the updates. You can upgrade the drives one at a time, or in bulk, depending on how you want to do this. All the remaining commands are all run in a PuTTY session.
Firstly run this script to list all the drive IDs and current firmware levels. We need the drive IDs if we want to update individual drives.
svcinfo lsdrive -nohdr |while read did error use;do svcinfo lsdrive $did |while read id value;do if [[ $id == "firmware_level" ]];then echo $did" "$value;fi;done;done
The output will look something like this, showing the drive ID and that drive's current firmware level. From step 1 we know what the latest firmware level is, so we can compare to the current firmware level:
However you may have a lot of drives and want to upgrade them in bulk. So you could use this command, which updates drive ID 19 and 20 (highlighted in red). You could change and also add extra drives to the list as required:
for did in 19 20;do echo "Updating drive "$did;svctask applydrivesoftware -file IBM2076_DRIVE_20110928 -type firmware -drive $did;sleep 10s;done
If we just wanted to upgrade every single drive in the machine (regardless of their level), we could run this command:
svcinfo lsdrive -nohdr |while read did name IO_group_id;do echo "Updating drive "$did;svctask applydrivesoftware -file IBM2076_DRIVE_20110928 -type firmware -drive $did;sleep 10s;done
When updating multiple drives, I have inserted a 10 second sleep between updates, just to ensure the process runs smoothly. This means each drive takes about 13-15 seconds.
Once we have upgraded every drive, it is time for a final check.
Step 4: Confirm all drives are updated
You have two ways to confirm this. Firstly run the following command to list the firmware level of each drive. Is each drive reflecting the levels reported in Step 1?
svcinfo lsdrive -nohdr |while read did error use;do svcinfo lsdrive $did |while read id value;do if [[ $id == "firmware_level" ]];then echo $did" "$value;fi;done;done
Now run the software upgrade test utility again:
svcupgradetest -f -d
Provided you receive no warnings about drives not being at the recommended levels, you are now finished with the drive updates. Of course you could now proceed to install 184.108.40.206 firmware, but you can do that from the GUI.
The SVC and Storwize V7000 offers a command line interface that you access via SSH. You start your favorite SSH client (such as PuTTY or Mindterm) and then logon as adminor as your own user-id. Right now to do this you need to generate a private/public key pair, although with release 6.3 (which will be available November 2011), you will be able to logon via SSH using just a user-id and password.
Having logged on there are three categories of commands you can issue:
svcinfo: Informational commands that let you examine your configuration. svctask: Task commands that let you change your configuration. satask: Service commands that are only used in specific circumstances.
There are several CLI usability features that I routinely find users are not aware of, so I thought I would share some of them here:
1) Listing all possible commands
If you cannot remember a command, here is a simple trick to list them all. Issue one of the following commands:
svcinfo -h or svcinfo -?
svctask -h or svctask -?
You can also type either svcinfo or svctask and then hit the tab key twice to get a full listing. With svctask you will need to type y to list them all, as per the example shown below:
IBM_2076:STG_V7000:admin>svctask (HIT TAB twice!)
Display all 139 possibilities? (y or n) y
2) Getting help on a particular command
Having found the command you want, issue that command with either -? or -h to get help information. For instance:
svctask mkvdisk -?
svctask mkvdisk -h
You will be shown the same help information that you can find in the Infocenter, including examples of syntax.
3) Drop the svctask and svcinfo prefixes
In release 6.2 of the SVC and Storwize V7000 firmware, the requirement to prefix a command with svcinfo or svctask has been removed. However I tend to keep using them because I write a lot of example commands for clients and I cannot be sure which version of firmware they are running.
4) Use the shell
When we SSH to the SVC or Storwize V7000 we are connecting to a Linux operating system using a special restricted shell. Some of the common Unix commands don't work (such as ls or grep or awk), but any commands that are provided by the shell itself, will work, such as while, if, read, pipe and echo.
We can use this to construct some really clever commands.
For instance creating volume copies is very popular, but the default copy rate is rather slow (50, which equals 2 MBps). It is not unusual for end users to speed up the background copy and then forget to slow it down when they are finished. So I wrote two commands to help me out. Firstly I run a command to display the copy rate of every volume. Ideally I should see 50 alongside each volume. However I often find that some volumes are set to higher numbers, such as the maximum value of 100 (which is 64 MBps).
svcinfo lsvdisk -nohdr |while read id name IO_group_id;do svcinfo lsvdisk $id |while read id value;do if [[ $id == "sync_rate" ]];then echo $value" "$name;fi;done;done
Lets break down this command. The structure looks like this:
We start with svcinfo lsvdisk -nohdr. This gives us a list of every VDisk in column format with no header information.
We pipe the output of that lsvdisk command to while read. This reads the output one line at a time and lets us work with that output. We read the first three columns of output and label the data in the first column id, the second column name and the third column IO_Group. I find we need to label at least three columns. We could read extra columns if we wanted to, but all I want is the VDisk id and name.
For every line of data we issue an lsvdisk command against each listed VDisk using the VDisk id. This output is not in column format so we need to do something different here.
We now examine the output of the lsvdisk command for each VDisk by piping the output to while read. Since each line contains a descriptor and a value, we label them id and value. We use if to look for a line that starts with sync_rate.
When we find the sync_rate for a VDisk we print the value of the sync_rate and the VDisk name. We are done for this VDisk.
We now examine the next VDisk and again look for the sync_rate for that VDisk.
Once we have examined every VDisk, we are done.
I then run the following command which sets the copy rate for every volume to the default value of 50 (2 MBps):
svcinfo lsvdisk -nohdr |while read id name IO_group_id;do svctask chvdisk -syncrate 50 $id;done
Clearly you could edit this second command to change the copy rate to any value between 0 and 100. In each case you just paste the entire command in, and hit Enter.
Lets break down this command. The structure looks like this:
We start with svcinfo lsvdisk -nohdr. This gives us a list of every VDisk in column format with no header information.
We pipe the output of that lsvdisk command to while read. This reads the output one line at a time and lets us work with that output. We read the data ion the first three columns of output and label the data in the first column id, the second column name and the third column IO_Group. I find we need to label at least three columns. We could read extra columns if we wanted to, but all I want is the VDisk id.
For every line of data we read, we do the following command: svctask chvdisk -syncrate 50 $id. Since we labelled the first column of output from the lsvdisk command as id, and that column contains VDisk IDs, we are going to issue this command against every VDisk that got listed.
Once we have run the chvdisk command against every VDisk listed, we aredone.
There are lots of possible clever combinations and I will list a few more in upcoming posts.
I have also been getting lots of requests to write a post about updating drive firmware, so expect something on that very soon.