IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & solutions      Support & downloads      My account     
 
developerworks > My developerWorks >  Dashboard > AIX > ... > Installation > AIXV53SANBoot
developerWorks
Log In   View a printable version of the current page.
Overview Connect Spaces Forums Wikis
AIXV53SANBoot
Added by skywalker, last edited by OneSkyWalker on Feb 02, 2010  (view change)
Labels: 
(None)

AIX V5.3 SAN boot considerations

There are significant advantages when booting from SAN (installing AIX rootvg on LUNs):

  1. advantages conferred by a disk storage subsystem:
    1. better I/O performance due to caching and striping across multiple spindles,
    2. ability to redeploy disk space when a server is retired from service, and
    3. option to use FlashCopy to capture a rootvg backup (but mind the caveats) and
  2. option to move an AIX image from one physical server/LPAR to another.

There are, however, some disadvantages. AIX sysadmins who are aware of the disadvantages, mitigate them, and find disadvantage acceptable and can boot from SAN with confidence.

  1. If a SAN hardware problem or an AIX defect causes intermittent loss of access to the SAN, there is no way of capturing a dump to determine what went wrong. The AIX error log and the AIX dump logical volume are in rootvg. If rootvg is on LUNs and AIX can not access LUNs, there is no way to write a dump to the SAN.

    Option to mitigate:

    Configure dump space on a SCSI hdisk dedicated to the LPAR (or on a vSCSI disk which is mapped to an internal SCSI disk allocated to a VIO Server LPAR). Extend rootvg onto the SCSI (or vSCSI) hdisk and configure dump space on it. Because this hdisk is not on the SAN, AIX can write a dump even if access to the SAN is lost. And the /var/adm/ras directory (which contains the AIX errlog) can also be allocated on a SCSI (or vSCSI) hdisk, although this seems less important than configuring non-SAN dump space.

  2. It is difficult to update the Multipath Subsystem Device Driver (SDD or SDDPCM) or other Fibre Channel multipath I/O support (eg, EMC PowerPath or HDS HDLM) when rootvg is on hdisks accessed via multiple Fibre Channel paths, assuming such multi-path access is supported for rootvg hdisks.

    Option to mitigate:

    No mitigation is required when using SDDPCM with AIX MPIO. A new version of SDDPCM can be installed while the current version remains in use. An AIX reboot is required to enable use of the new SDDPCM version for rootvg hdisks. For more information, see the "Updating SDDPCM" section of the Multipath Subsystem Device Driver User's Guide (downloadable from the Latest Multipath Subsystem Device Driver User's Guide web page. ( As with any system update, it is very important to preserve the option to fall back to a working version should an update render the AIX image unusable. See a note on the AIX V5.3 backup and restore web page for methods of updating AIX while preserving an option to fall back.)

    When using SDD, boot from a LUN accessed via a single path (hdisk) rather than multiple paths (vpath). Mirror AIX on two such LUNs (accessed via different Fibre adapters) so that AIX can survive the failure of one of the Fibre adapters. This mitigation is not optional. As stated in the Multipath Subsystem Device Driver User's Guide:

    SDD does not support:

    • Multipathing to a system boot device
    • Placing system primary paging devices (for example, /dev/hd6) on an SDD vpath device
    • Configuring SDD vpath devices as system primary or secondary dump devices

    When SDD is configured without multipathing to system boot devices, there is no difficulty updating SDD software to a new level. It seems almost certain that the mitigation appropriate for SDD will work equally well for EMC PowerPath, HDS HDLM, and HP AutoPath. (It is likely that, like SDD, third-party multipath device drivers do not support multipathing to rootvg hdisks.)

    Confirm that the version of SDD being used supports SAN boot. The Multipath Subsystem Device Driver User's Guide says:

    Note: SDDPCM supports ESS devices as SAN boot devices, starting from AIX 5.2I and AIX 5.3A. SDDPCM supports DS8000, DS6000, and SAN Volume Controller devices as SAN boot devices, starting from AIX 5.2L and AIX 5.3D.

    The AIX release numbers specified above require translation:

    AIX 5.2I AIX V5.2 ML05
    AIX 5.2L AIX V5.2 ML07
    AIX 5.3A AIX V5.3 ML01
    AIX 5.3D AIX V5.3 ML03


  3. Running AIX on LUNs can be the source of some very mysterious AIX behavior if the SAN occasionally injects delays into I/O operations, particularly paging operations. (According to a comment below by Jim Carstensen, updating SAN zoning will inject delays in some SANs.) And it is certainly the case that AIX hangs lasting several minutes are a big concern in a cluster (HACMP, VCS, etc), where, if a node hangs for a long time and suddenly wakes up, there is the risk of data corruption. It seems imprudent to boot from SAN (or allocate paging space on SAN for) any cluster node unless the cluster's shared volume groups are protected by disk reservation locks. (In this context, booting from vSCSI disks mapped to LUNs is equivalent to booting from SAN.)

    Please note that vSCSI disks don't currently support SCSI-3 persistent reserves, so it is currently impossible to protect (with disk reservation locks) a cluster's shared volume group if that volume group resides on vSCSI disks. However, SCSI-3 persistent reserves are supported with N Port ID Virtualization (NPIV), which is available with 8 GB Fibre Channel PCIe adapters and PowerVM Standard Edition, assuming minimum VIOS level (V2.1) and other requirements are met. NPIV for System p is announced in IBM US Software Announcement 208-341 dated October 7, 2008).

    If a SAN delay persists long enough that a write I/O request times out and fails and AIX does not crash as a result, there should be concern regarding data and filesystem integrity. While AIX is designed to handle write I/O request failures properly, it is not possible to inject every possible write I/O error in a test environment. Because every write I/O failure scenario can not possibly be tested, there is the potential that an undiscovered AIX software defect will impact data and filesystem integrity when a write I/O failure occurs. Therefore, even if write I/O failures do not cause AIX crashes, such failures must be treated as very serious SAN problems which deserve the greatest possible effort to diagnose and resolve.

  4. Accidentally installing AIX on the wrong LUNs or booting a system from the wrong LUNs. These risks can generally be avoided with prudent SAN administration.

Please note that rootvg can be placed on a vSCSI disk mapped to a LUN without concern for disadvantage #2. That's because the VIO client (to which the rootvg belongs) does not use (nor need) SDD for multipathing. See Figure 4-29, "Configuration for multiple Virtual I/O Server and IBM ESS" in the Advanced POWER Virtualization on IBM System p5 Redbook (SG24-7940-02), which shows that VIO client rootvg hdisks are configured with "MPIO default PCM failover only" when accessing a LUN through dual VIO Servers.

Moving an AIX V5.3 rootvg from one physical server/LPAR to another

Note

Moving an AIX rootvg from one server/LPAR to another isn't supported but might work provided the CPU architectures of the source and target are the same and the source and target have identical PCI adapter configurations.

However, according to Shannon Moore of IBM Austin, even after a LUN has booted successfully on a server/LPAR other than the one from which it was installed/built, there is no guarantee that the LUN will continue to boot successfully. According to Shannon Moore, the only methods supported by IBM for copying an AIX rootvg from one server/LPAR to another are to use the AIX mksysb, mkcd, or mkdvd command, to use the AIX alt_disk_copy command with the -O flag, or to use the bare metal restore option of IBM Tivoli Storage Manager for System Backup and Recovery (formerly known as SysBack). See the AIX V5.3 backup and restore web page for more information.

The AIX higher availability using SAN services article documents alternatives for moving an AIX rootvg from one server/LPAR to another when the rootvg resides on virtual disks. Options for cloning an AIX image are discussed, as well.

Please note that if AIX is shut down on one server/LPAR and booted up on another, AIX will come up using Fibre Channel adapters which have different WWPNs than those on which it ran earlier. LUN masking (and probably SAN zoning) must be changed to accommodate the new WWPNs. The Ethernet MAC addresses will change, too. And unless the source and target LPARs have identical PCI adapter configurations, AIX may have difficulty mapping existing IP addresses to Ethernet adapters in the target server/LPAR.

Please note that if an AIX rootvg is moved from an LPAR on one server to an LPAR on another server, difficulties with DLPAR might be seen in the new server, as described in the alternate disk install cloning, improper hostname resolution may cause HMC RSCT errors Redbooks Technote.

Please help!

Please use the Add Comment link at the bottom of the page to inform others of (1) issues not yet documented here which are encountered when moving an AIX rootvg and (2) ways not yet documented here of mitigating issues. Thanks!

(Note: Until you sign up and log in (using links in the upper right corner of this web page), you will not see the Add Comment link and you can not add a comment.)

Some SANs cause delays while updating zoning, and systems with SAN boot are more likely to notice in errpt a momentary interruption of service.  With multipathing these are not real problems, but making sure lines of communication are open between SAN administrators and Unix admins will reduce headaches of trying to diagnose the errpt.

A point of mitigation during upgrades (either of the OS or the multipathing software) is to use AIX alternate disk install where available.  This ensures a workable rootvg to rollback to.

Cheers,

Jim

Posted by jcarstensen at Oct 08, 2007 11:39 | Permalink

Just wanted to point out that SAN boot can also be done with EMC disks and Powerpath.

The caveat is that the disks must be EMC.  Powerpath upgrades are easily done by taking the root disk out of Powerpath control before the upgrade and putting it back afterwards.  That does have the disadvantage of extra reboots.

 Veritas Storage Foundation also has a SAN boot feature available but it only allows for a single LVM disk which means no root clones.  Early 2008, this restriction should be gone.

Posted by gcherry at Jan 15, 2008 14:11 | Permalink

Any advice on installing SAN booting Virtual IO Servers? I could not find any reference to that.
Customer's p570 configuration has 4 HBA's and the integrated SAS controller with six internal disks.
Customer wants a most important LPAR that will own 2 HBA because it is a database server.
There will be some other LPARs that have disk requirements, with minor load. For this LPARs w/o physical HBAs we planned to use two VIOS, each one with one HBA.
We are discussing two alternatives: one is adding another SAS controller and splitting SAS Disks into the two VIOS, for booting purposes of both VIOS. The second alternative is SAN booting the VIOS and using the internal SAS controller for a NIM LPAR.
Thanks a lot.
Gerardo

Posted by gerardoq at Oct 17, 2008 15:35 | Permalink

oneskywalker, please elaborate a bit on this statement: "A new version of SDDPCM can be installed while the current version remains in use."
We've tried to simply over-install the SDDPCM driver on one of our lab machines but the installation failed. The installp process reported that the old driver has to be removed first.
Should we try a forced over-installation?
Thanks for any hints...

Posted by chk2xn at Mar 17, 2009 08:13 | Permalink


What about if the SAN boot device is EMC or Hitachi lun, and using MPIO with PCM [Path Control Modules] from respective vendors. Is is possible to upgrade the PCM to new version online without having to export the volume groups. I understand reboot would be required after upgrade.

Posted by vashi at Dec 08, 2009 14:38 | Permalink

 
    About IBM Privacy Contact