AIX V5.3 backup and restore
This web page is intended to discuss ways of backing up and restoring the AIX rootvg. Given that AIX will not boot if the rootvg is badly damaged, rootvg backups must be restored by a process which is able to boot independently of the rootvg (eg, a boot image on tape).
The contents of this web page solely reflect the personal views of the authors and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management. Please use the
Add Comment link at the bottom of the page to provide feedback. Note: Until you log in (using the link in the upper right corner of this web page), you will not see the
Add Comment link and you can not add a comment. If you do not already have an IBM ID, use the Register Now link on the sign in page to obtain one. Registration is quick and easy.
Bootable AIX rootvg backups are typically captured using the AIX mksysb
command, or a bare metal restore option such as IBM Tivoli Storage Manager for System Backup and Recovery
(formerly known as SysBack) or SBAdmin from Storix
. When a mksysb
backup is written to tape, a boot record is written at the beginning of the tape, allowing the tape drive to be booted on "bare metal" to restore the AIX rootvg if AIX is unable to boot from disk for any reason.
A mksysb
backup can also be written to a file in a filesystem on disk. A Network Installation Management (NIM)
server can be used to restore a file-based mksysb
backup.
The mkcd
command can be used to write a file-based mksysb
backup to CD-Recordable (CD-R) or DVD-Recordable (DVD-R, DVD-RAM) media, which can then be booted on "bare metal" to restore the AIX rootvg.
The mkdvd
command can be used to write a bootable AIX backup directly to DVD-Recordable (DVD-R, DVD-RAM) media or to write a file-based mksysb
backup to DVD-Recordable (DVD-R, DVD-RAM) media, which can then be booted on "bare metal" to restore the AIX rootvg.
If spare disk drives are available, the AIX rootvg can be "cloned" to spare drives using the alt_disk_copy
command:
assuming that hdisk2 & hdisk3 are the targets on which the copy should be placed. Please note that the -O flag is required when "cloning" (that is, when planning to boot the rootvg copy on another LPAR or server), but can be detrimental when making a copy which will be booted on the same LPAR or server. Before taking the target disks away from the existing AIX image, run the alt_rootvg_op
command:
If a rootvg copy has been made for use on the same LPAR/server as the original rootvg (presumably without the -O flag on alt_disk_copy
), System Management Services (SMS)
can be used to switch between the primary and backup AIX rootvgs by shutting AIX down, booting to SMS mode, and selecting the disks from which to boot.
All of the above options (mksysb
, mkcd
, mkdvd
, and alt_disk_copy
) capture a backup of an AIX server while it is running. Files are captured one at a time at different points in time, so any applications which have files stored in rootvg filesystems and depend upon all application data files being captured at a single point in time can not be dependably restored from backups captured using the above options while the application is running (hot backup).
Note: According to the IBM Software on-line catalog
, a IBM Tivoli Storage Manager for System Backup and Recovery Managed Processor License + SW Maintenance 12 Months (D521YLL) will cost $295 per server (as of 12/1/2006). And rumor has it that if one downloads the latest PTF (from the TSMSBR web page
, follow the Technical support link on the left, then the Download link (under IBM Tivoli Storage Manager for System Backup and Recovery support), and then the link to Latest PTF for ITSM System Backup and Recovery ( SysBack ) 5.6 Users), one will be able to install and use SysBack for 60 days before it fails due to lack of a license.
It is especially important to capture a bootable backup periodically while building and testing a new server. Capturing a new rootvg backup is recommended each time the AIX configuration is changed and before making significant AIX changes. For example, the IBM AIX 5L Operating System Service Strategy Details and Best Practices document (downloadable from the Service and support best practices for UNIX servers
web page) says:
"Technology Levels must be applied as a group, using the smitty update_all or install_all_updates commands. Installing a Technology Level is an "all or nothing" operation. ... Before applying a TL, you should always create a backup and plan on restoring that backup if you need to rollback to your previous level."
There are several ways to rollback to a previous AIX level. To rollback with a reboot, update AIX to the new Technology Level using alt_disk_install or the new multibos function shipped with AIX V5.3 Maintenance Level 3. To rollback with a restore, create a backup (mksysb
) to a NIM server or to bootable media (CD, DVD, or tape) before updating AIX to the new Technology Level.
For more information on AIX backup options, see:
- the Backup methods
chapter in the AIX V5.3 Operating system and device management manual,
- the Backing Up the Operating System in AIX 4 and 5
Technote,
- the Back up and restore your AIX system, Part 1: The when, why, and how of backing up
IBM developerWorks
tutorial,
- the Back up and restore your AIX system, Part 2: Implementing your backup strategy and restoration processes
IBM developerWorks
tutorial, and
- the Backing Up and Restoring Your AIX System
white paper.
Note: Access to the IBM developerWorks
tutorials requires an IBM ID. Registering for an IBM ID is quick and easy. The tutorials have multiple pages. Be sure to use the
button (at the top and bottom of the page) to see pages other than the first.
Backup options which might not work as expected
- The mksysb
, mkcd
, mkdvd
, and alt_disk_copy
) commands back up only those filesystems in the root volume group which are mounted when the commands run. And the commands all capture a backup of an AIX server while AIX continues to run. Files are captured one at a time at different points in time, so any applications which have files stored in rootvg filesystems and depend upon all application data files being captured at a single point in time (cold backup) can not be dependably restored from backups captured using the above options while the application is running (hot backup). So don't store application data files in the AIX rootvg.
And when RECOVER_DEVICES=no in the /bosinst.data file (the default and the only viable option when using mksysb to clone an AIX image from one LPAR to another), the process of restoring AIX from backup removes all device definitions (eg, hdisks) and rediscovers devices, thus resetting device attributes to default values. One such device is aio0, so restoring AIX does not restore AIX asynchronous I/O
tailoring. Any AIO tuning
which has been done must be redone after restoring.
- As stated in the Backing Up the Operating System in AIX 4 and 5 Technote cited above, "The IBM AIX UNIX differs from other UNIXs because of two main features: the Object Database Manager (ODM) and the Logical Volume Manager (LVM). Due to the ODM and the LVM, as well as the ability to have multiple volume groups, a complete system archive made with cpio or tar will not restore properly."
- Methods used to back up an AIX rootvg can not necessarily be used to back up a VIO Server rootvg. See the Methods to Backup and Restore the Virtual I/O Server document (downloadable from the Virtual I/O Server Support for Power Systems
web page) for supported methods of backing up and restoring a VIO Server.
- Mirroring - AIX LVM mirroring (RAID-1) is in the base operating system. The AIX rootvg can be mirrored
, but please note that if a new logical volume is added to a mirrored volume group, the new logical volume is not mirrored! (Use the mklvcopy command
to mirror a single logical volume. Use the migratepv command
to move parts or all of a logical volume to a different physical disk in preparation for mirroring. An AIX logical volume can be mirrored or moved elsewhere while it remains in use.) Mirroring can protect against disk failures, but can not protect against other sources of damage (eg, accidentally erasing a crucial file, file data corruption due to a software failure, etc). If AIX won't boot due to corruption of the AIX rootvg, any attempt to recover (which may or may not succeed) will require bootable media at the same Technology Level as the operating system to be recovered. When mirroring a volume group to only two disks, disable the quorum mechanism
for the volume group to create what is known as a nonquorum volume group
. (The mirrorvg command
disables quorum by default.) Please note that while a nonquorum volume group will stay varied on if a disk fails, a two-disk volume group can not be varied on without the force option unless the disk with two VGDAs is available.
- FlashCopy
and similar LUN copy services (eg, EMC TimeFinder
or HDS ShadowImage
) from other data storage vendors - Care must be taken when capturing a FlashCopy image of a mounted AIX JFS or JFS2 filesystem. Consistency groups
or the freeze/thaw options of the AIX chfs command
must be used to capture a copy of the mounted filesystem and its JFSLOG/JFS2LOG at the same point in time across all LUNs on which the filesystem and log reside. (See the AIX V5.3 Release Notes
for more information regarding freeze/thaw.) A backup/restore methodology which uses FlashCopy with neither consistency groups nor freeze/thaw against LUNs containing mounted filesystems may be tested and work repeatedly, yet may eventually fail no matter how many times it has been used successfully. The failure will be subtle. The FlashCopy itself will complete without errors. The failure will manifest itself as data corruption. Either the filesystem won't mount (metadata corruption), AIX will later crash or otherwise report errors with filesystem metadata (metadata corruption), or the application will fail when attempting to use the copy (file corruption). Please note that the FlashCopy has functioned correctly, as far as it is able. It is the way in which FlashCopy is used that is flawed. And while use of consistency groups can protect metadata integrity, file data integrity is not protected. File data integrity must be protected by other means (eg, application recovery techniques following restore, AIX freeze/thaw, etc). Please see Clarification of Supported and Unsupported Methodology for Flashcopy Backups of Mounted AIX Filesystems
for more information.
- dd of a logical volume containing a mounted filesystem - As discussed regarding FlashCopy above, metadata and file data integrity can suffer if a copy is not captured at a single point in time. Given the way in which it operates, dd can not possibly capture a copy of a logical volume at a single point in time. A backup/restore methodology which uses dd without freeze/thaw against logical (or physical) volumes containing mounted filesystems may be tested and work repeatedly, yet may eventually fail no matter how many times it has been used successfully. The failure will be subtle. The dd itself will complete without errors. The failure will manifest itself as data corruption. Either the filesystem won't mount (metadata corruption), AIX will later crash or otherwise report errors with filesystem metadata (metadata corruption), or the application will fail when attempting to use the copy (file corruption). Please note that dd has functioned correctly, as far as it is able. It is the way in which dd is used that is flawed. This caveat applies when using dd to clone an AIX rootvg on a VIO Server
unless the AIX client LPAR is shut down while the dd is done. This caveat applies when using dd to copy one hdisk to another when the source hdisk contains filesystems which are mounted and not frozen when the dd is run. (Please note that the alt_disk_copy command
can be used on a running AIX system to capture a copy of rootvg to a spare disk outside of rootvg.)
- dd from and back to a raw logical volume - Using dd to restore a raw logical volume will overwrite the logical volume control block which, according to the LVCB warnings article
, is in the first 512 bytes of the logical volume. In most circumstances, it is undesirable to overwrite the LVCB. The command dd if=/dev/r$lvname /nas/$lvname.dd bs=1m will work fine to back up a raw logical volume, provided the physical partition size is a multiple of 1m. To restore the entire raw logical volume except for the LVCB use:
The target logical volume must, of course, be exactly the same size as the logical volume which was backed up. It is possible to restore using a single dd command, but then the entire logical volume must be restored using bs=512, which is ssslllooooowwwww.
As with any new procedure, be sure to carefully test before using in production.