How To
Summary
We all use mksysb and alt_disk_copy often for various of reasons, mainly for system backup and recovery.
However in certain situations mksysb and alt_disk_copy can hang which can prevent us from creating our backup or clone.
In this document I will explain most common causes for mksysb and alt_disk_copy to hang and what data you need to upload to IBM support for quicker resolution of the problem.
Steps
1. Mksysb explained in 3 steps.
When creating an mksysb the command does the following:
- It creates/updates the image.data file located in the root directory, which contains all the LVM information of the system.
- Next, it generate a list of files that it will back up.
- Finally, it begins the back up process.
When mksysb backs up the files, it is important to note that it backs up each file individually in the exact same order as in the generated list of files.
It is also important to note that the mksysb does not back up NFS mounted filesystems however it tries to access them.
It is also important to note that the mksysb does not back up NFS mounted filesystems however it tries to access them.
Most common hang is due to NFS mount. As mksysb does not back up NFS mounted filesystems it attempts to access them. If an NFS mount is stale mksysb will hang attempting to access it. In this case fixing or simply umounting the filesystem will resolve the issue. Sometimes bad file or files being modified by an application could also cause mksysb to hang.
In that case, excluding those files form the back up should resolve the issue. If you are uncertain how to proceed, then follow the bellow instructions to obtain and upload the data needed for IBM support to help investigate and advise on possible resolution.
If mksysb starts hanging, gather and upload the following data:
Set the mksysb in debug
# export MKSYSB_DEBUG=yes
Start a script to capture the log.
# script /tmp/mksysb_debug.out
Run your mksysb command in verbose (adding the -v flag)
# mksysb -iv <dir/name_of_mksysb>
Next, open a new session and make a copy of the archive list. This file is the file generated by the mksysb with the list of files it backs up in that exact same order.
# cd /tmp
# ls -lotr
Look for the directory that looks like this:mksysb.####### (the numbers are different each time)
# cd mksysb.#######
# cp .archive.list.####### /tmp/mksysb_archive_list
Once the mksysb command hangs, stop the process with Ctrl+C, then stop the sctipt:
# exit
Set the mksysb out of debug:
# export MKSYSB_DEBUG=no
Next, collect and upload a snap:
# snap -r <--- clears old snap data
# snap -aZ
# cp /tmp/mksysb_debug.out /tmp/ibmsupt/testcase
# cp /tmp/mksysb_archive_list /tmp/ibmsupt/testcase
# cp /etc/exclude.rootvg /tmp/ibmsupt/testcase
# snap -c
# mv /tmp/ibmsupt/snap.pax.Z <Case Number>.snap.pax.Z
Upload the snap to your IBM support case.
2. Alt_disk_copy hangs.
Similar to mksysb, alt_disk_copy hangs usually are caused by dead NFS mount, corrupt files or applications are modifying files at the time of the alt_disk_copy process.
Just as mksysb, alt_disk_copy also creates a list of files it begins to back up in the same order as in that list, then it begins restoring the files on the target disk.
Most hangs usually occur during the back up process.
If you are uncertain how to proceed when a hang occurs, follow the bellow instructions and upload the data needed for IBM support to analyze and advise on a possible resolution.
Just as mksysb, alt_disk_copy also creates a list of files it begins to back up in the same order as in that list, then it begins restoring the files on the target disk.
Most hangs usually occur during the back up process.
If you are uncertain how to proceed when a hang occurs, follow the bellow instructions and upload the data needed for IBM support to analyze and advise on a possible resolution.
Start a script to capture the command output in debug and verbose.
Keep running the bellow ('ls') command until you see a file name something similar to .include.list.10813894 appear.
This file (.include.list.10813894) will appear 20 - 30 seconds after the alt_disk_copy is initiated. It disappears quickly, so you must run the 'ls' command quickly until you see that file is created and then make a copy of it before it disappears:
Next, go back to the fist session and wait for the alt_disk_copy to complete/error/hang and stop the script as follows:
Next, collect and upload a snap:
Upload the snap to your IBM support case.
# script /tmp/alt_disk_copy_debug.out
# alt_disk_copy <your flags> -D -V
As soon as you initiate the command, quickly open a new session to the system:
# cd /tmp
Keep running the bellow ('ls') command until you see a file name something similar to .include.list.10813894 appear.
#ls | grep include.list
This file (.include.list.10813894) will appear 20 - 30 seconds after the alt_disk_copy is initiated. It disappears quickly, so you must run the 'ls' command quickly until you see that file is created and then make a copy of it before it disappears:
#cp <.include.list.10813894> include.list_original
Next, go back to the fist session and wait for the alt_disk_copy to complete/error/hang and stop the script as follows:
#exit
Next, collect and upload a snap:
# snap -r <--- clears old snap data
# snap -aZ
# cp /tmp/alt_disk_copy_debug.out /tmp/ibmsupt/testcase
# cp /tmp/include.list_original /tmp/ibmsupt/testcase
# snap -c
# mv /tmp/ibmsupt/snap.pax.Z <Case Number>.snap.pax.Z
Upload the snap to your IBM support case.
Related Information
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB08","label":"Cognitive Systems"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"ARM Category":[{"code":"a8m0z000000cvysAAA","label":"Install-\u003Ealt disk"},{"code":"a8m0z000000cvyjAAA","label":"Install-\u003Emksysb\/backups"}],"ARM Case Number":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
26 April 2022
UID
ibm16572753