IBM Support

Troubleshooting VIOS Snap Command Issues

Troubleshooting


Problem

Problem with padmin snap command on PowerVM Virtual I/O Server (VIOS)

Symptom

padmin snap command hangs or fails

Cause

This technote discusses the most frequently causes for snap failures on a VIOS and troubleshooting approach if the command is believed to be hung.

Environment

VIOS 3.1 and 4.1

Diagnosing The Problem

Determine your VIOS level. Login as padmin and run:
$ ioslevel
Then, see Part 3 of this technote for known issues.  If they are not applicable to your VIOS level, see Resolving the problem for further investigation.

Resolving The Problem

Part 1 - padmin snap command hangs

A. Refer to Troubleshooting a Hung Process or Command on PowerVM Virtual I/O Server.   
Note 1:
During "Gathering tcpip system information.....", snap command also gathers some fcstat data.  If the procedure in Part 1.A shows snap is hanging in fcstat, e.g. "/usr/sbin/fcstat -e fcs2", then, that means there is some kind of issue associated with that fibre channel port that needs to be investigated.  In such case, run errlog command on the VIOS to determine if there are errors reported by the fibre channel port that might need attention.  If no adapter errors are reported, next, work with your SAN administrator to verify all other physical components such as cables, GBIC, switch port, FC adapter port, etc. 

Once data collection is completed in part 1.A, go back to the snap session.  If snap has not progressed, you can hit CRTL+c for options.  You will receive and message similar to the following:
     You have chosen to interrupt the current process.
     Press 'enter' for no action, 's' to attempt to kill
     the current operation, or 'q' to quit out altogether:
You can hit ENTER to skip the current information and continue with the snap collection.

 

Part 2 - padmin snap fails

By default, padmin snap command will create 2 compressed files in /tmp and /home file systems:
    /tmp/ibmsupt/snap.pax.Z and
    /home/padmin/snap.pax.Z
Therefore, these file systems must have enough free space to hold the snap data.
Ensure /tmp and /home are at 50% Used or less. This percentage seems to work for most environments. To check filesystem space, run:
$ df -g
For more details, see Diagnosing Full File Systems in PowerVM VIOS.

If file system space is not an issue, reproduce the snap failure as outlined in Capturing Debug Output of padmin CLI and submit the following:

$ oem_setup_env
# cd /home/ios/logs
# tar cvf home_ios_logs_.tar `find . -type f`    ->upload the generated tarball
# ls -la *.tar
-rw-r--r--    1 root     staff     385392640 Jun 19 12:25 home_ios_logs_.tar

Part 3 Known Issues (not exhaustive)

4.1.1.0 IJ53686 SNAP SVCOLLECT GENERATES PERMISSION DENIED IN SVCOLLECT.ERR (viosvc.err is not captured by snap)
4.1.1.0 IJ53672 SNAP SVCOLLECT DO NOT COLLECT VIOSVC.OUT & VIOSVC.ERR -  4.1.0_ IJ53591
viosvc.* files in snap are zero size.  Workaround: SupportLine may ask to manually capture output and upload these files by running:
   $ oem_setup_env
   # alog -f /home/ios/logs/viosvc.log -o > /tmp/viosvc.out
   # alog -f /home/ios/logs/viosvc.log.err -o > /tmp/viosvc.err

 
4.1.1.0  IJ53687 SSPDB IS MISSING(SNAP SVCOLLECT IN PADMIN):PG_VERSION NOT EXIST -  4.1.0_IJ53590
              IJ50968 SNAP DO NOT COLLECT POSTGRES SSP DB IN VIOS 3.1.4.31 /4.1.0.X
3.1.4  IJ44463 VIOS SNAP ADD WRONG LINK (/HOME/PADMIN) THEN FAILS ON NXT RUN - 3.1.3_IJ46054, 3.1.2_IJ46092
3.1.4  IJ45174 VIOS SNAP TOO LARGE NOT HANDLED BY STAT() WELL - 3.1.3_IJ44606, 3.1.2_IJ46094 
3.1.4  IJ46870 SNAP OR SNAP SVCOLLECT FAILED TO COLLECT CMDB WHEN RUN IN PADMIN - 3.1.3_ IJ41409
Known Issue in FC adapter firmware
padmin snap is known to hang due to fcstat commands being in a delay loop querying the same (fscsi#) PCIe2 8Gb 4-Port FC Adapter with microcode level:
fcs2!7710322514101e04.0320080200
Recommendation:
If your adapter firmware is at 0320080200, update to level 0320080270:
Impact: Data  Severity: HIPER
Fix for a potential issue during error recovery processing resulting in a possible loss of data that would not be detected if an IO is aborted after partial transfer of the FCP_RSP payload.

Related Information

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSPHKW","label":"PowerVM Virtual I\/O Server"},"ARM Category":[{"code":"a8m3p000000PCZnAAO","label":"IOSCLI"}],"ARM Case Number":"TS018790671","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
22 July 2025

UID

isg3T1023492