IBM Support

IT42865: VMWARE GUEST BACKUPS CTGGA3134 MESSAGES ABOUT UNRESPONSIVE VADP

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The IBM Spectrum Protect Plus vADP host can become unresponsive
    during a VMware guest backup job.
    
    Guest backups can be stopped with the messages in the job log :
    
    ERROR,..,CTGGA3134,Failed to receive status updates from the
                       backup process on VADP proxy (<ProxyAddress>)
                       stopped the backup of VM (<VMName>).
    Investigation on the vADP host reveals that the affected guest
    backup sessions are killed by the Linux out-of-memory-monitor as
    seen in the /var/log/messages file.
    
    Even when there is no active backup, the
    'ps -eF|grep vmdkbackup' command will display a lot of left over
    guest sessions like :
    
    UID    PID PPID C     SZ   RSS PSR STIME TTY TIME
    ...
    root 12345 1234 1 670845 76940   7 00:06 ?   00:06:09
    CMD
    /opt/IBM/SPP/bin/vmdkbackup -a /tmp/vmdkbackup-<JobId>-vm-<mob>
    -<xxx>.json
    
    that should have been cleaned up at backup completion.
    
    It is the accumulation of these sessions all holding some memory
    that lead to the memory shortage of the vADP.
    
    The commands 'mount' and 'ls -l /sppvadp' will display also a
    lot of left over mounted directories like :
    
    /sppvadp/sppvadp__vsnap_vpool<x>_fs<yy>_<IPAddress>_<JobID>
    
    Further investigation of the VMware VDDK log for guests having
    such a left over session shows that, at the end of the
    processing, a failure to unmount the vSnap directory used to
    store the backup data on the vSnap.
    
    ...
    2023-01-02T20:07:42.746Z [I] The backup completed successfully.
                                 Total bytes transferred <xxx> MB in
                                 <y> second(s). Throughput: <zz>
                                 MB/s.
    ...
    2023-01-02T20:07:42.767Z [I] Unmounting /sppvadp/sppvadp__vsnap_
                                 vpool<x>_fs<yy>_<IP>_<JobId>...
    2023-01-02T20:07:42.769Z [I] Command finished with status:
                                 (exit status 1), desc (/bin/umount:
                                 invalid option -- ' '
    ...
    2023-01-02T20:07:42.769Z [I] Removing directory /sppvadp/sppvadp
                                 __vsnap_vpool<x>_fs<yy>_<IP>_
                                 <JobId>...
    2023-01-02T20:07:42.769Z [E] Command to remove directory
                                 (/sppvadp/sppvadp__vsnap_vpool<x>_
                                 fs<yy>_<IP>_<JobId>) failed:
                                 (remove /sppvadp/sppvadp__vsnap_
                                 vpool<x>_fs<yy>_<IP>_<JobId>:
                                 device or resource busy)
    
    This only affects vADP hosts running Linux RHEL 7.x/CentOS 7
    and Linux RHEL 8.5 or earlier/CentOS 8
    vADP services running on supported RHEL 8.6 and later versions
    are not affected.
    
    IBM Spectrum Protect Plus Versions Affected:
    IBM Spectrum Protect Plus 10.1.12 and later
    
    Additional Keywords: SPP, SPPLUS, TS011633854, hang, memory,
                         full
    

Local fix

  • Define new vADP hosts based on a supported RHEL 8.x version and
    remove the ones still running at the affected Linux versions.
    OR
    When there is no active backup session
    for the vADP : 1. Stop the vADP service with the command :
          sudo systemctl stop remote-vadp
    2. If left over mount in the format
       /sppvadp/sppvadp__vsnap_vpool<x>_fs<yy>_<IP>_<JobId>
       are seen, unmount these one by one:
       /bin/umount -l -f /sppvadp/sppvadp__vsnap_vpool<x>_fs<yy>
       <IP>_<JobId>
    3. If left over directories are seen, delete everything in
       directory /sppvadp
    4. Start the vADP service :
          sudo systemctl stop remote-vadp
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.12 and 10.1.13          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply the fixing level when available. This problem is       *
    * currently projected to be fixed in IBM Spectrum Protect Plus *
    * levels 10.1.13.1 and 10.1.14. Note that this is subject to   *
    * change at the discretion of IBM.                             *
    ****************************************************************
    

Problem conclusion

  • VADP was fixed so that it now work on CentOS, RHEL 7, and RHEL 8
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT42865

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A1C

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2023-01-09

  • Closed date

    2023-01-30

  • Last modified date

    2023-03-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A1C","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
01 February 2024