IBM Support

IT26212: BACKUP VM HANG

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In rare cases, the 'Backup VM' process halts due to an invalid
    pointer situation. This was observed on SLES Linux.
    
    Products affected:
    IBM Spectrum Protect for Virtual Environments:
    Data Protection for VMware version 7.1.8 and 8.1
    on Linux x86 and Windowsx64 platform
    IBM Spectrum Protect for Virtual Environments:
    Data Protection for Microsoft Hyper-V version 7.1.8 and 8.1 on
    Windowsx64 platform
    
    If you are using Data Protection for VMware 8.1 refer to APAR
    IT26212
    
    If you are using Data Protection for Microsoft Hyper-V
    8.1.4-8.1.6,
    refer to APAR IT26761
    
    If you are using Data Protection for VMware 7.1.8 or
    Data Protection for Microsoft Hyper-V 7.1.8 or 8.1.0-8.1.2,
    refer to APAR IT26762
    Note 1: The Backup-Archive Client is a prerequisite to using the
    
    Data Protection for VMware version 7.1.
    In Data Protection for VMware environments,
    the Backup-Archive Client is also known as the data mover.
    Note 2: The Backup-Archive Client is a prerequisite to using the
    
    Data Protection for Microsoft Hyper-V versions 7.1 till 8.1.2.
    In Data Protection for  Microsoft Hyper-V environments,
    the Backup-Archive Client is also known as the data mover.
    
    Customer/L2 Diagnostics
    A data mover client service,VM trace we can see that the guest
    'guest_1' is put into the list of failed VMs but at that point
    it is not yet handled.
    
    <timestamp> [PID] [TID_1] : vmOverlappedIO.cpp  (2910):
    OverlappedIOMonitor::KillVM(): error happened on consumer
    thread, abandoning backup for vm 'guest_1'
    
    The Consumer Thread [TID_2] was working with vm 'guest_2' which
    has a failure and therefore is also put in the list which should
    be handled by OverlappedIOMonitor Thread [TID_1].
    
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp   (13360): EXIT
    <===== vmGetObjInfoDisk(), rc = 0
    <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5684):
    VmVerifyIfSingleDisk(): Found disk: Hard Disk 2
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp   (16952):
    VmGetDiskNumFromLabel: disk num '2' for label 'Hard Disk 2'.
    <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5698):
    VmVerifyIfSingleDisk(): Verifying disk backup ctls: checking
    size on disk vs ctl size coverage: Hard Disk 2.
    <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5300):
    VmVerifyIfDiskBackup():
         Num of CTLs = 0; type = IFINCR
    .. : No ctl files found!
    .. : VM / Disk         : guest_2 / Hard Disk 2
    .. : capacity          : 2147483648
    .. : size on disk      : 2147483648
    .. : ctl coverage size : 0
    .. : disk included     : Yes
    .. : prev backup ifincr: Yes
    .. : ctl matches size  : No
    .. : ctl found         : No
    .. : bitmap found      : No
    .. : disk used         : Yes
    .. : result            : FAIL; missing CTLs
    <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5496): ANS9921E
    Virtual machine disk, guest_2 (Hard Disk 2), verification check
    failed (2147483648/0). If the disk is on an NFS datastore and
    the disk size was recently changed or the disk was moved from
    non-NFS datastore to NFS datastore, the verification failure is
    expected and a full backup is required.
    <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5778):
    VmVerifyIfSingleDisk(): Exiting with rc 6560.
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp   (8168): ANS9919E
    Failed to find the expected control files for guest_2.
    <timestamp> [PID] [TID_2] : vmbackvcm.cpp    ( 285): =========>
    Entering vcmFlushVolumeControlLibrary()
    <timestamp> [PID] [TID_2] : vmbackvcm.cpp    ( 206): =========>
    Entering vcmLogger::trace()
    <timestamp> [PID] [TID_2] : vmbackvcm.cpp    ( 217): ANS5250E An
    unexpected error was encountered.
    <timestamp> [PID] [TID_2] : vmbackvcm.cpp    ( 222): <=========
    Exiting vcmLogger::trace()
    <timestamp> [PID] [TID_2] : vmbackvcm.cpp    ( 295): <=========
    Exiting vcmFlushVolumeControlLibrary()
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp   (8222):
    VmSendData(): VmVerifyIfSingleDisk() returned rc=4379
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp   (9033):
    VmSendData(): vm guest_2 has 1 totaldisks entries to dispatch.
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp   (9074):
    VmSendData(): we had an error, telling the IO Monitor to stop
    backing up this VM.
    
    In the last line the Client puts the VM into list of failed VMs.
    After that, the Consumer Thread [TID_2] destroys the pointer
    which was used in the message to OverlappedIOMonitor Thread
    [TID_1].
    
    <timestamp> [PID] [TID_2] : vmbackvddk.cpp      (14921):
    vmBackupVMCleanup(): free vmBackupDataPP
    
    However, in the mean time, the OverlappedIOMonitor Thread
    [TID_1] handled the message "kill vm" and it used the invalid
    pointer which had already been destroyed by the Consumer Thread
    [TID_2]:
    
    <timestamp> [PID] [TID_1] : vmOverlappedIO.cpp  (2910):
    OverlappedIOMonitor::KillVM(): error happened on consumer
    thread, abandoning backup for vm 'guest_1'
    
    The same pointer was still pointing to vm 'guest_1' at the
    moment of handling message which causes the processing to halt.
    
    Initial Impact: Medium
    
    Additional Keywords: TS000989064 halt hang backup
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * Data Protection for VMware version 7.1.8 and 8.1 on Linux    *
    * x86 and Windowsx64 platform                                  *
    * Data Protection for Microsoft Hyper-V version 7.1.8 and 8.1  *
    * on Windowsx64 platform                                       *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * see ERROR DESCRIPTION                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * This issue is projected to be fixed in the Data Protection   *
    * for VMware version 8.1.6.1 and 8.1.7 on on Microsoft Windows *
    * x64 and Linux x86_64 platforms.                              *
    * Note 1: This is subject to change at the discretion of IBM.  *
    ****************************************************************
    *
    

Problem conclusion

  • The code has been changed, so that the Data Mover will not hang.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT26212

  • Reported component name

    TSM FOR VE DP V

  • Reported component ID

    5725TVEVM

  • Reported release

    81L

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-09-07

  • Closed date

    2018-11-02

  • Last modified date

    2018-11-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IT26761 IT26762

Modules/Macros

  • dsmc
    

Fix information

  • Fixed component name

    TSM FOR VE DP V

  • Fixed component ID

    5725TVEVM

Applicable component levels

  • R81L PSY

       UP

  • R81W PSY

       UP

[{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SS8TDQ","label":"Tivoli Storage Manager for Virtual Environments"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"81L"}]

Document Information

Modified date:
14 September 2021