A fix is available
APAR status
Closed as program error.
Error description
In rare cases, the 'Backup VM' process halts due to an invalid pointer situation. This was observed on SLES Linux. Products affected: IBM Spectrum Protect for Virtual Environments: Data Protection for VMware version 7.1.8 and 8.1 on Linux x86 and Windowsx64 platform IBM Spectrum Protect for Virtual Environments: Data Protection for Microsoft Hyper-V version 7.1.8 and 8.1 on Windowsx64 platform If you are using Data Protection for VMware 8.1 refer to APAR IT26212 If you are using Data Protection for Microsoft Hyper-V 8.1.4-8.1.6, refer to APAR IT26761 If you are using Data Protection for VMware 7.1.8 or Data Protection for Microsoft Hyper-V 7.1.8 or 8.1.0-8.1.2, refer to APAR IT26762 Note 1: The Backup-Archive Client is a prerequisite to using the Data Protection for VMware version 7.1. In Data Protection for VMware environments, the Backup-Archive Client is also known as the data mover. Note 2: The Backup-Archive Client is a prerequisite to using the Data Protection for Microsoft Hyper-V versions 7.1 till 8.1.2. In Data Protection for Microsoft Hyper-V environments, the Backup-Archive Client is also known as the data mover. Customer/L2 Diagnostics A data mover client service,VM trace we can see that the guest 'guest_1' is put into the list of failed VMs but at that point it is not yet handled. <timestamp> [PID] [TID_1] : vmOverlappedIO.cpp (2910): OverlappedIOMonitor::KillVM(): error happened on consumer thread, abandoning backup for vm 'guest_1' The Consumer Thread [TID_2] was working with vm 'guest_2' which has a failure and therefore is also put in the list which should be handled by OverlappedIOMonitor Thread [TID_1]. <timestamp> [PID] [TID_2] : vmbackvddk.cpp (13360): EXIT <===== vmGetObjInfoDisk(), rc = 0 <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5684): VmVerifyIfSingleDisk(): Found disk: Hard Disk 2 <timestamp> [PID] [TID_2] : vmbackvddk.cpp (16952): VmGetDiskNumFromLabel: disk num '2' for label 'Hard Disk 2'. <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5698): VmVerifyIfSingleDisk(): Verifying disk backup ctls: checking size on disk vs ctl size coverage: Hard Disk 2. <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5300): VmVerifyIfDiskBackup(): Num of CTLs = 0; type = IFINCR .. : No ctl files found! .. : VM / Disk : guest_2 / Hard Disk 2 .. : capacity : 2147483648 .. : size on disk : 2147483648 .. : ctl coverage size : 0 .. : disk included : Yes .. : prev backup ifincr: Yes .. : ctl matches size : No .. : ctl found : No .. : bitmap found : No .. : disk used : Yes .. : result : FAIL; missing CTLs <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5496): ANS9921E Virtual machine disk, guest_2 (Hard Disk 2), verification check failed (2147483648/0). If the disk is on an NFS datastore and the disk size was recently changed or the disk was moved from non-NFS datastore to NFS datastore, the verification failure is expected and a full backup is required. <timestamp> [PID] [TID_2] : vmbackcommon.cpp (5778): VmVerifyIfSingleDisk(): Exiting with rc 6560. <timestamp> [PID] [TID_2] : vmbackvddk.cpp (8168): ANS9919E Failed to find the expected control files for guest_2. <timestamp> [PID] [TID_2] : vmbackvcm.cpp ( 285): =========> Entering vcmFlushVolumeControlLibrary() <timestamp> [PID] [TID_2] : vmbackvcm.cpp ( 206): =========> Entering vcmLogger::trace() <timestamp> [PID] [TID_2] : vmbackvcm.cpp ( 217): ANS5250E An unexpected error was encountered. <timestamp> [PID] [TID_2] : vmbackvcm.cpp ( 222): <========= Exiting vcmLogger::trace() <timestamp> [PID] [TID_2] : vmbackvcm.cpp ( 295): <========= Exiting vcmFlushVolumeControlLibrary() <timestamp> [PID] [TID_2] : vmbackvddk.cpp (8222): VmSendData(): VmVerifyIfSingleDisk() returned rc=4379 <timestamp> [PID] [TID_2] : vmbackvddk.cpp (9033): VmSendData(): vm guest_2 has 1 totaldisks entries to dispatch. <timestamp> [PID] [TID_2] : vmbackvddk.cpp (9074): VmSendData(): we had an error, telling the IO Monitor to stop backing up this VM. In the last line the Client puts the VM into list of failed VMs. After that, the Consumer Thread [TID_2] destroys the pointer which was used in the message to OverlappedIOMonitor Thread [TID_1]. <timestamp> [PID] [TID_2] : vmbackvddk.cpp (14921): vmBackupVMCleanup(): free vmBackupDataPP However, in the mean time, the OverlappedIOMonitor Thread [TID_1] handled the message "kill vm" and it used the invalid pointer which had already been destroyed by the Consumer Thread [TID_2]: <timestamp> [PID] [TID_1] : vmOverlappedIO.cpp (2910): OverlappedIOMonitor::KillVM(): error happened on consumer thread, abandoning backup for vm 'guest_1' The same pointer was still pointing to vm 'guest_1' at the moment of handling message which causes the processing to halt. Initial Impact: Medium Additional Keywords: TS000989064 halt hang backup
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * * Data Protection for VMware version 7.1.8 and 8.1 on Linux * * x86 and Windowsx64 platform * * Data Protection for Microsoft Hyper-V version 7.1.8 and 8.1 * * on Windowsx64 platform * **************************************************************** * PROBLEM DESCRIPTION: * * see ERROR DESCRIPTION * **************************************************************** * RECOMMENDATION: * * This issue is projected to be fixed in the Data Protection * * for VMware version 8.1.6.1 and 8.1.7 on on Microsoft Windows * * x64 and Linux x86_64 platforms. * * Note 1: This is subject to change at the discretion of IBM. * **************************************************************** *
Problem conclusion
The code has been changed, so that the Data Mover will not hang.
Temporary fix
Comments
APAR Information
APAR number
IT26212
Reported component name
TSM FOR VE DP V
Reported component ID
5725TVEVM
Reported release
81L
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-09-07
Closed date
2018-11-02
Last modified date
2018-11-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
dsmc
Fix information
Fixed component name
TSM FOR VE DP V
Fixed component ID
5725TVEVM
Applicable component levels
R81L PSY
UP
R81W PSY
UP
[{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SS8TDQ","label":"Tivoli Storage Manager for Virtual Environments"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"81L"}]
Document Information
Modified date:
14 September 2021