Question & Answer
Question
How does the VE data mover work with VMware's Change Block Tracking (CBT) .
Answer
Overview
The Data Protection for VMware data mover uses the VMware vStorage API for Data Protection (VADP) to backup and restore virtual machines in a vCenter environment. The VADP is made up of two major components, the vSphere Management SDK and the Virtual Disk Development Kit (VDDK).
For more information about VADP, see the VMware FAQ at http://kb.vmware.com/kb/1021175.
Change Block Tacking (CBT) is a VMware feature that assists the data mover in performing incremental backups of virtual machines. The vSphere Management SDK is responsible for returning the CBT extents data (changed blocks) to the data mover. CBT identifies the initial allocated blocks and tracks blocks changed since the last backup.
Limitations for CBT:
- The host must be ESX/ESXi 4.0 or later.
- The virtual machine must be hardware version 7 or later.
- I/O operations must go through the ESX/ESXi storage stack. All VMFS datastores are supported, whether backed by SAN, iSCSI, or local disk. Except for the initial allocated disks, NFS datastores are supported, as are virtual RDMs.
- Physical RDMs and disks that are accessed directly from the guest OS (iSCSI or NFS) are not supported. Also, CBT is not supported if the VMDK is attached to a shared virtual SCSI bus.
- The virtual machines VMDK must not be an independent disk, meaning unaffected by snapshots.
- The virtual machine cannot be a template virtual machine.
- The VMDKs must be on a VMFS volume backed by SAN, iSCSI, vSAN, or local disks.
- The virtual machine must not have pre-existing snapshots.
Table 1: Change Block Tracking Flow
Backup type | New change ID | Old change ID | ID for CBT query | Result |
full | changeID 0 | none | * | All used blocks |
incremental | changeID 1 | changeID 0 | changeID 0 | All blocks since changeID 0 |
incremental | changeID 2 | changeID 1 | changeID 1 | All block since changeID 1 |
…. | …. | …. | …. | …. |
VMware provides the following KB article that describes best practices to follow when using advanced transports and CBT: http://kb.vmware.com/kb/1035096. The following VMware KB article provides some help in dealing with CBT issues and possible failures: http://kb.vmware.com/kb/1020128.
The data mover automatically enables CBT when the aforementioned CBT requirements are met. You can confirm that CBT is enabled in one of two ways: From the data mover one can issue the “show vm all” command and see a detailed list of all the virtual machines in the inventory along with the CBT attribute “changeTracking”. This attribute has a value of “On” or “Off”.
Example:
44.vmName: xp-32(0)
hostAddress: na-3912a122d1e6
tsmNodeName: na-3912a122d1e6
displayName: xp-32(0)
ipAddress: 192.168.0.160
datacenter: DC Lab
hostSystem: oneshot.home.lan
guestFolder:
guestFullName: Microsoft Windows XP Professional (32-bit)
altGuestName:
guestId: winXPProGuest
uuid: 422b19f7-0322-8c04-727f-68897688087a, moref: vm-1406
instance uuid: 502ba39f-df31-022e-e006-96d3c918e2bb
guestState: notRunning connectionState: disconnected
changeTracking: On vmHWversion: vmx-08
toolsRunningStatus: guestToolsNotRunning
toolsVersion: 9349 toolsVersionStatus: guestToolsSupportedNew
consolidationNeeded: No
vmFaultTolerant: No
domainKeyword:
domainSelected: No
cluster: Clouds.SRQ.VM
vApp: Clouds vApp
resourcePool:
VMDK[1]Label: 'Hard disk 1' (Hard Disk 1)
VMDK[1]Name: '[datastore1-4] xp-32(0)/xp-32(0).vmdk'
VMDK[1]Status: Included
Another method can be found in the following IBM Technote. The method described uses the vSphere Client: http://www.ibm.com/support/docview.wss?uid=swg21516726.
Finally, a powered on virtual machine must go through a stun-unstun cycle (power on, resume after suspend, migrate, or snapshot create/delete/revert ) to enable or disable CBT so the data mover will use snapshot creates and deletes to accomplish this stun-unstun cycle.
Common CBT issues
The one issue often reported is that the first full backup returns the entire VMDK. As discussed above, if the datastore is NFS backed, CBT reports that the entire VMDK is allocated. This is just a limitation related to the NFS datastore and the ability to get the allocated blocks from NFS hardware. More information can be found in the VMware KB: http://kb.vmware.com/kb/2077787.
A second issue is that an incremental CBT request fails because the CBT change ID is invalid and the data mover is forced to take a new full backup. This issue can occur if CBT has been reset due to power failures, hard shutdowns, cold migration, or Storage vMotion. For more information, see the following VMware KBs: http://kb.vmware.com/kb/1020128 and http://kb.vmware.com/kb/2048201. The Storage vMotion issue is reported fixed by VMware with ESXi 5.5 Update 2. See http://kb.vmware.com/kb/2048201.
In addition, another potential issue is that CBT overstates changes and the data mover is backing up too much data. There are several possibilities here. The first, and often overlooked, explanation is that some in-guest applications, like an anti-virus application, run daily and make modifications to the guest hard disk. There are also known defects in the VMware's ESXi versions or the VDDK. For more information, see the following IBM Technotes: http://www.ibm.com/support/docview.wss?uid=swg21635006 and http://www.ibm.com/support/docview.wss?uid=swg21628701.
When CBT is already enabled and the virtual disk is extended across a 128 GB boundary, this can also cause CBT to return the incorrect size. VMware reports that resetting CBT will correct this problem. See VMware KB: http://kb.vmware.com/kb/2090639.
For example:
- disk grows from 20GB to 100GB : no impact
- disk grows from 20GB to > 128GB : impacted
- disk grows from 140GB to 200GB : no impact
- disk grows from 140GB to > 256GB : impacted
- disk grows from 400GB to 500GB : no impact
- disk grows from 400GB to >512GB : impacted
Lastly, we have seen errors from the CBT function due to improper ESXi reboots or shutdowns. The saved CBT change ID has become invalid and CBT will need to be reset. An error message similar to this will be found in the dsmerror.log. The data mover can perform the reset automatically but process requires CBT to be turned disabled and then enabled and this will put the virtual machine through two snapshot stun-unstun cycles.
Example:
ANS9365E VMware vStorage API error.
TSM function name : QueryChangedDiskAreas
TSM file : vmvisdk.cpp (3592)
API return code : 12
API error message : SOAP 1.1 fault: "":ServerFaultCode[no subcode]"A specified parameter was not correct."
ANS9365E VMware vStorage API error.
TSM function name : visdkPrintSOAPError
TSM file : vmvisdk.cpp (885)
API return code : 12
API error message : SOAP 1.1 fault: "":ServerFaultCode[no subcode]"Error caused by file /vmfs/volumes/4ade85fd-81f49624-57f5-000e0cdd0d21/winxp-32/winxp-32.vmdk"
Forcing a CBT reset
For invalid CBT change IDs a CBT reset is necessary. Run a single TSM backup with the testflag vmbackup_cbt_reset.
Example:
dsmc backup vm 'myvm' -testflag=vmbackup_cbt_reset.
Diagnosing problems
Collecting a data mover trace should include the following lines in the dsm.opt:
traceflag vm
tracefile vmbackup.trc
One additional file is the dsmvddk.opt and the following lines need to be change to “6” to enable trivia tracing in the VDDK API:
# 0-quiet, 1-panic, 2-error, 3-warning, 4-info, 5-verbose, 6-trivia
vixDiskLib.transport.LogLevel = "6"
vixDiskLib.nfc.LogLevel = "6"
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
swg21681916