APAR status
Closed as program error.
Error description
During incremental copy to cloud or repository server, the existing cloud pool is imported so that incremental changes can be written to it. Due to slow reads from the cloud or repository server, the import process can time out or fail. The job log shows any one of the following error messages: CTGGA0309,Copy failed for snapshot <snapshot details>. Error: Exception: Failed to create gateway device: Could not find device path for serial <serial> CTGGA0309,Copy failed for snapshot <snapshot details>. Error: Exception: Failed to create/import offload pool: Command timed out: <zpool or zfs command> Instead of the errors above, an alternative symptom may be seen as described below. Due to slow reads from the cloud or repository server, the import process can take a long time. During this step, the import process holds a global filesystem lock. For any concurrent backup or copy operations running at the same time, those operations may perform a "zpool list" command to retrieve the list of pools. The list command has to wait for the lock to be released by the import process. If the import is slow, the list command can time out and cause failure of the backup or copy job. The job log shows any one of the following error messages: PoolInfoError: Failed to collect pool details for <pool ID> Fail to get the volume with volume id <ID>, Unable to get storage volume Further investigation of the vSnap logs shows timeout of "zpool list" command, and the stack of the timed out process shows: [<ffffffffc09cbe21>] spa_open_common+0x61/0x5d0 [zfs] [<ffffffffc09cc40d>] spa_get_stats+0x4d/0x330 [zfs] [<ffffffffc0a254a9>] zfs_ioc_pool_stats+0x39/0x90 [zfs] [<ffffffffc0a2ee9d>] zfsdev_ioctl+0x65d/0x6c0 [zfs] [<ffffffff8ce5fbc0>] do_vfs_ioctl+0x3a0/0x5a0 [<ffffffff8ce5fe61>] SyS_ioctl+0xa1/0xc0 [<ffffffff8d38dede>] system_call_fastpath+0x25/0x2a [<ffffffffffffffff>] 0xffffffffffffffff Versions Affected: 10.1.*
Local fix
N/A
Problem summary
**************************************************************** * USERS AFFECTED: * * IBM Spectrum Protect Plus level 10.1.5 and 10.1.6. * **************************************************************** * PROBLEM DESCRIPTION: * * See ERROR DESCRIPTION * **************************************************************** * RECOMMENDATION: * * Apply the fixing level when available. This problem is * * currently projected to be fixed in IBM Spectrum Protect * * level 10.1.7. Note that this is subject to change at the * * discretion of IBM. * ****************************************************************
Problem conclusion
IBM Spectrum Protect Plus uses an incremental-forever approach to store cloud copies. During an incremental copy operation, the previous copy of the cloud pool is mounted and the changed data is written to it. Mounting the cloud pool requires reading back some amount of data and metadata written to the cloud. The root cause of the problem described in this APAR is that reads from the cloud disk were slow. During mounting of a cloud pool for incremental copies, the slow reads resulted in the pool taking much longer to import. In some cases, an internal filesystem lock held during the import process prevented other concurrent operations from running which caused failures for other concurrent backup or copy jobs running at the same time. The performance of the reads from the cloud has been improved through the reading of larger chunks and better caching. This has created an improvement in read performance.
Temporary fix
Comments
APAR Information
APAR number
IT33586
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A15
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-07-17
Closed date
2020-11-18
Last modified date
2020-11-18
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SP PLUS
Fixed component ID
5737SPLUS
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A15","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
31 January 2024