APAR status
Closed as program error.
Error description
A copy to cloud job may appear to hang in IBM Spectrum Protect Plus. A cancel by the administrator of the copy job will also have no effect. Only a reboot of the vSnap host will abort the job. The issue was seen for a copy to a repository IBM Spectrum Protect Server. In the job log the copy starts : SUMMARY,<timestamp>,CTGGA2399,Starting job for policy SPPOFFLOAD with job name <SLAName> (ID:<SLAId>). id -> <JobId>. IBM Spectrum Protect Plus version 10.1.8-4082. ... INFO,<timestamp>,CTGGA1913,Created sessionId <xxxxxx> for <vSnapVolumeName> ... INFO,<timestamp>,CTGGA3118,Copying snapshot (ID: <SnapshotId>) from source [ server: <vSnapHost> volume: <vSnapVolumeName> snapshot: <SnapshotName>] to target [ server: https:// <ObjectAgentHost>:9000 volume: <TargetVolumeName>]. then, no progress is seen, the transfer message is stuck to a certain value and is repeated indefinitely every 5 minutes : INFO,<timestamp>,CTGGA0365,Snapshot <SnapshotName>(Id: <SnapshotId>) volume <vSnapHost>: <vSnapVolumeName> has transferred <aa> GB (Last status: Transferred <aa> of <bb> GB; <cc>% complete; Average throughput <dd> MB/s) If the administrator attempts to cancel the job, the job log will report it received the command, but nothing will further happen : INFO,<timestamp>,CTGGA0360,Aborting replication data transfer If the administrator then reboots the vSnap host, the job will complete reporting an error : ERROR,<timestamp>,,Unable to monitor status. Errornull ERROR,<timestamp>,,Failed to replicate <SnapshotId>. Error null e=java.lang.NullPointerException On the vSnap server, running "ps aux | grep recv" shows that there is one or more 'zfs recv' processes whose status is listed as "D" which means the process is hung waiting for I/O to complete. When a copy job has multiple transfer sessions, the above can be seen for only for some of the transfers while others can successfully complete. In that case the copy job ends in status PARTIAL. This issue occurs when the network connectivity between the vSnap and the cloud/repository has failed, or when the cloud/repository server cannot respond to vSnap requests. IBM Spectrum Protect Plus Versions Affected: IBM Spectrum Protect Plus 10.1.5 and later Initial Impact: Medium Additional Keywords: SPP, SPPLUS, TS005700623, hang, offload, IT32806
Local fix
Reboot the vSnap host
Problem summary
**************************************************************** * USERS AFFECTED: * * IBM Spectrum Protect Plus 10.1.8 * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in IBM Spectrum Protect Plus level * * 10.1.9. Note that this is subject to change at the * * discretion of IBM. * ****************************************************************
Problem conclusion
Implemented code fixes to correctly discard in-flight I/O operations when forcibly exporting a cloud pool. This allows the cloud copy operation to gracefully abort without causing the vSnap server to hang or crash and without leaving behind a stuck cloud pool.
Temporary fix
Comments
APAR Information
APAR number
IT37227
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A18
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-06-11
Closed date
2021-11-09
Last modified date
2021-11-09
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
vSnap Offload
Fix information
Fixed component name
SP PLUS
Fixed component ID
5737SPLUS
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A18","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
31 January 2024