Direct links to fixes
8.1.10.300-IBM-SPSRV-WindowsX64
8.1.10.300-IBM-SPSRV-Linuxx86_64
8.1.10.300-IBM-SPSRV-Linuxs390x
8.1.10.300-IBM-SPSRV-Linuxppc64le
8.1.10.300-IBM-SPSRV-AIX
8.1.12.000-IBM-SPSRV-WindowsX64
8.1.12.000-IBM-SPSRV-Linuxx86_64
8.1.12.000-IBM-SPSRV-Linuxs390x
8.1.12.000-IBM-SPSRV-Linuxppc64le
8.1.12.000-IBM-SPSRV-AIX
IBM Spectrum Protect Server V8.1.11.X interim fix downloads
IBM Spectrum Protect Server V8.1 Fix Pack 12 (V8.1.12) Downloads
APAR status
Closed as program error.
Error description
[Problem Description] A "BACKUP NODE" process may fail with varying ANR9999D errors if running backups to container storage pools. The backup operation then fails. [Customer/L2 Diagnostics] Example 1: 02/01/2021 13:35:59 ANR9999D_4237730896 SdAdjustBuf(sdbuf.c:1735) Thread<371>: The number of CQ slots for session 000000BD65823CE0 is being reduced to ZERO.(SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> issued message 9999 from: (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdab864504 OutDiagToCons()+b4 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdab85db72 outDiagfExt()+112 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdab5a31c8 SdAdjustBuf()+4b8 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdab595b9f SdStore()+bff (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdab593e67 sdCreate()+8a7 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdaaeeb0d2 CreateBitfile()+ba2 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdaaedf152 bfCreate()+1332 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdaae84969 bfNASCreate()+b9 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdb856f936 moverAcceptConnection()+206 ndserver.c:1865 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdb8567295 ndmpdSelect()+2a5 ndmpconn.c:1154 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdb856f377 connectionHandler()+227 ndserver.c:695 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdaac1c443 startThread()+153 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdb9eb4f7f beginthreadex()+107 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdb9eb5126 endthreadex()+192 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdcda513f2 BaseThreadInitThunk()+22 (SESSION: 28) 02/01/2021 13:35:59 ANR9999D Thread<371> 7ffdce6f54f4 RtlUserThreadStart()+34 (SESSION: 28) The problem will only occur when running NDMP backups to container storage pools. NDMP stream parsing produces a chunk that is too large which causes errors in the circular buffer queue. The problem originates during stream parsing which expects 1K read boundaries from the NAS filer. This assumption is violated and the read becomes mis-aligned. Example 2: 01/27/21 15:01:52 ANR9999D_1525641611 SdWriteNonDedupDataX(sdcreate.c:3755) Thread<1414>: Unexpected large meta data chunk size: 13046784. (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> issued message 9999 from: (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x0000000100086a30 StdPutText (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x0000000100087364 OutDiagToCons (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x00000001000633e4 outDiagfExt (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x0000000100e2d0c0 SdWriteNonDedupDataX (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x0000000100e34f48 SdWriteDedupData (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x0000000101f67720 SdCQSinkThread (SESSION: 10) 01/27/21 15:01:52 ANR9999D Thread<1414> 0x000000010009654c StartThread (SESSION: 10) Similar to the last example, if a large chunk is produced in the non-dedup chunk path, then there's an error indicating that the metadata chunk is too large to store. In both cases, running with SPI SPID BF RABIN SD trace is helpful for diagnosis as it will show the failing read iteration where the server reads data from the filer and creates an unexpectedly large chunk that gets sent down to the container layer (SD). The trace will look similar to below: 10:36:54.625 [368][bfdedup.c][14021][NdmpObjectSinkFunc]:dataAmount: 0, current: 0, bufLeft: 348, amountToCopy: 348 10:36:56.153 [368][bfdedup.c][14021][NdmpObjectSinkFunc]:dataAmount: 0, current: 348, bufLeft: 8388608, amountToCopy: 8388608 10:36:57.804 [368][bfdedup.c][14021][NdmpObjectSinkFunc]:dataAmount: 0, current: 8388956, bufLeft: 8388608, amountToCopy: 8388608 10:38:14.743 [368][sdbuf.c][1691][SdAdjustBuf]:Number 1 segment: length 8388260, bytesRecv 8388260, residual 25165824 10:38:14.743 [368][sdbuf.c][1691][SdAdjustBuf]:Number 0 segment: length 16777564, bytesRecv 16777564, residual 16777564 10:38:14.743 [368][sdbuf.c][1728][SdAdjustBuf]:Slot 000000BD6AEC76F0 is too small to hold one complete data chunk. Merging it into the next slot Note that "amountToCopy" in the first three lines adds up to the large chunk in one of the buffer slots. The trace then indicates that the buffer will try to compensate by merging into the next slot which fails and causes the ANR9999D. [IBM Spectrum Protect Versions Affected] IBM Spectrum Protect Server 8.1.10.000 and higher on all supported platforms. [Initial Impact] High [Additional Keywords] TSM NAS NDMP backup ANR9999D "BACKUP NODE" "Spectrum Protect" container
Local fix
Redirect NDMP backups temporarily to sequential device class storage pools.
Problem summary
**************************************************************** * USERS AFFECTED: * * All IBM Spectrum Protect server users. * **************************************************************** * PROBLEM DESCRIPTION: * * See error description. * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in levels 8.1.10.300, 8.1.11.100, and * * 8.1.12. Note that this is subject to change at the * * discretion of IBM. * ****************************************************************
Problem conclusion
This problem was fixed. Affected platforms for reported release: AIX, Linux, and Windows. Platforms fixed: AIX, Linux, and Windows.
Temporary fix
Comments
APAR Information
APAR number
IT35893
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
81A
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-02-12
Closed date
2021-03-04
Last modified date
2021-03-04
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV
Applicable component levels
R81A PSY
UP
R81L PSY
UP
R81W PSY
UP
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"81A","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
18 November 2021