Direct links to fixes
APAR status
Closed as program error.
Error description
IBM Spectrum Protect server can present a performance degradation due to a mutex lock contention when more than hundreds of sessions write data to file class storage pools residing on NFS file systems. Diagnostics: High "Unknown" time in servermon instrumentation psSessionThreads similar to: Thread 12345678 psSessionThread parent=123 04:37:35.562-->05:01:22.187 Operation Count Tottime Avgtime Mintime Maxtime InstTput Total KB --------------------------------------------------------------- ------------- Network Recv 993626 126.268 0.000 0.000 1.108 41900.7 5290731 Network Send 125 0.001 0.000 0.000 0.000 3813.9 5 SSL Receive 770 0.008 0.000 0.000 0.000 1893.5 16 SSL Send 250 0.007 0.000 0.000 0.000 134.9 1 DB2 Fetch Prep 252 0.055 0.000 0.000 0.004 DB2 MFtch Prep 45 0.016 0.000 0.000 0.002 DB2 Updat Prep 168 0.024 0.000 0.000 0.001 DB2 Fetch Exec 505 0.093 0.000 0.000 0.009 DB2 MFtch Exec 53 0.007 0.000 0.000 0.000 DB2 Inser Exec 267 0.063 0.000 0.000 0.002 DB2 Updat Exec 336 0.040 0.000 0.000 0.001 DB2 Fetch 505 0.003 0.000 0.000 0.000 DB2 MFetch 113414 9.233 0.000 0.000 0.008 DB2 Commit 88 0.291 0.003 0.000 0.198 DB2 Reg Prep 45 0.012 0.000 0.000 0.002 DB2 Reg Exec 60 0.006 0.000 0.000 0.000 DB2 Reg Fetch 50 0.000 0.000 0.000 0.000 Tm Lock Wait 122 234.558 1.923 0.000 24.458 Acquire Latch 68 3.273 0.048 0.000 0.774 Acquire XLatch 42 0.000 0.000 0.000 0.000 Thread Wait 20920 52.065 0.002 0.000 4.726 Unknown 1000.592 --------------------------------------------------------------- ------------- Total 1426.625 3708.6 5290754 If high unknown time is seen above and the stacks from the servermon "show" files show many stack entries in "pkAcquireMutexTracked" which are called by a function starting with "pvr" (the function just below the "pkAcqureMutexTracked") then this APAR may apply, examples: Thread 1291 (Thread 0x7f8d3af3c700 (LWP 6734)): #0 0x00007f94e24b254d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f94e24ade9b in _L_lock_883 () from /lib64/libpthread.so.0 #2 0x00007f94e24add68 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x0000000001333455 in pkAcquireMutexTracked (mutexP=mutexP@entry=0x2fb6cf0, file=file@entry=0x18393f5 "pvr.c", line=line@entry=10811) at pkmon.c:593 #4 0x0000000000bff56f in pvrGetFSCap (devClass=devClass@entry=1, fsCapP=fsCapP@entry=0x7f8d3af31420) at pvr.c:10811 #5 0x0000000001133a83 in AsGetFilePoolSpace (poolId=poolId@entry=-1, amount=amount@entry=134234456, capacityP=capacityP@entry=0x7f8d3af32518) at asutil.c:1113 #6 0x0000000001184c2a in ssSWAlloc (sessHandle=0x7f903d0253c0, txnId=0x7f903cc02600, poolId=poolId@entry=-1, allocMode=allocMode@entry=ssNewAlloc, amount=134234456, amount@entry=134218008, inclVolList=inclVolList@entry=0x0, inclVolCount=0) at ssalloc.c:1457 #7 0x0000000000855c9f in AfAllocSpace (txnP=txnP@entry=0x7f903c77da10, destPoolId=destPoolId@entry=-1, srvId=srvId@entry=0, ck1=ck1@entry=47, ck2=ck2@entry=4011, allocMode=allocMode@entry=ssNewAlloc, amount=134218008, multiDest=True) at afcreate.c:2344 #8 0x0000000000859bc5 in AfSwAlloc (txnP=txnP@entry=0x7f903c77da10, poolId=-1, srvId=srvId@entry=0, bfId=bfId@entry=1878267322, ck1=ck1@entry=47, ck2=ck2@entry=4011, estSize=134218008, afTxnH=0x7f94c0e4d8a8, aggregated=False, copyFragInfo=0x0, pendingIdP=0x0) at afcreate.c:3133 Thread 1195 (Thread 0x7f8de514f700 (LWP 36039)): #0 0x00007f94e24b254d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f94e24ade9b in _L_lock_883 () from /lib64/libpthread.so.0 #2 0x00007f94e24add68 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x0000000001333455 in pkAcquireMutexTracked (mutexP=mutexP@entry=0x2fb6cf0, file=file@entry=0x16542b0 "pvrclass.c", line=line@entry=6067) at pkmon.c:593 #4 0x0000000000c0b470 in pvrIsFileDevClass (className=className@entry=0x7f8de5144450 "CBRPFS02") at pvrclass.c:6067 #5 0x00000000011339ba in AsGetFilePoolSpace (poolId=poolId@entry=5, amount=amount@entry=134234454, capacityP=capacityP@entry=0x7f8de5145518) at asutil.c:1069 #6 0x000000000118483e in ssSWAlloc (sessHandle=0x7f9278b3cbf0, txnId=0x7f92787a4480, poolId=poolId@entry=5, allocMode=allocMode@entry=ssNewAlloc, amount=134234454, amount@entry=134218006, inclVolList=inclVolList@entry=0x0, inclVolCount=0) at ssalloc.c:1393 #7 0x0000000000855c9f in AfAllocSpace (txnP=txnP@entry=0x7f92786aeac0, destPoolId=destPoolId@entry=5, srvId=srvId@entry=0, ck1=ck1@entry=46, ck2=ck2@entry=1277, allocMode=allocMode@entry=ssNewAlloc, amount=134218006, multiDest=True) at afcreate.c:2344 #8 0x0000000000859bc5 in AfSwAlloc (txnP=txnP@entry=0x7f92786aeac0, poolId=poolId@entry=5, srvId=srvId@entry=0, bfId=bfId@entry=1878267280, ck1=ck1@entry=46, ck2=ck2@entry=1277, estSize=134218006, afTxnH=0x7f92786aec00, aggregated=False, copyFragInfo=0x0, pendingIdP=0x0) at afcreate.c:3133 #9 0x0000000000806d1f in StoreBitfile (txnP=txnP@entry=0x7f92786aeac0, poolName=poolName@entry=0x7f8de5145c00 "P_DATA_D", poolId=5, srvId=srvId@entry=0, bfId=bfId@entry=1878267280, estSize=134218006, ck1=46, ck2=1277, action=ssStoreEntire, mountWaitMode=bfWaitMount, sourceFunc=0x105fd70 <SmRecvNextData>, contextP=0x7f9278cc7de0, storeInfoP=0x7f9278a9ad20, sessP=<optimized out>) at bfswcrt.c:2560 #10 0x0000000000807f0d in BfSwCreate (sessHandle=sessHandle@entry=0x7f9278c47c80, txnId=0x7f92787a4480, bfId=1878267280, estSize=estSize@entry=134218006, estBitfileSize=estBitfileSize@entry=134218006, ck1=ck1@entry=46, ck2=1277, poolName=0x7f8de5145c00 "P_DATA_D", mountWaitMode=bfWaitMount, sourceFunc=0x105fd70 <SmRecvNextData>, contextP=0x7f9278cc7de0, aggregateState=admAggregate, forceAggregation=False) at bfswcrt.c:729 #11 0x0000000000773422 in bfCreate (sessHandle=sessHandle@entry=0x7f9278c47c80, txnId=txnId@entry=0x7f92787a4480, bfId=<optimized out>, bfId@entry=1878267280, estSize=estSize@entry=134218006, estBitfileSize=estBitfileSize@entry=134218006, ck1=ck1@entry=46, ck2=<optimized out>, poolName=<optimized out>, mountWaitMode=<optimized out>, sourceFunc=<optimized out>, contextP=<optimized out>, aggregateState=<optimized out>, forceAggregation=<optimized out>, isFragmentedP=<optimized out>, rptSizeP=<optimized out>) at bfcreate.c:2544 #12 0x0000000000fb247b in CreateBitfile (sessP=sessP@entry=0x7f9278cc7de0, bfHandle=bfHandle@entry=0x7f9278c47c80, txnId=txnId@entry=0x7f92787a4480, bfId=1878267280, ck1=46, ck2=1277, objInfoP=0x7f92784aefa0 "\f\f\026", objInfoLen=22, poolName=0x7f8de5145c00 "P_DATA_D", estSize=134218006, estBitfileSize=134218006, mountWaitMode=2, objType=1 '\001', isFragmentedP=0x7f8de5145bc8, rptSizeP=0x7f8de5145bf8) at smnode.c:29696 #13 0x0000000000fc861d in SmDoBackInsNormEnhanced (sessP=sessP@entry=0x7f9278cc7de0, bfHandle=0x7f9278c47c80, txnId=0x7f92787a4480, verbP=verbP@entry=0x7f92784aeea0, domainId=1, txnDate=..., updateList=0x7f9278b1d490, objectOpen=0x7f8de5145fac, setDeltaStoredP=0x7f8de5145fbc, objIdP=0x0) at smnode.c:18505 #14 0x0000000000fda9e3 in SmNodeSession (sessP=sessP@entry=0x7f9278cc7de0, logSummaryP=logSummaryP@entry=0x7f8de514be98) at smnode.c:6395 #15 0x0000000000f83055 in HandleNodeSession (sessP=sessP@entry=0x7f9278cc7de0, nodeInfoP=nodeInfoP@entry=0x7f8de514c450, bIsSchedSess=bIsSchedSess@entry=False) at smexec.c:7546 #16 0x0000000000f8a216 in DoNodeGeneral (nodeInfoP=0x7f8de514c450, sessP=0x7f9278cc7de0) at smexec.c:7599 #17 smExecuteSession (infoP=<optimized out>, beginFunc=beginFunc@entry=0x1346910 <tcpBegin>, sendFunc=sendFunc@entry=0x1346250 <tcpSend>, recvFunc=recvFunc@entry=0x13465d0 <tcpRecv>, flushFunc=flushFunc@entry=0x1345be0 <tcpFlush>, abortFunc=abortFunc@entry=0x1346a00 <tcpAbort>, abortForceFunc=<optimized out>, qmethodFunc=<optimized out>, qaddressFunc=<optimized out>, authFunc=<optimized out>, isNetworkMethod=<optimized out>, isLocalConnection=<optimized out>, qIsNodeThreadFunc=<optimized out>, isOobSession=<optimized out>) at smexec.c:4196 #18 0x0000000001347d0c in psSessionThread (argP=<optimized out>) at tcpcomm.c:1753 #19 0x00000000013317b0 in StartThread (startInfoP=<optimized out>) at pkthread.c:4039 #20 0x00007f94e24abea5 in start_thread () from /lib64/libpthread.so.0 #21 0x00007f94dddd096d in clone () from /lib64/libc.so.6 IBM Spectrum Protect versions affected: IBM Spectrum Protect Server 7.1 and 8.1 on all supported platforms Initial Impact: Medium Additional Keywords: TS005421570, performance, NFS
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * * All IBM Spectrum Protect server users using file device * * class over network shares (NFS/CIFS). * **************************************************************** * PROBLEM DESCRIPTION: * * See error description. * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in levels 8.1.12.200 and 8.1.13. Note * * that this is subject to change at the discretion of IBM. * ****************************************************************
Problem conclusion
This problem was fixed. Affected platforms for reported release: AIX, Linux, and Windows. Platforms fixed: AIX, Linux, and Windows.
Temporary fix
Comments
APAR Information
APAR number
IT37138
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
81L
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-06-07
Closed date
2021-10-01
Last modified date
2021-10-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV
Applicable component levels
R81A PSY
UP
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"81L","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
27 April 2022