A fix is available
APAR status
Closed as program error.
Error description
There is a small timing window in which the SHOW DBCONN command may result in a hang condition on the Tivoli Storage Manager Server. Platforms affected: All Tivoli Storage Manager 6.3.x Servers Customer/L2 Diagnostics: The output from the pstack/procstack command issued against the dsmserv process id will show a call stack similar to the following for the thread associated with the SHOW DBCONN command: pth_cond._cond_wait_global(??, ??, ??) at 0x9000000006fd4d0 pth_cond._cond_wait(??, ??, ??) at 0x9000000006fe058 pth_cond.pthread_cond_wait(??, ??) at 0x9000000006fed48 pkmon.pkWaitConditionTracked(??, ??, ??, ??, ??) at 0x100007870 output.OutQueueData(??, ??, ??) at 0x10001f7c0 output.StdPutText(??, ??, ??, ??) at 0x10001c994 outvarg.outPrintf(0x710112023b220000, 0x100e95b70, 0x167d336b0, 0x167d336e0, 0x167d33700, 0x101402a70, 0x66db, 0x167d3277c) at 0x10000a6b4 tmshow.ShowTxnDesc(??, ??, ??) at 0x1000c778c tmshow.tmShowTxnId(??, ??, ??) at 0x1000c7d08 dbiconn.dbiShowConnPool@AF26_7(??) at 0x1000b77d8 admshow.AdmShow(??) at 0x100cbc50c admcmd.AdmCommandLocal(??, ??, ??, ??, ??) at 0x1005a0348 admcmd.admCommand(??, ??, ??, ??, ??) at 0x10059ecb8 smadmin.SmAdminCommandThread(??) at 0x100a46530 pkthread.StartThread(??) at 0x10000c0dc The dbiShowConnPool() function is holding the DBV mutex. The amount of data generated by the SHOW DBCONN command exceeded the size of the output buffer and, thus, the thread had to signal the admin thread to send data back to the client. The call stack associated with the admin thread shows the following: pth_spinlock._global_lock_common(??, ??, ??) at 0x9000000006db7b4 pth_mutex._mutex_lock(??, ??, ??) at 0x9000000006e8bd8 pkmon.pkAcquireMutexTracked(??, ??, ??) at 0x100006648 smtrans.SmSendVerbX(??, ??) at 0x100333158 smadmin.SmAdminOutput(??) at 0x100a4687c smadmin.SmAdminSession(??) at 0x100a45124 smexec.DoAdminGeneral(??) at 0x1001aa168 smexec.smExecuteSession(??, ??, ??, ??, ??, ??, ??, ??) at 0x1001a3774 tcpcomm.psSessionThread(??) at 0x100063858 pkthread.StartThread(??) at 0x10000c0dc The admin thread is attempting to acquire the SMV mutex, but the mutex is currently held by another thread (restore/retrieve operation), as seen below pth_spinlock._global_lock_common(??, ??, ??) at 0x9000000006db7b4 pth_mutex._mutex_lock(??, ??, ??) at 0x9000000006e8bd8 pkmon.pkAcquireMutexTracked(??, ??, ??) at 0x100006648 dbiinit.dbIsRestoreInFlight@AF28_20(??, ??) at 0x1002335ac smtrans.SmSendData(??, ??, ??, ??) at 0x10033931c sstrans.RtrvFramed(??, ??, ??, ??, ??, ??, ??, ??) at 0x100659dd0 sstrans.ssRtrv(??, ??, ??, ??, ??, ??, ??, ??) at 0x1006589d0 afrtrv.AfRtrv(??, ??, ??, ??, ??, ??, ??, ??) at 0x1008f5bdc bfrtrv.RtrvOne(??, ??, ??, ??, ??, ??, ??, ??) at 0x100864e84 bfrtrv.bfRtrvExt(??, ??, ??, ??, ??, ??, ??, ??) at 0x100860c5c smnqr.SmRetrieveBitfile(??, ??, ??, ??, ??, ??, ??) at 0x10094c8c0 smnode.SmDoObjRtrv(??, ??, ??, ??, ??, ??) at 0x100794834 smnode.SmNodeSession(??, ??) at 0x10077ee54 smexec.HandleNodeSession(??, ??, ??) at 0x1001b0db8 smexec.DoNodeGeneral(??, ??) at 0x1001a8730 smexec.smExecuteSession(??, ??, ??, ??, ??, ??, ??, ??) at 0x1001a374c tcpcomm.psSessionThread(??) at 0x100063858 pkthread.StartThread(??) at 0x10000c0dc The restore/retrieve thread is holding the SMV mutex while attempting to acquire the DBV mutex held by the SHOW DBCONN command. This results in the deadlock condition that ultimately causes the Tivoli Storage Manager Server to hang. Initial Impact: Medium Additional Keywords: wait lock unresponsive txnt dbtxnt deadlocks
Local fix
Do not issue SHOW DBCONN command.
Problem summary
**************************************************************** * USERS AFFECTED: All Tivoli Storage Manager server users. * **************************************************************** * PROBLEM DESCRIPTION: See error description. * **************************************************************** * RECOMMENDATION: Apply fixing level when available. This * * problem is currently projected to be fixed * * in levels 6.3.3.100 and 6.3.4. Note that * * this is subject to change at the * * discretion of IBM. * **************************************************************** *
Problem conclusion
This problem was fixed. Affected platforms: AIX, HP-UX, Solaris, Linux, and Windows.
Temporary fix
Comments
APAR Information
APAR number
IC89767
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
63A
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2013-01-23
Closed date
2013-05-23
Last modified date
2013-05-23
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV
Applicable component levels
R63A PSY
UP
R63H PSY
UP
R63L PSY
UP
R63S PSY
UP
R63W PSY
UP
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"63A","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
23 May 2013