IBM Support

IC94985: DATABASE REORGANIZATION THREADS CAN CAUSE SERVER HANG

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In some cases, the RdbMonitorStatsThread and OnDemandThread can
    cause a deadlock. This can occur when the monitor thread has
    found a paused table reorganization and tries to restart it,
    while at the same time the ondemand thread has found that the
    same table is in need of reorganization. The ondemand thread has
    a lock on that table, and the monitor thread cannot obtain a
    lock escalation.
    
    
    This will cause other server processes to hang.
    
    
    To see if this deadlock between the reorg threads is occurring,
    check the call stacks for threads RdbMonitorStatsThread and
    OnDemandThread. If the call stacks are the same or equivalent
    to the example below, then this APAR applies:
    
    
    
    Thread 248, Parent 1: RdbMonitorStatsThread, Storage 1642726,
    AllocCnt 4139692 HighWaterAmt 1739599
     tid=38f8, ptid=1, det=1, zomb=0, join=0, result=0, sess=0
      Stack trace:
        0x09000000002597a0 semop
        0x090000000761158c sqloSSemP
        0x0900000007610f64 .sqlccrecv.fdpr.clone.739
        0xffffffff89000017 *UNKNOWN*
        0x090000000761082c sqljcReceive__FP10sqljCmnMgr
        0x09000000075e17f0
    sqljrDrdaArCall__FP14db2UCinterfaceP9UCstpInfo
        0x09000000075e0344 sqleproc__FPcP7sqlcharP5sqldaT3P5sqlca
        0x09000000075c0890
    sqlerInvokeKnownProcedure__FUiP5sqldaP5sqlca
        0x0900000007b3f804 db2Reorg
        0x000000010009c5a4 DoReorg
        0x000000010009b0e8 HandleInFlightReorgs
        0x0000000100099914 RdbReorg
        0x0000000100a92eb4 RdbMonitorStatsThread
        0x000000010000c0e0 StartThread
    
    
    Thread 317, Parent 316: OnDemandThread, Storage 8424, AllocCnt
    2049 HighWaterAmt 52479
     tid=ac3d, ptid=b43c, det=1, zomb=0, join=0, result=0, sess=0
      Stack trace:
        0x09000000002597a0 semop
        0x090000000761158c sqloSSemP
        0x0900000007610f64 .sqlccrecv.fdpr.clone.739
        0x0000000000000000 *UNKNOWN*
        0x090000000761082c sqljcReceive__FP10sqljCmnMgr
        0x090000000761e440
    sqljrDrdaArExecute__FP14db2UCinterfaceP9UCstpInfo
        0x09000000078b5520
    CLI_sqlExecute__FP17CLI_STATEMENTINFOP19CLI_ERRORHEADERINFO
        0x0900000007924860
    SQLExecute2__FP17CLI_STATEMENTINFOP19CLI_ERRORHEADERINFO
        0x0900000007936c64 SQLExecute
        0x000000010009fea8 CheckWithDB2
        0x000000010009f81c CheckForOnDemandReorg
        0x00000001000a711c OnDemandThread
        0x000000010000c0e0 StartThread
    
    
    
    
    Note that the thread call stack output above was obtained by
    running the Tivoli Storage Manager server command 'SHOW THREADS'
    on an AIX server. Linux and other UNIX servers do now show
    thread call stacks in the "SHOW THREADS' output. For these other
    operating systems, run the 'pstack' command and search for the
    RdbMonitorStatsThread and OnDemand threads and their associated
    call stacks to confirm the problem.
    
    
    For example:
    
    
    pstack <dsmserv_PID>
    

Local fix

  • Restarting the Tivoli Storage Manager server will resolve the
    deadlock.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All Tivoli Storage Manager server users.     *
    ****************************************************************
    * PROBLEM DESCRIPTION: See ERROR DESCRIPTION.                  *
    ****************************************************************
    * RECOMMENDATION: Apply fixing level when available. This      *
    *                 problem is currently projected to be fixed   *
    *                 in level 6.3.5. Note that this is            *
    *                 subject to change at the discretion of IBM.  *
    ****************************************************************
    *
    

Problem conclusion

  • This problem was fixed.
    
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC94985

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    63A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-08-16

  • Closed date

    2013-08-22

  • Last modified date

    2013-08-22

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"63A","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
22 August 2013