IBM Support

IZ23943: Hang in DUOW (pipelined) channel process following forced termination of a channel XPPTHREADMUTEX

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • If a Dual Unit Of Work (DUOW) (pipelined) channel is forcibly
    killed, e.g. through the stop mode(terminate) runmqsc command,
    then there is a possibility that the whole channel process may
    hang.
    
    This can be difficult to diagnose, and normal hang doc such as
    a pstack, may not provide evidence of a hung thread. Issuing
    SIGUSR2 to the process will not be effective. It may be
    possible to investigate a core file if one can be generated.
    
    If the core file's thread structures (XIHT) can be examined and
    formatted, they may show:
    
    A thread waiting for its DUOW secondary thread to end, within
    the xcsWaitThread function and having vectoring to the
    xppRunDestructors function (due to a thread cancel issued
    against this thread), e.g. the following sequence will be
    contained withing the traceback:
    
    ---} rriWaitTransfer rc=rrcI_CHANNEL_CLOSED
    --} rriReceiveData rc=rrcI_CHANNEL_CLOSED
    --{ rriFreeSess
    ---{ rriWaitSecondary
    ----{ xcsReleaseThreadMutexSem
    ----} xcsReleaseThreadMutexSem rc=OK
    ----{ xcsWaitThread
    -----{ xppRunDestructors
    
    The last line of the traceback fill be:
    
    -----} xppRunDestructors rc=OK
    
    This is sufficient evidence that the thread is blocked within
    the thread destruction code, in fact holding a lock that may
    prevent the process from doing further significant work.
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Users who make use of DUOW (PipeLineLength=2 in the qm.ini
    CHANNELS stanza), and who issue a stop mode(TERMINATE) runmqsc
    command, may be impacted by this problem. The problem is
    unlikely to occur, unless there are significant delays in
    channel termination.
    
    Platforms affected:
    All Unix
    
    ****************************************************************
    PROBLEM SUMMARY:
    The thread destruction code needs to take a lock and that lock
    must not therefore be currently held by that thread if it is
    cancelled. Otherwise, the thread deadlocks holding the lock.
    The coding needed to ensure this did not happen.
    

Problem conclusion

  • Modified the coding to ensure that the deadlock cannot occur.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
                       v6.0
    Platform           Fix Pack 6.0.2.5
    --------           --------------------
    AIX                U815929
    HP-UX (PA-RISC)    U815636
    HP-UX (Itanium)    U815818
    Solaris (SPARC)    U815659
    Solaris (x86-64)   U815928
    Linux (x86)        U815767
    Linux (x86-64)     U815808
    Linux (zSeries)    U815805
    Linux (Power)      U815806
    Linux (s390x)      U815807
    
    The latest available maintenance can be obtained from
    'Websphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available, information on
    its planned availability can be found in 'Websphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IZ23943

  • Reported component name

    WMQ LIN X86 V6

  • Reported component ID

    5724H7204

  • Reported release

    600

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2008-06-06

  • Closed date

    2008-06-26

  • Last modified date

    2008-06-26

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WMQ LIN X86 V6

  • Fixed component ID

    5724H7204

Applicable component levels

  • R600 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCPQ5M","label":"APAR"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"6.0","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
26 June 2008