IBM Support

IT20246: DELETE QLOCAL command hangs for a cluster transmission queue when its Q file is missing or corrupted

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • While running the DELETE QLOCAL command from runmqsc to delete a
    damaged cluster transmission queue, runmqsc hangs.
    
    The output to the screen is:
         1 : delete qlocal(SYSTEM.CLUSTER.TRANSMIT.chlname)
    AMQ8101: WebSphere MQ error (893) has occurred.
       [1012, 20]
    
    An error message is seen in the queue manager error logs:
    AMQ7472: Object SYSTEM.CLUSTER.TRANSMIT.chlname, type queue
    damaged.
    
    An Failure Data Capture (FDC) record is written, with these
    details:
    Probe Id :- KN772010
    Component :- kqiCloseShadowXmitQ
    Major Errorcode :- arcW_OBJECT_WOUNDED
    Probe Description :- AMQ7472: Object , type  damaged.
    MQM Function Stack
    zlaMainThread
    zlaProcessMessage
    zlaProcessSPIRequest
    zlaSPIDelete
    zsqSPIDelete
    zsqInqSetDef
    kpiSPIDelete
    kqiCloseShadowXmitQ
    xcsFFST
    
    The queue is not deleted.  Other applications that try to use
    the queue will also hang.
    For example:
    - another instance of runmqsc where the user types DISPLAY
    QLOCAL for that queue, will hang.
    - an instance of the amqsbcg program where the user tries to
    browse the content of the queue, will hang.
    
    Also, if trying to end the queue manager that has this problem,
    the endmqm command hangs.
    
    Further FDC records are seen after this point, mentioning
    xecL_W_LONG_LOCK_WAIT, with the following call stack:
    MQM Function Stack
    zlaMainThread
    zlaProcessMessage
    zlaProcessSPIRequest
    zlaSPIDelete
    zsqSPIDelete
    zsqInqSetDef
    kpiSPIDelete
    apiEnquireObject
    aocEnquireObject
    xlsRequestMutex
    xcsFFST
    

Local fix

  • MQ administrator can replace the queue file from backup or from
    a temporary queue manager.
    

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Users who have some serious problem on their disk causing the
    cluster transmission queue Q file to be corrupted or deleted.
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The MQ code noticed the Q file was missing or damaged, and
    continued to try to delete the queue.  A point in the MQ code
    (routine name: kqiCloseShadowXmitQ) is reached where a lock is
    held, and is not released.  Then a second routine that needed
    the lock, attempts to obtain it.  This lead to an immediate
    deadlock.
    

Problem conclusion

  • The routine is corrected to ensure it releases the lock.
    
    While fixing this problem, it was noticed that a similar issue
    exists within separate routine kqiSwitchCLUSSDR2.  Therefore,
    the same fix was made to that routine.
    
    Therefore if a similar "hang" condition is seen, and FDCs are
    seen that mention kqiSwitchCLUSSDR2, then consider this APAR as
    a possible fix.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v7.5       7.5.0.9
    v8.0       8.0.0.7
    v9.0 CD    9.0.3
    v9.0 LTS   9.0.0.2
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT20246

  • Reported component name

    WMQ BASE MULTIP

  • Reported component ID

    5724H7241

  • Reported release

    750

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-04-18

  • Closed date

    2017-04-26

  • Last modified date

    2017-12-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WMQ BASE MULTIP

  • Fixed component ID

    5724H7241

Applicable component levels

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCPQ63","label":"APAR \/ Maintenance"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.5","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
04 December 2017