IBM Support

IT21234: All channel processing hangs some weeks after user deletes cluster transmission queue

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Due to a missing cluster transmission queue, the MQ cluster
    repository process (amqrrfma) hangs while holding a lock on the
    channel status table.
    
    When the MQ cluster is in this hung state, the DISPLAY CHSTATUS
    command hangs.
    
    FDCs are created:
    
    For example:
    
      Probe Id          :- RM419010
      Component         :- rrmRemoveNonReallocMsgs
      Program Name      :- amqrrmfa
      Major Errorcode   :- MQRC_UNKNOWN_OBJECT_NAME
      MQM Function Stack
      amqrrfma_main
      rrmMain
      rrmRepository
      rrmGetMsg
      rrmRunTimers
      rrmMaintenance
      rfxEnumCLQMGR
      rrmMaintainClqMgr
      rrmRemoveNonReallocMsgs
      xcsFFST
    
    Also:
      Probe Id          :- RM193001
      Component         :- rrmMaintenance
      Program Name      :- amqrrmfa
      Major Errorcode   :- rrcE_MQOPEN_FAILED
      Comment2          :- SYSTEM.CLUSTER.TRANSMIT.QUEUE
      MQM Function Stack
      amqrrfma_main
      rrmMain
      rrmRepository
      rrmGetMsg
      rrmRunTimers
      rrmMaintenance
      xcsFFST
    
    Also:
      Probe Id          :- XC307100
      Component         :- xlsRequestMutex
      Program Name      :- amqpcsea
      Major Errorcode   :- xecL_W_LONG_LOCK_WAIT
      MQM Function Stack
      pcmCommandServer
      pcmProcessMessage
      pcmEscape
      uscRunCommand
      uscRunIT
      pcmInquireChannelStatus
      rrxGetNextStatusEntry
      xlsRequestMutex
      xcsFFST
    .
    .
    Additional symptoms:
    
    AMQ9509: Program cannot open queue manager object.
    
    EXPLANATION:
    The attempt to open either the queue or queue manager object
    'SYSTEM.CLUSTER.TRANSMIT.QUEUE' on queue manager '<qmgr-name>'
    failed with
    reason
    code 0.
    ACTION:
    Ensure that the queue is available and retry the operation.
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Users who have configured a cluster sender channel to use a
    specific transmission queue, but this channel becomes disused,
    and the user deletes the transmission queue before MQ has
    automatically removed its own internal channel definition from
    its local cluster cache.
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The user deleted a locally defined queue that was previously
    associated as the transmission queue for a local cluster sender
    channel.
    
    When that channel was not used for a long time (over 90 days)
    then, via internal automated processing in the queue manager's
    cluster repository manager, the local auto-defined channel
    definitions for it were expired from the local cluster cache.
    
    At this time the queue manager was calling MQOPEN on the cluster
    transmission queue that was configured for that channel.
    
    If the user had deleted the queue, then this failed, and in the
    error-handling logic that follows, the queue manager deadlocked
    waiting for a mutex that would not be released.
    

Problem conclusion

  • The MQ product code has been altered so that the missing
    transmission queue is not treated as an error condition at this
    point in the processing.  The cluster repository manager
    completes its removal of the old records from its cache without
    any errors.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v8.0       8.0.0.8
    v9.0 CD    9.0.5
    v9.0 LTS   9.0.0.3
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT21234

  • Reported component name

    WMQ BASE MULTIP

  • Reported component ID

    5724H7251

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-06-28

  • Closed date

    2017-10-17

  • Last modified date

    2021-02-15

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PI88958 IT33906

Fix information

  • Fixed component name

    WMQ BASE MULTIP

  • Fixed component ID

    5724H7251

Applicable component levels

[{"Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0.0.0"}]

Document Information

Modified date:
27 February 2021