IBM Support

IT24269: Queue manager ceases to participate in cluster after REFRESH CLUSTER. rc=2189 and other errors are seen

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In this scenario there were local logger filesystem availability
    problems.  Because of this, there were 1000s of application
    messages on the SYSTEM.CLUSTER.TRANSMIT.QUEUE, and one or more
    channels held indoubt batches of messages.  At this time the
    administrator ran REFRESH CLUSTER on the local queue manager.
    The indoubt batches of messages became rolled back (probably
    automatically due to log space shortage, though the same effect
    could be seen if using RESOLVE CHANNEL to roll back the batch),
    so the messages appeared on the transmission queue again.
    
    After this the local queue manager's repository manager program
    tried to reallocate messages from the rolled-back indoubt
    batches.  The reallocation routine found that the cluster cache
    no longer knew anything about the queues for which they were
    destined.  The repository manager did not break from its
    reallocation routine to ask the full repositories for details of
    the queues, so it suffered repeated
    MQRC_CLUSTER_RESOLUTION_ERROR errors over a period of many
    minutes, which were visible in the MQ trace file for the
    amqrrmfa process.
    
    Application calls to MQOPEN for queues not known locally will
    fail with 2189 MQRC_CLUSTER_RESOLUTION_ERROR.
    Other symptoms include:
    -- a failure to recognize other queue managers in the
       cluster including the QMGR hosting the cluster Q.
    -- message buildup on the SCCQ and the SCTQ.
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    A system that suffers multiple problems including log space
    shortage, at a time when there are inflight batches and many
    other application messages sitting on the
    SYSTEM.CLUSTER.TRANSMIT.QUEUE.  This problem only occurs if the
    REFRESH CLUSTER command is issued while the queue manager is in
    this situation.
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The reallocation routine within the repository manager was
    suffering a MQRC_CLUSTER_RESOLUTION_ERROR condition repeatedly,
    for the same message each time.  There were 1000s of messages on
    the cluster transmission queue, and because of a flaw in the
    logic flow it would re-read the same message 1000s of times,
    with a 1 second sleep between each time.  It did not break from
    this loop to send the query to the full repositories that was
    necessary to relieve this situation.
    

Problem conclusion

  • The correct behaviour for the reallocation routine is to
    schedule a re-run of itself in 60 seconds time, and break from
    its work to allow the repository manager to request information
    from the full repositories.  This is the behaviour that has now
    been coded in the MQ queue manager.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v8.0       8.0.0.10
    v9.0 LTS   9.0.0.4
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT24269

  • Reported component name

    IBM MQ BASE MP

  • Reported component ID

    5724H7251

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-03-02

  • Closed date

    2018-03-19

  • Last modified date

    2018-03-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PI95380 IT33907

Fix information

  • Fixed component name

    IBM MQ BASE MP

  • Fixed component ID

    5724H7251

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0.0.0","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
15 August 2020