IBM Support

IT34737: IBM MQ Appliance HA Queue manager configured for DR restarts/fails over unexpectedly

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • IBM MQ Appliance HA Queue manager restarted unexpectedly. The
    messages logs
    indicates that at the time of the issue, the DR_ping_monitor
    process Timed Out and causes the queue manager to
    restart/failover
    
    Oct 26 07:01:13 MQAPP01 lrmd[21715]:  warning:
    QM1_DR_ping_monitor_20000 process (PID 388342) timed out
    Oct 26 07:01:13 MQAPP01 lrmd[21715]:  warning:
    QM1_DR_ping_monitor_20000:388342 - timed out after 10000ms
    Oct 26 07:01:14 MQAPP01 pengine[21717]:  warning: Processing
    failed monitor of QM1_DR_ping:1 on MQAPP01: unknown error
    
    Oct 26 07:01:40 MQAPP01 lrmd[21715]:  notice:
    QM1_stop_0:394620:stderr [ IBM MQ Appliance queue manager 'QM1'
    ending. ]
    Oct 26 07:01:40 MQAPP01 lrmd[21715]:  notice:
    QM1_stop_0:394620:stderr [ IBM MQ Appliance queue manager 'QM1'
    ended. ]
    
    Oct 26 07:01:46 MQAPP01 lrmd[21715]:  notice:
    QM1_start_0:410673:stderr [ IBM MQ Appliance queue manager
    'QM1' starting. ]
    Oct 26 07:01:46 MQAPP01 lrmd[21715]:  notice:
    QM1_start_0:410673:stderr [ IBM MQ Appliance queue manager
    'QM1' started using V9.1.0.6. ]
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Users using IBM MQ Appliance for High Availability(HA) and
    Disaster Recovery (DR)
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The DR Ping monitor process that periodically checks the health
    of the DR connection could time out when the system experienced
    High CPU load. This caused the HA queue manager to be restarted
    or failed over.
    

Problem conclusion

  • Improvements are made to DR ping monitor code to prevent it from
    timing out in this scenario.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v9.1 LTS   9.1.0.8
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT34737

  • Reported component name

    MQ APPLIANCE M2

  • Reported component ID

    5737H4700

  • Reported release

    910

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-10-30

  • Closed date

    2021-04-27

  • Last modified date

    2021-07-06

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    MQ APPLIANCE M2

  • Fixed component ID

    5737H4700

Applicable component levels

[{"Type":"MASTER","Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS5K6E","label":"IBM MQ Appliance"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
07 July 2021