IT34737: IBM MQ Appliance HA Queue manager configured for DR restarts/fails over unexpectedly

APAR status

Closed as program error.

Error description

IBM MQ Appliance HA Queue manager restarted unexpectedly. The
messages logs
indicates that at the time of the issue, the DR_ping_monitor
process Timed Out and causes the queue manager to
restart/failover

Oct 26 07:01:13 MQAPP01 lrmd[21715]:  warning:
QM1_DR_ping_monitor_20000 process (PID 388342) timed out
Oct 26 07:01:13 MQAPP01 lrmd[21715]:  warning:
QM1_DR_ping_monitor_20000:388342 - timed out after 10000ms
Oct 26 07:01:14 MQAPP01 pengine[21717]:  warning: Processing
failed monitor of QM1_DR_ping:1 on MQAPP01: unknown error

Oct 26 07:01:40 MQAPP01 lrmd[21715]:  notice:
QM1_stop_0:394620:stderr [ IBM MQ Appliance queue manager 'QM1'
ending. ]
Oct 26 07:01:40 MQAPP01 lrmd[21715]:  notice:
QM1_stop_0:394620:stderr [ IBM MQ Appliance queue manager 'QM1'
ended. ]

Oct 26 07:01:46 MQAPP01 lrmd[21715]:  notice:
QM1_start_0:410673:stderr [ IBM MQ Appliance queue manager
'QM1' starting. ]
Oct 26 07:01:46 MQAPP01 lrmd[21715]:  notice:
QM1_start_0:410673:stderr [ IBM MQ Appliance queue manager
'QM1' started using V9.1.0.6. ]

Local fix

Problem summary

****************************************************************
USERS AFFECTED:
Users using IBM MQ Appliance for High Availability(HA) and
Disaster Recovery (DR)


Platforms affected:
MultiPlatform

****************************************************************
PROBLEM DESCRIPTION:
The DR Ping monitor process that periodically checks the health
of the DR connection could time out when the system experienced
High CPU load. This caused the HA queue manager to be restarted
or failed over.

Problem conclusion

Improvements are made to DR ping monitor code to prevent it from
timing out in this scenario.

---------------------------------------------------------------
The fix is targeted for delivery in the following PTFs:

Version    Maintenance Level
v9.1 LTS   9.1.0.8

The latest available maintenance can be obtained from
'WebSphere MQ Recommended Fixes'
http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037

If the maintenance level is not yet available information on
its planned availability can be found in 'WebSphere MQ
Planned Maintenance Release Dates'
http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
---------------------------------------------------------------

Temporary fix

Comments

APAR Information

APAR number
IT34737
Reported component name
MQ APPLIANCE M2
Reported component ID
5737H4700
Reported release
910
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-10-30
Closed date
2021-04-27
Last modified date
2021-07-06

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
MQ APPLIANCE M2
Fixed component ID
5737H4700

Applicable component levels

[{"Type":"MASTER","Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS5K6E","label":"IBM MQ Appliance"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
07 July 2021

Tips

IT34737: IBM MQ Appliance HA Queue manager configured for DR restarts/fails over unexpectedly

Subscribe to this APAR

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

Document Information

Share your feedback

Need support?