APAR status
Closed as program error.
Error description
The cluster manager process terminates unexpectedly due to an internal error and triggers the queue managers to failover to the Secondary node and continues to run there even though its preferred node is online again. Below errors were recorded in the system log that triggers the failover. 2021-04-21 13:13:19.374056+00:00 MQAPP1 crmd[861101]: error: Could not recover from internal error 2021-04-21 13:13:19.376296+00:00 MQAPP1 pacemakerd[860937]: error: crmd[861101] exited with status 201 (Generic Pacemaker error) 2021-04-21 13:13:19.376375+00:00 MQAPP1 pacemakerd[860937]: notice: Respawning failed child process: crmd
Local fix
Suspend and resume from the HA the node where the queue managers are now running.
Problem summary
**************************************************************** USERS AFFECTED: All MQ Appliance users with one or more queue managers configured in an HA group. Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: When cluster process terminates unexpectedly some of the transient attributes which determine the eligibility of the node to run the cluster resources were lost. As a result the of this the queue manager was unable to move back to its preferred node even though the node was online again.
Problem conclusion
The code has been improved such that the missing transient attributes are restored in the event of an unexpected failure. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.2 LTS 9.2.0.5 v9.x CD 9.2.5 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT36745
Reported component name
MQ APPL M2002 V
Reported component ID
5737H4701
Reported release
920
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-04-29
Closed date
2022-01-20
Last modified date
2022-01-25
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
MQ APPL M2002 V
Fixed component ID
5737H4701
Applicable component levels
[{"Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud \u0026 Data Platform"},"Product":{"code":"SS5K6E","label":"IBM MQ Appliance"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920"}]
Document Information
Modified date:
26 January 2022