APAR status
Closed as program error.
Error description
The drouter process unexpectedly terminates leading to appliance reload and this causes the queue managers running on their primary node (APP1) to end and then they are all started on their secondary node (APP2). However, if any of the queue managers take long time to end, exceeding the wait time provided by the cluster manager processes before it is restarted on the secondary node , then it ends up running on both the nodes and this leads to split-brain situation and hence a Partitioned state. 2021-04-07 11:59:22.546508-04:00 APP1 supervisor: Restart event for process drouter (PID 123317) - Reason: (2) Signal info: PID (123317) Signal (17) Status (9)(EXIT(-1) SIG(9) CORE(0))
Local fix
Resolve partitioned state by running 'makehaprimary QMgrName' in the winner appliance.
Problem summary
**************************************************************** USERS AFFECTED: Users using MQ Appliance for HA Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: MQAppliance reload results in the restart of the cluster manager processes which in turn triggers the failover of the HA queue managers from the primary to the secondary node. If the queue manager did not terminate on the Primary node before the cluster manager process were restarted then this caused the queue manager to enter a Partitioned state.
Problem conclusion
The code is modified to ensure the queue manager process is terminated before the cluster manager process are restarted on the primary node in the event of reload. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.2 LTS 9.2.0.4 v9.x CD 9.2.4 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT36495
Reported component name
MQ APPL M2002 V
Reported component ID
5737H4701
Reported release
920
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-04-08
Closed date
2021-09-28
Last modified date
2021-09-28
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
MQ APPL M2002 V
Fixed component ID
5737H4701
Applicable component levels
[{"Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud \u0026 Data Platform"},"Product":{"code":"SS5K6E","label":"IBM MQ Appliance"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920"}]
Document Information
Modified date:
29 September 2021