APAR status
Closed as program error.
Error description
A replicated data queue manager (RDQM) instance fails over more than expected during periods of network instability.
Local fix
Note: * Before making these noted below two changes, end all queue managers running in the HA group. * After making these changes, restart all nodes. * Once all nodes have been restarted, start the queue manager up on its preferred primary node. Change #1 In the /etc/corosync/corosync.conf file, under the totem stanza, add 'token: 3000' to increase the corosync timeout. For example: totem { version: 2 crypto_cipher: none crypto_hash: none clear_node_high_bit: yes token: 3000 interface { Change #2 In the /etc/drbd.d/global_common.conf file, under the net stanza, add 'ping-timeout 40;' as shown below: net { ping-timeout 40; max-buffers 40k; IBM does not advise changing any other values in these configuration files, nor recommend the use of values other than those stated here.
Problem summary
**************************************************************** USERS AFFECTED: All RDQM users with network delays. Platforms affected: Linux on x86-64 **************************************************************** PROBLEM DESCRIPTION: The default values of internal timeouts used by the corosync and drbd libraries underneath IBM MQ RDQM were found to be overly sensitive to network delays, and required adjustment.
Problem conclusion
The drbd ping-timeout and corosync timeout values have been incremented to make RDQM more tolerant to network and VM delays. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.2 LTS 9.2.0.10 v9.3 LTS 9.3.0.5 v9.x CD 9.3.3 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT41652
Reported component name
MQ BASE V9.2
Reported component ID
5724H7281
Reported release
920
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-07-29
Closed date
2023-02-22
Last modified date
2023-02-22
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
MQ BASE V9.2
Fixed component ID
5724H7281
Applicable component levels
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
23 February 2023