A fix is available
APAR status
Closed as program error.
Error description
The supplied dumps and trace show that the delays in processing for this queue are caused by contention between application threads performing frequent open and close of the queue. When an application opens a shared queue, especially when it is the only handle with that queue open on a particular queue manager, MQ needs to update the shared state of the queue so that other queue managers know whether that queue manager has the queue open. For any update to the queue state entry in the list, the version number is used to prevent conflicting updates being made at the same time. If a write fails due to a version mismatch, the current entry is read and the write is retried. Depending on the type of open being performed, these updates may or may not require the list lock. In this case, the contention occurs when multiple queue managers are attempting to obtain the lock and updated the entry at the same time. Making the lock available to one queue manager can take maybe 500 microseconds. This means that there is a period where we are obtaining the lock and another thread can make an update to the queue state entry. When more than one queue manager is trying to obtain the lock at the same time, however, each locker will take 500 microseconds to obtain the lock, attempt the update and then release the lock to the next queue manager. Since each waiter for the lock is held in a queue behind any others, the overall time that a thread waits for the lock can be a multiple of 500 microseconds. This creates a larger window for other version updates to be made. In the latest trace, we can see that the lock contention has reached the point where all 12 queue managers are waiting on this queue lock. This means that it takes 5-6 milliseconds for a given queue manager to receive the lock. The logic for processing a version number mismatch exacerbates this lock contention. When the write fails because another update has happened while waiting for the lock, the queue manager currently releases the lock before looping back for another try (read the current entry, re-obtain the lock, try the write again). This releasing and re-obtaining the lock means that a single MQOPEN may need to obtain the lock many times before it is successful in updating the queue state.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM MQ for z/OS Version 9 * * Release 3 Modification 0 and * * Release 4 Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: Slow MQOPEN performance is seen when * * multiple applications are opening the * * same Shared Queue on different Queue * * Managers in the Queue Sharing Group. * **************************************************************** When a Shared Queue is opened for input it may require a lock on the list related to that Shared Queue for the duration of part of that MQOPEN. If there are multiple applications opening the same Shared Queue on different Queue Managers that all require the lock there will be lock contention while they wait for the lock to be available, causing delays to the MQOPEN processing. An amount of lock contention in these scenarios is unavoidable. MQ currently does not make best use of the time when one of the MQOPEN processes does hold the lock. This results in it needing to reobtain the lock after it has released it leading to further lock contention.
Problem conclusion
The code has been updated to better utilise the time when the MQOPEN processing holds the lock on the list related to the Shared Queue so that it should not need to reobtain it again after it has released the lock. Over time this will reduce the number of processes that are waiting to obtain the lock thus reducing the performance degradation from the lock contention.
Temporary fix
Comments
APAR Information
APAR number
PH65507
Reported component name
IBM MQ Z/OS V9
Reported component ID
5655MQ900
Reported release
300
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2025-03-04
Closed date
2025-04-14
Last modified date
2025-05-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UO02805 UO02806
Modules/Macros
CSQE197M CSQEOSQ CSQEUCAT
Fix information
Fixed component name
IBM MQ Z/OS V9
Fixed component ID
5655MQ900
Applicable component levels
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"300","Line of Business":{"code":"LOB77","label":"Automation Platform"}}]
Document Information
Modified date:
02 May 2025