APAR status
Closed as program error.
Error description
A number of AMQP clients connect to the IBM MQ 9.2.0.1 AMQP service over an AMQP channel called MY.AMQP.CHANNEL and subscribe to a topic. When messages are published on that topic, the AMQP service sends copies of those messages to the AMQP clients for processing. After the system is upgraded to IBM MQ 9.2.0.4, the AMQP clients continue to receive messages that are published on the topic. However, after a while, the AMQP service stops sending any new messages to one or more of the clients. Javacores taken at the time the issue occurs shows that a ServerWorker thread and a consumer thread within the service have become blocked. The Java callstacks for these threads are shown below: Thread 1 - ServerWorker thread: ------------------------------------------- "MY.AMQP.CHANNEL ServerWorker0" J9VMThread:0x00000000011DF700, omrthread_t:0x00007FE11173F698, java/lang/Thread:0x00000000418F66E8, state:R, prio=5 ... Java callstack: at com/ibm/mq/jmqi/local/internal/base/Native.MQCTL(Native Method) at com/ibm/mq/jmqi/local/LocalMQ.MQCTL(LocalMQ.java:3006(Compiled Code)) at com/ibm/mq/jmqi/monitoring/JmqiInterceptAdapter.MQCTL(JmqiInterc eptAdapter.java:332(Compiled Code)) at com/ibm/mq/MQXRService/MQConnection.ctl(MQConnection.java:1405(C ompiled Code)) at com/ibm/mq/MQXRService/AbstractMessageListener.suspend(AbstractM essageListener.java:232(Compiled Code)) at com/ibm/mq/MQXRService/AMQPManagedSubscriptionListener.delete(AM QPManagedSubscriptionListener.java:806(Compiled Code)) at com/ibm/mq/MQXRService/AMQPServerSessionV10.deleteOutboundMessag e(AMQPServerSessionV10.java:1761(Compiled Code)) at com/ibm/mq/MQXRService/AMQPServerSessionV10.processUpdatedDelive ry(AMQPServerSessionV10.java:1651(Compiled Code)) at com/ibm/mq/MQXRService/AMQPServerSessionV10.processProtonUpdates (AMQPServerSessionV10.java:1517(Compiled Code)) at com/ibm/mq/MQXRService/AMQPServerSessionV10.handleReceive(AMQPSe rverSessionV10.java:517(Compiled Code)) at com/ibm/mq/MQXRService/AMQPServerWireContext.receive(AMQPServerW ireContext.java:176(Compiled Code)) at com/ibm/mq/communications/NonBlockingConnection.receive(NonBlock ingConnection.java:386(Compiled Code)) at com/ibm/mq/communications/NonBlockingWorker.run(NonBlockingWorke r.java:398(Compiled Code)) at java/lang/Thread.run(Thread.java:825) Thread 2 - Consumer thread: ------------------------------------------- "Jmqi AsyncConsume Thread. tid: 1937" J9VMThread:0x00000000011F4B00, omrthread_t:0x00007FE0D00BBA78, java/lang/Thread:0x0000000042302B68, state:B, prio=5 ... Blocked on: java/lang/Object@0x00000000423114D8 Owned by: "MY.AMQP.CHANNEL ServerWorker0" (J9VMThread:0x00000000011DF700, java/lang/Thread:0x00000000418F66E8) ... Java callstack: at com/ibm/mq/MQXRService/MQConnection.callback(MQConnection.java:1 524(Compiled Code)) at com/ibm/mq/MQXRService/AMQPManagedSubscriptionListener.suspend(A MQPManagedSubscriptionListener.java:888) at com/ibm/mq/MQXRService/AMQPSubscriptionWrapper.bufferMessage(AMQ PSubscriptionWrapper.java:255(Compiled Code)) at com/ibm/mq/MQXRService/AMQPServerSessionV10.sendTransmitQueueMes sage(AMQPServerSessionV10.java:2874(Compiled Code)) at com/ibm/mq/MQXRService/AMQPManagedSubscriptionListener.consumer( AMQPManagedSubscriptionListener.java:690(Compiled Code)) at com/ibm/mq/jmqi/local/internal/LocalProxyConsumer.jmqiConsumerMe thod(LocalProxyConsumer.java:207(Compiled Code))
Local fix
Customer did not have the problem in MQ 9.2.0.1, but they noticed the problem after they upgraded to MQ 9.2.0.4. They reverted back from MQ 9.2.0.4 to MQ 9.2.0.1 and the problem did not reappear.
Problem summary
**************************************************************** USERS AFFECTED: This issue affects users of the IBM MQ 9.2.0.4 (and later) AMQP service. Platforms affected: AIX, Linux on Power, Linux on x86-64, Linux on zSeries, Windows **************************************************************** PROBLEM DESCRIPTION: When an AMQP client connects to the MQ AMQP service and subscribes to a topic, the service creates an internal consumer thread for that client. The consumer thread maintains its own connection to the queue manager. When a message is published on the topic that the AMQP client has subscribed to, the queue manager uses this connection to pass a copy of that message to the consumer thread. The consumer thread then forwards the copy of the message to the AMQP client. Once the AMQP client has received and processed the copy of the message, it sends a disposition frame containing an outcome to the MQ AMQP service. An internal ServerWorker thread within the service receives the frame, and looks at the outcome to see if the message was processed successfully or not. If the outcome is set to accepted, then the ServerWorker thread will use the connection associated with the consumer thread for that AMQP client to delete the message from the queue manager. Now, APAR IT35658 updated the MQ AMQP service to ensure that only one thread could call an MQ API call on a queue manager connection at a time. The fix for this APAR was first included in MQ 9.2.0.4. As a result of this change, it was possible for the following sequence of events to occur: - An AMQP client connected to the MQ AMQP service, and took out a subscription on a topic. As a result, the service created an internal consumer thread for that client, along with a dedicated connection to the queue manager. - Some time later, the AMQP client successfully processed a message that it had received from the AMQP service and sent back a disposition frame containing the accepted outcome. - This frame was received by a ServerWorker thread within the AMQP service. - At the same time, a new message was published on the topic that the AMQP client had subscribed on. This caused the queue manager to invoke the consumer thread for the client. - As part of the processing of the disposition frame, the ServerWorker thread determined that it needed to delete the message from the queue manager. In order to do this, it locked the connection associated with the consumer thread for that client, and used it to issue an MQCTL API call, with the MQOP_SUSPEND operation specified, to suspend that thread. - The queue manager received the MQCTL API call, and detected that the consumer thread was still running. Because of this, it blocked waiting for the consumer thread to finish. - Meanwhile, the consumer thread detected that the AMQP client had run out of "link credit" (this meant that the service had sent a number of messages to the AMQP client, and was waiting for it to acknowledge receipt of those messages via disposition frames before any more messages could be sent), and so added the message to a backlog so that it could be sent later. - Next, the consumer thread tried to tell the queue manager to stop passing along messages for this client by issuing an MQCB API call - However, its connection to the queue manager was locked by the ServerWorker thread. As a result, the ServerWorker thread and the consumer thread both became blocked: - The ServerWorker thread had locked the queue manager connection associated with the consumer thread, issued an MQCTL API to the queue manager to suspend that thread and was waiting for the queue manager to respond. - The queue manager had received the MQCTL API call, detected that the consumer thread was currently running and so was waiting for it to complete. - The consumer thread was blocked trying to lock its queue manager connection, which was currently being used by the ServerWorker thread. Once the consumer thread had become blocked, the AMQP service was unable to forward any more messages to the AMQP client.
Problem conclusion
To resolve this issue, the IBM MQ AMQP service has been updated to not perform any locking on the queue manager connections associated with the internal consumer threads for AMQP clients. This means that both ServerWorker and consumer threads can issue MQCTL and MQCB API calls on the same connection at the same time, which prevents the issue reported in this APAR from occurring. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.2 LTS 9.2.0.7 v9.3 LTS 9.3.0.1 v9.x CD 9.3.1 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT40577
Reported component name
MQ BASE V9.2
Reported component ID
5724H7281
Reported release
920
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-04-08
Closed date
2022-07-12
Last modified date
2022-09-08
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
MQ BASE V9.2
Fixed component ID
5724H7281
Applicable component levels
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
08 September 2022