APAR status
Closed as program error.
Error description
When using the MQ classes for JMS with the MQ Automatic Client Reconnect functionality, a hang can occur when a multi-instance queue manager is failed over to the standby instance. A Javacore of the JVM process would show a RemoteRcvThread blocked, waiting for an internal "RemoteConnectionSpecification" lock in: com.ibm.mq.jmqi.remote.impl.RemoteTCPConnection(RemoteConnection ).asyncFailure(RemoteTls, Throwable, boolean) Example output from a Javacore: RcvThread: com.ibm.mq.jmqi.remote.impl.RemoteTCPConnection@288628329[qmid=. ..] Blocked on: com/ibm/mq/jmqi/remote/impl/RemoteConnectionSpecification$Connec tionsLock@0x000000000239BE18 Owned by: "JMSCCThreadPoolWorker-2" (J9VMThread:0x000000002255A100, java/lang/Thread:0x000000002204C030) Java callstack: at com/ibm/mq/jmqi/remote/impl/RemoteConnection.asyncFailureNotify at com/ibm/mq/jmqi/remote/impl/RemoteConnection.notifyReconnect at com/ibm/mq/jmqi/remote/impl/RemoteRcvThread.run at com/ibm/msg/client/commonservices/workqueue/WorkQueueItem.runTas k at com/ibm/msg/client/commonservices/workqueue/SimpleWorkQueueItem. runItem at com/ibm/msg/client/commonservices/workqueue/WorkQueueItem.run at com/ibm/msg/client/commonservices/workqueue/WorkQueueManager.run WorkQueueItem A RemoteReconnectThread thread (started to reconnect JMS Connections and JMS Sessions as part of the MQ Automatic Client Reconnect function) would typically be seen in a conditional wait state, but it would wake up every five seconds to check the TCP/IP connection it was using is still marked as connected. Example thread state from a Javacore when the problem arises: "JMSCCThreadPoolWorker-2" J9VMThread:0x000000002255A100, j9thread_t:0x00007FA72000B1E0, java/lang/Thread:0x000000002204C030, state:CW, prio=5 (java/lang/Thread getId:0x12, isDaemon:true) Waiting on: com/ibm/mq/jmqi/remote/impl/RemoteSession$AsyncTshLock@0x0000000 02204D788 Owned by: <unowned> Java callstack: at java/lang/Object.wait(Native Method) at java/lang/Object.wait at com/ibm/mq/jmqi/remote/impl/RemoteSession.receiveAsyncTsh (entered lock: com/ibm/mq/jmqi/remote/impl/RemoteSession$AsyncTshLock@0x0000000 02204D788, entry count: 1) at com/ibm/mq/jmqi/remote/impl/RemoteSession.receiveTSH at com/ibm/mq/jmqi/remote/impl/RemoteSession.startConversation at com/ibm/mq/jmqi/remote/impl/RemoteConnectionSpecification.sessio nFromEligible at com/ibm/mq/jmqi/remote/impl/RemoteConnectionSpecification.getSes sionFromEligibleConnection (entered lock: com/ibm/mq/jmqi/remote/impl/RemoteConnectionSpecification$Connec tionsLock@0x000000000239BE18, entry count: 1) at com/ibm/mq/jmqi/remote/impl/RemoteConnectionSpecification.getSes sion at com/ibm/mq/jmqi/remote/impl/RemoteConnectionPool.getSession at com/ibm/mq/jmqi/remote/api/RemoteFAP.jmqiConnect at com/ibm/mq/jmqi/remote/impl/RemoteReconnectThread.reconnect at com/ibm/mq/jmqi/remote/impl/RemoteReconnectThread.run The classes for JMS application would hang and would not reconnect to the standby queue manager instance.
Local fix
Set the server-connection channel attribute "SHARECNV" to the value 1. This ensures only one conversation (hConn) to the queue manager can occur over a single TCP/IP socket. As such, new connection requests and those being reconnected by the RemoteReconnectThread have a dedicated TCP/IP socket and no attempt to multiplex conversations over a single socket is attempted.
Problem summary
**************************************************************** USERS AFFECTED: This issue affects users of: - The WebSphere MQ classes for JMS v7.1.0.7 - The WebSphere MQ classes for Java v7.1.0.7 - The WebSphere MQ Resource Adapter v7.1.0.7 - The WebSphere MQ classes for JMS v7.5.03, v7.5.0.4, v7.5.0.5 and v7.5.0.6 - The WebSphere MQ classes for Java v7.5.03, v7.5.0.4, v7.5.0.5 and v7.5.0.6 - The WebSphere MQ Resource Adapter v7.5.03, v7.5.0.4, v7.5.0.5 and v7.5.0.6 after APAR IC93973 http://www-01.ibm.com/support/docview.wss?uid=swg1IC93973 and all versions of: - The IBM MQ classes for JMS v8 - The IBM MQ classes for Java v8 - The IBM MQ Resource Adapter v8 - The IBM MQ classes for JMS v9 - The IBM MQ classes for Java v9 - The IBM MQ Resource Adapter v9 Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: After failing over a multi-instance queue manager from an active to a standby instance, a hang could have occurred within the classes for JMS automatic client reconnection feature. The classes for JMS application would not reconnect and the JVM required terminating and restarting in order to recover. The issue occurred when the internal "RemoteReconnectThread" (responsible for reconnecting JMS Connections and JMS Sessions as part of automatic client reconnection) attempted to establish a new conversation on an existing TCP/IP connection (also known as a channel instance) that was in the process of being closed. In this scenario, there was a race condition between this RemoteReconnectThread and an internal "RemoteRcvThread" (responsible for reading the data from the TCP/IP connection) for this connection whereby the RemoteRcvThread would not notify the RemoteReconnectThread that the TCP/IP connection was no longer valid. As such, the RemoteReconnectThread would wait for a response from the queue manager to its MQCONNX request that would not be received. Furthermore, the RemoteReconnectThread held onto an internal"connections lock" - the RemoteRcvThread required that lock to to complete the closure of the failed connection, and so blocked indefinitely. The same hang issue could also occur between a RemoteRcvThread and a standard application thread creating a new conversation on an existing connection. For classes for JMS applications, this occurred when recreating a JMS Connection, JMS Session or JMS Context object. For classes for Java applications, this occurred when instantiating a new MQQueueManager object.
Problem conclusion
The MQ classes for Java and MQ classes for JMS have been updated such that the RemoteRcvThread now notifies either the RemoteReconnectThread or an application thread if a TCP/IP connection is no longer valid, should it attempt to allocate a new conversation on that TCP/IP connection at the same time as a a failure has been detected on that connection. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v7.1 7.1.0.8 v7.5 7.5.0.8 v8.0 8.0.0.6 v9.0 CD 9.0.1 v9.0 LTS 9.0.0.1 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 --------------------------------------------------------------- --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v7.1 7.1.0.8 v7.5 7.5.0.8 v8.0 8.0.0.6 v9.0 CD 9.0.1 v9.0 LTS 9.0.0.1 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT15408
Reported component name
WMQ BASE MULTIP
Reported component ID
5724H7241
Reported release
750
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-05-23
Closed date
2016-07-26
Last modified date
2017-06-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WMQ BASE MULTIP
Fixed component ID
5724H7241
Applicable component levels
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSDEZSF","label":"IBM WebSphere MQ Managed File Transfer for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.5","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
31 March 2023