Fixes are available
APAR status
Closed as program error.
Error description
Intermittently threads may be Stuck (hung / wait) when HBINT (Heartbeat interval) is set to 0. Channel HBINT of 0 will result in the Java Socket objects having a timeout value of 0, interpreted as an infinite timeout. The normal communication error methods are not invoked leaving the threads in a wait state. A coredump of the hung JVM shows a thread blocked at java.net.SocketInputStream.socketRead0(Native Method)
Local fix
There are two options: 1. Set the heartbeat interval (HBINT property) on the channel to a positive, non-zero value. This will ensure that a timeout is set on all socket read operations performed on connections using this channel. Option 1 will cause the client to send heartbeat messages to the queue manager and retry the request if it does not receive a response within the timespan defined by HBINT. If this behavior is not desirable for the customer then option 2 should be considered. 2. Set the environment variable "com.ibm.mq.tuning.socketGrainTimeout". This variable allows the customer to specify, in seconds, a timeout for socket read operations to override the value derived from HBINT. Using this property will not enable client heartbeating, but the socket timeouts will be applied system-wide, rather than to a specific channel.
Problem summary
**************************************************************** USERS AFFECTED: This issue affects users of: - The IBM WebSphere MQ v7 classes for Java - The IBM WebSphere MQ v7 classes for JMS who wish to use a TCP connection to a queue manager using a server-connection (SVRCONN) channel with a heartbeat interval (HBINT) of 0. Platforms affected: All Distributed (iSeries, all Unix and Windows) +Java **************************************************************** PROBLEM SUMMARY: At v7, the JMQI layer (classes used by both the WebSphere MQ classes for Java and classes for JMS to communicate with the queue manager) uses the channel's heartbeat interval property to derive the value to set for the socket timeout when waiting for data from the queue manager. If the heartbeat interval is 0, then the socket timeout will be set to 0. A Java socket timeout of 0 results in an indefinite socket read. It was observed that in some cases, when socket timeout was set to 0, the JVM failed to respond to errors on the socket connection, and did not return control the WMQ client classes. This caused applications to hang indefinitely.
Problem conclusion
JMQI code was modified to split infinite-wait socket receive operations into finite chunks. After each chunk elapses, the JVM times out the socket and returns control to the client classes. The client classes then restart the socket receive, detecting any socket errors in the process. In addition, the code change associated with this APAR also corrects the implementation of the com.ibm.mq.tuning.socketGrainTimeout property. This property allows the user to override the HBINT value for all queue manager connections within the JVM. Previously an integer value set on this property would be added to the existing HBINT value, rather than overwriting as was intended. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: v7.0 Platform Fix Pack 7.0.1.6 -------- -------------------- Windows U200328 AIX U840698 HP-UX (PA-RISC) U841555 HP-UX (Itanium) U841560 Solaris (SPARC) U841556 Solaris (x86-64) U841562 iSeries tbc_p700_0_1_6 Linux (x86) U841557 Linux (x86-64) U841561 Linux (zSeries) U841558 Linux (Power) U841559 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available, information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IC74652
Reported component name
WMQ WINDOWS V7
Reported component ID
5724H7220
Reported release
700
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2011-02-23
Closed date
2011-03-25
Last modified date
2011-03-25
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WMQ WINDOWS V7
Fixed component ID
5724H7220
Applicable component levels
R700 PSY
UP
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSDEZSF","label":"IBM WebSphere MQ Managed File Transfer for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
31 March 2023