APAR status
Closed as program error.
Error description
A leak of objects of type com/ibm/ws/tcp/channel/impl/ZAioTCPConnLink was observed in a zWAS control process. The scenario involved with the leak is thought to be 1-CR reads initial data but doesn't have enough to constitute a chunk (can't queue to WLM yet) 2-CR issues another read for more data, which goes async 3-The async read returns with an error Externally, you might expect to see (a) servant threads hung in BBOO0221W: WSVR0605W: Thread "WebSphere WLM Dispatch Thread t=008 " (00000084) has been active for 615,645 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung. at com.ibm.w s390.xmem.proxy.XMemProxySRCppUtilities.ntvXMemProxySrRead(Nativ e Method) (b) FFDCs cut for com.ibm.wsspi.http.channel.exception.HttpErrorException: Request timed out while waiting for servant region dispatch to complete java.io.IOException: A XMemProxyCRInboundConnLink cannot be found for ConnectionHandle=ConnectionHandle[0x531e29fd c0/0x4bc/IN:CR/remote] java.io.IOException: Error occurred during communications processing. EDC5140I Broken pipe. (errno2=0xC25F0041): RV=-1, RC=140, RS=198662201 at com.ibm.ws. tcp.channel.impl.ZAioTCPChannelCPPUtilities.write1Buffer(Native Method) It is possible that such a scenario could also be associated with a socket leak, due to references to Socket objects from the leaked ZAioTCPConnLink objects. NB. In the reported scenario, the WAS application was used for uploads of reasonably large (circa. 10MB) pdf files. It's possible that applications like this might expose this issue, especially if there have been changes in the performance of the network(s) from the client to the server.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM WebSphere Application * * Server * * V9.0 * **************************************************************** * PROBLEM DESCRIPTION: WebSphere Application Server for z/OS * * leak of ConnLink objects in a * * XMemProxyCRInboundConnLink channel * * chain. * **************************************************************** * RECOMMENDATION: * **************************************************************** There exists a flow in which an asynchronous read error can cause a leak of a XMem connection in the Controller region. A typical Channel Framework channel chain for a XMem Connection is TCP <> HTTP <> XMem CR. The most device-side channel, TCP, is object ZAioTCPConnLink. The following is the scenario for the leak flow: 1-CR reads data but does not have enough to constitute a request (it can not be queued to WLM yet). 2-CR issues another read for more data, which goes async. 3-The async read returns with an error (ex. timeout). The processing of the async read error, in the XMemProxyCRInboundConnLink class, sets an error flag: private boolean pendingReadError = true The processing of the async read error expects Servant code processing the request to react to the "pendingReadError", but since there is currently no request queued to WLM for this connection, the connection is leaked in this state. Another indicator that the leaked XMemProxyCRInboundConnLink object is from this window is that the boolean to indicate we have queued a request to WLM is "false": private boolean isFirstChunkSent = false
Problem conclusion
Code has been modified in the async read error method in XMemProxyCRInboundConnLink to properly handle the Connection if a read error occurs prior to queuing the request to WLM. It will now initiate a close on the connection which will release the related resources. The fix for this APAR is targeted for inclusion in fix pack 9.0.5.7. For more information, see 'Recommended Updates for WebSphere Application Server': https://www.ibm.com/support/pages/node/715553
Temporary fix
Comments
APAR Information
APAR number
PH32909
Reported component name
WEBSPHERE FOR Z
Reported component ID
5655I3500
Reported release
900
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-12-22
Closed date
2021-03-02
Last modified date
2021-03-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WEBSPHERE FOR Z
Fixed component ID
5655I3500
Applicable component levels
[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS7K4U","label":"WebSphere Application Server for z\/OS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"900"}]
Document Information
Modified date:
04 March 2021