Fixes are available
APAR status
Closed as program error.
Error description
Customer could not stop the server by 'STOP' command.They had to cancel it. Dump which was taken during termination hung showed there were deadlocked threads in CR. -com/ibm/ws390/xmem/proxy/channel/XMemProxyCRInboundConnLink.sen dFirstChunkToSr() source: XMemProxyCRInboundConnLink.java:1163 -com/ibm/ws/tcp/channel/impl/ZAioTCPConnLink.destroyCommon(Excep tion) source: ZAioTCPConnLink.java:1072 . The cause of the deadlock is a small timing window. On some paths through the inbound request path we get a "readlock" which will be held across the queuing of the request to WLM for dispatchwithin a Servant. . Just after queuing to WLM we enter a synchronized block on the XMemProxyCRInboundConnLink object to determine if there's more data to read for the new request. Before this thread can enter this block the queued request was dispatched in the Servant and its final response is on another Controller thread to drive back to the client. . This second CR thread is in XMemProxyCRInboundConnLink.sendFinalResponse where it obtains the lock on the XMemProxyCRInboundConnLink and sends the response. It then determines the disposition of the current connection. It needs to decide if it is a healthy persistent connection then it would just cleanup the current request and issue another read for the next request.If it is not a healthy connection it will attempt to cleanup the connection. In the deadlock scenario it attempted to cleanup the connection. It made it into the ZAioTCPConnLink.destroyCommon method under the close()processing where it attempts to get the "readlock" "readlock" then "writelock" to ensure that no outstanding I/O exists for this connection. This is where it became blocked. The inbound thread is holding the "readlock" and when it received control after queuing the inbound request to WLM it attempted to the get the lock on the XMemProxyCRInboundConnLink object but the sendFinalResponse thread obtained it first. Now both threads are deadlocked. . This only applies to V9. This problem is a result of changes unique to V9.
Local fix
Cancel the server.
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM WebSphere Application * * Server * * V9.0 * **************************************************************** * PROBLEM DESCRIPTION: WebSphere Application Server for z/OS * * may hang during stop processing. * **************************************************************** * RECOMMENDATION: * **************************************************************** The server may not stop after receiving a stop command. The reason for the hang is that a HTTP Request could still be in controller native request registry. It has been observed, from the ORB_Request stateflag information of this HTTP Request, that the HTTP Request has completed dispatch in a Servant and it has started controller response processing, but not completed. There are 2 controller ACRW threads deadlocked for this HTTP Request. The cause of the deadlock is a small timing window. On some paths through the inbound request path we get a "deadlock" which will be held across the queuing of the request to WLM for dispatch within a Servant. In XMemProxyCRInboundConnLink.sendFirstChunkToSr, just after queuing to WLM, we enter a synchronized block on the XMemProxyCRInboundConnLink object to determine if there's more data to read for the new request. Before this thread can enter this block the queued request was dispatched in the Servant and its final response is on another Controller thread to drive back to the client. This second CR thread is in XMemProxyCRInboundConnLink.sendFinalResponse where it obtains the lock on the XMemProxyCRInboundConnLink and sends the response. It then determines the disposition of the current connection. It needs to decide if it is a healthy persistent connection then it would just cleanup the current request and issue another read for the next request. If it is not a healthy connection it will attempt to cleanup the connection. In the deadlock scenario it attempted to cleanup the connection. It made it into the ZAioTCPConnLink.destroyCommon method under the close() processing where it attempts to get the "readlock" then "writelock" to ensure that no outstanding I/O exists for this connection. This is where it became blocked. The inbound thread is holding the "readlock" and when it received control after queuing the inbound request to WLM it attempted to the get the lock on the XMemProxyCRInboundConnLink object but the sendFinalResponse thread obtained it first. Now both threads are deadlocked.
Problem conclusion
Code has been modified in the XMemProxyCRInboundConnLink.sendFirstChunkToSr method to obtain the XMemProxyCRInboundConnLink lock prior to queuing the request to WLM. The fix for this APAR is currently targeted for inclusion in fix pack 9.0.5.1. Please refer to the Recommended Updates page for delivery information: http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix
Comments
APAR Information
APAR number
PH13273
Reported component name
WEBSPHERE FOR Z
Reported component ID
5655I3500
Reported release
900
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2019-06-12
Closed date
2019-06-19
Last modified date
2019-06-19
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WEBSPHERE FOR Z
Fixed component ID
5655I3500
Applicable component levels
R900 PSY
UP
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS7K4U","label":"WebSphere Application Server for z\/OS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"900","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
17 October 2021