APAR status
Closed as program error.
Error description
A client sending an Object Request Broker (ORB) request over a newly-created Secure Sockets Layer (SSL) connection to a WebSphere Application Server fails with a COMM_FAILURE exception due to a server-side exception on the connection. Particularly long-running requests (10+ seconds) are more susceptible to the failure. Client-side ORB trace for the failed request will log the following shortly after the request has been sent: [2/3/20 18:12:03:169 EET] 0000003a ORBRas 3 readMoreData:1810 Reached the end of the stream -- ConnId:1234 RT=5:P=42435:O=0:WSSSLTransportConnection[addr=1.2.3.4,port= 9110,local=35082] com.ibm.rmi.iiop.Connection [2/3/20 18:12:03:206 EET] 00000001 ORBRas 3 com.ibm.rmi.iiop.Connection getCallStream:2557 P=42435:O=0:CT The following exception was logged org.omg.CORBA.COMM_FAILURE: purge_calls:2177 Reason: CONN_ABORT (1), State: ABORT (5) vmcid: IBM minor code: 306 completed: Maybe at com.ibm.rmi.iiop.Connection.purge_calls at com.ibm.rmi.iiop.Connection.doReaderWorkOnce at com.ibm.rmi.transport.ReaderThread.run The Server-side ORB trace will show the following SocketTimeoutException shortly after receiving the client's request message: [2/3/20 18:12:03:158 EET] 0000049f 3 ORBRas com.ibm.rmi.iiop.Connection doReaderWorkOnce:3447 RT=29:P=794048:O=0:WSSSLTransportConnection[addr=1.2.3.4 port=41678,local=9426] The following exception was logged java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0 ... at com.ibm.rmi.iiop.Connection.doReaderWorkOnce at com.ibm.rmi.transport.ReaderThread.run
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: IBM WebSphere Application Server users of * * ORB SSL communications * **************************************************************** * PROBLEM DESCRIPTION: A SocketTimeoutException on an * * Applicaton Server's ORB connection * * can cause a client ORB request to * * fail with COMM_FAILURE exception. * **************************************************************** * RECOMMENDATION: * **************************************************************** When a Server receives a new ORB request on an SSL connection, the Server's ORB Reader thread for that connection can attempt a new socket read before the read timeout has been set back to 0 (following the SSL Handshake). This can cause an errant SocketTimeoutException. This exception in turn causes the connection to be cleaned up on the Server, resulting in the client's originating request to fail with a COMM_FAILURE exception. The problem is more likely to occur on a client request which takes longer than average (more than 10 seconds), and also on a server which is under more load and has more running threads.
Problem conclusion
When an Application Server gets a new incoming ORB SSL connection, a HandshakeCompletedNotifier Thread is created to detect when the SSL Handshake is completed on the new connection. This thread is supposed to reset the read timeout back to 0 to allow correct functioning of the ORB Reader thread using the connection. If this Notifier thread is not scheduled and run in a timely manner, the Server's ORB Reader thread for the connection can attempt a new socket read before the read timeout has been set back to 0, causing an errant SocketTimeoutException. Instead of relying on the timely scheduling of the HandshakeCompletedNotifier Thread, the Server's ORB Listener Thread will perform the resetting of the socket read timeout to 0 upon completion of the SSL Handshake. This fix is intended only for Application Servers on which com.ibm.ws.orb.transport.DeferSSLHandshake=false. The fix for this APAR is targeted for inclusion in fix pack 9.0.5.4. For more information, see 'Recommended Updates for WebSphere Application Server': http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix
Comments
APAR Information
APAR number
PH22275
Reported component name
WEBS APP SERV N
Reported component ID
5724H8800
Reported release
900
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-02-14
Closed date
2020-04-28
Last modified date
2020-04-28
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WEBS APP SERV N
Fixed component ID
5724H8800
Applicable component levels
R900 PSY
UP
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
01 November 2021