Technical Blog Post
Analogy for threads in socketRead
I've been working on some PMRs (Problem Records) lately for threads either completing slowly or being reported as "may be hung" in WebSphere Application Server and many of the products that run on it. A very common scenario is seeing threads in socketRead:
[9/15/14 9:32:13:492 PDT] 00000023 ThreadMonitor W WSVR0605W: Thread "WebContainer : 0" (00000021) has been active for 692481 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.
at java.net.SocketInputStream.socketRead0(Native Method)
... many more lines in stack
The analogy I use to illustrate a thread in socketRead is if you were to call me (your backend system, ie; database, ldap server, remote web service, web server, etc.) on the phone and ask for some information. You are now in "socketRead" waiting for me to respond. There is nothing you can do to speed that up, all you can do is wait.
It may be a complex request that I need to look to several sources to find (representing an issue with a complex database query), or I may not be able to get the file cabinet open quickly that has the answer (representing an issue with slow performance on the database or backend server), or we may have a bad connection due to an issue with a satellite so I'm talking but my response is delayed due to the bad connection (representing a delay due to the network). If you never get a response, the line may have gone dead (again, representing something on the network between the backend resource and the JVM or the backend resource crashed). No matter what, you need the answer so all you can do is wait (still in "socketRead") until my response gets back to you.
To resolve the issue, you should have the application team that owns the code in the thread stack making the request and putting the thread into socketRead to find out:
1) What information was requested
2) What backend system was it requested from
If there is no apparent problem with the query itself, then you need to check with the administrator for the backend system to check out:
a) If the remote server is running
b) If the application the data is requested from is running
c) If there are performance issues on the backend system
d) If the query that the application performed was valid and works directly on the database
If there is no apparent issue with the backend system, then you need to engage the network/firewall team to investigate to see if there is an issue preventing or delaying the response from the backend system to the appserver. This may require network traces, but this is for the network team to perform.
If there is a network issue causing a delayed response, this delay is outside of the application server's control and would need to be resolved by the network administrators rather than WebSphere Application Server support.
** Note: If a JVM is encountering OutOfMemory issues, then that will probably need to be diagnosed first as that can cause many issues, but that is easily identified with notifications about the OutOfMemory in the logs.
Some requests may need to take several minutes to complete. If long running requests are not expected, you can set timeouts in WebSphere Application Server to time out requests after a predetermined amount of time. For information on this, please see David Tiler's blog entitled "What to do when threads are hung in socketRead() waiting for a response from a backend."