WebSphere Peformance - Alexandre Polozoff's Point of View
Someone takes a javacore during what looks to be a hung app server and notices it contains lots of threads in socketRead. This is symptomatic of a slow back end whether it is a database, Web service, etc. An application is as strong as its weakest link. If the backend the application depends on is unable to respond in a timely manner then there is nothing that can be tuned at the application layer except for aggressive timeouts to protect the application from getting stuck. Hangs like these typically happen under high load/traffic conditions. It is important that the group that maintains the backend is aware of an issue with their tier and they need to fix it.
I had a blog post on socketRead issues causing hung thread in this blog post and using timeouts when communicating with a database back in 2009. These problems don't happen very often anymore as networks and databases tend to run on fairly robust environments. You can imagine my surprise when late last week I received an email from a colleague working with an application suffering from the same symptoms in that blog.
The reason for setting the timeouts is to be able to fail fast as opposed to the application appearing to be non-responsive.
However, what to do if the timeout doesn't seem to be kicking in? In this case the first thing to do is to open a PMR with IBM Support.
Data really needs to be collected in at least two places. On the application server and on the database.
For the database http://www.ibm.com/developerworks/data/library/techarticle/dm-0812wang/ provides information on how to use db2top to collect data when the problem is occurring.
Then for the application server (this happens to be for BPM) http://www-01.ibm.com/support/docview.wss?uid=swg21611603 there are various sets of mustgathers to collect data in order for IBM Support to run analysis on.
Networks can also have hiccups and by running tcpdump on both the application server and database side one can use various protocol inspection tools like Wireshark to look at the underlying network communications to see what, if any, problems may be occurring there.