Slow and hung request detection
The requestTiming-1.0
feature provides diagnostic information when the
duration of any request exceeds the configured threshold.
The latest documentation about slow and hung request detection is available on the Open Liberty website.
Slow request detection
When a request runs for longer than it was configured, a warning message is written in the messages log file. Details about the request and events that made up the request are captured.[12/1/14 11:58:09:629 IST] 0000001d com.ibm.ws.request.timing.SlowRequestTimer W TRAS0112W: Request AABjnS+lIn0_AAAAAAAAAAb has been running on thread 00000021 for at least 10003.571ms. The following stack trace shows what this thread is currently running. at java.util.HashMap.getEntry(HashMap.java:516) at java.util.HashMap.get(HashMap.java:504) at org.apache.derby.iapi.store.access.BackingStoreHashtable.get(Unknown Source) at org.apache.derby.impl.sql.execute.HashScanResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.DeleteResultSet.setup(Unknown Source) at org.apache.derby.impl.sql.execute.DeleteResultSet.open(Unknown Source) at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source) at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeUpdate(WSJdbcPreparedStatement.java:626) at com.ibm.websphere.samples.trade.direct.TradeDirect.resetTrade(TradeDirect.java:1832) at com.ibm.websphere.samples.trade.web.TradeConfigServlet.doResetTrade(TradeConfigServlet.java:65) at com.ibm.websphere.samples.trade.web.TradeConfigServlet.service(TradeConfigServlet.java:348) at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1275) .... at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1121) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614) at java.lang.Thread.run(Thread.java:769) Duration Operation 10007.571ms + websphere.servlet.service | TradeWeb | TradeConfigServlet?action=resetTrade 3.923ms websphere.datasource.psExecuteUpdate | jdbc/TradeDataSource | delete from holdingejb where holdingejb.account_accountid is null 0.853ms websphere.datasource.psExecuteUpdate | jdbc/TradeDataSource | delete from accountprofileejb where userid like 'ru:%' 5271.341ms + websphere.datasource.psExecuteUpdate | jdbc/TradeDataSource | delete from orderejb where account_accountid in (select accountid from accountejb a where a.profile_useri like 'ru:%')
TRAS0112W: Request <(Request ID)> has been running on thread <THREADID> for at least <DURATION>. The following stack trace shows what this thread is currently running. <STACK TRACE> <DURATION AND OPERATIONS Table>
- REQUEST ID
- This same ID can be used to search for log and trace messages corresponding to the request. In
particular, if you use binary logging, you can search for log and trace entries with the same
requestID
extension by using the binary log command. - STACK TRACE
- Indicates the method that is running. In the previous sample, you can see a stack trace of the
current request after
TRAS0112W
line. - DURATION AND OPERATIONS Table
- After the stack trace, you can find the tabular format of the request that shows the duration
and operation (also referred to as the event). The
Duration
column indicates the time that is taken by the corresponding operation of the request. The plus sign (+) indicates events within the request that are still running. The next line shows the duration without +, which indicates that the corresponding operation completed in the specified duration. Operation shows theEVENT TYPE
andCONTEXT INFO
(optional) for that operation. For more information about event types and context information, see Event Logging.
By analyzing the messages, you can figure out why the request is slow. However, it might be difficult to determine whether the request is stuck at that point or is still running slowly. Hence you can see three messages that are logged for any slow request at the interval of specified <slowRequestThreshold>. Using the three different stack trace and request data, you get a better insight into the issue. After the third warning, no further warnings are logged about the request unless the duration of the request crosses the hung request detection threshold.
Hung request detection
If the request exceeds the defaulthungRequestThreshold
or the configured threshold value, a warning message is
written in the messages log file along with the details about the request. Details about the request
and events that made up the request are captured. For a hung request detection, a series of three
thread dumps (javacores) is taken with a 1-minute delay between them. The following log message
sample shows the log messages for a request that crossed the hung request detection threshold. The
default duration value is 10 min. The value that is configured in the following example is 4
min.[WARNING ] TRAS0114W: Request AAA7WlpP7l7_AAAAAAAAAAA
was running on thread
00000021 for at least 240001.015ms. The following table shows the events that have run during this
request.Duration Operation 240001.754ms + websphere.servlet.service | TestWebApp | TestServlet?sleepTime=480000 0.095ms websphere.session.setAttribute | mCzBMyzMvAEnjMJJx9zQYIw | userID 0.007ms websphere.session.setAttribute | mCzBMyzMvAEnjMJJx9zQYIw | visitCount
TRAS0115W: Request AAA7WlpP7l7_AAAAAAAAAAAIt was previously detected to be hung and completed after 479999.681 s.
Configuring the slow and hung request thresholds
The slow and hung request thresholds are configured by setting
slowRequestThreshold and hungRequestThreshold in the
<requestTiming/>
configuration in the server.xml file. This
setting sets a single slow and hung request threshold for all requests that are timed.
You can also set different slow and hung request thresholds for different types of requests. An
embedded <servletTiming/>
or <jdbcTiming/>
configuration in
the server.xml file overrides the configured slow and hung request threshold
that are defined in <requestTiming/>
. A <servletTiming/>
configuration can set the slow and hung request threshold by application name, servlet name, or
both. A <jdbcTiming/>
configuration can set the slow and hung request threshold
by datasource name, sql string, or both. In both cases, the names are taken from the context
information for the request, as reported by the eventLogging-1.0
feature.
The event log contains the type and context information for each request that was processed by
the server. With type and context information, you can create a <servletTiming/>
or <jdbcTiming/>
configuration in the server.xml file for
the requests that require different threshold values.
For more information about what context information is provided for each request, see Event Logging in the product documentation. A <servletTiming/>
or <jdbcTiming/>
configuration can only be used to time a root level request.
For example, if a servlet request makes a JDBC query, the JDBC query cannot be timed because the
servlet request is the root level request.
<requestTiming/>
configuration with embedded
<servletTiming/>
configuration for one specific servlet and one specific web
application: <requestTiming slowRequestThreshold="10s" hungRequestThreshold="10m">
<servletTiming servletName="Arm10Servlet" hungRequestThreshold="20s"/>
<servletTiming appName="SecurityExchangeApp" hungRequestThreshold="2h"/>
</requestTiming>
requestTiming-1.0 feature restrictions
requestTiming-1.0
feature, the following restriction applies: - The
requestTiming-1.0
feature, when activated, has been shown to have a 4% adverse effect on the maximum possible application throughput when measured with the DayTrader application. While the effect on your application might be more or less than that, you should be aware that some performance degradation might be noticeable.