IBM Support

MustGather: Short performance delays in WebSphere Application Server

Troubleshooting


Problem

If user requests are taking longer than normal to run, it can be difficult to find out what the cause of the delays are. Normal debugging techniques may not be sufficient to find the cause of the increased responce time, in which case we can enable some of WebSphere's request tracing.

Diagnosing The Problem

WebSphere has a trace called Request Metrics, which can report on the time each request spent in the major WebSphere components (Servlet, EJB, JDBC). Obviously this is going to add a fair bit of overhead, and if we enabled it on every request, the app server would become almost unusable. But, we can run an Access Log trace on the app server, which is very lightweight and will tell us the URI of the requests taking the longest time. We can then run the Request Metrics trace on those URIs only.

Resolving The Problem

Part I: Enable traces for access log:

1. From the Administrative Console, go to Servers > Server Types > WebSphere application servers > server_name > Web Container Settings > Web container transport chains > Click on transport chain application is using (by default WCInboundDefault for http requests and WCInboundDefaultSecure for HTTPs requests) > Click on HTTP inbound channel


image-20190422154545-17 

2. Under "Logging", check "Enable logging" check box

3. Expand NCSA Access logging and select "Use chain-specific logging"

4. Make sure "Enable access logging" is checked and add following values for these fields

  • Access log file path: ${SERVER_LOG_ROOT}/http_access.log
  • Access log maximum size: 500
  • Maximum number of historical files: 1
  • NCSA access log format : Common


image-20190422154545-18 
 

5. Click on Apply

6. Under "Additional Properties" on the same panel (#1 above), click on "Custom properties"

7. Click New and add accessLogFormat as Name and %t %r STATUS CODE: %s EXECUTION TIME: %D RESPONSE SIZE: %b as Value
** Note: This outputs the times in microseconds (not milliseconds) so you need to convert microseconds to whatever measurement you are monitoring

image-20190422154545-19 

8. Click on Apply and "Save" changes

9. Go to Servers > Server Types > WebSphere application servers > your_server

image-20190422154545-20 

10. Under "Troubleshooting" click on "NCSA access and HTTP error logging"

image-20190422154545-21 

11Make sure "Enable logging service at server start-up" check box is checked

image-20190422154545-22 

12 Recreate the issue and send IBM the http_access.log file for review so we can determine which URI took the longest time.

13.  Click "Apply" and "Save" changes

Part II: Enable Request metrics:

Once IBM support has analyzed the access log from Part I, they will tell you what URI or URIs to use to enable Request Metrics on. 

1. From the Administrative console on the left pane under "Monitoring and Tuning" click on "Request metrics"

image-20190422154545-23 

2. Enable request metrics by selecting "Prepare Servers for Request metrics collection"

image-20190422154545-24 

3. Select "All" radio button under "Components to be instrumented"

4. Select trace level to "Debug"

5. Check "Standard Logs" check box under Request Metrics Destination

6. Click "Apply" and "Save" changes

7. From that screen, click on "Filters" under Additional Properties on the right then select "URI"

 image-20190422154545-25

8. Select "Filter Values" under Additional Properties on the rightimage-20190422154545-26

9. Add a new filter and give the value of your URI (Click on “New”)
image-20190422154545-27

10. Paste the URI (all one string) into that field:
<your_filter_here>

On the same screen enable check box for "Enable filter"

image-20190422154545-28

11. Click “Apply” and “Save” changes

Configure the hang detection policy:
 

1. From the Administrative console, click Servers > Server Types > WebSphere application servers > server_name

2. Under Server Infrastructure, click Administration > Custom properties

image-20190422154545-29 

3. Click New and add the following name and value pairs

  • Name: com.ibm.websphere.threadmonitor.interval Value: 1
  • Name: com.ibm.websphere.threadmonitor.threshold Value: X where X = |(total delay)/2| e.g. If total delay is 28 seconds, value of this parameter is |28/2| = 14
  • Name: com.ibm.websphere.threadmonitor.false.alarm.threshold Value:0 (this will prevent automatic adjustment to threshold value)

image-20190422154545-30 

4. Click “Apply” and “Save” changes

For WebSphere 8.5.X and above only - Configure cross component tracing :

1. From the Administrative Console, go to Servers > Server Types > WebSphere application servers > server_name

2. Under "Troubleshooting" click on "Change log detail levels" (you can get to this panel by clicking "Logging and tracing > Diagnostic Trace > Change log detail levels under Additional properties)

image-20190422154545-31 

3. Under "Correlation", check the "Enable log and trace correlation" check box

4. Select third radio button "Include request IDs in log and trace records, create correlation log records and capture data snapshots"

image-20190422154545-32 

5. Click “Apply” and “Save” changes

Recreating the problem:

1. Stop the Application Server

2. Take a backup if needed and clear all Application SErver logs.  The log files by default are located in the following directory: install_root/profiles/profile_name/logs/server_name
(Note: If you have configured to write Application Server log files into a different location, then clear them accordingly)

  • native_stderr
  • native_stdout
  • SystemErr
  • SystemOut

3. Clear all WebSphere Application Server FFDC logs.  FFDC files are located in the following directory: install_root/profiles/profile_name/logs/
(Note: If you have configured to write FFDC log files into a different location, then clear them accordingly)

4. Start the server and recreate the problem

Submit following information to IBM:

1. Provide URL under question and time stamp when it's accessed

2. The Application Server logs, such as the following logs:

  • native_stderr
  • native_stdout
  • SystemErr
  • SystemOut
  • http_access.log

3. FFDC logs with the current date from the following directory:
install_root/profiles/profile_name/logs/ffdc


4. The server.xml file that is located in the directory:
install_root/profiles/profile_name/config/cells/cell_name/nodes/node_name/servers/server_name


5. Send the files/information to IBM Support

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU004","label":"Hybrid Cloud"},"Component":"Performance and High CPU Utilization","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"9.0;8.5.5;8.5;8.0.0.2;7.0.0.31","Edition":"Network Deployment;Single Server"}]

Document Information

Modified date:
22 April 2019

UID

swg21661868