IBM Support

How to interpret the general diagnostic data of a performance problem in IBM Business Process Manager

Technical Blog Post


Abstract

How to interpret the general diagnostic data of a performance problem in IBM Business Process Manager

Body

 

In blog Collection of data for troubleshooting a performance issue involving IBM Business Process Manager, we talk about the MustGather data for troubleshooting performance issue of IBM Business Process Manager (BPM). After gathering the general diagnostic data, you can review the data and find out the root cause by yourself.

 

Firstly you want to check the Operation System behavior to see if the Java™ process consumes a lot of resources. The output of linperf.sh command is a file named linperf_RESULTS.tar.gz. Unzip the file, you can see a list of files. The files include CPU, memory, disk I/O, and network statistic data. You can use your own OS monitor tools to get this data. But in case you do not have good one, you can use the linperf.sh script. In top.out file, you can identify the process which occupies the most CPU resource. Following is an example top.out:

image

 

You can see the BPM JVM consumes 130% of CPU. The other process did not use much CPU. In some cases, you may see other processes take high CPU. In this example, you want to find out which threads inside BPM JVM consume high CPU. Then you need to check the topdashH3153.out:

image

 

Two WebContainer thread and one ThreadPool worker thread occupy high CPU resource. To find out the details of each thread, you need to check the Javacore files.

 

Javacore are a runtime snapshot for a JVM. To diagnose performance problem, a single javacore does not help much. You should collect a series of Javacores with 30 seconds interval. You should adjust the interval according to the actual performance gap. If the gap is 20s, consider setting it to 15s. With these javacores we can see what the JVM is doing from one state to another. In the above example, the WebContainer threads consume high CPU. Thread 3417 consume 34.5% of CPU. Transform PID 3417 from Dec format to Hex format 0xD59. Search for 0xD59 in the Javacore file, you can find the thread stack:

image

 

Probably it is writing logs. Check if you have enabled heavy trace on BPM. Besides the log, trace writing, we have seen the below causes of high CPU:
1) GC slave thread
2) Infinite loop

image

 

3) Execution Context serialization and de-serialization can be a CPU intensive operation. It can burn a lot of CPU if larger than 10MB. The corresponding thread stack looks like below:

image

 

In addition, you want to check vmstat.out file which is output from vmstat command. Below are some important columns you should pay attention to:

Procs

r: The number of processes waiting for run time. When this value is larger that the number of CPU, it indicate your CPU is busy and there might be a CPU bottleneck

b: The number of blocked threads

Memory

swpd: the amount of virtual memory used. If it is larger than 0, it indicates your memory is not sufficient. Consider adding more memory or move to a new hardware.

Swap

si: Amount of memory swapped in from disk (/s). If the value is greater than 0, it means your memory is not enough.

so: Amount of memory swapped to disk (/s). The same with si.

System

cs: The number of context switches per second. A context switch is expensive. The smaller this value is, the better. If this value is large, it indicates your CPU is wasting much time doing context switching rather than processing data. The CPU utilization is low. If the value is larger than 100,000, it should be a concern.

 

Below is the content of vmstat.out.

image

 

In the procs-> r column, there are some threads waiting for run time. At some time, 29 threads are queuing. In this case, you should investigate if CPU resource is sufficient, or if some resource should not be consuming high CPU.
 

 

 

title image (modified) credit: (cc) Some rights reserved by ClkerFreeVectorImages

 

 

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"","label":""},"Component":"","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

UID

ibm11080687