IBM Support

Mustgather: Gathering data for high processor usage on z/OS®

Troubleshooting


Problem

If your Java application experiences high processor usage on a z/OS® operating system, there are a number of diagnostic data files that are useful for diagnosing the problem. This mustgather document describes about the diagnostic data files to be collected and the procedure to collect the same.

Resolving The Problem

The following set of diagnostic data files need to be collected manually during the time of high processor usage by your java application:
1. Javacores
2. Transaction dumps(TDumps)
3. CEEDumps
4. Verbose GC logs (native_stderr.log)

For these diagnostic data files to be created, check that the system is configured correctly as per setup document.

Collect the diagnostic data files manually during the time of high processor usage as per the below steps:

1. Generate multiple javacores, transaction dumps and CEEDumps using the following command:
kill -SIGQUIT [PID_of_problem_JVM]

There needs to be 2 or 3 minutes time interval in between each set of dumps(javacore, Tdump and CEEDump) generated. Otherwise spread the time interval across the issue time and generate the above set of diagnostic data files at each of the points below:

a) As soon as possible after the application starts.


b) When you notice a definite increase in processor usage, based on observation or monitoring tools.
c) When processor usage peaks and remains consistently high, based on observation or monitoring tools

High CPU situations are normally observed initially on z/OS systems with tools like SDSF, where the DA option can sort CPU usage by address space. Confirmation can be provided by the use of operator commands like "D OMVS,A=ALL", where the CT_SECS field reports the CPU usage by address space. For a loop situation you might expect to see the CT_SECS field value for a specifc address space rise quickly between command invocations.

Once you have identified the address space of interest (and for java issues, its associated z/OS UNIX PID) you can use command "D OMVS,PID=<pid value>" to show the CPU usage by thread in the process. Once again, you might expect to see the ACC_TIME CPU time reported for a specific thread rise quickly between command invocations. This information can be later cross referenced with thread information in any javacores or TDumps you obtain.

Note that the javacores and the CEEdumps generated can be found in the current path. The transaction dumps will be generated as standard MVS™ Dump data sets.The names of them will follow the below format:

Javacore : javacore.<time stamp>.<id>.txt


Transaction dump : %uid.JVM.TDUMP.%job.D%y%m%d.T%H%M%S
CEEDump : CEEDump.<date>.<id>

2. Sometimes the generation of diagnostic data files via the Java signal handler can fail because of resource constraints. In that case the operator can take a console dump of the looping process using the operator DUMP command.

For example, having identified the address space in which the loop occurs using SDSF or the "D OMVS,A=ALL" command, you can issue a command like:

DUMP COMM=(dumptitle)
Rxx,ASID=(1,aaaa),SDATA=(PSA,CSA,LPA,LSQA,RGN,SQA,SUM,SWA,TRT,ALLNUC,GRSQ),END

where xx is the reply number for the WTOR issue in response to the DUMP command and
aaaa is the address space id in hexadecimal. You can also specify it with JOBNAME=(xyz).

The SVC dump thus obtained can be post-processed by IBM internally to generate information similar to that obtained by the JVM signal handler. Alternatively, you can process the data locally using the jextract and jdmpview tools provided in the IBM JDKs (consult the java diagnostic guide for more information).

3. Collect Verbose Garbage Collector data. This data is in the location specified during setup. Alternatively, the data is sent to the stderr output.

After collecting all the above diagnostic data files, you can submit them for help with diagnosing the problem.

[{"Product":{"code":"SSNVBF","label":"Runtimes for Java Technology"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"100% CPU Usage","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"8.0;7.1;7.0;6.1;6.0;5.0","Edition":"J2SE","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
15 June 2018

UID

swg21588548