Troubleshooting
Problem
This document assists you in collecting the data necessary to diagnose and resolve performance, hang, or high CPU for application server or database server issues with the IBM Engineering Lifecycle Management (ELM) products (includes IBM Jazz Team Server (JTS), IBM Engineering Workflow Management (EWM) and IBM Test Management (ETM).
Symptom
Performance, hang, or high CPU issues with the ELM products can occur when installed in a distributed environment. Each issue may contribute to a variety of symptoms and behavioral deficiencies.
Cause
This MustGather assists you in collecting the data necessary to help you diagnose and resolve the issue. If you are unable to determine the root cause using the information collected, you should open a case with IBM Support for further investigation providing the data collected.
Resolving The Problem
You can use the IBM Support Assistant Lite (ISA Lite) Data Collector tool to quickly collect diagnostic files, such as log files, configuration files or to run traces. This tool is bundled with ELM. ISA Lite collects information about your Jazz Team Server environment and stores the information in a .zip archive file. If you have a need to open a case with IBM Support for further assistance, you can send the archive file with the data collection so that they can help diagnose and fix problems.
The information below should be gathered in addition to the normal information and log gathering done by ISA Lite.
Note: Further information to help troubleshoot performance issues with your ELM products is also available in the Performance troubleshooting section of the Deployment wiki on jazz.net.
Javacore log files and Application Logs
Lifecycle Query Engine and Link Indexer issues
Business Impact |
Includes:
- What effect is this having on the business
- Did the issue happen once or it is reocurring
- Is this a production or test environment
- How many users are affected
Unexpected Behavior |
Details include:
- Problem description
- Steps to re-create
- Exact date and time of the issue. If the issue happened multiple times, provide all times when the issue happened
- Were there any recent changes made to the environment?
- Were new users able to log in to any ELM applications?
- What behavior did the logged-in users encounter? Provide a screen capture or record the video of the issue
- Any errors or exceptions logged at the time the incident occurred
- If you use proxy server, check whether the issue happens also when you by-pass this server
- What operations/applications/processes/users DO HAVE the issue. What operations/applications/processes/users DO NOT HAVE the issue
Topology |
Description of the Topology includes:
- Is this a stand-alone or distributed environment?
- What applications are installed on affected server?
- The version of the ELM applications including all applied interim fixes and hot fixes
- Are the ELM applications deployed IBM WebSphere (traditional Webpshphere or Websphere Liberty)
If traditional WebSphere, provide the output of:
<WebSphere_Install_Root>\bin\versionInfo -maintenancePackages
- The Operating System the ELM applications are installed on
- Include the number of CPU's
- How much memory is available
- Available disk space
- Size of indices on disk, for example, conf\<app>\indices
- The Database vendor and version being used.
If multiple database instances are being used, provide details on which ELM application is using which database instance. - JVM params provided at server startup (heap sizes and config cache tuning) . If ISA tool is used, this information is already available.
Javacore log files and Application Logs |
Due to the distributed and interconnected nature of the ELM applications we will need at minimum data from the JTS server and the affected ELM application which are collected at the time the incident occurs. If possible, providing data from all JVMs would be preferred
- Enable verbose gc and javacore creation
- WebSphere instructions:
- Instructions to enable verbose gc are located here
- To add javacore creation, add the following parameters to Generic JVM arguments in Websphere Admin Console:
-Xdump:java:events=user
-Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt
- Liberty Instructions:
- Windows instructions:
- Open
<CLM_Install_Root>\server\server.startup.bat file.
Add the following lines in set JAVA_OPTS section:
set JAVA_OPTS=%JAVA_OPTS% -Xdump:java:events=user
set JAVA_OPTS=%JAVA_OPTS% -Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt- Save the file and restart the server
- Open
- Linux instructions:
- Open
<CLM_Install_Root>\server\server.startup file.
Add the following lines in #Enable verbose GC logging for serviceability section:
-Xdump:java:events=user
JAVA_OPTS="$JAVA_OPTS
JAVA_OPTS="$JAVA_OPTS
-Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt- Save the file and restart the server.
- Open
- Windows instructions:
- WebSphere instructions:
- Gather 4-6 Java Cores in 30s intervals according to How to gather Java cores for different application servers in Engineering Lifecycle Management applications document.
NOTE: Java Cores have to be gathered at the moment when the issue is taking place, especially BEFORE server restart. Do not gather Java Cores when the issue is gone. - If there was out of memory exception, then the javacore, heapdump or core file was generated in your server installation directory. Collect all these files.
- Include all the ELM logs including ETL logs located in:
WebSphere location:<WebSphere_Install_Root>\profiles\(profile_name)\logs
directory
Liberty Profile:<CLM_Install_Root>\server\logs
directory
NOTE: These logs are gathered automatically by ISA DC tool. if you use ISA DC tool to gather log files, you do not have to attach them separately. These files can be gathered after restarting the server. - If ELM is running on WebSphere, provide the WebSphere Performance, hang, or high CPU mustgathers for the operating system ELM is installed on (in addition to those listed above)
- Provide
access.log
,error.log
andhttp_plugin.log
file if you use IHS. These files sometimes have been rotated quickly or are cleaned up by customer tools when the server is restarted. Make sure that these logs contains the entries from the time when the issue happened, not after server restart. - Provide the report from monitoring tool like Splunk, Instana or Grafana tool if you have configured them to gather the data from the server.
The most significant usage parameters from the report are following: processor, memory, Java heap, thread connections pool, garbage collection time, active services number, disk and network IO, number of expensive scenarios. Two charts are required: from the last week and the last day.
Database usage logs |
Additional Information |
- Gather a screen capture of the following page from the web UI and save the content as Full HTML. Ensure you gather a screen capture for each application (example: jts, ccm, jazz, and others): https://<hostname>/<context root>/service/com.ibm.team.repository.service.internal.counters.ICounterContentService
- Gather the screen capture from list of active services: https://<hostname>/<context root>/admin#action=com.ibm.team.repository.admin.activeServices
IBM Doors Next issues |
For IBM Doors Next issues use the following document: IBM DOORS Next V7.X Performance MustGather
Lifecycle Query Engine and Link Indexer issues |
For Lifecycle Query Engine (LQE) and Link Indexer (LDX) issues use the following document: Mustgather: Investigating LQE/LDX performance
Related Information
Was this topic helpful?
Document Information
Modified date:
09 September 2024
UID
swg21607533