Troubleshooting
Problem
Occasionally TADDM may appear unresponsive. You may be unable to login, or unable to start a discovery, or discovery does not complete even though TADDM appears to be running. When this occurs, thread dumps are often necessary to diagnose the problem. These in conjunction with logs and database information can be very helpful in determining the root cause.
Symptom
Unable to logon, start discovery or complete discovery or any other perceived hang of the TADDM server environment.
Diagnosing The Problem
While the problem is occurring gather thread dumps as noted below.
Resolving The Problem
To gather thread dumps, follow the appropriate procedure for your TADDM Operating System type in the 'Collecting Thread Dumps' section below. Typically you should gather thread dumps on the server where the perceived hang is. If you are running in streaming mode, collect the dumps on both the problematic server and any additional storage servers in the environment. For example, if the problem is that discovery is not completing, gather thread dumps on the discovery server, the primary storage server and any secondary storage servers.
Collect two sets of thread dumps, approximately 5 minutes apart. Once the thread dumps have been created, send the following information to Support;
1. all the javacore* files from dist/external/gigaspaces-4.1/bin and dist/bin dated since the last TADDM start. There may not be any in dist/bin depending on which services are dumped.
2. all the files from dist/log/* and dist/log/services/* directories from each server. Please note if DEBUG was not set and the problem is re-creatable, Support may request a re-create with DEBUG level logs.
3. screen shots indicating the problem
4. if requested by Support also include database documentation such as DB2 snapshots and db2diag.log or Oracle AWR report and alert log.
Collecting Thread Dumps
Linux or AIX;
The following command uses ps to find all the java processes on the TADDM server and takes a thread dump of each available one;
ps -e | grep java |grep -v grep| sort | awk '{print $1}' | while read pid; do kill -3 $pid; done
Run this twice approximately 5 minutes apart on all applicable TADDM servers.
Windows or Solaris:
1) change directory(cd) to dist/log
2) create a file called <jvmname>.dmp
- where <jvmname> is the following;
Discovery Servers in streaming mode -
- Discover
DiscoveryService
- StorageService
Domain mode jvm names -
Typically only these three JVM's are need for discovery or server hang type issues in domain mode;
- Discover
Topology
DiscoverAdmin
For example, create dist/log/StorageService.dmp to take a thread dump of StorageService on your Storage servers.
This will cause a thread dump within 10 seconds
3) To get the second thread dump for the same JVM, rename the dmp file in dist/log after approximately 5 minutes, wait 10 seconds for the trigger to re-arm, then create the file again.
Product Synonym
TADDM;CCMDB
Was this topic helpful?
Document Information
Modified date:
23 June 2018
UID
swg21598190