Over time, IBM has developed a host of monitoring and problem determination facilities for its implementations of the Java runtime. With these tools, the IBM support teams, Java application developers, and production operations personnel can diagnose and remedy issues arising in Java deployments.
This article discusses three of the major facilities as they are implemented in the most recent version of the IBM implementation of Java technology: the trace engine, the dump engine, and the DTFJ tooling API, each of which provides benefits to Java developers in determining the root cause of problem scenarios.
Trace information is a powerful tool in software problem determination: it can be used to effectively investigate problem scenarios such as functional errors, race conditions, and performance problems, and, from an educational point of view, is immensely useful in gaining an understanding of program flow.
IBM first introduced its trace engine into its implementation of the Java runtime in SDK 1.2.2 to help the IBM development team diagnose Java Virtual Machine (JVM) defects. The trace facility aimed to provide a low-overhead, high-performance, configurable tracing mechanism for the virtual machine itself. In subsequent releases, significant refinements have been made and enhancements added; the current version of the IBM SDK features a high-performance engine that is capable of capturing trace data for the JVM, the Java Class Libraries (JCL), and any Java application code deployed onto the runtime, without any additional instrumentation being required.
Activating and controlling trace
You can activate and control the trace engine through a variety of mechanisms:
- Through the
-Xtracecommand-line option - Using a trace properties file
- Dynamically using Java code through the
com.ibm.jvm.TraceAPI - Using trace trigger events
- From an external agent using the C-based JVM RAS Interface (JVMRI)
The primary way you can control trace is by using the -Xtrace option on the command line or you can use the optional trace properties file if the option set is long or complex.
The -Xtrace option consists of a series of tokens or token-value pairs that are used to determine whether trace should be written to stderr, internal buffers, or a binary file; whether method trace, JVM trace, or both are enabled; which trace points should be traced; and whether any changes to the trace point selections or dumps are required on trigger events.
When you're using the IBM trace facilities, the first thing that you need to determine is the destination to which the trace output should be directed. Table 1 lists these destinations, along with a brief description of each, and how much of each tracepoint's data is sent to it. For example, print directs full trace data to stderr, while minimal directs a subset of data for each tracepoint to in-memory buffers that can be captured to a file using the output option.
Table 1. Trace destinations
| Keyword | Function |
|---|---|
minimal | Trace selected tracepoints (identifier and timestamp only) to in-core buffer. Associated trace data is not recorded. |
maximal | Trace selected tracepoints (identifier and timestamp and associated data) to in-core buffer. |
count | Count the number of times selected tracepoints are called in the life of the JVM. |
print | Trace selected tracepoints to stderr with no indentation. |
iprint | Trace selected tracepoints to stderr with indentation. |
external | Route selected tracepoints to a JVMRI listener. |
exception | Trace selected tracepoints to an in-core buffer reserved for exceptions. |
You should set the value of each keyword to the trace points required. For instance:
-Xtrace:maximal=alltraces all of the information available from all JVM trace points to internal wrapping buffers.-Xtrace:iprint=awttraces all of the JVM internal AWT trace points tostderr, with indentations on entry and exit.-Xtrace:iprint=awtactivates method trace and set the output tostderrwith indentations.
Using an option from Table 1 does not on its own generate any output; you must supply the method names to trace separately. Note that all trace options are covered in detail in Chapter 32, "Tracing Java applications and the JVM," of the relevant IBM Diagnostics Guide (see Resources).
Placing trace data into internal buffers
Using in-storage buffers for trace is very efficient because no explicit I/O is performed until either a problem is detected or an API is used to snap the buffers to a file. Buffers are allocated on a per-thread principle; this precludes contention between threads and prevents trace data for individual threads from being swamped by other threads. For example, if one particular thread is not being dispatched, its trace information is still available when the buffers are dumped or snapped.
To examine the trace data, you must snap or dump and then format the buffers. The snapping of the buffers occurs automatically in any of the following scenarios:
- An uncaught Java exception occurs
- An operating system signal or exception occurs
- The
com.ibm.jvm.Trace.snap()Java API is called - The JVMRI
TraceSnapfunction is called
You can write trace data to a file continuously as an extension to the in-storage trace, but instead of one buffer per thread, at least two buffers per thread are allocated. This allocation allows the thread to continue to run while a full trace buffer is written to the filesystem. Depending on trace volume, buffer size, and the bandwidth of the output device, multiple buffers might be allocated to a given thread to keep pace with trace data that is being generated.
To specify that the output of the minimal or maximal trace options should be written to a file, the output keyword should be used, or the exception.output keyword for the exception option:
-Xtrace:maximal=all,output=trace.outtraces into a file called trace.out.-Xtrace:maximal=all,output={trace.out,5m}traces into a file called trace.out and wraps within the file once it has reached 5MB in size.-Xtrace:maximal=all,output={trace#.out,5m,5}traces sequentially into five files, each 5MB in size, with the#substituted for the file iteration number. In this instance, files named trace0.trc through trace4.trc are created, each containing the most recent 5MB of trace data. Once all five files are filled, the JVM overwrites trace0.trc and once again works its way through to trace4.trc. The maximum number of files that can be created with this option is 36, in which case the#character is replaced by 0 through 9 followed by A through Z.
It is also possible to put the following substitutions into the file name:
%p: The ID for the Java process.%d: The current date, in yyyymmdd format.%t: The current time, in hhmmss format.
Formatting snap and trace files
The trace formatter is a Java program that runs on any platform and can format a trace file from any platform. The formatter, which is shipped with the IBM SDK in core.jar, also requires a file called TraceFormat.dat, which contains the formatting templates. This file is shipped in jre/lib. You can launch the trace formatter using the following command line:
java com.ibm.jvm.format.TraceFormat input_file [output_file] |
Here, com.ibm.jvm.format.TraceFormat is the trace formatter class, input_file is the name of the binary trace file to be formatted, and output_file is the optional output filename. If the output file is not specified, the default output file name is the input file name appended with .fmt.
The IBM VM tracing facility includes a flight recorder, which continuously captures data from a subset of key tracepoints to buffers in memory. These buffers are captured in the event of a runtime problem and can be used to aid in problem diagnosis and analysis of the VM's history. VM initialization starts trace with a small set of trace points that are captured to wrap around in-storage buffers. You can use these to carry out first-stage diagnosis of any problems in the Java runtime and also to ensure that a subset of the data provided with the -verbose:gc option is always available. That garbage collection data is also present in any requested Java dump file.
The internal flight recorder uses a command-line option like this one:
-Xtrace:maximal=all{level1},exception=j9mm{gclogger}
|
If you specify -Xtrace on the command line or bring it in from a properties file, the set of active trace points is cleared.
With Java method trace, you can trace the invocations of methods in terms of method entry and method exit, on a per-thread basis, for any code running on the IBM implementation of the Java runtime. This is done without the need for any manual instrumentation of the Java code and you can use it to trace the JCL, third-party packages, or application code.
The method trace functionality is of particular use in debugging scenarios where race conditions are occurring or when unexpected parameters are being passed from method to method, leading to an exception. It can also be of great use for debugging performance problems, thanks to the presence of microsecond precision in the trace timestamps.
Method trace is invoked on the command line by adding the methods keyword token as well as setting the value of mt to one of the destination keywords (maximal, minimal, print). The methods keyword allows you to select method trace by class, method name, or both. You can use wildcards, along with the not operator, !, allowing for complex selection criteria. For example:
-Xtrace:print=mt,methods={*.*,!java/lang/*.*}: Write method trace tostderrfor all methods and for all classes except those in thejava.langpackage.-Xtrace:maximal=mt,output=trace.out,methods={tests/mytest/*.*}: Write method trace to file for all methods in thetests.mytestpackage. (Note that this option selects only the methods that are to be traced.)
One of the most powerful features of the IBM trace engine is its ability to trigger on trace events, which is vital for creating targeted trace output and reducing the volume of trace data produced. This increases both the performance of the application that is being debugged (as overhead is greatly reduced) and the speed at which the data can be interpreted (as there is less superfluous information).
The trace engine is capable of triggering on any given trace point, either VM internal or Java method, and has a number of actions that it can carry out on the event, outlined in Table 2:
Table 2. Trace engine actions
| Keyword | Function |
|---|---|
suspend | Suspend all tracing (except for special trace points). |
resume | Resume all tracing (except for threads suspended by the action of the resumecount property and Trace.suspendThis() calls). |
suspendthis | Increment the suspend count for this thread. A non-zero suspend count prevents all tracing for the thread. |
resumethis | Decrement the suspend count for this thread if it is greater than zero. If the suspend count reaches zero, tracing for this thread will be resumed. |
sysdump | Produce a non-destructive system dump. |
javadump | Produce a Java dump. |
heapdump | Produce a heap dump. |
snap | Snap all active trace buffers to a file in the current working directory. |
You can activate trigger trace by using the trigger command-line keyword, which determines which of the actions in Table 2 are taken when that event occurs. Note that the trigger option controls whether what has been selected by the other trace properties is produced as normal or whether it is blocked.
The following format is used to specify triggers in method events:
-Xtrace:trigger=method{method spec, entry action, exit action, delay count, match count}
|
On entering any method that matches the method spec indicated, the entry action is executed. When exiting the method, the exit action is performed. If the delay count is specified, the entry and exit actions are only carried out when entry and exit have occurred more times than the delay count. If the match count is specified, the actions are only carried out a maximum of that many times. Consider this example:
-Xtrace:trigger=method{java/lang/StackOverflowError*, sysdump} |
This creates a non-destructive system dump on the first (and only the first) instance of a StackOverflowError method being called -- which is the <clinit> method.
You can use the suspend and resume options in conjunction with the resumecount or suspendcount keywords to suspend and resume individual threads or all threads:
-Xtrace:resumecount=1
-Xtrace:trigger=method{HelloWorld.main,resume,suspend}
|
These options start tracing for all threads once HelloWorld.main() is called and stop tracing when HelloWorld.main() returns. This effectively means that tracing does not occur during Java runtime startup and only produces trace data whilst the HelloWorld application is running.
What can be achieved with the trace engine?
You can use the trace engine to produce data flow and historical data for any type of problem, either in the Java runtime itself or in the application code being run on it. This historical data, in combination with state data held in dumps generated by the dump engine, provides a powerful means of understanding and debugging many problem scenarios.
The dump engine built into the IBM implementation of the Java runtime provides the majority of the necessary data that IBM support teams use to diagnose problems in the Java runtime itself or in the JCLs that are supplied with the IBM SDKs. The dump engine has default settings to trigger a number of different dump types on specific events, which can be post-processed to determine the source of many problems.
You can also use the number of the dumps and available events to help diagnose problems in Java applications because the process for diagnosing problems in the JCL is equally applicable to diagnosing problems in other Java classes.
The IBM dump engine can generate four different dump types (five on z/OS®) and also has the ability to execute a tool in a separate process if requested. Each of the dump types is in itself non-destructive but becomes destructive when caused by a failure event, such as SIGSEGV/GPF. Table 3 outlines the available dump types:
Table 3. Dump types
| Keyword | Dump type | Description |
|---|---|---|
java | Java dump | A status report that includes environment, lock, thread stack, and class information. |
heap | Heap dump | A dump containing size and reference details for each Object on the Java heap. |
snap | Snap dump | The contents of the trace buffers, written to file. |
system | System dump | A process image in the normal format of the operating system (core file, minidump, or transaction dump). |
ceedump | CEEDUMP | A z/OS-specific thread stack and register summary file. |
tool | Tool agent | Executes a predefined tool using the supplied command line. |
The IBM dump engine is capable of producing any or all of the available dump types on each of the event types outlined in Table 4:
Table 4. Event types
| Event | Description |
|---|---|
gpf | An unexpected crash, such as a SIGSEGV or a SIGILL, has occurred. |
user | A SIGQUIT signal (Control+Break on Windows, Control+\ on Linux, Control+V on z/OS) has occurred. |
vmstart | The VM has finished initialization. |
vmstop | The VM is about to shut down. |
load | A new class has been loaded. |
unload | A classloader has been unloaded. |
throw | A Java exception has been thrown. |
catch | A Java exception has been caught. |
uncaught | A Java exception was not handled by the application. |
thrstart | A new thread has started. |
thrstop | An old thread has stopped. |
blocked | A thread is blocked entering a monitor. |
fullgc | Garbage collection has started. |
These events in themselves provide good flexibility as to when to generate each of the dumps, but that flexibility is also greatly increased by the addition of dump filters. You can add a filter to each of the events to provide more granularity as to when the dump is created. For instance, you can add an exception or error name to the throw, catch, and uncaught events.
Setting dump options and changing defaults
You set all of the dump options using the -Xdump command-line option in conjunction with a series of tokens to set the various options. You can view the default dump options using -Xdump:what, as shown in Listing 1:
Listing 1. -Xdump:what output
C:\home> java -Xdump:what Registered dump agents ---------------------- dumpFn=doSystemDump // Generate a system dump events=gpf+abort // on SIGSEGV and SIGABRT events filter= label=C:\home\core.%Y%m%d.%H%M%S.%pid.dmp // location and name of file range=1..0 // write on every event occurrence priority=999 // write this dump first request=serial // write in serial opts= ---------------------- dumpFn=doSnapDump // Generate trace snap file events=gpf+abort // on SIGSEGV and SIGABRT events filter= label=C:\home\Snap%seq.%Y%m%d.%H%M%S.%pid.trc // location and name of file range=1..0 // write on every event occurrence priority=500 // write after higher priority dumps request=serial // write in serial opts= ---------------------- dumpFn=doSnapDump // Generate trace snap file events=uncaught // on uncaught exceptions filter=java/lang/OutOfMemoryError // that match OutOfMemoryError label=C:\home\Snap%seq.%Y%m%d.%H%M%S.%pid.trc // location and name of file range=1..4 // write only on the first four events priority=500 // write after higher priority dumps request=serial // write in serial opts= ---------------------- dumpFn=doHeapDump // Generate heap dump file events=uncaught // on uncaught exceptions filter=java/lang/OutOfMemoryError // that match OutOfMemoryError label=C:\home\heapdump.%Y%m%d.%H%M%S.%pid.phd // location and name of file range=1..4 // write only on the first four events priority=40 // write after higher priority dumps request=exclusive+prepwalk // make sure the heap is walkable opts=PHD // write in "PHD" format ---------------------- dumpFn=doJavaDump // Generate java dump file events=gpf+user+abort // on SIGSEGV, SIGABRT and SIGQUIT events filter= label=C:\home\javacore.%Y%m%d.%H%M%S.%pid.txt // location and name of file range=1..0 // write on every event occurrence priority=10 // write after higher priority dumps request=exclusive // obtain exclusive access to walk the VM opts= ---------------------- dumpFn=doJavaDump // Generate java dump file events=uncaught // on uncaught exceptions filter=java/lang/OutOfMemoryError // that match OutOfMemoryError label=C:\home\javacore.%Y%m%d.%H%M%S.%pid.txt // location and name of file range=1..4 // write only on the first four events priority=10 // write after higher priority dumps request=exclusive // obtain exclusive access to walk the VM opts= ---------------------- |
You can add additional dumps by changing the syntax. To generate a Java dump on an uncaught socket exception, use the following syntax:
-Xdump:java:events=uncaught,filter=java/net/SocketException |
To remove all heap dumps, use this syntax:
-Xdump:heap:none |
What can be achieved with the dump engine?
You can use the improved facilities of the dump engine to solve problems in the IBM SDK itself; more importantly, you can utilise them to help solve issues in Java applications. The ability to generate Java dump files and heap dumps on OutOfMemoryErrors makes it possible to diagnose memory leaks and to determine the allocating stack of any large objects. The ability to generate Java dump files on other exceptions makes it possible to use thread stack data in the dump to debug potential race conditions.
In addition, the ability to create non-destructive system dumps on various events means that the DTFJ API can be used to interrogate the state of any part of the Java application at the event point.
Diagnostic Toolkit and Framework for Java
The DTFJ API is a Java-based API with which tool writers can access information about a Java process from a snapshot of the process image (a system dump, for instance) without requiring any knowledge of the various system dump formats or of how Java objects and other Java structures are laid out in memory.
As we've already discussed, the IBM implementation of the Java runtime is capable of creating non-destructive system dumps using either the trace or the dump engines. In addition, you can create non-destructive system dumps using the com.ibm.jvm.Dump.SystemDump() static method. You can achieve the same results by using the available operating system tools -- gencore on AIX® or gcore on Linux, for instance.
The creation of non-destructive system dumps allows tools that use the DTFJ API to gain information from live systems, as well as to post-process those that have failed and shut down.
The DTFJ API is a layered interface that is independent of runtime implementation: the API itself is scoped to cover multiple operating system and hardware platforms, multiple virtual machine implementations, and multiple languages. The base set of extensions that are contained in the DTFJ API are targeted at a Java runtime, and therefore allow tools writers to understand and introspect JVM data structures, with the DTFJ implementation that ships with the IBM implementations of the Java runtime capable of providing the information for data structures held within those runtimes.
The API itself is heavily influenced by the Reflection API, combined with a hierarchical view of the Java process that uses Iterators to provide access from high-level objects to increasingly specific objects. This gives a range of available data objects, from the process Image down to the level of individual JavaField and JavaMethod objects, which can subsequently be introspected to obtain the data that they contained at the point at which the system dump was taken. Figure 1 gives an indication of some of the data objects that the DTFJ API understands and can inspect:
Figure 1. DTFJ data objects overview

Because various operating systems produce different system dump formats and because some necessary changes may occur to the internal Java runtime data structures over time, a utility called JExtract needs to be run against a system dump before it can be accessed using the DTFJ API. This needs to be done by the same Java runtime version that was running when the system dump was produced and should be done on the same system.
The JExtract utility understands both the format of the system dump and the Java runtime's internal data structures. It uses that knowledge to create an XML description file that provides indexes into the system dump file that indicate the location of various data structures. DTFJ then uses the combination of the system dump and the JExtract-produced XML file to provide the information requested by the tools that use the DTFJ API.
Although JExtract is a post-processor for the system dump, you can use the dump engine's tool option to invoke JExtract automatically after the system dump has been created. For instance, to get system dumps requested on OutOfMemoryErrors, use the following syntax:
-Xdump:tool:events=uncaught,filter= OutOfMemoryError,exec="jextract .\core.%Y%m%d.*.%pid.dmp" |
What can be achieved with the DTFJ API?
You can use the DTFJ API to access the huge array of information that is present in a system dump. This includes information about the platform on which the process is running: physical memory, CPU number and type, libraries, command line, thread stacks, and registers. It can also provide information about the state of the Java runtime and the Java application it is running, including class loaders, threads, monitors, heaps, objects, Java threads, methods, compiled code, and fields and their values.
Because such a vast range of data artifacts is available, the DTFJ API provides the flexibility to create any number of tools. At the more simple level, it allows tools to be created to, for example, interrogate the size and contents of various caches and therefore more effectively size the amount of real Java heap memory required to hold those caches.
The first stage of creating a DTFJ-based tool is to obtain an Image object that relates to the system dump and then to obtain an ImageProcess object from an ImageAddressSpace. This is illustrated in Listing 2:
Listing 2. Using DTFJ to obtain the current process
Image theImage = new ImageFactory().getImage(new File(fileName)); ImageAddressSpace currentAddressSpace = (ImageAddressSpace) theImage.getAddressSpaces().next(); ImageProcess currentProcess = currentAddressSpace.getCurrentProcess(); |
On most platforms, there is only a single ImageAddressSpace and ImageProcess object in the image; however, mainframe operating systems may have multiple instances of each.
Once an ImageProcess object is obtained, it is possible to access the native threads, as shown in Listing 3:
Listing 3. Obtaining the threads and stack frames
Iterator vmThreads = process.getThreads(); ImageThread vmThread = (ImageThread) vmThreads.next(); Iterator vmStackFrames = vmThread.getStackFrames(); |
You can also access the various ImageModule (library) objects, as in Listing 4:
Listing 4. Obtaining the loaded libraries
Iterator loadedModules = process.getLibraries() |
From the ImageProcess object, you can obtain the JavaRuntime, as in Listing 5:
Listing 5. Obtaining the Java runtime
JavaRuntime runtime = (JavaRuntime) process.getRuntimes().next(); |
This opens up access to all of the Java structures.
With the JavaRuntime object, it becomes possible to begin writing tools to interrogate any Java application that is running. The simple example in Listing 6 shows how to iterate over all of the objects on the Java heap and count the number of each type of object:
Listing 6. Counting the objects of each type in the dump
..
Map<String,Long> objectCountMap = new HashMap<String,Long>();
Iterator allHeaps = currentRuntime.getHeaps();
/* Iterate over each of the Java heaps and call countObjects on them to */
/* populate the object type count HashMap */
while(allHeaps.hasNext()) {
countObjects((JavaHeap)allHeaps.next(),objectCountMap);
}
/* print out each of the entries in the HashMap of object types */
for (String objectClassName : objectCountMap.keySet()) {
System.out.println(objectClassName +
" occurs " + objectCountMap.get(objectClassName));
}
private static void countObjects(JavaHeap currentHeap,
Map<String, Long> objectCountMap)
throws Exception{
/* Iterate over each of the Objects on the supplied Java heap */
Iterator currentHeapObjects = currentHeap.getObjects();
while(currentHeapObjects.hasNext()) {
JavaObject currentObject = (JavaObject)currentHeapObjects.next();
/* Get the name of the class from the object */
String objectClassName = currentObject.getJavaClass().getName();
long objectCount = 0;
/* Add the class name to the HashMap, or increase the count if it */
/* already exists */
if (objectCountMap.containsKey(objectClassName)) {
objectCount = objectCountMap.get(objectClassName);
}
objectCountMap.put(objectClassName, objectCount + 1);
}
}
|
All of the functionality we've discussed here will help you when you're trying to diagnose and solve development and production issues in a Java deployment. Using these three major facilities in conjunction to produce historical trace data and detailed state data, along with an easy-to-use API for accessing that state data, results in a powerful and flexible way of interrogating Java applications to solve problem scenarios.
This article concludes our tour of the major improvements and changes IBM has made to its implementation of the Java virtual machine. In particular, we've covered the areas of memory management, class sharing, and application monitoring, and has described how to utilise these functionalities to improve the performance and availability of Java applications. More information on these enhancements and a number of others is available in the IBM Diagnostics Guide, whilst feedback and discussion is available through the IBM Runtimes and SDKs forum (see Resources for links to both).
In the final article in this series, the Java security development team will examine the IBM's security enhancements to the Java platform. The article will introduce each of security providers and review the functionalities they provide.
Learn
- Java 5.0 feature list: The complete rundown of features from Sun.
- Clarifications and Amendments to the Java Virtual Machine Specification: Read about the changes to the JVM specification for the Java 5.0 platform.
- JVM user guide: Essential information on using the IBM SDK.
- IBM Diagnostics Guide: A reference book for everything related to the IBM Developer Kit and Runtime Environment, Java 2 Technology Edition, Version 5.0. (In PDF format.)
- The developerWorks Java zone: Browse all Java content.
Get products and technologies
- IBM implementations of Java technology: Download the SDKs for AIX, Linux, and z/OS, among other IBM developer kits for Java technology, from this page.
- IBM Development Package for Eclipse: Develop, test, and run your Java applications with this ready-to-run Java development environment.
- IBM Development Package for Apache Harmony: An execution environment designed to run code contributed to the Apache Harmony project.
Discuss
- IBM SDKs and Runtimes: Visit this discussion forum, moderated by series lead Chris Bailey, for questions related to the IBM Developer Kits for the Java Platform.

Chris Bailey joined the IBM Java Technology Centre as a graduate of Southampton University in 2000. He works extensively with users to solve issues raised against the IBM ports of Java technology and Java platform-based products. Chris moderates the developerWorks forum entitled "IBM Java Runtimes and SDKs" and is currently focused on improving the quality of information and tooling available to users of IBM ports of the Java platform.
Simon Rowland joined the IBM Java Technology Centre as a graduate from Leeds University in 2001. He majored in philosophy but decided that the more concrete life of a programmer would be more fulfilling than a life spent contemplating unanswerable questions! He has spent his time at IBM working on various projects and platforms and currently works in the development team focusing on the trace and dump functionality for the IBM implementations of Java technology. In his spare time, Simon enjoys running and cycling and has some of his best ideas while falling off a mountain bike.
Comments (Undergoing maintenance)





