The Java language has come to be predominant in software development, and thus the reliability of the Java virtual machine (VM) has become a very important issue. The VM is typically a reliable piece of software, but of course failures do occur during execution for a variety of reasons. A small number of these problems are caused by errors in the VM itself; however, in the majority of cases, they are caused by errors or misconfigurations in the software stack above the VM (in IBM® WebSphere® Application Server, for instance) or in the application itself.
The software stack for a typical project has increased in complexity as information technology has matured, which has led to increasing difficulties for developers trying to determine the causes of problems. In such a complex environment, you may be faced with an overwhelming excess of information with which to diagnose a fault. In a production environment, there may well be many gigabytes of heap, hundreds of threads, thousands of classloaders, tens of thousands of classes, and a huge number of objects.
The IBM Diagnostic and Monitoring Tools for Java - Dump Analyzer (referred to hereafter as the Dump Analyzer) is an extensible framework that seeks a way out of this dilemma. The Dump Analyzer is available to all internal IBM users and external customers to investigate problems using the IBM Developer Kit for the Java Platform (IBM SDK). It uses analyzers to interrogate a formatted system dump (each analyzer asking the dump a specific question) and links the results together with a script to produce a concise report of the analysis. In its first two releases, the Dump Analyzer reports one of the four following outcomes:
- Out of memory
- Deadlock detected
- VM terminated due to signal (a result of internal or middleware/Java application error)
- Further investigation is required
Each of the first three can be mapped to a VM problem type described in the next section of this article.
No background knowledge is required to read the rest of this article. You'll get step-by-step instructions for analyzing a system dump using the Dump Analyzer, along with a high-level overview of the background to the tool and its architecture. After reading this article, you should have a good understanding of the circumstances under which you may want to use the Dump Analyzer, as well as some understanding of its infrastructure.
A high-level view of VM problem types
There are many ways in which a VM can fail while executing, and each type of failure requires a different approach to diagnosis. Before you look at the workings of the Dump Analyzer in detail, it is worth examining these different types of problems and the analysis required to solve them.
Out of memory problems
A VM can fail because it has run out of memory — either the Java heap memory or the native memory that is used by the VM to hold thread stacks, class information, JIT'ed code, graphic elements, and other artifacts for interfacing with the underlying operating system.
It can be extremely difficult to diagnose such a problem because the memory allocation that caused the failure is unlikely to itself be the ultimate culprit; some large collection had probably been growing inexorably until the VM finally reached the limit of available heap space. It is normally necessary to examine the contents of the heap and to compare snapshots of the heap taken at various times so as to identify collections that have grown rapidly.
A deadlock is a condition in which two or more processes are waiting for another process to release a resource. A thread owning a resource (such as a monitor) is unable to take ownership of another resource because a second thread owns it and is simultaneously attempting to gain ownership of the resource owned by the first thread. These faults often manifest as performance problems. They are relatively easy to diagnose by examining the states of the threads and the resources that they own.
Internal errors can be caused by a variety of problems:
- Native code may attempt to access an object that has invalid input (such as a stale local reference) or that is coded incorrectly.
- The garbage collector may incorrectly reclaim some storage that, when referenced, appears to contain pointers to uninitialized memory.
- The JIT compiler may produce incorrect code that either references or attempts to branch to an invalid location.
Errors in the Java application or middleware
The Dump Analyzer currently deals with errors that occur or are detected at the level of the VM itself, but eventually the same set of tools will be able to diagnose various errors and incorrect behavior in the Java application or middleware that operates in the VM. There are various potential causes of these errors; they typically result from defects in the application or middleware code or misuse or misconfiguration of JVM options. They are generally diagnosed by examining the state of various data structures in the application or middleware to determine if some aspect of that state is incorrect.
Problem diagnosis today
Without a tool like the Dump Analyzer, you would usually begin the process of problem diagnosis by examining the artifacts produced by the VM at the point of failure. Typically, these are:
- A dump of the process space (a system dump or core file)
- A dump of the Java heap (a heapdump)
- A snapshot of the Java process (a Javacore file)
- A trace file showing some of the execution history
Generally, you would examine these artifacts individually, using specialized programs for each of the specific formats. The process of problem determination then essentially consists of a manual examination of the information available. As the volume of data increases, this process becomes more and more time consuming, and an increasingly specialized job. As a result, customers are often unwilling to undertake the analysis themselves and rely on their VM or middleware vendor to do it, but the majority of problems reported are eventually diagnosed as application, configuration, or environment problems that do not require any changes to the code in the VM or middleware itself. In an ideal world, the diagnostic capability available to the customer would ensure that only genuine defects that require such code changes are reported to the VM or middleware vendor. Other problems would be diagnosed by an automated process using the appropriate artifacts produced by the VM.
Overview of the Dump Analyzer
The Dump Analyzer is a tool based on the Diagnostic Tooling Framework for Java (DTFJ; you'll learn more about this later in this article) designed to analyze system dumps and look for various kinds of problems. The tool is made up of small analysis modules that look at specific dump data and determine if a particular problem (a deadlock, for instance) is present. This design can easily accommodate the addition of new capabilities and can be tailored to find specific problems.
The tool operates at two levels:
- Each specialized analysis module attempts to diagnose one particular type of problem and produces a concise description of the problem that has been found.
- When a diagnosis cannot be made, each analysis module can produce a more lengthy report about one particular aspect of the state of the system. This report can be used by a troubleshooting expert, possibly in conjunction with other information, to diagnose the problem.
To add extra flexibility, a simple scripting language is used to control the flow of analysis. Our team intends to develop this facility further over time by providing many different scripts.
Here's the tool's analysis flow:
- The tool loads the dump data selected by the user to create an image for further analysis.
- The user chooses one or more analysis modules to run against the image; if no specific analyzer is chosen by user, the default script is run.
- The analysis modules run.
- Each module either returns information that controls the flow of further analysis or generates information for a report.
- Once all of the modules have completed their runs, the report is formatted into an HTML or text document.
As noted, if the user does not request specific analysis modules, the tool runs the default script (general.sml), which runs a set of analyzers that check for several common types of problems. If none of these problems are detected, the script invokes a default report that outlines some general information about the state of the VM when the dump was produced.
Later in this article, you'll see an example of the Dump Analyzer in use and get an overview of some of the nondefault analysis modules that could be selected.
Setup required to use the Dump Analyzer
All the Dump Analyzer needs to run is a formatted system dump. A system dump is generated by default when the VM crashes; however, the VM can be configured to create such a dump under other failing circumstances or at the user's request (see the Diagnostic Guides links in Resources for more information).
To format a system dump, you need to run the jextract tool against it. Using the same VM on the same machine that produced the dump, simply run the following command line:
With a 1.4.2-level VM, this command
produces a .sdff file; on VMs at Version 5.0 or above, it produces a .dmp.zip file. Note also that on different platforms, there may be various options that control the format of the dump produced at
the operating system level. In particular, these can cause the system dump to be
truncated, which may prevent the Dump Analyzer from producing a useful diagnosis.
The most common error (on UNIX® systems) is to forget to set the
unlimited, but there are other important options on other platforms.
To avoid this type of problem, refer to the information in the IBM
or search the IBM Software Support Web site (see Resources for links to both)
for platform-specific technotes with keywords such as "truncated core."
Using the Dump Analyzer within the IBM Support Assistant
The main release vehicle for the Dump Analyzer is the IBM Support Assistant (referred to hereafter as ISA). ISA is available to all internal IBM users and external customers (see the Resources section for a download link).
You can install the Dump Analyzer with ISA as follows:
- Ensure that Version 3 of ISA is installed.
- To install the Dump Analyzer, you must install a product plug-in for which it is relevant — the IBM Developer Kit for Java, for example. (See the Resources section for instructions.)
- Restart the ISA client. Now you can install a tool plug-in.
- Go to the Updater service. There are two ways to do this:
- Click the Updater icon on the Welcome page.
- Click the Updater link on the menu bar.
Once you've installed the Dump Analyzer, you can launch it from within ISA:
- Restart ISA.
- Select Tools.
- Select a product for which the Dump Analyzer is available — the IBM Developer Kit for Java, for example.
- Click IBM Diagnostic and Monitoring Tools for Java - Dump Analyzer to launch the tool. Your screen should look like Figure 1:
Figure 1. The Dump Analyzer within ISA
Here's how you would analyze a formatted system dump:
- Enter the fully qualified name of a formatted system dump to be analyzed.
- Click Estimate Time to receive a rough estimate of how long the analysis will take.
- Click Analyze. The results will appear in the window when complete.
Figure 2 shows an example of the kind of observation summary that the Dump Analyzer produces:
Figure 2. Example observation summary
Clicking Analyze Another returns to the screen shown in Figure 1, with the previously entered dump file name still entered in the first text box.
Selecting an analyzer module
The field labeled Optional Parameters in the invocation screens in Figures 1 and 2 controls the set
of analyzers that are to be executed, as well as other runtime options. Normally, you should leave that field blank; this
will cause the default analysis script, general.sml, to be executed. This script checks for the
most common types of problems. However, if you already know the particular type of problem that
you are investigating or if you need to work on a problem that is not integrated into the default script,
you can explicitly specify one or more analyzers to be invoked. These analyzers can
be invoked with the name of specific script files or the class name of specific analyzer modules. Typing
in the Optional Parameters field will list further runtime options.
In the first release of the tool, only a very small number of experimental analyzers have been provided beyond the default script. These include:
DefaultDumpReport (class name:
com.ibm.dtfj.analyzer.deal.basic.DefaultDumpReport): This analyzer produces a fairly detailed report on all the main aspects of the state of the VM, somewhat similar to what can be found in a Javacore file (but with some additional DTFJ-specific information).
ListZipJars (class name:
com.ibm.dtfj.analyzer.deal.extended.ListZipJars): This experimental analyzer attempts to discover all the zip and JAR files that are currently open within the VM, which may provide insight into any custom libraries being used by the application or the middleware.
SystemProperties (class name:
com.ibm.dtfj.analyzer.deal.extended.SystemProperties): This experimental analyzer scans the VM and prints the current value of every Java system property defined in that VM.
WASBasicInfo (class name:
com.ibm.dtfj.analyzer.deal.was.WASBasicInfo): This is a very preliminary and experimental version of an analyzer that demonstrates the use of this tool to examine the state of a WebSphere Application Server runtime executing inside the VM.
These other analyzers are currently provided mostly to illustrate the flexibility of the tool. Future releases will many additional specialized analyzers along with documentation. In addition, in the fourth article in this series, you will learn how to write your own analyzers to supplement those shipped with the tool itself.
Using the Dump Analyzer from the command line
In some circumstances, you may find it useful to be able to run the Dump Analyzer from the command line (if the analysis needs to be incorporated into some existing problem-handling workflow, for instance). The easiest way to use the Dump Analyzer is through ISA, and you've already seen how to download the Dump Analyzer within ISA.
To run the Dump Analyzer alone, you need four JAR files and a script file. These are:
- dumpAnalyzer.jar (found in installDir/plugins/com.ibm.java.diagnostics.dbda.isa_(version number)/WEB-INF/lib)
- dtfj-interface.jar (found in installDir/plugins/com.ibm.java.diagnostics.dbda.isa_(version number)/WEB-INF/lib/j9)
- dtfj.jar for Java 5.0 and above (found in installDir/plugins/com.ibm.java.diagnostics.dbda.isa_(version number)/WEB-INF/lib/j9)
- dtfj.jar for Java 1.4.2 (found in installDir/plugins/com.ibm.java.diagnostics.dbda.isa_(version number)/WEB-INF/lib/sov)
- general.sml (found in installDir/plugins/com.ibm.java.diagnostics.dbda.isa_(version number)
In all of these file paths, installDir denotes the ISA installation directory; by default, this is C:\Program Files\IBM\IBM Support Assistant v3 on Microsoft Windows or /opt/IBM/IBM Support Assistant v3 on Linux™. You can copy these files elsewhere, or you can run the Dump Analyzer directly from the installDir/plugins/com.ibm.java.diagnostics.dbda.isa_(version number) directory. Although ISA is only available on Windows and Linux, you can run the Dump Analyzer from the command line on any platform.
Here are the command-line steps to run Dump Analyzer from the default directory on Windows:
java -cp %CP% -Xbootclasspath/p:%BCP% com.ibm.dtfj.analyzer.base.DumpAnalyzer (dumpName) (options)
And here are the steps for Linux:
java -cp $CP -Xbootclasspath/p:$BCP com.ibm.dtfj.analyzer.base.DumpAnalyzer (dumpName) (options)
dumpName is the fully qualified dump name to be analyzed and
options are the runtime parameters
that can be used to configure the Dump Analyzer. Running with the
-help option prints a list of all the available
Figure 3 shows a snapshot of some output from the Dump Analyzer running on the command line:
Figure 3. Example Dump Analyzer command-line output
For more information about DTFJ, see the Resources section.
At the time of writing, the initial release of the Dump Analyzer is currently available. Our team intends to continue to make enhancements and updates on a regular basis. In particular, we will focus on two areas:
- We will continue to enhance the user interface of the tool itself, adding panels to control the dumps and analyzers that run, improving the output format, possibly adding an interactive mode, and more.
- We will increase the number of analyzers and scripts to cover a wider variety of problems.
The area of building new analyzers is particularly exciting. With this DTFJ dump analysis technology, you now have a fairly accessible mechanism for examining low-level VM entities such as threads and monitors so as to diagnose out-of-memory errors, crashes, deadlocks, and the like; in addition, you can also examine the content of any data structure present in the VM. In particular, you can examine the contents of various data structures that are part of the implementation of the application or middleware running inside the VM. We intend to start building a collection of analyzers that exploit this information to help diagnose a variety of problems in WebSphere Application Server and possibly other stack products.
Our aim is to make the tool as useful as possible, so feedback — both about the tool itself and about new analyzers you'd like to see added — is extremely welcome. Feedback can be provided through this article or through ISA.
Coming up in this series
The next article in this series introduces the IBM Diagnostic and Monitoring Tools for Java - Garbage Collection and Memory Visualizer. This tool can help you investigate memory-based Java performance problems by analyzing verbose garbage collection logs. You can use the tool to look at memory usage patterns, determine if there is a memory leak, or tune the garbage collection configuration to improve performance.
You'll revisit the Dump Analyzer in the fourth article in this series. In that article, you'll get a much more in-depth look to the tool's extensibility and learn how to build your own analysis modules for it.
- Java diagnostics, IBM style: Read each installment in this series.
- Diagnostics Guide 1.4.2: Diagnosis information on the features of IBM's implementation of Version 1.4.2 of the Java platform.
- Diagnostics Guide 5.0: Diagnosis information on the features of IBM's implementation of Version 5.0 of the Java platform.
- IBM Software Support Web site: Download Runtimes for Java Technology.
- "Java technology, IBM style: Monitoring and problem determination" Chris Bailey and Simon Rowland (developerWorks, June 2006) offers some good insight into the problem analysis process and introduces the DTFJ.
Get products and technologies
- IBM Support Assistant: Download the ISA and start tuning your Java application today. You can also find out about installing a product plug-in in the ISA.
- IBM Developer Kits for the Java Platform: Download the SDKs for AIX®, Linux, and z/OS®, among other IBM developer kits for Java technology, from this page.
- IBM Java Runtimes and SDKs: Visit this discussion forum for questions related to the IBM Developer Kits for the Java Platform.
- Java Technology Community: Interact with industry experts as J2EE architects, developers and programmers share their knowledge and experiences on the technology.