Java technology, IBM Style: Monitoring and problem determination

Use IBM's diagnostic tools to produce better apps

The IBM implementation of Java™ technology, Version 5.0 contains a number of useful tools that help you diagnose and solve Java development problems. In this final installment in our Java technology, IBM style series, you learn about the information available from the IBM trace and dump engines. You also get an introduction to the Diagnostic Toolkit and Framework for Java (DTFJ) API, with which you can write code that queries and analyzes diagnostic data.


Chris Bailey, Advisory Software Engineer, IBM

Chris BaileyChris Bailey joined the IBM Java Technology Centre as a graduate of Southampton University in 2000. He works extensively with users to solve issues raised against the IBM ports of Java technology and Java platform-based products. Chris moderates the developerWorks forum entitled "IBM Java Runtimes and SDKs" and is currently focused on improving the quality of information and tooling available to users of IBM ports of the Java platform.

developerWorks Professional author

Simon Rowland, Software Engineer, IBM Java Technology Centre

Simon Rowland joined the IBM Java Technology Centre as a graduate from Leeds University in 2001. He majored in philosophy but decided that the more concrete life of a programmer would be more fulfilling than a life spent contemplating unanswerable questions! He has spent his time at IBM working on various projects and platforms and currently works in the development team focusing on the trace and dump functionality for the IBM implementations of Java technology. In his spare time, Simon enjoys running and cycling and has some of his best ideas while falling off a mountain bike.

13 June 2006

Also available in Japanese

Over time, IBM has developed a host of monitoring and problem determination facilities for its implementations of the Java runtime. With these tools, the IBM support teams, Java application developers, and production operations personnel can diagnose and remedy issues arising in Java deployments.

This article discusses three of the major facilities as they are implemented in the most recent version of the IBM implementation of Java technology: the trace engine, the dump engine, and the DTFJ tooling API, each of which provides benefits to Java developers in determining the root cause of problem scenarios.

The trace engine

Trace information is a powerful tool in software problem determination: it can be used to effectively investigate problem scenarios such as functional errors, race conditions, and performance problems, and, from an educational point of view, is immensely useful in gaining an understanding of program flow.

IBM first introduced its trace engine into its implementation of the Java runtime in SDK 1.2.2 to help the IBM development team diagnose Java Virtual Machine (JVM) defects. The trace facility aimed to provide a low-overhead, high-performance, configurable tracing mechanism for the virtual machine itself. In subsequent releases, significant refinements have been made and enhancements added; the current version of the IBM SDK features a high-performance engine that is capable of capturing trace data for the JVM, the Java Class Libraries (JCL), and any Java application code deployed onto the runtime, without any additional instrumentation being required.

About the series

The Java technology, IBM style series takes a look at the latest releases of the IBM implementations of the Java platform. You'll learn how IBM has implemented some of the advances built into version 5.0 of the Java platform, and find out how to use some of the value-added features built into IBM's new releases.

Please contact the authors individually with comments or questions about their articles. To comment on the series as a whole, you may contact series lead Chris Bailey. For more on the concepts discussed here and links where you can download the latest IBM releases, see the Resources section.

Activating and controlling trace

You can activate and control the trace engine through a variety of mechanisms:

  • Through the -Xtrace command-line option
  • Using a trace properties file
  • Dynamically using Java code through the API
  • Using trace trigger events
  • From an external agent using the C-based JVM RAS Interface (JVMRI)

The primary way you can control trace is by using the -Xtrace option on the command line or you can use the optional trace properties file if the option set is long or complex.

The -Xtrace option consists of a series of tokens or token-value pairs that are used to determine whether trace should be written to stderr, internal buffers, or a binary file; whether method trace, JVM trace, or both are enabled; which trace points should be traced; and whether any changes to the trace point selections or dumps are required on trigger events.

Basics of activating trace

When you're using the IBM trace facilities, the first thing that you need to determine is the destination to which the trace output should be directed. Table 1 lists these destinations, along with a brief description of each, and how much of each tracepoint's data is sent to it. For example, print directs full trace data to stderr, while minimal directs a subset of data for each tracepoint to in-memory buffers that can be captured to a file using the output option.

Table 1. Trace destinations
minimalTrace selected tracepoints (identifier and timestamp only) to in-core buffer. Associated trace data is not recorded.
maximalTrace selected tracepoints (identifier and timestamp and associated data) to in-core buffer.
countCount the number of times selected tracepoints are called in the life of the JVM.
printTrace selected tracepoints to stderr with no indentation.
iprintTrace selected tracepoints to stderr with indentation.
externalRoute selected tracepoints to a JVMRI listener.
exceptionTrace selected tracepoints to an in-core buffer reserved for exceptions.

You should set the value of each keyword to the trace points required. For instance:

  • -Xtrace:maximal=all traces all of the information available from all JVM trace points to internal wrapping buffers.
  • -Xtrace:iprint=awt traces all of the JVM internal AWT trace points to stderr, with indentations on entry and exit.
  • -Xtrace:iprint=awt activates method trace and set the output to stderr with indentations.

Using an option from Table 1 does not on its own generate any output; you must supply the method names to trace separately. Note that all trace options are covered in detail in Chapter 32, "Tracing Java applications and the JVM," of the relevant IBM Diagnostics Guide (see Resources).

Placing trace data into internal buffers

Using in-storage buffers for trace is very efficient because no explicit I/O is performed until either a problem is detected or an API is used to snap the buffers to a file. Buffers are allocated on a per-thread principle; this precludes contention between threads and prevents trace data for individual threads from being swamped by other threads. For example, if one particular thread is not being dispatched, its trace information is still available when the buffers are dumped or snapped.

To examine the trace data, you must snap or dump and then format the buffers. The snapping of the buffers occurs automatically in any of the following scenarios:

  • An uncaught Java exception occurs
  • An operating system signal or exception occurs
  • The Java API is called
  • The JVMRI TraceSnap function is called

Placing trace data into files

You can write trace data to a file continuously as an extension to the in-storage trace, but instead of one buffer per thread, at least two buffers per thread are allocated. This allocation allows the thread to continue to run while a full trace buffer is written to the filesystem. Depending on trace volume, buffer size, and the bandwidth of the output device, multiple buffers might be allocated to a given thread to keep pace with trace data that is being generated.

To specify that the output of the minimal or maximal trace options should be written to a file, the output keyword should be used, or the exception.output keyword for the exception option:

  • -Xtrace:maximal=all,output=trace.out traces into a file called trace.out.
  • -Xtrace:maximal=all,output={trace.out,5m} traces into a file called trace.out and wraps within the file once it has reached 5MB in size.
  • -Xtrace:maximal=all,output={trace#.out,5m,5} traces sequentially into five files, each 5MB in size, with the # substituted for the file iteration number. In this instance, files named trace0.trc through trace4.trc are created, each containing the most recent 5MB of trace data. Once all five files are filled, the JVM overwrites trace0.trc and once again works its way through to trace4.trc. The maximum number of files that can be created with this option is 36, in which case the # character is replaced by 0 through 9 followed by A through Z.

It is also possible to put the following substitutions into the file name:

  • %p: The ID for the Java process.
  • %d: The current date, in yyyymmdd format.
  • %t: The current time, in hhmmss format.

Formatting snap and trace files

The trace formatter is a Java program that runs on any platform and can format a trace file from any platform. The formatter, which is shipped with the IBM SDK in core.jar, also requires a file called TraceFormat.dat, which contains the formatting templates. This file is shipped in jre/lib. You can launch the trace formatter using the following command line:

java input_file [output_file]

Here, is the trace formatter class, input_file is the name of the binary trace file to be formatted, and output_file is the optional output filename. If the output file is not specified, the default output file name is the input file name appended with .fmt.

The flight recorder

The IBM VM tracing facility includes a flight recorder, which continuously captures data from a subset of key tracepoints to buffers in memory. These buffers are captured in the event of a runtime problem and can be used to aid in problem diagnosis and analysis of the VM's history. VM initialization starts trace with a small set of trace points that are captured to wrap around in-storage buffers. You can use these to carry out first-stage diagnosis of any problems in the Java runtime and also to ensure that a subset of the data provided with the -verbose:gc option is always available. That garbage collection data is also present in any requested Java dump file.

The internal flight recorder uses a command-line option like this one:


If you specify -Xtrace on the command line or bring it in from a properties file, the set of active trace points is cleared.

Method trace

With Java method trace, you can trace the invocations of methods in terms of method entry and method exit, on a per-thread basis, for any code running on the IBM implementation of the Java runtime. This is done without the need for any manual instrumentation of the Java code and you can use it to trace the JCL, third-party packages, or application code.

The method trace functionality is of particular use in debugging scenarios where race conditions are occurring or when unexpected parameters are being passed from method to method, leading to an exception. It can also be of great use for debugging performance problems, thanks to the presence of microsecond precision in the trace timestamps.

Method trace is invoked on the command line by adding the methods keyword token as well as setting the value of mt to one of the destination keywords (maximal, minimal, print). The methods keyword allows you to select method trace by class, method name, or both. You can use wildcards, along with the not operator, !, allowing for complex selection criteria. For example:

  • -Xtrace:print=mt,methods={*.*,!java/lang/*.*}: Write method trace to stderr for all methods and for all classes except those in the java.lang package.
  • -Xtrace:maximal=mt,output=trace.out,methods={tests/mytest/*.*}: Write method trace to file for all methods in the tests.mytest package. (Note that this option selects only the methods that are to be traced.)

Triggering on trace events

One of the most powerful features of the IBM trace engine is its ability to trigger on trace events, which is vital for creating targeted trace output and reducing the volume of trace data produced. This increases both the performance of the application that is being debugged (as overhead is greatly reduced) and the speed at which the data can be interpreted (as there is less superfluous information).

The trace engine is capable of triggering on any given trace point, either VM internal or Java method, and has a number of actions that it can carry out on the event, outlined in Table 2:

Table 2. Trace engine actions
suspendSuspend all tracing (except for special trace points).
resumeResume all tracing (except for threads suspended by the action of the resumecount property and Trace.suspendThis() calls).
suspendthisIncrement the suspend count for this thread. A non-zero suspend count prevents all tracing for the thread.
resumethisDecrement the suspend count for this thread if it is greater than zero. If the suspend count reaches zero, tracing for this thread will be resumed.
sysdumpProduce a non-destructive system dump.
javadumpProduce a Java dump.
heapdumpProduce a heap dump.
snapSnap all active trace buffers to a file in the current working directory.

You can activate trigger trace by using the trigger command-line keyword, which determines which of the actions in Table 2 are taken when that event occurs. Note that the trigger option controls whether what has been selected by the other trace properties is produced as normal or whether it is blocked.

The following format is used to specify triggers in method events:

-Xtrace:trigger=method{method spec, entry action, exit action, delay count, match count}

On entering any method that matches the method spec indicated, the entry action is executed. When exiting the method, the exit action is performed. If the delay count is specified, the entry and exit actions are only carried out when entry and exit have occurred more times than the delay count. If the match count is specified, the actions are only carried out a maximum of that many times. Consider this example:

-Xtrace:trigger=method{java/lang/StackOverflowError*, sysdump}

This creates a non-destructive system dump on the first (and only the first) instance of a StackOverflowError method being called -- which is the <clinit> method.

You can use the suspend and resume options in conjunction with the resumecount or suspendcount keywords to suspend and resume individual threads or all threads:


These options start tracing for all threads once HelloWorld.main() is called and stop tracing when HelloWorld.main() returns. This effectively means that tracing does not occur during Java runtime startup and only produces trace data whilst the HelloWorld application is running.

What can be achieved with the trace engine?

You can use the trace engine to produce data flow and historical data for any type of problem, either in the Java runtime itself or in the application code being run on it. This historical data, in combination with state data held in dumps generated by the dump engine, provides a powerful means of understanding and debugging many problem scenarios.

The dump engine

The dump engine built into the IBM implementation of the Java runtime provides the majority of the necessary data that IBM support teams use to diagnose problems in the Java runtime itself or in the JCLs that are supplied with the IBM SDKs. The dump engine has default settings to trigger a number of different dump types on specific events, which can be post-processed to determine the source of many problems.

You can also use the number of the dumps and available events to help diagnose problems in Java applications because the process for diagnosing problems in the JCL is equally applicable to diagnosing problems in other Java classes.

Dump types

The IBM dump engine can generate four different dump types (five on z/OS®) and also has the ability to execute a tool in a separate process if requested. Each of the dump types is in itself non-destructive but becomes destructive when caused by a failure event, such as SIGSEGV/GPF. Table 3 outlines the available dump types:

Table 3. Dump types
KeywordDump typeDescription
javaJava dumpA status report that includes environment, lock, thread stack, and class information.
heapHeap dumpA dump containing size and reference details for each Object on the Java heap.
snapSnap dumpThe contents of the trace buffers, written to file.
systemSystem dumpA process image in the normal format of the operating system (core file, minidump, or transaction dump).
ceedumpCEEDUMPA z/OS-specific thread stack and register summary file.
toolTool agentExecutes a predefined tool using the supplied command line.

Dump events

The IBM dump engine is capable of producing any or all of the available dump types on each of the event types outlined in Table 4:

Table 4. Event types
gpfAn unexpected crash, such as a SIGSEGV or a SIGILL, has occurred.
userA SIGQUIT signal (Control+Break on Windows, Control+\ on Linux, Control+V on z/OS) has occurred.
vmstartThe VM has finished initialization.
vmstopThe VM is about to shut down.
loadA new class has been loaded.
unloadA classloader has been unloaded.
throwA Java exception has been thrown.
catchA Java exception has been caught.
uncaughtA Java exception was not handled by the application.
thrstartA new thread has started.
thrstopAn old thread has stopped.
blockedA thread is blocked entering a monitor.
fullgcGarbage collection has started.

These events in themselves provide good flexibility as to when to generate each of the dumps, but that flexibility is also greatly increased by the addition of dump filters. You can add a filter to each of the events to provide more granularity as to when the dump is created. For instance, you can add an exception or error name to the throw, catch, and uncaught events.

Setting dump options and changing defaults

You set all of the dump options using the -Xdump command-line option in conjunction with a series of tokens to set the various options. You can view the default dump options using -Xdump:what, as shown in Listing 1:

Listing 1. -Xdump:what output
C:\home> java -Xdump:what

Registered dump agents
dumpFn=doSystemDump                             // Generate a system dump
events=gpf+abort                                // on SIGSEGV and SIGABRT events
label=C:\home\core.%Y%m%d.%H%M%S.%pid.dmp       // location and name of file
range=1..0                                      // write on every event occurrence
priority=999                                    // write this dump first
request=serial                                  // write in serial
dumpFn=doSnapDump                               // Generate trace snap file
events=gpf+abort                                // on SIGSEGV and SIGABRT events
label=C:\home\Snap%seq.%Y%m%d.%H%M%S.%pid.trc   // location and name of file
range=1..0                                      // write on every event occurrence
priority=500                                    // write after higher priority dumps
request=serial                                  // write in serial
dumpFn=doSnapDump                               // Generate trace snap file
events=uncaught                                 // on uncaught exceptions
filter=java/lang/OutOfMemoryError               // that match OutOfMemoryError
label=C:\home\Snap%seq.%Y%m%d.%H%M%S.%pid.trc   // location and name of file
range=1..4                                      // write only on the first four events
priority=500                                    // write after higher priority dumps
request=serial                                  // write in serial
dumpFn=doHeapDump                               // Generate heap dump file
events=uncaught                                 // on uncaught exceptions
filter=java/lang/OutOfMemoryError               // that match OutOfMemoryError
label=C:\home\   // location and name of file
range=1..4                                      // write only on the first four events
priority=40                                     // write after higher priority dumps
request=exclusive+prepwalk                      // make sure the heap is walkable
opts=PHD                                        // write in "PHD" format
dumpFn=doJavaDump                               // Generate java dump file
events=gpf+user+abort                           // on SIGSEGV, SIGABRT and SIGQUIT events
label=C:\home\javacore.%Y%m%d.%H%M%S.%pid.txt   // location and name of file
range=1..0                                      // write on every event occurrence
priority=10                                     // write after higher priority dumps
request=exclusive                               // obtain exclusive access to walk the VM
dumpFn=doJavaDump                               // Generate java dump file
events=uncaught                                 // on uncaught exceptions
filter=java/lang/OutOfMemoryError               // that match OutOfMemoryError
label=C:\home\javacore.%Y%m%d.%H%M%S.%pid.txt   // location and name of file
range=1..4                                      // write only on the first four events
priority=10                                     // write after higher priority dumps
request=exclusive                               // obtain exclusive access to walk the VM

You can add additional dumps by changing the syntax. To generate a Java dump on an uncaught socket exception, use the following syntax:


To remove all heap dumps, use this syntax:


What can be achieved with the dump engine?

You can use the improved facilities of the dump engine to solve problems in the IBM SDK itself; more importantly, you can utilise them to help solve issues in Java applications. The ability to generate Java dump files and heap dumps on OutOfMemoryErrors makes it possible to diagnose memory leaks and to determine the allocating stack of any large objects. The ability to generate Java dump files on other exceptions makes it possible to use thread stack data in the dump to debug potential race conditions.

In addition, the ability to create non-destructive system dumps on various events means that the DTFJ API can be used to interrogate the state of any part of the Java application at the event point.

Diagnostic Toolkit and Framework for Java

The DTFJ API is a Java-based API with which tool writers can access information about a Java process from a snapshot of the process image (a system dump, for instance) without requiring any knowledge of the various system dump formats or of how Java objects and other Java structures are laid out in memory.

As we've already discussed, the IBM implementation of the Java runtime is capable of creating non-destructive system dumps using either the trace or the dump engines. In addition, you can create non-destructive system dumps using the static method. You can achieve the same results by using the available operating system tools -- gencore on AIX® or gcore on Linux, for instance.

The creation of non-destructive system dumps allows tools that use the DTFJ API to gain information from live systems, as well as to post-process those that have failed and shut down.


The DTFJ API is a layered interface that is independent of runtime implementation: the API itself is scoped to cover multiple operating system and hardware platforms, multiple virtual machine implementations, and multiple languages. The base set of extensions that are contained in the DTFJ API are targeted at a Java runtime, and therefore allow tools writers to understand and introspect JVM data structures, with the DTFJ implementation that ships with the IBM implementations of the Java runtime capable of providing the information for data structures held within those runtimes.

The API itself is heavily influenced by the Reflection API, combined with a hierarchical view of the Java process that uses Iterators to provide access from high-level objects to increasingly specific objects. This gives a range of available data objects, from the process Image down to the level of individual JavaField and JavaMethod objects, which can subsequently be introspected to obtain the data that they contained at the point at which the system dump was taken. Figure 1 gives an indication of some of the data objects that the DTFJ API understands and can inspect:

Figure 1. DTFJ data objects overview
DTFJ data objects overview

Click to see larger image

Figure 1. DTFJ data objects overview

DTFJ data objects overview

Running JExtract

Because various operating systems produce different system dump formats and because some necessary changes may occur to the internal Java runtime data structures over time, a utility called JExtract needs to be run against a system dump before it can be accessed using the DTFJ API. This needs to be done by the same Java runtime version that was running when the system dump was produced and should be done on the same system.

The JExtract utility understands both the format of the system dump and the Java runtime's internal data structures. It uses that knowledge to create an XML description file that provides indexes into the system dump file that indicate the location of various data structures. DTFJ then uses the combination of the system dump and the JExtract-produced XML file to provide the information requested by the tools that use the DTFJ API.

Although JExtract is a post-processor for the system dump, you can use the dump engine's tool option to invoke JExtract automatically after the system dump has been created. For instance, to get system dumps requested on OutOfMemoryErrors, use the following syntax:

  OutOfMemoryError,exec="jextract .\core.%Y%m%d.*.%pid.dmp"

What can be achieved with the DTFJ API?

You can use the DTFJ API to access the huge array of information that is present in a system dump. This includes information about the platform on which the process is running: physical memory, CPU number and type, libraries, command line, thread stacks, and registers. It can also provide information about the state of the Java runtime and the Java application it is running, including class loaders, threads, monitors, heaps, objects, Java threads, methods, compiled code, and fields and their values.

Because such a vast range of data artifacts is available, the DTFJ API provides the flexibility to create any number of tools. At the more simple level, it allows tools to be created to, for example, interrogate the size and contents of various caches and therefore more effectively size the amount of real Java heap memory required to hold those caches.

Getting started with DTFJ

The first stage of creating a DTFJ-based tool is to obtain an Image object that relates to the system dump and then to obtain an ImageProcess object from an ImageAddressSpace. This is illustrated in Listing 2:

Listing 2. Using DTFJ to obtain the current process
Image theImage =  new ImageFactory().getImage(new File(fileName));
ImageAddressSpace currentAddressSpace = 
  (ImageAddressSpace) theImage.getAddressSpaces().next();
ImageProcess currentProcess = currentAddressSpace.getCurrentProcess();

On most platforms, there is only a single ImageAddressSpace and ImageProcess object in the image; however, mainframe operating systems may have multiple instances of each.

Once an ImageProcess object is obtained, it is possible to access the native threads, as shown in Listing 3:

Listing 3. Obtaining the threads and stack frames
Iterator vmThreads = process.getThreads();
ImageThread vmThread = (ImageThread);
Iterator vmStackFrames = vmThread.getStackFrames();

You can also access the various ImageModule (library) objects, as in Listing 4:

Listing 4. Obtaining the loaded libraries
Iterator loadedModules = process.getLibraries()

From the ImageProcess object, you can obtain the JavaRuntime, as in Listing 5:

Listing 5. Obtaining the Java runtime
JavaRuntime runtime = (JavaRuntime) process.getRuntimes().next();

This opens up access to all of the Java structures.

With the JavaRuntime object, it becomes possible to begin writing tools to interrogate any Java application that is running. The simple example in Listing 6 shows how to iterate over all of the objects on the Java heap and count the number of each type of object:

Listing 6. Counting the objects of each type in the dump
    Map<String,Long> objectCountMap = new HashMap<String,Long>();

    Iterator allHeaps = currentRuntime.getHeaps();

    /* Iterate over each of the Java heaps and call countObjects on them to */
    /* populate the object type count HashMap                               */
    while(allHeaps.hasNext()) {

    /* print out each of the entries in the HashMap of object types        */
    for (String objectClassName : objectCountMap.keySet()) {
         System.out.println(objectClassName + 
           " occurs " + objectCountMap.get(objectClassName));

    private static void countObjects(JavaHeap currentHeap, 
      Map<String, Long> objectCountMap)
    throws Exception{

        /* Iterate over each of the Objects on the supplied Java heap       */
        Iterator currentHeapObjects = currentHeap.getObjects();
        while(currentHeapObjects.hasNext()) {
            JavaObject currentObject = (JavaObject);

            /* Get the name of the class from the object                    */
            String objectClassName = currentObject.getJavaClass().getName();
            long objectCount = 0;

            /* Add the class name to the HashMap, or increase the count if it */
            /* already exists                                                 */
            if (objectCountMap.containsKey(objectClassName)) {
                 objectCount = objectCountMap.get(objectClassName);
            objectCountMap.put(objectClassName, objectCount + 1);


All of the functionality we've discussed here will help you when you're trying to diagnose and solve development and production issues in a Java deployment. Using these three major facilities in conjunction to produce historical trace data and detailed state data, along with an easy-to-use API for accessing that state data, results in a powerful and flexible way of interrogating Java applications to solve problem scenarios.

This article concludes our tour of the major improvements and changes IBM has made to its implementation of the Java virtual machine. In particular, we've covered the areas of memory management, class sharing, and application monitoring, and has described how to utilise these functionalities to improve the performance and availability of Java applications. More information on these enhancements and a number of others is available in the IBM Diagnostics Guide, whilst feedback and discussion is available through the IBM Runtimes and SDKs forum (see Resources for links to both).

In the final article in this series, the Java security development team will examine the IBM's security enhancements to the Java platform. The article will introduce each of security providers and review the functionalities they provide.



Get products and technologies


developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Java technology on developerWorks

Zone=Java technology
ArticleTitle=Java technology, IBM Style: Monitoring and problem determination