Java diagnostics, IBM style, Part 2: Garbage collection with the IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer

Improve application performance, optimize garbage collection, and unearth application problems

The IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer, new tooling from IBM, is designed to help diagnose and analyze memory-related Java performance problems. This article, second in a four-part series, explains how to obtain and use the toolkit and demonstrates how you can use it to quickly diagnose some common problems.

18 July 2013 - Added new resource item for the article, "Java Technology Community," to Resources).


Dr. Holly Cummins, Software Engineer, Java Technology Center, IBM

Dr. Holly CumminsHolly Cummins is a developer in the Java Technology Centre in IBM United Kingdom and the author of the EVTK. She has been with IBM for six years and holds a DPhil in quantum computation and an MSc in software engineering. She has a passion for blueberries, gruyere cheese, obscure typesetting languages, and impractical footwear. She has very poor houseplant-keeping skills and a growing mountain of post on her kitchen table.

18 July 2013 (First published 09 October 2007)

Also available in Chinese Russian Japanese

About this series

Java diagnostics, IBM style explores new tooling from IBM that can help resolve problems with Java applications and improve their performance. You can expect to come away from every article with new knowledge that you can put immediately to use.

Each of the authors contributing to the series is part of a new team that creates tools to help you resolve problems with your Java applications. The authors have a variety of backgrounds and bring different skills and areas of specialization to the team.

Contact the authors individually with comments or questions about their articles.

There are several reasons why you might wish to take a closer look at the garbage collection (GC) in an application. You may be concerned about the application's memory usage pattern: Is it using too much memory? Is it leaking memory? Is the memory usage sustainable over a long period? You may also be interested in getting the application to execute more quickly. Garbage collection can have a big effect on application performance. Most people know that poorly configured GC can use a lot of resources and slow an application down. However, the opposite is also true: A wise choice of garbage collection parameters can actually make an application run more quickly.

In short-lived Java applications or in applications where performance isn't really important, GC can happily be ignored. In other cases, tooling can make it much easier to get the information you need from the verbose GC logs. The tooling can visualize what's going on in the heap, making it easier to spot patterns, and it can even point out some patterns for you and make tuning recommendations.

Develop skills on this topic

This content is part of a progressive knowledge path for advancing your skills. See Monitor and diagnose Java applications

The GC and Memory Visualizer is a part of a new tooling suite from IBM that analyzes verbose GC logs to help provide just this sort of insight into memory management issues. In this article, you'll learn about the GC and Memory Visualizer's capabilities and see some example scenarios where the GC and Memory Visualizer can help you diagnose memory problems.

The GC and Memory Visualizer can handle logs from all IBM JREs at Version 1.4.2 or higher. It can also visualize logs from IBM WebSphere® Real Time. With it, you can simultaneously compare multiple logs, zoom in on particular areas of a log, filter data, and display in a range of units. An example the GC and Memory Visualizer display is shown in Figure 1:

Figure 1. Example GC and Memory Visualizer display
Example GC and Memory Visualizer display

Enabling verbose GC logging

You must enable verbose GC logging for your application if you want to produce logs for analysis. This can be done with the -verbose:GC virtual machine (VM) flag or with the -XverboseGClog:file command for IBM VMs with Version 5.0 and higher. Here, file is the name of your chosen log file. The -XverboseGClog option is preferable when available. Verbose GC usually has relatively little performance impact on an application.

Downloading and installing the GC and Memory Visualizer

About the IBM Support Assistant

The IBM Support Assistant is a free software serviceability workbench that helps you to resolve problems with IBM software products. ISA has a search facility that spans the bulk of IBM documentation and categorizes the results for review.

It also provides a product information feature that has links to product support and home pages, troubleshooting guides, and various forums and newsgroups. The Service feature of ISA can gather information from your desktop and can easily create a problem report for IBM.

ISA's Tool workbench provides problem determination tools that help resolve issues with IBM products. These tools are constantly updated and enable you to run troubleshooting and diagnostic tools on your desktop. See Resources for an ISA download link.

The GC and Memory Visualizer is a free download within the IBM Support Assistant. If you don't already have the IBM Support Assistant installed, you need to start by downloading it. (See Resources for a link.) Once the IBM Support Assistant is installed, you need to let the IBM Support Assistant know that you're using a product that includes a JVM. This is done by installing a product plug-in, as shown in Figure 2. Product plug-ins are downloaded on the IBM Support Assistant's Updater page. For example, you may want to select one of the developer kits for Java from the Others section or one of the WebSphere products from the WebSphere section.

Figure 2. Installing the product plug-in
Installing the product plug-in for the IBM JDK

At the same time, you can install the GC and Memory Visualizer plug-in. The GC and Memory Visualizer is found under the New Plug-ins tab, in the Common Component Tools section, as shown in Figure 3:

Figure 3. Installing the GC and Memory Visualizer
Installing the GC and Memory Visualizer

You need to restart the IBM Support Assistant after installing the product plug-in and the GC and Memory Visualizer. The GC and Memory Visualizer will be available for launch on the Tools page, as shown in Figure 4:

Figure 4. Launching the GC and Memory Visualizer
Launching the GC and Memory Visualizer

Common tasks

Now let's perform some basic log analysis tasks with the GC and Memory Visualizer.

Open a log for analysis

To analyze a verbose GC log with the GC and Memory Visualizer, start the GC and Memory Visualizer and then choose Open File from the File menu. The GC and Memory Visualizer opens the log in an editor with four tabs, as shown in Figure 5. You can open a log from a running application, although the GC and Memory Visualizer will not automatically update the display. To refresh the display, click the Reset Axes button.

Figure 5. The tabs in the GC and Memory Visualizer editor
The tabs in the GC and Memory Visualizer editor

The tabs are as follows:

  • The tab labeled with the file name shows the text of the log itself. If the log is large, the GC and Memory Visualizer won't show all of the text, but the whole log has still been parsed.
  • The Data tab shows a raw view of the data produced by the GC and Memory Visualizer. This data is suitable for cutting and pasting into a spreadsheet.
  • The Line Plot tab shows a visualization of the data.
  • The Report tab shows the GC and Memory Visualizer's report on the data, with a summary of each selected field, a tabular summary of the whole log, and a series of tuning recommendations.

The VGC Data menu, illustrated in Figure 6, shows all of the fields available to view. Grayed-out fields are those that the GC and Memory Visualizer looked for but could not find in the current log. If the Summary field is not selected, you can select it to enable the tabular summary. Similarly, you can enable Tuning Recommendation to get recommendations.

Figure 6. The VGC Data menu
The VGC Data menu

Compare several files

The GC and Memory Visualizer lets you analyze multiple files side by side. This can be handy for evaluating the effects of performance changes. Figure 7 shows an application performing a fixed workload with three GC policies. (As it happens, the application is the GC and Memory Visualizer itself.) The solid line is the gencon GC policy, the dotted line is the optavgpause policy, and the dashed line is the optthruput policy. The GC and Memory Visualizer works out line labels based on the log file names. (For more information on the different types of GC policies, see the article "Garbage collection policies, Part 1," available in Resources.)

Figure 7. Heap usage and pause times for three different garbage collection policies
Heap usage and pause times for three different garbage collection policies

In this case, it's clear that the gencon mode is best by all criteria. It completes the task the most quickly, using less heap in the process, and with far shorter GC pauses. But the gencon policy is not the default; optthruput is because in most circumstances, it outperforms gencon. As this example shows, though, it doesn't always outperform the gencon policy, and so it's worth seeing how the behavior of your application changes when you use different GC policies. Often a very simple change, like changing the GC policy that your app uses, can make a big improvement.

Zoom in on a problem period

The GC and Memory Visualizer allows you to focus in on a particular time period within a log. When you zoom in on a particular period, all the summary data and recommendations change to reflect only that period. For example, the log illustrated in Figure 8 shows heap usage for an application that is busy during the day but then idle at night:

Figure 8. Heap usage for an application that is busy during the day but idle at night
Heap usage for an application that is busy during the day but idle at night

For the log as a whole, the GC overhead (that is, the amount of time spent doing GC) is around 5 percent, which is pretty good. However, this includes long periods during which the application isn't doing any work and no GC is required. Zooming in on a particular period gives a more accurate reflection of the behavior of the system during the busy period, as shown in Figure 9:

Figure 9. Zooming in on the busy period
Zooming in on the busy period

The GC and Memory Visualizer also allows you to focus in on a particular range of data. For example, you may only be interested in the really long pauses or in periods when the heap is larger than 500MB. You can do this kind of filtering by changing the values on the Y-axis.

Change the units

The GC and Memory Visualizer allows the display units to be changed. Changing the units will change how things are plotted, and it will also change the units in the summary table and tuning recommendation. To change the units, right-click on the units in a plot, or bring up the Advanced Perspective (from the View menu).

By default, time (the units on the X-axis) is shown in seconds. This is convenient for short runs, but not so ideal for logs covering longer time periods. To change to a different unit, choose your preferred units from the drop-down menu on the right, as illustrated in Figure 10. Possibilities include hours, minutes, the date, and the GC number, which is just the sequence number of the collection. The Normalize check box determines whether the times are shown relative to the start of the log (normalized) or in absolute time (unnormalized).

Figure 10. Changing the units
Changing the units

You can also change the units on the Y-axis. For example, you can change quantities in the heap, which are shown in megabytes by default, to gigabytes or to a percentage of the total heap.

Use and export templates

Often, you will find yourself viewing the same combinations of fields repeatedly. In the GC and Memory Visualizer, templates let you to save these combinations for later use. The Templates view is in the upper left corner of the window, as shown in Figure 11:

Figure 11. The Templates view
The Templates view

Double-clicking on a template applies it to the current data set. The GC and Memory Visualizer comes with some templates predefined. The Heap template is useful for evaluating the memory usage and requirements of your application. The Pauses template is a first step in diagnosing performance problems that you might suspect to be related to GC.

You can export templates by bringing up the View menu or by right-clicking anywhere in the Templates view and choosing Export current settings as template. Type a template name and the GC and Memory Visualizer saves the template in the Templates view for future use.

Change the colors

You can choose the colors the GC and Memory Visualizer uses for plotting. Click the Preferences item in the View menu, and then navigate to the Display colors page in the Displayers category, as shown in Figure 12:

Figure 12. The colors preference page
The templates view

Save output

You can save all the output from the GC and Memory Visualizer by right-clicking in the main panel and choosing Save from the resulting context menu, as shown in Figure 13. A line plot can be saved as a JPEG image, a report can be saved as HTML, and raw data can be saved as a CSV file. The graphs in this article were saved from the line plot view.

Figure 13. Saving

Using the recommendations

The GC and Memory Visualizer provides a summary of the interesting features of a verbose GC log along with its tuning recommendations. The summary and recommendations are available in the report on the Report tab.

Why should you ever find it necessary to intervene and do some manual tuning? The garbage collector already does a lot of autonomic tuning to try to optimize its performance. However, it can't know what your priorities are and what trade-offs you're willing to make without some guidance. There is no optimum configuration for all workloads and all circumstances. The simplest tuning you can do is to specify a policy and tell the garbage collector whether throughput or pause times are most important. If you're more adventurous or more eager to achieve optimum performance, you can try fixing the heap size, altering the nursery size, or trying a larger maximum nursery size.

Case study: Diagnose a memory leak

One of the main reasons for looking at verbose GC logs is to examine an application's memory usage and make sure it's not pathological in some way. For instance, an application may use more memory than expected, and the verbose GC output can give indications about application footprint. Leaking memory is a related but much more serious problem. The Java platform's GC facilities ensure that a Java application won't leak memory even if it doesn't free an object before losing all references to it. However, applications are still vulnerable to leaks if they incorrectly hold on to object references, as the garbage collector will not collect objects that are still referenced.

Soft and weak references

Diagnosing a memory leak is usually pretty straightforward. Enable verbose GC on the application, run it for some period of time, and then plot the Used heap (after collection) in the GC and Memory Visualizer. The memory usage of an application will naturally increase when the application is initializing and if the workload of the application increases. If the used heap line is creeping up when there's no obvious reason for the memory requirements of the application to be increasing, there may be a leak. The GC and Memory Visualizer looks for this pattern and adds a comment to the tuning recommendation if it detects something that is likely to be a leak.

While verbose GC can show you that a leak is in progress, it cannot tell you which objects are causing that leak. In some cases, code inspection will be enough to find it. Consider hash maps and other collections. Are any of them static? Do they all have mechanisms for removing objects as well as adding them? Is the application being too generous in what it caches? Object pooling can also be a cause of memory leaks.

As the following example shows, weak references and soft references are powerful tools for fixing memory leaks. (See the accompanying sidebar for more information about them and Resources for links to a discussion of when and how to use them.) Consider the clearly leaky application shown in Listing 1. It adds to a map but never prunes it.

Listing 1. A Java class that leaks memory badly
public class Leaker
	private Map things = new HashMap();

	public void leak() {
		while (true) {
			things.put(new Date(), new Leak());

	private class Leak
		private Object data;

		public Leak() {
			data = new Object();

Figure 14 shows the heap usage for this application. The dips in heap usage mark the points where the heap is compacted. The log ends when the JVM runs out of memory.

Figure 14. Heap usage of an extremely leaky application
Heap usage of an extremely leaky application

Using weak references to avoid leaks

Switching to a WeakHashMap, as shown in Listing 2, immediately corrects the problem; the new, improved heap usage is shown in Figure 15. The heap usage never goes above 1MB, and the application can continue running indefinitely.

Listing 2. A simple correction to the Leaker class that prevents the memory leak
	private Map things = new WeakHashMap();
Figure 15. A potentially leaky application rescued by a WeakHashMap
Heap usage of a corrected leaky application

However, even weak references may not be sufficient to correct some leaks. What if the map from Figure 15 were a linked list, as shown in Listing 3?

Listing 3. A further modification to the Leaker class that reintroduces the leak
public class Leaker
	private Map things = new WeakHashMap();

	public void leak() {
		Object previousThing = null;
		while (true) {
			final Leak thing = new Leak(previousThing);
			things.put(new Date(), thing);
			previousThing = thing;

	private class Leak
		private Object data;

		public Leak(Object thing) {
			/* Make a linked list */
			data = thing;

Weak references tell the garbage collector that it should collect an object if there are no references to it other than the weak reference. Because each object in the map holds a reference to the previous object, no weak references will be cleared and the application will run out of memory very quickly, as shown in Figure 16:

Figure 16. Heap usage in a leaky application that a WeakHashMap cannot help
Heap usage in a leaky application that a WeakHashMap cannot help

Making sure weak references are working as expected

The problem can be verified by plotting the weak references cleared in the GC and Memory Visualizer, as shown in Figure 17. After adding the linkage to the list, the number of weak references cleared changes from a large number to none at all. (The new version of the application is the extremely short line running along the X-axis at zero, while the previous version is the longer higher line.) Clearly, the weak references are no longer working.

Figure 17. Weak references cleared in two variations of a potentially leaky application
Weak references cleared in two variations of a potentially leaky application

The solution in this case is to change the links in the linked list to also be weak references. Once the code change shown in Listing 4 is implemented, the number of weak references increases significantly, and the heap usage returns to being minimal:

Listing 4. The introduction of more weak references prevents references being held longer than required
private class Leak
	private WeakReference reference;

	public Leak(Object thing) {
		this.reference = new WeakReference(thing);
	        * We can get back our object from the reference with
	        * reference.get(), but we should always check it for null.

By using the GC and Memory Visualizer to see how many weak references are being cleared, you can easily verify that a redesign to use weak references is actually effective.

If code inspection does not quickly find the leak, you will probably need to take some application dumps and analyze them to find the objects whose references are growing in size. See Resources for articles on finding and correcting memory leaks.

The verbose GC logs can also help evaluate application scalability. For example, if an application is intended to handle large volumes of data but uses quite a lot of memory when handling small volumes of data during testing, the application will probably not scale as hoped.

Case study: Sizing the heap

Many developers use verbose GC data to help choose the best size for the heap. If the heap is so small that the data required by the application will not fit into it, then the application will run out of memory and terminate with an OutOfMemoryError. If the heap has room for the application data but not much room to spare, the garbage collector will have to spend a lot of time ensuring that there is room in the heap for new allocations, and this will hurt application performance. A heap that is too big usually won't have a negative effect on application performance, but it is wasteful, and GC pauses may be long. Usually, there are other applications running on the same machine, and it may make sense to redistribute the memory so that no single Java application has more than it requires.

The garbage collector will try to size the heap appropriately, but it will avoid using more than half the physical memory available on the machine. It may also take some time to increase the heap to the optimum size, and if the application occupancy drops, it may shrink the heap. These fluctuations in the heap size can slow down the application and are unnecessary if the physical memory is not needed for anything else running on the same system. Fixing the heap size is an easy performance optimization if the memory requirements of the application are well understood.

There is no single ideal size for a heap. Usually, the bigger the heap, the better the application will perform, so sizing the heap involves trading off the requirements of the application against other demands on the physical memory. A reasonable heuristic is that the heap should be at least twice as large as the amount of live data. If it's not possible or desirable to make the heap that big, the gencon policy is probably a good bet because it tends to outperform the optthruput policy in situations where heap size is constrained. If at all possible, the heap should never be sized so that the machine needs to use virtual memory to accommodate it. Using virtual memory degrades performance severely.

The advantage of a fixed heap size

Figure 18 shows the pause times that result when the same workload from Figure 7 is run in a JVM with the heap sized fixed at 500MB and with the command-line options -Xms500m -Xmx500m. Based only on the pause times, the fixed heap size seems to have made things worse. The mean pause has gone up for every policy, and the proportion of time spent in GC is unchanged. However, the total pause (the fourth column in the tables in the report view) has actually gone down significantly. The total and mean pauses seem to disagree because collections in a small heap can be completed very quickly. When the heap had a variable size, the JVM performed many very quick collections while the heap was small, and these contribute to a short mean pause. In this case, the ultimate performance metric is how long the JVM took to complete the work (the length of the lines); fixing the heap produced a 13 percent improvement for gencon, a 15 percent improvement for optavgpause, and a 30 percent improvement for the optthruput policy.

Figure 18. Pause times from an application running in a fixed-size heap
Pause times from an application running in a fixed-size heap

Of course, this example is for a Java program that executes over 30 seconds, where much of the initial time is spent by the JVM in finding the best heap size. Fixing the heap size won't usually deliver such dramatic improvements for longer-running programs. If the workload isn't well understood, it may not be wise to fix the heap size because there is a risk that the JVM will be forced to run in a heap that is too small. The verbose GC output can be used to assess how stable the workload is and how much risk there is that the JVM will require more memory than has been allowed.

Case study: Estimating application throughput from verbose GC logs

You tune GC to optimize application performance. But how do you decide how well an application is performing? Benchmarks have clear performance metrics, but it's extremely unwise to tune the garbage collector to optimize a benchmark and then assume that the same configuration will give optimum results for a different application. All applications are different, and there is no single best configuration for the garbage collector. (If there were, the garbage collector would ship with that configuration, and tuning would be unnecessary.) Unlike benchmarks, not all applications provide a report showing how well they're performing.

In these cases, the verbose GC logs themselves can give a pretty good clue about how well things are going. Although verbose GC logs are a good place to start assessing application performance, the reported pause times are definitely not the right place to begin. As the example of fixing the heap above showed, an application may sometimes run faster than it did before being tuned but still be marked by an unchanged GC overhead or even longer mean pauses. An application that spends an excessive amount of time doing GC can certainly see a performance hit, but spending more time in GC can sometimes make an application perform better because of the way in which objects are arranged.

How can GC accelerate an application?

Good GC can actually enhance application performance. If objects are compactly laid out, as they are with the gencon mode or after a compaction, allocating new objects will be much quicker because no free-list search is required. This isn't measured in verbose GC logs, but fast allocation can significantly help an application. If the objects are well laid out so that objects that are used at similar times are close to one another (this is called locality), then object access will also be much faster. Smart garbage collectors rearrange objects to try and maximize the speed of object access.

Instead of trying to work out the impact of GC pause times on an application, look at the amount of garbage generated. One of the best indicators of application performance is how much garbage the application is generating: the more garbage it generates, the more work it must be doing because garbage is a side-effect of application work. All generated garbage is collected, and so the amount generated is exactly the same as the amount the garbage collector is collecting.

You can plot the amount of garbage collected by selecting Amount freed from the GC and Memory Visualizer's VGC Data menu. The plot on the Report tab shows statistics about the mean and total amounts of garbage collected during the run. The mean amount freed is not a good performance indicator; if the occupancy is stable, the amount freed per collection will probably be pretty stable as well. However, if the application is performing well, the frequency of collections will probably increase as the application gets through more work in a shorter time. Therefore, the total amount freed is a better indicator of performance over a fixed time period. If your log was not collected over a fixed time period, zooming in on a set time period will ensure that the total is shown only for that period.

An even better indicator is the rate of GC, as it is still meaningful even if you're comparing logs that do not cover the same amount of time. The rate is shown in the table at the top of the Report tab. (If no table is shown, try enabling Summary in the VGC Data menu.) Having a higher rate of GC means that your application is getting through more work in a shorter period of time — which is a good thing!

Consider the earlier example of the fixed heap. The mean pause times gave a pretty deceptive impression of how well GC was working for the application. However, if you look at the rate of GC, you can see that it's higher for the fixed-heap runs. For example, in a comparison of the two optthruput runs shown in Figure 19, the rate is 12 percent higher when the heap size is set in advance:

Figure 19. The summary view of the rate of collection
The summary view of the rate of collection

You could also think of the rate of GC as the rate of garbage generation. At first glance, garbage generation might seem like a bad thing that should be minimized. It is true that an application that generates a lot of garbage is likely to perform poorly compared to one that generates less because it places more of a strain on the garbage collector — but that's not always so. For example, object pooling reduces the amount of garbage generated by an application but can seriously hurt garbage collection performance. (See the article on urban performance legends in Resources for more discussion of why this is the case.) More generally, holding on to object references that could be discarded reduces the amount of generated garbage but tends to hurt GC. If you scope variables appropriately and reduce the use of instance variables, you can reduce this kind of object retention.

If an application is under-loaded — that is, if it doesn't have enough work to do — the rate of GC is not a great performance indicator because the rate will drop if no work is coming in. For example, a server is not going to be generating much garbage if all of its clients are disconnected, but that doesn't mean that the server needs to be tuned. The good news is that if an application is under-loaded, it probably doesn't need much tuning anyway. If the aim is to speed up individual transactions, then zooming the GC log in on the period of a transaction will give suitable information.

Estimating application response times

What if you're more concerned about application response times than application throughput? It's tempting to assume that verbose GC pause times are a good indicator of the application response times. But this is only sometimes true, and even then even only half true, so you need to be very cautious about what you infer from the pause times. If an application is under-loaded, the maximum pause time will be related to the maximum response time. However, the mean response time is usually proportional to the throughput instead. Thus, a policy with longer pause times (such as optthruput) may actually give lower average response times than one with short pause times (such as optavgpause). If an application is over-loaded, the pause times are even less important because work may have to queue up for service and the response time may be mostly determined by the length of the queue, which will be determined by the application throughput.


If you make the effort to look at verbose GC logs, you will often be rewarded with a better understanding of your application characteristics; you'll also be able to detect potentially serious problems with the application's memory usage and improve performance. The IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer is a powerful tool for getting the most out of the information available in the verbose GC.



Get products and technologies



developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Java technology on developerWorks

Zone=Java technology
ArticleTitle=Java diagnostics, IBM style, Part 2: Garbage collection with the IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer