Elusive bandits: what are Java memory leaks?
If the term Java™ memory leak seems like a misnomer to you, rest assured that you are not alone; after all, you were promised that Garbage Collection (GC) in Java will relieve you from the mundane duties of allocating, tracking, and freeing the heap memory. Indeed, the promise has been delivered, so it is reasonable -- and to quite an extent correct -- for you to conclude that there will be no memory leaks in your Java program. However, the catch is that GC can take care of only typical heap management problems.
Think about your experience with C/C++ (or any other non-GC language for that
matter). What are typical heap management problems in C/C++? Two of the most
famous ones are: memory leaks and dangling pointers. Listing 1 shows a simple example with both errors. In this example, interestingly, by themselves methods foo and main seem to be error-free, but together they manifest both a memory leak and a dangling pointer. First, method foo overwrites pi
with a new allocation, and all pointers to the memory chunk allocated in main
are lost, leading to a memory leak. Then foo frees
up what it has allocated, but pi still holds the address (it is not nullified).
In main, after calling foo,
when pi is used, it is referring to memory that has
been freed, so pi is a dangling pointer.
Listing 1. An example of C/C++ memory leak and dangling pointer
int *pi;
void foo() {
pi = (int*) malloc(8*sizeof(int)); // oops, memory leak of 4 ints
// use pi
free(pi); // foo() is done with pi
}
void main() {
pi = (int*) malloc(4*sizeof(int));
foo();
pi[0] = 10; // oops, pi is now a dangling pointer
}
|
In Java, you will never have a dangling pointer, because the GC will not reclaim a heap chunk if there are any references to it. Nor will you have a C/C++ type memory leak because GC will reclaim the heap chunk only if there is no reference to it. Then, what are Java memory leaks? If a program holds a reference to a heap chunk that is not used during the rest of its life, it is considered a memory leak because the memory could have been freed and reused. GC won't reclaim it due to the reference being held by the program. A Java program could run out of memory due to such leaks. Java memory leaks are mostly a result of non-obvious programming errors. A simple example is shown in Listing 2.
The LeakExample class has three methods: slowlyLeakingVector,
leakingRequestLog, and noLeak.
The slowlyLeakingVector has a wrong condition in
the second for loop, due to which one String object
leaks in every iteration. Instead of String, if it
was a database record with significant size, this slow leak can result in a
non-responding application running out of memory fairly quickly. The leakingRequestLog
represents a very common case where an incoming request is kept in a hash table
till it is completed. Except in this example, the programmer has forgotten to
remove it from the hash table once the request is done. Over a period of time,
the hash table will have plenty of entries, which will result in hash-clashes
as well as a large portion of heap occupied by useless hash entries. Both of
these are very common cases resulting into Java memory leaks.
Instinctively, you may think that the locations of highest memory allocation
result in leaks. However, it may not be true. For example, the noLeak
method allocates a lot of memory, but all of it gets garbage collected. On the
other hand, the other two methods don't allocate that much memory, but are causing
memory leaks.
Listing 2. An example demonstrating Java Memory Leaks
import java.io.IOException;
import java.util.HashSet;
import java.util.Random;
import java.util.Vector;
public class LeakExample {
static Vector myVector = new Vector();
static HashSet pendingRequests = new HashSet();
public void slowlyLeakingVector(int iter, int count) {
for (int i=0; i<iter; i++) {
for (int n=0; n<count; n++) {
myVector.add(Integer.toString(n+i));
}
for (int n=count-1; n>0; n--) {
// Oops, it should be n>=0
myVector.removeElementAt(n);
}
}
}
public void leakingRequestLog(int iter) {
Random requestQueue = new Random();
for (int i=0; i<iter; i++) {
int newRequest = requestQueue.nextInt();
pendingRequests.add(new Integer(newRequest));
// processed request, but forgot to remove it
// from pending requests
}
}
public void noLeak(int size) {
HashSet tmpStore = new HashSet();
for (int i=0; i<size; ++i) {
String leakingUnit = new String("Object: " + i);
tmpStore.add(leakingUnit);
}
// Though highest memory allocation happens in this
// function, but all these objects get garbage
// collected at the end of this method, so no leak.
}
public static void main(String[] args) throws IOException {
LeakExample javaLeaks = new LeakExample();
for (int i=0; true; i++) {
try { // sleep to slow down leaking process
Thread.sleep(1000);
} catch (InterruptedException e) { /* do nothing */ }
System.out.println("Iteration: " + i);
javaLeaks.slowlyLeakingVector(1000,10);
javaLeaks.leakingRequestLog(5000);
javaLeaks.noLeak(100000);
}
}
}
|
IRAD, the super cop: Java leak detection using IRAD
Since you now know that Java memory leak is not a misnomer, and you have seen examples of such leaks, it is natural to ask how these leaks can be identified in Java applications. IBM® Rational® Application Developer (IRAD) can identify potential leaks by analyzing heap dumps. A heap dump is the state of heap memory at a given point in time, and IRAD has the ability to profile an application and collect heap dumps. Alternatively, you can import heap dumps with tools (IBM® Text, IBM® PHD, IBM® SVC, HPROF) taken on various platforms (see Resource [1] for how to collect heap dumps using IBM JVM). IRAD can then analyze these heap dumps and find Java leaks. This section will explain how to collect or import heap dumps, do leak analysis, and understand the results. IRAD is an Eclipse-based tool, and most Java developers will find its GUI familiar and easy to use.
Collecting evidence: we call them heap dumps
Before collecting heap dumps, make sure that the IBM® Agent Controller is running. It is packaged and gets installed with IRAD. It usually runs as a service (see Resource [2] for more details on the IBM Agent Controller).
As mentioned previously, heap dumps can be collected by profiling a program, using these steps in the Profiling dialog.
- To start the Profiling dialog, from the Run menu, click Profile.
- Let's say you want
to collect heap dumps for
LeakExampleprogram (Listing 2). You will have to create a Configuration for it under the Java Application. - Then go to the Profiling tab and select the Memory Leak Analysis - Manual heap dumps profiling set. Figure 1 shows the Profile dialog, and annotations highlight the Profiling tab and profiling set selection.
- Once you click the Profile button, IRAD will switch to the Profiling and Logging perspective.
- IRAD will start your application and begin profiling. It will be shown in the Profiling Monitor view in the perspective (see Figure 2).
- You can collect a heap dump by clicking the Capture Heap Dump button. Collect at least two heap dumps. Typically, you should take these heap dumps just before and after the sequence of operations suspected to cause leak. Or it could just be at the beginning and end of the program.
- When you have collected the dumps, terminate the program.
Figure 1. Profile dialog
Figure 2. The Capture Heap Dump button in Profiling Monitor View
Probing the past: importing heap dumps
Alternatively, you might want to analyze heap dumps taken in customer environments. You can import these heap dumps through the Import wizard using these steps.
- Go to the Profiling Monitor view and right-click.
- On the context menu, click Import, select the Heap Dump option, and click the Next button.
- It will then ask you to select the type of heap dump you are importing.
- Then it will show you a window similar to Figure 3. Browse your file systems and select the heap dumps that you want to import.
- Click the Finish button, and IRAD will import the heap dumps.
Figure 3. Import Heap dump
Getting on the trail: doing leak analysis
Once you have collected or imported the heap dumps, you can analyze them
for leaks by clicking the Analyze for Leak button shown in Figure
4. It will bring up the Leak Analysis dialog, in which you should
select two heap dumps, set the threshold value to 1, and click
the OK button. IRAD will perform leak analysis and bring up the
Leak Candidate view. Figure 5 shows the
Leak Candidate view with leaks in the LeakExample
program (Listing 2). IRAD found that there
are two potential leak candidates: first is a Vector
in LeakExample accumulating String,
and the second is a HashMap.
Figure 4. Start Leak Analysis
Figure 5. Leak Candidate view
The great chase: browsing the Object Reference Graph
IRAD allows you to browse the Object Reference Graph (ORG) in a heap
dump. If you click a leak candidate in the Leak Candidate view,
it will bring up the ORG with a highlighted path characterizing the
leak. Figure 6 shows a screen shot with leak
candidates and the ORG for a program named simpleleaker. For example,
if you click the second leak candidate (that is, simpleleaker
class having a Vector that is leaking String
objects), IRAD will bring up the ORG with the path from simpleleaker
to Vector to String
highlighted. This is very helpful for non-trivial programs having
complex object containment and ownership. By looking at this path,
you can start from the class that is the root of the leak (simpleleaker
or LeakExample) and trace down to the object
that is leaking. The following sections of this article will examine
various characteristics of a leak candidate.
Figure 6. Screen shot showing Leak Candidate and Object Reference Graph views
Modus operandi: understanding leak analysis results
Now that you know how to use IRAD to find memory leaks in your Java application, we can discuss how to interpret the results. IRAD's leak analysis identifies Leak Regions.
A Leak Region is a collection of objects that characterizes a potential
memory leak. A Leak Region consists of a Leak Root, a Container,
an Owner Chain, and Leaking Units (see Figure
7). A Leak Root is the object at the head of a data structure
that is leaking in one or more ways (in other words, a Leak Root
can be part of multiple Leak Regions). In our example, for the second
leak candidate, the leak root is an object of type LeakExample.
A container is the object that owns all leaked objects in the region
(in our example it is HashMap). A container
is typically one of the collection classes such as Vector,
List, Hash,
and so on. The Owner Chain is the path from Leak Root to Container.
The Leaking Unit is the type of the leaked objects (in our example
it is HashMap$Entry).
In the Leak Candidate view, each Leak Candidate represents a Leak Region. By understanding the constituents of a Leak Region, you can examine the relationships of classes and identify the code that is accumulating leaking units.
Figure 5 shows the Leak Candidate View with columns representing Root of leak, Container type, and What's leaking (unit type). The Owner Chain can be identified in the Object Reference Graph view (Figure 6). The Leak Candidate view also shows Number of leaks, Objects leaked, and Bytes leaked. The Number of leaks is the count of all objects of the Leaking Unit type that are held in the Container. These objects may in turn refer to other objects (leak content). The Objects leaked and Bytes leaked represent the number and size of objects in the leak content.
Figure 7. Leak Region
IRAD detects potential memory leaks by identifying a collection of objects that exhibit unbound growth. The programming constructs such as object pools exhibiting similar behavior. An object pool consists of a predefined number of objects. When an object is needed, it is borrowed from the pool, and when the need is over, it is returned to the pool. An object allocation happens only when it is requested for the first time, later it is reused, and never garbage collected. It is a very powerful technique to avoid excessive overhead of object creation and garbage collection, and widely used for maintaining network or database connection pools. IRAD will identify these also as potential memory leaks. We suggest that you take the first heap dump such that caches and object pools are active to avoid these false positives. Also, if you conclude that a leak candidate belongs to an object pool, ignore it and move on to fixing the next leak candidate.
Testimonies: experiences of the IBM JVM L3 support group
The Java leak detection technology is based on Leakbot (see Resource [5] for more details), technology invented in IBM® Research Labs. The results are highly accurate, and can even detect very slow leaks. In this section we share some of the experiences from the IBM JVM L3 support group, which receives several issues related to memory leaks in customer applications. L3 engineers were using HeapRoots (Resource [6]) to browse and query the heap dumps based on various parameters such as object id, sub-tree size, and heap occupancy. Based on the skill of the engineer and complexity of the case, the engineer would usually spend a few hours to a day, manually analyzing heap dumps to identify leaks. IRAD has automated this process. Programmers can do other tasks while RAD is doing leak analysis, thus ensuring a much quicker turnaround and significant savings in developer time.
We selected a set of cases that represented the whole spectrum of complexity (heap dumps having 100,000 to 22 million objects). The test cases encompass different kinds of heap dumps (IBM Text and IBM PHD) on various platforms (AIX, Linux, and Windows). Table 1 shows characteristics of these cases. Each test case consists of a pair of heap dumps. We noted object count, reference count (number of instances where an object contains reference to another object), and size of each heap dump. We performed leak analysis using a 2.52 GHz Intel Pentium machine with 2 GB RAM, running Windows XP.
The time taken by IRAD depends on the number of objects and references in the heap dumps. Typically in a development environment, a heap dump has between a hundred thousand and a million objects. IRAD takes half a minute to 5 minutes for these cases. The heap dumps from production environments contain a few million objects. One of the biggest heap dumps with the JVM L3 group has around 22 million objects. RAD took 2.5 hours to do the leak analysis. Without RAD, it would have taken an L3 engineer a couple of days to identify the leaks. Also, since it is machine time instead of developer time, L3 engineers found RAD very useful that freed them to do other tasks and reduce costs.
Table 1. Various customer cases handled by IBM JVM L3 Support team
| Application | # Objects | # References | # GZipped File Size (KB) | RAD Time (sec) | |||
| 1st dump | 2nd dump | 1st dump | 2nd dump | 1st dump | 2nd dump | ||
| Investment Bank | 41,694 | 101,170 | 54,667 | 96,159 | 335 | 684 | 39 |
| Global Bank | 834,514 | 846,893 | 1,173,227 | 1,192,097 | 6,479 | 6,571 | 333 |
| CRM | 3,465,012 | 3,439,322 | 4,850,147 | 4,991,909 | 27,092 | 27,334 | 898 |
| Web Portal | 4,874,567 | 4,901,711 | 8,648,962 | 8,746,019 | 42,420 | 42,774 | 1,438 |
| IT Infrastructure | 4,644,259 | 4,668,908 | 13,440,098 | 13,506,208 | 41,035 | 41,364 | 2,216 |
| Movie Rental | 8,089,934 | 21,785,658 | 14,729,462 | 39,251,283 | 71,595 | 196,820 | 9,109 |
Prevent the crime: leak detection during the Development Phase
It is said that prevention is better than cure. You can surely use IRAD to fix Java leak problems in production systems, but dividends will be higher if you incorporate Java leak detection into your Development and Testing phase. For every operation, at logical break points, you should take heap dumps, and analyze them for leaks. For example, in a banking system, there will be account opening operation, deposit operation, withdrawal operation. For each such operations, you should take heap dumps at the beginning and at the end, and do leak analysis on the pair. Essentially, you should take heap dumps along all control paths that are exercised in your testing, and do leak analysis on all pairs of consecutive heap dumps. In this way, you can detect Java leaks during system development rather than post deployment.
And then there were none...
-
IRAD space at developerWorks
-
Runtime problem determination with RAD
-
For information on how to take heap dumps using IBM JVM, see IBM JVM Diagnostic Guide
-
To learn how to set up and start IBM Agent Controller, see IBM Redbook Rational Application Developer V6 Programming Guide.
-
For an overview of RAD profiling capabilities, read the article "Runtime problem determination with IBM Rational Application Developer" by Tanuj Vohra.
-
To understand profiling concepts in Eclipse, read An introduction to profiling Java applications.
-
For details of the Leakbot project at IBM Research and other related publications.
-
For information about the HeapRoots
project on IBM alphaWorks.

Satish Chandra Gupta is a developer in the IBM Rational® PurifyPlus® group in Bangalore, India. His interests include compilers, programming languages, runtime analysis, Java memory leaks, type theory, software engineering, and software development environments. His research has been published in ACM/IEEE conferences. He received a B.Tech. from the Indian Institute of Technology in Kanpur (India), and an M.S. from the University of Wisconsin in Milwaukee (USA).

Rajeev Palanki is an engineer with the IBM JVM support group in Bangalore, India. He has extensive experience in debugging Java memory leaks. He has co-authored the Java Diagnostics guide, a number of developerWork articles, and an IBM Redbook. He holds a B.E. from Government Engineering College in Bhopal, India.




