 | Level: Intermediate Satish Chandra Gupta (satish.gupta@in.ibm.com), Programmer, RAD/PurifyPlus, Rational Software, IBM Rajeev Palanki (rpalanki@in.ibm.com), Level 3 Java Support, IBM
16 Aug 2005 Garbage collection in the Java™ programming
language simplifies memory management and eliminates typical memory problems.
However, contrary to popular belief, garbage collection can not take care of all
memory problems. One such problem is of Java memory leaks, which are harder to
detect because they usually result from design and implementation errors (for
example, a reference to an object kept beyond its useful life). This article demystifies
Java memory leaks, provides an overview of the Java leak detection technology
in IBM® Rational® Application Developer (IRAD), and shares the IBM JVM
L3 support group's experiences using the technology. It proposes exploiting this
technology in the Development and Testing phases, and suggests how to design test
cases for detecting Java memory leaks.
Elusive
bandits: what are Java memory leaks?
If the term Java™ memory leak seems like a misnomer to you, rest
assured that you are not alone; after all, you were promised that Garbage Collection
(GC) in Java will relieve you from the mundane duties of allocating, tracking,
and freeing the heap memory. Indeed, the promise has been delivered, so it is
reasonable -- and to quite an extent correct -- for you to conclude that there
will be no memory leaks in your Java program. However, the catch is that GC
can take care of only typical heap management problems.
Think about your experience with C/C++ (or any other non-GC language for that
matter). What are typical heap management problems in C/C++? Two of the most
famous ones are: memory leaks and dangling pointers. Listing 1 shows a simple example with both errors. In this example, interestingly, by themselves methods foo and main seem to be error-free, but together they manifest both a memory leak and a dangling pointer. First, method foo overwrites pi
with a new allocation, and all pointers to the memory chunk allocated in main
are lost, leading to a memory leak. Then foo frees
up what it has allocated, but pi still holds the address (it is not nullified).
In main, after calling foo,
when pi is used, it is referring to memory that has
been freed, so pi is a dangling pointer.
Listing
1. An example of C/C++ memory leak and dangling pointer
int *pi;
void foo() {
pi = (int*) malloc(8*sizeof(int)); // oops, memory leak of 4 ints
// use pi
free(pi); // foo() is done with pi
}
void main() {
pi = (int*) malloc(4*sizeof(int));
foo();
pi[0] = 10; // oops, pi is now a dangling pointer
}
|
In Java, you will never have a dangling pointer, because the GC will
not reclaim a heap chunk if there are any references to it. Nor will you have
a C/C++ type memory leak because GC will reclaim the heap chunk only
if there is no reference to it. Then, what are Java memory leaks? If
a program holds a reference to a heap chunk that is not used during the rest
of its life, it is considered a memory leak because the memory could have been
freed and reused. GC won't reclaim it due to the reference being held by the
program. A Java program could run out of memory due to such leaks. Java memory
leaks are mostly a result of non-obvious programming errors. A simple example
is shown in Listing 2.
The LeakExample class has three methods: slowlyLeakingVector,
leakingRequestLog, and noLeak.
The slowlyLeakingVector has a wrong condition in
the second for loop, due to which one String object
leaks in every iteration. Instead of String, if it
was a database record with significant size, this slow leak can result in a
non-responding application running out of memory fairly quickly. The leakingRequestLog
represents a very common case where an incoming request is kept in a hash table
till it is completed. Except in this example, the programmer has forgotten to
remove it from the hash table once the request is done. Over a period of time,
the hash table will have plenty of entries, which will result in hash-clashes
as well as a large portion of heap occupied by useless hash entries. Both of
these are very common cases resulting into Java memory leaks.
Instinctively, you may think that the locations of highest memory allocation
result in leaks. However, it may not be true. For example, the noLeak
method allocates a lot of memory, but all of it gets garbage collected. On the
other hand, the other two methods don't allocate that much memory, but are causing
memory leaks.
Listing
2. An example demonstrating Java Memory Leaks
import java.io.IOException;
import java.util.HashSet;
import java.util.Random;
import java.util.Vector;
public class LeakExample {
static Vector myVector = new Vector();
static HashSet pendingRequests = new HashSet();
public void slowlyLeakingVector(int iter, int count) {
for (int i=0; i<iter; i++) {
for (int n=0; n<count; n++) {
myVector.add(Integer.toString(n+i));
}
for (int n=count-1; n>0; n--) {
// Oops, it should be n>=0
myVector.removeElementAt(n);
}
}
}
public void leakingRequestLog(int iter) {
Random requestQueue = new Random();
for (int i=0; i<iter; i++) {
int newRequest = requestQueue.nextInt();
pendingRequests.add(new Integer(newRequest));
// processed request, but forgot to remove it
// from pending requests
}
}
public void noLeak(int size) {
HashSet tmpStore = new HashSet();
for (int i=0; i<size; ++i) {
String leakingUnit = new String("Object: " + i);
tmpStore.add(leakingUnit);
}
// Though highest memory allocation happens in this
// function, but all these objects get garbage
// collected at the end of this method, so no leak.
}
public static void main(String[] args) throws IOException {
LeakExample javaLeaks = new LeakExample();
for (int i=0; true; i++) {
try { // sleep to slow down leaking process
Thread.sleep(1000);
} catch (InterruptedException e) { /* do nothing */ }
System.out.println("Iteration: " + i);
javaLeaks.slowlyLeakingVector(1000,10);
javaLeaks.leakingRequestLog(5000);
javaLeaks.noLeak(100000);
}
}
}
|
 |
IRAD,
the super cop: Java leak detection using IRAD
Since you now know that Java memory leak is not a misnomer, and you
have seen examples of such leaks, it is natural to ask how these leaks can be
identified in Java applications. IBM® Rational® Application Developer
(IRAD) can identify potential leaks by analyzing heap dumps. A heap dump is
the state of heap memory at a given point in time, and IRAD has the ability
to profile an application and collect heap dumps. Alternatively, you can import
heap dumps with tools (IBM® Text, IBM® PHD, IBM® SVC, HPROF) taken
on various platforms (see Resource [1] for how to collect
heap dumps using IBM JVM). IRAD can then analyze these heap dumps and find Java
leaks. This section will explain how to collect or import heap dumps, do leak
analysis, and understand the results. IRAD is an Eclipse-based tool, and most
Java developers will find its GUI familiar and easy to use.
Collecting
evidence: we call them heap dumps
Before collecting heap dumps, make sure that the IBM® Agent Controller
is running. It is packaged and gets installed with IRAD. It usually runs as
a service (see Resource [2] for more details on the
IBM Agent Controller).
As mentioned previously, heap dumps can be collected by profiling a program,
using these steps in the Profiling dialog.
- To start the Profiling
dialog, from the Run menu, click Profile.
- Let's say you want
to collect heap dumps for
LeakExample program (Listing
2). You will have to create a Configuration for it under the Java
Application.
- Then go to the Profiling
tab and select the Memory Leak Analysis - Manual heap dumps profiling
set. Figure 1 shows the Profile dialog, and annotations
highlight the Profiling tab and profiling set selection.
- Once you click the
Profile button, IRAD will switch to the Profiling and Logging perspective.
- IRAD will start your application and begin profiling. It will be shown in the Profiling Monitor view in the perspective (see Figure 2).
- You can collect a
heap dump by clicking the Capture Heap Dump button. Collect at least
two heap dumps. Typically, you should take these heap dumps just before and
after the sequence of operations suspected to cause leak. Or it could just
be at the beginning and end of the program.
- When you have collected
the dumps, terminate the program.
Figure
1. Profile dialog
Figure
2. The Capture Heap Dump button in Profiling Monitor View
Probing
the past: importing heap dumps
Alternatively, you might want to analyze heap dumps taken in customer environments.
You can import these heap dumps through the Import wizard using these steps.
- Go to the Profiling Monitor view and right-click.
- On the context menu, click Import, select the Heap Dump
option, and click the Next button.
- It will then ask you to select the type of heap dump you are importing.
- Then it will show you a window similar to Figure
3. Browse your file systems and select the heap dumps that you want
to import.
- Click the Finish button, and IRAD will import the heap dumps.
Figure
3. Import Heap dump
Getting
on the trail: doing leak analysis
Once you have collected or imported the heap dumps, you can analyze them
for leaks by clicking the Analyze for Leak button shown in Figure
4. It will bring up the Leak Analysis dialog, in which you should
select two heap dumps, set the threshold value to 1, and click
the OK button. IRAD will perform leak analysis and bring up the
Leak Candidate view. Figure 5 shows the
Leak Candidate view with leaks in the LeakExample
program (Listing 2). IRAD found that there
are two potential leak candidates: first is a Vector
in LeakExample accumulating String,
and the second is a HashMap.
Figure
4. Start Leak Analysis
Figure
5. Leak Candidate view
The
great chase: browsing the Object Reference Graph
IRAD allows you to browse the Object Reference Graph (ORG) in a heap
dump. If you click a leak candidate in the Leak Candidate view,
it will bring up the ORG with a highlighted path characterizing the
leak. Figure 6 shows a screen shot with leak
candidates and the ORG for a program named simpleleaker. For example,
if you click the second leak candidate (that is, simpleleaker
class having a Vector that is leaking String
objects), IRAD will bring up the ORG with the path from simpleleaker
to Vector to String
highlighted. This is very helpful for non-trivial programs having
complex object containment and ownership. By looking at this path,
you can start from the class that is the root of the leak (simpleleaker
or LeakExample) and trace down to the object
that is leaking. The following sections of this article will examine
various characteristics of a leak candidate.
Figure
6. Screen shot showing Leak Candidate and Object Reference Graph views
Modus
operandi: understanding leak analysis results
Now that you know how to use IRAD to find memory leaks in your
Java application, we can discuss how to interpret the results. IRAD's
leak analysis identifies Leak Regions.
A Leak Region is a collection of objects that characterizes a potential
memory leak. A Leak Region consists of a Leak Root, a Container,
an Owner Chain, and Leaking Units (see Figure
7). A Leak Root is the object at the head of a data structure
that is leaking in one or more ways (in other words, a Leak Root
can be part of multiple Leak Regions). In our example, for the second
leak candidate, the leak root is an object of type LeakExample.
A container is the object that owns all leaked objects in the region
(in our example it is HashMap). A container
is typically one of the collection classes such as Vector,
List, Hash,
and so on. The Owner Chain is the path from Leak Root to Container.
The Leaking Unit is the type of the leaked objects (in our example
it is HashMap$Entry).
In the Leak Candidate view, each Leak Candidate represents a Leak
Region. By understanding the constituents of a Leak Region, you
can examine the relationships of classes and identify the code that
is accumulating leaking units.
Figure 5 shows the Leak Candidate
View with columns representing Root of leak, Container
type, and What's leaking (unit type). The Owner Chain
can be identified in the Object Reference Graph view (Figure
6). The Leak Candidate view also shows Number of leaks,
Objects leaked, and Bytes leaked. The Number of
leaks is the count of all objects of the Leaking Unit type that
are held in the Container. These objects may in turn refer to other
objects (leak content). The Objects leaked and Bytes leaked
represent the number and size of objects in the leak content.
Figure
7. Leak Region
IRAD detects potential memory leaks by identifying a collection
of objects that exhibit unbound growth. The programming constructs
such as object pools exhibiting similar behavior. An object pool
consists of a predefined number of objects. When an object is needed,
it is borrowed from the pool, and when the need is over, it is returned
to the pool. An object allocation happens only when it is requested
for the first time, later it is reused, and never garbage collected.
It is a very powerful technique to avoid excessive overhead of object
creation and garbage collection, and widely used for maintaining
network or database connection pools. IRAD will identify these also
as potential memory leaks. We suggest that you take the first heap
dump such that caches and object pools are active to avoid these
false positives. Also, if you conclude that a leak candidate belongs
to an object pool, ignore it and move on to fixing the next leak
candidate.
 |
Testimonies:
experiences of the IBM JVM L3 support group
The Java leak detection technology is based on Leakbot (see Resource
[5] for more details), technology invented in IBM® Research
Labs. The results are highly accurate, and can even detect very
slow leaks. In this section we share some of the experiences from
the IBM JVM L3 support group, which receives several issues related
to memory leaks in customer applications. L3 engineers were using
HeapRoots (Resource [6]) to browse and
query the heap dumps based on various parameters such as object
id, sub-tree size, and heap occupancy. Based on the skill of the
engineer and complexity of the case, the engineer would usually
spend a few hours to a day, manually analyzing heap dumps to identify
leaks. IRAD has automated this process. Programmers can do other
tasks while RAD is doing leak analysis, thus ensuring a much quicker
turnaround and significant savings in developer time.
We selected a set of cases that represented the whole spectrum
of complexity (heap dumps having 100,000 to 22 million objects).
The test cases encompass different kinds of heap dumps (IBM Text
and IBM PHD) on various platforms (AIX, Linux, and Windows). Table
1 shows characteristics of these cases. Each test case consists
of a pair of heap dumps. We noted object count, reference count
(number of instances where an object contains reference to another
object), and size of each heap dump. We performed leak analysis
using a 2.52 GHz Intel Pentium machine with 2 GB RAM, running Windows
XP.
The time taken by IRAD depends on the number of objects and references
in the heap dumps. Typically in a development environment, a heap
dump has between a hundred thousand and a million objects. IRAD
takes half a minute to 5 minutes for these cases. The heap dumps
from production environments contain a few million objects. One
of the biggest heap dumps with the JVM L3 group has around 22 million
objects. RAD took 2.5 hours to do the leak analysis. Without RAD,
it would have taken an L3 engineer a couple of days to identify
the leaks. Also, since it is machine time instead of developer time,
L3 engineers found RAD very useful that freed them to do other tasks
and reduce costs.
Table 1. Various
customer cases handled by IBM JVM L3 Support team
| Application | # Objects | # References | # GZipped File Size (KB) | RAD Time (sec) | | 1st dump | 2nd dump | 1st dump | 2nd dump | 1st dump | 2nd dump | | Investment Bank | 41,694 | 101,170 | 54,667 | 96,159 | 335 | 684 | 39 | | Global Bank | 834,514 | 846,893 | 1,173,227 | 1,192,097 | 6,479 | 6,571 | 333 | | CRM | 3,465,012 | 3,439,322 | 4,850,147 | 4,991,909 | 27,092 | 27,334 | 898 | | Web Portal | 4,874,567 | 4,901,711 | 8,648,962 | 8,746,019 | 42,420 | 42,774 | 1,438 | | IT Infrastructure | 4,644,259 | 4,668,908 | 13,440,098 | 13,506,208 | 41,035 | 41,364 | 2,216 | | Movie Rental | 8,089,934 | 21,785,658 | 14,729,462 | 39,251,283 | 71,595 | 196,820 | 9,109 |
Prevent
the crime: leak detection during the Development Phase
It is said that prevention is better than cure. You can surely
use IRAD to fix Java leak problems in production systems, but dividends
will be higher if you incorporate Java leak detection into your
Development and Testing phase. For every operation, at logical break
points, you should take heap dumps, and analyze them for leaks.
For example, in a banking system, there will be account opening
operation, deposit operation, withdrawal operation. For each such
operations, you should take heap dumps at the beginning and at the
end, and do leak analysis on the pair. Essentially, you should take
heap dumps along all control paths that are exercised in your testing,
and do leak analysis on all pairs of consecutive heap dumps. In
this way, you can detect Java leaks during system development rather
than post deployment.
And then there were none...
Resources
About the authors  | 
|  | Satish Chandra Gupta is a developer in the IBM Rational® PurifyPlus® group in Bangalore, India. His interests include compilers, programming languages, runtime analysis, Java memory leaks, type theory, software engineering, and software development environments. His research has been published in ACM/IEEE conferences. He received a B.Tech. from the Indian Institute of Technology in Kanpur (India), and an M.S. from the University of Wisconsin in Milwaukee (USA). |
 | 
|  | Rajeev Palanki is an engineer with the IBM JVM support group in Bangalore, India. He has extensive experience in debugging Java memory leaks. He has co-authored the Java Diagnostics guide, a number of developerWork articles, and an IBM Redbook. He holds a B.E. from Government Engineering College in Bhopal, India. |
Rate this page
|  |