Of all the exceptions a Java programmer might encounter, the null-pointer exception is among the most dreaded, and for good reason: it is one of the least informative exceptions that a program can signal. Unlike, for example, a class-cast exception, a null-pointer exception says nothing about what was expected instead of the null pointer. Furthermore, it says nothing about where in the code the null pointer was actually assigned. In many null-pointer exceptions, the true bug occurs where the variable is actually assigned to null. To find the bug, we have to trace back through the flow of control to find out where the variable was assigned and determine whether doing so was incorrect. This process can be particularly frustrating when the assignment occurs in a package other than the one in which the error was signaled.
Many Java developers have told me that the vast majority of program crashes they encounter are null-pointer exceptions, and that they long for a tool that would statically identify these bugs before the program is ever run. Unfortunately, automata theory tells us that no tool could ever statically determine which programs will throw null-pointer exceptions. But it is possible for a tool to rule out null-pointer exceptions for many expressions in a program, leaving us with just a few small sections of potential trouble spots that we must then examine manually. In fact, several research efforts are now underway to provide just such a tool for Java programs (see Resources). But a good tool can only take us so far. Null-pointer exceptions will never be completely eradicated. When they do occur, it helps to know the bug patterns associated with them so that we can diagnose them quickly. In addition, we can apply certain programming and design techniques to significantly minimize the occurrence of these patterns.
The Dangling Composite pattern
The first bug pattern relating to null-pointer exceptions that we'll explore is one that I call the Dangling Composite pattern. Bugs of this sort result from defining a recursive datatype in such a way that certain base cases of the definition are not given their own classes. Instead, null pointers are inserted into the various composite datatypes. Instances of the datatype are then used as if these null pointers were filled in properly. I call this the Dangling Composite pattern because the offending code is an imperfect application of the Composite design pattern, in which the composite datatypes contain dangling references (that is, null pointers).
Consider the following singly linked implementation of the
LinkedList class, with a dangling composite. To keep the example
simple, I implemented only a few of the methods defined in
java.util.LinkedList. Just to show how insidious bugs of this
pattern can be, I've already introduced a bug in the following code. See if
you can spot it.
Listing 1. Singly linked list
import java.util.NoSuchElementException;
public class LinkedList {
private Object first;
private LinkedList rest;
/**
* Constructs an empty LinkedList.
*/
public LinkedList() {
this.first = null;
this.rest = null;
}
/**
* Constructs a LinkedList containing only the given element.
*/
public LinkedList(Object _first) {
this.first = _first;
this.rest = null;
}
/**
* Constructs a LinkedList consisting of the given Object followed by
* all the elements in the given LinkedList.
*/
public LinkedList(Object _first, LinkedList _rest) {
this.first = _first;
this.rest = _rest;
}
}
|
This code is truly awful. Instead of defining a separate class for empty lists, it represents empty lists by putting a null pointer in both fields. It may at first appear that representing empty lists in this way makes the code simpler. After all, we've saved the effort of having to define an extra class just for empty lists. But, as I will demonstrate, this simplicity is an illusion. Let's define some getter and setter methods for this class:
Listing 2. Defining methods for LinkedList
public Object getFirst() {
if (! (this.isEmpty())) {
return this.first;
}
else {
throw new NoSuchElementException();
}
}
public LinkedList getRest() {
if (! (this.isEmpty())) {
return this.rest;
}
else {
throw new NoSuchElementException();
}
}
public void addFirst(Object o) {
LinkedList oldThis = (LinkedList)this.clone();
this.first = o;
this.rest = oldThis;
}
public boolean isEmpty() {
return this.first == null && this.rest == null;
}
private Object clone() {
return new LinkedList(this.first, this.rest);
}
|
Notice that the action taken by both of the getters depends on whether the list is empty. This is exactly the sort of if-then-else chain that a properly constructed class hierarchy prevents. Because of these chains, we are prevented from considering the behavior of these getters on a single type of list in isolation. Furthermore, if at some future date we needed a third type of list (for example, an immutable list), we would have to go into every one of these methods and change the code.
But the true measure of simplicity is how easily we can avoid introducing
bugs into the program. By this measure, the LinkedList
implementation in Listing 2 fails miserably. In fact, as I mentioned
earlier, our LinkedList class already contains a subtle but
devastating bug (did you spot it?). Exactly what is our representation of
empty lists? I said earlier that an empty list is a LinkedList
containing a null pointer in both fields. In fact, the zero-argument
constructor sets up just such an empty list. But notice that the
single-argument constructor does
not
place an empty list into the
rest field, as would be necessary to construct a list of one
argument. Instead, it places a null pointer into the field. Because a dangling
composite confuses null pointers with place holders for base cases, mistakes like
this are very easy to make. To see how this bug can manifest itself as a
null-pointer exception, let's write an equals method for our
lists:
Listing 3. Where it goes wrong
public boolean equals(Object that) {
// If the objects are not of the same class, then they are not equal.
// Reflection is used in case this method is called from an instance of a
// subclass.
if (this.getClass() == that.getClass()) {
LinkedList _that = (LinkedList)that;
if (this.isEmpty() || _that.isEmpty()) {
return this.isEmpty() && _that.isEmpty();
}
else {
boolean firstEltsMatch = this.getFirst().equals(_that.getFirst());
boolean restEltsMatch = this.getRest().equals(_that.getRest());
return firstEltsMatch && restEltsMatch;
}
}
else {
return false;
}
}
|
If neither this nor that is empty, the
equals method correctly expects that it can call
getFirst and getRest on both of them without an
error signal. But if either of these lists contains any part that was
constructed using the single-argument constructor, then the recursive call
will eventually uncover a null pointer where an empty list should have been
waiting. When getFirst or getRest is called on it, a
null-pointer exception is signaled.
One idea might be to simply represent empty lists directly as null pointers, but this idea would be wholly inadequate, because it would then be impossible to remove the last element of a list or to insert an element into an empty list.
On the other hand, the bug could be fixed by rewriting the single-argument constructor as follows:
Listing 4. Applying a fix
public LinkedList(Object _first) {
this.first = _first;
this.rest = new LinkedList();
}
|
But, like most bug patterns, it is much better to prevent bugs rather than to fix them. The fact that the code broke so easily, and that even simple getters, setters, and the equals method were all so bloated, suggests that there is a better design.
In fact, there is a simple way to avoid Dangling Composite bugs: give each
base case of the datatype its own class. Instead of implementing singly linked
lists as we did earlier, I recommend implementing such lists with a class
LinkedList that contains a field holding either an Empty or a
Cons class. These classes implement a common interface, as shown in Figure
1.
Figure 1. Empty and Cons UML diagram

To implement mutator methods, the new LinkedList
class serves as a container for an internal, immutable list, as shown in
Listing 5. This approach is necessary because truly empty lists have no fields
to mutate and are, therefore, immutable.
Listing 5. Each base case gets its own class
import java.util.NoSuchElementException;
public class LinkedList {
private List value;
/**
* Constructs an empty LinkedList.
*/
public LinkedList() { this.value = new Empty(); }
/**
* Constructs a LinkedList containing only the given element.
*/
public LinkedList(Object _first) { this.value = new Cons(_first); }
/**
* Constructs a LinkedList consisting of the given Object followed by
* all the elements in the given LinkedList.
*/
public LinkedList(Object _first, LinkedList _rest) {
this.value = new Cons(_first, _rest.value);
}
private LinkedList(List _value) { this.value = _value; }
public Object getFirst() { return this.value.getFirst(); }
public LinkedList getRest() { return new LinkedList(this.value.getRest()); }
public void addFirst(Object o) { this.value = new Cons(o, this.value); }
public boolean isEmpty() { return this.value instanceof Empty; }
public boolean equals(Object that) {
if (this.getClass() == that.getClass()) {
// The above test guarantees that the cast to LinkedList will always
// succeed.
return this.value.equals(((LinkedList)that).value);
}
else {
return false;
}
}
}
|
Implementation of the immutable lists is then straightforward, as shown in Listing 6.
Listing 6. Adding immutable lists
interface List {
public Object getFirst();
public List getRest();
}
class Empty implements List {
public Object getFirst() { throw new NoSuchElementException(); }
public List getRest() { throw new NoSuchElementException(); }
public boolean equals(Object that) {
return this.getClass() == that.getClass(); }
}
class Cons implements List {
Object first;
List rest;
Cons(Object _first) {
this.first = _first;
this.rest = new Empty();
}
Cons(Object _first, List _rest) {
this.first = _first;
this.rest = _rest;
}
public Object getFirst() { return this.first; }
public List getRest() { return this.rest; }
public boolean equals(Object that) {
if (this.getClass() == that.getClass()) {
// The above test guarantees that the cast to Cons will always succeed.
Cons _that = (Cons)that;
boolean firstEltsMatch = this.getFirst().equals(_that.getFirst());
boolean restEltsMatch = this.getRest().equals(_that.getRest());
return firstEltsMatch && restEltsMatch;
}
else {
return false;
}
}
}
|
The logic for each method is now considerably simpler. Also notice that, while it is still possible to introduce the same bug as before in the single-argument Cons constructor, the fact that we have made an explicit Empty class makes this much less likely. Additionally, any code that traverses our lists and forgets to check for the empty case will get a NoSuchElementException, as opposed to the much-less-helpful null-pointer exception.
A simple optimization to this code would be to apply the Singleton design pattern to class Empty, because every instance of Empty is identical. I have left out this optimization because it is not relevant to eliminating null-pointer exceptions, and it makes the code slightly more complex.
Here's the breakdown of this week's bug pattern:
- Pattern: Dangling Composite
- Symptoms: Code that uses a recursively defined datatype is signaling a null-pointer exception.
- Cause: The recursive datatype is defined in such a way that certain base cases of the definition are not given their own classes. Instead, null-pointers are inserted into the various composite datatypes. The base cases are then handled inconsistently by client code.
- Cures and preventions: Ensure that the base cases are represented and checked for consistently. Give each base case its own class.
We aren't finished with null pointers just yet. In the next installment, we'll take a look at yet another very common bug pattern that manifests itself as a null-pointer exception, how to recognize it, and how to avoid it.
- One approach to statically determining possible null-pointer exceptions is a technique known as set-based analysis. The Carnegie Mellon School of Computer Science Web site provides a short introduction to this approach, along with links to several technical publications on the topic.
- The Division of Software Engineering at DePaul University has done some work on automated theorem provers to detect null-pointer exceptions in Java code.
- Visit the Patterns Home Page for a good introduction to design patterns and how they are used.
- Check out JUnit and catch more errors by making your code "test-infested."
- Be sure to read Eric Allen's first article on bug patterns (developerWorks, February 2001).
- Interested in general debugging? Check out this free, dW-exclusive tutorial, Java debugging, to learn debugging techniques from concept to application.
Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is currently the lead Java software developer at Cycorp, Inc., and a part-time graduate student in the programming languages team at Rice University. His research concerns the development of formal semantic models and extensions of the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cyc.com.




