Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Diagnosing Java Code: The Dangling Composite bug pattern

Squash one of the most common causes of the null-pointer exception

Eric Allen (eallen@cyc.com), Software Engineer, Cycorp, Inc.
Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is currently the lead Java software developer at Cycorp, Inc., and a part-time graduate student in the programming languages team at Rice University. His research concerns the development of formal semantic models and extensions of the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cyc.com.

Summary:  One of the most commonly recurring (and most complained about) bugs in Java programming is the null-pointer exception. Tracking down the cause of one of these bugs can truly make you question your career decision. In this installment of Diagnosing Java Code, we'll continue with our examination of bug patterns by cataloging one of the most common patterns associated with null-pointer exceptions and step through an example of a class that contains it. We will then review several programming techniques that can help you minimize the pattern's occurrence.

View more content in this series

Date:  01 Mar 2001
Level:  Introductory
Also available in:   Japanese

Activity:  8171 views
Comments:  

Null pointers everywhere!

Of all the exceptions a Java programmer might encounter, the null-pointer exception is among the most dreaded, and for good reason: it is one of the least informative exceptions that a program can signal. Unlike, for example, a class-cast exception, a null-pointer exception says nothing about what was expected instead of the null pointer. Furthermore, it says nothing about where in the code the null pointer was actually assigned. In many null-pointer exceptions, the true bug occurs where the variable is actually assigned to null. To find the bug, we have to trace back through the flow of control to find out where the variable was assigned and determine whether doing so was incorrect. This process can be particularly frustrating when the assignment occurs in a package other than the one in which the error was signaled.

Many Java developers have told me that the vast majority of program crashes they encounter are null-pointer exceptions, and that they long for a tool that would statically identify these bugs before the program is ever run. Unfortunately, automata theory tells us that no tool could ever statically determine which programs will throw null-pointer exceptions. But it is possible for a tool to rule out null-pointer exceptions for many expressions in a program, leaving us with just a few small sections of potential trouble spots that we must then examine manually. In fact, several research efforts are now underway to provide just such a tool for Java programs (see Resources). But a good tool can only take us so far. Null-pointer exceptions will never be completely eradicated. When they do occur, it helps to know the bug patterns associated with them so that we can diagnose them quickly. In addition, we can apply certain programming and design techniques to significantly minimize the occurrence of these patterns.


The Dangling Composite pattern

The first bug pattern relating to null-pointer exceptions that we'll explore is one that I call the Dangling Composite pattern. Bugs of this sort result from defining a recursive datatype in such a way that certain base cases of the definition are not given their own classes. Instead, null pointers are inserted into the various composite datatypes. Instances of the datatype are then used as if these null pointers were filled in properly. I call this the Dangling Composite pattern because the offending code is an imperfect application of the Composite design pattern, in which the composite datatypes contain dangling references (that is, null pointers).


The cause

Consider the following singly linked implementation of the LinkedList class, with a dangling composite. To keep the example simple, I implemented only a few of the methods defined in java.util.LinkedList. Just to show how insidious bugs of this pattern can be, I've already introduced a bug in the following code. See if you can spot it.


Listing 1. Singly linked list

import java.util.NoSuchElementException;

public class LinkedList {

  private Object first;
  private LinkedList rest;

  /**
   * Constructs an empty LinkedList.
   */
  public LinkedList() {
    this.first = null;
    this.rest = null;
  }

  /**
   * Constructs a LinkedList containing only the given element.
   */
  public LinkedList(Object _first) {
    this.first = _first;
    this.rest = null;
  }

  /**
   * Constructs a LinkedList consisting of the given Object followed by
   * all the elements in the given LinkedList.
   */
  public LinkedList(Object _first, LinkedList _rest) {
    this.first = _first;
    this.rest = _rest;
  }
}
		

This code is truly awful. Instead of defining a separate class for empty lists, it represents empty lists by putting a null pointer in both fields. It may at first appear that representing empty lists in this way makes the code simpler. After all, we've saved the effort of having to define an extra class just for empty lists. But, as I will demonstrate, this simplicity is an illusion. Let's define some getter and setter methods for this class:


Listing 2. Defining methods for LinkedList


  public Object getFirst() {

    if (! (this.isEmpty())) {
      return this.first;
    }
    else {
      throw new NoSuchElementException();
    }
  }
  public LinkedList getRest() {
  
    if (! (this.isEmpty())) {
      return this.rest;
    }
    else {
      throw new NoSuchElementException();
    }
  }

  public void addFirst(Object o) {

    LinkedList oldThis = (LinkedList)this.clone();

    this.first = o;
    this.rest = oldThis;
  }

  public boolean isEmpty() {
    return this.first == null && this.rest == null;
  }

  private Object clone() {
    return new LinkedList(this.first, this.rest);
  }
		

Notice that the action taken by both of the getters depends on whether the list is empty. This is exactly the sort of if-then-else chain that a properly constructed class hierarchy prevents. Because of these chains, we are prevented from considering the behavior of these getters on a single type of list in isolation. Furthermore, if at some future date we needed a third type of list (for example, an immutable list), we would have to go into every one of these methods and change the code.

But the true measure of simplicity is how easily we can avoid introducing bugs into the program. By this measure, the LinkedList implementation in Listing 2 fails miserably. In fact, as I mentioned earlier, our LinkedList class already contains a subtle but devastating bug (did you spot it?). Exactly what is our representation of empty lists? I said earlier that an empty list is a LinkedList containing a null pointer in both fields. In fact, the zero-argument constructor sets up just such an empty list. But notice that the single-argument constructor does not place an empty list into the rest field, as would be necessary to construct a list of one argument. Instead, it places a null pointer into the field. Because a dangling composite confuses null pointers with place holders for base cases, mistakes like this are very easy to make. To see how this bug can manifest itself as a null-pointer exception, let's write an equals method for our lists:


Listing 3. Where it goes wrong

  public boolean equals(Object that) {

    // If the objects are not of the same class, then they are not equal.
    // Reflection is used in case this method is called from an instance of a
    // subclass.   
    if (this.getClass() == that.getClass()) {

	LinkedList _that = (LinkedList)that;

      if (this.isEmpty() || _that.isEmpty()) {
	return this.isEmpty() && _that.isEmpty();
      }
      else {

	boolean firstEltsMatch = this.getFirst().equals(_that.getFirst());
	boolean restEltsMatch = this.getRest().equals(_that.getRest());
	
	return firstEltsMatch && restEltsMatch;
      }
    }
    else {
      return false;
    }
  }
		

If neither this nor that is empty, the equals method correctly expects that it can call getFirst and getRest on both of them without an error signal. But if either of these lists contains any part that was constructed using the single-argument constructor, then the recursive call will eventually uncover a null pointer where an empty list should have been waiting. When getFirst or getRest is called on it, a null-pointer exception is signaled.

One idea might be to simply represent empty lists directly as null pointers, but this idea would be wholly inadequate, because it would then be impossible to remove the last element of a list or to insert an element into an empty list.

On the other hand, the bug could be fixed by rewriting the single-argument constructor as follows:


Listing 4. Applying a fix

  public LinkedList(Object _first) {
    this.first = _first;
    this.rest = new LinkedList();
  }
		

But, like most bug patterns, it is much better to prevent bugs rather than to fix them. The fact that the code broke so easily, and that even simple getters, setters, and the equals method were all so bloated, suggests that there is a better design.


Cures and prevention

In fact, there is a simple way to avoid Dangling Composite bugs: give each base case of the datatype its own class. Instead of implementing singly linked lists as we did earlier, I recommend implementing such lists with a class LinkedList that contains a field holding either an Empty or a Cons class. These classes implement a common interface, as shown in Figure 1.


Figure 1. Empty and Cons UML diagram
Empty and Cons UML diagram

To implement mutator methods, the new LinkedList class serves as a container for an internal, immutable list, as shown in Listing 5. This approach is necessary because truly empty lists have no fields to mutate and are, therefore, immutable.


Listing 5. Each base case gets its own class

import java.util.NoSuchElementException;

public class LinkedList {

  private List value;

  /**
   * Constructs an empty LinkedList.
   */
  public LinkedList() { this.value = new Empty(); }

  /**
   * Constructs a LinkedList containing only the given element.
   */
  public LinkedList(Object _first) { this.value = new Cons(_first); }

  /**
   * Constructs a LinkedList consisting of the given Object followed by
   * all the elements in the given LinkedList.
   */
  public LinkedList(Object _first, LinkedList _rest) {
   this.value = new Cons(_first, _rest.value);
  }

  private LinkedList(List _value) { this.value = _value; }

  public Object getFirst() { return this.value.getFirst();  }
  public LinkedList getRest() { return new LinkedList(this.value.getRest()); }
  public void addFirst(Object o) { this.value = new Cons(o, this.value); }
  public boolean isEmpty() { return this.value instanceof Empty; }

  public boolean equals(Object that) {
   
    if (this.getClass() == that.getClass()) {

      // The above test guarantees that the cast to LinkedList will always
      // succeed.
      return this.value.equals(((LinkedList)that).value);
    }
    else {
      return false;
    }
  }
}
		

Implementation of the immutable lists is then straightforward, as shown in Listing 6.


Listing 6. Adding immutable lists

interface List {
  public Object getFirst();
  public List getRest();
}

class Empty implements List {
  public Object getFirst() { throw new NoSuchElementException(); }
  public List getRest() { throw new NoSuchElementException(); }
  public boolean equals(Object that) { 
    return this.getClass() == that.getClass(); }
}

class Cons implements List {
  
  Object first;
  List rest;
  
  Cons(Object _first) {
    this.first = _first;
    this.rest = new Empty();
  }
  Cons(Object _first, List _rest) {
    this.first = _first;
    this.rest = _rest;
  }
  public Object getFirst() { return this.first; }
  public List getRest() { return this.rest; }

  public boolean equals(Object that) {
    if (this.getClass() == that.getClass()) {

      // The above test guarantees that the cast to Cons will always succeed.
      Cons _that = (Cons)that;

      boolean firstEltsMatch  =  this.getFirst().equals(_that.getFirst());
      boolean restEltsMatch = this.getRest().equals(_that.getRest());
      return firstEltsMatch && restEltsMatch;
    }
    else {
      return false;
    }
  }
}  
		

The logic for each method is now considerably simpler. Also notice that, while it is still possible to introduce the same bug as before in the single-argument Cons constructor, the fact that we have made an explicit Empty class makes this much less likely. Additionally, any code that traverses our lists and forgets to check for the empty case will get a NoSuchElementException, as opposed to the much-less-helpful null-pointer exception.

A simple optimization to this code would be to apply the Singleton design pattern to class Empty, because every instance of Empty is identical. I have left out this optimization because it is not relevant to eliminating null-pointer exceptions, and it makes the code slightly more complex.


Wrapup

Here's the breakdown of this week's bug pattern:

  • Pattern: Dangling Composite
  • Symptoms: Code that uses a recursively defined datatype is signaling a null-pointer exception.
  • Cause: The recursive datatype is defined in such a way that certain base cases of the definition are not given their own classes. Instead, null-pointers are inserted into the various composite datatypes. The base cases are then handled inconsistently by client code.
  • Cures and preventions: Ensure that the base cases are represented and checked for consistently. Give each base case its own class.

We aren't finished with null pointers just yet. In the next installment, we'll take a look at yet another very common bug pattern that manifests itself as a null-pointer exception, how to recognize it, and how to avoid it.


Resources

  • One approach to statically determining possible null-pointer exceptions is a technique known as set-based analysis. The Carnegie Mellon School of Computer Science Web site provides a short introduction to this approach, along with links to several technical publications on the topic.

  • The Division of Software Engineering at DePaul University has done some work on automated theorem provers to detect null-pointer exceptions in Java code.

  • Visit the Patterns Home Page for a good introduction to design patterns and how they are used.

  • Check out JUnit and catch more errors by making your code "test-infested."

  • Be sure to read Eric Allen's first article on bug patterns (developerWorks, February 2001).

  • Interested in general debugging? Check out this free, dW-exclusive tutorial, Java debugging, to learn debugging techniques from concept to application.

About the author

Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is currently the lead Java software developer at Cycorp, Inc., and a part-time graduate student in the programming languages team at Rice University. His research concerns the development of formal semantic models and extensions of the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cyc.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=10515
ArticleTitle=Diagnosing Java Code: The Dangling Composite bug pattern
publish-date=03012001
author1-email=eallen@cyc.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers