Skip to main content

Diagnosing Java Code: The Run-on Initializer bug pattern

Avoiding under-argumented constructors can beat this bug

Eric Allen (eallen@cs.rice.edu), Ph.D. candidate, Java programming languages team, Rice University
Eric Allen has a bachelor's degree in computer science and mathematics from Cornell University and is a PhD candidate in the Java programming languages team at Rice University. Before returning to Rice to finish his degree, Eric was the lead Java software developer at Cycorp, Inc. He has also moderated the Java Beginner discussion forum at Javaworld. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Eric is the lead developer of Rice's experimental compiler for the NextGen programming language, an extension of the Java language with added language features, and is a project manager of DrJava, an open-source Java IDE designed for beginners. Contact Eric at eallen@cs.rice.edu.

Summary:  Often you see code that initializes a class not just by calling a constructor, but also through several follow-up actions intended to set various fields. Such follow-up actions, unfortunately, are hot spots for bugs, introducing a type of run-on initialization. In this installment of Diagnosing Java Code, Eric Allen explores the Run-on Initializer bug, discusses why and how it should be avoided, and demonstrates how to minimize the damage it may cause.

View more content in this series

Date:  01 Apr 2002
Level:  Introductory
Activity:  1252 views

Run-on initialization

For various reasons (mostly bad), you will often see class definitions in which the class constructors don't take enough arguments to properly initialize all the fields of the class. Such constructors require client classes to initialize instances in several steps (setting the values of the uninitialized fields) rather than with a single constructor call. Initializing an instance in this way is an error-prone process that I refer to as run-on initialization. The types of bugs that result from this process have similar symptoms and remedies, so we can group them together into a pattern called the Run-on Initializer bug pattern.

For example, consider the following code:

class RestrictedInt {
  public Integer value;
  public boolean canTakeZero;
  
  public RestrictedInt(boolean _canTakeZero) {
    canTakeZero = _canTakeZero;
  }
  
  public void setValue(int _value) throws CantTakeZeroException {
    if (_value == 0) {
      if (canTakeZero) {
        value = new Integer(_value);
      }
      else {
        throw new CantTakeZeroException(this);
      }
    }
    else {
      value = new Integer(_value);
    }
  }
}

class CantTakeZeroException extends Exception {
  
  public RestrictedInt ri;
  
  public CantTakeZeroException(RestrictedInt _ri) {
    super("RestrictedInt can't take zero");
    ri = _ri;
  }
}

class Client {
  public static void initialize() throws CantTakeZeroException {
    RestrictedInt ri = new RestrictedInt(false);
    ri.setValue(0);
  }
}

Unfortunately, the initialization sequence for an instance of this class is prone to bugs. You may have noticed that in the above code, an exception is thrown in the second initialization step. As a result, the field that should have been set after that step is not set.

But a handler for the thrown exception may not know that the field was not set. If, in the process of recovering from the exception, it accesses the value field of the RestrictedInt in question, it may trip over a NullPointerException itself.

If that happens, it is worse than if the handler hadn't been there at all. At least the checked exception contained some clue about its cause. But NullPointerExceptions are notoriously difficult to diagnose because they (necessarily) contain very little information as to why a value was set to null in the first place. Furthermore, they occur only when the uninitialized field is accessed. That access will probably occur far away from the cause of the bug (that is, the failure to initialize the field in the first place).

There are, of course, other errors that can occur from run-on initialization bugs.


More errors from run-ons

Other possible errors that might result are:

  • The programmer writing the initialization code may forget to put in one of the initialization steps.

  • There may be an order-based dependence in the initialization steps, unknown to the programmer, who therefore executes the statements out of order.

  • The class being initialized might change. New fields might be added, or old ones removed. As a result, all the initialization code in every client must be modified to set the fields appropriately. Much of the modified code will be similar, but if just one copy is missed, a bug is introduced. For this reason, run-on initializers can easily become rogue tiles (see my article on the Rogue Tile bug pattern for some background).

Because of all the problems involved with run-on initialization, it's much better to define constructors that initialize all fields. In the above example, the constructor for RestrictedInt should take an int to initialize its value field. There's never a good reason to include a constructor for a class that leaves any of the fields uninitialized. When writing classes from scratch, that's not a difficult principle to follow.

But what if you have to work with a large code base in which a class doesn't initialize all of its fields in the constructors, and there are run-on initializers throughout the code base? I've run into situations like that more than once.


When your hands are tied

Unfortunately, working with a legacy code base in which a class doesn't initialize all of its fields in the constructors is more common than most programmers would like. If the legacy code base is large and there are many clients of the offending class, you may not want to modify the constructor signatures, especially if the unit tests over the code are scant. Inevitably, you'll wind up breaking undocumented invariants.

Often, in such situations, the best thing to do is to throw out that legacy code and start fresh! That may sound like crazy talk, but the time you'll spend patching up bugs in code like that can easily dwarf the time it would take to rewrite it. Many times, I have struggled to work with large bases of legacy code with problems like this, and, ultimately, I come away wishing I had just started fresh.

But if throwing the code away is not an option, we can still attempt to control the potential for errors by incorporating the following simple practices:

  • Initialize the fields to (non-null) default values.
  • Include extra constructors to use.
  • Include an isInitialized method in the class.
  • Construct special classes to represent the default values.

Let's take a look at why we should follow these practices.

Initialize the fields to (non-null) default values

By filling in the fields with default values, you ensure that instances of your class will be in a well-defined state at all times. This practice is particularly important for reference types that will take on the null value unless you specify otherwise.

Why? Because gratuitous uses of null values inevitably result in NullPointerExceptions. And NullPointerExceptions are bad. For one reason, they provide very little information about the true cause of a bug. For another reason, they tend to be thrown very far away from the actual cause of the bug.

Avoid them at all costs. And if you decide you want to use null so that you can signal that the class is not yet completely initialized, see my article on the Null Flag bug pattern for assistance.

Include extra constructors

When you include additional constructors, you can use them in new contexts, where you don't have to include new run-on initializations. Just because some contexts are limited to using it, other contexts shouldn't have to pay the price.

Place an isInitialized method in the class

You can include an isInitialized method in the class to allow for quick determination as to whether an instance has been initialized. Such a method is almost always a good idea when writing classes that require run-on initialization.

In cases in which you don't maintain these classes yourself, you can even put such isInitialized methods into your own utility class. After all, if there is a consequence of an instance not being initialized that is observable from the outside, you can write a method to check for this consequence (even if it entails using the usually ill-advised practice of catching a RuntimeException).

Construct special classes to represent the default values

Instead of allowing the fields to be filled in with null, construct special classes (most likely with Singletons) to represent the default values. Then fill in instances of these classes into your fields in the default constructor. Not only will you decrease the chances of a NullPointerException, but you will be able to control precisely what error does occur if these fields are accessed inappropriately.

For example, we could modify the RestrictedInt class as follows:

class RestrictedInt implements SimpleInteger {
  public SimpleInteger value;
  public boolean canTakeZero;
  
  public RestrictedInt(boolean _canTakeZero) {
    canTakeZero = _canTakeZero;
    value = NonValue.ONLY;
  }
  
  public void setValue(int _value) throws CantTakeZeroException {
    if (_value == 0) {
      if (canTakeZero) {
        value = new DefaultSimpleInteger(_value);
      }
      else {
        throw new CantTakeZeroException(this);
      }
    }
    else {
      value = new DefaultSimpleInteger(_value);
    }
  }
  
  public int intValue() {
    return ((DefaultSimpleInteger)value).intValue();
  }
}

interface SimpleInteger {
}

class NonValue implements SimpleInteger {
    
  public static NonValue ONLY = new NonValue();
    
  private NonValue() {}
  
}


class DefaultSimpleInteger implements SimpleInteger {
  private int value;
  
  public DefaultSimpleInteger(int _value) {
    value = _value;
  }
  
  public int intValue() {
    return value;
  }
}

Now, if any of your client classes that access this field were to perform an intValue operation on the resulting element, they would first have to cast to a DefaultSimpleInteger since NonValues don't support that operation.

The advantage of the above approach is that you'll be constantly reminded (with compiler errors) that this method call doesn't work on the default value, at every point in the code where you forgot to cast. Also, if, at run time, you happen to access this field and it contains the default value, you'll get a ClassCastException, which will be much more informative than the NullPointerException you would have gotten -- the ClassCastException will tell you not only what was actually there, but what the program expected to be there.

The disadvantage is that you'll pay in performance. Every time the field is accessed, the program will also have to perform a cast.

If you're willing to forgo the compilation error messages, another solution is to include the intValue method in interface SimpleInteger. You can then implement this method in the default class with a method that throws whatever error you'd like (and you can include any information that you'd like). To illustrate this, look at the following example:

class RestrictedInt implements SimpleInteger {
  public SimpleInteger value;
  public boolean canTakeZero;
  
  public RestrictedInt(boolean _canTakeZero) {
    canTakeZero = _canTakeZero;
    value = NonValue.ONLY;
  }
  
  public void setValue(int _value) throws CantTakeZeroException {
    if (_value == 0) {
      if (canTakeZero) {
        value = new DefaultSimpleInteger(_value);
      }
      else {
        throw new CantTakeZeroException(this);
      }
    }
    else {
      value = new DefaultSimpleInteger(_value);
    }
  }
  
  public int intValue() {
    return value.intValue();
  }
}

interface SimpleInteger {
  public int intValue();
}

class NonValue implements SimpleInteger {
    
  public static NonValue ONLY = new NonValue();
    
  private NonValue() {}
    
  public int intValue() {
    throw new 
      RuntimeException("Attempt to access an int from a NonValue");
  }
}


class DefaultSimpleInteger implements SimpleInteger {
  private int value;
  
  public DefaultSimpleInteger(int _value) {
    value = _value;
  }
  
  public int intValue() {
    return value;
  }
}

This solution can provide even better error diagnostics than the ClassCastException. It's also more efficient, because no cast is required at run time. But this solution won't require you to think about the possible values of the field at every access point.

The solution you choose to use depends partly on your preferences and partly on the performance and robustness constraints of your project.

Now, let's look at a technique that, at first glance, seems completely wrong.


Including methods that only throw exceptions

At first, this practice may strike you as inherently wrong and counter-intuitive -- a class should only contain methods that actually make sense to perform on the data. Particularly when you are teaching programmers about object-oriented programming, including such classes as these can be confusing.

For example, consider the two possible ways to define a class hierarchy for Lists shown in Listings 4 and 5 below:

abstract class List {}

class Empty extends List {}

class Cons extends List {
  Object first;
  List rest;
  
  Cons(Object _first, List _rest) {
    first = _first;
    rest = _rest;
  }
  
  public Object getFirst() {
    return first;
  }
  
  public List getRest() {
    return rest;
  }
}

abstract class List {
  public abstract Object getFirst();
  public abstract Object getRest();
}

class Empty extends List {
  public Object getFirst() {
   throw new RuntimeException("Attempt to take first of an empty list");
  }
  
  public List getRest() {
   throw new RuntimeException("Attempt to take rest of an empty list");
  }
}

class Cons extends List {
  Object first;
  List rest;
  
  Cons(Object _first, List _rest) {
    first = _first;
    rest = _rest;
  }
  
  public Object getFirst() {
    return first;
  }
  
  public List getRest() {
    return rest;
  }
}

For a programmer new to object-oriented languages, the motivations behind the first version of List (the one with no universal getters) will be less confusing. Intuitively, classes shouldn't contain a method unless that method does real work. But the above considerations for dealing with default classes apply equally well to this example too.

It can be quite cumbersome to continually insert casts in your code, and the code can become quite wordy. Additionally, the class casts can have significant repercussions in terms of performance, especially for an often-called utility class like List.

As with all design practices, this practice is best applied with a consideration for the underlying motivation of the practice. The motivation won't always be applicable, so, when it isn't, the practice shouldn't be used.


You're better off fixing them

You might have noticed (if you've read any of my other bug pattern articles) that the Run-on Initializer bug is a bit different. This time I've provided quite a few ideas on how to work around their root cause, rather than just fixing it outright. That's because, on many occasions, I've had to work around them. Those weren't the best of times.

Still, as the considerations we've mentioned indicate, it is far better to avoid run-on initializations altogether. But when you have to deal with them, at least you can protect yourself. Here's this bug pattern in a nutshell:

  • Pattern: Run-on Initializer

  • Symptoms: A NullPointerException at the point that one of the uninitialized fields is accessed.

  • Cause: A class whose constructors don't initialize all fields directly.

  • Cures and preventions: Initialize all fields in a constructor. Use special classes for default values when better values can't be used. Include multiple constructors to cover cases where better values can be used. Include an isInitialized method.

For the next few months, we'll be returning to the topic of bug patterns. Next month, we'll cover some of the platform-dependent bugs that occur in the Java language. Contrary to popular belief, it is not immune to them.


Resources

  • In "A Taste of Bitter Java " (developerWorks, March 2002), Bruce Tate demonstrates how and why antipatterns (a typical solution to a problem that actually results in decidedly negative consequences) are a necessary and complementary companion to design patterns.

  • Check out this XP site for a summary of the ideas behind extreme programming.

  • The JUnit Web site provides links to many interesting articles from a multitude of sources that discuss program testing methods.

  • Read all of Eric Allen's Diagnosing Java Code columns, including a full complement of articles on bug patterns.

  • DrJava is Rice University's free, open-source Java IDE, with a read-eval-print loop.

  • Find other Java technology resources on the developerWorks Java technology zone.

About the author

Eric Allen has a bachelor's degree in computer science and mathematics from Cornell University and is a PhD candidate in the Java programming languages team at Rice University. Before returning to Rice to finish his degree, Eric was the lead Java software developer at Cycorp, Inc. He has also moderated the Java Beginner discussion forum at Javaworld. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Eric is the lead developer of Rice's experimental compiler for the NextGen programming language, an extension of the Java language with added language features, and is a project manager of DrJava, an open-source Java IDE designed for beginners. Contact Eric at eallen@cs.rice.edu.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=10653
ArticleTitle=Diagnosing Java Code: The Run-on Initializer bug pattern
publish-date=04012002
author1-email=eallen@cs.rice.edu
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers