Skip to main content

skip to main content

developerWorks  >  Java technology  >

Bug patterns: An introduction

Diagnosing and correcting recurring bug types in your Java programs

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Introductory

Eric Allen, Software Engineer, Cycorp, Inc.

01 Feb 2001

Welcome to Diagnosing Java Code, a new bi-weekly column that focuses on Java solutions to keep you on track with your daily programming efforts. This premier article introduces the notion of bug patterns, an extremely useful concept that will increase your ability to detect and remedy bugs in your code. You will learn about one of the most common bug patterns, which will lay the groundwork for you to begin recognizing and avoiding more advanced patterns.

Bug patterns and why they're useful

Just as good programming skill involves the knowledge of many design patterns, which you can combine and apply in various contexts, good debugging skill involves knowledge of bug patterns. Bug patterns are recurring correlations between signaled errors and underlying bugs in a program. This concept is not novel to programming. Medical doctors rely on similar types of correlations when diagnosing disease. They learn to do so by working closely with senior doctors during their internships. Their very education focuses on learning to make such diagnoses. In contrast, our education as software engineers focuses on design processes and algorithmic analysis. These skills are, of course, important, but little attention is paid to teaching the process of debugging. Instead, we are expected to "pick up" the skill on our own. With the advent of extreme programming and its emphasis on unit testing, this practice is starting to change. But frequent unit testing solves just part of the problem. Once bugs are found, they must be diagnosed and corrected. Fortunately, many bugs follow one of several patterns we can identify. Once you can recognize these patterns, you will be able to diagnose the cause of a bug and correct it more quickly.

Bug patterns are related to anti-patterns, which are patterns of common software designs that have been proven to fail time and again. But while anti-patterns are patterns of design, bug patterns are patterns of erroneous program behavior correlated with programming mistakes. The concern is not with design at all, but with the programming and debugging process.



Back to top


Learning by example

To illustrate the idea behind bug patterns, let's consider a fundamental bug pattern that is frequently encountered by beginning (and, often, more advanced) programmers. In subsequent articles, we'll cover more advanced bug patterns. Along with each pattern, I'll discuss programming principles that will help minimize the occurrence of that particular pattern (not to imply that all bugs are the result of a failure to follow some programming principle; we all make mistakes, regardless of how many principles we follow).

For classification purposes, I'll summarize bug pattern descriptions using the following form (borrowing some terminology from the medical establishment):

  • Pattern name
  • Symptoms
  • Cause(s)
  • Cures and preventions

The Rogue Tile pattern
Perhaps the most common bug pattern among beginning programmers results from copying and pasting a block of code into some other part of the program. Sometimes, small parts of the copy are changed because of slightly different functional requirements. Inevitably, bugs are fixed in one copy but not the other, leaving you scratching your head when the symptoms of the error recur. Although most programmers quickly become familiar with this pattern of bug, few take appropriate measures to minimize its occurrence. It's very tempting to take a break from thinking and simply copy code that you believe to be working already. But the productivity lost from fixing bugs due to indiscriminate copy-and-paste actions quickly dwarfs any productivity gained from copying the code.

I call this the Rogue Tile pattern because the various copies of a code block can be thought of as "tiles" covering the program. As the code in the various copies diverges, the copies become "rogue tiles."

The symptom
The most common symptom of this pattern of bug is a program that continues to exhibit erroneous behavior after you believe you've fixed the problem.

The cause
To understand how this can happen, let's consider the following class hierarchy for binary trees:

public abstract class Tree {

}

public class Leaf extends Tree {

 public Object value;
 ...
}

public class Branch extends Tree {

 public Object value;
 public Tree left;
 public Tree right;
 ...
}

The first thing to notice about these classes is that both concrete classes contain a value field of type Object. If you decide later to make trees containing, say, Integers, you might forget to update one of these field declarations. If some other part of the program were to expect these fields to be Integers, the program likely would not compile. You'll probably remember that you changed the type of the value field in one of the classes, but you might overlook the fact that you did not make the change in the other.

An ounce of prevention
Of course, this example is one that a beginning programmer would quickly learn to avoid by factoring out the common code. In this case, the field declaration should be moved to class Tree. Both subclasses will then inherit this field, and any changes to the field declaration need only occur in one place.

Continuing with this example, we might also write methods for adding and multiplying all the nodes in a Tree. For the sake of simplicity, I'll write these methods recursively:

 // in class Tree:

 public abstract int add();
 public abstract int multiply();

 // in class Branch:

 public int add() {
  return this.value.intValue() + left.add() + right.add();
 }

 public int multiply() {
  return this.value.intValue() * left.multiply() + right.multiply();
 }

 // in class Leaf:

 public int add() { return this.value.intValue(); }
 public int multiply() { return this.value.intValue(); }

Notice the bug I've introduced in the multiply method for class Branch: instead of multiplying by the third term, I add it. The error occurred because I created the multiply method by copying the code from the add method and making slight (but incomplete) alterations. This bug is particularly insidious because calling the multiply method will never signal an error. In fact, in many cases, it will return what appears to be a perfectly reasonable result.

Just as before, we can minimize bugs of this sort by factoring out the common code. In this case, we could write a single method that accumulates an operator (passed as an argument) over a Tree. We can use a design pattern (not a bug pattern!) known as the Command Pattern to encapsulate this operator in an object:

public abstract class Operator {
 public abstract int apply(int l, int r);
}

public class Adder extends Operator {
 public int apply(int l, int r) {
   return l + r;
 }
}

public class Multiplier extends Operator {
 public int apply(int l, int r) {
   return l * r;
 }
}

Then we could change the methods in our Tree class hierarchy as follows:

 // in class Tree:

 public abstract int accumulate(Operator o);

 public int add() {
   return this.accumulate(new Adder());
 }

 public int multiply() {
   return this.accumulate(new Multiplier());
 }

 // in  class Leaf:

 public int accumulate(Operator o) {
   return value.intValue();
 }


in class Branch: 

 public int accumulate(Operator o) {
   return o.apply(this.value.intValue(),
                  o.apply(left.accumulate(o),
                         right.accumulate(o)));
 }

By factoring out the common code, we eliminated the possibility of a copy-and-paste error occurring in the method bodies of add and multiply. Also, notice that we no longer need separate add and multiply methods for each subclass of Tree.

Factoring out common code is a good practice, but it can't be applied in all cases. For example, the simplicity of the Java type system often forces us to choose between precise type checking and keeping a single point of control for each distinct functional element of a program (see Resources for my article on NextGen). Because of such cases, the Rogue Tile pattern is a bug pattern that all developers must constantly strive to minimize.



Back to top


What's next?

Here's our first bug pattern in a nutshell. You might want to cut this out and pin it on your bulletin board as a reminder.

  • Pattern: Rogue Tile
  • Symptoms: The code seems to act as if a previously corrected bug is still there.
  • Cause: At least one copy of a copy-and-pasted code fragment still contains a bug fixed in the other copies.
  • Cures and preventions: Factor out the common code, if possible; otherwise, update it. Avoid copying and pasting code.

In my next article, I will explore some other common bug patterns that manifest themselves in Java code. In particular, we'll look into bug patterns that crop up as null pointer exceptions and discuss how to minimize their occurrence.



Resources

  • Visit the Patterns home page for a good introduction to design patterns and how they are used.

  • The Anti-Patterns home page provides information and links to books on the subject.

  • If you haven't done so already, check out Extreme Programming, a new and powerful way of developing clean, robust software quickly.

  • Then download JUnit and start unit testing right away.

  • Eric Allen's article on NextGen, an extension of Java with run-time generic types, explains how a more powerful type system can help alleviate some of the tension that exists between the goals of factoring out common code, and using the type system to catch errors during compilation.


About the author

Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is currently the lead Java software developer at Cycorp, Inc., and a part-time graduate student in the programming languages team at Rice University. His research concerns the development of formal semantic models and extensions of the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top