 | Level: Introductory Eric Allen, Software Engineer, Cycorp, Inc.
01 Feb 2001 Welcome to Diagnosing Java Code, a new bi-weekly column that focuses on Java solutions to keep you on track with your daily programming efforts. This premier article introduces the notion of bug patterns, an extremely useful concept that will increase your ability to detect and remedy bugs in your code. You will learn about one of the most common bug patterns, which will lay the groundwork for you to begin recognizing and avoiding more advanced patterns.
Bug patterns and why they're useful
Just as good programming skill involves the knowledge of many design
patterns, which you can combine and apply in various contexts, good debugging
skill involves knowledge of bug patterns. Bug patterns are recurring
correlations between signaled errors and underlying bugs in a program. This
concept is not novel to programming. Medical doctors rely on similar types of
correlations when diagnosing disease. They learn to do so by working closely
with senior doctors during their internships. Their very education focuses on
learning to make such diagnoses. In contrast, our education as software
engineers focuses on design processes and algorithmic analysis. These skills
are, of course, important, but little attention is paid to teaching the
process of debugging. Instead, we are expected to "pick up" the skill on our
own. With the advent of extreme programming and its emphasis on unit testing,
this practice is starting to change. But frequent unit testing solves just
part of the problem. Once bugs are found, they must be diagnosed and
corrected. Fortunately, many bugs follow one of several patterns we can
identify. Once you can recognize these patterns, you will be able to diagnose
the cause of a bug and correct it more quickly. Bug patterns are related to anti-patterns, which are patterns of common
software designs that have been proven to fail time and again. But while
anti-patterns are patterns of design, bug patterns are patterns of erroneous
program behavior correlated with programming mistakes. The concern is not with
design at all, but with the programming and debugging process.
Learning by example
To illustrate the idea behind bug patterns, let's consider a fundamental
bug pattern that is frequently encountered by beginning (and, often, more
advanced) programmers. In subsequent articles, we'll cover more advanced bug
patterns. Along with each pattern, I'll discuss programming principles that
will help minimize the occurrence of that particular pattern (not to imply
that all bugs are the result of a failure to follow some programming
principle; we all make mistakes, regardless of how many principles we follow). For classification purposes, I'll summarize bug pattern descriptions using the
following form (borrowing some terminology from the medical establishment):
- Pattern name
- Symptoms
- Cause(s)
- Cures and preventions
The Rogue Tile pattern
Perhaps the most common bug pattern among beginning programmers results
from copying and pasting a block of code into some other part of the program.
Sometimes, small parts of the copy are changed because of slightly different
functional requirements. Inevitably, bugs are fixed in one copy but not the
other, leaving you scratching your head when the symptoms of the error recur. Although most programmers quickly become familiar with this pattern of bug, few take appropriate measures to minimize its occurrence. It's very tempting to take a break from thinking and simply
copy code that you
believe
to be working already. But the
productivity lost from fixing bugs due to indiscriminate copy-and-paste
actions quickly dwarfs any productivity gained from copying the code. I call this the Rogue Tile pattern because the various copies of a code
block can be thought of as "tiles" covering the program. As the code in the
various copies diverges, the copies become "rogue tiles."
The symptom
The most common symptom of this pattern of bug is a program that continues
to exhibit erroneous behavior after you believe you've fixed the problem.
The cause
To understand how this can happen, let's consider the following class hierarchy for
binary trees:
public abstract class Tree {
}
public class Leaf extends Tree {
public Object value;
...
}
public class Branch extends Tree {
public Object value;
public Tree left;
public Tree right;
...
}
|
The first thing to notice about these classes is that both concrete classes
contain a value field of type Object. If you decide
later to make trees containing, say, Integers, you might forget
to update one of these field declarations. If some other part of the program
were to expect these fields to be Integers, the program likely
would not compile. You'll probably remember that you changed the type of the
value field in one of the classes, but you might overlook the
fact that you did not make the change in the other.
An ounce of prevention
Of course, this example is one that a beginning programmer would quickly
learn to avoid by factoring out the common code. In this case, the field
declaration should be moved to class Tree. Both subclasses will
then inherit this field, and any changes to the field declaration need only
occur in one place. Continuing with this example, we might also write methods for adding and
multiplying all the nodes in a Tree. For the sake of simplicity,
I'll write these methods recursively:
// in class Tree:
public abstract int add();
public abstract int multiply();
// in class Branch:
public int add() {
return this.value.intValue() + left.add() + right.add();
}
public int multiply() {
return this.value.intValue() * left.multiply() + right.multiply();
}
// in class Leaf:
public int add() { return this.value.intValue(); }
public int multiply() { return this.value.intValue(); }
|
Notice the bug I've introduced in the multiply method for
class Branch: instead of multiplying by the third term, I add
it. The error occurred because I created the multiply method by
copying the code from the add method and making slight (but
incomplete) alterations. This bug is particularly insidious because calling
the multiply method will never signal an error. In fact, in many
cases, it will return what appears to be a perfectly reasonable result. Just as before, we can minimize bugs of this sort by factoring out the
common code. In this case, we could write a single method that accumulates an
operator (passed as an argument) over a Tree. We can use a design
pattern (not a bug pattern!) known as the Command Pattern to encapsulate this
operator in an object:
public abstract class Operator {
public abstract int apply(int l, int r);
}
public class Adder extends Operator {
public int apply(int l, int r) {
return l + r;
}
}
public class Multiplier extends Operator {
public int apply(int l, int r) {
return l * r;
}
}
|
Then we could change the methods in our Tree class hierarchy as
follows:
// in class Tree:
public abstract int accumulate(Operator o);
public int add() {
return this.accumulate(new Adder());
}
public int multiply() {
return this.accumulate(new Multiplier());
}
// in class Leaf:
public int accumulate(Operator o) {
return value.intValue();
}
in class Branch:
public int accumulate(Operator o) {
return o.apply(this.value.intValue(),
o.apply(left.accumulate(o),
right.accumulate(o)));
}
|
By factoring out the common code, we eliminated the possibility of a
copy-and-paste error occurring in the method bodies of add
and multiply. Also, notice that we no longer need
separate add and multiply methods for each
subclass of Tree. Factoring out common code is a good practice, but it can't be applied
in all cases. For example, the simplicity of the Java type system
often forces us to choose between precise type checking and keeping a
single point of control for each distinct functional element of a
program (see Resources for my article on NextGen). Because of such cases, the Rogue
Tile pattern is a bug pattern that all developers must constantly strive to
minimize.
What's next?
Here's our first bug pattern in a nutshell. You might want to cut this out and
pin it on your bulletin board as a reminder.
- Pattern: Rogue Tile
- Symptoms: The code seems to act as if a previously corrected bug is still there.
- Cause: At least one copy of a copy-and-pasted code fragment still contains a bug fixed in the other copies.
- Cures and preventions: Factor out the common code, if possible; otherwise, update it. Avoid copying and pasting code.
In my next article, I will explore some other common
bug patterns that manifest themselves in Java code. In particular, we'll look into
bug patterns that crop up as null pointer exceptions and discuss how to
minimize their occurrence.
Resources - Visit the Patterns home page for a good introduction to design patterns and how they are used.
- The Anti-Patterns home page provides information and links to books on the subject.
- If you haven't done so already, check out Extreme Programming, a new and powerful way of developing clean, robust software quickly.
- Then download JUnit and start unit testing right away.
- Eric Allen's article on NextGen, an extension of Java with run-time generic types, explains how a more powerful type system can help alleviate some of the tension that exists between the goals of factoring out common code, and using the type system to catch errors during compilation.
About the author  | |  | Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is currently the lead Java software developer at Cycorp, Inc., and a part-time graduate student in the programming
languages team at Rice University. His research concerns the development of formal semantic models and extensions of the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. |
Rate this page
|  |