Design patterns, at their best, can be a great help in quickly converging on a simple design for a project. But the simplest way to implement a design pattern in a particular context is not always obvious -- there's more than one way to roast a goose. This month, we'll discuss some ways to apply common design patterns to produce simple, short, and robust code.
First, let's look at two patterns with applicability in many different contexts. Of all the design patterns discussed in Design Patterns (Erich Gamma, et al., a.k.a. the "Gang of Four"), the seminal work on this topic (see Resources), I've found that the most widely applicable are the Composite and Visitor patterns.
Let's take a closer look at these two patterns.
Specifying a recursive datatype with Composite
It's easy to see why the Composite pattern is useful. The Composite pattern is simply the object-oriented way to specify a recursively defined datatype; these opportunities pop up all the time in software development.
The ability to work over recursively defined types of objects is one of the most salient characteristics of software engineering (as opposed to, say, digital design, where the systems are built from finite-state machines).
Extending class hierarchies with Visitor
As for the Visitor pattern, much of the reason it is so widely applicable is the fact that it complements the Composite pattern.
Visitors are often touted as a means of extending the functionality of an existing composite class hierarchy without actually modifying the classes in that hierarchy. But their abilities go much further than that.
Because visitors allow you to separate some of the functionality of a class hierarchy from the classes themselves, they can be used in all sorts of settings where it is conceptually awkward to consider the functionality to be an intrinsic part of those classes.
One common case in which this occurs frequently is in a recursive descent over a composite data structure. For example, consider a class hierarchy for binary trees and a visitor that determines if any node in a tree contains a zero:
abstract class Tree {
public abstract Boolean accept(TreeVisitor that);
}
class Leaf extends Tree {
public static final Leaf ONLY = new Leaf();
public Boolean accept(TreeVisitor that) {
return that.forLeaf(this);
}
}
class Branch extends Tree {
public int value;
public Tree left;
public Tree right;
public Branch(int _value, Tree _left, Tree _right) {
this.value = _value;
this.left = _left;
this.right = _right;
}
public Boolean accept(TreeVisitor that) {
return that.forBranch(this);
}
}
interface TreeVisitor {
public Boolean forLeaf(Leaf that);
public Boolean forBranch(Branch that);
}
class ZeroFinder implements TreeVisitor {
public Boolean forLeaf(Leaf that) {
return new Boolean(false);
}
public Boolean forBranch(Branch that) {
if (that.value == 0) { return new Boolean(true); }
boolean foundInLeft = that.left.accept(this).booleanValue();
boolean foundInRight = that.right.accept(this).booleanValue();
return new Boolean(foundInLeft || foundInRight);
}
}
|
By keeping this zero-finding functionality in for*() methods ("*" representing the wildcard such as "Leaf" or "Branch"), separate from the Tree classes themselves, we gain these advantages:
- We prevent code bloat of these classes. If we add in new methods to our classes every time there is a need for one more piece of functionality, they'll get big pretty quickly.
- We keep all of the zero-finding functionality in one place, making it easier to maintain.
Now, as the Gang of Four mentioned, the main disadvantage of using visitors is that it becomes a little harder to add new classes into the composite hierarchy. This said, visitors are best used in cases where the composite hierarchy is not expected to change (although there are ways around this, such as the "Extensible Visitor pattern").
However, there is another big disadvantage. In cases where many of the classes in the composite hierarchy are handled in the same way, the code will be a lot larger in visitors than it would be if put in the classes themselves (since we could simply put a default implementation into the parent class).
Fortunately, there is a variant on ordinary visitors that not only solves this issue, but actually allows us, in many cases, to make the code even simpler than it would be if put into the individual classes. We call this variant the Depth-First Visitor pattern.
The idea behind the Depth-First Visitor pattern is as follows: most visitors that recursively descend down a data structure do so in a depth-first manner. That means they visit the children of a given (non-leaf) node before visiting the node itself.
Therefore, we can implement the depth-first traversal in the for* methods of an abstract visitor class and then let the concrete implementations simply describe, for each datatype, how to combine the results of visiting the subnodes. These descriptions are placed in special for*Only() methods that take as arguments the results of visiting their subnodes.
For example, Listing 2 shows how we could rewrite a depth-first visitor over our binary trees:
abstract class DepthFirstTreeVisitor implements TreeVisitor {
public abstract Boolean forLeafOnly(Leaf that);
public abstract Boolean forBranchOnly(Branch that,
Boolean left_result,
Boolean right_result);
public Boolean forLeaf(Leaf that) {
return forLeafOnly(that);
}
public Boolean forBranch(Branch that) {
Boolean left_result = that.left.accept(this);
Boolean right_result = that.right.accept(this);
return forBranchOnly(that, left_result, right_result);
}
}
|
Now we can rewrite our ZeroFinder visitor as shown in Listing 3.
abstract class DepthFirstTreeVisitor implements TreeVisitor {
class DepthFirstZeroFinder extends DepthFirstTreeVisitor {
public Boolean forLeafOnly(Leaf that) {
return new Boolean(false);
}
public Boolean forBranchOnly(Branch that,
Boolean left_result,
Boolean right_result)
{
if (that.value == 0) { return new Boolean(true); }
return new Boolean(left_result.booleanValue() ||
right_result.booleanValue());
}
}
|
With this, we've managed to pull much of the complexity of the depth-first traversal out of the concrete visitors and into the (shared) abstract parent class. Also, the implementor of a depth-first visitor would have handy (in the signatures of the for*Only() methods) a list of all contained components for each type.
What about complex hierarchies?
This is about as good as we can do for this simple composite hierarchy. For more complex class hierarchies, however, we can do more.
Often, for composite hierarchies of many classes, a given visitor will handle many of the cases in a common (or trivial) way. This is especially true when the visitor substitutes for a couple instanceof checks. For these cases, we can provide default implementations of the for*Only() methods that simply call an abstract defaultCase() method:
abstract class DepthFirstTreeVisitor implements TreeVisitor {
public abstract Boolean defaultCase(Tree that);
public Boolean forLeafOnly(Leaf that) {
return defaultCase(that);
}
public Boolean forBranchOnly(Branch that,
Boolean left_result,
Boolean right_result)
{
return defaultCase(that);
}
...
}
|
Now, when implementing a depth-first visitor over such a composite, we need only specify code for the cases that aren't taken care of by the default.
When the composite hierarchy is deeper than one layer, we may also want to provide separate default methods for each subtree of the hierarchy. These default methods can be defined to simply call the default method of the parent class. But they can be overridden as necessary.
Once we've done that, we've gained the terseness of including functionality in the classes of the composite while keeping all the advantages of ordinary visitors.
For visitors that descend depth-first in all but a few cases, we can still use a depth-first visitor. We just have to override those for*() methods (as opposed to the for*Only() methods) that we don't want to traverse depth-first.
There is one caveat to this approach, however: it's really easy to encounter broken dispatches this way. Remember them? The Broken Dispatch bug pattern occurs when we accidentally overload a method rather than overriding its implementation in the parent class. With depth-first visitors, this is especially easy to do.
Suppose that we intend to override one of the for*() (but not the for*Only()) methods of the parent class. The signature of this for*() method will take the node it operates on, but, unlike a for*Only() method, it won't take the results of visiting the subnodes. Now, because we're writing both for*() and for*Only() methods, it's not hard to imagine that we'll slip and accidentally write for*Only() as the name of one of our for*() methods.
If we do so, the code will compile just fine, but we'll have failed to override the for*() method. Instead, we'll have overloaded the for*Only() method with a new method of a different signature.
Additionally, this new for*Only() method will never be called. Broken dispatches like these can be particularly insidious bugs to track down -- the type checker won't catch them and you never know when, or if, they'll ever actually signal an error. In the worst-case scenario, broken dispatches will simply modify the behavior of your program in such a way that the results appear to be perfectly reasonable, but are, in fact, dead wrong.
All I can say about preventing them is that they're the zillionth example of why it saves time to write unit tests before (and while) writing code, and why you should run those tests often!
So there you have it: depth-first visitors, why they're cool, and how they can bite you. I'd like to thank graduate student Brian Stoler of our research lab for first introducing me to this pattern. Depth-first visitors are constructed by the "Java Tree Builder," a free tool available in Resources for automatically generating a composite hierarchy (plus visitors) from a JavaCC grammar.
- If you're new to design patterns, you might want to start with "Java design patterns 101" (developerWorks, January 2002), which provides an introduction to an important and useful object-oriented design and development concept.
- Start generating depth-first visitors automatically with the Java Tree Builder, a syntax tree builder used with the Java Compiler Compiler (JavaCC) parser generator. It takes a plain JavaCC grammar file as input and automatically generates the following: a set of syntax tree classes based on the productions in the grammar (using the Visitor design pattern); two interfaces (Visitor and ObjectVisitor); two depth-first visitors (DepthFirstVisitor and ObjectDepthFirst); and a JavaCC grammar with the proper annotations to build the syntax tree during parsing.
- The JUnit Web site provides links to many interesting articles from a multitude of sources that discuss program testing methods.
- Read all of Eric Allen's
Diagnosing Java Code
columns.
- Find more Java resources on the developerWorks Java technology zone.
Eric Allen has a bachelor's degree in computer science and mathematics from Cornell University and is a PhD candidate in the Java programming languages team at Rice University. Before returning to Rice to finish his degree, Eric was the lead Java software developer at Cycorp, Inc. He has also moderated the Java Beginner discussion forum at JavaWorld. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Eric has also helped in the development of Rice's compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cs.rice.edu.