The growing diversity of information-processing tasks (and the costs associated with them) makes it obvious that increasing the degree of code reuse is an effective design (and business) strategy in this climate of technology budget tightening. If system extensibility is your goal, ask yourself, " How extensible should the system be and how extensible can I make it?" Then consider the following facts:
- The trade-offs for adding extensibility may be decreased performance or testing ability.
- The most testable systems are usually the simplest ones; adding extensibility often adds complexity.
- A key bit of knowledge to planning a successful extensible design is to know how you plan to extend the system at a later date.
In the first article in this series, I outlined the various forms of extensibility that a system may exhibit -- the black box design and two white box designs, the glass box and the open box. Next, we'll explore these forms in greater depth.
We'll begin our tour with glass box extensibility.
Earlier, I defined glass box extensibility as refering to the ways in which a software system may be extended when the source code is available for viewing but not modifying. There are primarily two directions along which a program can be glass box extensible:
- The extension of data
- The extension of functionality over this data
Choosing to make a program extensible along either of these dimensions will have repercussions on its resultant architecture. In most cases, such a decision will also carry repercussions with respect to performance.
Data as the extensible dimension
Let us first consider the dimension of extensibility most natural to include in an object-oriented language such as Java -- data. Some of the most essential constructs in an object-oriented language (namely, the class hierarchy, interfaces, and abstract methods) are included primarily to allow extensibility along this dimension.
A composite datastructure can be expanded to include new subtypes merely by defining a new class that is declared to extend the root of the original composite. For example, consider the following simple class hierarchy for binary trees:
abstract class Tree {
}
class Branch extends Tree {
private int value;
private Tree left;
private Tree right;
public Branch(int _value, Tree _left, Tree _right) {
this.value = _value;
this.left = _left;
this.right = _right;
}
public Tree getLeft() {
return this.left;
}
public Tree getRight() {
return this.right;
}
public int getValue() {
return this.value;
}
}
class Leaf extends Tree {
}
|
If we wanted to add a new form of binary tree, such as a non-empty leaf node, all we need to do is define a new class as follows:
class NonEmptyLeaf extends Tree {
private int value;
public NonEmptyLeaf(int _value) {
this.value = _value;
}
public int getValue() {
return value;
}
}
|
Using a "Visitor" to extend binary trees
Alternatively, if we wanted to allow for easy extensibility of the functionality provided on these trees, but we weren't so interested in allowing for the extension of new subtypes, we could provide support for visitors over our trees.
The Visitor Pattern is one of the design patterns presented in the original book on the subject (Design Patterns, Gamma et. al.; see Resources). The idea behind this pattern is to define an abstract "visitor" class for a composite datatype. This visitor class contains a distinct method for each concrete subtype. In the case of our Tree class, we could define an abstract TreeVisitor class as follows:
interface TreeVisitor {
public Object forBranch(Branch that);
public Object forLeaf(Leaf that);
}
|
Each subtype of Tree must contain an accept method that takes a TreeVisitor and calls the corresponding method in the TreeVisitor on itself:
// in class Tree:
public abstract Object accept(TreeVisitor that);
// in class Branch:
public Object accept(TreeVisitor that) {
return that.forBranch(this);
}
// in class Leaf:
public Object accept(TreeVisitor that) {
return that.forLeaf(this);
}
|
Now, when we want to add new functionality on trees, we can simply define a new concrete subclass of TreeVisitor with the for methods defined appropriately. For example, we could add functionality to produce a deep copy of the tree as follows:
class TreeCopier implements TreeVisitor {
public Object forBranch(Branch that) {
return new Branch(that.getValue(),
(Tree)that.getLeft().accept(this),
(Tree)that.getRight().accept(this));
}
public Object forLeaf(Leaf that) {
// There are no subcomponents to visit in a Leaf.
return new Leaf();
}
}
|
But we will run into trouble with this approach if we also want to extend the set of concrete subtypes of class Tree.
The first problem is that existing TreeVisitors will not include for methods for the new datatype. This problem can be solved by subtyping the existing TreeVisitors with new TreeVisitors that include the new methods, and implementing a new subinterface that includes the new methods.
In our deep copy example (Listing 5), if single-element leaves were added to our trees, we could extend our TreeCopier in the following manner (notice the need to add a cast in the accept method for NonEmptyLeaf):
interface TreeVisitor2 extends TreeVisitor {
public Object forNonEmptyLeaf(NonEmptyLeaf that);
}
...
// in class NonEmptyLeaf
public Object accept(TreeVisitor that) {
return ((TreeVisitor2)that).forNonEmptyLeaf(this);
}
class TreeCopier2 extends TreeCopier implements TreeVisitor2 {
public Object forNonEmptyLeaf(NonEmptyLeaf that) {
return new NonEmptyLeaf(that.getValue());
}
}
|
But simply extending the TreeVisitors in this way is not enough.
What if visitors bring more visitors?
The original TreeVisitors may construct new instances of TreeVisitors. Instances of the subclasses of a TreeVisitor will now construct instances of their superclasses!
This problem is common. Often, when it is natural to include extra arguments to a visitor, the extra arguments are passed to the constructor of the visitor, which then places these arguments into fields. If, during a recursive descent of a datastructure, it is necessary to use different values for these arguments in the recursive call, then a new visitor is constructed with the new arguments.
For example, suppose we wanted to create a TreeVisitor for pretty-printing the elements of a Tree. We could use a field in the TreeVisitor to keep track of the degree of indentation when printing subtrees, as is done in the following TreeVisitor:
import java.io.OutputStream;
import java.io.PrintStream;
class TreePrinter implements TreeVisitor {
private int amountOfIndentation;
// The stream to which we are printing.
PrintStream out;
public TreePrinter() {
this.amountOfIndentation = 0;
this.out = System.out;
}
public TreePrinter(OutputStream _out) {
this();
this.out = new PrintStream(_out);
}
TreePrinter(int _amountOfIndentation) {
this();
this.amountOfIndentation = _amountOfIndentation;
}
TreePrinter(int _amountOfIndentation, OutputStream _out) {
this();
this.amountOfIndentation = _amountOfIndentation;
this.out = new PrintStream(_out);
}
/**
* Prints an amount of whitespace proportional to the
* current degree of indentation.
*/
public void indent() {
for (int i = 0; i < this.amountOfIndentation; i++) {
this.out.print(" ");
}
}
public Object forLeaf(Leaf that) {
// Since leaves are empty, they are not printed.
// Returns a dummy object to satisfy TreeVisitor interface.
return new Object();
}
public Object forBranch(Branch that) {
TreePrinter innerPrinter =
new TreePrinter(this.amountOfIndentation + 1, this.out);
this.indent();
this.out.println(that.getValue());
that.getLeft().accept(innerPrinter);
that.getRight().accept(innerPrinter);
// Returns a dummy object to satisfy TreeVisitor interface.
return new Object();
}
}
|
But then, when we extend this TreeVisitor to include the case of single-element leaves, the original methods will construct instances of the wrong TreeVisitor type:
class TreePrinter2 extends TreePrinter implements TreeVisitor2 {
public TreePrinter2(int _amountOfIndentation) {
super(_amountOfIndentation);
}
public Object forNonEmptyLeaf(NonEmptyLeaf that) {
this.indent();
this.out.println(that.getValue());
// Returns a dummy object to satisfy TreeVisitor interface.
return new Object();
}
}
...
// But the inherited method forBranch will construct an
// instance of TreePrinter, not TreePrinter2!
public Object forBranch(Branch that) {
TreePrinter innerPrinter =
new TreePrinter(this.amountOfIndentation + 1, this.out);
this.indent();
this.out.println(that.getValue());
that.left.accept(innerPrinter);
that.right.accept(innerPrinter);
// Returns a dummy object to satisfy TreeVisitor interface.
return new Object();
}
|
If an instance of TreeCopier2 tries to visit a Tree with a Branch that is the parent of a NonEmptyLeaf, the cast to TreeVisitor2 in the accept method of NonEmptyLeaf will fail.
An answer: Turn to factory methods
One solution to this problem, originally proposed by Krishnamurthi, et.al., (see Resources) is to use factory methods rather than constructors to construct new instances of visitors. These factory methods may then be overridden in any subclasses of the original visitor.
In our example, this could be done by including the following factory method to class TreePrinter:
// in class TreePrinter:
TreePrinter newTree(int _amountOfIndentation, OutputStream _out) {
return new TreePrinter(_amountOfIndentation, _out);
}
|
Any methods inside class TreePrinter that construct new TreePrinters should call method newTree() to do so.
So, the forBranch() method of TreePrinter would be written as follows:
// in class TreePrinter:
public Object forBranch(Branch that) {
TreePrinter innerPrinter =
newTree(this.amountOfIndentation + 1, this.out);
this.indent();
this.out.println(that.getValue());
that.getLeft().accept(innerPrinter);
that.getRight().accept(innerPrinter);
// Returns a dummy object to satisfy TreeVisitor interface.
return new Object();
}
|
Then, if class TreePrinter ever needs to be extended to include methods for new datatypes, we can simply override newTree() in the new class to return an instance of the appropriate type.
For example, we could override method newTree() for class TreePrinter2 as follows:
// in class TreePrinter2:
TreePrinter newTree(int _amountOfIndentation, OutputStream _out) {
return new TreePrinter2(_amountOfIndentation, _out);
}
|
This solution is known as the Extensible Visitor Pattern.
A final word on performance versus extensibility
So, with the above design, we are now able to add both new functionality over Trees, as well as new subtypes of class Tree, quite easily. Of course, we will pay for this extensibility with performance trade-offs.
This kind of extensibility works best when algorithms over the data are defined recursively, but, unfortunately, recursive calls can be expensive in the Java language, and for large datastructures, the calls can easily overflow the stack.
The latest JIT compilers ease this problem by performing dynamic inlining of method calls when possible. Additionally, the latest IBM JIT compiler also performs tail-call elimination which helps prevent stack overflow at least on tail-recursive methods.
Fortunately, it turns out in practice that the parts of a program where we want the greatest extensibility often are not the most performance-critical parts. In those cases, the use of extensible design like that described in this article can be most advantageous. Next time, I will discuss some of the issues involved with black box extensibility.
- The JUnit Web site provides links to many interesting articles from a multitude of sources that discuss program testing methods.
- Read all of Eric's Diagnosing Java code articles.
- Find more Java technology resources on the developerWorks Java technology zone.
Eric Allen has a bachelor's degree in computer science and mathematics from Cornell University and is a PhD candidate in the Java programming languages team at Rice University. Before returning to Rice to finish his degree, Eric was the lead Java software developer at Cycorp, Inc. He has also moderated the Java Beginner discussion forum at Javaworld. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Eric has also helped in the development of Rice's compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cs.rice.edu.
Comments (Undergoing maintenance)





