Skip to main content

Diagnosing Java Code: The Impostor Type bug pattern

Using tags to distinguish object types can lead to mislabeling

Eric Allen (eallen@cs.rice.edu), Ph.D. candidate, Java programming languages team, Rice University
Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is a Ph.D. candidate in the Java programming languages team at Rice University. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cs.rice.edu.

Summary:  When special tags in fields are used to distinguish between types of objects, errors are possible in which a tag mislabels the associated data -- a bug pattern known as the Impostor Type. In this installment of Diagnosing Java Code, Eric Allen examines the symptoms and causes of this bug, defines ways to prevent this error from occurring, and discusses a tempting hybrid implementation that does not use impostor types but, in the end, turns out to have many of the same weaknesses. Share your thoughts on this article with the author and other readers in the discussion forum by clicking Discuss at the top or bottom of the article.

View more content in this series

Date:  01 Jul 2001
Level:  Introductory
Activity:  1805 views

All but the most trivial of programs manipulate some types of data. Static type systems provide a way to ensure that a program doesn't manipulate data of a given type inappropriately. One of the advantages of the Java language is that it is strongly typed, so that the possibility of a type error is eliminated before the program is ever run. As developers, we can use this type system to produce more robust and bug-free code. Often, though, the type system is not used to its full potential.

The Impostor Type bug pattern

Many programs make less use of the static type system than they could, instead relying on special fields to contain tags that distinguish the types of data.

By relying on these special fields to distinguish types of data, such programs forego the very protection that the type system was designed to give them. When one of these tags mislabels its data, it generates a bug that I call the Impostor Type.


The symptoms

One common symptom of an impostor type bug is that many conceptually distinct types of data are all treated in the same (and incorrect) manner. Another common symptom is that data doesn't match any of the designated types.

As a rule of thumb, suspect this bug pattern whenever there is a mismatch between the conceptual type of data and the way it is handled by your program.

To illustrate how easily bugs of this pattern can be introduced, let's consider a simple example. Suppose we want to manipulate various Euclidean forms, such as circles, squares, etc. These forms will have no position, but they will have a scale, so that it will be possible to compute their area.

public class Form {

     String shape;
     double scale;

     public Form(String _shape, double _scale) {
         this.shape = _shape;
         this.scale = _scale;
     }

     public double getArea() {
         if (shape.equals("square")) {
             return scale * scale;
         }
         else if (shape.equals("circle")) {
             return Math.PI * scale * scale;
         }
         else { // shape.equals("triangle"), an equilateral triangle
             return scale * (scale * Math.sqrt(3) / 4);
         }
     }
 }           


There are serious disadvantages to implementing forms in this way, even though you see it done often.

One of the most glaring disadvantages is that this method is not very extensible. If we wanted to introduce a new shape for our forms (such as, "pentagon"), we'd have to go in and modify the source code for the getArea() method. But extensibility is a separate concern; in this article, we'll focus on the susceptibility for errors that the implementation causes. I'll come back to the issue of extensibility in a future article.

Consider what would happen if, in some other part of the program, we constructed a new Form object as follows:

  Form f = new Form("sqaure", 2);

Of course, "square" has been misspelled. But, as far as the compiler is concerned, this is perfectly valid code.

Now consider what will happen when we try to, say, call getArea() on our new Form object. Because the shape of the Form won't match any of the tests in the if-then-else block, its area will be computed in the else clause, as if it were a triangle!

There will be no error signaled. Indeed, in many circumstances, the return value will appear to be a perfectly reasonable number. Even if we put in some redundancy and check that the implied condition in the else clause holds (with, say, an assertion), the error won't be found until the code is run.

Many other similar bugs might occur with the above code. A clause might be accidentally left out of the if-then-else block, causing all Forms of the type corresponding to that clause to be handled improperly. Additionally, because the impostor type is just a String in a field, it might be modified, either accidentally or maliciously.

Either way, such modifications could wreak all sorts of havoc.


Cures and preventions

As you might have guessed, I suggest avoiding bugs of this type by using the type system to weed them out during static checking. Consider this alternative implementation:

public abstract class Form {

     double scale;

     public Form(double _scale) {
         this.scale = _scale;
     }

     public abstract double getArea();
 }

 class Square extends Form {
     public Square(double _scale) {
         super(_scale);
     }

     public double getArea() {
         return scale * scale;
     }
 }

 class Circle extends Form {
     public Circle(double _scale) {
         super(_scale);
     }

     public double getArea() {
         return Math.PI * scale * scale;
     }   
 }

 class Triangle extends Form {
     public Triangle(double _scale) {
         super(_scale);
     }

     public double getArea() {
         return scale * (scale * Math.sqrt(3) / 4);
     }
 }


Now consider what would happen if we were to mistype "Sqaure" when creating a new Form. The compiler would signal an error, telling us that class Sqaure could not be found. The code would never even have a chance to run.

Similarly, the compiler would not allow us to forget to define getArea() for any of our subclasses. And, of course, it would be impossible for any object to change the type of a Form.


One last caveat

Before leaving this topic, I'd like to discuss one more possible implementation, a kind of cross between the two implementations I've discussed.

In this case, no impostor types are used, but the code has many of the same susceptibilities as if they were. In fact, this implementation is worse than implementing getArea() separately for each type.

public abstract class Form {

     double scale;

     public Form(double _scale) {
         this.scale = _scale;
     }

     public double getArea() {
         if (this instanceof Square) {
             return scale * scale;
         }
         else if (this instanceof Circle) {
             return Math.PI * scale * scale;
         }
         else { // this instanceof Triangle
             return scale * (scale * Math.sqrt(3) / 4);
         }
     }
 }

 class Square extends Form {
     public Square(double _scale) {
         super(_scale);
     }
 }

 class Circle extends Form {
     public Circle(double _scale) {
         super(_scale);
     }
 }

 class Triangle extends Form {
     public Triangle(double _scale) {
         super(_scale);
     }
 }


Although the compiler would still catch type misspellings, and the types of objects could not be changed, we are again using an if-then-else block to dispatch on the appropriate type. Therefore, we are again susceptible to mismatches between the instanceof checks in the if-then-else block and the set of types we are operating on.

I should also mention that, like the first implementation, this implementation is not as extensible as the second.


Wrapup

So, in a nutshell, here is our latest bug pattern:

  • Pattern: Impostor Type

  • Symptoms: A program that treats data of conceptually distinct types in the same way, or doesn't recognize certain types of data.

  • Cause: The program uses fields with tags in lieu of separate classes for the various types of data.

  • Cures and preventions: Divide conceptually distinct types of data into separate classes whenever possible.

The important point is that the language offers you the best resources for avoiding this type of error -- just remember to use them.


Resources

About the author

Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is a Ph.D. candidate in the Java programming languages team at Rice University. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Currently, he is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cs.rice.edu.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=10559
ArticleTitle=Diagnosing Java Code: The Impostor Type bug pattern
publish-date=07012001
author1-email=eallen@cs.rice.edu
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers