Java theory and practice: The pseudo-typedef antipattern

Extension is not type definition

The addition of generics to the Java™ language complicated the type system and increased the verbosity of many variable and method declarations. Because no "typedef" facility was provided for defining short names for types, some developers have turned to extension as a "poor man's typedef," with less than good results. In this month's Java theory and practice, Java expert Brian Goetz explains the limitations of this "antipattern."

Brian Goetz (brian@quiotix.com), Principal Consultant, Quiotix

Brian Goetz has been a professional software developer for over 18 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, California, and he serves on several JCP Expert Groups. Brian's book, Java Concurrency In Practice, will be published in late 2005 by Addison-Wesley. See Brian's published and upcoming articles in popular industry publications.



21 February 2006

Also available in Russian Japanese

A common complaint about the new generics facility in Java 5.0 is that it renders the code too verbose. Variable declarations that used to fit entirely on one line no longer do, and the repetition associated with declaring variables of parameterized type can be annoying, especially without good IDE support for auto completion. For example, if you want to declare a Map whose keys are Sockets and whose values are Future<String>, the old way:

Map socketOwner = new HashMap();

is more compact than the new way:

Map<Socket, Future<String>> socketOwner 
  = new HashMap<Socket, Future<String>>();

Of course, the new way embeds more type information, reducing programming errors and improving program readability, but it does make for more up-front work in declaring variables and method signatures. The repetition of the type parameters in the declaration and the initialization seems particularly unnecessary; Socket and Future<String> need to be typed twice, forcing us to violate the "DRY" principle (don't repeat yourself).

Synthesizing typedef . . . sort of

The addition of generics adds some complexity to the type system. Where "type" and "class" were nearly synonymous prior to Java 5.0, parameterized types, especially those with bounded wildcard types, make the concepts of subtype and subclass very different. The types ArrayList<?>, ArrayList<? extends Number>, and ArrayList<Integer> are distinct types, even though they are all implemented by the same class, ArrayList. These types form a hierarchy; ArrayList<?> is a supertype of ArrayList<? extends Number>, and ArrayList<? extends Number> is a supertype of ArrayList<Integer>.

With the original simple type system, a feature like C's typedef made no sense. But with a more complicated type system, a typedef facility might offer some benefits. For better or worse, typedef was not added to the language when generics were.

One (broken) idiom that some people are using as a "poor man's typedef" is trivial extension: creating a class that extends a generic type but adds no functionality, such as the SocketUserMap type, as shown in Listing 1:

Listing 1. The pseudo-typedef antipattern -- don't do this
public class SocketUserMap extends HashMap<Socket, Future<String>> { }
SocketUserMap socketOwner = new SocketUserMap();

This trick, which I'll call the pseudo-typedef antipattern, accomplishes the (questionable) goal of getting the socketOwner definition back on one line, but delivers little more and ultimately becomes an impediment to reuse and maintenance. (For classes that have constructors other than the no-arg constructor, the derived class also needs to declare each constructor, as constructors are not inherited.)


Problems with pseudotypes

In C, defining a new type with typedef is more like a macro than a type declaration. Typedefs that define equivalent types can be freely interchanged with each other as well as with the raw type. Listing 2 shows an example of defining a callback function, where a typedef is used in the signature, but the caller supplies a callback of an equivalent type and the compiler and runtime are perfectly happy:

Listing 2. Typedef examples in C
// Define a type called "callback" that is a function pointer
typedef void (*Callback)(int);

void doSomething(Callback callback) { }

// This function conforms to the type defined by Callback
void callbackFunction(int arg) { }

// So a caller can pass the address of callbackFunction to doSomething
void useCallback() {
  doSomething(&callbackFunction); 
}

Extension is not type definition

An equivalent program in the Java language that tried to use the pseudo-typedef antipattern would run into trouble. The StringList and UserList types in Listing 3 both extend a common superclass, but they are not equivalent types. This means that any code that wants to call lookupAll must pass a StringList, not a List<String> or a UserList.

Listing 3. How pseudotypes lock clients into using pseudotypes
class StringList extends ArrayList<String> { }
class UserList extends ArrayList<String> { }
...
class SomeClass {
    public void validateUsers(UserList users) { ... }
    public UserList lookupAll(StringList names) { ... }
}

This restriction is more severe than it might first appear. In a small program, it probably doesn't make much of a difference, but as the program gets larger, the requirement to use the pseudotype consistently could cause trouble. If a variable is of type StringList, you cannot assign an ordinary List<String> to it because List<String> is a supertype of StringList and therefore not a StringList. Just as you cannot assign an Object to a variable of type String, you cannot assign a List<String> to a variable of type StringList. (You can, however, go the other way around; for example, you can assign a StringList to a variable of type List<String> because List<String> is a supertype of StringList.)

The same is true for method parameters; if a method parameter is of type StringList, you cannot pass an ordinary List<String> to it. This means that you cannot use pseudotypes at all as method arguments without requiring that every use of that method use the pseudotype, which in practicality means that you cannot use pseudotypes at all in library APIs. And because most library APIs grew out of code that was never intended to be library code, the excuse of "this code is just for me, no one else will be using it" is not a good excuse (assuming your code is any good; if it stinks, you're probably right).

Pseudotypes are contagious

This "viral" nature is one of the factors that made reuse of C code problematic. Nearly every C package has header files that define utility macros and types like int32, boolean, true, false, and so on. If you try to use several packages within an application that do not use identical definitions for these common items, you may spend quite a while in "header file hell" before you can even compile an empty program that includes all the header files. Writing a C application that uses a dozen different packages from different authors almost certainly involves some of this type of pain. On the other hand, it is quite common for a Java application to use a dozen or more different packages without any such pain. If packages were to use pseudotypes in their APIs, we would be reinventing a problem that should remain only a painful memory.

As an example, say two different packages each define StringList using the pseudo-typedef antipattern, as shown in Listing 4, and each defines utility methods to operate on a StringList. The fact that both packages have defined the same identifier is already a minor source of inconvenience; client programs must choose one definition to import and use the fully qualified name for the other. But the bigger problem is that now clients of these packages cannot create an object that can be passed to both sortList and reverseList because the two different StringList types are distinct types and are not compatible with each other. Clients now must choose between using one package or the other, or they have to do a lot of work to convert between the different kinds of StringList. What was supposed to be a convenience for the package writer has become a significant impediment to using the package in all but the most limited contexts.

Listing 4. How the use of pseudotypes inhibits reuse
package a;

class StringList extends ArrayList<String> { }
class ListUtilities {
    public static void sortList(StringList list) { }
}

package b;

class StringList extends ArrayList<String> { }
class SomeOtherUtilityClass {
    public static void reverseList(StringList list) { }
}
 
...

class Client {
    public void someMethod() {
        StringList list = ...;
        // Can't do this
        ListUtilities.sortList(list);
        SomeOtherUtilityClass.reverseList(list);
    }
}

Pseudotypes are usually too concrete

A further problem with the pseudo-typedef antipattern is that it tends to ignore the benefit of using interfaces to define the types of variables and method arguments. While it is possible to define StringList as an interface that extends List<String> and a concrete type StringArrayList that extends ArrayList<String> and implements StringList, most users of the pseudo-typedef antipattern generally do not go to this length, as the purpose of this technique is primarily to simplify and shorten type names. As a result, APIs will be less useful and more brittle because they use concrete types like ArrayList rather than abstract types like List.

A safer trick

A safer trick for reducing the amount of typing required to declare a generic collection is to use type inference. The compiler is pretty smart about using type information embedded in the program to assign type arguments. If you define a utility method like this:

public static <K,V> Map<K,V> newHashMap() {
    return new HashMap<K,V>(); 
}

You can use it to safely avoid entering the type parameters twice:

Map<Socket, Future<String>> socketOwner = Util.newHashMap();

This approach works because the compiler can infer the values of K and V from the context in which the generic method newHashMap() is called.


Conclusion

The motivation for the pseudo-typedef antipattern is straightforward enough -- developers want a way to define more compact type identifiers, especially as generics make type identifiers more verbose. The problem is that this idiom creates tight coupling between code that employs it and that code's clients, inhibiting reuse. You may not like the verbosity of generic type identifiers, but this is not the way to solve it.

Resources

Learn

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology, Architecture
ArticleID=103978
ArticleTitle=Java theory and practice: The pseudo-typedef antipattern
publish-date=02212006