 | Level: Introductory Brian Goetz (brian@quiotix.com), Principal Consultant, Quiotix
21 Feb 2006 The addition of generics to the Java™ language
complicated the type system and increased the verbosity of many
variable and method declarations. Because no "typedef" facility
was provided for defining short names for types, some developers
have turned to extension as a "poor man's typedef," with less than
good results. In this month's Java theory and practice, Java expert
Brian Goetz explains the limitations of this "antipattern."
A common complaint about the new generics facility in Java 5.0 is that
it renders the code too verbose. Variable declarations that used to
fit entirely on one line no longer do, and the repetition associated
with declaring variables of parameterized type can be annoying,
especially without good IDE support for auto completion. For example,
if you want to declare a Map whose keys are
Sockets and whose values are Future<String>, the old way:
Map socketOwner = new HashMap();
|
is more compact than the new way:
Map<Socket, Future<String>> socketOwner
= new HashMap<Socket, Future<String>>();
|
Of course, the new way embeds more type information, reducing
programming errors and improving program readability, but it does make
for more up-front work in declaring variables and method
signatures. The repetition of the type parameters in the declaration
and the initialization seems particularly unnecessary; Socket and Future<String> need to be typed twice, forcing us to violate the "DRY" principle (don't repeat yourself).
Synthesizing typedef . . . sort of
The addition of generics adds some complexity to the type
system. Where "type" and "class" were nearly synonymous prior to Java
5.0, parameterized types, especially those with bounded wildcard
types, make the concepts of subtype and subclass very different. The
types ArrayList<?>, ArrayList<? extends
Number>, and ArrayList<Integer> are distinct
types, even though they are all implemented by the same class,
ArrayList. These types form a hierarchy;
ArrayList<?> is a supertype of ArrayList<? extends
Number>, and ArrayList<? extends Number> is a
supertype of ArrayList<Integer>.
With the original simple type system, a feature like C's typedef made
no sense. But with a more complicated type system, a typedef facility
might offer some benefits. For better or worse, typedef was not added
to the language when generics were.
One (broken) idiom that some people are using as a "poor man's
typedef" is trivial extension: creating a class that extends a generic
type but adds no functionality, such as the SocketUserMap
type, as shown in Listing 1:
Listing 1. The pseudo-typedef antipattern -- don't do this
public class SocketUserMap extends HashMap<Socket, Future<String>> { }
SocketUserMap socketOwner = new SocketUserMap();
|
This trick, which I'll call the pseudo-typedef antipattern,
accomplishes the (questionable) goal of getting the
socketOwner definition back on one line, but delivers
little more and ultimately becomes an impediment to reuse and
maintenance. (For classes that have constructors other than the no-arg
constructor, the derived class also needs to declare each constructor,
as constructors are not inherited.)
Problems with pseudotypes
In C, defining a new type with typedef is more like a
macro than a type declaration. Typedefs that define equivalent types
can be freely interchanged with each other as well as with the raw
type. Listing 2 shows an example of defining a callback function,
where a typedef is used in the signature, but the caller supplies a
callback of an equivalent type and the compiler and runtime are
perfectly happy:
Listing 2. Typedef examples in C
// Define a type called "callback" that is a function pointer
typedef void (*Callback)(int);
void doSomething(Callback callback) { }
// This function conforms to the type defined by Callback
void callbackFunction(int arg) { }
// So a caller can pass the address of callbackFunction to doSomething
void useCallback() {
doSomething(&callbackFunction);
}
|
Extension is not type definition
An equivalent program in the Java language that tried to use the
pseudo-typedef antipattern would run into trouble. The
StringList and UserList types in Listing 3
both extend a common superclass, but they are not equivalent
types. This means that any code that wants to call
lookupAll must pass a StringList, not a
List<String> or a UserList.
Listing 3. How pseudotypes lock clients into using pseudotypes
class StringList extends ArrayList<String> { }
class UserList extends ArrayList<String> { }
...
class SomeClass {
public void validateUsers(UserList users) { ... }
public UserList lookupAll(StringList names) { ... }
}
|
This restriction is more severe than it might first appear. In a small
program, it probably doesn't make much of a difference, but as the
program gets larger, the requirement to use the pseudotype
consistently could cause trouble. If a variable is of type
StringList, you cannot assign an ordinary
List<String> to it because List<String> is
a supertype of StringList and therefore not a
StringList. Just as you cannot assign an
Object to a variable of type String, you
cannot assign a List<String> to a variable of type
StringList. (You can, however, go the other way around; for
example, you can assign a StringList to a variable of
type List<String> because List<String> is a
supertype of StringList.)
The same is true for method parameters; if a method parameter is of
type StringList, you cannot pass an ordinary
List<String> to it. This means that you cannot use
pseudotypes at all as method arguments without requiring that every
use of that method use the pseudotype, which in practicality means
that you cannot use pseudotypes at all in library APIs. And because most
library APIs grew out of code that was never intended to be library
code, the excuse of "this code is just for me, no one else will be
using it" is not a good excuse (assuming your code is any good; if it
stinks, you're probably right).
Pseudotypes are contagious
This "viral" nature is one of the factors that made reuse of C code
problematic. Nearly every C package has header files that define
utility macros and types like int32,
boolean, true, false, and so
on. If you try to use several packages within an application that do
not use identical definitions for these common items, you may spend
quite a while in "header file hell" before you can even compile an
empty program that includes all the header files. Writing a C
application that uses a dozen different packages from different
authors almost certainly involves some of this type of pain. On the
other hand, it is quite common for a Java application to use a dozen
or more different packages without any such pain. If packages were to
use pseudotypes in their APIs, we would be reinventing a problem that
should remain only a painful memory.
As an example, say two different packages each define StringList using
the pseudo-typedef antipattern, as shown in Listing 4, and each
defines utility methods to operate on a StringList. The
fact that both packages have defined the same identifier is already a
minor source of inconvenience; client programs must choose one
definition to import and use the fully qualified name for the
other. But the bigger problem is that now clients of these packages
cannot create an object that can be passed to both
sortList and reverseList because the two
different StringList types are distinct types and are not
compatible with each other. Clients now must choose between using one
package or the other, or they have to do a lot of work to convert between the
different kinds of StringList. What was supposed to be a convenience
for the package writer has become a significant impediment to using
the package in all but the most limited contexts.
Listing 4. How the use of pseudotypes inhibits reuse
package a;
class StringList extends ArrayList<String> { }
class ListUtilities {
public static void sortList(StringList list) { }
}
package b;
class StringList extends ArrayList<String> { }
class SomeOtherUtilityClass {
public static void reverseList(StringList list) { }
}
...
class Client {
public void someMethod() {
StringList list = ...;
// Can't do this
ListUtilities.sortList(list);
SomeOtherUtilityClass.reverseList(list);
}
}
|
Pseudotypes are usually too concrete
A further problem with the pseudo-typedef antipattern is that it tends
to ignore the benefit of using interfaces to define the types of
variables and method arguments. While it is possible to define
StringList as an interface that extends
List<String> and a concrete type
StringArrayList that extends
ArrayList<String> and implements StringList,
most users of the pseudo-typedef antipattern generally do not go to
this length, as the purpose of this technique is primarily to simplify
and shorten type names. As a result, APIs will be less useful and more
brittle because they use concrete types like ArrayList
rather than abstract types like List.
A safer trick
A safer trick for reducing the amount of typing required to declare a
generic collection is to use type inference. The compiler is pretty
smart about using type information embedded in the program to assign
type arguments. If you define a utility method like this:
public static <K,V> Map<K,V> newHashMap() {
return new HashMap<K,V>();
}
|
You can use it to safely avoid entering the type parameters twice:
Map<Socket, Future<String>> socketOwner = Util.newHashMap();
|
This approach works because the compiler can infer the values of K
and V from the context in which the generic method
newHashMap() is called.
Conclusion
The motivation for the pseudo-typedef antipattern is straightforward
enough -- developers want a way to define more compact type
identifiers, especially as generics make type identifiers more
verbose. The problem is that this idiom creates tight coupling between
code that employs it and that code's clients, inhibiting reuse. You
may not like the verbosity of generic type identifiers, but this is
not the way to solve it.
Resources Learn
Discuss
About the author  | |  | Brian Goetz has been a professional software developer for over 18 years. He is a Principal Consultant at Quiotix, a software
development and consulting firm located in Los Altos, California, and he serves on several JCP Expert Groups. Brian's book,
Java Concurrency In Practice
, will be published in late 2005 by Addison-Wesley. See Brian's published and upcoming articles in popular industry publications.
|
Rate this page
|  |