Crossing borders: Typing strategies beyond the Java model

The Java™ community is split when it comes to the language's approach to typing. Some love the compile-time error checking, better security, and improved tools -- all features enabled by static typing. Others would prefer a more dynamically typed experience. This time in Crossing borders, you'll look at the dramatically different typing strategies used by two highly productive non-Java languages and at ways you can achieve some typing flexibility in your Java programming.

Bruce Tate (bruce.tate@j2life.com), President, RapidRed

Bruce TateBruce Tate is a father, mountain biker, and kayaker in Austin, Texas. He's the author of three best-selling Java books, including the Jolt winner Better, Faster, Lighter Java. He recently released Beyond Java.. He spent 13 years at IBM and is now the founder of the RapidRed consultancy, where he specializes in lightweight development strategies and architectures based on Java technology and Ruby on Rails. 



23 May 2006

One of the more contentious issues in any programming-language discussion is the typing model, and with good reason. Typing determines the kinds of tools you can use and impacts application design. Many developers link typing to productivity (as I do) or maintainability. The typical Java developer is often especially protective of the Java language's typing model, citing better development tools, the ability to catch certain kinds of bugs -- such as type incompatibilities and misspellings -- at compile time, and performance benefits.

If you want to understand a new programming language, or even a family of languages, you should often start with the typing strategy. In this article, you'll see alternatives to the Java typing model. I'll start with a general description of the decisions any language designer must consider within a typing model, focusing on the controversial decision of static versus dynamic typing. I'll show examples on both ends of the spectrum -- in Objective Caml for static typing and in Ruby for dynamic typing. I'll also talk about the Java language's typing limitations and how it's possible to program productively both within and against them.

About this series

In the Crossing borders series, author Bruce Tate advances the notion that today's Java programmers are well served by learning other approaches and languages. The programming landscape has changed since Java technology was the obvious best choice for all development projects. Other frameworks are shaping the way Java frameworks are built, and the concepts you learn from other languages can inform your Java programming. The Python (or Ruby, or Smalltalk, or ... fill in the blank) code you write can change the way that you approach Java coding.

This series introduces you to programming concepts and techniques that are radically different from, but also directly applicable to, Java development. In some cases, you'll need to integrate the technology to take advantage of it. In others, you'll be able to apply the concepts directly. The individual tool isn't as important as the idea that other languages and frameworks can influence developers, frameworks, and even fundamental approaches in the Java community.

Typing strategies

You can look at typing along at least three axes:

  • Static versus dynamic typing depends on when a typing model is enforced. Statically typed languages enforce typing at compile time. Dynamically typed languages enforce typing at run time, usually based on an object's characteristics.
  • Strong versus weak typing depends on how typing models are enforced. Strong typing enforces the typing rigorously, throwing run-time or compilation errors if you break typing rules. Weak typing allows more leeway. At the extreme endpoints, a weakly typed language such as Assembler lets you assign any data type to any other (whether the assignment makes sense or not). Statically typed languages can have strong or weak typing; dynamically typed systems are usually, though not exclusively, strongly typed.
  • Explicit (also called manifest) typing versus inferred typing depends on how the language determines a given object's type. A manifestly typed language forces you to declare each variable and each function argument. A type-inferred language determines which type an object might be based on syntactic or structural clues in the language. Statically typed languages are usually, but not exclusively, type explicit; dynamically typed languages are almost always type-inferred.

Here are two examples that give you a good view of what happens along two of the three axes. Suppose you compile this Java code:

class Test {
   public static void test(int i) {
      String s = i;
   }
}

You get this error message:

Test.java:3: incompatible types
found   : int
required: java.lang.String
        String s = i;
                   ^
1 error

Executing this Ruby code:

1 + "hello"

Gives you this error message:

TypeError: String can't be coerced into Fixnum
        from (irb):3:in '+'
        from (irb):3

Both languages lean toward strong typing because both throw error messages when you try to use an object outside of its intended type structure. The Java typing strategy gives you the error at compile time because it performs static type checking. Ruby gives you an error at run time because Ruby supports dynamic typing. Said another way, Java binds the object to a type at compile time. Ruby binds the object to the type at run time, each time you change the object. Because I declared the variable in the Java code but didn't in Ruby, you see the difference between the Java language's explicit typing and Ruby's inferred typing in action.

Of the three axes, static versus dynamic typing tends to have the most impact on a language's character, so I'll focus now on these two strategies' respective strengths.

Strengths of static typing

In statically typed languages, the programmer (through a declaration or convention) or the compiler (through inference based on structural and syntactic clues) assigns a type to a variable or object, and that type doesn't change. Static typing usually comes with a cost because statically typed languages such as the Java language are usually type explicit. That means you must declare all of your variables and compile the code. That cost comes with a payoff: early error detection. At the most basic level, static typing exists to give the compiler much more information. One of the benefits of the additional information is the ability to catch certain kinds of errors that a dynamically typed language won't detect until run time. If you wait until run time to catch these kinds of bugs, some will make it into production. This is perhaps the strongest argument against dynamically typed languages.

The counterclaim is that modern software-development teams often run automated tests, and dynamic-language proponents claim that even the simplest automated tests catch a vast majority of type errors. But the best argument dynamic language proponents can possibly make against compile-time error detection is that the benefit of early detection isn't worth the cost because you've got to test anyway, whether or not you use dynamic typing.

One interesting trade-off would be to use inferred typing with statically typed languages, reducing the cost of typing. The open source Objective Caml (OCaml) language is a statically typed Lisp derivative that enables excellent performance without sacrificing productivity. OCaml uses inferred typing, so you can have this statically typed example:

# let x = 4+7

OCaml returns:

val x : int = 11

OCaml, based on syntactic clues in the expression, infers the type of x. 4 is an int, and 7 is an int, so x must be an int too. Type-inferred languages can have all of the type safety of the Java language, and even more. The difference is the amount of information that you need to provide and the amount of information that you have available when you're reading the program. Many people who love static typing prefer inferred typing. They'd rather the compiler do the work than be forced to repeat themselves in their code.

One big advantage of type-inferred systems is that you don't need to declare the types of arguments to a function because the compiler infers them from the parameters you pass in. This lets you use the same method for multiple purposes.

Refactoring fallacy

It's a fallacy to believe that you must have static typing to enable good IDEs with refactoring support. Most modern IDEs take at least some ideas from the early Smalltalk IDEs. In fact, some of Eclipse's early roots are in Visual Age for Java, which initially shipped on a Smalltalk virtual machine! The Smalltalk Refactoring Browser is still one of the most full-featured refactoring tools available (see Resources). Still, the Java language has better tools than the most popular dynamic languages (with the exception of Smalltalk), and static typing is the biggest reason why.

The compiler isn't the only tool that can take advantage of the additional information that static typing provides. IDEs, through static typing, can provide much better support for refactoring. A few years ago, a revolutionary idea changed the way that development environments work. In IDEA and Eclipse, your code looks like a text view, but the development environment is actually editing the Abstract Syntax Tree (AST). So when you want to rename a method or class, it's easy for the environment to find every use of it by pinpointing the location in the AST. Today, it's hard to imagine programming in the Java language without excellent refactoring, made easier by static typing. In my exploration of Ruby, I miss IDEA more than any other tool or feature.

Static typing has some other advantages that I won't discuss in detail here. Static typing possibly enables better security and definitely improves code readability in places. Static typing can also provide more information that a compiler can use to make earlier optimizations, improving performance. But the biggest wins for static typing for most developers are earlier error detection and better tooling.

Strengths of dynamic typing

Ruby maven Dave Thomas has labeled dynamic typing with the term duck typing (see Resources), which has two meanings. The first is that the language doesn't really implement typing -- it ducks the issue. The second meaning is that if something walks like a duck and quacks like a duck, it's probably a duck. In the context of programming languages, duck typing means that if an object responds to the methods of a type, then for all practical purposes, you can treat it like that type. This behavior leads to some interesting optimizations.

In addition to arguing that tests make the cost of early error detection unnecessary, most developers who prefer dynamic typing cite the expressiveness and productivity of a dynamically typed language. Quite simply, you can usually express more ideas in fewer keywords. As a new Ruby convert, I strongly believe that dynamic languages are more productive, though I don't have any more hard evidence than a typical static-language proponent. But I've definitely experienced an improvement in my productivity since I started to write more Ruby code. Sure, I can still see the benefits of static typing, especially in my tool sets, but I'm also starting to recognize the disadvantages.

The big thing that changed for me when I started coding in Ruby is my ability to produce and consume metaprogramming constructs. If you've been following the Crossing borders series since its inception, you know that metaprogramming, or writing programs that write programs, is one of the driving forces behind Ruby on Rails in particular, and domain-specific languages more generally. In Ruby, I'm usually coding larger building blocks, or I'm building with the larger blocks. I find that I can extend my program with more kinds of reusable blocks than I can in Java coding. As with Java programming, you can extend your programs with new classes. You can also add methods and data to existing Ruby classes because classes are open. You can use mix-ins (more below, under Run-time binding) to add core capabilities to existing classes. You can also change the definition of an object at any time to suit your purposes. I don't need these capabilities very often and didn't use them as a novice Ruby programmer, but when I did start to use them, the results were quite amazing.

For example, to add an interceptor, you simply rename a method and create a new implementation of the original one. To intercept new, you'd write this code:

class Class
  alias_method :old_new, :new
  def new(*args)
    puts "Intercepted new" #do interception work here
    old_new(*args)
  end
end

You don't need an AspectJ library, bytecode enhancement, or stacks of libraries. You simply code your interceptor directly where you need it.

At another level, dynamic typing saves effort in terms of raw lines of code. Because dynamic languages are almost always type inferred, you don't need to work as hard to express basic ideas. Rather than declaring a variable, you're free just to start using it; rather than express all of the possible permutations of the types of arguments, you can just enter a list of names. Your code can be more polymorphic -- everything that responds to a method can be treated as one type -- so you can often express ideas much more concisely than you can in other languages. You also have much less coupling in your code. When you want to change the type of something, the change is usually localized, so you aren't forced to make changes in multiple places.

Safety versus flexibility

In one sense, the crux of the debate over static versus dynamic languages is safety versus flexibility. Static-language proponents believe that the safer language is better. Dynamic-language proponents are unwilling to trade any power for safety. To them, the key measurement of a language is how quickly you can express an idea, with the aim of maximum programmer efficiency. At the other end of the spectrum, static-language experts say that if you can catch an early bug, you should, and that tools can make up for limitations in the language.

The final productivity booster is the lack of a compile step. Many dynamically typed languages are interpreted, so you can see changes immediately after making them. Exploring the behaviors of library and application code is much easier in Ruby, even without a conventional debugger, because you can open up an interpreter, often in the context of a debugging session (as I showed in the preceding article of this series), and simply explore.

And yet...

Compilation does more than support static typing, though. Static-language proponents often claim better performance too. Many static languages, such as Java code, C, and C++, are called systems languages because they're the most common languages used to build operating systems, device drivers, and other high-performance systems code. This often leads fans of dynamically typed languages to accuse statically typed languages of always being too low-level to be productive languages for applications -- but that's a narrow world view. The OCaml language is a quite high-level language, capable of object-oriented programming, functional programming (like Lisp or Erlang), or traditional structured programming. The typing model is static, and many say that the performance is even better than C++'s (see Resources). With OCaml, you pay minimal overhead for your static typing because the language is type inferred. For the trouble, you get rock-solid performance, compile-time type checking, and a very high-level language. Even the staunchest proponents of duck typing have to respect those trade-offs.


Typing limitations in the Java language

Java developers take full advantage of static typing. They have some of the best development tools in the world with capabilities, such as code completion and refactoring, that lean heavily on static typing. The many Java programmers who are only now starting to take full advantage of test-first development get added stability because the compiler can catch bugs related to type. New language features such as generics strengthen the typing model and provide still more information to the compiler. But Java developers are often blind to some of the benefits of dynamic typing.

Run-time binding

The flexibility that dynamic typing permits is more important than you might think. In some ways, Java developers seek to defeat static typing by using more XML (which can delay binding until run time) and strings (which can represent many different kinds of types). Configuration, which can often benefit from run-time binding (and thus dynamic typing) in Ruby normally takes the form of Ruby code, and in Java programming normally takes the form of XML. Take the Spring framework (see Resources): To configure a generic Spring bean, you use XML. You must provide a valid Java class name and set the properties for each variable. For example, persistence engines such as Hibernate need a session factory (see Resources). Configuring a data-access object in the Java syntax would be quite pleasant:

  Dao myDataAccessObject = Dao.new(sessionFactory);

The problem is that this line of code gets bound at compile time, and that's too static. You often need to replace the instance of the session factory or the data-access object with something else, such as a mock data-access object for testing. So instead of hard-coding the example as above, you'd use a framework like the Spring framework to configure the item with XML, which might look something like this (from the Spring Framework example called petclinic):

<bean id="myDao" class="org.springframework.samples.petclinic.hibernate.HibernateClinic">
   <property name="sessionFactory" ref="sessionFactory">
</bean>

The Spring framework is one of the more important and influential frameworks in the Java community today because it lets you delay binding and loosen coupling between major elements of your system. Further, you can decouple without blowing your one shot at inheritance. In Java programming, especially when you're writing an increasing number of POJOs (plain old Java objects), you must be very careful when you use inheritance because with the Java language, you get only one shot.

In a dynamic language such as Ruby, the solution would be strikingly different. First, I would be inclined to use a mix-in to implement persistence. All associations would happen once in the mix-in. Think of a mix-in as an interface with an implementation behind it. In other words, with a mix-in, you can add multiple capabilities to the same object without multiple inheritance. In fact, Active Record does exactly this through inheriting from a common base class, which has multiple capabilities mixed in:

class Pojo < ActiveRecord::Base

In Ruby, you're not concerned about blowing your single shot at inheritance because with open classes (letting you add capabilities on the fly) and modules (letting you mix in still other capabilities), you can add more functionality to your object at will. But what about the tight coupling? If you wanted to implement the class the Java way, you'd see something like this:

class MyClass 

  attr_accessor myDao   #defines getters and setters for myDao

  def initialize(session_factory)
    myDao = Dao.new(session_factory)
  end

...

The code in the initialize() method looks like the initial Java version that's taboo because it binds the data-access object to the session factory at compile time. But this is a dynamically typed language, so you needn't be concerned about boxing yourself in. For testing purposes, you can always change the definition of the class on the fly. You can open the existing class later:

class MyClass   #not redefining the class; just opening the existing class

  def myDao    #redefine the getter for myDao
    #do some work to generate the mock object
    return myMockObject
  end

end

Putting it together

At some point, as a user of a programming language, you're a slave to that language's typing strategy, for better or worse. As a Java programmer, you should seek to write Java code in a way that embraces the type system. Exploiting the type system to the fullest and depending on the community to get better metaprogramming support through frameworks instead of rolling your own metaprogramming are both good ways to leverage your advantages. A huge number of Java frameworks support metaprogramming for persistence (Hibernate and JDO), transactions (Spring and EJB), model-view-controller (WebFlow and RIFE), and programming models (AspectJ).

But sometimes you need to work against the type system in your chosen language, whether you're documenting code that needs additional description for better readability or you're trying to delay binding to a type. The Java language is strong enough that you can take full advantage of many projects that are built to help you do so:

  • The Spring framework lets you delay binding until run time and gives you many of the capabilities of dynamically typed languages. Spring is especially good at adding capabilities to POJOs, run-time configuration, and working around the Java language's type limitations.
  • AspectJ is an implementation of the aspect-oriented programming model on the Java platform (see Resources). AspectJ lets you introduce crosscutting concerns without introducing additional syntax, a technique that also lets you overcome the Java language's static nature.
  • The Hibernate project and the Java Persistence API (JPA) let you add persistence to a POJO, again without changing the underlying type.
  • XML lets you express both data and application configuration. Many frameworks use XML to work around the Java language's type limitations.

You have another choice as well. By understanding the typing strategies in other languages, you can recognize problems that don't fit the Java strategy. When you need access to the Java platform but not the Java language, you can then use one of the excellent JVM implementations of alternative languages (see Resources).

In the next article in this series, you'll look at testing in Ruby on Rails. You'll see some ideas that Rails shares with the Java language, some things that Rails does better, and some areas where the Java language has a clear advantage. Until then, keep staying in touch with the rest of the world by crossing borders.

Resources

Learn

Get products and technologies

  • Objective Caml: OCaml, a Lisp derivative that combines static and inferred typing to deliver high performance without sacrificing productivity.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=122924
ArticleTitle=Crossing borders: Typing strategies beyond the Java model
publish-date=05232006