Skip to main content

Diagnosing Java code: Unit tests and automated code analysis working together

Your tests can help your tools to analyze your code

Eric Allen (eallen@cs.rice.edu), Ph.D. candidate, Java programming languages team, Rice University
Eric Allen has a bachelor's degree in computer science and mathematics from Cornell University and is a PhD candidate in the Java programming languages team at Rice University. Before returning to Rice to finish his degree, Eric was the lead Java software developer at Cycorp, Inc. He has also moderated the Java Beginner discussion forum at JavaWorld. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Eric is the lead developer of Rice's experimental compiler for the NextGen programming language, an extension of the Java language with added language features, and is a project manager of DrJava, an open-source Java IDE designed for beginners. Contact Eric at eallen@cs.rice.edu.

Summary:  Unit testing and static analysis are often seen as unrelated ways to help ensure the correctness of a program. This article examines the relationship between these two methods and covers how the tools that form the working backbone of each method can be used to leverage one another to mutual advantage. Specifically, Eric Allen discusses some of the exciting new applications available that allow you to further leverage your unit tests. Share your thoughts on this article with the author and other readers in the discussion forum by clicking Discuss at the top or bottom of the article.

View more content in this series

Date:  01 Oct 2002
Level:  Introductory
Activity:  2755 views

It's an age-old debate -- which is more valuable to producing robust code, testing or static analysis and proofs? You hear programmers arguing the merits on a daily basis, especially on Extreme Programming discussion forums. (See our own XP discussion forum hosted by Roy Miller.)

The main argument in favor of static analysis (including type checking) is that the results hold for all possible runs of the program, whereas passing unit tests only guarantee that the tested components hold specifically for the inputs they were tested with (on the platform they were tested on).

The main argument in favor of unit testing is that it is much more tractable. You can test many constraints of a program that are far beyond the reach of contemporary static-analysis tools.

Let me take a risk here and say that I think it's a mistake to view these two kinds of tools as conflicting. Each tool helps to build more robust programs. And in fact, they can complement each other in very powerful ways.

Each tool has a major strength that can be particularly useful to complement the other tool:

  • Unit tests are able to show the common paths of execution, to show how a program behaves.
  • Analysis tools are able to check the coverage that unit tests provide.

Let's look at each of these attributes and discuss tools that can help you bring that strength to the other method.

Unit tests show common paths of execution

A suite of unit tests provides a solid base of example uses of the components of the program. By examining how the program behaves when the tests run, an analysis tool can form heuristic speculations concerning invariants that the developer expects to hold over a program (just as a programmer reading the unit tests does).

This is yet another way in which unit tests can be an executable form of documentation. After speculative invariants are inductively inferred from the runs of the unit tests, the analysis tool may attempt to deductively verify that the invariants hold, or it may annotate the code with assertions that can be checked at run time.

In either case, it's best for the tool to report back to the user with the inferred set of invariants before it does anything else, to ask which ones really were intended. Incidentally, if such a tool reports back a lot of invariants that the user didn't intend, it can be a signal that there are problems with the unit tests -- for instance, that they're not general enough.

One tool that can be used with unit tests in this way is Daikon, a free, experimental tool from Mike Ernst's program analysis group at MIT. Daikon analyzes runs of a program (such as, say, the runs of unit tests) and attempts to infer invariants. It then queries the user with these invariants and inserts the intended invariants into the program as assertions.

For example, suppose we wrote an adapter for Vectors that implemented an interface Sequence containing a method lookup for retrieving elements and a method insert for putting them at the end. Method lookup holds an index i that it uses to access the Vector it contains.

Say that the length of the array is stored in a field length. By maintaining the length in the adapter, we could potentially remove elements from the end without notifying the Vector itself.

Let's write an easy test case for this simple adapter we envision:

import junit.framework.TestCase;

public class VectorAdapterTest extends TestCase {
  public VectorAdapterTest(String name) {
    super(name);
  }
  
  public void testLookupAndInsert() {
    VectorAdapter v = new VectorAdapter();
    v.insert("this");
    v.insert("is");
    v.insert("a");
    v.insert("test");
    assertEquals("Retrieved and inserted elements don't match",
                 "a",
                 v.lookup(2));
  }
}

Then we could implement our adapter to pass this test as follows:

import java.util.Vector;

public class VectorAdapter implements Sequence {
  private Vector values = new Vector();
  private int length = 0;
  
  public void insert(Object o) {
    length += 1;
    values.addElement(o);
  }
  
  public Object lookup(int i) {
    return values.elementAt(i);
  }
}

interface Sequence {
  public void insert(Object o);
  public Object lookup(int i);
}

When Daikon is run over this code, it may infer for method lookup that i is always less than length. Daikon may infer this from the unit tests and report a pre-condition for our method: i < length.

The invariants that Daikon reports can then be examined by the programmer, who can get a better idea of how well his tests cover the program. For example, if Daikon starts inferring a lot of unintended invariants, it means that the unit tests are only checking the program with a misrepresentative subset of the potential program inputs.

Although Daikon is written in the Java language, it requires a front end that's written in C++, making it less portable than it could be. Nevertheless, builds of the front end for many major platforms are available online. Also, the Daikon team offers to add requested builds for other platforms.

(You can find download information and more on Daikon in the Resources section.)


Analysis tools can check unit test coverage

Analysis tools can help the programmer build a strong unit test suite. There are primarily two ways in which this has been done so far:

  • Using static analysis to try to automatically generate a suite of unit tests
  • Using static analysis to determine how well a unit test suite covers the functionality of a program

There are currently several free tools that attempt to produce unit tests from code automatically, but most of the free tools tackling this task are still in alpha. Some of the more promising include JUnitDoclet and JUB (an acronym for "JUnit test case Builder"), available on SourceForge (links are provided in the Resources section).

The main thing to keep in mind about these types of tools is that they are best applied when retrofitting legacy code with tests. They are less useful when building new projects.

Why? Because new projects should be built in tandem with the unit tests over them. Developing unit tests is a powerful way to build a design; the APIs to the components are designed implicitly as the tests are written for them. What's more, designing in this fashion provides immediate feedback to the designer. Bad designs will be harder to write tests for! And any analysis tool would be hard-pressed to do as good a job as the designer in determining what tests to write for a program.

The second kind of analysis tool analyzes a program and its unit tests and determines how well these tests cover the program. In contrast to the first kind of tool mentioned, tools like these are useful for every project. In fact, an Extreme Programming team might consider integrating such a tool into their code-committing process. Then not only could they prevent code from being committed unless it passed all tests, they could prevent code from being committed unless there were tests over it! Test coverage can be scant not just because of laziness, but because of mistakes, so such enforcement can be useful to programmers at all levels of skill (and integrity).

One new and particularly promising tool that can perform such an analysis is Clover. Clover is a plug-in for Ant, the popular all-Java replacement for make. Clover is a commercial tool, but it is available for free for open source projects.

Clover works in a two-stage process. First, it instruments code at compile time. Then, at test time, information about the runs of the tests are written to a database that is then used to generate reports (through a GUI, Web page, or at the console).

Integrating Clover into an existing project that uses Ant is quite straightforward. It involves tweaking the build.xml file for the project to add a few targets for instrumenting code during compiling, logging testing, and generating reports. For example, suppose we had a build.xml file with targets for building and compiling. All we'd have to do is put the Clover JAR files in our Ant library directory and augment our build.xml file as follows (information for these and similar Ant targets are provided in the Clover user guide; I include them here for convenience):

<property name="clover.initstring" value="/tmp/mycoverage.db"/>

<target name="with.clover">
    <property name="build.compiler"
            value="org.apache.tools.ant.taskdefs.CloverCompilerAdapter"/>
</target>

<path id="clover.classpath">
  <pathelement path="<CLOVER_HOME>/lib/clover.jar"/>
  <pathelement path="<CLOVER_HOME>/lib/velocity.jar"/>
</path>

 <target name="clover.report">
  <java classname="com.cortexeb.tools.clover.reporters.html.HtmlReporter">
   <arg line="--outputdir /tmp/clover_html --showSrc --initstring 
     $\{clover.initstring\} --title 'My Project'"/>
   <classpath refid="clover.classpath"/>
  </java>
</target>

The property clover.initstring specifies a file to which to write Clover coverage data. The target with.clover is used to switch Clover on when executing other targets (such as compile and test). The clover.report target is used to take accumulated coverage data and generate a report.

In the code above, we're generating an HTML report. We can also generate text reports (useful for feeding to a script to determine if the test coverage is acceptable) and Swing-based reports.

Setting clover.classpath is necessary so that the report generator target knows where to find all of its requisite classes. However, the second JAR file placed on the class path (velocity.jar) is necessary only for generating HTML reports. Once this is done, a Clover report can be generated with the following commands:

$ ant with.clover compile test
$ ant clover.report

It's that simple. For some sample outputs, check out the online Clover reports for the popular coding tools JBoss and Ant. (You can find download information and more on these in the Resources section.)


The meeting of the twain

The tools discussed in this article highlight some of the useful ways that program analysis and unit tests can be used together to provide more powerful invariant detection than either could perform in isolation. These techniques represent just the tip of the iceberg of what is possible.

In the future, additional tools could provide an even stronger leverage of unit tests. For example, type-inference engines and optimizing compilers could infer hints from existing unit tests, UML-generation tools could construct various diagrams (and not just class diagrams) from the tests, and more. There is enormous room for creative development and experimentation in the combining of these methods for better code building and troubleshooting.

Remember the attributes that each method has as its strengths:

  • Unit tests have the ability to demonstrate how a program behaves during specific runs; they can also illuminate the common paths of execution.

  • Analysis tools have the ability to check certain properties of a program over all possible runs.

Each strength can be used to supplement the potential weaknesses of the other method.

Next time, we'll go off on another path of enhanced unit testing and review some of the latest tools available to help develop unit tests over GUIs.


Resources

About the author

Eric Allen has a bachelor's degree in computer science and mathematics from Cornell University and is a PhD candidate in the Java programming languages team at Rice University. Before returning to Rice to finish his degree, Eric was the lead Java software developer at Cycorp, Inc. He has also moderated the Java Beginner discussion forum at JavaWorld. His research concerns the development of semantic models and static analysis tools for the Java language, both at the source and bytecode levels. Eric is the lead developer of Rice's experimental compiler for the NextGen programming language, an extension of the Java language with added language features, and is a project manager of DrJava, an open-source Java IDE designed for beginners. Contact Eric at eallen@cs.rice.edu.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=10714
ArticleTitle=Diagnosing Java code: Unit tests and automated code analysis working together
publish-date=10012002
author1-email=eallen@cs.rice.edu
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers