Java theory and practice: Kill bugs dead

Inspection tools like FindBugs provide a second layer of defense against common coding errors

Most advice on programming style is aimed at creating high-quality, maintainable code, which makes sense because the easiest time to fix a bug is before the bug is created (an ounce of prevention . . .). Unfortunately, prevention is not always enough, and while some fine tools exist to help you create good code, fewer tools are available to help you analyze, maintain, or improve the quality of existing code. This month, columnist Brian Goetz builds on Chris Grindstaff's earlier Introduction to FindBugs and shows you how this static analysis tool can help you analyze your code for compliance with design principles that have been discussed in past issues of this column.

Share:

Brian Goetz (brian@quiotix.com), Principal Consultant, Quiotix Corp

Brian Goetz has been a professional software developer for over 15 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, California, and he serves on several JCP Expert Groups. See Brian's published and upcoming articles in popular industry publications.



29 June 2004

Also available in Russian Japanese

Writing a thread-safe class is hard, but analyzing an existing class for thread-safety is even harder, as is enhancing it so that it remains thread-safe. Much of the knowledge of how a class works (or is supposed to work) evaporates shortly after it is written, in the form of implicit assumptions, invariants, and expected use cases that are clear in the developer's head but never get written down in the form of design notes, comments, or documentation. Existing code is always harder to work with than new code.

Needed: better code auditing tools

Of course, the best time to ensure high-quality code is at the moment the code is being written, as this is the time you best understand how it is put together. Plenty of advice can be had (just read this column!) on how to write quality code, but you don't always have the luxury of writing everything from scratch or taking as much time as you'd like to write it. So what do you do in that case? Developers often lobby for rewriting (after all, writing new code is more fun than fixing someone else's, or even our own, buggy code), but that, too, is a luxury, and often simply trades today's known problems for tomorrow's unknown ones. What is needed are better tools for analyzing and auditing existing codebases to assist developers in code audits and bug hunting.

I'm happy to say that the state of automated code-inspection and auditing tools has gotten significantly better recently with the introduction of FindBugs. Until now, most inspection tools either tried to take on the very difficult problem of proving that a program is correct, or focused on superficial issues such as code formatting and naming conventions, or at best, simple bug patterns such as self-assignment, unused fields, or potential errors such as unused method parameters or public methods that could be declared private or protected. But FindBugs is different -- it uses bytecode analysis and a host of plug-in bug pattern detectors to find common bugs in code. It can help you spot where your code strays, intentionally or unintentionally, from good design principles. (For an introduction to FindBugs, see Chris Grindstaff's articles, "FindBugs, Part 1: Improve the quality of your code" and "FindBugs, Part 2: Writing custom detectors.")


Design advice and bug patterns

For every bug pattern, there exists a corresponding proscriptive element of design advice exhorting us to avoid the specific bug pattern. So if FindBugs is a bug pattern detector, it stands to reason that it can also serve as an auditing tool to measure the degree of compliance with a set of design principles. Many installments of Java theory and practice are dedicated to specific elements of design advice (or its corresponding bug pattern). In this installment, I'll explain how FindBugs can help ensure that this design advice is followed in an existing codebase. Let's rehash some past advice and see how FindBugs can help detect when you fail to follow it.

The exceptions debate

In "The exceptions debate," one of the arguments raised against checked exceptions was that it is too easy for exceptions to be "fumbled" -- to catch an exception and neither take corrective action nor throw another exception, as shown in Listing 1. This fumbling often happens when, in prototyping, an empty catch block is coded simply to make the program compile, with the intention of going back later and filling in some sort of error-handling strategy. While some offer the frequency of this scenario as an example of the unworkability of the approach to exception handling taken by the Java language design, I think it is merely a failure to use the right tools. FindBugs can detect and flag these empty catch blocks easily. If you intend to ignore the exception, it is easy to add a descriptive comment to that effect so readers know that you are deliberately ignoring it, and haven't just forgotten to handle it.

Listing 1. Fumbling an exception
try {
  mumbleFoo();
}
catch (MumbleFooException e) { 
}

Hashing it out

In "Hashing it out," I outlined the basic rules for correctly overriding Object.equals() and Object.hashCode(), in particular that equal objects (according to equals()) must have equal hashCode() values. While this rule is fairly easy to follow once you know it (and some IDEs have wizards for defining both for you in a consistent manner), if you override one of these methods and forget to override the other, this bug can be very difficult to detect through inspection -- because the error is not in the code that is present, but in the code that is absent.

FindBugs has a detector for detecting many instances of this problem, such as overriding equals() but not hashCode(), or overriding hashCode() but not equals(). These detectors are among the simplest in FindBugs as they only need to examine the set of method signatures in the class and determine whether or not both equals() and hashCode() have been overridden. It is also possible to mistakenly define equals() with an argument type other than Object; while this construct is valid, it won't do what you think it does. The Covariant Equals detector will detect such questionable overrides such as this one:

  public void boolean equals(Foo other) { ... }

Related to this detector is the Confusing Method Names detector, which triggers on methods with names like hashcode() and tostring(), on classes that have methods whose names differ only by capitalization, or whose methods are named the same as a superclass constructor. While these method names are all valid according to the language specification, it is likely that they are not what was intended. Similarly, the Serialization detector will trigger if the field serialVersionUID isn't final, isn't long, or isn't static.

Finalizers are not your friend

In "Garbage collection and performance," I tried my best to discourage the use of finalizers. Finalizers have a significant performance cost, and they are not guaranteed to run in a predictable amount of time (or even at all). Still, there are times when you need to use finalizers, and there are a number of errors you can make when doing so. If you must use a finalizer, it should generally be structured like in Listing 2:

Listing 2. Proper definition of a finalizer
  protected void finalize() { 
    try {
      doStuff();
    }
    finally { 
      super.finalize();
    }
  }

FindBugs detects a number of questionable finalizer constructs, such as:

  • An empty finalizer (which negates the effect of the superclass finalizer)
  • Do-nothing finalizers (which only call super.finalize(), but which can inhibit certain runtime optimizations)
  • Explicit invocation of finalizers (calling finalize() from user code)
  • Public finalizers (finalizers should be declared protected)
  • Finalizers that do not call super.finalize()

Examples of these bug patterns are shown in Listing 3:

Listing 3. Common finalizer mistakes
  // negates effect of superclass finalizer
  protected void finalize() { }

  // fails to call superclass finalize method
  protected void finalize() { doSomething(); }

  // useless (or worse) finalizer
  protected void finalize() { super.finalize(); }

  // public finalizer
  public void finalize() { try { doSomething(); } finally { super.finalize() } }

Also in "Garbage collection and performance," I talked about another garbage-collection hazard: explicit calls to System.gc(). Such explicit calls are almost always misguided attempts to "help" or "trick" the garbage collector, and they often end up hurting performance rather than helping it. FindBugs can detect explicit calls to System.gc() and flag them (on Sun JVMs, you can also disable explicit garbage collection with the -XX:+DisableExplicitGC startup option).

Safe construction techniques

In "Safe construction techniques," I showed how allowing an object's reference to escape its constructor can cause some serious problems. Since then, the risk of allowing the this reference to escape construction has gotten even more serious -- the new Java Memory Model (as specified by JSR 133 and implemented in JDK 1.5) negates all initialization safety guarantees if an object's reference is allowed to escape the constructor.

An object's reference can escape its constructor in several ways, both directly and indirectly. Storing the this reference in a static variable or data structure is clearly a no-no, but there are more subtle ways to allow a reference to escape construction, such as publishing a reference to a nonstatic inner class, or starting a thread from within a constructor (which almost invariably entails publishing the reference to the new thread). FindBugs has a detector for finding instances where a thread is started from a constructor; while it currently cannot detect all of these hazards, it is likely that a future version will include detectors for some of these other initialization safety patterns.

Being nice to the memory model

In "Fixing the Java Memory Model, Part 1," I reviewed the basic rule for synchronization -- whenever reading a variable that might have last been written by another thread, or writing a variable that might next be read by another thread, you must synchronize. It is easy to "forget" this rule, especially when reading -- but doing so creates numerous risks for the thread-safety of your program. This sort of bug is commonly introduced when maintaining a class that was originally properly synchronized, but the thread-safety requirements were not fully understood by the maintainer.

Fortunately, FindBugs has a number of detectors that can help identify incorrectly synchronized classes. The Inconsistent Synchronization detector is probably the most complicated detector used by FindBugs; it has to analyze the whole program, not just individual methods, use dataflow analysis to determine when a lock is held, and use heuristics to infer that a class intends to provide thread-safety guarantees. Basically, for each field, it looks at the pattern of accesses for that field, and if the majority of accesses are done with synchronization, accesses without synchronization are flagged as potential errors. Similarly, the Inconsistent Synchronization detector will issue a warning if the setter for a property is synchronized and the getter is not.

In addition to inconsistent synchronization, a number of other detectors for common threading errors are included, such as waiting on a monitor with two locks held (which, while not necessarily a bug, could cause a deadlock), using the double-checked locking idiom, incorrect lazy initialization of nonvolatile fields, invoking run() on a thread instead of starting it, invoking Thread.start() from a constructor, or calling wait() without wrapping it in a loop.

To mutate, or not to mutate

In "To mutate, or not to mutate" (and others), I extolled the virtues of immutability -- immutable objects cannot get into an inconsistent state. They are inherently thread-safe (assuming their immutability is ensured through the use of the final keyword), and you can freely share and cache references to immutable objects without having to copy or clone them.

The final keyword was included in the Java language to assist developers in creating immutable classes, and enable compilers and runtime environments to optimize on the basis of declared immutability. However, while fields can be final, array elements cannot. By the proper use of final and private fields, an object can be made immutable, but if the object's state includes arrays, it is important to prevent references to these internal arrays from escaping the class' methods. The class shown in Listing 4 tries to be immutable, but is not, because a caller could modify the states array after calling getStates(). (A related bug-in-waiting is when a mutable class might return a reference to a mutable array, and by the time the caller uses the array, its contents may have changed.) While generally considered a "malicious code" vulnerability (and many developers are not concerned about "malicious code" because their system does not load any "untrusted" classes), it is still possible for this idiom to cause serious problems in the absence of malicious code. It would be better either to return an unmodifiable List or clone the array before returning. FindBugs can detect errors like the one in getStates() (shown in Listing 4) -- while it doesn't necessarily know that the States class is supposed to be immutable, it knows that this getter is returning a handle to a mutable private array and flags it accordingly.

Listing 4. Erroneously returning a reference to a mutable array
  public class States {
    private final String[] states = { "AL", "AR", "AZ", ... };
    public boolean isState(String stateCandidate) { ... }
    public String[] getStates() { return states; }
  }

No bug too trivial

FindBugs is a truly impressive tool -- it manages to find real bugs, nearly all the time. You may think that some of the bug patterns it searches for, such as variable self-assignment, are too trivial to even bother looking for, but you'd be wrong -- every single one of the detectors in FindBugs has found bugs in tested, production, professionally developed code. Are there silly bugs lurking in your code? Get a copy of FindBugs and try it on your code. The results may enlighten -- and disturb -- you.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=10960
ArticleTitle=Java theory and practice: Kill bugs dead
publish-date=06292004