Level: Intermediate Brian Goetz (brian@quiotix.com), Principal Consultant, Quiotix
20 Jun 2006 Back in June 2004, veteran exterminator Brian Goetz introduced the FindBugs static code analysis tool, which
can detect bugs even in well-tested software. This month, he revisits that topic and looks at how static analysis tools can change the way you manage software quality by aiming development resources at entire classes of bugs rather than specific instances.
Unit testing is now an integral part of the development process for
most teams; frameworks like JUnit have made testing painless enough
that even if we don't like it, we're willing to write some tests for
some of our code. But unit tests operate at a low level; they can
only test a single piece of code, and test code generally exhibits a
low level of reusability -- the tests we wrote yesterday for component
A are unlikely to be much use for testing component B, except maybe as
example code.
A typical unit test scenario
When you find a bug, what is the first thing you do? You could just
fix it, but that might not be the most efficient approach in the long
run. In most development shops, the process looks something like
the following:
- Write a test case for the bug
- Make sure the test case fails before touching the code
- Fix the bug
- Make sure the test case now passes
- Make sure the rest of the test suite still passes
- Check the fix and the test case into version control
- Document the fix in the bug tracking system
While this approach is a lot more work in the short term than just fixing the
bug, it offers greater value: you have more confidence that the bug is
fixed because you have tested it, and you have more confidence that the
bug will not come back because the test case is part of your regression
test suite. Between the version control system and the bug tracking
system, you also have a record that describes what the bug was and how it
was fixed -- useful information that may benefit others.
If you're feeling ambitious, you might think about how the bug came
about and look for other places where the same mistake was made. Then,
if you find the same mistake elsewhere, you might check in tests and fixes for
those bugs as well. But the fundamental weakness of unit tests as a
quality management tool is that each test case can only test one piece of
code. Because test cases must be specifically designed for each
component and for each potential failure mode, writing enough unit
tests to test a large product can be extremely time-consuming and
expensive.
The economics of QA
Tests are an essential quality management tool, but we all know that
having even a great set of test cases is not enough to find all the
bugs in a complex piece of software. In fact, "finding all the bugs"
for any nontrivial program is likely to be an impossible goal. It is
estimated that NASA employs 20 testers for each developer -- far
more than any commercial entity could afford to allocate to QA -- and
still there are software defects. The goal of the QA process, then,
should not be to find all the bugs, because that's impossible.
Instead, it should be to raise confidence that the code works, to the
greatest degree possible given the available resources.
Running an efficient QA operation involves budgeting available
resources among the available QA methodologies so as to maximize
confidence. A test suite with good coverage raises our confidence
that the code works, as does a thorough code review. Doing both is
better than doing just one or the other because each finds errors the other is likely to
miss. Both are also subject to diminishing returns, so a QA plan that
involves X dollars worth of testing and Y dollars worth of code review is
likely to be more effective than one that involves X+Y of either.
Adding static analysis
Static analysis is the process of analyzing code without running it,
much like what we do in our heads during a code review or what IDEs do
when they flag questionable constructs. Static analysis is a sensible
technique to add to the QA mix because it is yet another technique
that is good at finding errors that other approaches (like testing and code
review) can miss. Static analysis is also relatively cheap; unlike
unit tests, which you must write anew for each class you want to test, you can run
static analysis tools on any body of code.
FindBugs is an open source static analysis tool that contains bug
pattern detectors for many common bug patterns. Perhaps surprisingly,
it often can find "dumb" bugs in even well-tested software -- bugs that unit
tests and expert code reviews are likely to miss. FindBugs also
allows you to write new bug pattern detectors and package them as
plugins so if the standard set of detectors doesn't do what you need,
you can easily write your own. It is this extensibility that
makes FindBugs such a powerful tool for quality management because
as you discover new types of mistakes, you can write detectors for
them and search for them in your entire code base.
The major cost of static analysis is analyzing the output and
determining if reported items are actual bugs or false alarms. Part
of writing a good analysis tool or bug pattern detector is managing
the false alarm rate; the detectors in the core FindBugs package have
been tuned with the goal of emitting no more than 50 percent false alarms so
that analyzing the output is not excessively painful. (Contrast this threshold
with lint-like tools for C, which often emit so many false alarms that
they are prohibitively time-consuming to use.)
Taking it up a level
The methodology described earlier for fixing a bug -- write the test
case first, then fix the bug, then check in both the fix and the test
case -- reflects the desire not only to fix the bug but to improve
confidence that it has been fixed and to retain the knowledge of how
and when it was fixed. This approach is more work than just fixing the bug,
but it gives us more confidence that our code will continue to work
even with ongoing modification by multiple developers. But writing
test cases only for bugs that have been discovered is a reactive
approach. We would like to analyze our code to the extent possible
for compliance with good practices before it fails.
Listing 1 illustrates a common bug with the use of the
BigDecimal class. BigDecimal is immutable,
so the arithmetic methods such as add() return a new
BigDecimal as their result rather than modifying the object on which they are invoked. The
code in Listing 1 clearly is supposed to add conditionally the shipping cost to the
total order price, but in fact unconditionally adds nothing because the return
value of add() is discarded:
Listing 1. A typical bug pattern -- confusing a factory method with a mutator method
public class ShoppingCart {
private BigDecimal totalCost;
private boolean qualifiesForFreeShipping() { ... }
private BigDecimal getShippingCost() { ... }
public void checkout() {
...
if (!qualifiesForFreeShipping())
totalCost.add(getShippingCost()); //WRONG!
}
}
|
The mistake in Listing 1 -- forgetting that an object
is immutable and thus mistaking a factory method for a mutator method -- is a common one.
If you were to find such a mistake in your code, you might realize
that there is a good chance that the same mistake is made more than
once, as it stems from a misunderstanding of how a particular library
class works. On discovering this bug, a responsible developer might
search the entire code base for calls to
BigDecimal.add(), subtract(), and so on, looking
for other instances of ignoring the return value.
This strategy is a good first step, but we can do better. It is easy to
identify the bug pattern here -- ignoring the result of a
value-bearing method on an immutable object. Once you identify the pattern, it is a relatively simple matter to build a detector that
identifies this pattern. (FindBugs has such a detector in the core
detector set.) Not only can this technique apply to
BigDecimal, but also to other immutable classes such as
BigInteger, String, or Color.
Taking the time to create a bug detector for a bug pattern such as
this one can pay big dividends. Not only can you audit your entire
project for this bug with less work and higher confidence than you can
by hand, but you can apply the same detector to other projects,
present and future. Rather than addressing bugs on an
instance-by-instance basis, you've created a defense against that type
of bug ever coming back, wherever it might pop up.
An example bug detector
To illustrate the process of writing a detector for FindBugs, let's
write a simple detector that finds calls to System.gc(). (Download the source code for this example detector code.)
While calling System.gc() is not necessarily a bug, in
practice it usually causes more problems than it solves. Having an
errant call to System.gc() buried in a library can
dramatically impair the performance of a program that uses that
library, and the developers may well be left scratching their heads,
wondering why performance is so bad.
The first step in writing a bug detector is to identify the bug
pattern being detected. In this case, the pattern is simple:
calling System.gc(). To write a detector that recognizes
this pattern in bytecode, we need to know what the bytecode
corresponding to the bug pattern looks like. The easiest way to learn
the answer is to write a small program that has the bug, compile it, and
disassemble the .class file with javap -c.
Listing 2 shows a class that exhibits the bug:
Listing 2. Code that exhibits the bug pattern we want to build a detector for
public class BadClass {
public void doBadStuff() {
System.gc();
}
}
|
Listing 3 shows the output of javap -c when run on the
sample class:
Listing 3. Bytecode listing for code in Listing 2
public void doBadStuff();
Code:
0: invokestatic #2; //Method java/lang/System.gc:()V
3: return
|
We can quickly see that calling a static method is
done with the invokestatic JVM instruction, and the
operand of invokestatic is the gc:()V method
of class java/lang/system. The method signatures and
type names in the bytecode look a little different than they do in the
source code, but it doesn't take long to get used to the encoding used
by the bytecode.
With this example of our bug pattern, writing a FindBugs detector is
fairly straightforward. Listing 4 shows a detector that extends the
BytecodeScanningDetector base class and overrides the
sawOpcode() method. When it sees an
invokestatic instruction, it checks the class and name of
the method being invoked and if it is System.gc(), it
reports a bug instance.
Listing 4. Bug detector to find calls to System.gc()
public class CallSystemGC extends BytecodeScanningDetector {
private BugReporter bugReporter;
public CallSystemGC(BugReporter bugReporter) {
this.bugReporter = bugReporter;
}
public void sawOpcode(int seen) {
if (seen == INVOKESTATIC) {
if (getClassConstantOperand().equals("java/lang/System")
&& getNameConstantOperand().equals("gc")) {
bugReporter.reportBug(new BugInstance("SYSTEM_GC", NORMAL_PRIORITY)
.addClassAndMethod(this)
.addSourceLine(this));
}
}
}
}
|
Packaging detectors into plugins
The last step required in creating a new bug detector is to package it
as a plugin. A FindBugs plugin contains one or more bug detectors, a
deployment descriptor, and a resource file, packaged into a JAR file
and placed in the plugins directory of your FindBugs installation.
The deployment descriptor, called findbugs.xml, defines
the known bug detectors and the errors it might report. The resource
file, called messages.xml (or
messages_xx.xml, for localized versions), defines
language-specific strings that will be used by the FindBugs GUI to
describe the bugs reported. The deployment descriptor and resource
file for our example bug detector are shown in Listing 5 and Listing 6.
Multiple localized versions of the resource file may be included in
the plugin JAR; the deployment descriptor and resource files are
placed in the top level directory of the plugin JAR.
Listing 5. Deployment descriptor for example bug detector
<FindbugsPlugin xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="findbugsplugin.xsd"
pluginid="com.briangoetz.findbugs.plugin"
defaultenabled="true"
provider="Brian Goetz"
website="http://www.briangoetz.com">
<Detector class="com.briangoetz.findbugs.plugin.CallSystemGC"
speed="fast"
reports="SYSTEM_GC" />
<BugPattern abbrev="GC" type="SYSTEM_GC" category="PERFORMANCE" />
</FindbugsPlugin>
< |
Listing 6. Resources file for example bug detector
<MessageCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="messagecollection.xsd">
<Plugin>
<ShortDescription>Brian's plugin</ShortDescription>
<Details></Details>
</Plugin>
<Detector class="com.briangoetz.findbugs.plugin.CallSystemGC">
<Details>
<![CDATA[
Finds calls to System.gc().
]]>
</Details>
</Detector>
<BugPattern type="SYSTEM_GC">
<ShortDescription>Method calls System.gc()</ShortDescription>
<LongDescription>Call to System.gc() method in {1}</LongDescription>
<Details>
<![CDATA[
Library code should not call System.gc()
]]>
</Details>
</BugPattern>
<BugCode abbrev="GC" >Garbage collection</BugCode>
</MessageCollection>
|
Building and packaging our plugin and running it against the JDK
1.4.2 class libraries provides us with a surprise: several classes in
com.sun.imageio -- including JPEGImageReader and
JPEGImageWriter -- call System.gc()! This result
is yet another benefit of the flexibility of static analysis: Once
you have created a bug detector, you may be surprised at where it finds bugs.
Summary
Static analysis and custom bug detectors can be a very cost-effective
way to improve software quality. By creating a detector for a known
bug pattern, we can search for that bug pattern not only in the
current code base for a specific project, but in any project, current
or future. The extra effort to create a bug detector is more than
made up for by the quality dividends it pays in the future.
Download | Description | Name | Size | Download method |
|---|
| Source code | j-jtp06206fb_plugin_src.jar | 8KB | HTTP |
|---|
Resources Learn
-
Java
theory and practice: "Kill bugs dead" (developerWorks, Brian
Goetz, June 2004): This earlier installment of Java theory and
practice introduced the FindBugs tool and described some of the
bug patterns it can find.
-
"FindBugs,
Part I" (developerWorks, Chris Grindstaff, May 2004): Discusses how to
integrate FindBugs into your development methodology.
- "FindBugs,
Part II" (developerWorks, Chris Grindstaff, May 2004): The follow-on article discusses how to use custom bug
detectors to enforce project-wide code standards.
- The Java technology zone: Hundreds of articles about every aspect of Java programming.
Get products and technologies
-
FindBugs: Download FindBugs and try it out on your code.
Discuss
About the author  | |  | Brian Goetz has been a professional software developer for 20 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, California, and he serves on several JCP Expert Groups. Brian's book, Java Concurrency In Practice, was published in May 2006 by Addison-Wesley. See Brian's published and upcoming articles in popular industry publications. |
Rate this page
|