Java theory and practice: Testing with leverage, Part 2

Writing and tuning bug detectors

June's Java theory and practice column demonstrated how static analysis tools like FindBugs can bring greater leverage to bear on managing software quality by focusing on entire categories of bugs rather than on specific bug instances. In this month's installment, resident exterminator Brian Goetz details the process of constructing and tuning a nontrivial bug pattern detector.

Share:

Brian Goetz (brian@quiotix.com), Principal Consultant, Quiotix

Brian Goetz has been a professional software developer for 20 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, California, and he serves on several JCP Expert Groups. Brian's book, Java Concurrency In Practice, was published in May 2006 by Addison-Wesley. See Brian's published and upcoming articles in popular industry publications.



25 July 2006

Also available in Russian Japanese

Part 1 of this brief series on effective testing built a FindBugs plugin to find a trivial bug pattern, that of calling System.gc(). Bug patterns identify problematic coding practices that are frequently found in the neighborhoods where bugs live. Of course, not all occurrences of bug patterns are necessarily bugs, but this doesn't keep bug pattern detectors from being tremendously useful. All that is needed for a bug pattern detector to be effective is that it turn up a high enough percentage of questionable code to make it worth the effort of using it. Creating bug pattern detectors can have very high leverage; once you've created a detector, you can run it on any code you want, now or in the future, and you might be surprised what turns up. For example, the trivial detector in Part 1 showed that there were calls to System.gc() buried in the JPEG image I/O library in JDK 1.4.2.

Writing a detector to find calls to a particular static method isn't that hard, but most bug detectors involve a fair bit more analysis and implementation effort. In this installment, you'll develop a detector for a less trivial bug pattern, called RuntimeException capture. (This bug detector is now included in the FindBugs distribution.)

RuntimeException capture

One of the advantages of exception handling in the Java™ language is that exceptions are objects and the try-catch mechanism understands the hierarchy of exception types, offering substantial flexibility in how clients deal with error handling. For example, the FileInputStream constructor throws FileNotFoundException, a subclass of IOException, if the file cannot be found. This use of inheritance allows clients to handle file-not-found conditions separately from other file-related error conditions if they like by catching FileNotFoundException separately, but they can also handle all file-related error conditions the same way by catching only IOException.

On the other hand, a chief complaint of exception handling is that if you use exceptions properly, it is easy to end up with methods with three or four lines of business logic and 20 or 30 lines of exception handling. Because error recovery code is easy to get wrong and hard to test, having a substantial portion of code devoted to exception handling is both frustrating and error-prone. A typical example of this situation is shown in Listing 1, where a method with two lines of "real" code requires three separate catch blocks, each of which do the exact same thing -- log the exception:

Listing 1. Multiple identical catch blocks
public void addInstance(String className) {
    try {
        Class clazz = Class.forName(className);
        objectSet.add(clazz.newInstance());
    }
   catch (IllegalAccessException e) {
        logger.log("Exception in addInstance", e);
    }
    catch (InstantiationException e) {
        logger.log("Exception in addInstance", e);
    }
    catch (ClassNotFoundException e) {
        logger.log("Exception in addInstance", e);
    }
}

Looking at Listing 1, you might be tempted to collapse all three catch blocks into a single block that catches Exception because the recovery action is the same for each. At first glance, this strategy seems like a good idea -- code duplication is a bad thing, so consolidating these duplicated paths should be an improvement. However, this "improvement" has an often-unintended consequence. Because RuntimeException extends Exception, collapsing the three catch blocks into one as shown in Listing 2 also changes the semantics -- now unchecked exceptions will be logged instead of propagated. This bug pattern, where RuntimeException is inadvertently caught by an overly broad catch block, is called RuntimeException capture.

Listing 2. RuntimeException capture bug pattern -- don't do this
public void addInstance(String className) {
    try {
        Class clazz = Class.forName(className);
        objectSet.add(clazz.newInstance());
    }
    catch (Exception e) {
        logger.log("Exception in newInstance", e);
    }
}

Bug patterns often stem from confusing features of the language or the class library; this bug pattern stems from the somewhat counterintuitive fact that RuntimeException extends Exception. The fix for RuntimeException capture is easy -- if you know about the problem: catch RuntimeException first and rethrow it before catching Exception, as shown in Listing 3. However, even knowing about the bug pattern and its cure, it is easy to forget to do it right or not notice it during a code review, and the compiler won't warn you. That's where bug pattern detectors come in -- they save you from making mistakes about which "you should have known better."

Listing 3. Fixing RuntimeException capture by handling RuntimeException explicitly
public void addInstance(String className) {
    try {
        Class clazz = Class.forName(className);
        objectSet.add(clazz.newInstance());
    }
    catch (RuntimeException e) {
        throw e;
    }
    catch (Exception e) {
        logger.log("Exception in newInstance", e);
    }
}

Writing a RuntimeException capture detector

As you learned last month, the first step in writing a bug pattern detector is clearly identifying the bug pattern. In this case, the bug pattern is a catch block that catches Exception when there is no corresponding catch block for RuntimeException, and where no method call or throw statement in the try block throws Exception. To detect this bug pattern, you'll need to know where the try-catch blocks are, what might be thrown out of the try block, and what is caught in the catch blocks.

Identifying caught exceptions

Like last month, you'll start the bug detector by subclassing the BytecodeScanningDetector base class, which implements the Visitor pattern. There is a visit(Code) method in BytecodeScanningDetector, and the implementation calls visit(CodeException) every time it finds a catch block. If you override visit(Code) and call super.visit(Code) from it, when the superclass visit(Code) returns, it will have called visit(CodeException) for all the catch blocks in the method. Listing 4 shows a first cut at an implementation of visit(Code) and visit(CodeException) that accumulates information about all the catch blocks in a method. Each CodeException contains the starting and ending bytecode offset for the corresponding try block, so you can easily determine which CodeException objects correspond to a try-catch block.

Listing 4. First version of a RuntimeException capture detector that gathers information about exceptions thrown in a method
public class RuntimeExceptionCapture extends BytecodeScanningDetector {
  private BugReporter bugReporter;
  private Method method;
  private OpcodeStack stack = new OpcodeStack();
  private List<ExceptionCaught> catchList;
  private List<ExceptionThrown> throwList;

  public void visitMethod(Method method) {
    this.method = method;
    super.visitMethod(method)		 }

  public void visitCode(Code obj) {
    catchList = new ArrayList<ExceptionCaught>();
    throwList = new ArrayList<ExceptionThrown>();
    stack.resetForMethodEntry(this);

    super.visitCode(obj);
    // At this point, we've identified all the catch blocks
    // More to come...
  }

  public void visit(CodeException obj) {
    super.visit(obj);
    int type = obj.getCatchType();
    if (type == 0) return;
    String name = 
      getConstantPool().constantToString(getConstantPool().getConstant(type));

    ExceptionCaught caughtException =
      new ExceptionCaught(name, obj.getStartPC(), obj.getEndPC(), obj.getHandlerPC());
    catchList.add(caughtException);
  }
}

Identifying thrown exceptions

At this point, you have half the information you need: which exceptions are caught where. Now you have to find out which exceptions are thrown. To do that, you need to override the sawOpcode() method of BytecodeScanningDetector and handle the bytecodes corresponding to method invocation and exception throwing. Exceptions are thrown with the athrow JVM instruction. Three JVM instructions are used to invoke a method: invokestatic, invokevirtual, and invokespecial. Just as with visit(CodeException), sawOpcode is called during the call to the superclass visit(Code), so if you gather the appropriate information in sawOpcode(), when super.visit(Code) returns, you'll have all the information you need about caught and thrown exceptions.

Listing 5 shows the implementation of sawOpcode(), which handles these JVM instructions. For the athrow instruction, you use FindBugs' OpcodeStack helper class to learn the type of the operand of athrow. For the method call instructions, you use Bytecode Engineering Library (BCEL) classes to extract the types of checked exceptions the method is declared to throw. In either case, you accumulate information about which exceptions are thrown at what bytecode offset in the method, so you can match them up after you've have finished processing the whole method.

Listing 5. Identifying where exceptions are thrown in the visited code
public void sawOpcode(int seen) {
  stack.mergeJumps(this);
  try {
      switch (seen) {
      case ATHROW:
          if (stack.getStackDepth() > 0) {
              OpcodeStack.Item item = stack.getStackItem(0);
              String signature = item.getSignature();
              if (signature != null && signature.length() > 0) {
                  if (signature.startsWith("L"))
                      signature = SignatureConverter.convert(signature);
                  else
                      signature = signature.replace('/', '.');
                  throwList.add(new ExceptionThrown(signature, getPC()));
              }
          }
          break;

      case INVOKEVIRTUAL:
      case INVOKESPECIAL:
      case INVOKESTATIC:
          String className = getDottedClassConstantOperand();
          try {
              if (!className.startsWith("[")) {
                  JavaClass clazz = Repository.lookupClass(className);
                  Method[] methods = clazz.getMethods();
                  for (Method method : methods) {
                      if (method.getName().equals(getNameConstantOperand())
                              && method.getSignature().equals(getSigConstantOperand())) {
                          ExceptionTable et = method.getExceptionTable();
                          if (et != null) {
                              String[] names = et.getExceptionNames();
                              for (String name : names)
                                  throwList.add(new ExceptionThrown(name, getPC()));
                          }
                          break;
                      }
                  }
              }
          } catch (ClassNotFoundException e) {
              bugReporter.reportMissingClass(e);
          }
          break;
      default:
          break;
      }
  } finally {
      stack.sawOpcode(this, seen);
  }
}

Putting it all together

Now that you have the information you need about caught and thrown exceptions, the last step is to put it together. After the call to the superclass visit(Code) returns, the throwList and caughtList collections are fully populated. These contain information about all the try-catch blocks in the method, so you'll have to correlate the throw and catch information to identify the bug pattern.

Listing 6 shows the logic for identifying RuntimeException capture. It iterates the list of catch blocks, and if it finds one that catches Exception, it looks again for a catch block that catches RuntimeException for the same range of bytecodes. It also looks for instances where Exception is thrown in the corresponding range of bytecodes. If neither RuntimeException is caught nor Exception is thrown, you have a potential bug.

Listing 6. Combining catch and throw data to identify RuntimeException capture
for (ExceptionCaught caughtException : catchList) {
    Set<String> thrownSet = new HashSet<String>();
    for (ExceptionThrown thrownException : throwList) {
        if (thrownException.offset >= caughtException.startOffset
                && thrownException.offset < caughtException.endOffset) {
            thrownSet.add(thrownException.exceptionClass);
            if (thrownException.exceptionClass.equals(caughtException.exceptionClass))
                caughtException.seen = true;
        }
    }
    int catchClauses = 0;
    if (caughtException.exceptionClass.equals("java.lang.Exception") 
      && !caughtException.seen) {
        // Now we have a case where Exception is caught, but not thrown
        boolean rteCaught = false;
        for (ExceptionCaught otherException : catchList) {
            if (otherException.startOffset == caughtException.startOffset
                    && otherException.endOffset == caughtException.endOffset) {
                catchClauses++;
                if (otherException.exceptionClass.equals("java.lang.RuntimeException"))
                    rteCaught = true;
            }
        }
        int range = caughtException.endOffset - caughtException.startOffset;
        if (!rteCaught) {
            bugReporter.reportBug(new BugInstance(this, "REC_CATCH_EXCEPTION",
                    NORM_PRIORITY)
                    .addClassAndMethod(this)
                    .addSourceLine(this, caughtException.sourcePC));
        }
    }
}

To write bug detectors, you need some understanding of the structure of JVM bytecode and of class files. The BCEL and FindBugs libraries handle some of this task for you, extracting information from the bytecode and presenting it at a slightly higher level. Unfortunately, the documentation on both the BCEL and FindBugs support for taking apart a class is weaker than you might like. As with many open source projects, the best source of information about how to write a detector is to look at other detectors that do similar things.


Tuning the detector

The most expensive aspect of using static analysis is dealing with false alarms. Static analysis is necessarily imprecise; the goal is not to find bugs, but to find constructs that might be bugs, which means that sometimes correct code will be flagged. If a code auditing tool produces 95 percent false alarms, it is unlikely that anyone would use it twice; the memory of wading through the false alarms the first time to find the few bugs it reports would be too painful. So for a bug pattern detector to be effective, it must minimize false alarms, preferably no more than 50 percent.

The best way to tune a detector is to run it on a large code base, such as the JDK class libraries (rt.jar), Eclipse, or JBoss. So once you write a bug detector, try running it on a few projects and sample the reports to see if they are actual bugs or false alarms. For a nontrivial detector like the one developed here, the first experience is often somewhat disappointing -- more false alarms than expected.

The process of tuning a detector involves looking at the false alarms and refining the bug pattern so as to eliminate some false alarms without excluding too many actual bugs. One thing you can do to narrow the pattern is to eliminate cases where there are zero or one distinct checked exceptions thrown in the try block; in these cases, catching Exception is not likely to be a result of trying to collapse multiple catch blocks, but instead reflective of an actual desire to catch unchecked exceptions. This modification had a significant effect on the false alarm rate.

Tuning a detector often involves the use of "scoring" algorithms to determine whether a match should be reported as a bug. Additional tuning was done by using several factors to increase or decrease the "confidence score" for a given instance. Some aspects, such as having no checked exceptions, reduce a candidate match's score; others, such as the caught exception being "dead" (not used at all in the catch block), increase a candidate match's score. Listing 7 shows the scoring algorithm that was arrived at after the tuning process for this detector; it uses the priority as a score, as bugs with priority over a certain threshold are ignored by the bug-reporting framework (higher priority values indicate lower actual bug severity).

Listing 7. Scoring algorithm used by the RuntimeException capture detector after tuning
if (!rteCaught) {
    int priority = LOW_PRIORITY + 1;
    if (range > 300) priority--;
    else if (range < 30) priority++;
    if (catchClauses > 1) priority++;
    if (thrownSet.size() > 1) priority--;
    if (caughtException.dead) priority--;
    bugReporter.reportBug(new BugInstance(this, "REC_CATCH_EXCEPTION",
            priority)
            .addClassAndMethod(this)
            .addSourceLine(this, caughtException.sourcePC));
}

Summary

Writing custom bug detectors for a static code analysis tool such as FindBugs can offer tremendous leverage in improving code quality, and it can be a lot of fun. While writing and tuning bug detectors can be challenging (and tuning them is critical to ensuring they are used), capturing knowledge of a bug pattern in a detector pays handsome dividends as it becomes possible to scan for that bug pattern in any project with very little effort -- and you might be surprised where even the most silly-looking bugs turn up.

Resources

Learn

Get products and technologies

  • FindBugs: Download FindBugs and try it out on your code.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=148372
ArticleTitle=Java theory and practice: Testing with leverage, Part 2
publish-date=07252006