Fuzz testing

Attack your programs before someone else does

For years, I've been astounded by the number of corrupt files that can crash Microsoft Word. A few bytes out of place and the whole application goes down in flames. On older, non-memory-protected operating systems, the whole computer usually went down with it. Why can't Word recognize when it's received bad data and simply put up an error message? Why does it corrupt its own stack and heap just because a few bits got twiddled? Of course, Word is hardly the only program that behaves atrociously in the face of malformed files.

This article introduces you to a technique that attempts to avert just this sort of disaster. In fuzz testing, you attack a program with random bad data (aka fuzz), then wait to see what breaks. The trick of fuzz testing is that it isn't logical: Rather than attempting to guess what data is likely to provoke a crash (as a human tester might do), an automated fuzz test simply throws as much random gibberish at a program as possible. The failure modes identified by such testing usually come as a complete shock to programmers because no logical person would ever conceive of them.

Fuzz testing is a simple technique, but it can nonetheless reveal important bugs in your programs. It can identify real-world failure modes and signal potential avenues of attack that should be plugged before your software ships.

How fuzz testing works

Fuzz testing is a very simple procedure to implement:

  1. Prepare a correct file to input to your program.
  2. Replace some part of the file with random data.
  3. Open the file with the program.
  4. See what breaks.

You can vary the random data in any number of ways. For example, you might randomize the entire file rather than replacing just a part of it. You could limit the file to ASCII text or non-zero bytes. Any way you slice it, the key is to throw a lot of random data at an application and see what fails.

While you can do initial tests manually, you should really automate fuzzing for maximum effect. In this case, you first need to define the proper error behavior for the application when faced with corrupt input. (If you discover the program hasn't bothered to define what happens when the input data is corrupt, well, there's your first bug.) Then you simply pass random data into the program until you find a file that doesn't trigger the proper error dialog, message, exception, etc. Store and log that file so you can reproduce the problem later. Repeat.

Although fuzz testing usually requires some manual coding, there are tools that can help. For example, Listing 1 shows a simple Java™ class that randomly modifies a certain length of a file. I usually like to start fuzzing somewhere after the first few bytes because programs seem more likely to notice an early mistake than a later one. (You want to find the errors the program doesn't check, not the ones it does.)

Listing 1. A class that replaces part of a file with random data
import java.util.Random;

public class Fuzzer {

     private Random random = new SecureRandom();
     private int count = 1;

     public File fuzz(File in, int start, int length) throws IOException  

         byte[] data = new byte[(int) in.length()];
         DataInputStream din = new DataInputStream(new FileInputStream(in));
         fuzz(data, start, length);
         String name = "fuzz_" + count + "_" + in.getName();
         File fout = new File(name);
         FileOutputStream  out = new FileOutputStream(fout);
         return fout;

     // Modifies byte array in place
     public void fuzz(byte[] in, int start, int length) {

         byte[] fuzz = new byte[length];
         System.arraycopy(fuzz, 0, in, start, fuzz.length);



Fuzzing the file is easy. Passing it to the application is normally not much harder. Scripting languages like AppleScript or Perl are often the best choice for writing this part of the fuzz test. For GUI programs, the hardest part can be recognizing whether the application indicated the right failure mode. Sometimes it's simplest to have a human sit in front of the program and mark each test as pass or fail. Be sure to individually name and save all the random test cases generated so you can reproduce any failures you detect with this procedure.

Defensive coding

Solid code follows this fundamental principle: Never accept external data into your program without verifying it for consistency and sanity.

If you read a number from a file and expect it to be positive, check that it is before further processing with that number. If you expect a string to contain only ASCII letters, be sure that it does. If you think a file contains an integral multiple of four bytes, then check that. Never assume that any characteristic of externally-supplied data is as you expect.

The most common mistake is to assume that because an instance of your program wrote the data out, it can read that data back in again without verifying it. This is dangerous! The data could have been overwritten on disk by another program. It could have been corrupted by a failing disk or a bad network transfer. It could have been modified by another program that had a bug. It could even have been deliberately modified in an effort to subvert your program's security. Assume nothing. Verify everything.

Of course, error handling and verification is ugly, annoying, inconvenient, and thoroughly despised by programmers the world over. Sixty years into the computer age, we still aren't checking basic things like the success of opening a file or whether memory allocation succeeds. Asking programmers to test each byte and every invariant when reading a file seems hopeless -- but failing to do so leaves your programs vulnerable to fuzz. Fortunately, help is available. Properly used, modern tools and technologies can significantly alleviate the pain of hardening your applications. In particular, three techniques stand out:

  • Checksums
  • Grammar based formats such as XML
  • Verified code such as Java

Fuzz proofing with checksums

The simplest thing you can do to protect against fuzzing is add a checksum to your data. For example, you can sum up all the bytes in the file and then take the remainder when dividing by 256. Store the resulting value in one extra byte at the end of the file. Then, before trusting the input data, verify that the checksum matches. This very simple scheme reduces the risk of undetected accidental failure to about 1 in 256.

Robust checksum algorithms like MD5 and SHA do much more than simply take the remainder when dividing by 256. In the Java language, the and classes provide convenient means for attaching a checksum to your data. Using one of these checksum algorithms reduces the chance of accidental corruption to less than one in a billion (though deliberate attacks are still a possibility, as you'll see).

XML storage and validation

Storing your data in XML is an excellent way to avoid problems with data corruption. While XML was originally intended for Web pages, books, poems, articles, and similar documents, it has found broad success in almost every field ranging from financial data to vector graphics to serialized objects.

The key characteristic that makes XML formats resistant to fuzz is that an XML parser assumes nothing about the input. This is precisely what you want in a robust file format. XML parsers are designed so that any input (well-formed or not, valid or not) is handled in a defined way. An XML parser can process any byte stream. If your data first passes through an XML parser, all you need to be ready for is whatever the parser can give you. For instance, you don't need to check whether the data contains null characters because an XML parser will never pass you a null. If the XML parser sees a null character in its input, it throws an exception and stops processing. You still need to handle this exception of course, but writing a catch block to handle a detected error is much simpler than writing code to detect all possible errors.

For further security you can validate your document with a DTD and/or a schema. This checks not only that the XML is well-formed, but that it's at least close to what you expect. Validation will rarely tell you everything you need to know about a document, but it makes it easy to write a lot of simple checks. With XML, it is very straightforward to strictly limit the documents you accept to the formats you know how to handle.

Still, there will be pieces of the code you can't validate with a DTD or schema. For example, you can't test whether the price of an item in an invoice is the same as the price for that item in your inventory database. When receiving an order document from a customer that contains the price, whether in XML or any other format, you should always check to make sure the customer hasn't modified the price before submitting it. However, you can implement these last checks with custom code.

Grammar-based formats

One characteristic that makes XML so fuzz-resistant is that the format is carefully and formally defined using a Backus-Naur Form (BNF) grammar. Many parsers are built directly from this grammar using parser-generator tools such as JavaCC or Bison. The nature of such a tool is to read an arbitrary input stream and determine whether or not it satisfies the grammar.

If XML is not appropriate for your file format, you can still get the robustness of a parser-based solution. You'll have to write your own grammar for the file format, however, and then develop your own parser to read it. Rolling your own is a lot more work than just using an off-the-shelf XML parser. Nonetheless, it is a far more robust solution than simply loading the data into memory without formally verifying it against a grammar.

Java code verification

Many of the crashes resulting from fuzz testing are direct results of memory allocation mistakes and buffer overflows. Writing your application in a safe, garbage-collected language that executes in a virtual machine such as Java or managed C# eliminates many potential problems. Even if you're writing your code in C or C++, you should use a reliable garbage-collection library. In 2006, no desktop or server programmers should be managing their own memory.

The Java runtime features an additional layer of protection for its own code. Before a .class file is loaded into the virtual machine, it is verified by a bytecode verifier and optionally a SecurityManager. Java does not assume that the compiler that created the .class file was non-buggy or behaving correctly. The Java language was designed from day one to allow the execution of untrusted, potentially malicious code in a secure sandbox. It doesn't even trust the code it, itself, has compiled. After all, someone could have changed the bytecode manually with a hex editor to attempt to trigger a buffer overflow. We should all have this level of paranoia about input to our programs.

Think like the enemy

Each of the previous techniques goes a long way toward preventing accidental damage. Taken together and implemented properly, they reduce the chance of undetected, unintentional damage to essentially zero. (Well, not quite zero, but of the same order of magnitude as the chance of a stray cosmic ray causing your CPU to add 1+1 and come up with 3.) But not all data corruption is unintentional. What if someone is deliberately introducing bad data in the hopes of breaching your program's security? Thinking like a cracker is the next step in defending your code.

Switching back to an attacker's mode of thinking, let's suppose the application you're attacking is written in the Java programming language, uses non-native code, and stores all external data in XML, which is thoroughly validated before acceptance. Can you still attack it successfully? Yes, you can. A naive approach that randomly changes bytes in the file is unlikely to succeed, however. You need a more sophisticated approach that accounts for the program's built-in error-detection mechanisms and routes around them.

When you're testing a fuzz-resistant application, you can't do pure blackbox testing, but with some obvious modifications, the basic ideas still apply. For example, consider checksums. If a file format contains a checksum, then you simply modify the checksum so it matches your random data before passing the file to the application.

For XML, try fuzzing individual element content and attribute values rather than selecting a random section of the bytes in the document to replace. Be careful to replace data with legal XML characters rather than random bytes, because even a hundred bytes of random data is almost certain to be malformed. You can change element and attribute names too, as long as you're careful to make sure the resulting document is still well formed. If the XML document is checked against a very restrictive schema, you'll need to figure out what the schema isn't checking to determine where you can productively fuzz.

A really restrictive schema combined with code-level verification of the remaining data may leave you with no room to maneuver. As a developer, this is what you should strive for. The application should be able to process meaningfully any stream of bytes you send it that it does not reject as de jure invalid.

In conclusion

Fuzz testing can demonstrate the presence of bugs in a program. It doesn't prove that no such bugs exist. Nonetheless, passing a fuzz test greatly improves your confidence that an application is robust and secure against unexpected input. If you've fuzzed your program for 24 hours and it's still standing, then it's unlikely that further attacks of the same sort will compromise it. (Not impossible, mind you, just less likely.) If fuzz testing does reveal bugs in your programs, you should fix them. Rather than plugging random bugs as they appear, it may be more productive to fundamentally harden the file format through the judicious use of checksums, XML, garbage collection, and/or grammar-based file formats.

Fuzz testing is a crucial tool for identifying real errors in programs, and one that all security-aware and robustness-oriented programmers should have in their toolboxes.

Downloadable resources

Related topics

Zone=Java development
ArticleTitle=Fuzz testing