Tip

Validation and the SAX ErrorHandler interface

What to do to turn on validation and error handling in a SAX-based parser

Comments

Content series:

This content is part # of # in the series: Tip

Stay tuned for additional content in this series.

This content is part of the series:Tip

Stay tuned for additional content in this series.

XML validation is the cornerstone of good document authoring. The key to giving meaning to an XML document -- and the crux of validation -- lies in the set of constraints that governs that document, and in ensuring that those constraints are followed. As an example, the element page takes on a different meaning when only one page element is allowed (as in representing a single page of content) than it does when many page elements are allowed (as in a lengthy novel with hundreds of pages). A DTD or an XML Schema plus a validating parser make a document usable across applications. Validating a document's constraints, and providing this meaning to one or more XML documents, can be achieved easily by using SAX, the Simple API for XML (see Related topics).

In XML parsers, validation is usually turned off by default because many XML authors are not writing constraints; leaving it off helps to avoid lengthy processing in production environments. To turn on validation, you must request it explicitly. In this tip, I show you how to do that using the SAX API. Because SAX is event driven, you'll want to be notified of, and react to, any errors that occur during validation. You can do this by using the SAX ErrorHandler interface, and I'll show you how.

Setting SAX features

Setting a SAX feature is the key to validation in SAX. This is done through the SAX 2.0 method setFeature(). This method takes as arguments a URI that describes the feature to set and the Boolean value (either true or false). In Related topics I refer you to an online list of SAX-defined URIs. The feature that you and I are interested in is listed on that page. Its String constant is http://xml.org/sax/features/validation and, as I mentioned earlier, it is usually turned off by default in parsers. To request validation in an XML parser, you simply need to set the value of this feature to true, as shown in Listing 1.

Listing 1. Requesting validation
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
public class ValidateXML {
    public static void main(String[] args) {
        try {
            // Create a new XML parser
            XMLReader reader = XMLReaderFactory.createXMLReader();
            // Request validation
            reader.setFeature("http://xml.org/sax/features/validation", true);
            // Parse the file as the first argument on the command-line
            reader.parse(args[0]); 
	} catch (SAXException e) {
            System.out.println("Error: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Getting notification through ErrorHandler

After making the changes per Listing 1, the parser will perform validation on documents, but you might not hear about any problems it encounters because this code doesn't provide a means to report errors. When a validation error occurs -- for example, a disallowed element is found -- then a SAX callback occurs. But if you don't write code to do something in that callback, nothing will get reported to your code or to the application client. To take care of that, implement the org.xml.sax.ErrorHandler interface. The interface has three methods, all of them intended to receive warning and error notifications. Listing 2 adds a class to the source shown in Listing 1 and registers that error handler with the parser.

Listing 2. Creating and registering an ErrorHandler
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
public class ValidateXML {
    public static void main(String[] args) {
        try {
            // Create a new XML parser
            XMLReader reader = XMLReaderFactory.createXMLReader();
            // Request validation
            reader.setFeature("http://xml.org/sax/features/validation", true);
            // Register the error handler
            reader.setErrorHandler(new MyErrorHandler());
            // Parse the file as the first argument on the command-line
            reader.parse(args[0]);
        } catch (SAXException e) {
            System.out.println("Error: " + e.getMessage());
            e.printStackTrace();
        }
    }
}
class MyErrorHandler implements ErrorHandler {
    public void warning(SAXParseException exception) throws SAXException {
        // Bring things to a crashing halt
        System.out.println("**Parsing Warning**
" +
                           "  Line:    " + 
                              exception.getLineNumber() + "
" +
                           "  URI:     " + 
                              exception.getSystemId() + "
" +
                           "  Message: " + 
                              exception.getMessage());        
        throw new SAXException("Warning encountered");
    }
    public void error(SAXParseException exception) throws SAXException {
        // Bring things to a crashing halt
        System.out.println("**Parsing Error**
" +
                           "  Line:    " + 
                              exception.getLineNumber() + "
" +
                           "  URI:     " + 
                              exception.getSystemId() + "
" +
                           "  Message: " + 
                              exception.getMessage());        
        throw new SAXException("Error encountered");
    }
    public void fatalError(SAXParseException exception) throws SAXException {
        // Bring things to a crashing halt
        System.out.println("**Parsing Fatal Error**
" +
                           "  Line:    " + 
                              exception.getLineNumber() + "
" +
                           "  URI:     " + 
                              exception.getSystemId() + "
" +
                           "  Message: " + 
                              exception.getMessage());        
        throw new SAXException("Fatal Error encountered");
    }
}

This is a bit of an extremist's implementation of ErrorHandler as it brings things to a crashing halt when any problems arise. Instead of gracefully returning an error code to the parent application, I print the error to the screen and bail out of the code. You would probably want a more graceful solution in your production applications. But these three methods are very helpful in letting you know exactly what the problem is and where that problem occurred. And that's all there is to using ErrorHandler. Turn on the validation feature, register an error handler, and boom! You're validating XML with SAX.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12007
ArticleTitle=Tip: Validation and the SAX ErrorHandler interface
publish-date=06012001