Skip to main content

Tip: Validation and the SAX ErrorHandler interface

What to do to turn on validation and error handling in a SAX-based parser

Brett McLaughlin (brett@newinstance.com), Enhydra strategist, Lutris Technologies
Author photo: Brett McLaughlin
Brett McLaughlin (brett@newinstance.com) works as Enhydra strategist at Lutris Technologies and specializes in distributed systems architecture. He is author of Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans technology, XML, and business-to-business applications. Along with Jason Hunter, he founded the JDOM project, which provides a simple API for manipulating XML from Java applications. He is also an active developer on the Apache Cocoon project and the EJBoss EJB server as well as a co-founder of the Apache Turbine project.

Summary:  In this tip, Brett McLaughlin explores SAX's validation capabilities and explains how to turn XML document validation on and off. He also covers the ErrorHandler interface, which enables you to receive notification of errors in your applications and act on that notification. Code samples demonstrate how to request validation and how to create and register an error handler in SAX.

View more content in this series

Date:  01 Jun 2001
Level:  Introductory
Activity:  3142 views

XML validation is the cornerstone of good document authoring. The key to giving meaning to an XML document -- and the crux of validation -- lies in the set of constraints that governs that document, and in ensuring that those constraints are followed. As an example, the element page takes on a different meaning when only one page element is allowed (as in representing a single page of content) than it does when many page elements are allowed (as in a lengthy novel with hundreds of pages). A DTD or an XML Schema plus a validating parser make a document usable across applications. Validating a document's constraints, and providing this meaning to one or more XML documents, can be achieved easily by using SAX, the Simple API for XML (see Resources).

In XML parsers, validation is usually turned off by default because many XML authors are not writing constraints; leaving it off helps to avoid lengthy processing in production environments. To turn on validation, you must request it explicitly. In this tip, I show you how to do that using the SAX API. Because SAX is event driven, you'll want to be notified of, and react to, any errors that occur during validation. You can do this by using the SAX ErrorHandler interface, and I'll show you how.

Setting SAX features

Setting a SAX feature is the key to validation in SAX. This is done through the SAX 2.0 method setFeature(). This method takes as arguments a URI that describes the feature to set and the Boolean value (either true or false). In Resources I refer you to an online list of SAX-defined URIs. The feature that you and I are interested in is listed on that page. Its String constant is http://xml.org/sax/features/validation and, as I mentioned earlier, it is usually turned off by default in parsers. To request validation in an XML parser, you simply need to set the value of this feature to true, as shown in Listing 1.


Listing 1. Requesting validation
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
public class ValidateXML {
    public static void main(String[] args) {
        try {
            // Create a new XML parser
            XMLReader reader = XMLReaderFactory.createXMLReader();
            // Request validation
            reader.setFeature("http://xml.org/sax/features/validation", true);
            // Parse the file as the first argument on the command-line
            reader.parse(args[0]); 
	} catch (SAXException e) {
            System.out.println("Error: " + e.getMessage());
            e.printStackTrace();
        }
    }
}


Getting notification through ErrorHandler

After making the changes per Listing 1, the parser will perform validation on documents, but you might not hear about any problems it encounters because this code doesn't provide a means to report errors. When a validation error occurs -- for example, a disallowed element is found -- then a SAX callback occurs. But if you don't write code to do something in that callback, nothing will get reported to your code or to the application client. To take care of that, implement the org.xml.sax.ErrorHandler interface. The interface has three methods, all of them intended to receive warning and error notifications. Listing 2 adds a class to the source shown in Listing 1 and registers that error handler with the parser.


Listing 2. Creating and registering an ErrorHandler
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
public class ValidateXML {
    public static void main(String[] args) {
        try {
            // Create a new XML parser
            XMLReader reader = XMLReaderFactory.createXMLReader();
            // Request validation
            reader.setFeature("http://xml.org/sax/features/validation", true);
            // Register the error handler
            reader.setErrorHandler(new MyErrorHandler());
            // Parse the file as the first argument on the command-line
            reader.parse(args[0]);
        } catch (SAXException e) {
            System.out.println("Error: " + e.getMessage());
            e.printStackTrace();
        }
    }
}
class MyErrorHandler implements ErrorHandler {
    public void warning(SAXParseException exception) throws SAXException {
        // Bring things to a crashing halt
        System.out.println("**Parsing Warning**
" +
                           "  Line:    " + 
                              exception.getLineNumber() + "
" +
                           "  URI:     " + 
                              exception.getSystemId() + "
" +
                           "  Message: " + 
                              exception.getMessage());        
        throw new SAXException("Warning encountered");
    }
    public void error(SAXParseException exception) throws SAXException {
        // Bring things to a crashing halt
        System.out.println("**Parsing Error**
" +
                           "  Line:    " + 
                              exception.getLineNumber() + "
" +
                           "  URI:     " + 
                              exception.getSystemId() + "
" +
                           "  Message: " + 
                              exception.getMessage());        
        throw new SAXException("Error encountered");
    }
    public void fatalError(SAXParseException exception) throws SAXException {
        // Bring things to a crashing halt
        System.out.println("**Parsing Fatal Error**
" +
                           "  Line:    " + 
                              exception.getLineNumber() + "
" +
                           "  URI:     " + 
                              exception.getSystemId() + "
" +
                           "  Message: " + 
                              exception.getMessage());        
        throw new SAXException("Fatal Error encountered");
    }
}

This is a bit of an extremist's implementation of ErrorHandler as it brings things to a crashing halt when any problems arise. Instead of gracefully returning an error code to the parent application, I print the error to the screen and bail out of the code. You would probably want a more graceful solution in your production applications. But these three methods are very helpful in letting you know exactly what the problem is and where that problem occurred. And that's all there is to using ErrorHandler. Turn on the validation feature, register an error handler, and boom! You're validating XML with SAX.


Resources

About the author

Author photo: Brett McLaughlin

Brett McLaughlin (brett@newinstance.com) works as Enhydra strategist at Lutris Technologies and specializes in distributed systems architecture. He is author of Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans technology, XML, and business-to-business applications. Along with Jason Hunter, he founded the JDOM project, which provides a simple API for manipulating XML from Java applications. He is also an active developer on the Apache Cocoon project and the EJBoss EJB server as well as a co-founder of the Apache Turbine project.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12007
ArticleTitle=Tip: Validation and the SAX ErrorHandler interface
publish-date=06012001
author1-email=brett@newinstance.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers