Note: This tip uses JAXP. The classes are also part of the Java 2 SDK 1.4, so if you have this version installed, you don't need any additional software. It briefly covers the basics of SAX, but you should already understand the basics of both Java and XML.
This tip looks at an application that determines which employees to notify of a particular emergency situation, and then acts accordingly. (The actual contact is left as an exercise for the reader.) The source document in Listing 1 simply lists employees, their department, and their status:
Listing 1. The source document
<?xml version="1.0"?>
<personnel>
<employee empid="332" deptid="24" shift="night"
status="contact">
JennyBerman
</employee>
<employee empid="994" deptid="24" shift="day"
status="donotcontact">
AndrewFule
</employee>
<employee empid="948" deptid="3" shift="night"
status="contact">
AnnaBangle
</employee>
<employee empid="1032" deptid="3" shift="day"
status="contact">
DavidBaines
</employee>
</personnel>
|
A SAX application consists of two parts. The main application creates an XMLReader
that actually parses the document, sending events such as startElement and endDocument to a content handler. You can send errors to a separate error handler object. The handler objects receive these events and act on them.
The main application can also act as either the content or the error handler (or both), but in Listing 2 they are three separate classes:
Listing 2. The main application
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.XMLReader;
import org.xml.sax.SAXException;
import org.xml.sax.InputSource;
import java.io.IOException;
public class MainSaxApp {
public staticvoid main (String[] args){
try {
StringparserClass = "org.apache.crimson.parser.XMLReaderImpl";
XMLReader reader = XMLReaderFactory.createXMLReader(parserClass);
reader.setContentHandler(new DataProcessor());
reader.setErrorHandler(new ErrorProcessor());
InputSource file = new InputSource("employees.xml");
reader.parse(file);
} catch (IOException ioe) {
System.out.println("IO Exception: "+ioe.getMessage());
} catch(SAXException se) {
System.out.println("SAX Exception: "+se.getMessage());
}
}
} |
By setting the content handler for the reader to be a DataProcessor object, the application tells the reader to send its events to that object. In Listing 3, the DataProcessor is simple, checking only for the name of the element and the status of employees before determining whether to contact them:
Listing 3. The content handler
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.Attributes;
public class DataProcessor extends DefaultHandler
{
public voidstartElement (String namespaceUri, String localName,
String qualifiedName, Attributesattributes) {
if(localName.equals("employee")){
if(attributes.getValue("status").equals("contact")){
System.out.println("Contacting employee "+
attributes.getValue("empid"));
//Implement actual contact here
}
}
}
} |
The ErrorProcessor class is trivial, and is included in the source code for this tip. (See Resources to download the source code.)
When the application runs, the output includes all of the employees with a status attribute of contact, no matter which department they work in:
Contacting employee 332 Contacting employee 948 Contacting employee 1032 |
So far the application contacts all employees that are listed as on duty regardless of their department, and it works well (or at least, we can hope so!). When you receive a new requirement to contact only employees in a particular department, you have two options:
- Change the content handler and risk all sorts of new bugs
- Change the data that comes to the content handler so that only the appropriate employees are seen as on duty.
Because other requirements are also likely to be added later, it makes more sense to implement them separately.
A SAX filter sits between a parser and a content handler. It receives events from the parser and, unless instructed otherwise, passes them on to the content handler unchanged. For example, consider this filter in Listing 4:
Listing 4. A simple XML filter
import org.xml.sax.helpers.XMLFilterImpl;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
public class DataFilter extends XMLFilterImpl
{
} |
The XMLFilterImpl class includes methods that simply pass the data on unchanged. Inserting the filter into the stream within the main application is all that's necessary (see Listing 5):
Listing 5. Inserting the filter into the main application
...
XMLReader reader = XMLReaderFactory.createXMLReader(parserClass);
DataFilter filter = new DataFilter();
filter.setParent(reader);
filter.setContentHandler(new DataProcessor());
filter.setErrorHandler(new ErrorProcessor());
filter.parse("employees.xml");
} catch (IOException ioe) {
... |
The application creates the XMLReader as usual, but it's actually the filter that initiates the parse of the file receiving the events from its parent, the XMLReader. (Remember, the filter calls super(parent).) It passes the events on to its content handler -- the same DataProcessor object used in the original version.
So far, the filter just passes the events on unchanged, so running the application still produces this:
Contacting employee 332 Contacting employee 948 Contacting employee 1032 |
With the filter in place, however, you can easily make changes without touching the main application. For example, in Listing 6, the filter can eliminate all employees that are not in department 24 by simply setting everyone else's status to donotcontact:
Listing 6. Filtering data
...
import org.xml.sax.helpers.AttributesImpl;
public class DataFilter extends XMLFilterImpl
{s
public void startElement (String namespaceUri, String localName,
String qualifiedName, Attributes attributes)
throws SAXException
{
AttributesImpl attributesImpl = new AttributesImpl(attributes);
if (localName.equals("employee")){
if (!attributes.getValue("deptid").equals("24")){
attributesImpl.setValue(3, "donotcontact");
}
}
super.startElement(namespaceUri, localName, qualifiedName, attributesImpl);
}
}
|
In this case, you're overriding the startElement() method defined in XMLFilterImpl. It still passes on the event, but if the employee is not in department 24, the filter passes it on with an altered Attributes object that lists the employee as do not contact.
The DataProcessor object has no idea that the data has been manipulated. It simply knows that some employees should be contacted and others shouldn't. Processing now produces a different result:
Contacting employee 332 |
This tip has demonstrated a simple way to alter the processing of a SAX application using an XML filter. In this case, the filter has been pre-determined, but you can build an application to accomodate different situations by choosing filter behavior at run-time. You might accomplish this by replacing the DataFilter class, by passing a parameter at run-time, or even by using a factory to create the filter class in the first place.
A SAX application can also chain filters together so that the output of one filter is used as the input for another, allowing for complex programming in modular chunks.
| Name | Size | Download method |
|---|---|---|
| x-tipsaxfilter/saxfiltertipsourcecode.zip | HTTP |
Information about download methods
- Download saxfiltertipsourcecode.zip to get the source code for this tip.
- Get a good understanding of SAX with the Understanding SAX tutorial (developerWorks, September 2001).
- Check out the SAX specification.
- Download Java API for XML Processing (JAXP).
- You'll find plenty more XML resources on the developerWorks XML zone.
-
IBM trial software: Build your next development project with trial software available for download directly from developerWorks.
- Find out how you can become an IBM Certified Developer in XML and related technologies.
- Want us to send you useful XML tips like this every week? Sign up for the developerWorks XML Tips newsletter.
Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, USA, and is the author of three books on Web development, including Java and XML From Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.




