XML parses the newer Java language APIs—JAXP, JAXB, JAX-WS, and more—so easily that XML parsing is now a fundamental aspect of Java programming; potential problems arise when abstractions in the higher-level APIs cause a loss of fine-grained control between parser and data interactions. In this article, I'll show you how the Simple API for XML (SAX) delivers an easy-to-use vehicle to deal with those errors, one you can use even when you're not using SAX directly.
A crashing program is not error handling
Every application program is first and foremost an application user. Whether it's vi or emacs or DreamWeaver® or Adobe® Photoshop®, decisions about how to build an application are informed largely by experience with other applications. Given that, it's no surprise that most error handling in modern applications—especially in Web applications—is a useless screen with numbers, strange letters that don't spell out words in any dictionary, and if you're lucky, an apology with a little bit of font formatting. That's a pretty poor means to handle problems in applications.
It's even worse when paired with the truism that errors will occur in your applications. Just as you find that unique way to crash Eclipse by launching simultaneous builds with different classpaths, users of the applications you program will manage to spin a thread off that never gets cleaned up, hit a servlet without getting data in the right request variable, or tax your MySQL® database with a ton of open connections.
And, when it comes to XML, users are notorious for stuffing the wrong data into a field or managing to fly by your validation with still-invalid data. When your programs consume XML from another company, the potential for errors is compounded. Now you're trusting another company, with programmers as overworked as you are, to get all the details of a data format right. It's in these cases—which are wide and varied—that your XML parsing can grind to a halt, as it throws a vague exception. Simply wrapping everything in a bit block like this:
try {
// some interesting and complex XML processing code
} catch (Exception e) {
System.err.println(e.getMessage());
}
|
isn't acceptable error-handling! All you'll manage to do with code like this is annoy your users, irritate your boss, and potentially anger everyone around you who has to work late to track down the problem when the CEO gets an irate e-mail from her biggest client.
First and foremost, you've got to actually write error handling code. While a
catch block with a System.err.println() statement might be considered error-handling in
the loosest sense of the word, it's a pretty poor approach. But error handling code
should do more than just report an error—quality error handling is:
- User-friendly.
- Not disruptive (unless it needs to be).
- Informative.
Error handling is user-friendly
More than anything, error handling code is for the users of your application. In fact, everything in your program is ultimately for your users; even your own debugging statements help you know what's going on, so you can fix functionality...functionality is for the user. Error handling code is no different.
Of course, the term "user" can take on a lot of different meanings, especially if you don't write consumer-facing code. If your application is a back-end system for transferring financial data between your company and a bank, then your user is probably some internal group in your company or the bank. If your code is simply groundwork for another group to use, then that other group is your customer. So the first thing you' figure out is, who is your customer?
Once you know whether your customer is a computer user in New Jersey or the Web developers on the third floor or the chairman of the New York Stock Exchange, you can write code that is friendly to that user (or user class). For a consumer, you need to provide a readable error message that doesn't involve programming terms. For a Web developer, you might provide contact information for your department or the systems administration group. For the CEO of a bank...well, the error handling better not interrupt them. In fact, before you worry too much about an error message, it's better to consider that not all error handling code has to report an error.
Error handling isn't disruptive unless it has to be
If you're driving to work and come to a major construction project that blocks the highway you need to take, you don't pull over, turn off the car, hang your head, and mentally start polishing your resume. That would be stupid. Instead, you take the next exit and figure out an alternate route. It might take you a few extra minutes to get to work and you might even have to call in and let someone know you'll be late for that 8 AM meeting. Still you make it to work, despite the construction problem.
The best error handling operates exactly like you do when you encounter a problem: It tries to find a way around the problem. Crashing, spitting out a "sorry, you're out of luck" message, or printing out a stack trace is not error handling. That is error reporting. Your job as a programmer is to do everything in your power to get your users to work even if construction is going on.
In the XML world, that means taking data that might be erroneous and trying to work with it anyway. Sometimes you can actually ignore certain low-level errors or log a message to a file without bringing an entire program to a crashing halt. Other times you might need to ask the user of your program (which you might recall can be another program and not a real human at all) for more or alternate information. In these cases, you try to keep processing and moving forward.
Only in a true catastrophic situation do you completely interrupt a user's work. If a file is completely missing, data is garbled beyond recovery, or an essential element is absent in an XML document, that is when you interrupt your user's work. It's equivalent to four or five highways all being flooded and the back roads being completely underwater. In other words, don't halt except in a real disaster.
That's where SAX can really help: It allows you to receive possibly bad data or error conditions before the program halts processing and gives you a chance to make course corrections.
Whatever you do when an error occurs, you must provide useful information. In the best case, that information is simply what the user of your program originally wanted, asked for, or intended to produce. In cases where you can't make a graceful recovery and have to make a change in processing, you need to indicate what's going on. In those rare situations where you have to completely abort processing, you should provide information that's helpful even in that case.
But the key here is to know your user. If your program is a business-to-business component that is called by some other processing component, a stack trace to a log file and a technical error message might be perfect. It gives detailed information to the programmers who are using the programs to interact. However, if your program is part of a customer-facing Web application, you cannot throw out stack traces and error codes. Instead, you need to provide a human-readable (that's not the same as programmer-readable!) message along with information about who to contact for further assistance. If that's not possible or not something you know, then make sure you bubble up an exception with helpful information so that calling programs can (hopefully) make similar smart decisions.
SAX is underneath most XML processing
The key to good XML-based error handling is largely tied to the SAX API, the Simple API for XML. That's not because SAX is inherently better than any other API nor that it is particularly well-suited to error processing—SAX is key simply because almost all XML processing involves SAX at some level.
The reason for this is pretty simple: SAX is blazing fast and has been around for a while. XML is fairly easy to work with, but it's not an intuitive language in lots of ways. XML can have lots of constructs and syntactic quirks and parsing is hard. For most parser and processor vendors, the idea of building a custom parser API to deal with XML, from textual data to elements to namespaces to entity references, is daunting if not completely undesirable. Instead those vendors (and even API writers) use SAX since it already works pretty well—SAX isn't great at a lot of things, but it is great at parsing XML. And for that reason, if you know how to handle errors in SAX, then you know how to handle errors in almost any XML processing API.
Now look at some underpinnings of SAX (complete with code examples). The first step is to get from whatever processing API that you use to SAX.
SAX parsing (obviously) uses SAX
If you're already a SAX veteran, then you don't have to do anything to get going with SAX; you're already using the API. Specifically, you're probably writing code that looks something like Listing 1, a fragment from a program that parses an XML document using SAX.
Listing 1. Parsing an XML document using SAX
XMLReader reader = XMLReaderFactory.createXMLReader();
ContentHandler handler = new PrintingContentHandler();
reader.setContentHandler(handler);
reader.parse(new InputSource(new FileReader(xmlFilename)));
|
If you've got code like this, you're well on your way to handling errors in your XML
processing. In fact, if you already know—through SAX or another API—how
to get access to an XMLReader implementation, then you've
taken a big first step toward handling errors smoothly.
Most DOM parsers are actually SAX parsers that build up a DOM tree. The DOM API itself doesn't expose an underlying SAX parser, but that's because the DOM API is largely about working with a DOM tree rather than creating that tree structure in the first place. Most parsers that implement the DOM provide at least a vendor-specific means of access to an underlying SAX parser.
For example, in Xerces the class you use to build a DOM tree is called org.apache.xerces.parsers.DOMParser. You can call the parse() method of that object and then get a DOM tree in the form of a Document object from it, as in the code fragment in Listing 2.
Listing 2. Creating a DOM tree using SAX
DOMParser parser = new DOMParser();
parser.parse(new InputSource(xmlFilename));
Document doc = parser.getDocument();
|
At a glance, this resembles the parsing process in SAX. In fact, the InputSource class (which is just one of a few ways to get a document
to the DOMParser instance) is a SAX construct. But things are
even more similar than they appear. If you pull open the Xerces-J API documentation or
trace the source code, you'll note that DOMParser extends
another Xerces class, org.apache.xerces.framework.XMLParser.
That class turns out to be the basis for both DOMParser
and the SAX parsing class in Xerces, SAXParser.
That's a pretty confusing set of sentences, but here's the bottom line: The XML parsing implementation that serves SAX parsing in Xerces-J is the foundation of the DOM parsing classes too. So while you can't trace the DOMParser class back to SAX's XMLReader, the same code that underpins Xerces-J's XMLReader implementation underpins DOMParser. And because of that, you've several really valuable methods available on DOMParser:
-
setEntityResolver(): This method takes a SAX construct,EntityResolver, for handling entities in an XML document. -
setFeature(): This is another SAX-originated method that allows you to set DOM- and SAX-related features of the parser. -
setErrorHandler(): This is the key for error handling. This method accepts a SAXErrorHandlerimplementation, allowing you the ability to intercept and respond to errors.
Even though you don't have direct access to a SAX XMLReader implementation, you can get to the SAX-specific methods that are the core of error handling in this article.
Most developers don't fool much with SAX or DOM directly anymore. Instead they use the JAXP API, the Java API for XML Processing. Listing 3 shows a fragment of code that uses JAXP for some SAX parsing.
Listing 3. SAX parsing using JAXP
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(false);
SAXParser parser = factory.newSAXParser();
parser.parse(new File(args[0]), new MyHandler());
|
This should look familiar; although the class names are a bit different and several options are set, this is not much different from the SAX parsing in Listing 1. But you can take these JAXP constructs and get even closer to SAX. Specifically, the JAXP SAXParser class provides a method called getXMLReader() that returns the underlying SAX XMLReader implementation:
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(false);
SAXParser parser = factory.newSAXParser();
XMLReader reader = parser.getXMLReader();
|
You can then work with the XMLReader instance as you would with a straight SAX application which of course includes setting an error handler.
The same is true when you use JAXP for DOM parsing which provides an even easier DOM-to-SAX layering of APIs. Listing 4 shows a code fragment demonstrating the normal DOM parsing procedure.
Listing 4. DOM parsing using JAXP
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File(args[0]));
|
Even though this is clearly DOM-based parsing, it's pretty easy to return to SAX. In
fact, before worrying about code to move from DOM to SAX, consider the more basic fact:
If you use JAXP, you've got SAX classes already (seeListing
3). So if you use JAXP, don't even worry about converting from DOM to SAX (something you can do and is detailed in a developerWorks tip referenced in Resources). Just rewrite a few lines of code to use SAXParser instead of DOMBuilder and you're ready to add SAX-specific error handling.
Higher-level APIs like JAXB use SAX, too
You should start to draw some conclusions about the proliferation of SAX. SAX is used for DOM processes, as well as JAXP parsing; it stands to reason then, that higher-level APIs that use DOM or model JAXP will use SAX as well. You might have to do a little more work with these APIs, but you can usually get to a SAX basis in those cases as well.
Another popular API that fits this mold is JAXB, Sun's Java API (or architecture, depending on the version you're using) for XML Binding. JAXB allows you to convert from an XML document to a Java object model and then convert that object model back into XML. The process of going from XML to Java is called unmarshalling and is essentially two steps:
- Parsing an XML document, retaining both the data in the document and the names of elements and attributes, as well as the relationships between the elements, attributes, and data.
- Converting that data into member variables and instances of a Java object model.
With parsing involved, you've the possibility of error and the almost certainty that SAX is involved. That's the case with JAXB and you can get to a SAX XMLReader, albeit a bit indirectly. Listing 5 shows how to take an Unmarshaller, the core class used for unmarshalling, and get finer-grained control over the SAX-based parsing process.
Listing 5. Getting to a SAX XMLReader from JAXB
JAXBContext context = JAXBContext.newInstance("dw.ibm");
Unmarshaller unmarshaller = context.createUnmarshaller();
// Get the lower level handler from the Unmarshaller
UnmarshallerHandler unmarshallerHandler =
unmarshaller.getUnmarshallerHandler();
// Now use JAXP to get a SAX parser
SAXParserFactory factory = SAXParserFactory.newInstance();
// Set options on the factory, using standard JAXP calls
factory.setNamespaceAware(true);
XMLReader reader = factory.newSAXParser().getXMLReader();
// We can use the handler from the unmarshaller as the content handler
reader.setContentHandler(unmarshallerHandler);
// Now parse, using the unmarshalling handler from JAXB
reader.parse(new InputSource(new FileReader(xmlDocument)));
MyCustomObject topObject = (MyCustomObject)unmarshallerHandler.getResult();
|
This is a little more convoluted than the previous examples, but is still pretty
straightforward. The big difference here is that instead of using the JAXB framework
directly, through the Unmarshaller.unmarshal() method you
obtain the handler from JAXB that has all the unmarshalling code in it; that's what this call does:
// Get the lower level handler from the Unmarshaller UnmarshallerHandler unmarshallerHandler = unmarshaller.getUnmarshallerHandler(); |
Then the SAX parsing is handled manually with the returned handler as the means of dealing with content. This is where you'll insert your custom error handling code—which the next section covers in detail.
Finally parsing occurs, you then return to the JAXB framework with this last call:
MyCustomObject topObject = (MyCustomObject)unmarshallerHandler.getResult(); |
The result is a fine-grained control over exactly what happens as unmarshalling takes place. That means that you can get your hands dirty, avoid or smoothly report errors, and provide a much better experience for the users of your application.
It would obviously be impossible to cover every XML API and detail how to get from that
API's highest levels of use to its SAX underpinnings, but you should already have a
pretty good idea of the sort of patterns to look for. Check out the API documentation
and look for classes that extend or implement org.xml.sax.XMLReader or take as an argument a SAX ErrorHandler (in the org.xml.sax package).
You can even just Google "[your API name] SAX" or "[your API name] ErrorHandler." You'll
be surprised at how easy it is to connect the dots between the API that you use and SAX.
Use the ErrorHandler interface to deal with errors
The heart of error handling in SAX is the org.xml.sax.ErrorHandler interface. It's a simple, three-method interface that you can implement and handle all types of errors. It's also easy to register with SAX, so handling errors is just a few lines of code away.
Three methods for three error types
The code for ErrorHandler is unbelievably simple. Listing 6
shows its entirety (minus comments).
Listing 6. The SAX ErrorHandler interface
package org.xml.sax;
public interface ErrorHandler {
public void warning(SAXParseException exception) throws Exception {
}
public void error(SAXParseException exception) throws Exception {
}
public void fatalError(SAXParseException exception) throws Exception {
}
}
|
That's all there is to it. Any problems in parsing are reported to one of these three methods and if you implement your own version of ErrorHandler, you're well on your way to custom error handling.
A warning in SAX is defined as any problem that is not considered an error or fatal error as defined by the XML 1.0 recommendation. That's pretty vague, but there's a better way of stating this— warnings are problems that don't prevent the parser from continuing to parse. The usual default actions for warnings to completely ignore the warning—since it doesn't prevent parsing and processing—or to issue an informational message and continue on.
Problems with comments, unexpected values that can still be processed, and most minor nits that you wouldn't even think about show up here. Before you worry too much about what a warning is, remember the principles of good error handling mentioned earlier:
- Error handling is user-friendly.
- Error handling isn't disruptive unless it has to be.
- Error handling is informative.
Apply any (or all) of these to warnings and you'll quickly see that the default action of ignoring warnings is probably for the best. If you want to log a message to a debugging application or log file, that's suitable. Even then though, you're taking valuable processing time with something that by definition is non-critical. The best implementation of the warning() method usually looks something like Listing 7.
Listing 7. An ErrorHandler implementation with a warning() method filled in
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
public class DefaultErrorHandler implements ErrorHandler {
public void warning(SAXParseException exception)
throws SAXException {
// Do nothing
}
public void error(SAXParseException exception) throws SAXException {
// to be filled in later
}
public void fatalError(SAXParseException exception)
throws SAXException {
// to be filled in later
}
}
|
Errors are the hardest XML problem to deal with. Warnings can be ignored or just logged. And fatal errors (which you'll learn more about shortly) require parsing to stop and that you take a serious set of actions. Errors (not the same as fatal errors) are a sort of nebulous in-the-middle problem. An error in SAX corresponds to an error in the XML 1.0 specification which has this pretty vague description:
A violation of the rules of this specification; results are undefined. Unless otherwise specified, failure to observe a prescription of this specification indicated by one of the keywords must, required, must not, shall and shall not is an error. Conforming software may detect and report an error and may recover from it.
Unfortunately, this tells you very little about what does cause errors. The only thing that's particularly notable here is the word "recover." In other words, your program should be able to recover from this sort of error rather than having to stop parsing and processing.
In practice, an error is reported when something goes wrong that has to do with content, usually unexpected, but not with the structure or well-formedness of an XML document. So what you really get with an error is indication that you might get an incomplete document or that some data in that parsed document might be lost, garbled, or inaccurate.
In terms of recovery, you can do very little by the time an error is reported. SAX
doesn't let you read ahead or behind, so to gather context from the past in SAXParseException is tough (at best). The nice thing though is that
your processing doesn't have to stop. So without worrying about what you can do,
there's something important you don't need to do—halt processing. Your
users—whether they're customers or other developers—don't need to see
that a problem occurred and in most cases will get usable results from your program.
So what do you do? First you should probably log information somewhere. While
you don't want to let users know that they might get flawed data, you should record some information about what happened. Avoid things like System.out and System.err, but logging to a file or using an API like Log4J (see Resources for links) is a great way to keep up with what happened and allow you or someone else to guard against the problem occurring again.
The other step you might want to take as you deal with errors through the error() message is to use what you know about your application's
business logic to make smart decisions. For instance, suppose your application stores
e-mails in an element named email nested within a contact-info element. If the error()
method indicates that the problem is related to the contact-info or email element, you've
potentially lost that e-mail address text data. You might let your calling program know
(through a non-interruptive approach) that an e-mail address might be missing. If you
write a customer-facing application and the information came through user input, you can even ask the user to re-enter their e-mail address using a simple Web form.
The biggest challenge as I talk about dealing with errors is to give concrete
examples...those examples depend completely on your business logic and domain. So you'll
have to come up with your own ideas about how to implement error(). The key is to not crash your program or throw an exception that stops processing. Remember, by definition an error (as opposed to a fatal error) shouldn't cause processing to stop.
The final type of problem in SAX is a fatal error. Defined in the XML 1.0
specification, these are problems that absolutely interfere and prevent parsing from
continuing. The most common example is a document that lacks well-formedness. In other
words, an opening firstName element is never closed or an
opening angle bracket is missing. Parsers can't recover in these situations because the
entire structure of the document is in question. SAX API documentation even goes as far
as to say that once a fatal error is reported, the application must assume that the document is unusable. So fatal errors are a pretty big deal.
At first glance, it might seem that you can triage fatal errors. For example, if </firstName> is missing, maybe there's something that looks
close enough to make an educated guess about. Perhaps there's a closing element named,
oddly, "girstName." That might very well be a typo. So many fatal errors appear to be pretty trivial to fix to a human eye.
The problem is, however, that SAX is a read-only, sequential parser. It doesn't read ahead and it doesn't keep up with what it's read already. So it is a virtual impossibility to look ahead for a potential typo or even to look back at what was just read. You'd have to write a handler to build what amounts to an in-memory buffer to keep up with what's already been read; then you'd need some code to read ahead as well. The overhead would be immense and at best, you'd still be guessing at what the original document author intended. So fatal errors fall under that second principle of good error handling: Error handling isn't disruptive...unless it has to be.
Given that, you're left with the third principle of error handling: Error handling is informative. With a genuine disruption, your job is to be informative. Here's an example of what not to do (it's not even in a code listing so people won't mistake it for a good practice):
public void fatalError(SAXParseException exception)
throws SAXException {
// typical, but terrible, error handling
// Bring things to a crashing halt
System.out.println("**Parsing Fatal Error**" + "\n" +
" Line: " +
exception.getLineNumber() + "\n" +
" URI: " +
exception.getSystemId() + "\n" +
" Message: " +
exception.getMessage());
throw new SAXException("Fatal Error encountered");
}
|
This might be informative to a programmer, but to anyone else it's gibberish. You'll
devise your own approach to deal with these exceptions because your application is
different; however, you must pass information back to your calling program in the form
of a SAXException so you do have some limitations. For
example, you probably don't want to take direct control of the user experience but
instead pass something to a calling class that you can use to provide feedback to the user.
Send useful errors through a custom exception class
The easiest way to provide meaningful error information to your program is to build
your own exception that you can pass to SAXException. SAXException is the exception that all three methods of ErrorHandler can throw and it's basically your only means to communicate with calling programs. By default, SAXException only provides a few methods:
-
getException(): This method returns anExceptionwhich allows you to nest exceptions for later retrieval. You can stuff your own custom exception in here and pass on information to calling programs. -
getMessage(): You can put in a human-readable message here although that's a fairly raw approach to sending along good error information. -
getString(): This overrides the superclass'sgetString()to print out embedded exception information.
Suppose that you putting together an XML processing component for use in Web programs. You might define a custom class like that in Listing 8.
Listing 8. Custom exception class for reporting to a Web app
public class WebException extends Exception {
private int httpStatusCode = 400;
private String redirectURL;
public WebException(String message, int httpStatusCode, String redirectURL) {
super(message);
this.httpStatusCode = httpStatusCode;
this.redirectURL = redirectURL;
}
public WebException(String message, Throwable cause,
int httpStatusCode, String redirectURL) {
super(message, cause);
this.httpStatusCode = httpStatusCode;
this.redirectURL = redirectURL;
}
public int getHttpStatusCode() {
return httpStatusCode;
}
public String getRedirectURL() {
return redirectURL;
}
}
|
This is pretty basic but takes some Web-specific information: an HTTP status code to report (like 404 or 401) and a URL to redirect users to. So you might use this in your error handling like Listing 9 details.
Listing 9. Reporting an exception with a nested custom exception
public void fatalError(SAXParseException exception) throws SAXException {
// Report through a Web-specific exception
WebException webException = new WebException("There was a problem converting your " +
"response into a format our server could read. Please contact our customer " +
"service team at 1-800-555-0972, and we'd love to help you in person.",
exception, 406, "/errorPages/xmlError.php?reason=" + exception.getMessage());
throw new SAXException(webException);
}
|
This isn't very complicated code, but it does several really important things:
- The error message returned is specific, user-readable, and informative. It's also intended for real people, not programmers.
- Application-specific information, like status codes and error pages, are provided for the calling application. That way, there's no mystery about what to do and code that calls the XML processing component can expect information when there's a problem, not a mysterious stack trace.
- Even the redirect page carries useful information: The underlying error message which might be logged and even used to auto-generate a bug report for programmers to deal with later.
All of this required only a few lines of custom exception code and a little thought about how to implement fatalError().
Tailor your exceptions to your applications
Obviously WebException won't work for applications that aren't Web-based. In fact, it might not even be a good fit for your own Web-based applications. That's where your subject- and domain-level expertise come in. You've got to know what your application needs.
You should know your current users and the sort of information you need to provide
them. Even if you write a lower-level processing component, you know what the calling code might need if a problem crops up. Build your own application-specific exception and wrap that exception up through SAXException in your error method. Pass along any relevant or helpful information about the problem that occurred through that custom exception.
The only other step is to contact (by phone, or e-mail, or in some cases, paper
documentation) the developers who use your code if you don't build a customer-facing
component. So if the intranet Java servlet guys use your code, tell them that you
wrapped up a custom exception with lots of helpful information within SAXException. Send them the source code for your exception, some basic documentation, and give them suggestions for use. A little bit of communication will keep them from spitting out getMessage() and making error-handling mistakes of their own.
One common mistake is to actually extend SAXException rather
than embedding a custom exception in that class. So in the previous example (Listings 8 and 9), WebException can extend SAXException and be thrown directly from fatalError(). That seems workable, but has some subtle problems you'll want to dispense with.
First, extending SAXException ties your custom exception to
SAX. That makes it less usable in other components that might have no XML parsing,
processing, or SAX functionality at all. It's much better to leave your exception generic and usable across your app; that's why SAXException allows for another unrelated exception to be embedded in the first place. Second and just as important, many existing applications that work with SAX already automatically pull embedded exception information, so by using SAXException the way it was designed, you potentially make your code work better with other existing code components.
Register your ErrorHandler implementation
Once you implement the three methods in ErrorHandler, you're almost ready to use your error handling. The only step left is to register your error handler implementation with the parsing process. That's pretty simple, but a crucial step. It does you no good to build a great set of error-handling methods and then not tell your SAX parser about them.
Call setErrorHandler() on XMLReader
If you managed to get an XMLReader from your XML API or
processing layer, you've an easy job. XMLReader provides a setErrorHandler() method that takes an implementation of ErrorHandler as an argument. You call that method as in Listing 10.
Listing 10. Setting an error handler on an XMLReader
JAXBContext context = JAXBContext.newInstance("dw.ibm");
Unmarshaller unmarshaller = context.createUnmarshaller();
// Get the lower level handler from the Unmarshaller
UnmarshallerHandler unmarshallerHandler =
unmarshaller.getUnmarshallerHandler();
// Now use JAXP to get a SAX parser
SAXParserFactory factory = SAXParserFactory.newInstance();
// Set options on the factory, using standard JAXP calls
factory.setNamespaceAware(true);
XMLReader reader = factory.newSAXParser().getXMLReader();
// We can use the handler from the unmarshaller as the content handler
reader.setContentHandler(unmarshallerHandler);
// Register a custom ErrorHandler implementation
reader.setErrorHandler(new MyJAXBErrorHandler("/logs/logfile.txt"));
// Now parse, using the unmarshalling handler from JAXB
reader.parse(new InputSource(new FileReader(xmlDocument)));
MyCustomObject topObject = (MyCustomObject)unmarshallerHandler.getResult();
|
In Listing 10, the higher-level API is JAXB (as first shown in
Listing 5). Once an XMLReader
instance is obtained, a new custom ErrorHandler
implementation is registered through setErrorHandler(). In
this example, the custom handler is called MyJAXBErrorHandler. The handler takes the path to a file where it log errors as an argument. Once this is done, parsing is started (through parse()), and any problems are passed on to your methods in MyJAXBErrorHandler.
You can check the current ErrorHandler with getErrorHandler
It's also possible to see what implementation of ErrorHandlerthat your XMLReader uses. Just call getErrorHandler() and you'll get the current registered ErrorHandler implementation. You can then work with what class or just print out information about it; see Listing 11 for a simple example.
Listing 11. Checking the current error handler
JAXBContext context = JAXBContext.newInstance("dw.ibm");
Unmarshaller unmarshaller = context.createUnmarshaller();
// Get the lower level handler from the Unmarshaller
UnmarshallerHandler unmarshallerHandler =
unmarshaller.getUnmarshallerHandler();
// Now use JAXP to get a SAX parser
SAXParserFactory factory = SAXParserFactory.newInstance();
// Set options on the factory, using standard JAXP calls
factory.setNamespaceAware(true);
XMLReader reader = factory.newSAXParser().getXMLReader();
// We can use the handler from the unmarshaller as the content handler
reader.setContentHandler(unmarshallerHandler);
// See what's handling errors right now
ErrorHandler handler = reader.getErrorHandler();
System.out.println("Error handler is currently: " +
handler.getClass().getName());
// Now parse, using the unmarshalling handler from JAXB
reader.parse(new InputSource(new FileReader(xmlDocument)));
MyCustomObject topObject = (MyCustomObject)unmarshallerHandler.getResult();
|
This is really only useful for informational value since there's not much you do with that class. Still, if you're curious about how some of your favorite XML APIs handle errors, this is a good way to find out.
Error handling must be considered a frontline part of your application development. In fact, most users report a stronger memory of a program's errors—and how those errors were handled—than any other component of an application or site. Great features are great; horrendous error handling is horrendous. That sounds overly simple, but taking even 10 percent more time to work on handling errors gracefully or even preventing errors from occurring at all will dramatically improve user experience.
When it comes to handling XML parsing and processing errors, the key is not really a
particular SAX interface as much as it is understanding what drives XML processing. Once
you realize that SAX underpins most XML processing, you know that SAX is then the key to
good error handling. If five years from now, another XML parsing API has supplanted SAX,
you should learn that API to the extent that you can work with it. Getting access to a
SAX XMLReader is trivial, but knowing what to do with that
interface is not. In fact, that's really the key to error handling: Understand the
system that you work with and the lower levels of that system. You don't need to start pushing and popping in assembly language, but you do need to know that SAX is the key XML parsing API in use today.
Error handling then becomes an issue of implementation and execution. With SAX, you use the XMLReader and ErrorHandler interfaces to receive information about errors. You can deal with those errors immediately, pass them on to a calling program, wrap them into custom objects with additional information, or do anything else that makes sense to you and your application. There's no magical formula for great error handling, but there certainly is a key principle—don't annoy your users! If you can manage that you're well ahead of your peers and competitors.
Tackle your errors head on, and use them to your advantage rather than your user's detriment. You'll get fewer complaints and a happier manager- and user-base. Let me know what interesting solutions you find to your errors and join the experts online at the developerWorks forums (check Resources for links) to tell how you averted disasters in your own programs.
Learn
-
All About JAXP: Read up on JAXP in detail.
-
XML 1.0
specification: Find out what's an error, fatal error, and what's neither (a warning).
-
Tip: Converting from
DOM (Brett McLaughlin, developerWorks, April 2001): In this useful tip, learn to convert DOM structures to SAX and JDOM to allow communication with applications that do not use DOM.
- Find out more about the APIs discussed in this
article. Start with SAX 2 for Java on the SAX
Web site, and then look at DOM on the W3C Web site.
-
IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
-
XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
-
developerWorks technical events and webcasts: Stay current with technology in these sessions.
- The technology
bookstore: Browse for books on these and other technical topics.
-
developerWorks
podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
-
Java 5 or the more recent
Java 6 software: If you're new to Java programming,
get JAXP along with a complete JDK with these downloads.
-
Log4J: Try this great IBM-based API that's now open source, and allows you to easily log to a variety of sources, from text files to network sources. Logging is a vital part of a good error handling application.
-
Java and XML, Third Edition (Brett McLaughlin and Justin Edelson, O'Reilly Media, Inc.): Cover XML from start to finish, including extensive information on XML, XSL, and a number of related XML specifications.
-
IBM
trial software for product evaluation: Build your next project with trial software available for download directly from developerWorks, including application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
-
XML zone discussion forums: Participate in any of several XML-related discussions.
-
developerWorks XML zone: Share your thoughts: After you read this article, post your comments and thoughts in this forum. The XML zone editors moderate the forum and welcome your input.
-
developerWorks blogs: Check out these blogs and get involved in the developerWorks community.

Brett McLaughlin is a bestselling and award-winning non-fiction author. His books on computer programming, home theater, and analysis and design have sold in excess of 100,000 copies. He has been writing, editing, and producing technical books for nearly a decade, and is as comfortable in front of a word processor as he is behind a guitar, chasing his two sons around the house, or laughing at reruns of Arrested Development with his wife. His last book, Head First Object Oriented Analysis and Design, won the 2007 Jolt Technical Book award. His classic Java and XML remains one of the definitive works on using XML technologies in the Java language.
Comments (Undergoing maintenance)





