In an earlier tutorial, I showed you the basics of XML parsing in the Java language. I covered the major APIs (DOM, SAX, and JDOM), and went through a number of examples that demonstrated the basic tasks common to most XML applications. The second tutorial in the series covered parser features, namespaces, and XML validation. This final tutorial looks at more difficult things that I didn't cover before, such as:
- Building XML structures without an XML document
- Converting between one API and another (SAX events to DOM trees, for example)
- Manipulating tree structures
As in the previous tutorials, I cover these APIs:
- The Document Object Model (DOM), Levels 1, 2, and 3
- The Simple API for XML (SAX), Version 2.0
Although many of the sample programs I discuss here use JAXP (the Java API for XML parsing), I won't discuss JAXP specifically in this tutorial.
Most of the examples here work with the Shakespearean sonnet that appeared in the previous tutorials. The structure of this sonnet is:
<sonnet> <author> <lastName> <firstName> <nationality> <yearOfBirth> <yearOfDeath> </author> <lines> [14 <line> elements] </lines> </sonnet>
I'll use this sample document throughout this tutorial. Links to the complete set of sample files are shown below:
- sonnet.dtd (download to view in a text editor)
As an alternative, you can download x-java3_codefiles.zip to view these files in a text editor.
In addition to the sonnet, you'll also learn how to parse files of comma-separated values and text strings, including several approaches to converting that information into XML or XML data structures.
You'll need to set up a few things on your machine before you can run the examples. (I'm assuming that you know how to compile and run a Java program, and that you know how to set your
- First, visit the home page of the Xerces XML parser at the Apache XML Project (http://xml.apache.org/xerces2-j/). You can also go directly to the download page (http://xml.apache.org/xerces2-j/download.cgi).
- Unzip the file that you downloaded from Apache. This creates a directory named
xerces-2_5_0or something similar, depending on the release level of the parser. The JAR files you need (
xml-apis.jar) should be in the Xerces root directory.
- Visit the JDOM project's Web site and download the latest version of JDOM (http://jdom.org/).
- Unzip the file you unloaded from JDOM. This creates a directory named
jdom-b9or something similar. The JAR file you need (
jdom.jar) should be in the
- Finally, download the zip file of examples for this tutorial, x-java3_codefiles.zip, and unzip the file.
- Add the current directory (