As a developer, you've probably heard of the 80-20 rule, known in other circles as Pareto's Law: a process or methodology will accommodate 80 percent of all possible situations, and the other 20 percent will need to be handled on a case-by-case basis. The corollary for software development is that it should be extremely easy for developers to accomplish 80 percent of the things they could possibly do with a given technology.
Of course, software products and standards don't always evolve according to the 80-20 rule. The fractured space of Java XML tools, in particular, illustrates an exception to that rule. The Java programming world is full of APIs -- some homegrown, some backed by the marketing might of a major corporation -- that provide sophisticated solutions to particular XML tasks. As testament to the universality of XML, for every new task, there is a new technology. But what's the glue, and how do you go about finding the right tool for the 80 percent of things that you have to do over and over -- basic XML tree manipulation with an intuitive mapping to the Java language? JDOM is an XML API built with exactly that question in mind.
In many ways, the Java language has become the programming language of choice for XML. With groundbreaking work from the Apache Software Foundation and IBM alphaWorks, there are now complete tool chains for creating, manipulating, transforming, and parsing XML documents.
But while many Java developers use XML daily, Sun has lagged behind the industry in incorporating XML into the Java platform. Because the Java 2 platform went gold before XML had made its mark as a key technology for everything from business-to-business integration to Web site content streamlining, Sun has been using the JSR process to grandfather in existing XML APIs that have gained wide acceptability. The most significant addition to date has been the inclusion of JAXP, the Java API for XML Parsing, which includes three packages:
-
org.w3c.dom, the Java implementation of the W3C's recommendation for a standard programmatic Document Object Model for XML -
org.xml.sax, the event-driven Simple API for XML parsing -
javax.xml.parsers, a factory implementation that allows application developers to configure and obtain a particular parser implementation
Though the inclusion of these packages is a good thing for Java developers, it merely represents a formal nod to existing API standards, and not a giant leap forward in providing elegant Java-XML interoperability. What the core Java platform has lacked is an intuitive interface for manipulating XML documents as Java objects.
Enter JDOM. The brainchild of two well-known Java developers and authors, Brett McLaughlin and Jason Hunter, JDOM was inaugurated as an open-source project under an Apache-like license in early 2000. It has grown to include contributions and incorporate feedback and bug fixes from a wide base of Java developers, and aims to build a complete Java platform-based solution for accessing, manipulating, and outputting XML data from Java code.
It's the API, dummy: Where JDOM fits
JDOM can be used as an alternative to the org.w3c.dom package for programmatically manipulating XML documents. It's not a drop-in replacement, and, in fact, JDOM and DOM can happily coexist. In addition, JDOM is not concerned with parsing XML from text input, although it provides wrapper classes that take much of the work out of configuring and running a parser implementation. JDOM builds on the strengths of existing APIs to build, as the project home page states, "a better mousetrap."
To understand why there is a need for an alternative API, consider the design constraints of the W3C DOM:
-
Language independent. The DOM wasn't designed with the Java language in mind. While this approach keeps a very similar API between different languages, it also makes the API more cumbersome for programmers who are used to the Java language's idioms. For example, while the Java language has a
Stringclass built into the language, the DOM specification defines its ownTextclass.
-
Strict hierarchies. The DOM's API follows directly from the XML specification itself. In XML, everything's a node, so you find a
Node-based interface in the DOM that almost everything extends and a host of methods that returnNode. It's elegant from a polymorphism point of view, but as explained above, it's difficult and cumbersome to work with in the Java language, where explicit downcasts fromNodeto the leaf types lead to long-winded and less understandable code.
-
Interface driven. The public DOM API consists of interfaces only (the one exception, appropriately enough, is an
Exceptionclass). The W3C isn't interested in providing implementations, just in defining interfaces, which makes sense. But it also means that using the API as a Java programmer imposes a degree of separation when creating XML objects, as the W3C standards make heavy use of generic factory classes and similar flexible but less direct patterns. For certain uses where XML documents are built only by a parser, and never by application-level code, this is irrelevant. But as XML use becomes more widespread, not all problems continue to look so parser-driven, and application developers need a convenient way to construct XML objects programmatically.
To the programmer, these constraints mean a heavy (both in terms of memory use and interface size) and unwieldy API that can be hard to learn and frustrating to use. In contrast, JDOM was formulated as a lightweight API that, first and foremost, is Java-centric. It does away with the above awkwardness by turning the DOM's principles on their head:
-
JDOM is Java platform specific. The API uses the Java language's built-in
Stringsupport wherever possible, so that text values are always available asStrings. It also makes use of the Java 2 platform collection classes, likeListandIterator, providing a rich environment for programmers familiar with the Java language.
-
No hierarchies. In JDOM, an XML element is an instance of
Element, an XML attribute is an instance ofAttribute, and an XML document itself is an instance ofDocument. Because all of these represent different concepts in XML, they are always referenced as their own types, never as an amorphous "node."
-
Class driven. Because JDOM objects are direct instances of classes like
Document,Element, andAttribute, creating one is as easy as using thenewoperator in the Java language. It also means that there are no factory interfaces to configure -- JDOM is ready to use straight out of the jar.
Look ma, no Nodes: Building and manipulating JDOM documents
JDOM makes use of standard Java coding patterns. Where possible, it uses the Java new operator instead of complex factory patterns, making object manipulation easy, even for the novice user. For example, let's look at how you would build a simple XML document from scratch using JDOM. The structure we are going to build is shown in Listing 1. (Download the complete code for this article from Resources.)
Listing 1. Sample XML document to build
<?xml version="1.0" encoding="UTF-8"?> <car vin="123fhg5869705iop90"> <!--Description of a car--> <make>Toyota</make> <model>Celica</model> <year>1997</year> <color>green</color> <license state="CA">1ABC234</license> </car> |
Note: We'll build the sample document detailed in Listings 2 through 7 below.
To begin, let's create the root element and add it to a document:
Listing 2. Creating a Document
Element carElement = new Element("car");
Document myDocument = new Document(carElement);
|
This step creates a new org.jdom.Element and makes it the root element of the org.jdom.Document
myDocument. (If you're using the sample code provided in Resources, make sure to import org.jdom.*.) Because an XML document must always have a single root element, Document takes the Element in its constructor.
Next, we'll add the vin attribute:
Listing 3. Adding an Attribute
carElement.addAttribute(new Attribute("vin", "123fhg5869705iop90"));
|
Adding elements is also very straightforward. Here we add the make element:
Listing 4. Elements and sub-elements
Element make = new Element("make");
make.addContent("Toyota");
carElement.addContent(make);
|
Because the addContent method of Element returns the Element, we could also write this as:
Listing 5. Adding elements in a terse form
carElement.addContent(new Element("make").addContent("Toyota"));
|
Both of these statements accomplish the same thing. Some would argue that the first example is more readable, but the second becomes more readable if you are constructing many elements at once. To finish constructing the document:
Listing 6. Adding the remaining elements
carElement.addContent(new Element("model").addContent("Celica"));
carElement.addContent(new Element("year").addContent("1997"));
carElement.addContent(new Element("color").addContent("green"));
carElement.addContent(new Element("license")
.addContent("1ABC234").addAttribute("state", "CA"));
|
You will notice that for the license element, we not only added the
content of the element, we also added an attribute to it, specifying the
state in which the license was issued. This is possible because the
addContent methods on Element always return the Element itself, instead of having a void declaration.
Adding a comment section or other standard XML types is done the same way:
Listing 7. Adding a comment
carElement.addContent(new Comment("Description of a car"));
|
Manipulation of the document takes place in a similar fashion. For
example, to obtain a reference to the year element, we use the getChild method of Element:
Listing 8. Accessing child elements
Element yearElement = carElement.getChild("year");
|
This statement actually will return the first child Element with the element
name year. If there is no year element, then the call will return
null. Note that we didn't need to upcast the return value from anything like the DOM Node interface -- children of Elements are simply Elements. In a similar fashion, we can remove the year element from the document:
Listing 9. Removing child elements
boolean removed = carElement.removeChild("year");
|
This call will remove the year element only; the rest of the document
remains unchanged.
So far, we have covered how documents can be created and manipulated.
To output our finished document to the console we can use JDOM's XMLOutputter class:
Listing 10. Turning JDOM into XML text
try {
XMLOutputter outputter = new XMLOutputter(" ", true);
outputter.output(myDocument, System.out);
} catch (java.io.IOException e) {
e.printStackTrace();
}
|
XMLOutputter has a few formatting options. Here we've specified that we
want child elements indented two spaces from the parent element, and that
we want new lines between elements. XMLOutputter can output to either
a Writer or an OutputStream. To output to a file we could simply
change the output line to:
Listing 11. Using a FileWriter to output XML
FileWriter writer = new FileWriter("/some/directory/myFile.xml");
outputter.output(myDocument, writer);
writer.close();
|
Plays well with others: Interoperating with existing XML tools
One of the interesting features of JDOM is its interoperability with
other APIs. Using JDOM, you can output a document not only to a Stream or a Reader, but also as a SAX Event Stream or as a DOM Document. This flexibility allows JDOM to be used in a heterogeneous environment or to be added to systems already using another method for handling XML. As we'll see in a later example, it also allows JDOM to use other XML tools that don't yet recognize the JDOM data structures natively.
Another use of JDOM is the ability to read and manipulate XML data that
already exists. Reading a well-formed XML file is done by using one of the
classes in org.jdom.input. In this example, we'll use SAXBuilder:
Listing 12. Parsing an XML file with SAXBuilder
try {
SAXBuilder builder = new SAXBuilder();
Document anotherDocument =
builder.build(new File("/some/directory/sample.xml"));
} catch(JDOMException e) {
e.printStackTrace();
} catch(NullPointerException e) {
e.printStackTrace();
}
|
You can manipulate the document built through this process in the same ways shown back in Listings 2 through 7.
Another practical application of JDOM combines it with the Xalan product from Apache (see Resources). Using the car example above, we will construct a Web page for an online car dealer presenting the details of a particular car. First, we will assume that the document we built above represents the information about the car we are going to present
to the user. Next, we will combine this JDOM Document with an XSL stylesheet and output the HTML-formatted results to a servlet's OutputStream for display in the user's browser.
In this case, the XSL stylesheet we are going to use is called car.xsl:
Listing 13. XSL document for transforming car records to HTML
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/car">
<html>
<head>
<title><xsl:value-of select="make"/> <xsl:value-of select="model"/>
</head>
<body>
<h1><xsl:value-of select="make"/></h1><br />
<h2><xsl:value-of select="model"/></h2><br />
<table border="0">
<tr><td>VIN:</td><td><xsl:value-of select="@vin"/></td></tr>
<tr><td>Year:</td><td><xsl:value-of select="year"/></td></tr>
<tr><td>Color:</td><td><xsl:value-of select="color"/></td></tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
|
Now we will turn the org.jdom.Document into a DOM Document and feed it to Xalan, along with the file that represents our XSL and the OutputStream that
we obtained from our hypothetical application server, which happens to be using servlets (shown in Listing 14).
Listing 14. Creating an HTML document with JDOM and Xalan
TransformerFactory tFactory = TransformerFactory.newInstance();
// Make the input sources for the XML and XSLT documents
org.jdom.output.DOMOutputter outputter = new org.jdom.output.DOMOutputter();
org.w3c.dom.Document domDocument = outputter.output(myDocument);
javax.xml.transform.Source xmlSource =
new javax.xml.transform.dom.DOMSource(domDocument);
StreamSource xsltSource =
new StreamSource(new FileInputStream("/some/directory/car.xsl"));
// Make the output result for the finished document using
// the HTTPResponse OutputStream
StreamResult xmlResult = new StreamResult(response.getOutputStream());
// Get a XSLT transformer
Transformer transformer = tFactory.newTransformer(xsltSource);
// Do the transform
transformer.transform(xmlSource, xmlResult);
|
In this example, the output is streamed through the HTTPResponse
OutputStream of a Java servlet. However, the stream could just as easily be a filestream as in our earlier example with XMLOutputter. We used DOMOutputter to generate the XML source for Xalan. However, we could generate the same input by using XMLOutputter to output our XML document as a String and then make it into a StreamSource. Talk about flexibility: JDOM can output its structure as a String, a SAX Event Stream, or a DOM Document. This allows JDOM to interface with tools that can take any of these models as input. (For additional functionality, check the JDOM Web site for the contrib package, where you'll find a worthy library of JDOM-based utilities that provide tools like a JDBC ResultSet-based builder, XPATH implementation, and more.)
In just a few lines of code, JDOM enables a wide variety of functionality. We have parsed and programatically created XML documents in XML, manipulated those documents, and used them to produce an XML- driven Web page.
JDOM grows up: A glimpse of the future
As of this writing, the JDOM project has released its Beta 6 version. Even with its beta status, JDOM has proved to be a stable implementation for many real-world implementations. While much of the API is solid, work continues in a number of areas that will potentially impact the existing interfaces. Therefore, any development projects undertaken at this point need not shirk JDOM for fear of a buggy implementation, but should consider the fact that certain method signatures or specific semantics are still likely to change before final release and potential adoption into the core Java API. (See What's in a name?)
The near-term to-do list for JDOM is focused on stabilizing the API and evaluating the performance aspects of parts of the implementation. Other items that are in progress but that could hinder some application developers include support for DTD entities and other less frequent constructs. Further down the road is core support for XPATH, the XML path language native to applications like XSLT, and more direct integration with XML data sources.
So, in conclusion, is JDOM better than existing XML APIs? If you dream in Java, the answer is probably yes. JDOM isn't meant to supercede your favorite parser or XML-aware database, but its design principles make for a particularly rapid learning curve for Java developers entering or well on their way to ruling the XML world.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample code | j-jdomExamples.jar | 7KB | HTTP |
Information about download methods
- For the latest release of JDOM, online API documentation and more, check out the JDOM Web site.
- For more information about Xalan, Xerces, and other Java XML products, check out the Apache XML site.
- Visit the XML Web site for general XML information, tutorials, and resources.
- And don't forget the developerWorks XML zone for a wealth of XML-related content.
- JDOM's creators are both well-known authors. We recommend you read Jason Hunter's
Java Servlet Programming, 2nd Edition
(O'Reilly, April 2001) and Brett McLaughlin's
Java and XML
(O'Reilly, June 2000).
- Join the developerWorks XML tools and APIs discussion for Java developers.
- Here's more information on using DOM to incorporate XML documents into applications.
Wes Biggs has developed Internet applications for companies including the Los Angeles Times, USWeb and Elite Information Systems. He is a frequent contributor to open source Java projects and maintains the Free Software Foundation's gnu.regexp regular expression package. Contact Wes at wes@tralfamadore.com.
Harry Evans' experience in software design and application engineering includes the design of several Web-based and Internet-aware products, mostly in a start-up environment. He has worked in all stages of product life cycle, from Rapid Application Development to legacy product integration. Contact Harry at harry@tralfamadore.com.





