Simplify XML programming with JDOM

This open-source API makes XML document manipulation easy for Java developers

JDOM is a unique Java toolkit for working with XML, engineered to enable rapid development of XML applications. Its design embraces the Java language from syntax to semantics. But is it better than existing -- and more standard -- XML APIs? Judge for yourself as we run through some examples and illuminate the design goals of this popular open-source project, which recently was formally accepted as a Java Specification Request.

Share:

Wes Biggs (wes@tralfamadore.com), Senior Developer, T.H.I.

Wes Biggs has developed Internet applications for companies including the Los Angeles Times, USWeb and Elite Information Systems. He is a frequent contributor to open source Java projects and maintains the Free Software Foundation's gnu.regexp regular expression package. Contact Wes at wes@tralfamadore.com.



Harry Evans (harry@tralfamadore.com), Senior Developer, T.H.I.

Harry Evans' experience in software design and application engineering includes the design of several Web-based and Internet-aware products, mostly in a start-up environment. He has worked in all stages of product life cycle, from Rapid Application Development to legacy product integration. Contact Harry at harry@tralfamadore.com.



01 May 2001

Also available in Russian Japanese

As a developer, you've probably heard of the 80-20 rule, known in other circles as Pareto's Law: a process or methodology will accommodate 80 percent of all possible situations, and the other 20 percent will need to be handled on a case-by-case basis. The corollary for software development is that it should be extremely easy for developers to accomplish 80 percent of the things they could possibly do with a given technology.

Of course, software products and standards don't always evolve according to the 80-20 rule. The fractured space of Java XML tools, in particular, illustrates an exception to that rule. The Java programming world is full of APIs -- some homegrown, some backed by the marketing might of a major corporation -- that provide sophisticated solutions to particular XML tasks. As testament to the universality of XML, for every new task, there is a new technology. But what's the glue, and how do you go about finding the right tool for the 80 percent of things that you have to do over and over -- basic XML tree manipulation with an intuitive mapping to the Java language? JDOM is an XML API built with exactly that question in mind.

Tag, you're it: Java and XML

In many ways, the Java language has become the programming language of choice for XML. With groundbreaking work from the Apache Software Foundation and IBM alphaWorks, there are now complete tool chains for creating, manipulating, transforming, and parsing XML documents.

But while many Java developers use XML daily, Sun has lagged behind the industry in incorporating XML into the Java platform. Because the Java 2 platform went gold before XML had made its mark as a key technology for everything from business-to-business integration to Web site content streamlining, Sun has been using the JSR process to grandfather in existing XML APIs that have gained wide acceptability. The most significant addition to date has been the inclusion of JAXP, the Java API for XML Parsing, which includes three packages:

  • org.w3c.dom, the Java implementation of the W3C's recommendation for a standard programmatic Document Object Model for XML
  • org.xml.sax, the event-driven Simple API for XML parsing
  • javax.xml.parsers, a factory implementation that allows application developers to configure and obtain a particular parser implementation

Though the inclusion of these packages is a good thing for Java developers, it merely represents a formal nod to existing API standards, and not a giant leap forward in providing elegant Java-XML interoperability. What the core Java platform has lacked is an intuitive interface for manipulating XML documents as Java objects.

Enter JDOM. The brainchild of two well-known Java developers and authors, Brett McLaughlin and Jason Hunter, JDOM was inaugurated as an open-source project under an Apache-like license in early 2000. It has grown to include contributions and incorporate feedback and bug fixes from a wide base of Java developers, and aims to build a complete Java platform-based solution for accessing, manipulating, and outputting XML data from Java code.


It's the API, dummy: Where JDOM fits

JDOM can be used as an alternative to the org.w3c.dom package for programmatically manipulating XML documents. It's not a drop-in replacement, and, in fact, JDOM and DOM can happily coexist. In addition, JDOM is not concerned with parsing XML from text input, although it provides wrapper classes that take much of the work out of configuring and running a parser implementation. JDOM builds on the strengths of existing APIs to build, as the project home page states, "a better mousetrap."

To understand why there is a need for an alternative API, consider the design constraints of the W3C DOM:

  • Language independent. The DOM wasn't designed with the Java language in mind. While this approach keeps a very similar API between different languages, it also makes the API more cumbersome for programmers who are used to the Java language's idioms. For example, while the Java language has a String class built into the language, the DOM specification defines its own Text class.
  • Strict hierarchies. The DOM's API follows directly from the XML specification itself. In XML, everything's a node, so you find a Node-based interface in the DOM that almost everything extends and a host of methods that return Node. It's elegant from a polymorphism point of view, but as explained above, it's difficult and cumbersome to work with in the Java language, where explicit downcasts from Node to the leaf types lead to long-winded and less understandable code.
  • Interface driven. The public DOM API consists of interfaces only (the one exception, appropriately enough, is an Exception class). The W3C isn't interested in providing implementations, just in defining interfaces, which makes sense. But it also means that using the API as a Java programmer imposes a degree of separation when creating XML objects, as the W3C standards make heavy use of generic factory classes and similar flexible but less direct patterns. For certain uses where XML documents are built only by a parser, and never by application-level code, this is irrelevant. But as XML use becomes more widespread, not all problems continue to look so parser-driven, and application developers need a convenient way to construct XML objects programmatically.

To the programmer, these constraints mean a heavy (both in terms of memory use and interface size) and unwieldy API that can be hard to learn and frustrating to use. In contrast, JDOM was formulated as a lightweight API that, first and foremost, is Java-centric. It does away with the above awkwardness by turning the DOM's principles on their head:

  • JDOM is Java platform specific. The API uses the Java language's built-in String support wherever possible, so that text values are always available as Strings. It also makes use of the Java 2 platform collection classes, like List and Iterator, providing a rich environment for programmers familiar with the Java language.
  • No hierarchies. In JDOM, an XML element is an instance of Element, an XML attribute is an instance of Attribute, and an XML document itself is an instance of Document. Because all of these represent different concepts in XML, they are always referenced as their own types, never as an amorphous "node."
  • Class driven. Because JDOM objects are direct instances of classes like Document, Element, and Attribute, creating one is as easy as using the new operator in the Java language. It also means that there are no factory interfaces to configure -- JDOM is ready to use straight out of the jar.

Look ma, no Nodes: Building and manipulating JDOM documents

JDOM makes use of standard Java coding patterns. Where possible, it uses the Java new operator instead of complex factory patterns, making object manipulation easy, even for the novice user. For example, let's look at how you would build a simple XML document from scratch using JDOM. The structure we are going to build is shown in Listing 1. (Download the complete code for this article from Resources.)

Listing 1. Sample XML document to build
<?xml version="1.0" encoding="UTF-8"?>
<car vin="123fhg5869705iop90">
  <!--Description of a car-->
  <make>Toyota</make>
  <model>Celica</model>
  <year>1997</year>
  <color>green</color>
  <license state="CA">1ABC234</license>
</car>

Note: We'll build the sample document detailed in Listings 2 through 7 below.

To begin, let's create the root element and add it to a document:

Listing 2. Creating a Document
Element carElement = new Element("car");
Document myDocument = new Document(carElement);

This step creates a new org.jdom.Element and makes it the root element of the org.jdom.DocumentmyDocument. (If you're using the sample code provided in Resources, make sure to import org.jdom.*.) Because an XML document must always have a single root element, Document takes the Element in its constructor.

Next, we'll add the vin attribute:

Listing 3. Adding an Attribute
carElement.addAttribute(new Attribute("vin", "123fhg5869705iop90"));

Adding elements is also very straightforward. Here we add the make element:

Listing 4. Elements and sub-elements
Element make = new Element("make");
make.addContent("Toyota");
carElement.addContent(make);

Because the addContent method of Element returns the Element, we could also write this as:

Listing 5. Adding elements in a terse form
carElement.addContent(new Element("make").addContent("Toyota"));

Both of these statements accomplish the same thing. Some would argue that the first example is more readable, but the second becomes more readable if you are constructing many elements at once. To finish constructing the document:

Listing 6. Adding the remaining elements
carElement.addContent(new Element("model").addContent("Celica"));
carElement.addContent(new Element("year").addContent("1997"));
carElement.addContent(new Element("color").addContent("green"));
carElement.addContent(new Element("license")
    .addContent("1ABC234").addAttribute("state", "CA"));

You will notice that for the license element, we not only added the content of the element, we also added an attribute to it, specifying the state in which the license was issued. This is possible because the addContent methods on Element always return the Element itself, instead of having a void declaration.

Adding a comment section or other standard XML types is done the same way:

Listing 7. Adding a comment
carElement.addContent(new Comment("Description of a car"));

Manipulation of the document takes place in a similar fashion. For example, to obtain a reference to the year element, we use the getChild method of Element:

Listing 8. Accessing child elements
Element yearElement = carElement.getChild("year");

This statement actually will return the first child Element with the element name year. If there is no year element, then the call will return null. Note that we didn't need to upcast the return value from anything like the DOM Node interface -- children of Elements are simply Elements. In a similar fashion, we can remove the year element from the document:

Listing 9. Removing child elements
boolean removed = carElement.removeChild("year");

This call will remove the year element only; the rest of the document remains unchanged.

So far, we have covered how documents can be created and manipulated. To output our finished document to the console we can use JDOM's XMLOutputter class:

Listing 10. Turning JDOM into XML text
try {
    XMLOutputter outputter = new XMLOutputter("  ", true);
    outputter.output(myDocument, System.out);
} catch (java.io.IOException e) {
    e.printStackTrace();
}

XMLOutputter has a few formatting options. Here we've specified that we want child elements indented two spaces from the parent element, and that we want new lines between elements. XMLOutputter can output to either a Writer or an OutputStream. To output to a file we could simply change the output line to:

Listing 11. Using a FileWriter to output XML
FileWriter writer = new FileWriter("/some/directory/myFile.xml");
outputter.output(myDocument, writer);
writer.close();

Plays well with others: Interoperating with existing XML tools

One of the interesting features of JDOM is its interoperability with other APIs. Using JDOM, you can output a document not only to a Stream or a Reader, but also as a SAX Event Stream or as a DOM Document. This flexibility allows JDOM to be used in a heterogeneous environment or to be added to systems already using another method for handling XML. As we'll see in a later example, it also allows JDOM to use other XML tools that don't yet recognize the JDOM data structures natively.

Another use of JDOM is the ability to read and manipulate XML data that already exists. Reading a well-formed XML file is done by using one of the classes in org.jdom.input. In this example, we'll use SAXBuilder:

Listing 12. Parsing an XML file with SAXBuilder
try {
  SAXBuilder builder = new SAXBuilder();
  Document anotherDocument = 
    builder.build(new File("/some/directory/sample.xml"));
} catch(JDOMException e) {
  e.printStackTrace();
} catch(NullPointerException e) {
  e.printStackTrace();
}

You can manipulate the document built through this process in the same ways shown back in Listings 2 through 7.

Another practical application of JDOM combines it with the Xalan product from Apache (see Resources). Using the car example above, we will construct a Web page for an online car dealer presenting the details of a particular car. First, we will assume that the document we built above represents the information about the car we are going to present to the user. Next, we will combine this JDOM Document with an XSL stylesheet and output the HTML-formatted results to a servlet's OutputStream for display in the user's browser.

In this case, the XSL stylesheet we are going to use is called car.xsl:

Listing 13. XSL document for transforming car records to HTML
<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/car">
    <html>
        <head>
          <title><xsl:value-of select="make"/> <xsl:value-of select="model"/>
        </head>
        <body>
          <h1><xsl:value-of select="make"/></h1><br />
          <h2><xsl:value-of select="model"/></h2><br />
          <table border="0">
          <tr><td>VIN:</td><td><xsl:value-of select="@vin"/></td></tr>
          <tr><td>Year:</td><td><xsl:value-of select="year"/></td></tr>
          <tr><td>Color:</td><td><xsl:value-of select="color"/></td></tr>
          </table>
        </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

Now we will turn the org.jdom.Document into a DOM Document and feed it to Xalan, along with the file that represents our XSL and the OutputStream that we obtained from our hypothetical application server, which happens to be using servlets (shown in Listing 14).

Listing 14. Creating an HTML document with JDOM and Xalan
TransformerFactory tFactory = TransformerFactory.newInstance();

// Make the input sources for the XML and XSLT documents
org.jdom.output.DOMOutputter outputter = new org.jdom.output.DOMOutputter();
org.w3c.dom.Document domDocument = outputter.output(myDocument);
javax.xml.transform.Source xmlSource = 
  new javax.xml.transform.dom.DOMSource(domDocument);
StreamSource xsltSource = 
  new StreamSource(new FileInputStream("/some/directory/car.xsl"));

// Make the output result for the finished document using 
// the HTTPResponse OutputStream
StreamResult xmlResult = new StreamResult(response.getOutputStream());

// Get a XSLT transformer
Transformer transformer = tFactory.newTransformer(xsltSource);

// Do the transform
transformer.transform(xmlSource, xmlResult);

In this example, the output is streamed through the HTTPResponseOutputStream of a Java servlet. However, the stream could just as easily be a filestream as in our earlier example with XMLOutputter. We used DOMOutputter to generate the XML source for Xalan. However, we could generate the same input by using XMLOutputter to output our XML document as a String and then make it into a StreamSource. Talk about flexibility: JDOM can output its structure as a String, a SAX Event Stream, or a DOM Document. This allows JDOM to interface with tools that can take any of these models as input. (For additional functionality, check the JDOM Web site for the contrib package, where you'll find a worthy library of JDOM-based utilities that provide tools like a JDBC ResultSet-based builder, XPATH implementation, and more.)

In just a few lines of code, JDOM enables a wide variety of functionality. We have parsed and programatically created XML documents in XML, manipulated those documents, and used them to produce an XML- driven Web page.

The official 1.0 release of JDOM may coincide with its continuing evolution in the Java Community Process. Submitted as JSR-102, JDOM has been approved for eventual inclusion in the core Java platform, with this comment from Sun: "JDOM does appear to be significantly easier to use than the earlier APIs, so we believe it will be a useful addition to the platform." According to the JSR, a 1.0 release might see JDOM's packaging change from "org.jdom" to "javax.xml.tree". While the future certainly looks positive, developers may eventually have to retrofit their code to work with the new version.


JDOM grows up: A glimpse of the future

As of this writing, the JDOM project has released its Beta 6 version. Even with its beta status, JDOM has proved to be a stable implementation for many real-world implementations. While much of the API is solid, work continues in a number of areas that will potentially impact the existing interfaces. Therefore, any development projects undertaken at this point need not shirk JDOM for fear of a buggy implementation, but should consider the fact that certain method signatures or specific semantics are still likely to change before final release and potential adoption into the core Java API. (See What's in a name?)

The near-term to-do list for JDOM is focused on stabilizing the API and evaluating the performance aspects of parts of the implementation. Other items that are in progress but that could hinder some application developers include support for DTD entities and other less frequent constructs. Further down the road is core support for XPATH, the XML path language native to applications like XSLT, and more direct integration with XML data sources.

So, in conclusion, is JDOM better than existing XML APIs? If you dream in Java, the answer is probably yes. JDOM isn't meant to supercede your favorite parser or XML-aware database, but its design principles make for a particularly rapid learning curve for Java developers entering or well on their way to ruling the XML world.


Download

DescriptionNameSize
Sample codej-jdomExamples.jar7KB

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=10537
ArticleTitle=Simplify XML programming with JDOM
publish-date=05012001