Skip to main content

Tip: Converting from DOM

When you need SAX or JDOM output from DOM

Brett McLaughlin (brett@newInstance.com), Enhydra Strategist, Lutris Technologies
Brett McLaughlin (brett@newInstance.com) works as Enhydra strategist at Lutris Technologies and specializes in distributed systems architecture. He is author of Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans technology, XML, and business-to-business applications. Along with Jason Hunter, he founded the JDOM project, which provides a simple API for manipulating XML from Java applications. He is also an active developer on the Apache Cocoon project and the EJBoss EJB server as well as a co-founder of the Apache Turbine project.

Summary:  In this tip, you'll learn how to convert DOM structures to SAX and JDOM to allow communication with applications that do not use DOM. The code listings demonstrate how to convert from DOM to an output stream for use by SAX, and how to convert from DOM to JDOM.

View more content in this series

Date:  01 Apr 2001
Level:  Introductory
Comments:  

For those of you who are sold on the W3C's DOM (the Document Object Model) and think SAX is silly, you will have to find a way to move from DOM to the other formats that application developers use. These other formats are, of course, SAX and JDOM. What do you do when you have to accept DOM as input and convert it to something else? This certainly is a valid question. With DOM providing a complete document representation, converting it into another format makes a lot of sense. In this tip, you'll learn how to perform this conversion from DOM to either SAX or JDOM

From DOM to SAX

Unfortunately, DOM Level 1 and the newer Level 2 do not provide a means of outputting a DOM tree to SAX or any other format. The result is that each parser implementation provides a set of custom APIs for output, and implementation independence is lost. In other words, your code only works with the parser you wrote it for (like Crimson, or Xerces, or Oracle, and so on). DOM Level 3 is supposed to provide this functionality, so we'll all have to wait and see what DOM Level 3 provides in the way of output methodology. In the meantime, check out your vendor's documentation on writing, or on the serialization of, a DOM tree. As an example, using Apache Xerces, you would need to use the org.apache.xml.serialize.XMLSerializer class, as shown in Listing 1. In either case, you will probably have to output the DOM tree to a stream, then push that stream back into SAX for sequential processing. Note that Listing 1 only shows outputting a DOM tree to a stream; you can then use that stream as input to a SAX processor.

import org.apache.xerces.parsers.DOMParser;
import org.apache.xml.serialize.XMLSerializer;
import org.xml.sax.InputSource;
import org.w3c.dom.Document;
public class PrintDOMTree {
    public static void main(String[] args) {
        try {
            InputSource source = new InputSource(args[0]);
            DOMParser parser = new DOMParser();
            parser.parse(source);
            Document doc = parser.getDocument();
            XMLSerializer serializer = new XMLSerializer();
            // Insert your PipedOutputStream here instead of System.out!
            serializer.setOutputByteStream(System.out);
            serializer.serialize(doc);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}    


From DOM to JDOM

Moving from DOM to JDOM is quite a bit easier than moving from DOM to SAX. This actually makes sense, since once you have a DOM tree, you've probably already had a chance to get at the data through SAX. In fact, rarely do situations arise where a DOM tree is best handled by SAX, because you've already used up the memory for storing the XML in memory through a DOM representation. A far more common task is to convert an XML document that is coming in as a DOM tree to a JDOM tree. Since these formats are both document representations, but substantially different in behavior and functionality, you may want to let someone else take your DOM tree and deal with it as JDOM. While you might argue that this should be their job, you do need to know (at least!) how to convert from your structure to theirs.

For converting from DOM to JDOM, the JDOM API provides a consumer for DOM Nodes, which is called org.jdom.input.DOMBuilder. This class will take in a DOM Document (as well as some other DOM structures, such as Element and Attr) and convert the DOM tree to a JDOM Document. There really isn't much to this operation, so I'll simply show you the code in Listing 2 and let you see the process in action.

// Java imports
import java.io.IOException;
// JDOM imports
import org.jdom.JDOMException;
import org.jdom.input.DOMBuilder;  
import org.jdom.output.XMLOutputter;  
// SAX and DOM
import org.xml.sax.InputSource;
// Xerces
import org.apache.xerces.parsers.DOMParser;
public class DOMtoJDOM {
    // DOM tree of input document
    org.w3c.dom.Document domDoc;
    public DOMtoJDOM(String systemID) throws Exception {
        DOMParser parser = new DOMParser();
        parser.parse(new InputSource(systemID));
        domDoc = parser.getDocument();
    }
    public org.jdom.Document convert() 
        throws JDOMException, IOException {
        // Create new DOMBuilder, using default parser
        DOMBuilder builder = new DOMBuilder();
        org.jdom.Document jdomDoc = builder.build(domDoc);
        return jdomDoc;
    }
    public static void main(String[] args) {
        try {
            DOMtoJDOM tester = new DOMtoJDOM(args[0]);
            org.jdom.Document jdomDoc = tester.convert();
            // Output the document to System.out
            XMLOutputter outputter = new XMLOutputter();
            outputter.output(jdomDoc, System.out);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}    

There's nothing more to say. Once you know how to move from DOM to SAX and JDOM, you're all set for tackling any output format you need and interacting with pretty much any type of XML representation you'll come up against. Watch the DOM Level 3 specification for changes to outputting DOM trees in a standard, vendor-independent way, and until then, enjoy using the DOM!


Resources

About the author

Brett McLaughlin (brett@newInstance.com) works as Enhydra strategist at Lutris Technologies and specializes in distributed systems architecture. He is author of Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans technology, XML, and business-to-business applications. Along with Jason Hunter, he founded the JDOM project, which provides a simple API for manipulating XML from Java applications. He is also an active developer on the Apache Cocoon project and the EJBoss EJB server as well as a co-founder of the Apache Turbine project.

Comments



Trademarks

static.content.url=/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=11989
ArticleTitle=Tip: Converting from DOM
publish-date=04012001
author1-email=brett@newInstance.com
author1-email-cc=