Tip: Converting from DOM

When you need SAX or JDOM output from DOM

In this tip, you'll learn how to convert DOM structures to SAX and JDOM to allow communication with applications that do not use DOM. The code listings demonstrate how to convert from DOM to an output stream for use by SAX, and how to convert from DOM to JDOM.

Share:

Brett McLaughlin (brett@newInstance.com), Enhydra Strategist, Lutris Technologies

Brett McLaughlin (brett@newInstance.com) works as Enhydra strategist at Lutris Technologies and specializes in distributed systems architecture. He is author of Java and XML (O'Reilly). He is involved in technologies such as Java servlets, Enterprise JavaBeans technology, XML, and business-to-business applications. Along with Jason Hunter, he founded the JDOM project, which provides a simple API for manipulating XML from Java applications. He is also an active developer on the Apache Cocoon project and the EJBoss EJB server as well as a co-founder of the Apache Turbine project.



01 April 2001

Also available in Japanese

For those of you who are sold on the W3C's DOM (the Document Object Model) and think SAX is silly, you will have to find a way to move from DOM to the other formats that application developers use. These other formats are, of course, SAX and JDOM. What do you do when you have to accept DOM as input and convert it to something else? This certainly is a valid question. With DOM providing a complete document representation, converting it into another format makes a lot of sense. In this tip, you'll learn how to perform this conversion from DOM to either SAX or JDOM

From DOM to SAX

Unfortunately, DOM Level 1 and the newer Level 2 do not provide a means of outputting a DOM tree to SAX or any other format. The result is that each parser implementation provides a set of custom APIs for output, and implementation independence is lost. In other words, your code only works with the parser you wrote it for (like Crimson, or Xerces, or Oracle, and so on). DOM Level 3 is supposed to provide this functionality, so we'll all have to wait and see what DOM Level 3 provides in the way of output methodology. In the meantime, check out your vendor's documentation on writing, or on the serialization of, a DOM tree. As an example, using Apache Xerces, you would need to use the org.apache.xml.serialize.XMLSerializer class, as shown in Listing 1. In either case, you will probably have to output the DOM tree to a stream, then push that stream back into SAX for sequential processing. Note that Listing 1 only shows outputting a DOM tree to a stream; you can then use that stream as input to a SAX processor.

import org.apache.xerces.parsers.DOMParser;
import org.apache.xml.serialize.XMLSerializer;
import org.xml.sax.InputSource;
import org.w3c.dom.Document;
public class PrintDOMTree {
    public static void main(String[] args) {
        try {
            InputSource source = new InputSource(args[0]);
            DOMParser parser = new DOMParser();
            parser.parse(source);
            Document doc = parser.getDocument();
            XMLSerializer serializer = new XMLSerializer();
            // Insert your PipedOutputStream here instead of System.out!
            serializer.setOutputByteStream(System.out);
            serializer.serialize(doc);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

From DOM to JDOM

Moving from DOM to JDOM is quite a bit easier than moving from DOM to SAX. This actually makes sense, since once you have a DOM tree, you've probably already had a chance to get at the data through SAX. In fact, rarely do situations arise where a DOM tree is best handled by SAX, because you've already used up the memory for storing the XML in memory through a DOM representation. A far more common task is to convert an XML document that is coming in as a DOM tree to a JDOM tree. Since these formats are both document representations, but substantially different in behavior and functionality, you may want to let someone else take your DOM tree and deal with it as JDOM. While you might argue that this should be their job, you do need to know (at least!) how to convert from your structure to theirs.

For converting from DOM to JDOM, the JDOM API provides a consumer for DOM Nodes, which is called org.jdom.input.DOMBuilder. This class will take in a DOM Document (as well as some other DOM structures, such as Element and Attr) and convert the DOM tree to a JDOM Document. There really isn't much to this operation, so I'll simply show you the code in Listing 2 and let you see the process in action.

// Java imports
import java.io.IOException;
// JDOM imports
import org.jdom.JDOMException;
import org.jdom.input.DOMBuilder;  
import org.jdom.output.XMLOutputter;  
// SAX and DOM
import org.xml.sax.InputSource;
// Xerces
import org.apache.xerces.parsers.DOMParser;
public class DOMtoJDOM {
    // DOM tree of input document
    org.w3c.dom.Document domDoc;
    public DOMtoJDOM(String systemID) throws Exception {
        DOMParser parser = new DOMParser();
        parser.parse(new InputSource(systemID));
        domDoc = parser.getDocument();
    }
    public org.jdom.Document convert() 
        throws JDOMException, IOException {
        // Create new DOMBuilder, using default parser
        DOMBuilder builder = new DOMBuilder();
        org.jdom.Document jdomDoc = builder.build(domDoc);
        return jdomDoc;
    }
    public static void main(String[] args) {
        try {
            DOMtoJDOM tester = new DOMtoJDOM(args[0]);
            org.jdom.Document jdomDoc = tester.convert();
            // Output the document to System.out
            XMLOutputter outputter = new XMLOutputter();
            outputter.output(jdomDoc, System.out);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

There's nothing more to say. Once you know how to move from DOM to SAX and JDOM, you're all set for tackling any output format you need and interacting with pretty much any type of XML representation you'll come up against. Watch the DOM Level 3 specification for changes to outputting DOM trees in a standard, vendor-independent way, and until then, enjoy using the DOM!

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=11989
ArticleTitle=Tip: Converting from DOM
publish-date=04012001