Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Practical data binding: Looking into JAXB, Part 2

A closer look at JAXB's round-tripping capabilities

Brett McLaughlin (brett@oreilly.com), Editor, O'Reilly and Associates
Brett McLaughlin has been working in computers since the Logo days (Remember the little triangle?). He currently specializes in building application infrastructure using Java-related technologies. He has spent the last several years implementing these infrastructures at Nextel Communications and Allegiance Telecom, Inc. Brett is one of the co-founders of the Java Apache project Turbine, which builds a reusable component architecture for Web application development using Java servlets. He is also a contributor of the EJBoss project, an open source EJB application server, and Cocoon, an open source XML Web-publishing engine.

Summary:  The last installment of this column dealt with generating classes using the JAXB API. This article focuses on using these classes, and examines JAXB's round-tripping capabilities. You'll learn where the problem areas are in converting from XML to Java code and then back again.

View more content in this series

Date:  17 Jun 2004
Level:  Introductory

Comments:  

Data binding APIs are most useful in that they allow programmatic manipulation of XML. It's a lot easier to type someElement.addAttribute("name", "value"); than it is to parse a file, buffer the output, add the characters that make up an attribute declaration, close the stream, and flush the output stream. However, all the manipulation in the world is of little use if you can't properly write your changes back out to a file. This article focuses on that process -- known as marshalling in the data binding world -- and in particular JAXB's marshalling capabilities. Specifically, you'll learn how JAXB scores in the round-tripping arena.

The road so far

In the first installment of this column, you learned some important terms: Marshalling and unmarshalling are intrinsic to the data binding world; however, you also learned some new terms, such as round tripping and semantic equivalence. Round tripping is the process of converting from XML to Java code, and then back again. The quality of a data binding API's round tripping capabilities is measured by how closely the input and output documents match up. Semantic equivalence makes that comparison possible -- it allows insignificant aspects of XML, like ignorable whitespace, to be discarded and a valid comparison to be made.

In the second installment, I introduced a simple XML document, shown again here in Listing 1.


Listing 1. Basic XML listing of guitars (guitars.xml)
<guitars>
  <guitar id="10021">
    <builder luthier="true">Ryan</builder>
    <model>Mission Grand Concert</model>
    <back-sides>Brazilian Rosewood</back-sides>
    <top>Adirondack Spruce</top>
    <notes>
      <![CDATA[
        Just unbelievable...   this guitar has all the tone & 
        resonance you could ever want. I mean, <<WOW!!!>> This 
        is a lifetime guitar.
      ]]>
    </notes>
  </guitar>
  <guitar id="0923">
    <builder smallShop="true">Bourgeois</builder>
    <model>OMC</model>
    <back-sides>Bubinga</back-sides>
    <top>Adirondack Spruce</top>
  </guitar>
  <guitar id="11091">
    <builder>Martin & Company</builder>
    <model>OM-28VR</model>
    <back-sides>Indian Rosewood</back-sides>
    <top bearclaw="true">Sitka Spruce</top>
    <notes>It's certainly true that Martin isn't the only game in town anymore. 
           Still, the OM-28VR is one of their best models...     and this one 
           has some fabulous bearclaw to boot.              Nice specimen of a 
           still-important guitar manufacturer.
    </notes>
  </guitar>
</guitars>

I also supplied a schema for this document -- which isn't repeated here for brevity's sake -- and I showed you how to generate Java source files from this schema, as shown in Listing 2.


Listing 2. JAXB class generation output
		
C:\developerworks>xjc -p com.ibm.dw guitars.xsd -d src
parsing a schema...
compiling a schema...
com\ibm\dw\impl\runtime\MSVValidator.java
com\ibm\dw\impl\runtime\SAXUnmarshallerHandlerImpl.java
com\ibm\dw\impl\runtime\ErrorHandlerAdaptor.java
com\ibm\dw\impl\runtime\AbstractUnmarshallingEventHandlerImpl.java
com\ibm\dw\impl\runtime\UnmarshallableObject.java
com\ibm\dw\impl\runtime\SAXMarshaller.java
com\ibm\dw\impl\runtime\XMLSerializer.java
com\ibm\dw\impl\runtime\ContentHandlerAdaptor.java
com\ibm\dw\impl\runtime\UnmarshallingEventHandlerAdaptor.java
com\ibm\dw\impl\runtime\SAXUnmarshallerHandler.java
com\ibm\dw\impl\runtime\ValidatorImpl.java
com\ibm\dw\impl\runtime\ValidatableObject.java
com\ibm\dw\impl\runtime\UnmarshallerImpl.java
com\ibm\dw\impl\runtime\NamespaceContext2.java
com\ibm\dw\impl\runtime\Discarder.java
com\ibm\dw\impl\runtime\NamespaceContextImpl.java
com\ibm\dw\impl\runtime\ValidatingUnmarshaller.java
com\ibm\dw\impl\runtime\UnmarshallingContext.java
com\ibm\dw\impl\runtime\GrammarInfoImpl.java
com\ibm\dw\impl\runtime\ValidationContext.java

Ensure that you have these Java source files generated, compiled, and ready for use. For detailed steps, consult the previous article in the series.


XML to Java code

With your classes generated and ready for use, you're all set to unmarshal the XML document from Listing 1 into JAXB's in-memory model. This is the first step in testing out JAXB's round-tripping capabilities. Since this isn't an article on JAXB basics (for such articles, see Resources), I'll just let you see the code, shown in Listing 3.


Listing 3. Unmarshalling XML to Java code
		
import java.io.FileInputStream;
import javax.xml.bind.*;

// Import generated classes
import com.ibm.dw.*;

public class RoundTripper {

  private String inputFilename;
  private String outputFilename;
  private JAXBContext jc;

  private final String PACKAGE_NAME = "com.ibm.dw";

  public RoundTripper(String inputFilename, String outputFilename) throws Exception {
    this.inputFilename = inputFilename;
    this.outputFilename = outputFilename;
    jc = JAXBContext.newInstance(PACKAGE_NAME);
  }

  public Guitars unmarshal() throws Exception {
    Unmarshaller u = jc.createUnmarshaller();
    return (Guitars)u.unmarshal(new FileInputStream(inputFilename));
  }

  public static void main(String[] args) {
    if (args.length < 2) {
     System.err.println("Incorrect usage: java RoundTripper" +
                   "[input XML filename] [output XML filename]");
      return;
    }

    try {
      RoundTripper rt = new RoundTripper(args[0], args[1]);
      Guitars guitars = rt.unmarshal();
    } catch (Exception e) {
      e.printStackTrace();
      return;
    }
  }
}

Note: If you are having trouble getting these classes set up and running, consult the last section in this article, "Running example programs," for help.

Some might think that at this point you should print out the version in memory. However, the same APIs used to print something in memory are also used to write the data to an output stream, so that step is really unnecessary.


Java code to XML

Now you can instruct JAXB to spit the in-memory representation back out to XML. This will allow you to inspect the differences between your input file and your output file. I've added code to the RoundTripper class to take care of this, as seen in Listing 4.


Listing 4. Marshalling Java to XML
import java.io.FileInputStream;
import java.io.FileOutputStream;
import javax.xml.bind.*;

// Import generated classes
import com.ibm.dw.*;

public class RoundTripper {

  private String inputFilename;
  private String outputFilename;
  private JAXBContext jc;

  private final String PACKAGE_NAME = "com.ibm.dw";

  public RoundTripper(String inputFilename, String outputFilename) throws Exception {
    this.inputFilename = inputFilename;
    this.outputFilename = outputFilename;
    jc = JAXBContext.newInstance(PACKAGE_NAME);
  }

  public Guitars unmarshal() throws Exception {
    Unmarshaller u = jc.createUnmarshaller();
    return (Guitars)u.unmarshal(new FileInputStream(inputFilename));
  }

  public void marshal(Guitars guitars) throws Exception {
    Marshaller m = jc.createMarshaller();
    m.marshal(guitars, new FileOutputStream(outputFilename));
  }

  public static void main(String[] args) {
    if (args.length < 2) {
     System.err.println("Incorrect usage: java RoundTripper" +
                   "[input XML filename] [output XML filename]");
      return;
    }

    try {
      RoundTripper rt = new RoundTripper(args[0], args[1]);
      Guitars guitars = rt.unmarshal();
      rt.marshal(guitars);
    } catch (Exception e) {
      e.printStackTrace();
      return;
    }
  }
}			

This is, again, fairly simple and self-explanatory. I ran this program with guitars.xml as the input file, and supplied output.xml as the output filename. There isn't any output to speak of, in terms of text written to the terminal, but you should get a new file (output.xml) from running this process. Theoretically, this file should be a carbon copy of guitars.xml, since no changes were made to the file in memory.


Comparing apples with apples

Once you've generated output.xml, open it up. It should look very similar, if not identical, to Listing 5.


Listing 5. output.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<guitars>
	<guitar id="10021">
		<builder luthier="true">Ryan<builder>
		<model>Mission Grand Concert</model>
		<back-sides>Brazilian Rosewood<back-sides>
		<top>Adirondack Spruce<top>
		<notes>
      
        Just unbelievable...   this guitar has all the tone & 
        resonance you could ever want. I mean, <<WOW!!!>> This 
        is a lifetime guitar.
      
    <notes>
	</guitar>
	<guitar id="0923">
		<builder smallShop="true">Bourgeois</builder>
		<model>OMC<model>
		<back-sides>Bubinga<back-sides>
		<top>Adirondack Spruce</top>
	<guitar>
	<guitar id="11091">
		<builder>Martin & Company<builder>
		<model>OM-28VR<model>
		<back-sides>Indian Rosewood<back-sides>
		<top bearclaw="true">Sitka Spruce<top>
		<notes>It's certainly true that Martin isn't the only game in town anymore.
      Still, the OM-28VR is one of their best models...     and this one 
      has some fabulous bearclaw to boot.              Nice specimen of a 
      still-important guitar manufacturer.
    <notes>
	<guitar>
<guitars>


With an input and output file, it's now possible to see how JAXB does at round-tripping, by comparing the files to each other (remember that the original file was shown back in Listing 1).

Added XML declaration

First, note that the input file had no XML declaration (the line that starts with <xml version=...). JAXB automatically inserted this line into the output. This may seem like a minor issue, but is pretty important -- it's common to actually include an XML file in another XML document these days, particularly when working with SOAP or other transport technologies. The problem with the insertion of the XML declaration is that an XML document can have only one. If you inserted guitars.xml into another XML document, you'd not violate this rule; on the other hand, if you did the same with output.xml, you would have problems. So right away, JAXB has a feature that you need to be careful about.

Removed CDATA sections

Also note that the CDATA sections in the original XML document have been removed. This technically doesn't violate the rules of semantic equivalence -- the content in both documents are semantically the same. In the first document, entity references are avoided through the use of CDATA; in the output document, CDATA is abandoned in favor of entity references. This is more an issue about actual equality between the documents, rather than semantic equality. It's something you should be aware of, although not something that's a major concern.

Whitespace handling

It's nice to see properly handled whitespace. Even though the CDATA section is removed, the whitespace is properly preserved. In addition, the lengthy space in the middle of the description of the Martin OM-28VR guitar is kept as is, a properly handled issue.


Retesting for assurance

One of the best and most telling ways to evaluate roundtripping is to actually retest the round-tripping process. Be careful, though -- I don't mean to simply redo the test. Instead, take the output file (output.xml), and feed it to the round-tripper as the input file. If anything has been introduced into the XML that shouldn't be there, you'll see each successive roundtrip create an output that is a little further away from the original file (guitars.xml). This is a great way to really isolate problem areas. A good data binding tool will always create the same file, over and over again, especially after the initial roundtrip process.

Performing this step, I instruct my RoundTripper to produce retest.xml, based on output.xml as the source XML. The result is shown in Listing 6.


Listing 6. retest.xml
<xml version="1.0" encoding="UTF-8" standalone="yes"?>
<guitars>
	<guitar id="10021">
		<builder luthier="true">Ryan<builder>
		<model>Mission Grand Concert</model>
		<back-sides>Brazilian Rosewood<back-sides>
		<top>Adirondack Spruce<top>
		<notes>
      
        Just unbelievable...   this guitar has all the tone & 
        resonance you could ever want. I mean, <<WOW!!!>> This 
        is a lifetime guitar.
      
    <notes>
	<guitar>
	<guitar id="0923">
		<builder smallShop="true">Bourgeois<builder>
		<model>OMC<model>
		<back-sides>Bubinga<back-sides>
		<top>Adirondack Spruce<top>
	<guitar>
	<guitar id="11091">
		<builder>Martin & Company</builder>
		<model>OM-28VR<model>
		<back-sides>Indian Rosewood<back-sides>
		<top bearclaw="true">Sitka Spruce<top>
		<notes>It's certainly true that Martin isn't the only game in town anymore.
      Still, the OM-28VR is one of their best models...     and this one 
      has some fabulous bearclaw to boot.              Nice specimen of a 
      still-important guitar manufacturer.
    <notes>
	</guitar>
<guitars>


The good news is that Listing 5 and Listing 6 are identical -- showing that JAXB does a pretty good job once that initial roundtripping step has been taken.

In general, then, I can say that JAXB shows pretty well. While I think the automatic addition of the XML declaration is a real issue, it's still not bad compared to APIs that affect the content. JAXB also handles CDATA sections a little differently than you might expect, but it does preserve semantic equivalence. In the next article, I'll show you the various options that you can tweak to further affect the output file, manually dealing with some of the issues that JAXB introduces. All in all, though, JAXB shows itself well in preserving the input document as it should.


Running example programs

In closing, let me share with you my cheat sheet, the Ant setup I use to make all my classpath and JAXB samples easy. Listing 7 shows the Ant build file I've used with this article. To use this file yourself, simply modify the paths to your own XML input files, as well as to your JAXB JAR files.


Listing 7. Ant build file
<?xml version="1.0"?>
<project basedir="." default="roundtrip">
	<property name="jwsdp.home" value="c:\jwsdp-1.3"/>
	<property name="xml.inputFile" value="guitars.xml"/>
	<property name="xml.outputFile" value="output.xml"/>
	<property name="xml.retestFile" value="retest.xml"/>
	<path id="classpath">
		<pathelement path="build"/>
		<fileset dir="${jwsdp.home}" includes="jaxb/lib/*.jar"/>
		<fileset dir="${jwsdp.home}" includes="jwsdp-shared/lib/*.jar"/>
		<fileset dir="${jwsdp.home}" includes="jaxp/lib/**/*.jar"/>
	<path>
	<taskdef name="xjc" classname="com.sun.tools.xjc.XJCTask">
		<classpath refid="classpath"/>
	<taskdef>
	<!-- compile Java source files -->
	<target name="compile">
		<!-- generate the Java content classes from the schema -->
		<echo message="Compiling the schema external binding file..."/>
		<xjc schema="guitars.xsd" package="com.ibm.dw" target="src"/>
		<!-- compile all of the java sources -->
		<echo message="Compiling the java source files..."/>
		<javac srcdir="src" destdir="build" debug="on">
			<classpath refid="classpath"/>
		</javac>
		
		<!-- Copy over the properties files -->
		<copy todir="build">
		  <fileset dir="src">
		    <exclude name="**/*.java"/>
		  </fileset>
		<copy>
	<target>
	
	<target name="roundtrip" depends="compile">
	  <echo message="Converting XML file to Java and back..."/>
	  <java classname="RoundTripper">
	    <arg value="${xml.inputFile}" />
	    <arg value="${xml.outputFile}" />
	    <classpath refid="classpath" />
	  </java>
	<target>
	
	<target name="roundtrip-retest" depends="roundtrip">
	  <echo message="Converting XML file to Java and back... (Second iteration)"/>
	  <java classname="RoundTripper">
	    <arg value="${xml.outputFile}" />
	    <arg value="${xml.retestFile}" />
	    <classpath refid="classpath" />
	  <java>
	<target>
<project>

By default, this file will generate source files from a schema, compile those files, copy over the required JAXB property files, and then compile and run the RoundTripper class. You can manually run the roundtrip-retest target, which handles the second pass of the roundtripping process, using output.xml as the input file. This file should make life easier -- enjoy!


Resources

About the author

Brett McLaughlin has been working in computers since the Logo days (Remember the little triangle?). He currently specializes in building application infrastructure using Java-related technologies. He has spent the last several years implementing these infrastructures at Nextel Communications and Allegiance Telecom, Inc. Brett is one of the co-founders of the Java Apache project Turbine, which builds a reusable component architecture for Web application development using Java servlets. He is also a contributor of the EJBoss project, an open source EJB application server, and Cocoon, an open source XML Web-publishing engine.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=10953
ArticleTitle=Practical data binding: Looking into JAXB, Part 2
publish-date=06172004
author1-email=brett@oreilly.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).