Skip to main content

Programming With XMLBeans

Discover how XMLBeans is revolutionizing data binding

Abhinav Chopra (abchopra@in.ibm.com), Senior Software Engineer, IBM Global Services
Photo of Abhinav Chopra
Abhinav Chopra is a Senior Software Engineer with IBM Global Services India Ltd. He has extensive experience in Java technology, J2EE, and XML-related technologies. His areas of expertise include designing and developing n-tier enterprise applications. He has presented on Java-language and XML related topics in department-level talks and holds professional certifications from IBM and Sun Microsystems. You can contact him at abchopra@in.ibm.com.

Summary:  Get an in-depth look at the features and functionality of XMLBeans. This article introduces the technology with a simple example, takes you through the step-by-step process of compilation and binding, and discusses advanced features like XML cursors, tokens, and XQuery expressions. It also discusses how XMLBeans is more powerful than other XML-Java technology data binding techniques.

Date:  17 Sep 2004
Level:  Introductory
Comments:  

As the complexity of enterprise applications increases, constraints and rules on the structure of XML documents become more and more stringent. Also, with the rapid adoption of Web services across the industry, XML now plays an undeniably important role across many platforms. All of this means it is imperative that applications have an easy and powerful mechanism for handling XML.

XMLBeans provides such a mechanism. XMLBeans is used for XML data binding. It is immensely powerful in that it supports the full W3C XML Schema specification, unlike other data-binding techniques that support only a subset of it. It is also surprisingly easy to use for developers who are accustomed to object-oriented manipulations.

With XMLBeans, you can access and manipulate the data contained in an XML document using Java classes.

Here's how it's done -- actually, it's a two-step process:

  1. The XMLBeans compiler generates an object representation of an XML schema. This object representation is a set of generic Java classes and interfaces that represent the structure and constraints of the schema.
  2. An actual XML instance document that conforms to the above schema is bound to the instances of the Java classes and interfaces generated in Step 1. The binding process involves using the XMLBeans API to access the data in the actual XML instance document in an object-mannered way.

Once the XMLBeans compiler generates the generic Java classes and interfaces that correspond to the schema, any XML instance document that conforms to the schema can be bound using these classes and interfaces. XMLBeans goes a step beyond traditional parsing in that you don't have to:

  • Navigate through each node of an in-memory data tree.
  • Write callback methods to fetch the information from an XML document. (See Advantages of XMLBeans later in this article for further comparison of XMLBeans and parsing.)

A simple example

Consider a simple example in which a schema is taken as input and an XMLBeans compiler compiles this schema into generic Interfaces. I will show you how an actual XML instance document that conforms to this schema is then bound to these interfaces.


Listing 1. The input schema (automobile-policy.xsd)
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xs:element name="automobile-policy">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="insurance-date" type="xs:dateTime"/>
            <xs:element name="policyholder-information"
                        type="policyholder-information" minOccurs="1"/>
            <xs:element name="insured-vehicle"
                        type="insured-vehicle" minOccurs="1"/>
            <xs:element name="liability-coverage"
                        type="liability-coverage" minOccurs="1"/>
            <xs:element name="third-party-coverage"
                        type="third-party-coverage"/>
         </xs:sequence>
      </xs:complexType>
   </xs:element>
   <xs:complexType name="policyholder-information">
      <xs:sequence>
         <xs:element name="name" type="xs:string"/>
         <xs:element name="social-security-number" type="xs:string"/>
         <xs:element name="address" type="xs:string"/>
      </xs:sequence>
   </xs:complexType>
   <xs:complexType name="insured-vehicle">
      <xs:sequence>
      <xs:element name="year-of-manufacture" type="xs:string"/>
         <xs:element name="make" type="xs:string"/>
         <xs:element name="model" type="xs:string"/>
         <xs:element name="price" type="xs:double"/>
      </xs:sequence>
   </xs:complexType>
   <xs:complexType name="liability-coverage">
      <xs:sequence>
         <xs:element name="coverage-limit" type="xs:double"/>
         <xs:element name="coverage-premium" type="xs:double"/>
      </xs:sequence>
   </xs:complexType>
   <xs:complexType name="third-party-coverage">
      <xs:sequence>
         <xs:element name="coverage-limit" type="xs:double"/>
         <xs:element name="coverage-premium" type="xs:double"/>
      </xs:sequence>
   </xs:complexType>
</xs:schema>

The schema in Listing 1 describes an automobile insurance policy. It has:

  • A global root element named automobile-policy
  • Four complex types: policyholder-information, insured-vehicle, liability-coverage, and third-party-coverage
  • A simple type named insurance-date

Compilation process

Before you proceed with the compilation process for the schema in Listing 1, download and install Apache XMLBeans version 1.03 (see Resources). Extract the files from the archive and place the bin directory in the path and place lib/xbean.jar in the classpath.

The bin directory contains scripts for performing a number of useful actions. For example (on the Windows platform):

  • scomp.cmd is the schema compiler. It compiles schemas into XMLBeans classes and interfaces.
  • validate.cmd validates the XML instance document against the schema.

On UNIX and Linux platforms, XMLBeans provides scomp.sh and validate.sh for performing the above operations.

xbean.jar contains the actual XMLBeans API classes.

Once you place the schema in an appropriate folder and set the path and classpath, use the following command to compile the schema:

scomp  -out  automobile-policy.jar automobile-policy.xsd

In this command, scomp is the schema compiler, the -out option is used for the name of the output jar, automobile-policy.jar is the output jar, and automobile-policy.xsd is the schema being compiled.

The above command compiles automobile-policy.xsd to the XMLBeans Interfaces and classes, and jars them into automobile-policy.jar. The following interfaces are generated as a result of compilation:

  • AutomobilePolicyDocument represents the document element. This element is generated for global root elements. In this case, it is generated for automobile-policy.
  • AutomobilePolicyDocument$AutomobilePolicy represents the global root element automobile-policy.
  • PolicyholderInformation represents the complex type policyholder-information.
  • InsuredVehicle represents the complex type insured-vehicle.
  • LiabilityCoverage represents the complex type liability-coverage.
  • ThirdPartyCoverage represents the complex type third-party-coverage.

The package for the generated interfaces is derived from a namespace used in the schema. Since this schema does not include any namespaces, these interfaces will be generated in the package noNamespace.

Take a look at these interfaces. The interface AutomobilePolicyDocument includes the following methods:

  • getAutomobilePolicy() gets the automobile-policy element.
  • setAutomobilePolicy(AutomobilePolicy automobilePolicy) sets the automobile-policy element.
  • addNewAutomobilePolicy() appends and returns a new empty automobile-policy element.

Similarly, the AutomobilePolicyDocument$AutomobilePolicy interface includes the following methods:

  • getPolicyholderInformation() gets the policyholder-information element.
  • getInsuredVehicle() gets the insured-vehicle element.

The bottom line is that the XML Schema structure was replicated as Java Interfaces, and all basic operations -- say, to add new elements or to get or set the existing elements -- were implemented as methods in these interfaces.

Also, all the generated interfaces have a factory class that contains static methods such as:

  • newInstance(), which is used to create instances of this type
  • parse(), which is used to parse the actual XML instance document

Binding process

After the schema has been compiled into XMLBeans interfaces and classes, you need to bind an XML instance to these classes. Here is the code from AutomobilePolicyHandler.java that uses the generated interfaces to handle an actual XML instance based on the schema compiled.


Listing 2. AutomobilePolicyHandler.java
 import noNamespace.*;
 import java.io.File;
 import java.util.Calendar;

 public class AutomobilePolicyHandler{
  public static void main(String args[]) {
  try {
   String filePath = "automobile-policy.xml";
   java.io.File inputXMLFile = new java.io.File(filePath);
   AutomobilePolicyDocument autoPolicyDoc =
     AutomobilePolicyDocument.Factory.parse(inputXMLFile);
   AutomobilePolicyDocument.AutomobilePolicy autoPolicyElement =
     autoPolicyDoc.getAutomobilePolicy();
   System.out.println("date is " + autoPolicyElement.getInsuranceDate());
  } catch (Exception e) {
   e.printStackTrace();
  }
  }
 } 

The code in Listing 2 takes as input the actual XML instance and uses the parse() method of the Factory class of AutomobilePolicyDocument to get an instance of AutomobilePolicyDocument.

The getAutomobilePolicy() method on the above instance of AutomobilePolicyDocument then gives a handle to the root element, automobile-policy. You can then use simple getters and setters to retrieve the values of the child elements in the XML.

The following XML file, automobile-policy.xml, conforms to the schema automobile-policy.xsd, and can be used for the example in Listing 2.


Listing 3. automobile-policy.xml
   <automobile-policy>
      <insurance-date>2004-09-05T14:12:22-05:00</insurance-date>
      <policyholder-information>
         <name>Alan</name>
         <social-security-number>1GBL7D1G3GV100770
                     </social-security-number>
         <address>171 Dormonth Street, Fairfield, OH</address>
      </policyholder-information>
      <insured-vehicle>
         <year-of-manufacture>1999</year-of-manufacture>
         <make>Chevy</make>
         <model>Optra</model>
         <price>1234</price>
      </insured-vehicle>
      <liability-coverage>
         <coverage-limit>1222</coverage-limit>
         <coverage-premium>12</coverage-premium>
      </liability-coverage>
      <third-party-coverage>
            <coverage-limit>2343</coverage-limit>
            <coverage-premium>14</coverage-premium>
      </third-party-coverage>
   </automobile-policy> 


Hierarchy of XMLBeans

All of the XMLBeans classes generated as a result of the compilation process extend org.apache.xmlbeans.XmlObject. This is the base interface for all XMLBeans types, and it includes a number of common facilities that all XMLBeans classes provide:

  • It has methods to copy an XMLObject instance to or from a standard DOM tree or SAX stream.
  • It has a validate() method that validates the subtree of XML under this XMLObject.
  • It has a selectPath(java.lang.String) method that uses relative XPaths to find other XmlObjects in the subtree underneath this XmlObject.

Beneath the XMLObject level, you have user-derived schema types and built-in schema types. I have already showed you the semantics of user-derived schema types such as automobile-policy and policyholder-information. As I mentioned, each user-derived schema type is represented as an Interface.

On the other hand, for built-in schema types such as xs:int and xs:string, XMLBeans provides 46 Java types that correspond to the 46 built-in types defined by the W3C XML Schema specification. For example, for xs:string in XML Schema, XMLBeans provides XmlString.

Returning to the schema, to fetch the social security number of type xs:string that's inside the policyholder-information complex type, XMLBeans provides the following method:

     org.apache.xmlbeans.XmlString xgetSocialSecurityNumber();

This method returns XmlString.

Of course, XMLBeans also provide a method that returns the natural Java type:

     java.lang.String getSocialSecurityNumber();

Note that the method name starts with xget in cases where it returns an XMLBeans type. The xget version of a method provides a performance benefit over the get version, as the get version has to convert the data to the most appropriate Java type.


Advanced features

Now that you've seen a simple example of how easy it is to use XMLBeans, and become familar with the hierarchy of XMLBeans, it's time to take a look at some of the advanced features of XMLBeans. These features are truly representative of the power that XMLBeans possess.

XML cursors

An XML cursor defines a location in an XML document. It is ideal for working with XML documents when a user does not have a schema available. A cursor allows the user to navigate through the document by changing its own location. It also allows the user to remove and insert XML fragments, get and set XML values, and more.

Listing 4 shows a simple example of how XML cursors work. The code in CursorHandler.java retrieves the value of the model of the insured vehicle in automobile-policy.xml.


Listing 4. CursorHandler.java
  import noNamespace.*;
  import java.io.File;
  import java.util.Calendar;
  import org.apache.xmlbeans.XmlCursor;

  public class CursorHandler {
    public static void main(String args[]) {
    try {
      String filePath = "automobile-policy.xml";
      java.io.File inputXMLFile = new java.io.File(filePath);
      AutomobilePolicyDocument autoPolicyDoc = 
         AutomobilePolicyDocument.Factory.parse(inputXMLFile);
      XmlCursor cursor = autoPolicyDoc.newCursor();
      cursor.toFirstContentToken();
      cursor.toChild(2);
      cursor.toChild(2);
      System.out.println(cursor.getTextValue());
      System.out.println("Type of Token is: " +
         cursor.currentTokenType() +
      "\nText of Token is" + cursor.xmlText());
      cursor.dispose();


    } catch (Exception e) {
      e.printStackTrace();
    }

    }
  } 

Here, the cursor is defined at the beginning of the XML instance. Then the method toFirstContentToken() moves the cursor to the first token in the content of the current START or STARTDOC. What this essentially means is that the cursor is now positioned at the start of the root element, automobile-policy.

At this point, cursor.getTextValue() will print the entire contents of the XML document.

Since you are interested in finding the value of the model of the insured vehicle, the cursor.toChild(2) method takes the cursor to the third child element of automobile-policy, which is <insured-vehicle>. Now as the cursor is positioned at the <insured-vehicle> element, again calling cursor.toChild(2) method brings the cursor to the third child element relative to the current position, which is the <model> element.

cursor.getTextValue() then retrieves the value of the model.

When the cursor is finished, don't forget to call its dispose() method.

XML tokens

An XML token represents a category of XML markup. Essentially, XML tokens are representative of the different kinds of parts an XML document can have. These include start and end of an XML document, start and end of elements in an XML document, the attributes and their values, and so on.

As XML cursors are moved in the code, they move from one token to another. When you move a cursor, it moves to the token that fits the description. If the cursor finds no appropriate token to move to, it remains where it is, and a value of "false" is returned to indicate that the cursor didn't move.

Each token type is represented by a constant in the TokenType class. Constants can include:

  • INT_STARTDOC, which represents the start of the XML document (excluding the XML declaration)
  • INT_ENDDOC, which represents the end of the XML document
  • INT_TEXT, which represents contents of an element

The tokens themselves are not exposed as objects, but their type and properties are discoverable through methods on the cursor. For example, the following code snippet in CursorHandler.java prints the token type and the token text.


Listing 5. Code that prints token type and token text
  System.out.println("Type of Token is: " + cursor.currentTokenType()
                        + "\nText of Token is" + cursor.xmlText());

XQuery expressions

XMLBeans provide the support for XQuery expressions. This is SQL-like syntax that can traverse through an XML document to reach elements and attributes. This combination of XQuery expressions with XML cursors makes XQuery very powerful. Taking the same example as above -- that of retrieving the value of the insured vehicle model in automobile-policy.xml -- the code snippet in Listing 6 should do the trick.


Listing 6. Code to retrieve the value of an XML element using an XQuery expression
  XmlCursor cursor = autoPolicyDoc.newCursor();
  String modelQuery = $this/automobile-policy/insured-vehicle/model;

  //Note that execQuery creates a new cursor
  XMLCursor resultCursor = cursor.execQuery(modelQuery);
  System.out.println(resultCursor.getTextValue());

XQuery support

Version 1.x of XMLBeans does not come with an XQuery engine. Version 2 will probably integrate an XQuery engine with XMLBeans.

The code in Listing 6 creates an XQuery expression that reaches out to the required element. The execQuery() method runs this query expression and a new resultCursor is returned. This resultCursor is then used to print the value of the model element. The variable $this denotes the current position of the XML cursor.


Advantages of XMLBeans

XMLBeans compete with traditional parsing and binding techniques such as DOM, SAX, JAXB, and Castor. But XMLBeans has some distinct advantages. Here's how they compare:

  • DOM generates an in-memory tree for the entire document. In cases where the document is very large, DOM becomes quite memory intensive and performance degrades substantially. XMLBeans scores well on performance by doing incremental unmarshalling and providing xget methods to access built-in schema data types.
  • SAX is less memory intensive compared to DOM. However SAX requires that developers write callback methods for event handlers. This is not required by XMLBeans.
  • JAXB and Castor are XML/Java technology binding technologies like XMLBeans, but none of these provide 100 percent schema support. One of the biggest advantages of XMLBeans is its nearly 100 percent support for XML Schema. Also, XMLBeans allow access to the full XML Infoset. This is useful because the order of the elements or comments might be critical for an application.
  • XMLBeans also provide for on-time validation of the XML instance being parsed.
  • XMLBeans include innovative features such as XML cursors and support for XQuery expressions.

Conclusion

On the XML and Java technology frontier, where numerous technologies jostle for space, XMLBeans is making a mark for itself in a very short time. In scenarios where developers need to work with complex XML schemas and need more native support (such as access to the full XML Infoset), XMLBeans has no equal.

The performance benefits and the on-time validation provision make XMLBeans extremely powerful for all kinds of XML and Java technology data binding scenarios. This easy-to-understand API reduces the learning curve for developers, which also makes it a very tempting option. This is one technology that is both powerful and exciting.



Download

NameSizeDownload method
x-beans1_code.zip3KB HTTP

Information about download methods


Resources

About the author

Photo of Abhinav Chopra

Abhinav Chopra is a Senior Software Engineer with IBM Global Services India Ltd. He has extensive experience in Java technology, J2EE, and XML-related technologies. His areas of expertise include designing and developing n-tier enterprise applications. He has presented on Java-language and XML related topics in department-level talks and holds professional certifications from IBM and Sun Microsystems. You can contact him at abchopra@in.ibm.com.

Comments



Trademarks

static.content.url=/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=15109
ArticleTitle=Programming With XMLBeans
publish-date=09172004
author1-email=abchopra@in.ibm.com
author1-email-cc=dwxed@us.ibm.com