As the complexity of enterprise applications increases, constraints and rules on the structure of XML documents become more and more stringent. Also, with the rapid adoption of Web services across the industry, XML now plays an undeniably important role across many platforms. All of this means it is imperative that applications have an easy and powerful mechanism for handling XML.
XMLBeans provides such a mechanism. XMLBeans is used for XML data binding. It is immensely powerful in that it supports the full W3C XML Schema specification, unlike other data-binding techniques that support only a subset of it. It is also surprisingly easy to use for developers who are accustomed to object-oriented manipulations.
With XMLBeans, you can access and manipulate the data contained in an XML document using Java classes.
Here's how it's done -- actually, it's a two-step process:
- The XMLBeans compiler generates an object representation of an XML schema. This object representation is a set of generic Java classes and interfaces that represent the structure and constraints of the schema.
- An actual XML instance document that conforms to the above schema is bound to the instances of the Java classes and interfaces generated in Step 1. The binding process involves using the XMLBeans API to access the data in the actual XML instance document in an object-mannered way.
Once the XMLBeans compiler generates the generic Java classes and interfaces that correspond to the schema, any XML instance document that conforms to the schema can be bound using these classes and interfaces. XMLBeans goes a step beyond traditional parsing in that you don't have to:
- Navigate through each node of an in-memory data tree.
- Write callback methods to fetch the information from an XML document. (See Advantages of XMLBeans later in this article for further comparison of XMLBeans and parsing.)
Consider a simple example in which a schema is taken as input and an XMLBeans compiler compiles this schema into generic Interfaces. I will show you how an actual XML instance document that conforms to this schema is then bound to these interfaces.
Listing 1. The input schema (automobile-policy.xsd)
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="automobile-policy">
<xs:complexType>
<xs:sequence>
<xs:element name="insurance-date" type="xs:dateTime"/>
<xs:element name="policyholder-information"
type="policyholder-information" minOccurs="1"/>
<xs:element name="insured-vehicle"
type="insured-vehicle" minOccurs="1"/>
<xs:element name="liability-coverage"
type="liability-coverage" minOccurs="1"/>
<xs:element name="third-party-coverage"
type="third-party-coverage"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="policyholder-information">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="social-security-number" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="insured-vehicle">
<xs:sequence>
<xs:element name="year-of-manufacture" type="xs:string"/>
<xs:element name="make" type="xs:string"/>
<xs:element name="model" type="xs:string"/>
<xs:element name="price" type="xs:double"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="liability-coverage">
<xs:sequence>
<xs:element name="coverage-limit" type="xs:double"/>
<xs:element name="coverage-premium" type="xs:double"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="third-party-coverage">
<xs:sequence>
<xs:element name="coverage-limit" type="xs:double"/>
<xs:element name="coverage-premium" type="xs:double"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
|
The schema in Listing 1 describes an automobile insurance policy. It has:
- A global root element named
automobile-policy - Four complex types:
policyholder-information,insured-vehicle,liability-coverage, andthird-party-coverage - A simple type named
insurance-date
Before you proceed with the compilation process for the schema in Listing 1, download and install Apache XMLBeans version 1.03 (see Resources). Extract the files from the archive and place the bin directory in the path and place lib/xbean.jar in the classpath.
The bin directory contains scripts for performing a number of useful actions. For example (on the Windows platform):
scomp.cmdis the schema compiler. It compiles schemas into XMLBeans classes and interfaces.validate.cmdvalidates the XML instance document against the schema.
On UNIX and Linux platforms, XMLBeans provides scomp.sh and validate.sh for performing the above operations.
xbean.jar contains the actual XMLBeans API classes.
Once you place the schema in an appropriate folder and set the path and classpath, use the following command to compile the schema:
scomp -out automobile-policy.jar automobile-policy.xsd |
In this command, scomp is the schema compiler, the -out option is used for the name of the output jar, automobile-policy.jar is the output jar, and automobile-policy.xsd is the schema being compiled.
The above command compiles automobile-policy.xsd to the XMLBeans Interfaces and classes, and jars them into automobile-policy.jar. The following interfaces are generated as a result of compilation:
AutomobilePolicyDocumentrepresents the document element. This element is generated for global root elements. In this case, it is generated forautomobile-policy.AutomobilePolicyDocument$AutomobilePolicyrepresents the global root elementautomobile-policy.PolicyholderInformationrepresents the complex typepolicyholder-information.InsuredVehiclerepresents the complex typeinsured-vehicle.LiabilityCoveragerepresents the complex typeliability-coverage.ThirdPartyCoveragerepresents the complex typethird-party-coverage.
The package for the generated interfaces is derived from a namespace used in the schema. Since this schema does not include any namespaces, these interfaces will be generated in the package noNamespace.
Take a look at these interfaces. The interface AutomobilePolicyDocument includes the following methods:
getAutomobilePolicy()gets theautomobile-policyelement.setAutomobilePolicy(AutomobilePolicy automobilePolicy)sets theautomobile-policyelement.addNewAutomobilePolicy()appends and returns a new emptyautomobile-policyelement.
Similarly, the AutomobilePolicyDocument$AutomobilePolicy interface includes the following methods:
getPolicyholderInformation()gets thepolicyholder-informationelement.getInsuredVehicle()gets theinsured-vehicleelement.
The bottom line is that the XML Schema structure was replicated as Java Interfaces, and all basic operations -- say, to add new elements or to get or set the existing elements -- were implemented as methods in these interfaces.
Also, all the generated interfaces have a factory class that contains static methods such as:
newInstance(), which is used to create instances of this typeparse(), which is used to parse the actual XML instance document
After the schema has been compiled into XMLBeans interfaces and classes, you need to bind an XML instance to these classes.
Here is the code from AutomobilePolicyHandler.java that uses the generated interfaces to handle an actual XML instance based on the schema compiled.
Listing 2. AutomobilePolicyHandler.java
import noNamespace.*;
import java.io.File;
import java.util.Calendar;
public class AutomobilePolicyHandler{
public static void main(String args[]) {
try {
String filePath = "automobile-policy.xml";
java.io.File inputXMLFile = new java.io.File(filePath);
AutomobilePolicyDocument autoPolicyDoc =
AutomobilePolicyDocument.Factory.parse(inputXMLFile);
AutomobilePolicyDocument.AutomobilePolicy autoPolicyElement =
autoPolicyDoc.getAutomobilePolicy();
System.out.println("date is " + autoPolicyElement.getInsuranceDate());
} catch (Exception e) {
e.printStackTrace();
}
}
} |
The code in Listing 2 takes as input the actual XML instance and uses the parse() method of the Factory class of AutomobilePolicyDocument to get an instance of AutomobilePolicyDocument.
The getAutomobilePolicy() method on the above instance of AutomobilePolicyDocument then gives a handle to the root element, automobile-policy. You can then use simple getters and setters to retrieve the values of the child elements in the XML.
The following XML file, automobile-policy.xml, conforms to the schema automobile-policy.xsd, and can be used for the example in Listing 2.
Listing 3. automobile-policy.xml
<automobile-policy>
<insurance-date>2004-09-05T14:12:22-05:00</insurance-date>
<policyholder-information>
<name>Alan</name>
<social-security-number>1GBL7D1G3GV100770
</social-security-number>
<address>171 Dormonth Street, Fairfield, OH</address>
</policyholder-information>
<insured-vehicle>
<year-of-manufacture>1999</year-of-manufacture>
<make>Chevy</make>
<model>Optra</model>
<price>1234</price>
</insured-vehicle>
<liability-coverage>
<coverage-limit>1222</coverage-limit>
<coverage-premium>12</coverage-premium>
</liability-coverage>
<third-party-coverage>
<coverage-limit>2343</coverage-limit>
<coverage-premium>14</coverage-premium>
</third-party-coverage>
</automobile-policy> |
All of the XMLBeans classes generated as a result of the compilation process extend org.apache.xmlbeans.XmlObject. This is the base interface for all XMLBeans types, and it includes a number of common facilities that all XMLBeans classes provide:
- It has methods to copy an
XMLObjectinstance to or from a standard DOM tree or SAX stream. - It has a
validate()method that validates the subtree of XML under thisXMLObject. - It has a
selectPath(java.lang.String)method that uses relative XPaths to find otherXmlObjects in the subtree underneath thisXmlObject.
Beneath the XMLObject level, you have user-derived schema types and built-in schema types. I have already showed you the semantics of user-derived schema types such as automobile-policy and policyholder-information. As I mentioned, each user-derived schema type is represented as an Interface.
On the other hand, for built-in schema types such as xs:int and xs:string, XMLBeans provides 46 Java types that correspond to the 46 built-in types defined by the W3C XML Schema specification. For example, for xs:string in XML Schema, XMLBeans provides XmlString.
Returning to the schema, to fetch the social security number of type xs:string that's inside the policyholder-information complex type, XMLBeans provides the following method:
org.apache.xmlbeans.XmlString xgetSocialSecurityNumber(); |
This method returns XmlString.
Of course, XMLBeans also provide a method that returns the natural Java type:
java.lang.String getSocialSecurityNumber(); |
Note that the method name starts with xget in cases where it returns an XMLBeans type. The xget version of a method provides a performance benefit over the get version, as the get version has to convert the data to the most appropriate Java type.
Now that you've seen a simple example of how easy it is to use XMLBeans, and become familar with the hierarchy of XMLBeans, it's time to take a look at some of the advanced features of XMLBeans. These features are truly representative of the power that XMLBeans possess.
An XML cursor defines a location in an XML document. It is ideal for working with XML documents when a user does not have a schema available. A cursor allows the user to navigate through the document by changing its own location. It also allows the user to remove and insert XML fragments, get and set XML values, and more.
Listing 4 shows a simple example of how XML cursors work. The code in CursorHandler.java retrieves the value of the model of the insured vehicle in automobile-policy.xml.
Listing 4. CursorHandler.java
import noNamespace.*;
import java.io.File;
import java.util.Calendar;
import org.apache.xmlbeans.XmlCursor;
public class CursorHandler {
public static void main(String args[]) {
try {
String filePath = "automobile-policy.xml";
java.io.File inputXMLFile = new java.io.File(filePath);
AutomobilePolicyDocument autoPolicyDoc =
AutomobilePolicyDocument.Factory.parse(inputXMLFile);
XmlCursor cursor = autoPolicyDoc.newCursor();
cursor.toFirstContentToken();
cursor.toChild(2);
cursor.toChild(2);
System.out.println(cursor.getTextValue());
System.out.println("Type of Token is: " +
cursor.currentTokenType() +
"\nText of Token is" + cursor.xmlText());
cursor.dispose();
} catch (Exception e) {
e.printStackTrace();
}
}
} |
Here, the cursor is defined at the beginning of the XML instance. Then the method toFirstContentToken() moves the cursor to the first token in the content of the current START or STARTDOC. What this essentially means is that the cursor is now positioned at the start of the root element, automobile-policy.
At this point, cursor.getTextValue() will print the entire contents of the XML document.
Since you are interested in finding the value of the model of the insured vehicle, the
cursor.toChild(2) method takes the cursor to the third child element of automobile-policy, which is <insured-vehicle>.
Now as the cursor is positioned at the <insured-vehicle> element, again calling cursor.toChild(2) method brings the cursor to the third child element relative to the current position, which is the <model> element.
cursor.getTextValue() then retrieves the value of the model.
When the cursor is finished, don't forget to call its dispose() method.
An XML token represents a category of XML markup. Essentially, XML tokens are representative of the different kinds of parts an XML document can have. These include start and end of an XML document, start and end of elements in an XML document, the attributes and their values, and so on.
As XML cursors are moved in the code, they move from one token to another. When you move a cursor, it moves to the token that fits the description. If the cursor finds no appropriate token to move to, it remains where it is, and a value of "false" is returned to indicate that the cursor didn't move.
Each token type is represented by a constant in the TokenType class. Constants can include:
INT_STARTDOC, which represents the start of the XML document (excluding the XML declaration)INT_ENDDOC, which represents the end of the XML documentINT_TEXT, which represents contents of an element
The tokens themselves are not exposed as objects, but their type and properties are discoverable through methods on the cursor. For example, the following code snippet in CursorHandler.java prints the token type and the token text.
Listing 5. Code that prints token type and token text
System.out.println("Type of Token is: " + cursor.currentTokenType()
+ "\nText of Token is" + cursor.xmlText());
|
XMLBeans provide the support for XQuery expressions. This is SQL-like syntax that can traverse through an XML document to reach elements and attributes. This combination of XQuery expressions with XML cursors makes XQuery very powerful. Taking the same example as above -- that of retrieving the value of the insured vehicle model in automobile-policy.xml -- the code snippet in Listing 6 should do the trick.
Listing 6. Code to retrieve the value of an XML element using an XQuery expression
XmlCursor cursor = autoPolicyDoc.newCursor(); String modelQuery = $this/automobile-policy/insured-vehicle/model; //Note that execQuery creates a new cursor XMLCursor resultCursor = cursor.execQuery(modelQuery); System.out.println(resultCursor.getTextValue()); |
The code in Listing 6 creates an XQuery expression that reaches out to the required element. The execQuery() method runs this query expression and a new resultCursor is returned. This resultCursor is then used to print the value of the model element. The variable $this denotes the current position of the XML cursor.
XMLBeans compete with traditional parsing and binding techniques such as DOM, SAX, JAXB, and Castor. But XMLBeans has some distinct advantages. Here's how they compare:
- DOM generates an in-memory tree for the entire document. In cases where the document is very large, DOM becomes quite memory intensive and performance degrades substantially. XMLBeans scores well on performance by doing incremental unmarshalling and providing
xgetmethods to access built-in schema data types. - SAX is less memory intensive compared to DOM. However SAX requires that developers write callback methods for event handlers. This is not required by XMLBeans.
- JAXB and Castor are XML/Java technology binding technologies like XMLBeans, but none of these provide 100 percent schema support. One of the biggest advantages of XMLBeans is its nearly 100 percent support for XML Schema. Also, XMLBeans allow access to the full XML Infoset. This is useful because the order of the elements or comments might be critical for an application.
- XMLBeans also provide for on-time validation of the XML instance being parsed.
- XMLBeans include innovative features such as XML cursors and support for XQuery expressions.
On the XML and Java technology frontier, where numerous technologies jostle for space, XMLBeans is making a mark for itself in a very short time. In scenarios where developers need to work with complex XML schemas and need more native support (such as access to the full XML Infoset), XMLBeans has no equal.
The performance benefits and the on-time validation provision make XMLBeans extremely powerful for all kinds of XML and Java technology data binding scenarios. This easy-to-understand API reduces the learning curve for developers, which also makes it a very tempting option. This is one technology that is both powerful and exciting.
| Name | Size | Download method |
|---|---|---|
| x-beans1_code.zip | 3KB | HTTP |
Information about download methods
- Download the source code used in this article.
- Download Apache XMLBeans version 1.03 source and binaries.
- Find out more about XMLBeans at the Apache XMLBeans Site.
- Get answers to your general XMLBeans questions at the XMLBeans FAQ.
- Read the debut installment of Brett McLaughlin's popular Practical data binding column, "Get your feet wet in the real world", here on developerWorks (May, 2004). The column forum can also provide additional information on how to work with these technologies.
- Find more XML resources on the developerWorks XML zone.
- Learn how you can become an IBM Certified Developer in XML and related technologies.

Abhinav Chopra is a Senior Software Engineer with IBM Global Services India Ltd. He has extensive experience in Java technology, J2EE, and XML-related technologies. His areas of expertise include designing and developing n-tier enterprise applications. He has presented on Java-language and XML related topics in department-level talks and holds professional certifications from IBM and Sun Microsystems. You can contact him at abchopra@in.ibm.com.



