Part 1 of this two-part series introduced you to the XPath language and its syntax. This was probably a bit of a stretch for Java programmers who are unfamiliar with XSLT. The XPath syntax might seem a little strange and feel more like a UNIX directory path -- with some odd additions -- than an XML-based syntax. But you should have quickly seen that XPath makes it pretty easy to select a particular portion of an XML document.
The reason that XPath shows up in the Practical data binding series is that these selections all use logical names (see It's eminently logical). Instead of selecting, say, the first attribute of the second child element of the root element, you use an XPath expression like
/cds/cd[@title='August and Everything After']
. This is data binding, in a certain sense, because you're using XML's markup -- rather than its structure -- to access data.
This article focuses on using XPath as a means to access data from XML using logical names such as cd or table or person, rather than firstElement or parent.getChildren(2). The end result is code that looks very much like data binding -- with logical names instead of semantic ones -- but without the initial class generation and classpath issues of traditional data-binding solutions.
Remember from Part 1 that XPath selections always result in a set of nodes being returned. Like any other set, this result can have zero, one, or more members. The concept of a node is what's important here. A node can be a piece of text (such as an element's content), an element, or a processing instruction or comment. And, of course, through a set of nodes it can be any combination of these things.
If you're a DOM programmer you won't stumble over the node concept at all. DOM sees everything in terms of its Node interface, so you're accustomed to working with comments, elements, and even textual data through the Node interface. If you're not used to DOM, though, nodes will take some getting used to. Nodes are critical for working with XPath. When you execute an XPath expression, the code you write to operate upon the result of that expression must deal with nodes and not dive directly into XML semantics. Your document navigation involves XPath expressions, rather than moving from one child to another using a DOM tree-walking approach.
The newest version of the Java programming language -- Java 5.0 -- comes standard with support for the Java API for XML Processing (JAXP) and XPath. You're probably already familiar with JAXP and with the javax.xml.parsers and javax.xml.transform packages used in SAX, DOM, and XSL programming. (If you're not, check out some of the Resources listed at the end of this article.) XPath support comes largely in the form of a single new package called javax.xml.xpath.
Verifying your JAXP and Java versions
JAXP 1.3 is included in all Java 5.0 downloads. To ensure you've got JAXP 1.3, enter this command at your command prompt:
java -version |
You should get something similar to the output shown in Listing 1.
Listing 1. Java 5.0 on the command prompt
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-56)
Java HotSpot(TM) Client VM (build 1.5.0_02-36, mixed mode, sharing)
|
If your version is 1.5 or greater, then you have JAXP 1.3, which ensures that you have the XPath APIs you'll use with this article.
You also need some XML to operate upon. This article will use one simple XML document, shown in Listing 2, for all its examples. It's not a particularly complex document, but it serves nicely as a demonstration for the various XPath functions you need in 90 percent of your XPath programming.
Listing 2. Sample XML for XPath code examples
<?xml version="1.0"?>
<cds>
<cd title="August and Everything After" artistId="23">
<style>rock</style>
<style>alternative</style>
<tracks>
<track id="1">Round Here</track>
<track id="2">Omaha</track>
</tracks>
</cd>
<cd title="This Desert Life" artistId="23">
<style>rock</style>
<style>alternative</style>
<tracks>
<track id="1">Hangin' Around</track>
<track id="2">Mrs. Potter's Lullaby</track>
</tracks>
</cd>
<cd title="Crash" artistId="46">
<style>alternative</style>
<style>jazz</style>
<style>rock</style>
<tracks>
<track id="5">#41</track>
<track id="3">Crash Into Me</track>
</tracks>
</cd>
<artists>
<artist id="23" type="group">
<group-name>Counting Crows</group-name>
<website>http://www.countingcrows.com</website>
<member type="lead-singer">
<firstName>Adam</firstName>
<lastName>Duritz</lastName>
<website>http://adam.countingcrows.com</website>
</member>
</artist>
<artist id="46" type="group">
<group-name>Dave Matthews Band</group-name>
<website>http://www.dmband.com</website>
<member type="lead-singer">
<firstName>Dave</firstName>
<lastName>Matthews</lastName>
</member>
<member type="instrumentalist">
<firstName>Boyd</firstName>
<lastName>Tinsley</lastName>
<instrument>violin</instrument>
<instrument>viola</instrument>
</member>
</artist>
<artist id="101" type="solo">
<website>http://www.patdonohue.com</website>
<member>
<firstName>Pat</firstName>
<lastName>Donohue</lastName>
<instrument>acoustic guitar</instrument>
</member>
</artist>
</artists>
</cds>
|
Save this document somewhere on your local machine where you can access it easily in your Java code (see Download). If you want to use your own XML documents, feel free; you just need to adjust the sample code to match the structure and logical names in your own document.
DOM programmers are already familiar with the concept of nodes, which is integral to understanding XPath expressions and the values returned from those expressions. Another advantage that DOM programmers have when working with XPath is that they actually use the DOM to get access to an XML document, before applying any XPath expressions. This shouldn't be too much of a surprise: XPath is a core part of JAXP now, and JAXP provides integrated DOM support as well.
The first step in working with XPath is to read the target XML document in as a DOM tree, as shown in Listing 3.
Listing 3. Load XML into a DOM tree
package com.ibm.dw.xpath;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
public class SimpleXPath {
File xmlInput;
public SimpleXPath(String xmlFilename) {
xmlInput = new File(xmlFilename);
if (!xmlInput.exists())
throw new RuntimeException("Specified file does not exist!");
}
public void test() throws Exception {
Document doc = buildDocument();
}
private Document buildDocument() throws Exception {
DocumentBuilder builder =
DocumentBuilderFactory.newInstance()
.newDocumentBuilder();
return builder.parse(xmlInput);
}
public static void main(String[] args) {
if (args.length < 1) {
System.err.println("Usage: java com.ibm.dw.xpath.SimpleXPath " +
"[XML filename]");
return;
}
try {
SimpleXPath tester = new SimpleXPath(args[0]);
tester.test();
} catch (Exception e) {
System.err.println("Exception occurred: " + e.getMessage());
e.printStackTrace(System.err);
return;
}
}
}
|
The code in Listing 3 doesn't contain much of note. It simply takes in XML on the command line and then loads that XML into a DOM tree using the JAXP DOM APIs. You should be pretty familiar with this sort of operation if you've worked with JAXP -- or even with DOM without the JAXP helper classes. (If this code seems intimidating, you might want to check out the DOM articles in Resources and then come back to this article.)
Remember from Part 1 that before you evaluate any XPath expression, you must know what context you're starting from. For example, your XPath expression to select the artist node with an ID of 23 is different depending on whether your context is the root of the XML document or the second track element of the third cd.
Obviously, then, you want to be able to move around within an XML document to set that context, and to tailor your XPath expressions accordingly. Here's where the line between a true data-binding API and XPath starts to becomes a little clearer. In XPath, even though your expressions are purely logical (using names like cd or artist), you still need to use the DOM to get around in the XML. This means that you still need to use XML structural constructs like Element or Text, and you still need to do a decent bit of tree walking. Of course, if you read in an XML document -- as in Listing 3 -- your initial context is the document's root, and you needn't worry about much tree walking if you write your expressions from that root.
I thought this was data binding!
Some of you are probably already scratching your heads. This article is supposed to be about how XPath is used for data binding, but so far you've heard mostly about DOM and structure. XPath is a great stand-in for data binding, but a certain degree of structure is inherent in the API. In fact, even the location paths use some degree of structure:
//cds/cd[@title='This Desert Life']
presumes that the cds element is the root element in the target document, and that the cd element is a child of that root. Of course, that's not much different from using code like getCDs().getCD("This Desert Life"). These two approaches to data binding are not significantly different.
Where things become a bit more disparate, though, is in dealing with DOM, as you must do in XPath code. However, even this can be largely tucked away. Refer again to Listing 3; all the DOM-specific work of loading a document is abstracted away through the buildDocument() method. The return value -- a DOM Document object -- is DOM-specific, but you don't need to operate upon that object directly very much. So, without requiring some of the class generation that you need to work through with traditional data-binding alternatives, XPath gets you most of the benefits of data binding:
- Abstraction from APIs like DOM and SAX
- Logical naming instead of structural naming
- A loose mapping between the XML document and your Java objects
That makes it a great lightweight data-binding option in your XML programming.
You've already seen how a little knowledge of DOM can put you well ahead of the learning curve for XPath. If you add familiarity with the JAXP API to your DOM knowledge, then you're way ahead of the game. Take, for instance, the basic sequence you follow when working with either SAX or DOM in JAXP:
- Create a new parser factory.
- Create a new parser or document builder.
- Parse/build the target XML.
It turns out that the same sequence applies to XPath programming:
- Create a new XPath factory.
- Create a new XPath evaluator.
- Create and evaluate XPath expressions.
If you do have that SAX or DOM background, you're probably ready simply to glance at the XPath API docs and start programming. Hold on for just a second, though. You still need to know about a few XPath-specific wrinkles, even though you'll find working through this section both familiar and fairly simple.
If you're new to JAXP or haven't spent much time working with it, you might not understand the importance and value of the factory pattern in this API. JAXP is really not an API so much as an abstraction; it simply lets you use specific vendors' implementations of SAX, DOM, and now XPath processors, without writing vendor-specific code. To accomplish this, you load an instance of a factory -- such as SAXParserFactory or DocumentBuilderFactory -- and then JAXP hooks that class into the vendor's implementation of those constructs. So while you work directly with SAXParserFactory in your code, JAXP might use Apache Xerces behind the scenes to handle the actual parsing of a document.
This same abstraction is a more forward-looking feature in XPath. A wealth of XPath processors simply doesn't exist today, in contrast to the abundance of tools for XML parsing and transformation. However, by using this factory pattern, you'll find it easier to change out an XPath implementation later if you like. Specifically, the XPath factory lets you work with different object models. By default, you'll get node sets back from your XPath expressions in DOM Nodes and NodeSets (something you'll learn a lot more about shortly, under Working with XPath expressions). But, you might want to work with JDOM constructs, or perhaps the object model that dom4j provides. Although XPath engines that work with JAXP aren't currently available for these alternate Java/XML APIs, they'll probably show up soon. The JAXP factory model means that you'll be able to plug these models in easily. So don't get annoyed with the extra steps required to use factories; they're in place for your benefit and flexibility.
You've already learned that the package you'll work with most in using XPath is javax.xml.xpath. It should be no surprise that the factory you need is called XPathFactory and is in that package. As JAXP veterans probably already realize, you use newInstance() to create a new factory, as shown in Listing 4 (which adds to the code you've already seen in Listing 3).
Listing 4. Create a new XPath factory
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
}
|
Nothing particularly special occurs in Listing 4. You just create a new instance of the XPath factory, using the default object model -- which, as you now know, is the DOM (at least as of version 1.3, and that's unlikely to change anytime soon).
As other object models for XML begin to work with JAXP, you might want to move away from JAXP's default DOM model. In those cases, you can specify a string Uniform Resource Identifier (URI) to newInstance(), as in Listing 5.
Listing 5. Specify a string URI
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory with an alternate object model
XPathFactory factory = XPathFactory.newInstance(
"http://jdom.org/jaxp/xpath/jdom");
}
|
By default, this URI is specified as http://java.sun.com/jaxp/xpath/dom in javax.xml.xpath.XPathConstants.DOM_OBJECT_MODEL. So the code in Listing 5 is equivalent to the code shown in Listing 6.
Listing 6. Create a new XPath factory, the long way
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance(
XPathConstants.DOM_OBJECT_MODEL);
}
|
No matter how you create it, the factory is always the first step in working with XPath. So make sure you have factory-creation code in your test class, and you're ready to work with the XPath class.
Once you've created your factory, you need to move from it to an object that can actually evaluate expressions and interact with the XPath environment. The XPathFactory does not let you evaluate expressions (at least not directly); it's merely a means of getting from your code to a vendor's XPath engine and object model, without requiring lots of vendor-specific code. So with that link established, you need to get an object that can handle XPath expression evaluation. That's where another object, called simply XPath, comes into play.
You can create a new XPath with the newXPath() method, available on your factory. Listing 7 shows how simple an operation this is.
Listing 7. Create an instance of the XPath class
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
}
|
If this looks almost painfully simple, then you've got exactly the right idea. All the rest of your XPath work will proceed from this class, though, so as simple as this looks, get used to typing it.
Configuring and resetting the XPath object
Once you have an instance of XPath, you can start to configure it by changing the context, working with namespaces, and evaluating expressions. Some of this topic goes beyond the scope of this article, but if you ever end up with XPath code that isn't behaving, you might have a misconfigured XPath object on your hands. If that's the case -- or you suspect that it might be -- you can always call xpath.reset() to reset the object to its original configuration. This will often remove any problems you see, and it's a great help in debugging.
Working with XPath expressions
By now, you're probably more than ready to get on to using XPath expressions. Fortunately, all the preparation you've done to this point makes this easy. You can simply create a new expression (remember that these are just references, often called location paths, to locations within the XML document) as a string.
Listing 8 assigns an XPath expression to a string variable -- no trickery or unusual syntax here. You just use a normal Java String and give it your XPath expression to evaluate.
Listing 8. Create an XPath expression
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
}
|
Because Listing 8 includes no XPath-specific steps, you must be careful. It also doesn't include any error-checking. You might leave the @ symbol off of the title attribute (something I actually did when I first wrote this code!), and you'll get no errors -- only an empty return value when you evaluate the expressions. So always double-check your expressions before you evaluate and use them.
Evaluating an XPath expression
Once you have the expression, you just need to evaluate it. This is where the XPath object comes back into play. You use the evaluate() method to evaluate your expression. This method is a little odd, so take a look at an example shown in Listing 9, and you'll see what each parameter does.
Listing 9. Evaluate an XPath expression
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
Node cd = (Node)xpath.evaluate(expr, doc, XPathConstants.NODE);
}
|
The first argument to evaluate() is the expression itself, and the second is the context to evaluate the expression from. In Listing 9, the DOM document obtained by calling buildDocument() is passed in. If you wanted a different context, you could pass in a Node from somewhere else in the document. More often than not, though, it's easiest to pass in the DOM document element and write your expressions from there.
The last argument will look a little odd to you unless you're familiar with the Java Naming and Directory Interface (JNDI). This argument indicates to the XPath engine what type to expect as the expression's return value. The XPathConstants class defines several possible values:
-
XPathConstants.BOOLEANfor expressions that return boolean data types (maps to the JavaBooleandata type) -
XPathConstants.NODEfor expressions that return boolean data types (maps to the DOMorg.w3c.dom.Nodedata type) -
XPathConstants.NODESETfor expressions that return boolean data types (maps to the DOMorg.w3c.dom.NodeListdata type) -
XPathConstants.NUMBERfor expressions that return boolean data types (maps to the JavaDoubledata type) -
XPathConstants.STRINGfor expressions that return boolean data types (maps to the JavaStringdata type)
These types will cover all your expressions. In the case of the expression in Listing 9 (
//cds/cd[@title='August and Everything After']
), one and only one node should be returned -- so the type is specified as XPathConstants.NODE. If the expression were to return multiple nodes -- for instance,
//cds/cd
, which should return all cd elements -- you would use XPathConstants.NODESET.
The return value of evaluate() is a Java Object, so you need to cast that to the type you indicated. You can then operate upon the Node, or NodeSet, or whatever else you're getting back in return.
Precompiling an XPath expression
Before I move on to working with different types of XPath expressions, one more feature of the XPath object is worth mentioning. If you plan to use the same XPath expression over and over -- either on a single XML document, or perhaps across several documents that share the same structure -- you really don't want to pay the cost of compiling the bytecode version of that expression repeatedly.
To avoid duplicating that process, XPath provides a method called compile(), which simply takes your string XPath expression as its only argument. It compiles that expression and returns it as an instance of the XPathExpression class. This class then stands in for XPath; you call evaluate() on your XPathExpression instance, instead of on XPath directly. The version of evaluate() on XPathExpression takes a context (your DOM document object in most cases) and the return type, and operates just as the version on the XPath object did. Check out Listing 10 for a simple example.
Listing 10. Precompile an XPath expression
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
XPathExpression xexpr = xpath.compile(expr);
Node cd = (Node)xexpr.evaluate(doc, XPathConstants.NODE);
}
|
This is a pretty simple change that gives good return on your effort. You're going to pay the cost of compiling an expression into bytecode anyway, so the only real additional cost associated with this change is an extra object to manage, and a slightly less obvious program flow. (The emphasis is on slightly; this is still pretty clear code for anyone with minimal Java and JAXP experience.) Unless you're really sure you won't need an expression but once, you'll almost always benefit from precompiling your expressions.
Dealing with expression results
Your choices in dealing with the result of an XPath expression are limited only by your programmming knowledge. You can do pretty much anything you want once you've got a result set. For instance, to verify that the code in Listing 9 and Listing 10 does indeed retrieve the correct CD, you might add code like that shown in Listing 11.
Listing 11. Verify results
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
XPathExpression xexpr = xpath.compile(expr);
Node cd = (Node)xexpr.evaluate(doc, XPathConstants.NODE);
if (cd != null) {
Element el = (Element)cd;
System.out.println(el.getAttribute("title"));
} else {
System.err.println("Error! Node is null!");
}
}
|
In itself, this isn't very useful, but what's important is that you can use DOM manipulation code on the Node returned from the XPath expression evaluation. You can also transform this into another Java/XML format (such as SAX) and feed it to an event handler, or serialize the node out to disk. Or, you can extract its data for processing in another part of your application, or iterate through its tracks, or anything else you might think of. The XPath-specific processing is done, so you no longer need to worry about it.
Of course, you could use this returned Node as the context for a new XPath expression, as shown in Listing 12.
Listing 12. Multiple expression evaluation
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
XPathExpression xexpr = xpath.compile(expr);
Node cd = (Node)xexpr.evaluate(doc, XPathConstants.NODE);
Node track2 = (Node)xpath.evaluate(
"tracks/track[@id='2']", cd, XPathConstants.NODE);
Element e = (Element)track2;
// Test the returned value
System.out.println(e.getFirstChild().getNodeValue());
}
|
This is where using the context can become quite valuable; you can start an XPath expression with the result of a previous expression, allowing for incredibly easy document navigation without needing to write lots of DOM code.
Things get more interesting when you understand XPath and can use different types of expressions. You've seen most of the things XPath has to offer in Part 1. Now it's time to put them into play in your Java code.
One of the more common operations is to work with a set of nodes returned from an expression. Look back at the sample XML in Listing 2 and suppose you want to get all the CDs by the Counting Crows. You could use a couple of XPath expressions to make this pretty easy, as shown in Listing 13.
Listing 13. Get fancy with node sets
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
XPathExpression xexpr = xpath.compile(expr);
Node cd = (Node)xexpr.evaluate(doc, XPathConstants.NODE);
expr = "//cds/artists/artist[group-name='Counting Crows']/@id";
String crowsID =
(String)xpath.evaluate(expr, doc, XPathConstants.STRING);
expr = "//cds/cd[@artistId='" + crowsID + "']";
NodeList crowsCDs =
(NodeList)xpath.evaluate(expr, doc, XPathConstants.NODESET);
expr = "@title";
XPathExpression title = xpath.compile(expr);
for (int i = 0; i < crowsCDs.getLength(); i++) {
Node current = crowsCDs.item(i);
System.out.println("Found CD titled '" +
title.evaluate(current) + "'");
}
}
|
The code in Listing 13 uses several XPath expressions, with varying return types. First, the ID for the Counting Crows is retrieved using an XPath expression that returns a string value (it's actually a number, but there's no real benefit in converting from text to a number since it's used as a string value later in the same code). The result of that expression is then used to create a new expression and select all the cd elements with that ID.
The result from this second expression is a node set, which maps to the DOM org.w3c.dom.NodeList type. It's pretty trivial to iterate through the org.w3c.dom.NodeList; you could easily cast each Node in the list to an org.w3c.dom.Element and then grab the child element (a text node) and spit out that text node's value. However, just to show XPath one more time -- and how easy XPath makes it to sidestep almost all DOM tree-walking code -- another XPath expression is used. This one grabs the current node's title attribute (remember, the context is important here), and returns its value, which is then output. By using the current Node in the NodeList as the context, you have a great alternative to DOM for getting each CD's title.
Attentive readers may have noticed this bit of code in Listing 13:
title.evaluate(current) |
This version of
evaluate() takes a context item, as is normal, but does not require a return type, and it seems to return a string. That's exactly what this version of evaluate() does: It returns a string in all cases. Sometimes -- such as when you're evaluating an expression that returns a set of nodes, or even a single node -- this can create a big mess. Other times, though, when you really just want a textual expression, it's a nifty shortcut.
Another nice feature of XPath is its ability to evaluate boolean expressions. For instance, you could use the code in Listing 14 to achieve an effect similar to that shown in Listing 13; this time, all CDs are iterated over, and a boolean XPath expression is applied.
Listing 14. Return booleans with XPath expressions
public void test() throws Exception {
Document doc = buildDocument();
// Create a new XPath factory
XPathFactory factory = XPathFactory.newInstance();
// Create a new XPath instance
XPath xpath = factory.newXPath();
// Create a new XPath expression
String expr = "//cds/cd[@title='August and Everything After']";
XPathExpression xexpr = xpath.compile(expr);
Node cd = (Node)xexpr.evaluate(doc, XPathConstants.NODE);
expr = "//cds/artists/artist[group-name='Counting Crows']/@id";
String crowsID = (String)xpath.evaluate(expr, doc, XPathConstants.STRING);
expr = "//cds/cd";
NodeList allCDs =
(NodeList)xpath.evaluate(expr, doc, XPathConstants.NODESET);
String expr1 = "@artistId='" + crowsID + "'";
String expr2 = "@title";
XPathExpression artist = xpath.compile(expr1);
XPathExpression title = xpath.compile(expr2);
for (int i = 0; i < allCDs.getLength(); i++) {
Node current = allCDs.item(i);
Boolean crowsCD =
(Boolean)artist.evaluate(current, XPathConstants.BOOLEAN);
if (crowsCD)
System.out.println("Found CD titled '" +
title.evaluate(current) + "'");
}
}
|
Neither Listing 13 nor Listing 14 is the right or wrong approach; they are simply two different ways to get at the same information. However, they both demonstrate XPath's flexibility. These examples, along with others you've seen in this article, show you how XPath works with various data types and give you ideas for your own programming tasks.
Java developers have more APIs, technologies, and toolkits available to them than ever before. However, with this wealth of programming capability comes a danger -- a sort of API tunnel vision. It's easy to assume that only APIs that are expressly labeled as data binding can be used to get at XML data in a logical, rather than semantic, way. An API labeled XPath must only be useful for applications where XPath is traditionally used, the tunnel-vision thinking goes. The same could be said about any other API, whether it's intended for parsing XML, reading property files, or creating tree-based structures in memory.
However, the Java language's power is often most evident when you use a little creativity in your API and programming choices. Instead of being limited to an arbitrary label (such as data binding), look at what an API actually provides. If you find an API that seems to offer useful functionality, then add it into your code, regardless of what it is intended for. A working application that (mis)uses APIs this way will be appreciated a lot more than a clunky, slow, barely functional one that followed all the rules.
In this two-article series, you've seen that you can use the XPath API -- generally considered a bit player in the XSLT and XPointer worlds -- as a data binding API relatively easily. Even better, the Java language comes with an XPath API built into JAXP. How can any Java-and-XML programmer resist an offering like this? So break out of labeling limitations; you can use a lot more than just JAXB for data binding, and XPath is a great place to start.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample code for this article | x-pracdb9_code.zip | 52KB | HTTP |
Information about download methods
Learn
- "XPath as data binding tool, Part 1" (developerWorks, November 2005): Read the first article in this two-part series and find out how to select XML content with XPath.
-
XML Path Language (XPath) Version 1.0 and XML Path Language (XPath) 2.0: The XPath 1.0 specification is a W3C Recommendation. XPath 2.0 became a W3C Candidate Recommendation in early November 2005.
- "Get Started with XPath" (developerWorks, May 2004): Take Bertrand Portier's tutorial for newcomers to XPath.
-
The Extensible Stylesheet Language Family (XSL) and XSL Transformations (XSLT) Version 1.0: XPath is a critical part of XSL. Find out more about the XSL family of specifications at the W3C, including the XSLT transformations language.
-
XQuery 1.0 and XPath 2.0 Data Model (XDM): Along with XPath 2.0, XQuery attempts to define a data model that will work with both specifications.
- "Toward an XPath API": This article by Leigh Dodds is great for understanding what led to the original 1.0 XPath API.
- "SAX, the power API" (developerWorks, August 2001): This article is a solid introduction to SAX, the event-based API for processing XML that has become a de facto standard.
- "Understanding DOM" (developerWorks, July 2003): This tutorial by Nicholas Chase explains the structure of a DOM document and shows how to use Java technology to create a DOM document from an XML file, make changes to it, and retrieve the output.
- "Effective XML processing with DOM and XPath in Java" (developerworks, May 2002): Tony Darugar examines how to make effective and efficient use of DOM in Java programming.
-
Java and XML
(O'Reilly Media, Inc.): The upcoming third edition of Brett McLaughlin's book will devote an entire chapter to XPath.
-
XML in a Nutshell
(O'Reilly Media, Inc.): This book by Elliotte Rusty Harold and W. Scott Means is a great all-in-one XML resource with a chapter devoted to XPath.
-
Java API for XML Processing (JAXP): Find out how JAXP provides a native Java-language solution that allows you to create XPath requests and use the results in your applications.
- "All about JAXP, Part 1" and "All about JAXP, Part 2" (developerWorks, May 2005): This two-part series by Brett McLaughlin introduces JAXP, showing how to take advantage of the API's parsing and validation features and its support for XSL transformations.
- "JAXP validation" (developerWorks, October 2005): This article details the new JAXP API, from its basics to the more advanced features.
-
Java Naming and Directory Interface (JNDI): This industry standard provides applications based on Java technology with a unified interface to multiple naming and directory services.
-
developerWorks XML zone: Find more XML resources here, including articles, tutorials, tips, and standards.
-
developerWorks Java technology zone:
Access hundreds of articles, tutorials, and tips to help you make the most of the Java-language technology and related applications.
-
IBM Certified Solution Developer -- XML and related technologies: Learn how to get certified.
Get products and technologies
-
Build your next development project with IBM trial software, available for download directly from developerWorks.
Discuss

Brett McLaughlin has worked in computers since the Logo days. (Remember the little triangle?) In recent years, he's become one of the most well-known authors and programmers in the Java technology and XML communities. He's worked for Nextel Communications, implementing complex enterprise systems; at Lutris Technologies, actually writing application servers; and most recently at O'Reilly Media, Inc., where he continues to write and edit books that matter. His most recent book, Java 5.0 Tiger: A Developer's Notebook, is the first book available on the newest version of Java technology, and his classic Java and XML remains one of the definitive works on using XML technologies in the Java language.




