Some ten years ago, when XML was peaking as one of the hottest technologies around, it was inconceivable that in a decade, XML would be so important, plus you can find many XML-related technologies that are just as interesting. In fact, you can view the various XML technologies through several paths.
You can group XML-related technologies into a few basic groups, or focus areas:
- Document authoring: This group is for the folks that spend most of their time actually authoring XML. Whether they create original XML data, or represent existing data in an XML format, the focus here is pure XML, and takes little notice of the programming tasks that might use those documents. Here's where the core of XML sits, along with specific XML vocabularies, like MathML or some of the scientific XML vocabularies.
- Processing XML: These are the technologies like XSL that allows XML to be transformed, massaged, or migrated from one format to another. Again, the focus is the XML documents and data within them, although programming languages are sometimes used to accomplish these transformations.
- Reading/writing XML (and persisting data): This is the more programming-centric grouping of technologies, ranging from low-level APIs like SAX and DOM to data binding technologies like JAXB and Castor. This is where XML is seen as a data storage mechanism, and in a lot of ways, a means to an end.
Until recently, these were the big categories...with most new technologies and specifications adding to one of these three groupings.
Moving XML to a first-class data citizen
One of the big problems with XML, though—and a limitation to the three groups
above—was the lack of good search support. If you wanted to search through your
data, and the data was in XML, it's been a problem. In fact, the general solution was to force together a few of the groupings above. Document authors might struggle through using a command-line tool like grep, which is a lousy way to perform searches. Programmers might read in the XML (another grouping), and then use their programming language (like Java or C#) to search through the data in a non-XML format. That's workable, but still reveals a limitation of XML.
Fortunately, the introduction and now popularity of XPath (and XQuery, mentioned late in this article) have introduced a new group:
- Searching XML: This is where XPath and XQuery come in. These specifications/technologies allow you to search XML documents in what amounts to an XML manner. In other words, the searches are capable of working with XML semantics, and can search through not only the data in an XML document, but the structure of those documents, as well.
With XPath and XQuery, you're not stuck pulling data from XML into a programming language, and then using that language's tools to search the data. In addition to the constraint of your programming language with that approach, you typically lose most of the XML semantics and structure, such as what element was a child of what other element, and so on. XPath and XQuery allow you to search XML without needing a programming language.
All that said, though, there's still a need for programming languages and interacting
with XPath and XQuery from Java (and other) languages. While XPath and XQuery give you
great XML-aware searching capabilities, you still need a way to use these technologies
if you're a programmer. Simply launching a command-line process with a system-aware
command like exec() is a pain, and prone to all sorts of
errors you can't handle. Worse yet, it probably makes working with the results of a
search nearly impossible (if not actually impossible). That's where XPath meets
the Java (or C# or Perl) language. This is a Java-specific article. (If you want to see
articles about those other languages, tell us in the feedback section of this article!).
If you're reading this article, you should at least be familiar with XPath. Check out Resources for links to a good two-part introductory tutorial on XPath, if you've never used the technology before, and then come back to this article.
It's at this point—the intersection of XPath and Java technology—that Sun has done Java programmers a big favor. They're integrated XPath support into the Java 5 environment. Even better, you don't need to download the enterprise edition, or a supplemental package (like Sun used to do with parts of JDBC). If you've got Java 5 software on your machine, you've got XPath support, in a very Java-centric way. In fact, it's part of JAXP, the Java API for XML Processing, something you're probably already familiar with.
Make sure you've got the Java 5 release
If you're not sure what version of Java technology you have, or what version of Java
technology runs on the machine (perhaps on a remote server) that you write code on, you can find out easily. Just run java with the -version flag. Listing 1 shows what that should look like.
Listing 1. Making sure you've got Java 5 or later
[bdm0509:~/Documents/developerworks/java_xpath] java -version
java version "1.5.0_13"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05-237)
Java HotSpot(TM) Client VM (build 1.5.0_13-119, mixed mode, sharing)
|
As long as the version is 1.5 or greater, you're ready to work through this article.
Note that 1.5 is equivalent to 5.0, and it's not completely clear (to me or most Java
developers) why all the public literature says "Java 5", but the java command still returns 1.5. Still, if you've got 1.5.x or even
1.6.x, you're in great shape. If not, check Resources for links
to download Java 5 technology.
Make sure you've got XPath support
Next, you need to make sure that you have XPath support. That probably sounds like a
redundancy; you just checked to see if you've got at least Java 5 software, right? Still, there
are tons of developers who have a different Java version on their system path than
their development environment is using. Or Eclipse, your IDE, is running something other
than your Web application server. And on and on...the best way to avoid problems like this sneaking up on you is to build a tiny program to test things out. Listing 2 shows a program that does nothing more than create a new instance of the XPath factory, XPathFactory. This also ensures things like parsers and an implementation are set up and running.
Listing 2. A very simple XPath test class
import javax.xml.xpath.XPathFactory;
public class XPathTester {
public static void main(String args[]) {
try {
XPathFactory factory = XPathFactory.newInstance();
} catch (Exception e) {
System.err.println("Uh oh...looks like you don't have the version " +
"of JAXP with XPath support. Better upgrade to Java 5 or greater.");
}
System.out.println("Successfully loaded XPath factory. Things look good.");
}
}
|
Compile this class and run it. You should get the very basic output in Listing 3.
Listing 3. Successful output of the test class from Listing 2
[bdm0509:~/Documents/developerworks/java_xpath] java XPathTester
Successfully loaded XPath factory. Things look good.
|
This is pretty trivial, but you can take this class and try it on your Web server, your application server, your four mirrored production servers...and anywhere else you want your XPath code to run. If the class runs on those machines, then you're safe to develop more complex XPath apps. If the test class doesn't run, spend your time getting XPath support working before spending hours on writing code that might not work when it counts.
Understanding the XPath part of the JAXP API is really dependent upon understanding how JAXP handles all XML parsing, processing, and transformations.
You'll recall that the basic steps to work with XML are:
- Get a factory class to provide instances of a vendor-specific JAXP implementation.
- Get a parser or transformer instance from the factory.
- Set configuration options on or around that parser or transformer (validation, namespace-awareness, stylesheet to use, and so on).
- Create an object to hold, store, or reference the XML to be operated on (usually through some type of
InputSource. - Parse or transform the XML.
This usually resembles Listing 4 in code, which shows a simple XML parse, using a command-line argument as the filename of the XML document to parse.
Listing 4. Using the SAXParserFactory
import java.io.OutputStreamWriter;
import java.io.Writer;
// JAXP
import javax.xml.parsers.FactoryConfigurationError;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
// SAX
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class TestSAXParsing {
public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println ("Usage: java TestSAXParsing [filename]");
System.exit (1);
}
// Get SAX Parser Factory
SAXParserFactory factory = SAXParserFactory.newInstance();
// Turn on validation, and turn off namespaces
factory.setValidating(true);
factory.setNamespaceAware(false);
SAXParser parser = factory.newSAXParser();
parser.parse(new File(args[0]), new MyHandler());
} catch (ParserConfigurationException e) {
System.out.println("The underlying parser does not support " +
" the requested features.");
} catch (FactoryConfigurationError e) {
System.out.println("Error occurred obtaining SAX Parser Factory.");
} catch (Exception e) {
e.printStackTrace();
}
}
}
class MyHandler extends DefaultHandler {
// SAX callback implementations from ContentHandler, ErrorHandler, etc.
}
|
If you're building a DOM tree, the process still follows the same model. Listing 5 shows code to create a DOM tree of an XML document, and the steps are very similar, even though the class and method names change.
Listing 5. Using the document builder factory
import java.io.File;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;
// JAXP
import javax.xml.parsers.FactoryConfigurationError;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
// DOM
import org.w3c.dom.Document;
import org.w3c.dom.DocumentType;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class TestDOMParsing {
public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println ("Usage: java TestDOMParsing " +
"[filename]");
System.exit (1);
}
// Get Document Builder Factory
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
// Turn on validation, and turn off namespaces
factory.setValidating(true);
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File(args[0]));
// Print the document from the DOM tree and
// feed it an initial indentation of nothing
printNode(doc, "");
} catch (ParserConfigurationException e) {
System.out.println("The underlying parser does not " +
"support the requested features.");
} catch (FactoryConfigurationError e) {
System.out.println("Error occurred obtaining Document " +
"Builder Factory.");
} catch (Exception e) {
e.printStackTrace();
}
}
private static void printNode(Node node, String indent) {
// print the DOM tree
}
}
|
In both cases, you get a factory, use that to create a new parser/processor instance, and then operate on that instance.
The workflow is very similar when you write XPath code:
- Get an XPath factory class to provide instances of a vendor-specific XPath implementation.
- Get an XPath evaluator instance from the factory.
- Create a new XPath expression. (This step is different from the parsing model, although it still aligns with assigning a stylesheet in the XML transformations model.)
- Build a DOM tree of the XML document to evaluate the XPath expression against.
- Evaluate the XPath expression.
Let's walk through this process step-by-step, build up a basic program for parsing XPath expressions, and then you can evaluate any of your own XPaths, or any of the XPaths you wrote when you worked through the XPath tutorial (those links are in Resources).
You need to keep in mind a few suppositions that apply to the program you'll build in this article:
- You have an XML document that you can easily convert into a DOM tree. This article's example reads in an XML document from the command line, and converts it to a DOM tree, but you can just as easily build a DOM tree from a network URI, a set of SAX events, or any other source. If you're rusty on how to get a DOM tree from various sources using JAXP, check out Resources for some helpful links.
- An XPath you want to evaluate. This article assumes you already have an XPath, or at least know how to construct one. There's no substantive discussion about how to build XPaths, but more on how to evaluate them.
Once you take care of these things, you're ready to write code.
Get an XML document to evaluate your XPath against
Begin with a simple program that reads in a filename from the command-line. You'll use that name to build a DOM tree from the XML document the filename references. There's nothing XPath- or even JAXP-specific here; just some simple I/O and program plumbing. Listing 6 is the beginning of your program; save this as XPathEvaluator.java.
Listing 6. Initial version of program to evaluate XPaths
package ibm.dw.xpath;
public class XPathEvaluator {
public XPathEvaluator(String xmlFilename) {
// Convert filename into a DOM tree
}
public void evaluateXPath(String xpathString) {
}
public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println("Usage: java ibm.dw.xpath.XPathEvaluator " +
"[XML filename]");
System.exit(1);
}
XPathEvaluator evaluator = new XPathEvaluator(args[0]);
} catch (Exception e) {
e.printStackTrace();
}
}
}
|
Convert your XML into a DOM tree
The XPath API—at least in its current form in JAXP—requires a DOM tree to operate upon. All XPaths require some sort of in-memory model to operate upon, because XPaths are fundamentally about the hierarchy of an XML document. DOM provides this, in the form of a navigable tree of elements, attributes, and text nodes.
Since you're already using JAXP for XPath support, you get DOM support as well, for
free. Use the DocumentBuilder class (and its associated
factory, DocumentBuilderFactory) to convert the string reference to an XML document into an in-memory DOM tree. Listing 7 shows the additions to XPathEvaluator to take care of this.
Listing 7. Creating a DOM tree from the input XML document
package ibm.dw.xpath;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
public class XPathEvaluator {
private Document domTree = null;
public XPathEvaluator(String xmlFilename) {
try {
// Convert filename into a DOM tree
DocumentBuilderFactory domFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = domFactory.newDocumentBuilder();
this.domTree = builder.parse(xmlFilename);
} catch (SAXException e) {
throw new IOException("Error in document parsing: " + e.getMessage());
} catch (ParserConfigurationException e) {
throw new IOException("Error in configuring parser: " + e.getMessage());
}
}
public void evaluateXPath(String xpathString) {
}
public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println("Usage: java ibm.dw.xpath.XPathEvaluator " +
"[XML filename]");
System.exit(1);
}
XPathEvaluator evaluator = new XPathEvaluator(args[0]);
} catch (Exception e) {
e.printStackTrace();
}
}
}
|
Most of the code is just DOM-based JAXP parsing; if you're unclear on what's going on here, check Resources specifically for the links on general JAXP parsing and transformation articles.
XPath, as is the case with most XML specifications that are fairly modern and current, is namespace aware. That means that namespace prefixes on elements (like iTunes:artist) can be part of your XPaths. Even if you're not using namespaced documents, though, you should ensure that you have this capability for the future.
To do that, though, you must ensure that your DOM tree is namespace aware. In other words, you ensure that the input to your XPath evaluations is namespace-aware, so your evaluations can be. To ensure that, always turn on namespace awareness when you build your DOM tree. Listing 8 shows a single-line addition to accomplish that.
Listing 8. Adding namespace awareness to building the DOM tree
package ibm.dw.xpath;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
public class XPathEvaluator {
private Document domTree = null;
public XPathEvaluator(String xmlFilename) {
try {
// Convert filename into a DOM tree
DocumentBuilderFactory domFactory =
DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
this.domTree = builder.parse(xmlFilename);
} catch (SAXException e) {
throw new IOException("Error in document parsing: " + e.getMessage());
} catch (ParserConfigurationException e) {
throw new IOException("Error in configuring parser: " + e.getMessage());
}
}
public void evaluateXPath(String xpathString) {
}
public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println("Usage: java ibm.dw.xpath.XPathEvaluator " +
"[XML filename]");
System.exit(1);
}
XPathEvaluator evaluator = new XPathEvaluator(args[0]);
} catch (Exception e) {
e.printStackTrace();
}
}
}
|
Represent an XPath in the Java environment
Once you've got a DOM tree to evaluate, you need to take your XPath—which is
just a textual string—and create a Java representation of that. Of course, that
doesn't just mean that you create a String variable and stuff the XPath into it. You need an actual Java object that can either evaluate itself, or be evaluated by some other XPath-aware component, against the DOM tree you've now got. That's where JAXP's new API additions come into play.
Here's where that sequence of events from earlier comes into play. You begin all your XPath work—outside of getting a DOM tree ready, which technically can be done anytime before actual XPath evaluation—with a new class, javax.xml.xpath.XPathFactory.
Specifically, XPathFactory is an interface, and you need an implementation of that interface. That implementation will be vendor-specific; Sun provides a default implementation, Apache might have an implementation, Oracle might have an implementation...but none of that code belongs in a nice, vendor-neutral piece of code. Instead, you can abstract vendor specifics away with XPathFactory, and its newInstance() method, which handles getting an implementation of XPathFactory for you.
Listing 9 takes care of that. Note that this listing shows only the evaluateXPath() method. You'll need to add a few import statements to your code to make this work, all in the javax.xml.xpath package.
Listing 9. Getting an instance of XPathFactory
public void evaluateXPath(String xpathString) {
XPathFactory factory = XPathFactory.newInstance();
}
|
Next up, you need an XPath object. This object is capable of
evaluating XPaths, and is the cornerstone of your XPath-aware Java programs. Just as you
get a DocumentBuilder from a DocumentBuilderFactory, you get an XPath
from an XPathFactory. Listing 10 shows this minimal code.
Listing 10. Getting an XPath object
public void evaluateXPath(String xpathString) {
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
}
|
With this object, you're ready to evaluate your XPath, and work with the results.
Once you have an XPath instance, you can evaluate XPaths, get a resulting node set, and do some work with those results.
You evaluate an XPath (not the Java object, but a string path referring to an XML document) with the evaluate method on the XPath Java object. That's a bit confusing: you use XPath to evaluate an XPath. So in a truer sense, the XPath object is an XPath evaluator.
The evaluate() method takes two arguments: a string XPath, a DOM tree to evaluate that XPath against, and an XPath constant indicating the return type. The return type turns out to be pretty inflexible; the specification of a return type is really for future compatibility; for now, always use XPathConstants.NODESET, to have your results returned as a DOM NodeList structure.
See the code to evaluate an XPath in Listing 11, added to the evaluateXPath method.
Listing 11. Evaluating an XPath
public NodeList evaluateXPath(String xpathString) throws IOException {
try {
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
return (NodeList)xpath.evaluate(
xpathString, domTree, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new IOException("Error evaluating XPath: " + e.getMessage());
}
}
|
Listing 11 includes lots of new additions, all important:
- The method now returns an
org.w3c.dom.NodeList. Be sure to add animport org.w3c.dom.NodeList;statement to your code to make this work.NodeListis the structure used to return the list of nodes from the evaluation of your XPath. - The entire code block is wrapped in a
try/catchblock, and the exception that can result from XPath evaluation—javax.xml.xpath.XPathExpressionException—is caught and rethrown as anIOException. You'll come back to the reasoning behind this shortly. -
evaluate()is called with the XPath string passed into the method, the DOM tree you built in the class's constructor, and the constant indicating to return results as a list of nodes. - The result of
evaluate, which is anObject, is cast to the DOMNodeListtype, and returned.
Despite several things happening, they're all pretty straightforward, and nothing that should trip you up.
XPath-specific, DOM-specific, JAXP-specific?
One interesting point is the decision to return any exceptions from this method, as well as any that arise in the constructor, as IOExceptions. That's a design decision, and not really XPath-specific, but it's important. With that decision, you can insulate users of this class—through the command line or another program—from having to know, import, or directly use any XPath classes or interfaces.
In fact, you abstracted away all JAXP classes, DOM classes, SAX classes, and XPath classes...except the NodeList class from the DOM. That's pretty powerful, as other programmers don't need to be familiar with the JAXP or XPath API to get XPath evaluation. It takes your program from an interesting programming exercise to a reusable tool, and that's a pretty important distinction.
If you take this principle and want to go even a bit further, you could take the returned NodeList and iterate through it, and dump the results into a Java List. That would abstract away the details about DOM completely, and remove even the current small dependency on org.w3c.dom.NodeList.
Work with the results of evaluation
Once you get the results of an XPath evaluation, you're ready to work with those results...in whatever format you like. For the sake of example, you'll look at just iterating through the results and printing them out. Of course, you can expand on this as much as you like.
A very simple iteration through result nodes
Each member of a NodeList is in fact a DOM Node (org.w3c.dom.Node), and you can then find out the name of the node, its type, and pretty much anything else about that node you want. Listing 12 shows a very basic addition to the XPathEvaluator class that passes in an XPath to evaluate, gets the results, and prints them out.
Listing 12. Completing the XPathEvaluator program (take one)
package ibm.dw.xpath;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class XPathEvaluator {
private Document domTree = null;
public XPathEvaluator(String xmlFilename) throws IOException {
try {
// Convert filename into a DOM tree
DocumentBuilderFactory domFactory =
DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
this.domTree = builder.parse(xmlFilename);
} catch (SAXException e) {
throw new IOException("Error in document parsing: " + e.getMessage());
} catch (ParserConfigurationException e) {
throw new IOException("Error in configuring parser: " + e.getMessage());
}
}
public NodeList evaluateXPath(String xpathString) throws IOException {
try {
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
return (NodeList)xpath.evaluate(
xpathString, domTree, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new IOException("Error evaluating XPath: " + e.getMessage());
}
}
public static void main(String[] args) {
try {
if (args.length != 1) {
System.err.println("Usage: java ibm.dw.xpath.XPathEvaluator " +
"[XML filename]");
System.exit(1);
}
XPathEvaluator evaluator = new XPathEvaluator(args[0]);
String xpathString = "//target[@name='init']/property[" +
"starts-with(@name, 'parser')]";
NodeList results = evaluator.evaluateXPath(xpathString);
for (int i=0; i<results.getLength(); i++) {
Node node = results.item(i);
System.out.print("Result: ");
switch (node.getNodeType()) {
case Node.ELEMENT_NODE: System.out.println("Element node named " +
node.getNodeName());
break;
case Node.ATTRIBUTE_NODE: System.out.println(
"Attribute node named " +
node.getNodeName() + " with value '" +
node.getNodeValue() + "'");
break;
case Node.TEXT_NODE: System.out.println("Text: '" +
node.getNodeValue() + "'");
break;
default: System.out.println(node);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
|
If you read the two-part tutorial on XPath, you'll recognize this XPath as selecting several properties. You should download the example file xerces-build.xml (available in Resources if you don't already have it) to run this example, as shown in Listing 13.
Listing 13. Running XPathEvaluator program (take one)
[bdm0509:~/java_xpath] java ibm.dw.xpath.XPathEvaluator xerces-build.xml
Result: Element node named property
Result: Element node named property
Result: Element node named property
Result: Element node named property
Result: Element node named property
Result: Element node named property
|
These results look pretty bland, especially if you compare them to Figure 1, a screen capture from the tutorial where this same expression was evaluated graphically using a tool for evaluating XPaths.
Figure 1. Evaluating an XPath expression in a graphical tool
However, that printed out view of elements is deceptively simple.
A node has more than just a name
Remember that while all the sample program did was print out a node's name, type, and possibly its value (depending on that type), you've still got a complete Node object. Further, that node isn't in isolation; it's a reference to a node in an in-memory DOM tree (even if you don't see that DOM tree from a usage perspective; it's hidden internally in the XPathEvalutor code).
What that means is that for each Node, you've really got a location pointer within the complete XML document you handed off to XPathEvaluator. That means you can navigate to a node's children, see what attributes exist on an element node, find out the name of a text node's parent element, and perform any other DOM operation that's allowed on a Node. You don't just have a node, you have a reference to that node in its full DOM context. It's up to you to determine what you do with that node, and the context within which it's positioned.
About those earlier JAXP, DOM, and XPath abstractions...
You might have noticed that all the work intimated above to avoid DOM-specific references now goes out the window. In fact, that's why XPathEvaluator abstracts XPath details away from users of the class, but still returns a DOM NodeList. You can safely insulate your users from JAXP and XPath, but to do much with the results of an XPath evaluation, you'll need to work with the DOM.
For that reason, it's best to return DOM structures, but avoid requiring XPath-specific input or providing XPath-specific output. Let your users work with the DOM, and nothing else, at least in terms of your requirements for your class functioning.
Developers like you and me are a short-tempered, anxious lot. As you begin to get the feel and command of using XPath from the Java environment, you're probably already thinking about what you can't do with XPath. Particularly complex relationships between data aren't easy to deal with (using SQL-like joins is at the outer extremes of what XPath is built to do), you must do ordering of nodes and further filtering in the Java environment, and readability of XPath is pretty difficult if you're not already familiar with the specification.
Thankfully, you can take a very natural step from XPath to the next thing which addresses all of these limitations, and does it in a way that is reminiscent of what you've already don. XQuery adds more of an XML-ized version of SQL, allowing you to build queries, sort and order results, and use actual WHERE statements in your queries. XQuery also builds on XPath, meaning everything you've learned about nodes, predicate matching, and how elements and attributes relate to each other applies to XQuery.
And, just as XPath does, XQuery has an API for its inclusion in Java programmers: the XQuery for Java (XQJ) API. For a lot more on XQuery, check out Resources, which has links to articles and tutorials on XQuery and XQJ. And once you feel you've gotten your head firmly around XPath, take a look at XQuery to add even more power to your XML-related application code.
Much of using XPath from Java technology is simply to learn new syntax, get an API and a few tools configured, and then apply what you already know about XPath. That shouldn't make you think that using XPath in the Java environment is trivial, though. Beyond a need for complexity, XPath offers a tremendous amount of flexibility when you work with XML from Java programming. It certainly moves you far beyond what most basic SAX, DOM, JAXP, JDOM, or other., implementations provide (although some vendors and projects provide XPath-capable extensions to the basics that those specs and APIs offer).
And, XPath offers a wonderful gateway to the more complex XQuery language, and Java and XQuery combinations (using the XQJ API). Rather than immediately move on to XQuery, you'll do well to polish your XPath skills, and learn to select complex node sets from within your Java applications, and manipulate those as needed. You'll find lots of cases where you don't need anything beyond XPath. On top of that, XQuery builds upon XPath—both from a lexical perspective and in terms of the XQJ API, which can actually evaluate XPaths as well as execute XQueries—so you're improving your XQuery skills implicitly. Most of all, have fun with the increased flexibility that XPath offers, especially when evaluated from the Java environment.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample compiled code for article | compiledCode.zip | 3KB | HTTP |
| Sample source code for article | sourceCode.zip | 2KB | HTTP |
| Sample XML for article | xerces-build-xml.zip | 11KB | HTTP |
Information about download methods
Learn
- If you're unfamiliar with XPath, take this two-part tutorial:
- Locate specific sections of your XML documents with XPath, Part 1 (Brett McLaughlin, Sr., developerWorks, June 2008): Explore XPath basics. Easily locate and refer to specific data in a document and its various selectors and semantics, in an example-driven, hands-on manner.
- Locate specific sections of your XML documents with XPath, Part 2 (Brett McLaughlin, Sr., developerWorks, June 2008): Add predicates to your XPath skills to find exact nodes as you evaluate attribute values plus the parent and child nodes of a target element.
-
XPath 1.0: Read the formal definition of XPath in the original specification.
-
XPath 2.0: Read the online specification for the most current version of XPath.
-
Tutorial on XPath: Understand how
XPath is fundamental to much advanced XML usage with this useful but brief tutorial from the W3C.
-
Online function reference for XPath, XQuery, and XSLT: Once you understand how predicates and functions work, visit this great resource to look up syntax and find functions that aren't commonly discussed.
- Sun's XQuery for API: Move from XPath to XQuery when you check out this page, which details the complete XQuery for Java API.
-
DataDirect
resource page: From DataDirect, the hosts of xquery.com, explore implementations of
both an XPath evaluator (Stylus Studio), and an XQuery for Java (XQJ) engine, plus related work on XQuery and XPath.
-
DataDirect online help system: Visit this indexed, searchable resource that's great for finding out about particular DataDirect objects and methods.
-
Understanding DOM
(Nicholas Chase, developerWorks, March 2007): Dig deeper into manipulating XML from a node-based API in an excellent tutorial.
-
IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
-
XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
-
developerWorks technical events and webcasts: Stay current with technology in these sessions.
- The technology
bookstore: Browse for books on these and other technical topics.
-
developerWorks
podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
-
Java 5 SE: Download for integrated XPath support on your system.
-
Java 6 software: If you're considering
upgrading to Java 5 technology, just skip version 5 and go straight to the very latest version, if at all possible.
-
Stylus Studio 2008 XML: Download to start with XPath and XML documents on the Windows platform.
-
AquaPath: Download to enable easy XPath location evaluation on Mac OS X.
-
DataDirect's XQuery for Java implementation:
Download and get started with XQuery and Java searches.
-
Java
& XML, Third Edition (Brett McLaughlin and Justin Edelson, O'Reilly Media, 2006): Cover XML from start to finish, including extensive information on XML, XSL, and a number of related XML specifications.
-
IBM
trial software for product evaluation: Build your next project with trial software available for download directly from developerWorks, including application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
-
XML zone discussion forums: Participate in any of several XML-related discussions.
-
developerWorks XML zone: Share your thoughts: After you read this article, post your comments and thoughts in this forum. The XML zone editors moderate the forum and welcome your input.
-
developerWorks blogs: Check out these blogs and get involved in the developerWorks community.

Brett McLaughlin is a bestselling and award-winning non-fiction author. His books on computer programming, home theater, and analysis and design have sold in excess of 100,000 copies. He has been writing, editing, and producing technical books for nearly a decade, and is as comfortable in front of a word processor as he is behind a guitar, chasing his two sons around the house, or laughing at reruns of Arrested Development with his wife. His last book, Head First Object Oriented Analysis and Design, won the 2007 Jolt Technical Book award. His classic Java and XML remains one of the definitive works on using XML technologies in the Java language.





