Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Manage XML collections with XAPI

Use the Application Programming Interface for XML Databases with your favorite programming language

Uche Ogbuji (uche@ogbuji.net), Principal Consultant, Fourthought, Inc.
Photo of Uche Ogbuji
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Summary:  XML repositories are a simple extension of the idea of XML documents, and they call for a simple API for access and manipulation. The likes of DOM and XPath are too granular, while XQuery may be too elaborate for some needs. A group of XML repository implementers (named XML:DB) have come together to develop such an API specification, and the result is the Application Programming Interface for XML Databases (XAPI). In this article, Uche Ogbuji introduces XAPI.

Date:  11 Jan 2005
Level:  Introductory

Comments:  

When people first thought about how to use XML, one of the early questions was If I have a lot of related XML files, how can I best manage them in a system that allows individual control over each file where necessary, plus broader access and management across files where necessary? Systems that provide some practical answer to this question are called XML repositories. XML repositories support collections of XML documents, and provide persistent storage and services (whether as a mainstream Web service or some other form) for access and manipulation of the contents. Once your XML repository is in place, you'll probably want to perform ad-hoc as well as programmatic queries on the data, and so developers of early XML databases, server frameworks, and programming language interfaces (commercial as well as open source) came up with proprietary interfaces for data access. Soon, as is the custom in most areas of XML development, an informal group got together to write common specifications for XML repository APIs. In this case it was the XML:DB group (see Resources).

The XML:DB group is even less formal than most, and has developed its specifications in very sketchy fashion. Even so, it has come up with several specifications that are influential as well as widely implemented. XUpdate (see Resources) is probably the best known example, but another important entry is the Application Programming Interface for XML Databases (XAPI). In the XML:DB's own words (from the XAPI specification page):

The XML:DB API [XAPI] is designed to enable a common access mechanism to XML databases. The API enables the construction of applications to store, retrieve, modify and query data that is stored in an XML database. These facilities are intended to enable the construction of applications for any XML database that claims conformance with the XML:DB API. The API can be considered generally equivalent to technologies such as ODBC, JDBC or Perl [or Python] DBI.

In this article I introduce XAPI and contrast it with similar specifications.

Whose territory is this?

With the words "XML", "API", and "query" featuring so prominently in the description so far, you might wonder But what about DOM, XQuery, and the like? XAPI neatly falls into a niche between some of the better known specifications. DOM focuses on node-level access of individual documents. XPath is also typically used in association with individual documents, although technically it can work on any grove of XML nodes; a grove is a collection of trees (in computer science terms, a Directed Acyclic Graph), and XSLT is an example of a host language for XPath that takes advantage of this distinction to allow XPath to process the contents of multiple source documents (courtesy of the XSLT document() function). XAPI also takes advantage of XPath's flexibility, as I shall show.

XQuery is designed from the ground up to support aggregations that span multiple documents, but its basic method is to define a very rigorous data model and semantics of the abstract data being queried. XQuery is designed to work just as well accessing legacy data stores such as relational and object databases, and the resulting abstraction and complexity gives it a very different feel from that of popular XML repositories, which are often just collections of XML as plain files or in simple hash databases. So where XQuery essentially provides a calculus for all conceptualizations of how one might access XML data, there is still room for a simple arithmetic of basic XML collections, focusing on the idea of file-systems-like hierarchies of XML documents without too much conceptual load. To offer another analogy, XQuery feels like a high-end enterprise DBMS, with all the associated power and cost, whereas XAPI is more like the GNU utilities for UNIX file processing -- focusing on pipelines of input and output text, with very simple operations in each processing stage. The two need not be mutually exclusive, and indeed some repositories that implement XAPI also implement XQuery.


Nuts and bolts

XAPI is designed for modular understanding and implementation. As with DOM it is broken into modules, each of which is defined in Interface Definition Language (IDL) for language-independence, although this imparts a very strongly-typed, object-oriented bias that may not suit all languages very well. Again like DOM, the XAPI modules are organized into levels of conformance: Minimum Conformance Level, Core Level 0, and Core Level 1.

Minimum Conformance Level defines interfaces for basic repository features such as Resource -- the basic unit of data (typically an individual document) -- and Collection -- representing a collection of resources (typically some sort of folder or container). In addition, Service provides extensions to collections for query and management tasks, Database abstracts a connection to a particular repository, and ResourceIterator and ResourceSet generally represent result sets from queries. This conformance level deals strictly with abstract objects as the content of resources, but it does support the idea of extensible types and identifiers for resources.

Core Level 0 refines the abstract idea of resources to add XML particulars. It allows you to get the contents of an XMLResource as a DOM node (method getContentAsDOM()) or as a series of SAX events (method getContentAsSAX()). Similarly, Core Level 0 includes methods for modification of XML content, as well as an interface (BinaryResource) for content defined as (non-XML) byte streams.

Core Level 1 builds upon the other levels, and adds the following interfaces for common query and manipulation services:

  • XPathQueryService: Allows you to use XPath to query a collection or resource, including methods for namespace mapping and for query execution.
  • XUpdateQueryService: Allows you to use XUpdate to modify a collection or resource.
  • CollectionManagementService: Allows you to add and remove collections (think "make directory" and "remove directory").
  • TransactionService: Defines transaction context within services, allowing for clean data operations when multiple tasks are operating at the same time.

A simple use case

Listing 1 is a simple Java program excerpt from the XAPI use cases that exercises all three XAPI levels. It queries a movie database for movies with the title "Music Man".


Listing 1. Simple query of a movie database
import org.xmldb.api.base.*;
import org.xmldb.api.modules.*;
import org.xmldb.api.*;

/**
 * Simple XML:DB API example to query the database.
 */
public class Example1 {
   public static void main(String[] args) throws Exception {
      Collection col = null;
      try {
         /* Section A */
         String driver = "org.vendorx.xmldb.DatabaseImpl";
         Class c = Class.forName(driver);
         
         Database database = (Database) c.newInstance();
         DatabaseManager.registerDatabase(database);
         col =
            DatabaseManager.getCollection("xmldb:vendorx://db.xmlmovies.com:2030/movies");
   
         /* Section B */
         String xpath = "//movie[@title='Music Man']";
         XPathQueryService service =
            (XPathQueryService) col.getService("XPathQueryService", "1.0");
         ResourceSet resultSet = service.query(xpath);
         
         /* Section C */
         ResourceIterator results = resultSet.getIterator();
         while (results.hasMoreResources()) {
            Resource res = results.nextResource();
            System.out.println((String) res.getContent());
         }
      }
      catch (XMLDBException e) {
         System.err.println("XML:DB Exception occurred " + e.errorCode);
      }
      finally {
         if (col != null) {
            col.close();
         }
      }
   }
}

The driver is the module that ties the XAPI interfaces to the actual database or library implementation. The code section labeled "A" essentially creates a database connection and selects a collection within the database. The section labeled "B" sets up and executes an XPath query (for simplicity, no namespaces are used). The section labeled "C" iterates over and prints the results of the query.


State of the art

The current XAPI working drafts are three years old, which is cause for some caution but should not put you off entirely. For one thing, remember that the much-touted XQuery has been winding its way along for at least as many years. For another, the XML:DB group is (in-)famous for developing specifications that hit a bit of a rut in mid-development, and yet are simple and clean enough to become widely implemented. (XUpdate is a good example; that spec is also in need of repair, yet it is widely implemented.) You can download a good number of XAPI implementations right away (see Resources), including a reference implementation hosted on SourceForge. I have seen more and more lightweight XML repository systems emerging, and if you are developing one you should certainly consider providing an XAPI-like interface. It's simple enough to understand, and probably just as easy to implement.


Resources

  • Check out the XAPI home page and the overall XML:DB page. The XAPI Use Cases are a good way to get a quick idea of the API, which is set forth in more detail (IDL and Javadocs) in the Working Draft.

  • Try out some XAPI implementations, such as the reference implementation, Apache XIndice, or eXist. Be aware that one of the earliest XAPI implementations, dbXML, no longer exists, having merged with the SleepyCat Berkeley XML DB project. XAPI support was dropped in the process.

  • Learn more about XUpdate, which defines update facilities for modifying data in XML documents. XUpdate is designed to work on regular XML documents as well as XML in database collections, and even virtual XML data models. It is an XML vocabulary similar to XSLT, but is much simpler and is a very accessible vocabulary overall. Like XSLT, it uses XPath for accessing the document to be modified, and has specialized elements that define output operations. XUpdate is also widely implemented, mostly among open-source tools such as XML DBMS and XML difference and patching tools. The XUpdate Use Cases draft also serves as an excellent introduction to XUpdate.

  • Find more XML resources on the developerWorks XML zone, including Uche Ogbuji's Thinking XML column. See also Part 2 and Part 4 of the XML standards survey, which mention XUpdate and XAPI.

  • Browse for books on these and other technical topics.

  • Find out how you can become an IBM Certified Developer in XML and related technologies.

About the author

Photo of Uche Ogbuji

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=32575
ArticleTitle=Manage XML collections with XAPI
publish-date=01112005
author1-email=uche@ogbuji.net
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).