Skip to main content

Tip: Namespaces and versioning

Using XML namespaces to mark the version of XML formats

Uche Ogbuji (uche@ogbuji.net), Principal Consultant, Fourthought, Inc.
Photo of Uche Ogbuji
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colo., USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Summary:  You can use several techniques for versioning XML schemas, such as defining special root attributes or using the DTD. This tip discusses how to use XML namespaces to version formats.

View more content in this series

Date:  01 Jun 2002
Level:  Intermediate
Activity:  1802 views

One of the core features of XML is its ability to deal with changes in the rules for data (hence the extensible in its name -- Extensible Markup Language). As changes are made to XML vocabularies, the creation of multiple versions is inevitable. This makes it necessary to mark the versions clearly, for human and machine information. The clear marking of versions can be used for driving validation, or for branch processing according to the requirements of each version.

You can mark the version of an XML vocabulary in many ways. This discussion focuses on the use of XML namespaces for marking versions.

Versioning with special attributes or document types

Let's start with an XML vocabulary for a mailing label format:


Listing 1. Mailing label format
<?xml version="1.0"?>
<labels>
  <label>
    <name>Thomas Eliot</name>
    <address>
      <street>3 Prufrock Lane</street>
      <city>Stamford</city>
      <state>CT</state>
    </address>
  </label>
</labels>

If I think this format may change, it makes sense to mark its version when I first deploy it. One way of doing this is through the document type declaration (DTD). The DTD refers to a public identifier, which can be made specific to the document version. A good example of this is the W3C's XHTML public identifier, as used in the following declaration:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">

You can see not only the version (1.0), but also the variation within the version (XHTML has three variations: strict, transitional, and frameset).

Naturally, the DTDs for the various versions themselves reflect the changes being made. This approach requires that I define the format in a DTD, which might not always be desired. Also, though DOM and SAX provide access to the public identifier used in the source's declaration, XSLT does not.

Another approach often used is the top-level version attribute. For instance:


Listing 2. Mailing label format with version attribute
<?xml version="1.0"?>
<labels version="1.0">
  <label>
    <name>Thomas Eliot</name>
    <address>
      <street>3 Prufrock Lane</street>
      <city>Stamford</city>
      <state>CT</state>
    </address>
  </label>
</labels>

The top-level version attribute works whether I use an XML schema system or not. And the version information is available from DOM, SAX, XSLT, or any other normal XML processing technology.

The version attribute approach is taken by the XSLT language itself. The biggest problem with this approach is primarily conceptual: The connection between the version identifier and each XML information item is somewhat tenuous (typically through an attribute of an ancestor, possibly a distant one). This can also lead to some awkwardness in code that dispatches according to version.


Versioning with namespaces

To make the version information a more immediate property of the XML information items, you can place them in XML namespaces that reflect the version. For example:


Listing 3. Mailing label format with namespace version
<?xml version="1.0"?>
<labels xmlns="http://uche.ogbuji.net/eg/labels/1.0">
  <label>
    <name>Thomas Eliot</name>
    <address>
      <street>3 Prufrock Lane</street>
      <city>Stamford</city>
      <state>CT</state>
    </address>
  </label>
</labels>

Thus the version information comes through with the namespace -- on each SAX event, on each DOM node, or on the namespace axis of each XPath node. This common system is used in most W3C vocabularies. In fact, XSLT uses it in addition to the version attribute.

To be precise, the W3C usually uses date stamps in the namespace URIs rather than version numbers. I might emulate this with the following:


Listing 4. Mailing label format with date-stamped namespace version
<?xml version="1.0"?>
<labels xmlns="http://uche.ogbuji.net/eg/labels/2002/05">
  <label>
    <name>Thomas Eliot</name>
    <address>
      <street>3 Prufrock Lane</street>
      <city>Stamford</city>
      <state>CT</state>
    </address>
  </label>
</labels>

When you use namespace for versioning as shown in Listing 4, the biggest problem is that even a small change in the actual format becomes a big issue because of the propagation of the namespaces. If I tweak the format to allow an optional country element in the address, users end up supporting the original namespace as well as the updated one (say http://uche.ogbuji.net/eg/labels/1.0) in all their processing code, even though it might not have much of an actual effect on the processing code.

If a change to the format across a version is minor enough that it is not expected to affect processing much, one solution is not to change the namespace URI with every single change in format. This solution works in most cases, but does break down when the maintainer of a namespace uses a retrievable URI that points to an actual document that describes the format. In this instance, the document will likely change with any format change, regardless of how minor; hence it makes sense to change that document's URI, which also happens to be the namespace.

Eric van der Vlist proposed a system for minimizing this problem on the XML-DEV mailing list in March 2001 (see Resources).

In this case, version numbers are divided into major and minor parts based on the magnitude of the format changes represented. Only the major parts of version numbers are used in the namespace. For instance, the original version of my mailing label format is 1.0 (major 1, minor 0). After I add the optional country element, the new version is 1.1 (major 1, minor 1). The namespace I use in both cases is:

http://uche.ogbuji.net/eg/labels/1

Then I set up the HTTP server (that provides the documentation pointed to by each namespace URI) to redirect the user from a URL with only the major version number to a URL that gives the precise version. So when the server gets a request for http://uche.ogbuji.net/eg/labels/1, it redirects to the document at http://uche.ogbuji.net/eg/labels/1.1, as that is the latest version. A user is still free to retrieve the 1.0 document by making an explicit request for that URI.


Conclusion

This tip glosses over several controversial points by assuming common practice. Marking versions using namespaces is more common than doing so using version attributes, though which approach is better is a matter of debate. Also, it is controversial as to whether a namespace URI should point to anything at all, either directly to a document defining the format or to a general information document about the vocabulary, as defined by the Resource Directory Description Language (RDDL). Again, common practice uses HTTP URLs for the namespaces. Considering the subtleties explored in this discussion, placing the version in the namespace is, in practice, well proven, and makes dealing with changes in XML format just a bit less hairy.


Resources

About the author

Photo of Uche Ogbuji

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colo., USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12113
ArticleTitle=Tip: Namespaces and versioning
publish-date=06012002
author1-email=uche@ogbuji.net
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers