Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Plan to use XML namespaces, Part 1

The best ways to use XML namespaces to your advantage

David Marston, Engineer, IBM Research
David Marston has worked with XML technologies since late 1998. Over his 25+ years in the computing business, he has been involved with all aspects of software development. He is a graduate of Dartmouth College and a member of the ACM. He is on the Next-Generation Web team at IBM Research. You can contact him at David_Marston@us.ibm.com.

Summary:  This article introduces XML namespaces, explores their practical benefits, and shows you how they are used in the standard XML formats and tools defined by the W3C. Several W3C specifications are mentioned, notably XML Schema and XSLT, which offer useful ideas for using namespaces to your advantage. Best practices range from terminology usage up through system-wide design.

Date:  29 Apr 2004 (Published 01 Nov 2002)
Level:  Intermediate

Comments:  

Note: This document mentions changes proposed up through the September 2002 "Last Call Working Draft" of version 1.1 of Namespaces in XML.

Most business and communications problems that XML can solve require a combination of several XML vocabularies. (You may read tag and attribute sets in place of the term XML vocabularies if you wish.) XML has a mechanism for qualifying names to be allocated into different namespaces, such as namespaces that apply to different industries. A company or (better yet) an industry consortium can assign names to elements and use common words like "title" or "state" without worrying that those names will clash with the same names used in another vocabulary.

XML namespaces also allow names to evolve over time. After using the first version of a vocabulary, your real-world experience may lead you to devise an enhanced vocabulary. The new version can be assigned to a different namespace, and you can use XSLT to transform data from one vocabulary to the other.

Speaking of XSLT, that stylesheet standard provides for the importation of subsidiary stylesheets, which can contain generic templates written by others. The name of a template can be qualified to a namespace, again avoiding clashes. In other words, my stylesheet can call a named template that has a distinctive name qualified by a namespace (which has been chosen by the template's author). I could even use more than one library of templates imported into my stylesheet, and different namespaces for each library would avoid duplicate names of the templates. Many recommended standards of the World Wide Web Consortium (W3C) promote namespaces for modularity.

XML namespaces also allow various tools that process XML, such as a stylesheet-driven XSLT processor, to pick out the instructions they should obey and treat instructions for other processors as just more data. The processor is set up to consider elements from a particular namespace (or two) to be the instructions. Elements that have no namespace are data, as are all elements that have a namespace other than those recognized as instructions.

Brief introduction to XML namespaces

The formal designation of a namespace is a URI. Generally, you'll see URLs (one form of URI) as the identifier. Because URIs use a wide range of characters, there would be a severe impact on the XML syntax if we had to attach the full URI directly to every qualified name. Therefore, the XML Namespaces Recommendation also defines prefixes that are directly attached to names. Syntactically, you use quotes (single or double) around the URI string, and a colon to set off the prefix; other characters present no interference. The prefix is a standard XML name. You can avoid using a prefix by assigning one URI to all unprefixed names or by laboriously (and dangerously) reassigning the default namespace wherever needed in the document. For practical purposes, prefixes are required when you intermix vocabularies.

Like other specifications for XML, the XML Namespaces Recommendation is published by the W3C. The W3C is developing version 1.1 of the Namespaces Recommendation, where the formal designation will be an Internationalized Resource Identifier, or IRI. The differences between URIs and IRIs lie in how certain characters are escaped to make them benign.

Let's look at some real namespace syntax:

<mddl:custodian 
   xmlns:mddl="http://www.mddl.org/mddl/2001/1.0-final">Merrill Lynch</mddl:custodian>

This is not a mere custodian element; it is a custodian element in the vocabulary identified by the URI http://www.mddl.org/mddl/2001/1.0-final. The prefix mddl is used to associate the element name with that URI. The URI is in the mddl.org domain; mddl.org is the organization that maintains the Market Data Definition Language, an XML vocabulary in which custodian is one of many elements. (This vocabulary defines elements pertaining to investments and portfolio management.) Notice that mddl.org has made provisions to define other vocabularies and to issue later versions of the MDDL vocabulary by having several fields in their URI.

The local part of the name is the name within a particular vocabulary. For names that are not qualified by a namespace, the local part is the only part that exists. For a prefixed name, the local part is what comes after the colon. For example, elements named book:title and book:isbn are in the same namespace but have different local parts. Elements named book:title and person:title have the same local part but are entirely unrelated because they belong to different namespaces.

The prefix, used in qualified names

Prefixes simplify discussion of your work. You can discuss book:title and xsl:apply-templates and the like while you develop an XML-based system, and only occasionally approach the details of their respective namespaces. In some technical sense, the prefix doesn't matter because it's a transient abbreviation that associates names with a namespace URI.

However, it's a best practice to establish logical and consistent prefix names to boost developer productivity.

The prefix qualifies and associates names of elements and attributes, and also applies to keyword-type text strings in some situations. For example, book:title is equivalent to "title as a characteristic of a book" when read in an XML document, which is convenient when a person has to scan some XML. By referring to the place where the prefix book is tied to a URI, one can find a more formal specification that states, for example, "title as defined in the book vocabulary issued by abaa.org in 1999."

Several W3C recommendations use the term QName to refer to an XML name that may (or may not) be qualified to a namespace, and if you read specifications regularly, you will even occasionally see "QName-but-not-NCName" to indicate an XML name that must be qualified to a namespace. (The term NCName refers to an XML name without a colon. NC means "no colon." ) For example, named templates in XSLT can be named with QNames rather than with simple XML names, facilitating the publication of a library of templates that are all named in a particular namespace. A QName uses the colon (:) as a special character to separate the prefix from the local part. Naturally, the prefix and local part cannot contain a colon, but they otherwise follow the prescribed syntax for XML names.

More than one prefix can be associated with a particular URI. XML standards will generally force resolution of prefixes to their associated URIs, so that names are the same if their local parts and URIs match, even if the prefixes differ. A prefix can only be associated with one URI at a time.

The namespace URI

Every article about XML namespaces has to point out that the URI goes nowhere, meaning there is no need to fetch any material that the URI appears to identify. Indeed, there is no requirement to set up a server for the identified location or to have fetchable material at the location. The XML Namespaces Recommendation only requires string-matching to establish that two URIs are the same, though it does briefly mention that the namespace value is a URI reference and implies that this value should follow the syntax of RFC 2396 of the Internet Engineering Task Force.

The URIs issued by the W3C always use the http: protocol and the w3.org domain name, so use of HTTP URLs can be considered the safe approach, and thus a best practice.

The domain name is the key to avoiding clashing names. By using the worldwide Domain Name System, the namespace URI provides an answer to the "Says who?" question. If you have a domain name, you have a piece of the world where you control the names, and this applies to your XML namespaces as well as your servers. For example, mddl.org is the domain name belonging to an organization that defines XML vocabularies pertaining to investments, and nobody else can assign names and URLs under the mddl.org domain.

In the future, the W3C may establish a guiding principle for the namespace URI to point to a fetchable resource. Various W3C committees are discussing alternatives. Most likely, the material identified by the URI will itself be an indirect pointer to an actual schema or description, allowing the syntax of the real description to evolve over time. For now, the URIs used for W3C namespaces point to simple text pages stating that the URI is a namespace. Try this as an example: http://www.w3.org/1999/XSL/Transform.

The W3C document that defines the namespace syntax and function uses the term namespace name to refer to the URI of the namespace. The XML Information Set Recommendation, which defines the meaningful parts of an XML document, also uses the term in the same way. However, the XPath functions name() and local-name() return the prefix when applied to a namespace node.

Therefore, it is a best practice to either avoid the term namespace name or only use it in a context where it's clear what you mean. XQuery uses the terms namespace prefix and namespace URI when discussing its syntax. The latter can safely be used to refer to the URI in a namespace declaration.


Namespaces already in use

Namespaces are designated for the various XML vocabularies, whether issued by the W3C itself like MathML or by an industry consortium like DSML, which came from an OASIS Technical Committee. (See Resources.) You can do the same within your organization.

Furthermore, other W3C recommendations in the XML family use namespaces to distinguish what they define. The XML Recommendation defines no element names, but describes two attributes that can be used in XML documents, xml:space and xml:lang. The XML Base Recommendation adds xml:base to the list. In each case, use of the xml prefix means that they are in a namespace that is defined by default for every XML document. In a recent Erratum, the W3C declared that you cannot use any prefix besides xml on the names built into XML.

The W3C uses a unique prefix for each vocabulary it defines. Each recommendation takes pains to point out that these prefixes are not functionally special, just used consistently. Returning to the MathML example, the MathML 2.0 Recommendation suggests the outer element <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">, where mml is their favored prefix. Again, the mml string is not special, other than for humans reading the documentation. The URI (http://www.w3.org/1998/Math/MathML) is the string that actually identifies the MathML vocabulary. (In that vocabulary, an element with the local part name math is the outer element.)

The XML Inclusions Recommendation presented a design dilemma: The inclusion construct couldn't be reduced to a single string value, as could xml:base and xml:lang. An inclusion declaration may need as many as three parts:

  • the href of the included resource,
  • its presumed encoding,
  • and its parsing method.

These could be joined as attributes on an element, but naming that element include or xml:include would impinge on the set of available element names in XML, causing messy exceptions for humans and machines alike. The solution was to define a namespace just for this one element. A typical include element looks like this:

<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" 
   href="http://example.com/std/defs" parse="xml" />

In this example, notice that the element carries the declaration of the xi prefix inside its own start tag. If one file has several XML inclusions, you may want to declare xi at the top, in which case each element is still named xi:include, but doesn't carry the namespace declaration inside its start tag:

<document xmlns:xi="http://www.w3.org/2001/XInclude" ...>
<-- ...other content... -->
<xi:include href="http://example.com/system/common" parse="xml" />
<-- ...other content... -->
<xi:include href="http://example.com/system/style" parse="xml" />
<-- ...other content, and perhaps more include elements... -->
</document>

Having a namespace for the generalized XML include also keeps it separate from the application-specific includes in XSLT and XML Schema.

XSLT and XML Schema are two cases in which an elaborate recommendation requires a full document to describe the transformation or data design, respectively. These documents are known as XSLT stylesheets and schema definitions, respectively. Following one of the basic design principles of XML, these are XML documents that use specially-namespaced vocabularies. In fact, XML Schema defines one vocabulary for the schema definition document, and another vocabulary for schema items that occur in the instance documents or data defined by the schema. Schema definitions and XSLT stylesheets may intermix the elements and attributes of their language with elements and attributes from other namespaces, so prefixes are needed.

It is a best practice to use the prefixes that the W3C uses. For example, the XSLT Recommendation and most books about XSLT use the prefix xsl to identify the elements of the XSLT vocabulary. If you stick with the xsl prefix for your stylesheets, you can then discuss your deployment plans and consult XSLT books without the mental overhead of translating prefixes.


Establishing qualified names in your XML

XML documents have a tree structure, descending from the document element or outermost element. A namespace can be declared on any element, allowing it to be recognized within the sub-tree defined by that element and all its children. The declaration resembles an attribute, but most W3C recommendations consider it to be a separate type of node. When you look at the XML, you'll see the namespace declarations inside the start tags of the relevant elements, right alongside the attributes. There are two syntax variations:

  • xmlns:prefix="URI"
  • xmlns="URI"

The first one is commonly used; it associates a prefix with a URI. The second one declares that there is a default namespace for those elements that lack a prefix. Within the overall design of XML, both of these syntaxes fit under the reservation of names beginning with the characters xml for XML purposes. The default namespace is initialized to be no namespace-URI at all, so there is a syntax for undefining a previously-defined default namespace by assigning it to the null string. (Null strings are technically valid as URIs, but disallowed as namespace URIs.) Prefixes can be set to different URIs, but cannot be undefined, at least for XML 1.0 documents.

Variations of namespace declarations

Namespace URIxmlns=xmlns:prefix=
"http://URI"Sets defaultAssociates prefix
"" (null string)Unsets defaultILLEGAL! (may change)

In April of 2002, the XML Working Group of the W3C announced it was considering a revision of XML namespaces that would permit the assignment of a namespace prefix to the null string. In the September edition of the proposal, the usage was restricted so that such a declaration could only be used to undefine a prefix for the purposes of avoiding conflicts and eliminating unwanted namespace nodes, and a qualified name could not use the prefix at any place in the document where it was assigned to null. For now, note that the exclude-result-prefixes feature of XSLT can be used to remove unwanted namespace nodes if they aren't in use, should you need to do so.

A prefix can be associated with one URI at the top of a tree, but associated with a different URI within a sub-tree by having an xmlns:prefix="new-uri" declaration in the start-tag of the element atop the sub-tree, then associated with another URI (or the original URI) in a sub-sub-tree inside the sub-tree, and so on. Doing this can cause confusion for those who have to read the raw XML document.

This example is compact, but imagine how hard it would be to find all the xmlns declarations in a large document:

<data:document xmlns:data="http://example.com/namespace/fields">
  <-- ...other content... -->
  <data:legacy xmlns:data="http://example.com/namespace/legacy-data">
    <-- ...data of an older style... -->
    <data:item xmlns:data="http://example.com/namespace/fields">
      <-- ...this one item within legacy uses the standard namespace... -->
    </data:item>
  </data:legacy>
  <-- ...other content... -->
</data:document>

You can apply the following preferred practices:

The best practice here is to use a given prefix for only one namespace throughout all XML documents in a system. If this is impractical, at least try to associate the prefix with only one URI within a single document. Another best practice is to make all the necessary associations up in the start tag of the document element, so that they apply throughout the whole document. This makes it easier to find all the declarations. The number of namespace declarations that can appear in a single start tag is unlimited.

When a software tool generates XML, it has to place namespace nodes (xmlns declarations) within the tree so that they are in effect where needed to qualify names. If a namespace has an associated prefix, the namespace can be declared higher up than the element where it's needed. This can have the desirable effect of reducing redundant declarations. The Xalan XSLT processor is one example of a tool that does this.

You must declare all prefixes before using them, except xml and xmlns, which can be assumed to be in effect and unchanging throughout all XML documents. You may be tempted to exploit the attribute-like syntax to have some of your declarations set up as default attributes in an external entity.

The best practice here is to have the declarations contained within the document, thereby reducing assumptions and dependencies.

Use of the default namespace (the one applicable to unprefixed element names) is a judgment call. If you can get accustomed to prefixing all element names everywhere, you avoid some pitfalls. However, some people may experience prefix fatigue or feel that one namespace applies to the real content of the document and that making it the default is a way to make that distinction. If you follow that latter path, you will need to establish some design principles for determining the namespace that can be the default in a given document. Of course, the rules will benefit only those people who actually have to read (and possibly create) XML documents.

The best practice regarding use of prefixes is to either use them everywhere or to use them on all items except those that are the real content being delivered to the end user. Use prefixes for all process control elements that are modified only by system developers, including XSLT stylesheets, schema definitions, and so forth. Use prefixes on all items coming from XML vocabularies that are external to your organization, with the possible exception of real content being delivered to the end user.

Attributes are a little different

An attribute can appear in a different namespace than the element that contains it. For example, <movie:title xml:lang="fr"> has an attribute that is not from the movie namespace. If an attribute name has a prefix, its name is in the namespace indicated by the prefix. However, if an attribute name has no prefix, it has no namespace. This is true even when the default namespace has been assigned. The W3C Namespaces in XML Recommendation makes that point with this example:

<x xmlns="http://www.w3.org" xmlns:n1="http://www.w3.org">
  <good a="1" n1:a="2" />
</x>

The elements are affected by the declaration of a URI for the default namespace. That is, both x and good are associated with the URI "http://www.w3.org" because it's the default namespace. The attribute n1:a is also associated with that namespace, due to its use of the n1 prefix, which is associated with the same URI. There is no conflict that the a attribute is being declared twice, because while n1:a is in the http://www.w3.org namespace, the unprefixed a is not; the latter is not in any namespace.

Since xml:lang is illustrated above, let's note that it is a best practice to use the xml:lang attribute as the way to declare that the content of the element is in a particular natural language.

When a W3C vocabulary specifies both elements and attributes, it typically will not require that the attributes be qualified to the namespace as long as they occur on elements that are qualified. Returning to the XML Include example, in <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="http://example.com/std/defs" parse="xml" />, the href and parse attributes are specified as meaningful attributes of the xi:include element, so an XML parser that is able to act upon the xi:include element must interpret those attributes as details of the include operation.

In the full universe of possible attribute names, all names beginning with the letters "xml" -- in that order, but in any upper-/lower-case combination -- are reserved to be defined by the W3C. That way, a namespace declaration like xmlns="http://foo.com" can use the syntax of an attribute rather than a distinct syntax.

Most W3C specifications call it a namespace declaration rather than an attribute, and it's a best practice to observe the difference in conversation. (The Namespaces Recommendation document itself refers to these declarations as reserved attributes long enough to introduce them. DOM Level 3 also treats these declarations as attributes from the xmlns namespace.)

A namespace declaration like xmlns:fooname="http://foo.com" has the same syntax as an attribute with a qualified name, but the initial letters "xml" signal its special role, and it too is a namespace declaration in conversation. However, an attribute like xml:space="preserve" is still an attribute in the proper terminology, but it is in the reserved namespace. If your XML documents get processed by an application that recognizes XML but is not namespace-aware, the QNames will probably survive and the namespace declarations will be treated as attributes.

The xmlns prefix has been specified by the first Namespaces in XML Recommendation to not have an associated URI. The W3C may opt to change this in the future. This may not make much of a difference in the real world, since most XML tools and processes manage namespace declarations automatically. Where they don't, you usually have a method to create or avoid creating a namespace node in the tree-like representation of the XML. When the XML resides in a file, the namespace declaration has the standard xmlns sequence in the start tag of an element, but the XML parser that reads the file will know to recognize xmlns whether or not it's associated with a namespace. (Since XML launched without namespaces, you could potentially encounter an early XML parser that is not namespace-aware; avoid such parsers if the best practices presented here are at all relevant to you.)

Validation and namespaces

The XML Schema Recommendation has complete provisions for defining a document structure with namespaced elements and attributes. Furthermore, it defines a special QName data type for strings that must be valid as qualified names. A schema definition document can specify the target namespace for the document structure.

The older document type definition (DTD) syntax for specifying document structure is not namespace-aware. However, DTDs tolerate element and attribute names that contain colons. If you want to use DTDs and namespaces together, you can do so by designating specific prefixes and treating them as fixed parts of the element and attribute names. The technique is explained in detail in C. M. Sperberg-McQueen's memo in The Cover Pages (see Resources). Expect substantial discomfort if you must do this. (DTDs allow the assignment of values to attributes not explicitly present in the XML document. Setting an attribute named xmlns through this DTD mechanism is a bad idea.)

Looking ahead

To this point, I have covered the foundation established by the W3C. Part 2 provides more depth on the best way to establish your own XML vocabularies. In Part 2, you'll also see renaming techniques that are namespace-aware.


Resources

About the author

David Marston has worked with XML technologies since late 1998. Over his 25+ years in the computing business, he has been involved with all aspects of software development. He is a graduate of Dartmouth College and a member of the ACM. He is on the Next-Generation Web team at IBM Research. You can contact him at David_Marston@us.ibm.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12179
ArticleTitle=Plan to use XML namespaces, Part 1
publish-date=04292004
author1-email=David_Marston@us.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).