With all the hundreds of XML processing tools out there, Web browsers are still where the action is—luckily for XML developers, the action never seems to slow down. Over the past few years I've written a series of articles (see Resources) about XML-related features in the developer favorite, the Firefox browser; I've covered Firefox 1.5 through 2.0. Recently Firefox moved up to version 3.0 with numerous overall improvements and a lot of great new developments for XML processing. Many of the improvements come from the upgrade of the core Web processing engine, Gecko, from version 1.8.1 to 1.9.
XML fundamentals in version 3.0
The XML space includes a huge stack of technologies, but it still all begins with the parser; Firefox 3 introduces one huge improvement to basic XML parsing. In the past on Mozilla browsers, parsing an XML document was synchronous, blocking all operations on the document until it was fully loaded. Contrast this to HTML parsing which has always been asynchronous so that parts of the document become available as they're parsed. To the user, this meant he starts to see how a Web page shaped up before the browser had completely processed the page; on the other hand, with XML documents the user saw nothing at all until it was completely parsed. This was a usability problem that served as an unfortunate deterrent for processing large XML documents.
In Firefox 3.0, construction of the XML content model is incremental, much as it is for HTML. This will make a big difference for practical use of XML on the Web. There are some exceptions—most notably that XSLT is not processed incrementally. In theory, you might apply a subset of XSLT incrementally, using a restricted subset of XPath, but doing so is significant effort in itself and lies beyond the scope of Firefox 3.0.
One improvement I had hoped for in Firefox 3.0 is xml:id support. There was some
controversy as to whether to support this, but a patch is available with a good
chance that it will become available in a future release. As a general note, the only means Firefox JavaScript provides to use getElementById on XML documents is the internal DTD subset (no external subset, and no xml:id). If you really need xml:id use XPath from JavaScript to query for attributes in the XML namespace and "id" local name.
Another hoped-for core improvement that didn't make it is the ability for the user to request that the browser loads the external DTD subset. Again it looks as if a patch is ready but that there just weren't enough available developer resources to work it through the Q&A process to get it into the Firefox 3.0 release.
The biggest win for those looking to use XSLT in Firefox is support for EXSLT, a set of XSLT extensions developed and sanctioned by the XSLT community and supported in many other XSLT processors. Firefox 3.0 adds support for a large subset of EXSLT, starting with the node-set function, an important workaround for XSLT 1.0's most severe limitation. EXSLT is organized into modules, each of which defines several extension functions and elements. Firefox 3.0 implements a selection of extensions within a selection of modules, as listed:
-
Common: Firefox 3.0 implements the basic set of general-purpose functions:
-
exsl:node-setallows you to turn result tree fragments into node-sets so that you can apply XPath on them. -
exsl:object-typeis an introspection tool to report the type of an object such as string, node set, number, or boolean.
-
-
Sets: Firefox 3.0 implements some useful extensions for working with node sets:
-
set:differencecomputes the difference between two sets, returning a node-set whose nodes are in one of the arguments but not the other. -
set:distinctexamines a node set for nodes with the same string value and removes all but one instance of each. -
set:intersectioncomputes the intersection if two sets, returning a node-set whose nodes are in both. -
set:has-same-nodedetermines whether two node sets have any nodes in common (such as whether they share the actual same node and not just different nodes with the same string value, as with the XPath=operator). -
set:leadingreturns the nodes in one node-set that come before the first node in the other node-set, in document order. -
set:trailingreturns the nodes in one node-set that come after the first node in the other node-set, in document order.
-
-
Strings: Firefox 3.0 implements some useful extensions for working with strings:
-
str:concatreturns a string concatenating the string values of each node in a set (compare to the built-inconcatfunction which concatenates a fixed sequence of expressions). -
str:splituses a pattern to split a string into a sequence of substrings (represented in a node set constructed at run time). -
str:tokenizeuses a set of single-character tokens to split a string into a sequence of substrings (represented in a node set constructed at run time).
-
-
Math: Firefox 3.0 implements some functions that make it easier to grab smallest and largest numerical quantities from the content of node sets:
-
math:maxreturns the highest numerical value of content within a given node set. -
math:minreturns the lowest numerical value of content within a given node set. -
math:highestreturns the node set whose content has the highest numerical value. -
math:lowestreturns the node set whose content has the lowest numerical value.
-
-
Regular expressions: Firefox 3.0 brings the power of regular expressions to XSLT:
-
regexp:matchmatches a regular expression pattern against a string and returns the matching substrings as a node set constructed at run time. -
regexp:testchecks whether a string entirely matches a regular expression pattern. -
regexp:replacereplaces substrings that match a regular expression pattern.
-
To get you started using EXSLT in your own transforms, I've constructed an example that exercises a good number of the functions implemented in Firefox 3.0. One of the best uses I've found for XSLT in the browser is to deliver reports against semi-structured data. You point the user to the XML file which contains a processing instruction to apply an XSLT transform. In such situations you can often dictate the required browser version so you don't have to worry as much about cross-browser compatibility. In addition you offload a lot of work from the server to each user's computer. Listing 1 (employees.xml) is an employee information file against which I'll present a report in Firefox 3.0.
Listing 1. Employee information file employees.xml
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xml" href="employees.xsl"?>
<employees>
<department id="res">
<title>Research</title>
<info>http://example.com/ar-and-dee for more info</info>
<employee id="111">
<title>Coordinator</title>
<name>
<given>Rene</given>
<family>Descartes</family>
</name>
<location building="PAR1">France</location>
</employee>
<employee id="112">
<title>Project Manager</title>
<name>
<given>Abu Ja'far</given>
<family>Al Kwarizmi</family>
</name>
<location building="BAG2">Iraq</location>
</employee>
</department>
<department id="exec">
<title>Executive</title>
<info>Home of the head honchos</info>
<employee id="101">
<title>Chief Executive Officer</title>
<name>
<given>Genghis</given>
<family>Khan</family>
<honorific>The Great</honorific>
</name>
<location building="MON1">China</location>
</employee>
</department>
<department id="hr">
<title>Human Resources</title>
<info>We're happy to serve you at http://example.com/hr</info>
<employee id="102">
<title>Manager of Wellness</title>
<name>
<given>Ching-Yuen</given>
<family>Li</family>
</name>
<location building="SZE1">China</location>
</employee>
</department>
</employees>
|
Notice the xml-stylesheet processing instruction at the top which instructs the browser to use XSLT. Listing 2 (employees.xsl) is a transform to generate a report from Listing 1.
Listing 2. Transform to generate a report from the employee information file (employees.xsl)
<?xml version="1.0" encoding="utf-8"?>
<!-- A -->
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:math="http://exslt.org/math"
xmlns:regex="http://exslt.org/regular-expressions"
xmlns:set="http://exslt.org/sets"
xmlns:str="http://exslt.org/strings"
xmlns="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="set math regex str">
<!-- Notice the namespace declarations for EXSLT.
Notice also exclude-result-prefixes, since you don't want those
namespace declarations in the result XHTML
-->
<!-- Use XML mode to approximate XHTML output
(notice the doc type declaration info) -->
<xsl:output method="xml" encoding="utf-8"
doctype-public="-//W3C//DTD XHTML 1.1//EN"
doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"/>
<xsl:template match="employees">
<!-- Put the presentation style into a separate file,
specified using a processing instruction in the output -->
<xsl:processing-instruction name="xml-stylesheet">
<xsl:text>type="text/css" href="employees.css"</xsl:text>
</xsl:processing-instruction>
<html xml:lang="en">
<head>
<title>Employee report</title>
</head>
<body>
<h1>Employee report</h1>
<table>
<xsl:apply-templates/>
</table>
<hr/>
<xsl:call-template name="stats"/>
</body>
</html>
</xsl:template>
<xsl:template name="stats">
<xsl:variable name="execs"
select="department[title='Executive']/employee"/>
<xsl:variable name="employees-in-china"
select="department/employee[location='China']"/>
<!-- Use set:has-same-node to check whether the two separate
XPath queries have any node sets in common -->
<xsl:if test="set:has-same-node($execs, $employees-in-china)">
<p>Note: At least one executive presently works in China</p>
</xsl:if>
<dl>
<dt>Countries where employees presently work</dt>
<dd>
<!-- Use set:distinct to eliminate duplicate country names
from the query result -->
<xsl:for-each select="set:distinct(department/employee/location)">
<xsl:value-of select="."/>
<xsl:if test="not(position()=last())">
<xsl:text>, </xsl:text>
</xsl:if>
</xsl:for-each>
</dd>
<dt>Newest employee</dt>
<!-- Use math:highest to determine the highest numerical value of employee ID -->
<dd><xsl:value-of select="math:highest(department/employee/@id)"/></dd>
</dl>
</xsl:template>
<xsl:template match="department">
<tr>
<td colspan="4">
<!-- Use regular expressions to sniff out URLs from unstructured content -->
<!-- [1] ensues that if multiple URLs are detected, only the first is used -->
<a href="{regex:match(info, 'http://[a-zA-Z0-9-_/\.]*', '')[1]}">
<xsl:value-of select="title"/>
</a>
</td>
</tr>
<xsl:apply-templates select="employee"/>
</xsl:template>
<xsl:template match="employee">
<tr>
<td>
<xsl:value-of select="name/given"/>
<xsl:text> </xsl:text>
<xsl:value-of select="name/family"/>
</td>
<td>
<!-- Use str:concat to construct a composite ID from ancestor -->
<!-- If, for example you added an id attribute to the root element,
the value would be appended for each employee -->
<xsl:value-of select="str:concat(ancestor-or-self::*/@id)"/>
</td>
<td>
<xsl:value-of select="title"/>
</td>
<td>
<!-- Standard concat assembles a string from a fixed sequence of expressions -->
<xsl:value-of select="concat(location, '(', location/@building, ')')"/>
</td>
</tr>
</xsl:template>
</xsl:transform>
|
I've heavily commented the code, highlighting where I use EXSLT, as well as other useful notes. The generated output references a CSS stylesheet, mostly as a demonstration of this pattern. Listing 3 (employees.css) is the CSS.
Listing 3. Presentation stylesheet for the report generate from the employee information file (employees.css)
body { background-color: lightblue; }
td { padding-left: 1em; }
|
Load Listing 1 into Firefox 3.0 to get the display in Figure 1.
Figure 1. Firefox 3.0 display of the report generated using Listings 1-3
In addition to the core-parsing improvements and EXSLT, Firefox 3.0 includes some fixes for compliance when working with XML documents with namespaces. The DOMAttrModified event now properly handles attributes in a namespace and the JavaScript DOM method getElementsByTagName() now works correctly on sub-trees that have elements with namespace prefixes in their tag names. There are many CSS and JavaScript fixes which will make life easier for XML developers.
For users of Scalable Vector Graphics (SVG), everyone's favorite XML showpiece, Firefox 3.0 offers even more goodies. It now supports patterns and masks which give you more options for rich effects; all SVG 1.1 filters are supported. You can now apply SVG transforms to any old Web browser object so that for example you might decide to rotate an IFRAME by 45 degrees, a trick that would usually require the Canvas facility. The Mozilla team has filled out SVG DOM support and along the way, squashed a lot of bugs.
Some will comment that XML has not had the expected success on the Web, but there is certainly a lot you can already accomplish with XML in browsers and thanks to continuing development in Web browsers, more and more becomes possible each year. Firefox 3.0 is an important milestone with its core performance improvements for XML processing, as well as the enhancements to XSLT, DOM, and SVG. You can't go wrong trying out these new capabilities. Even if you can't put all the bits to use right away because of cross-browser requirements, you'll be prepared for the future as the state of the art continues to advance in Web applications.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample XML, XSL, CSS files | x-think41-samples.zip | 3KB | HTTP |
Information about download methods
Learn
-
Firefox 3 for developers: Review the updated developer features for Firefox 3.0 and learn about the major improvements.
- For more details on the key fixes see the relevant bug entries:
- 18333 – XML Content Sink should be incremental
- 365801 – expose EXSLT functions to DOM Level 3 XPath API (documentInstance.evaluate)
- 362391 – DOMAttrModified doesn't handle namespaced attributes properly
- 206053 – document.getElementsByTagName('tagname') with XML document wrongly includes elements with namespace prefix in the tag name
- 22942 (entities) – Load external DTDs (entity/entities) (local and remote) if a pref is set
- 275196 – xml:id support
-
xml:id standards
page (developerWorks, April 2007): Find out more about xml:id and how to give unique identifiers to elements in XML documents.
-
Extensible
Stylesheet Language Transformations (XSLT) standards page (developerWorks, April 2007:
Find out more about XSLT and EXSLT and how to transform XML documents to different forms.
-
xml:id support: Keep track of progress, and if need be, use XPath through JavaScript to query for attributes in the XML namespace and "id" local name.
-
SVG improvements
in Firefox 3: Check out the improved support for SVG in this convenient list of newly added features.
- Check out earlier articles on Firefox and XML (developerWorks, Uche Ogbuji):
- XML in Firefox 1.5, Part 1: Overview of XML features (March 2006)
- XML in Firefox 1.5, Part 2: Basic XML processing (March 2006)
- XML in Firefox 1.5, Part 3: JavaScript meets XML in Firefox (August 2006)
- Firefox 2.0 and XML (October 2007)
-
Multi-pass XSLT (developerWorks, Uche Ogbuji, September 2002) and Counting words in XML
documents (developerWorks, Uche Ogbuji, September 2005) : Discover some XSLT techniques that become available with the addition of an EXSLT subset in Mozilla. For more on EXSLT overall, see "EXSLT by example" (developerWorks, Uche Ogbuji, February 2003).
-
IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
-
XML technical library: See the developerWorks XML library for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- The developerWorks Web Architecture zone: Expand your Web development skills with articles and tutorials that specialize in Web technologies.
-
developerWorks technical events and webcasts: Stay current with technology in these sessions.
- The technology bookstore: Browse for books on these and other technical topics.
-
developerWorks podcasts: Listen to interesting interviews and discussions for software developers.
-
developerWorks
podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
-
Firefox: Get the Mozilla-based Web browser that offers standards compliance, performance, security, and solid XML features.
-
IBM trial software for product evaluation: Build your next project with trial software available for download directly from developerWorks, including application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
- Participate in the discussion forum.
-
XML zone discussion forums: Participate in any of several XML-related discussions.
-
developerWorks XML zone: Share your thoughts: After you read this article, post your comments and thoughts in this forum. The XML zone editors moderate the forum and welcome your input.
-
developerWorks blogs: Check out these blogs and get involved in the developerWorks community.

Uche Ogbuji is Partner at Zepheira, LLC, a solutions firm specializing in the next generation of Web technologies. Mr. Ogbuji is lead developer of 4Suite, an open source platform for XML, RDF, and knowledge-management applications and its successor Amara 2; the Jacqard agile methodology for team Web development; and the Versa RDF query language. He is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can find more about Mr. Ogbuji at his Weblog Copia.




