As with all programming languages, getting to know XSLT's built-in data types and structures is essential to mastering the language. Node sets are the most interesting creatures among XPath's data types (which form the basis of XSLT's data types). In this discussion, I'll show a couple of nonobvious ways that you can use node sets to simplify XSLT processing.
Counted loops in traditional XSLT
XSLT provides a primitive operation for iterating all the items in a node set: xsl:for-each. If you have used XSLT seriously, you are also familiar with the standard approach for making iterations based on a given number, rather than on a given node set. As an illustration, the following XSLT template takes a number, and prints that many asterisks:
Listing 1. Template for printing a specified number of asterisks
<xsl:template name="print-asterisks">
<xsl:param name="count"/>
<!-- The termination condition (infinite recursion is no fun) -->
<xsl:if test="$count">
<!-- print the asterisk for this iteration -->
<xsl:text>*</xsl:text>
<!-- recursive call to print remaining asterisks -->
<xsl:call-template name="print-asterisks">
<xsl:with-param name="count" select="$count-1"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
|
If you are unfamiliar with this technique, find a good XSLT tutorial or book right away and learn how such recursive templates work. This is one of the most fundamental techniques in XSLT. Even though this tip offers an occasional way around it, you won't get very far in XSLT without being able to rattle off such code in your sleep.
The template takes a single parameter, which is the number of asterisks to count out and print. When the template is initially called, you pass in the total number of asterisks to print. I go easy on the error-checking in this script. For example, if you were to pass in a negative number value for count, the result would be infinite recursion. The if test avoids infinite recursion in normal cases by doing nothing when the count falls to zero. Then a single asterisk is printed and the template is called recursively (with one subtracted from the count) to print the remaining asterisks.
Performance is the biggest problem with this approach. Recursion in its raw form can take up a lot of resources. Most XSLT processors recognize this as an example of tail recursion, which can be optimized into a regular iteration. This helps, but even iteration can be slow if it goes through the machinery of template dispatch each time. Perhaps by now some XSLT processors have even more sophisticated optimizers that eliminate even this overhead, but I wouldn't count on such advancement yet. In general, when each step in the recursion is a trivial operation (such as printing a single asterisk) the overhead can be a problem.
You could use xsl:for-each for such loops if you were able to contrive a node set of exactly the length you want. One way to do this is to take a node set that is longer than the length you want, and select the subset with the right length. The following XPath expression does this, if count is the desired number and nodeset is a node set you know to be longer than count:
$nodeset[position() < $count] |
From the source node set, the predicate creates another node set with exactly count nodes. The main question is where to get nodeset from. Any means of obtaining a node set is fine for the job, as long as the resulting node set is large enough. You could then just grab a bunch of nodes from the source document -- or better yet, all of them -- using the XPath //node(). The problem is that you can't always rely on the length of the source document. The stylesheet itself is probably a better source, since you can vouch for its size when you write the transform, and you can even pad it with dummy nodes if necessary. The expression document("") gets the entire stylesheet as a secondary source document.
Using these tricks, you can rewrite the template for printing asterisks to the following:
Listing 2. Using a tailored node set for looping
<!-- use all nodes in the current stylesheet as a source -->
<xsl:variable name="nodeset" select="document('')//node()"/>
<xsl:template name="print-asterisks">
<xsl:param name="count"/>
<xsl:if test="$count > count($nodeset)">
<!-- Basic safety measure: better to crash and burn
than to fail in a non-obvious way -->
<xsl:message terminate="yes">
Not enough nodes for iteration
</xsl:message>
</xsl:if>
<!-- Execute the loop, using the node set we want -->
<xsl:for-each select="$nodeset[position() < $count]">
<xsl:text>*</xsl:text>
</xsl:for-each>
</xsl:template>
|
The node set of all nodes in the stylesheet is constructed once, at top-level, and can be reused for any such loop in the transform. The template first checks that there are sufficient nodes for the iteration, and aborts all processing if there aren't. While you may choose more elegant error handling than this, do not omit the check or you may request a certain number of iterations, and end up with fewer without any warning. Such errors can be very hard to spot.
Another potential disadvantage is that for some XSLT implementations, the document("")//node() operation can be expensive in terms of time and space. The stylesheet could be reparsed, and then plumbed for every node. This is a one-time penalty for the stylesheet execution. If you use the trick several times, you'll probably still get an appreciable speed boost. If you only need iterations of smaller length, you could use the variation document("")/node(), which restricts the mining of nodes to the top level. There are a handful of other tricks along these lines that you can use to suit your purposes. For instance, you can decrease the chances of running out of nodes by creating a node set from both the stylesheet and the source document: //node()|document("")//node().
Someone lacking charity might call this technique a hack, but as long as you understand the standard iteration tricks for XSLT, you can use this short-cut when you really need it. It looks as if this trick will become redundant with XPath and XSLT 2.0, which have far more sophisticated looping primitives built in, but it could be years before these are finalized and compliant implementations emerge.
- Read Jeni Tennison's XSLT pages which are an excellent resource for handy XSLT techniques.
- Clear up many of the dark corners of XSLT, including the ways of node sets, with the XSLT FAQ.
- Want to grasp all the nuances of XSLT and XPath in fine detail? I recommend Mike Kay's XSLT Programmer's Reference
.
- Get the latest news about XSLT at the W3C site.
-
IBM trial software: Build your next development project with trial software available for download directly from developerWorks.
- Find out how you can become an IBM Certified Developer in XML and related technologies.
- Want us to send you useful XML tips like this every week? Sign up for the developerWorks
XML Tips newsletter.
- Find more XML resources on the developerWorks XML zone. For a complete list of XML tips to date, check out the tips summary page.

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a computer engineer and writer born in Nigeria, living and working in Boulder, Colo. USA. You can contact Mr. Ogbuji at uche@ogbuji.net.
Comments (Undergoing maintenance)





