UNIX users are very familiar with the idea of pipes -- mechanisms that direct the output of one program so that it becomes the input for another. Pipes are behind perhaps the first major examples of modularizing loosely-coupled code. Each UNIX command is very simple and targeted; complex actions are produced by stringing them together. The processing of XML using XSLT has much to gain from this same sort of modularization.
You can improve code simplicity and reuse by breaking the transform into a set of separate phases or passes. Unfortunately, in pure XSLT 1.0, most of the commands for handling input of a transform are forbidden from use on output. This restriction has been removed in XSLT 2.0, and even in XSLT 1.0 (which has many more years of life) you can remove the restriction using an extension function that is usually provided by XSLT vendors.
To follow this tip, you should be familiar with XSLT.
I have a little XSLT template for taking a document table and displaying only the first item in each row. It is designed to work with the sort of tables used in DocBook (which are based on a model of tables popular in SGML). A sample table is shown in Listing 1 (db-table.xml).
Listing 1. Simple table in DocBook form (db-table.xml)
<table frame="all">
<title>Numbers and tongues</title>
<tgroup cols="3" align="left" colsep="1" rowsep="1">
<thead>
<row>
<entry>1</entry>
<entry>2</entry>
<entry>3</entry>
</row>
</thead>
<tfoot>
<row>
<entry>I</entry>
<entry>II</entry>
<entry>III</entry>
</row>
</tfoot>
<tbody>
<row>
<entry>one</entry>
<entry>two</entry>
<entry>three</entry>
</row>
<row>
<entry>uno</entry>
<entry>dos</entry>
<entry>tres</entry>
</row>
<row>
<entry>otu</entry>
<entry>abuo</entry>
<entry>ato</entry>
</row>
</tbody>
</tgroup>
</table>
|
Listing 2 (db-onecol.xslt) is a transform that renders only the first column of the table.
Listing 2. XSLT transform for rendering the first column of a DocBook-style table (db-onecol.xslt)
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exslt="http://exslt.org/common"
version="1.0"
>
<xsl:output method="text"/>
<xsl:template match="table">
<xsl:value-of select="title"/><xsl:text>
</xsl:text>
<xsl:for-each select="tgroup/thead/row">
<xsl:value-of select="entry[1]"/><xsl:text>
</xsl:text>
</xsl:for-each>
<xsl:for-each select="tgroup/tbody/row">
<xsl:value-of select="entry[1]"/><xsl:text>
</xsl:text>
</xsl:for-each>
<xsl:for-each select="tgroup/tfoot/row">
<xsl:value-of select="entry[1]"/><xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:transform>
|
This outputs simple text. The entities are line feeds that are placed in xsl:text so that they are not stripped from the style sheet as white space. The rest is simple. When a table element is encountered, the title is output, followed by the first entry in the rows of the table head, body, and foot. I did not simplify to one xsl:for-each loop using tgroup/*/row, or the like, because the thead, tbody, and tfoot elements can come in any order in the document, and I wanted them processed in a specific order. The following session demonstrates how this transform is run:
$ 4xslt db-table.xml db-onecol.xslt Numbers and tongues 1 one uno otu I |
Now I have an XHTML-style table in Listing 3 (xhtml-table.xml) that I'd like to process in the same way.
Listing 3. An XHTML-style table (xhtml-table.xml)
<table border="1" frame="box">
<caption>Numbers and tongues</caption>
<thead>
<tr>
<th>1</th>
<th>2</th>
<th>3</th>
</tr>
</thead>
<tbody>
<tr>
<td>one</td>
<td>two</td>
<td>three</td>
</tr>
<tr>
<td>uno</td>
<td>dos</td>
<td>tres</td>
</tr>
<tr>
<td>otu</td>
<td>abuo</td>
<td>ato</td>
</tr>
</tbody>
<tfoot>
<tr>
<td>I</td>
<td>II</td>
<td>III</td>
</tr>
</tfoot>
</table>
|
Because this table has different element names and a slightly different organization, I cannot simply reuse the DocBook table template. I could copy this template over with some modifications to create a special version for XHTML elements, but this is a less modular approach. Another approach is to convert the XHTML to DocBook form and then pass that through the DocBook template; the advantage here is that I can also re-use other facilities for DocBook tables once the conversion has been made.
Listing 4 (xhtml-onecol.xslt) is a transform that uses the DocBook table module to operate on XHTML tables.
Listing 4. XSLT transform for rendering the first column of an XHTML-style table (xhtml-onecol.xslt)
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exslt="http://exslt.org/common"
version="1.0"
>
<xsl:import href="db-onecol.xslt"/>
<xsl:template match="/">
<xsl:apply-templates mode="xhtml"/>
</xsl:template>
<xsl:template match="table" mode="xhtml">
<xsl:variable name="db-table">
<xsl:call-template name="xhtml-table-to-db"/>
</xsl:variable>
<xsl:apply-templates
select="exslt:node-set($db-table)/table"/>
</xsl:template>
<xsl:template name="xhtml-table-to-db">
<xsl:copy>
<title><xsl:value-of select="caption"/></title>
<tgroup cols="{count(thead/tr/th)}">
<thead>
<row>
<xsl:for-each select="thead/tr/th">
<entry><xsl:apply-templates/></entry>
</xsl:for-each>
</row>
</thead>
<tfoot>
<row>
<xsl:for-each select="tfoot/tr/td">
<entry><xsl:apply-templates/></entry>
</xsl:for-each>
</row>
</tfoot>
<tbody>
<xsl:for-each select="tbody/tr">
<row>
<xsl:for-each select="td">
<entry><xsl:apply-templates/></entry>
</xsl:for-each>
</row>
</xsl:for-each>
</tbody>
</tgroup>
</xsl:copy>
</xsl:template>
</xsl:transform>
|
One important point: I have intentionally simplified these examples to focus on the main point. The style sheets use the pull style of XSLT (which means frequent use of xsl:for-each and xsl:value-of) rather than the push style (which uses a lot of templates and modes). I have done this because the pull style is more widely familiar, although the push style is superior in many ways. For example, in a real project I would write the template for converting XHTML tables to DocBook as a variation of the identity transform. Also, the templates would need much more logic to handle general cases of XHTML and DocBook tables.
The crux of the multi-pass technique occurs in the line:
<xsl:apply-templates select="exslt:node-set($db-table)/table"/> |
This is the hand-off from one phase to the next. In the first pass, the XHTML table is converted to DocBook form within the variable db-table. This creates a result tree fragment of output very similar to that in Listing 1. To treat this as input on the second pass, I have to convert this from a result tree fragment to a node-set, which is what the exslt:node-set function does. This extension function is supported by several processors, and even processors that do not support the EXSLT extensions almost invariably provide their own proprietary node-set extension which works the same way.
I select the table element from this new node-set to kick off the second pass, in which the table template from the imported db-onecol.xslt module does its work. I use a mode (xhtml) to select the XHTML table so that this template does not interfere with the operation of the DocBook template, which has the same match, but lower import precedence.
The output of this transform is the same as that of the transform on a pure DocBook table. I was able to reuse the DocBook code just as I intended.
This example is an extreme simplification of a situation I encountered in a real project. I needed to reuse many DocBook processing templates on an XHTML source. By transforming XHTML content to DocBook in the first pass, and then re-using standard DocBook templates in subsequent passes, I saved a huge amount of work and debugging. The idea of multi-pass XSLT is even more general than this. In addition to promoting code reuse, it can also break complex transforms into chunks that are easy to understand. The next time you are faced with a complex problem in XSLT, determine whether it could be simplified or modularized as a series of piped operations.
- Take a look at the W3C's XSL page, which has many useful links to XSLT-related resources, including the specifications themselves, tutorials, articles, and implementations.
- Be sure to bookmark XSL Frequently Asked Questions, which features many notes that are handy for debugging.
- For a good introduction to XSLT, read "Investigating XSLT: The XML transformation language," by LindaMay Patterson (developerWorks, August 2001).
- See Uche Ogbuji's other XSLT-related tips: "Counting with node sets" (developerWorks, May 2002), "Generating internal HTML links with XSLT" (developerWorks, February 2001), and "XSLT lookup tables" (developerWorks, February 2001).
- The style sheet processor used in the examples is 4XSLT, part of 4Suite, which is co-developed by Uche Ogbuji.
- Look to EXSLT for useful and widely supported extension functions for XSLT. The common module has the node-set function, and others.
- All XSLT users should be familiar with the identity transform. Read Jason Diamond's article, "Template Languages in XSLT," on customizing the identity transform.
- Learn about DocBook at DocBook.org and from "A gentle guide to DocBook" by Joe Brockmeier (developerWorks, September 2000).
- Learn about XHTML from articles such as "XHTML: The power of two languages" by Sathyan Munirathinam (developerWorks, July 2002), "The Web's future: XHTML 2.0" by Nicholas Chase (developerWorks, September 2002), and "Introduction to XHTML, with eXamples" by Alan Richmond.
-
IBM trial software: Build your next development project with trial software available for download directly from developerWorks.
- Find more XML resources on the developerWorks XML zone. For a complete list of XML tips to date, check out the tips summary page.
- Find out how you can become an IBM Certified Developer in XML and related technologies.
- Want us to send you useful XML tips like this every week? Sign up for the developerWorks
XML Tips newsletter.

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.
Comments (Undergoing maintenance)





