As you may remember, my previous article demonstrated the concept of using a stylesheet to compile new features into another stylesheet. Specifically, I showed you how to write a simple execution tracing tool, which automatically modifies a stylesheet so it will generate comments in the output document as it runs, showing which parts of the latter were produced by each template.
However, I ended the article by pointing out that the basic version I'd developed was quite limited, and suggesting a number of ways in which it could be improved. In this installment I'll add some of those missing features, and turn this proof-of-concept into a much more useful tool.
Note: I'm going to assume you're already familiar with the code we developed last time, and will focus on the changes we're making to it. If you haven't read Part 1, you really should glance though that before proceeding.
One of the limitations in my original solution was that it
reported only the
match pattern when each template
started and ended execution. In fact, not every template execution is
the result of a
match. Some have a
instead (or in addition), and are (at least sometimes) invoked through
that name, using
And sometimes there are several templates with overlapping
match patterns, and which one runs depends on the
mode the stylesheet is currently running in.
So I'd like to modify the stylesheet we wrote last time,
tracexsl.xsl, to include these details in its reports.
First off, I need to
have the template tell me more about itself, which means making the
comment-generator a bit fancier. I'll check whether each of the
attributes I'm interested in --
mode -- is
present, and display it if so. If one of them is missing, I'll
suppress printing its name as well, so I can tell the difference
between it being absent and being set to an empty string. Here's the
sequence I'll want to execute to insert a start-of-template
comment generator into the template being styled:
Listing 1. Inserting a start-of-template comment generator
<tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL Begin]</xsl:text> <xsl:if test="@match"> <xsl:text> match="</xsl:text> <xsl:value-of select="@match"/> <xsl:text>"</xsl:text> </xsl:if> <xsl:if test="@name"> <xsl:text> name="</xsl:text> <xsl:value-of select="@name"/> <xsl:text>"</xsl:text> </xsl:if> <xsl:if test="@mode"> <xsl:text> mode="</xsl:text> <xsl:value-of select="@mode"/> <xsl:text>"</xsl:text> </xsl:if> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text>
tracexsl: prefix is bound to a dummy
namespace, and will be rebound to the official XSLT namespace when
these literal result elements are copied into the generated
stylesheet, thanks to a
directive in my stylesheet-for-stylesheets. If I had used the
xsl: prefix directly, these elements would be executed
immediately rather than being copied.)
Obviously, the end-template comment can, and probably should, be modified similarly.
Now I have more data about which template
ran. But I still don't know why it ran. Unfortunately XSLT
doesn't have any way to ask how a template was invoked, or what the
mode is, so I'll have to do a bit
more work to display that information.
Most template invocations occur due to an
mode change, so I'll make that my assumed case.
The other two possibilities are
<xsl:apply-templates> with a
mode change, or
can report these operations by generating additional comments in the
output document, using the same techniques I applied to the
<xsl:template> elements... but this time wrapping the
comments around the outside of the element I want to trace rather
than inserting them inside it.
Here's a template that adds a comment when
<xsl:call-template> occurs. Since that applies only to
the very next template to be invoked, I'm just generating one comment
before the call, rather than producing a before-and-after pair.
Listing 2. Generating a comment when <xsl:call-template> occurs
<xsl:template match="xsl:call-template"> <tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL] CALL NAME="</xsl:text> <xsl:value-of select="@name"/> <xsl:text>"</xsl:text> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template>
And here's the template that does this for mode changes. I really
wish I could report only when the
mode is actually
changing to a new value -- and report what it's restored to when I
pop out of the
<xsl:apply-templates> call -- but I
haven't yet found a good way to do so. If you discover one, please let
me know so we can update this article!
Listing 3. Generating a comment for mode changes
<xsl:template match="xsl:apply-templates[@mode]"> <tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL] APPLY MODE="</xsl:text> <xsl:value-of select="@mode"/> <xsl:text>"</xsl:text> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> <tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL] END MODE="</xsl:text> <xsl:value-of select="@mode"/> <xsl:text>"</xsl:text> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text> </xsl:template>
To demonstrate this new code, I'll need to run it against a stylesheet
mode changes and
<xsl:call-template>. Rather than write one for this
purpose, I'm going to swipe one of the conformance test cases
that comes with Xalan. It generates text output rather than HTML or XML,
but it'll suffice to demonstrate that my trace is doing the right thing.
tracexsl2.xsl-- Revised stylesheet for tracing stylesheets.
tracexsl-sample2.xsl-- a copy of
modes10.xslfrom the Xalan test suite.
- When I use Xalan to run
tracexsl2.xslover that sample, I get
tracexsl-sample2.xml-- A copy of
modes10.xml, the input document for that test case.
- And when I use Xalan to run
tracexsl-sample2.xsl.withTraceover that source document, it produces
That trace output is getting a bit wordy, isn't it? That's OK -- I'll discuss how to trim it down in a later part of this article. Right now, though, I'm going to make it even more detailed, because I'm still missing something I consider important.
So far, I've generated a lot of information about what templates are being invoked, and how they're being invoked. But of course the stylesheet is only half of the process; the other half is the document being styled. If I don't know where the input is coming from, it's hard to understand why I'm generating a particular set of output.
I could take the approach Xalan did in its own messages, and display the line and column number within the source document. But standard XSLT doesn't make that information available; I'd have to resort to XSLT Extensions. Since I've gotten this far without giving up portability, I'd rather not do that.
Besides, line and column really aren't the most meaningful way to describe a location within an XML document. It would be more useful to display an XPath to the node being processed: "/purchaseOrder/customer/shippingAddress" is much more informative than "line 152, column 17".
Automatically generating XPaths turns out to be more difficult than you might expect. The actual path syntax is quite simple, of course. But when the W3C Working Group designed the XPath syntax, they decided to use XML Namespace prefixes rather than spelling out namespace URIs in full. Unfortunately, they didn't provide any syntax for declaring those prefixes, so an XPath by itself is not as meaningful as it should be. We've sent a gripe to the XML Core Working Group asking that they address this, since it means there's no way to write a context-independent XPath.
Another little issue is that a prefix may be redefined at any point in a document...so anyone creating an XPath has to be prepared to create new prefixes to disambiguate these collisions, and make sure those prefixes don't collide with any others already in use. The simplest, and ugliest, answer would be to always produce new prefixes in the generated XPath rather than using the ones from the source document -- which would work fine for programs, but which would be harder for humans to read.
I've decided to dodge these issues and settle for a namespace-insensitive approximation of XPaths. The trace comments are primarily intended to be read by humans rather than by a real XPath processor, and redefined prefixes are relatively uncommon, so this is probably good enough for now. In many cases, it will in fact be an entirely acceptable XPath to the node. I've put code to generate this in its own template, so it can be easily replaced with a better solution when one becomes available.
That simplification makes generating a Pseudo XPath in XSLT much easier. It boils down to:
- Examining all of a node's ancestors, and the node itself, in turn
- Checking each node's type
- Outputting a slash, the node's type and name, and a position predicate.
That position number is needed in case the node has siblings
of the same type and name, to indicate which one we're
returns the right value for this purpose -- 1 for the
first such instance, 2 for the second, and so on.
The following template, when invoked with
<xsl:call-template>, should do the job:
Listing 4. Template that generates a Pseudo XPath in XSLT
<xsl:template name="pseudo-xpath-to-current-node"> <!-- Special-case for the root node, which otherwise wouldn't generate any path at all. A bit of a kluge, but it's simple and efficient. --> <xsl:if test="not(parent::node())"> <xsl:text>/</xsl:text> </xsl:if> <xsl:for-each select="ancestor-or-self::node()"> <xsl:choose> <xsl:when test="not(parent::node())"> <!-- This clause recognizes the root node, which doesn't need to be explicitly represented in the XPath. --> </xsl:when> <xsl:when test="self::text()"> <xsl:text>/text()[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::comment()"> <xsl:text>/comment()[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::processing-instruction()"> <xsl:text>/processing-instruction()[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::*"> <!-- This test for Elements works because the Principal Node Type of the self:: axis happens to be Element. --> <xsl:text>/</xsl:text> <xsl:value-of select="name(.)"/> <xsl:text>[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::node()[name()='xmlns' | starts-with(name(),'xmlns:')]"> <!-- This recognizes namespace nodes, though it's a bit ugly. XSLT 1.0 doesn't seem to have a more elegant test. XSLT 2.0 is expected to deprecate the whole concept of namespace nodes, so it may become a moot point. NS nodes are unique; a count isn't required. --> <xsl:text>/namespace::</xsl:text> <xsl:value-of select="local-name(.)"/> </xsl:when> <xsl:otherwise> <!-- If I've reached this clause, the node must be an attribute. Attributes are unique; a count is not required. --> <xsl:text>/@</xsl:text> <xsl:value-of select="name(.)"/> </xsl:otherwise> </xsl:choose> </xsl:for-each> </xsl:template>
I'm going to want to call that template from my comment generator code.
That's fairly straightforward -- within the
I can just add something like:
<xsl:text> source=</xsl:text> <tracexsl:call-template name="pseudo-xpath-to-current-node"/>
Now all I have to do is put the
pseudo-xpath-to-current-node template someplace where the
annotated stylesheet can find it. I could put it in its own
stylesheet file, and use
<xsl:import> to bring it in. But that would mean the
altered stylesheet couldn't run without this supporting stylesheet,
and I'd prefer a solution that lets me ship a single self-contained
file to a customer who's having trouble.
So I'm going to take advantage of the fact that I'm already
modifying the stylesheet, and I'll copy this template into the
annotated version. Just to keep everything in one convenient pile, I've
chosen to store the master copy as part of my
stylesheet-for-stylesheets, and copy it from there. This is quite
convenient in XSLT, since
document('') lets me read from the currently executing stylesheet.
In addition to adding the
pseudo-xpath-to-current-node template to my revised
tracexsl3.xsl, I've written a template that recognizes the
<xsl:stylesheet> element in the source
stylesheet and copies the
template into it after all its other children. For safety, I've added
a test that first checks whether the pseudo-XPath template already exists --
because someday I may want to use
to debug itself, and two copies of the same template would be an error.
Listing 5. Template that copies pseudo-xpath-to-current-node into <xsl:stylesheet>
<xsl:template match="xsl:stylesheet"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates select="node()"/> <xsl:if test="not(xsl:template[@name='pseudo-xpath-to-current-node'])"> <xsl:text> </xsl:text> <xsl:copy-of select="document('')/xsl:stylesheet/xsl:template[ @name='pseudo-xpath-to-current-node']"/> <xsl:text> </xsl:text> </xsl:if> </xsl:copy> </xsl:template>
Here's what I get when I put it together:
tracexsl3.xsl-- Revised stylesheet for tracing stylesheets.
tracexsl-sample3.xsl-- a copy of
modes10.xslfrom the Xalan test suite.
- When I use Xalan to run
tracexsl3.xslover that sample, I get
tracexsl-sample3.xml-- A copy of
modes10.xml, the input document for that test case.
- And when I use Xalan to run
tracexsl-sample3.xsl.withTraceover that source document, it produces
In the last section, I mentioned the possibility of using
tracexsl.xsl to debug itself.
Actually, as things stand right now that wouldn't work very
well. I'm inserting comments into the output of every template, and
that would include the
template...which means I'd wind up trying to generate comments
inside the generated comment. Similarly, I'd generate a comment
for the default template, which would mean I'd be outputting comments
before attributes, which is forbidden. As it stands, tracing
would break my trace stylesheet!
(At the end of the previous installment in this series, I pointed out that inserting comments into the output was not always going to be safe. This is a perfect example.)
I can certainly deal with these specific templates as a
special case. But what if I'm tracing another stylesheet that has
the same concerns? What I need is a more general solution, one that
would let users tell
tracexsl.xsl which of
their templates are and are not safe to trace.
How could stylesheet developers give me that information? One simple solution would be to ask them to add a mark to every template, indicating whether it should or shouldn't have trace code added. (Actually, I should probably pick a default so they only have to mark the exceptions.) I could then change my template-for-templates to only process templates with, or without, that mark.
What kind of mark? I'd suggest an attribute. XSLT elements are
allowed to carry additional attributes, as long as those attributes
are in a different namespace so the stylesheet processor knows it can
ignore them. I've already defined a namespace for the
tracexsl: prefix, so I'll use that; this will
also help me remember that these attributes are commands to my trace
For example, I might
<xsl-template> elements with the attribute
tracexsl:trace="no" should not have trace comment
generators added to them. To implement this, I would modify the
match condition of my template-for-templates to check
this attribute before proceeding, then add this flag to the templates
I'm worried about.
Listing 6. Modifying the match condition, and marking non-traced templates
<xsl:template match="xsl:template[not(@tracexsl:trace='no')]> ... </xsl:template> <xsl:template match="@*|node()" priority="-1" tracexsl:trace="no"> ... </xsl:template> <xsl:template name="pseudo-xpath-to-current-node" tracexsl:trace="no"> ... </xsl:template>
That works... but I think I'd like to take it a bit further.
As I've pointed out, producing comments every time the processor enters and leaves a
template can produce an overwhelming amount of information, most of
which probably isn't relevant to the problem you're trying to debug. You
trim that down, but then you'd have to modify the stylesheet to add and
remove these attributes every time you want to change what's being
A better approach would be to find a way to say that a
template belongs to a group I'm interested in, and specify which
groups should be traced at the time I run
do that by extending the
tracexsl:trace behavior -- "no"
will now be a special group that is never traced, but other keywords
will be traced when that value is a substring of the current value of
$tracegroups variable, or if the variable is set to the empty
string. Note that I have to check that the attribute is actually
present, since when it's absent the
contains() test would
return true (the empty string is contained in all strings).
Listing 7. Improved match to support trace groups
<xsl:template match="xsl:template[not(@tracexsl:trace='no') and ($tracegroups='' or @tracexsl:trace and contains($tracegroups,@tracexsl:trace))]">
This isn't a perfect test -- if the variable contains the
comma-separated list of groups "fred,ginger", it will match a
tracexsl:trace value of
"red". But it's good
enough for my purposes, since I'm free to pick group names that
aren't proper subsets of each other. (If this really bothers you, feel
free to improve this stylesheet!)
Now all I need to do is set
than forcing users to alter
tracexsl.xsl to change
groups, I'll take advantage of
which are variables that can be set by the XSLT processor before the
styling process starts. To accept this parameter, I need to add
another directive within my
Listing 8. Accepting the tracegroups list as a stylesheet parameter
<xsl:param name="tracegroups" select="''"/>
This both says I'm going to accept a parameter called
tracegroups, and makes its default value the empty string.
(Two layers of quoting are needed because
select expects to evaluate
an XPath expression to get its value -- the inner '' is the expression for the
empty string, and the "" around it quotes that expression.)
How would you actually set this parameter? The details of passing a parameter to a stylesheet vary, depending on which XSLT processor you're using and how you invoke it. If you're using Xalan's command-line tools, you could add the option:
For the Xalan-J
-PARAM tracegroups "fred,ginger"
For the Xalan-C
-p tracegroups "fred,ginger"
to request that you add tracing only to templates marked as belonging to the
"ginger" groups. Note that the
quotes are required only because some operating systems interpret the
comma prematurely as a command-line token delimiter. An alternative
solution is to separate your group names with a character that
doesn't have that effect.
If you're invoking an XSLT processor through the
(javax.xml.transform) APIs, you could accomplish the same thing by
call to your
Transformer object. If you're using Xalan-C
rather than Xalan-J, the equivalent call is
XalanTransformer object (see Resources).
If you're using another processor, of course, check its documentation for the details of how it accepts stylesheet parameters.
Putting it all together, here's a complete selective trace
stylesheet. I've added more
tracexsl:trace attributes to
it as an example. Try running it against itself!
tracexsl4.xsl-- TRACEXSL stylesheet with
tracegroupssupport added. This version includes comments about how it works, and is the one I'd suggest you actually save for reuse.
- I'll use an identical copy of
tracexsl4.xslas the sample stylesheet to be traced.
tracexsl-sample4.xsl.withTrace-- Stylesheet with trace code added. Note that the
pseudo-xpath-to-current-nodetemplate does not have trace code added, since it has
- I'll use another identical copy of
tracexsl4.xslas input to the annotated stylesheet.
tracexsl-sample4.traceResultshows the (very verbose!) results.
- I'll try that again, but this time set the
"stylesheet,template". This will trace only the templates belonging to those groups, specifically those that match
tracexsl-sample4a.xsl.withTrace-- Stylesheet with trace code added only to templates matching that
tracegroupssetting. Note that the pseudo-XPath template is still suppressed as well.
- Again, an identical copy of
tracexsl4.xslis used as input to this selectively annotated stylesheet. I'm going to run this pass without setting
tracexsl-sample4a.traceResultshows the results -- now far less verbose, since the identity template is not being traced. As you can see, this limited trace makes understanding what actually happened much easier.
Homework assignment: Can you guess what would
happen if I specified
tracegroups for this second pass
as well? Try it!
Now that you know how to style stylesheets, you could apply these same
techniques to other automatic enhancements. For example, you could write
a stylesheet that generates
<xsl:message> calls rather
than comments -- perhaps taking advantage of the
techniques Uche Ogbui has previously described here on developerWorks (see Resources).
You could also replace the simple
with something more sophisticated. One interesting possibility might
be to pass in an XPath to be used to test whether a template
should be traced, and call the EXSLT Dynamic XPath Evaluation extension
dyn:evaluate()) to execute it. The EXSLT
library isn't supported in all XSLT processors, but at least it's
standardized enough that if it is present its behavior should
You could also do a bit more cleanup. One of the things that makes the
<xsl:namespace-alias> approach a bit ugly is that
namepace declarations for the aliased prefix tend to appear where that
prefix is used -- which means they're scattered all over the
document. It would be really nice to have that prefix declared once at
the top of the file, on the generated
<xsl:stylesheet> directive. In
that's rather ugly; the only way to explicitly create a namespace node
is to create a temporary element that uses it and extract it from
Listing 9. Creating a temporary element in order to create a namespace node
<xsl:variable name="dummy"> <xsl:element name="tracexsl:x" namespace="http://www.my.net/my markup"/> </xsl:variable> <xsl:copy-of select="xx:node-set($dummy)//namespace::*"/>
(XSLT 2.0 is expected to introduce a new directive,
<xsl:namespace>, which would be the elegant
As I noted back when I started this project, you
could also switch to using
<xsl:element> to explicitly
generate the new stylesheet elements with the standard
xsl: prefix rather than using
<xsl:namespace-alias>. Doing so might make your trace
tool a bit harder to write and maintain, and would mean you could no
longer see which directives were generated by the trace tool, but
getting cleaner output might be worth those costs.
Some users may want to analyze the trace information, perhaps
counting how often each template was invoked. Obviously, you could
write a stylesheet that scanned through the traced output, found the
comments, and sorted and counted them. If that's your main interest,
you might want to change the contents of the comments -- or perhaps
even change them to some other kind of annotation entirely -- to make
them easier for your analyzer to process. But if you don't need a
portable solution, this sort of measurement might be easier to write
as a Xalan
As I noted, producing a completely correct
XPath (rather than a pseudo-XPath approximation) would take a bit more work.
In addition to the namespace issue, note that the version I've shown
here doesn't indicate which source document it came from -- which
could be important if your stylesheet uses the
function or other secondary source trees. I'm sure there are ways to
close these gaps, but I've left them as an exercise for the
Of course, tracing is only one example of styling stylesheets. More
generally, this technique gives you yet another way to implement
extensions to the XSLT language. As you saw when I created the
tracexsl:trace attribute, you can invent new stylesheet
modifiers -- or entirely new directives -- that your
stylesheet-for-stylesheets will convert into the detailed XSLT
operations needed to achieve the desired effect. If you're careful
to avoid processor-specific features, you can make these solutions
work with any XSLT processor.
I'm sure you'll think of applications I haven't dreamed of. All it takes is the realization that XSLT isn't just a tool for producing pretty displays and printouts -- it's a very general document processor. Since stylesheets are themselves documents they can be preprocessed...and a compiler is, in some sense, just a very fancy preprocessor.
And it's all made possible because XSLT is itself an XML application.
Aren't general-purpose, standards-based tools wonderful?
- Download the source files for Part 2 of this article.
- For general advice on using the XSLT stylesheet language, one of
the best places to look is the XSL User's mailing List, at http://www.mulberrytech.com/xsl/xsl-list/index.html. The mailing list's home page also has a link to Dave Pawson's XSLT Frequently Asked Questions (FAQ) Web site, which collects many of the most useful answers.
- For information about the open-source Xalan XSLT processor, which
I used to develop and test the examples in this article, see Apache's
Web site at http://xml.apache.org. The best
places to ask questions about using this specific processor would be
the Xalan-J and Xalan-C users' mailing lists; you can find out about
them at http://xml.apache.org/mail.html. If
you want to get involved in Xalan's development, try the Xalan-Dev
mailing list, found at the same place.
- For the official definition of XSLT, including
<xsl:namespace-alias>-- check out the XSLT Recommendation on the W3C's Web site.
- The W3C also has an XSL Home Page, which has links not only to the Recommendation but to related information such as the XPath and XSL-FO specifications, lists of software that supports these standards, pointers to interesting articles about XSL and other resources (including most of the ones I've mentioned here), as well as the Working Drafts (WDs) which describe proposed future versions of these tools. Yes, W3C specifications are often a bit hard to read -- both because they were written by experts for experts, and because many
expert programmers really don't write very well -- but if you need the
Official Word on exactly what XSLT should be doing in any particular
case, this is where you'll find it.
- Find out more about the
setParameter("tracegroups","fred,ginger")call at the Apache XML Project site.
- For an interesting example of using XSLT as a code compiler, take a look at the DOM Test Suite now being developed,
the DOM is available in multiple languages and bindings, they chose to
write the test suite using an abstract XML-based meta-language, and
to use XSLT stylesheets to turn that into executable code. A test case
written once in the meta-language can be compiled and executed in any
language they have a stylesheet for, ensuring that the same tests are
applied everywhere. (I've experimented with some similar code
generation myself, but in my case I had the stylesheet produce BML --
IBM's Bean Markup Language -- and then let the BML tools do the work
of turning it into executing Java code.)
- And of course don't forget to check right here on IBM's developerWorks XML
Zone for a wide variety of articles, tutorials, tips and
tools. Two articles by Uche Ogbuji cover concepts mentioned in this article: "Debug XSLT on the fly" (November 2002) and "EXSLT by example" (February 2003).
- You'll also find a number of interesting XML tools on alphaWorks, where you can
download experimental versions of some of IBM's very latest ideas.
Joe Kesselman has been with IBM for over two decades, working on projects ranging from mainframe circuit design, to CAD tools, to research in software development, to Internet standards (he's one of the authors of the W3C's DOM Level 2 Recommendation). Most recently he's been working on XSLT processors, including Apache's Xalan. You can contact Joe at firstname.lastname@example.org.