As you may remember, my previous article demonstrated the concept of using a stylesheet to compile new features into another stylesheet. Specifically, I showed you how to write a simple execution tracing tool, which automatically modifies a stylesheet so it will generate comments in the output document as it runs, showing which parts of the latter were produced by each template.
However, I ended the article by pointing out that the basic version I'd developed was quite limited, and suggesting a number of ways in which it could be improved. In this installment I'll add some of those missing features, and turn this proof-of-concept into a much more useful tool.
Note: I'm going to assume you're already familiar with the code we developed last time, and will focus on the changes we're making to it. If you haven't read Part 1, you really should glance though that before proceeding.
What's in a name?
One of the limitations in my original solution was that it reported only
match pattern when each template started and ended
execution. In fact, not every template execution is the result of a
match. Some have a
name instead (or in
addition), and are (at least sometimes) invoked through that name, using
<xsl:call-template>. And sometimes there are several
templates with overlapping
match patterns, and which one runs
depends on the
mode the stylesheet is currently running
So I'd like to modify the stylesheet we wrote last time,
tracexsl.xsl, to include these details in its reports. First
off, I need to have the template tell me more about itself, which means
making the comment-generator a bit fancier. I'll check whether each of the
attributes I'm interested in --
mode -- is present, and display it if so. If one of them is
missing, I'll suppress printing its name as well, so I can tell the
difference between it being absent and being set to an empty string.
Here's the sequence I'll want to execute to insert a start-of-template
comment generator into the template being styled:
Listing 1. Inserting a start-of-template comment generator
<tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL Begin]</xsl:text> <xsl:if test="@match"> <xsl:text> match="</xsl:text> <xsl:value-of select="@match"/> <xsl:text>"</xsl:text> </xsl:if> <xsl:if test="@name"> <xsl:text> name="</xsl:text> <xsl:value-of select="@name"/> <xsl:text>"</xsl:text> </xsl:if> <xsl:if test="@mode"> <xsl:text> mode="</xsl:text> <xsl:value-of select="@mode"/> <xsl:text>"</xsl:text> </xsl:if> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text>
tracexsl: prefix is bound to a dummy
namespace, and will be rebound to the official XSLT namespace when these
literal result elements are copied into the generated stylesheet, thanks
<xsl:namespace-alias> directive in my
stylesheet-for-stylesheets. If I had used the
directly, these elements would be executed immediately rather than being
Obviously, the end-template comment can, and probably should, be modified similarly.
Now I have more data about which template ran. But I still don't
know why it ran. Unfortunately XSLT doesn't have any way to ask
how a template was invoked, or what the current
mode is, so
I'll have to do a bit more work to display that information.
Most template invocations occur due to an
<xsl:apply-templates> with no
so I'll make that my assumed case. The other two possibilities are
<xsl:apply-templates> with a
<xsl:call-template>. I can report these operations
by generating additional comments in the output document, using the same
techniques I applied to the
but this time wrapping the comments around the outside of the element I
want to trace rather than inserting them inside it.
Here's a template that adds a comment when
<xsl:call-template> occurs. Since that applies only to
the very next template to be invoked, I'm just generating one comment
before the call, rather than producing a before-and-after pair.
Listing 2. Generating a comment when <xsl:call-template> occurs
<xsl:template match="xsl:call-template"> <tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL] CALL NAME="</xsl:text> <xsl:value-of select="@name"/> <xsl:text>"</xsl:text> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template>
And here's the template that does this for mode changes. I really wish I
could report only when the
mode is actually changing to a new
value -- and report what it's restored to when I pop out of the
<xsl:apply-templates> call -- but I haven't yet found a
good way to do so. If you discover one, please let me know so we can
update this article!
Listing 3. Generating a comment for mode changes
<xsl:template match="xsl:apply-templates[@mode]"> <tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL] APPLY MODE="</xsl:text> <xsl:value-of select="@mode"/> <xsl:text>"</xsl:text> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> <tracexsl:text xml:space="preserve"> </tracexsl:text> <tracexsl:comment> <xsl:text>[TraceXSL] END MODE="</xsl:text> <xsl:value-of select="@mode"/> <xsl:text>"</xsl:text> </tracexsl:comment> <tracexsl:text xml:space="preserve"> </tracexsl:text> </xsl:template>
To demonstrate this new code, I'll need to run it against a stylesheet that
mode changes and
Rather than write one for this purpose, I'm going to swipe one of the
conformance test cases that comes with Xalan. It generates text output
rather than HTML or XML, but it'll suffice to demonstrate that my trace is
doing the right thing.
tracexsl2.xsl-- Revised stylesheet for tracing stylesheets.
tracexsl-sample2.xsl-- a copy of
modes10.xslfrom the Xalan test suite.
- When I use Xalan to run
tracexsl2.xslover that sample, I get
tracexsl-sample2.xml-- A copy of
modes10.xml, the input document for that test case.
- And when I use Xalan to run the generated
tracexsl-sample2.xsl.withTraceover that source document, it produces
That trace output is getting a bit wordy, isn't it? That's OK -- I'll discuss how to trim it down in a later part of this article. Right now, though, I'm going to make it even more detailed, because I'm still missing something I consider important.
But what was the input?
So far, I've generated a lot of information about what templates are being invoked, and how they're being invoked. But of course the stylesheet is only half of the process; the other half is the document being styled. If I don't know where the input is coming from, it's hard to understand why I'm generating a particular set of output.
I could take the approach Xalan did in its own messages, and display the line and column number within the source document. But standard XSLT doesn't make that information available; I'd have to resort to XSLT Extensions. Since I've gotten this far without giving up portability, I'd rather not do that.
Besides, line and column really aren't the most meaningful way to describe a location within an XML document. It would be more useful to display an XPath to the node being processed: "/purchaseOrder/customer/shippingAddress" is much more informative than "line 152, column 17".
Automatically generating XPaths turns out to be more difficult than you might expect. The actual path syntax is quite simple, of course. But when the W3C Working Group designed the XPath syntax, they decided to use XML Namespace prefixes rather than spelling out namespace URIs in full. Unfortunately, they didn't provide any syntax for declaring those prefixes, so an XPath by itself is not as meaningful as it should be. We've sent a gripe to the XML Core Working Group asking that they address this, since it means there's no way to write a context-independent XPath.
Another little issue is that a prefix may be redefined at any point in a document...so anyone creating an XPath has to be prepared to create new prefixes to disambiguate these collisions, and make sure those prefixes don't collide with any others already in use. The simplest, and ugliest, answer would be to always produce new prefixes in the generated XPath rather than using the ones from the source document -- which would work fine for programs, but which would be harder for humans to read.
I've decided to dodge these issues and settle for a namespace-insensitive approximation of XPaths. The trace comments are primarily intended to be read by humans rather than by a real XPath processor, and redefined prefixes are relatively uncommon, so this is probably good enough for now. In many cases, it will in fact be an entirely acceptable XPath to the node. I've put code to generate this in its own template, so it can be easily replaced with a better solution when one becomes available.
That simplification makes generating a Pseudo XPath in XSLT much easier. It boils down to:
- Examining all of a node's ancestors, and the node itself, in turn
- Checking each node's type
- Outputting a slash, the node's type and name, and a position predicate.
That position number is needed in case the node has siblings of the same
type and name, to indicate which one we're referring to.
<xsl-number> returns the right value for this purpose
-- 1 for the first such instance, 2 for the second, and so on.
The following template, when invoked with
<xsl:call-template>, should do the job:
Listing 4. Template that generates a Pseudo XPath in XSLT
<xsl:template name="pseudo-xpath-to-current-node"> <!-- Special-case for the root node, which otherwise wouldn't generate any path at all. A bit of a kluge, but it's simple and efficient. --> <xsl:if test="not(parent::node())"> <xsl:text>/</xsl:text> </xsl:if> <xsl:for-each select="ancestor-or-self::node()"> <xsl:choose> <xsl:when test="not(parent::node())"> <!-- This clause recognizes the root node, which doesn't need to be explicitly represented in the XPath. --> </xsl:when> <xsl:when test="self::text()"> <xsl:text>/text()[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::comment()"> <xsl:text>/comment()[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::processing-instruction()"> <xsl:text>/processing-instruction()[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::*"> <!-- This test for Elements works because the Principal Node Type of the self:: axis happens to be Element. --> <xsl:text>/</xsl:text> <xsl:value-of select="name(.)"/> <xsl:text>[</xsl:text> <xsl:number level="single"/> <xsl:text>]</xsl:text> </xsl:when> <xsl:when test="self::node()[name()='xmlns' | starts-with(name(),'xmlns:')]"> <!-- This recognizes namespace nodes, though it's a bit ugly. XSLT 1.0 doesn't seem to have a more elegant test. XSLT 2.0 is expected to deprecate the whole concept of namespace nodes, so it may become a moot point. NS nodes are unique; a count isn't required. --> <xsl:text>/namespace::</xsl:text> <xsl:value-of select="local-name(.)"/> </xsl:when> <xsl:otherwise> <!-- If I've reached this clause, the node must be an attribute. Attributes are unique; a count is not required. --> <xsl:text>/@</xsl:text> <xsl:value-of select="name(.)"/> </xsl:otherwise> </xsl:choose> </xsl:for-each> </xsl:template>
I'm going to want to call that template from my comment generator code.
That's fairly straightforward -- within the
<tracexsl:comment>, I can just add something like:
<xsl:text> source=</xsl:text> <tracexsl:call-template name="pseudo-xpath-to-current-node"/>
Now all I have to do is put the
template someplace where the annotated stylesheet can find it. I could put
it in its own stylesheet file, and use
<xsl:import> to bring it in. But that would mean the
altered stylesheet couldn't run without this supporting stylesheet, and
I'd prefer a solution that lets me ship a single self-contained file to a
customer who's having trouble.
So I'm going to take advantage of the fact that I'm already modifying the
stylesheet, and I'll copy this template into the annotated version. Just
to keep everything in one convenient pile, I've chosen to store the master
copy as part of my stylesheet-for-stylesheets, and copy it from there.
This is quite convenient in XSLT, since
document('') lets me
read from the currently executing stylesheet.
In addition to adding the
template to my revised
tracexsl3.xsl, I've written a template
that recognizes the top-level
in the source stylesheet and copies the
pseudo-xpath-to-current-node template into it after all its
other children. For safety, I've added a test that first checks whether
the pseudo-XPath template already exists -- because someday I may want to
tracexsl to debug itself, and two copies of the same
template would be an error.
Listing 5. Template that copies pseudo-xpath-to-current-node into <xsl:stylesheet>
<xsl:template match="xsl:stylesheet"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates select="node()"/> <xsl:if test="not(xsl:template[@name='pseudo-xpath-to-current-node'])"> <xsl:text> </xsl:text> <xsl:copy-of select="document('')/xsl:stylesheet/xsl:template[ @name='pseudo-xpath-to-current-node']"/> <xsl:text> </xsl:text> </xsl:if> </xsl:copy> </xsl:template>
Here's what I get when I put it together:
tracexsl3.xsl-- Revised stylesheet for tracing stylesheets.
tracexsl-sample3.xsl-- a copy of
modes10.xslfrom the Xalan test suite.
- When I use Xalan to run
tracexsl3.xslover that sample, I get
tracexsl-sample3.xml-- A copy of
modes10.xml, the input document for that test case.
- And when I use Xalan to run the generated
tracexsl-sample3.xsl.withTraceover that source document, it produces
Too much! Too many!
In the last section, I mentioned the possibility of using
tracexsl.xsl to debug itself.
Actually, as things stand right now that wouldn't work very well. I'm
inserting comments into the output of every template, and that would
means I'd wind up trying to generate comments inside the
generated comment. Similarly, I'd generate a comment for the default
template, which would mean I'd be outputting comments before attributes,
which is forbidden. As it stands, tracing would break my trace
(At the end of the previous installment in this series, I pointed out that inserting comments into the output was not always going to be safe. This is a perfect example.)
I can certainly deal with these specific templates as a special case. But
what if I'm tracing another stylesheet that has the same concerns? What I
need is a more general solution, one that would let users tell
tracexsl.xsl which of their templates are and are
not safe to trace.
How could stylesheet developers give me that information? One simple solution would be to ask them to add a mark to every template, indicating whether it should or shouldn't have trace code added. (Actually, I should probably pick a default so they only have to mark the exceptions.) I could then change my template-for-templates to only process templates with, or without, that mark.
What kind of mark? I'd suggest an attribute. XSLT elements are allowed to
carry additional attributes, as long as those attributes are in a
different namespace so the stylesheet processor knows it can ignore them.
I've already defined a namespace for the
tracexsl: prefix, so
I'll use that; this will also help me remember that these attributes are
commands to my trace system.
For example, I might say that
with the attribute
tracexsl:trace="no" should not have trace
comment generators added to them. To implement this, I would modify the
match condition of my template-for-templates to check this
attribute before proceeding, then add this flag to the templates I'm
Listing 6. Modifying the match condition, and marking non-traced templates
<xsl:template match="xsl:template[not(@tracexsl:trace='no')]> ... </xsl:template> <xsl:template match="@*|node()" priority="-1" tracexsl:trace="no"> ... </xsl:template> <xsl:template name="pseudo-xpath-to-current-node" tracexsl:trace="no"> ... </xsl:template>
That works... but I think I'd like to take it a bit further.
As I've pointed out, producing comments every time the processor enters and
leaves a template can produce an overwhelming amount of information, most
of which probably isn't relevant to the problem you're trying to debug.
You could use
tracexsl:trace="no" to trim that down, but then
you'd have to modify the stylesheet to add and remove these attributes
every time you want to change what's being traced.
A better approach would be to find a way to say that a template belongs to
a group I'm interested in, and specify which groups should be traced at
the time I run
tracexsl.xsl. I'll do that by extending the
tracexsl:trace behavior -- "no" will now be a special group
that is never traced, but other keywords will be traced when that value is
a substring of the current value of the
variable, or if the variable is set to the empty string. Note that I have
to check that the attribute is actually present, since when it's absent
contains() test would return true (the empty string is
contained in all strings).
Listing 7. Improved match to support trace groups
<xsl:template match="xsl:template[not(@tracexsl:trace='no') and ($tracegroups='' or @tracexsl:trace and contains($tracegroups,@tracexsl:trace))]">
This isn't a perfect test -- if the variable contains the comma-separated
list of groups "fred,ginger", it will match a
"red". But it's good enough for my purposes, since
I'm free to pick group names that aren't proper subsets of each other. (If
this really bothers you, feel free to improve this stylesheet!)
Now all I need to do is set
$tracegroups. Rather than forcing
users to alter
tracexsl.xsl to change groups, I'll take
stylesheet parameters, which are variables that
can be set by the XSLT processor before the styling process starts. To
accept this parameter, I need to add another directive within my
Listing 8. Accepting the tracegroups list as a stylesheet parameter
<xsl:param name="tracegroups" select="''"/>
This both says I'm going to accept a parameter called
tracegroups, and makes its default value the empty string.
(Two layers of quoting are needed because
select expects to
evaluate an XPath expression to get its value -- the inner '' is the
expression for the empty string, and the "" around it quotes that
How would you actually set this parameter? The details of passing a parameter to a stylesheet vary, depending on which XSLT processor you're using and how you invoke it. If you're using Xalan's command-line tools, you could add the option:
For the Xalan-J
Process or Xalan-C
-PARAM tracegroups "fred,ginger"
For the Xalan-C
-p tracegroups "fred,ginger"
to request that you add tracing only to templates marked as belonging to
"ginger" groups. Note that the
quotes are required only because some operating systems interpret the
comma prematurely as a command-line token delimiter. An alternative
solution is to separate your group names with a character that doesn't
have that effect.
If you're invoking an XSLT processor through the
(javax.xml.transform) APIs, you could accomplish the
same thing by issuing the
setParameter("tracegroups","fred,ginger") call to your
Transformer object. If you're using Xalan-C rather than
Xalan-J, the equivalent call is
setStylesheetParam("tracegroups","fred,ginger") on the
XalanTransformer object (see Resources).
If you're using another processor, of course, check its documentation for the details of how it accepts stylesheet parameters.
Putting it all together, here's a complete selective trace stylesheet. I've
tracexsl:trace attributes to it as an example. Try
running it against itself!
tracexsl4.xsl-- TRACEXSL stylesheet with
tracegroupssupport added. This version includes comments about how it works, and is the one I'd suggest you actually save for reuse.
- I'll use an identical copy of
tracexsl4.xslas the sample stylesheet to be traced.
tracexsl-sample4.xsl.withTrace-- Stylesheet with trace code added. Note that the
pseudo-xpath-to-current-nodetemplate does not have trace code added, since it has
- I'll use another identical copy of
tracexsl4.xslas input to the annotated stylesheet.
tracexsl-sample4.traceResultshows the (very verbose!) results.
- I'll try that again, but this time set the
"stylesheet,template". This will trace only the templates belonging to those groups, specifically those that match
tracexsl-sample4a.xsl.withTrace-- Stylesheet with trace code added only to templates matching that
tracegroupssetting. Note that the pseudo-XPath template is still suppressed as well.
- Again, an identical copy of
tracexsl4.xslis used as input to this selectively annotated stylesheet. I'm going to run this pass without setting
tracexsl-sample4a.traceResultshows the results -- now far less verbose, since the identity template is not being traced. As you can see, this limited trace makes understanding what actually happened much easier.
Homework assignment: Can you guess what would happen if I
tracegroups for this second pass as well? Try
Now that you know how to style stylesheets, you could apply these same
techniques to other automatic enhancements. For example, you could write a
stylesheet that generates
<xsl:message> calls rather
than comments -- perhaps taking advantage of the techniques Uche Ogbui has
previously described here on developerWorks (see Resources).
You could also replace the simple
tracegroups mechanism with
something more sophisticated. One interesting possibility might be to pass
in an XPath to be used to test whether a template should be traced, and
call the EXSLT Dynamic XPath Evaluation extension
dyn:evaluate()) to execute it. The EXSLT library isn't
supported in all XSLT processors, but at least it's standardized enough
that if it is present its behavior should be predictable.
You could also do a bit more cleanup. One of the things that makes the
<xsl:namespace-alias> approach a bit ugly is that
namepace declarations for the aliased prefix tend to appear where that
prefix is used -- which means they're scattered all over the document. It
would be really nice to have that prefix declared once at the top of the
file, on the generated
<xsl:stylesheet> directive. In
XSLT 1.0, that's rather ugly; the only way to explicitly
create a namespace node is to create a temporary element that uses it and
extract it from there.
Listing 9. Creating a temporary element in order to create a namespace node
<xsl:variable name="dummy"> <xsl:element name="tracexsl:x" namespace="http://www.my.net/my markup"/> </xsl:variable> <xsl:copy-of select="xx:node-set($dummy)//namespace::*"/>
(XSLT 2.0 is expected to introduce a new directive,
<xsl:namespace>, which would be the elegant
As I noted back when I started this project, you could also switch to using
<xsl:element> to explicitly generate the new stylesheet
elements with the standard
xsl: prefix rather than using
<xsl:namespace-alias>. Doing so might make your trace
tool a bit harder to write and maintain, and would mean you could no
longer see which directives were generated by the trace tool, but getting
cleaner output might be worth those costs.
Some users may want to analyze the trace information, perhaps counting how
often each template was invoked. Obviously, you could write a stylesheet
that scanned through the traced output, found the comments, and sorted and
counted them. If that's your main interest, you might want to change the
contents of the comments -- or perhaps even change them to some other kind
of annotation entirely -- to make them easier for your analyzer to
process. But if you don't need a portable solution, this sort of
measurement might be easier to write as a Xalan
As I noted, producing a completely correct XPath (rather than a
pseudo-XPath approximation) would take a bit more work. In addition to the
namespace issue, note that the version I've shown here doesn't indicate
which source document it came from -- which could be important if your
stylesheet uses the
document() function or other secondary
source trees. I'm sure there are ways to close these gaps, but I've left
them as an exercise for the reader.
Of course, tracing is only one example of styling stylesheets. More
generally, this technique gives you yet another way to implement
extensions to the XSLT language. As you saw when I created the
tracexsl:trace attribute, you can invent new stylesheet
modifiers -- or entirely new directives -- that your
stylesheet-for-stylesheets will convert into the detailed XSLT operations
needed to achieve the desired effect. If you're careful to avoid
processor-specific features, you can make these solutions work with any
I'm sure you'll think of applications I haven't dreamed of. All it takes is the realization that XSLT isn't just a tool for producing pretty displays and printouts -- it's a very general document processor. Since stylesheets are themselves documents they can be preprocessed...and a compiler is, in some sense, just a very fancy preprocessor.
And it's all made possible because XSLT is itself an XML application.
Aren't general-purpose, standards-based tools wonderful?
- Download the source files for Part 2 of this article.
- For general advice on using the XSLT stylesheet language, one of the best places to look is the XSL User's mailing List, at http://www.mulberrytech.com/xsl/xsl-list/index.html. The mailing list's home page also has a link to Dave Pawson's XSLT Frequently Asked Questions (FAQ) Web site, which collects many of the most useful answers.
- For information about the open-source Xalan XSLT processor, which I used to develop and test the examples in this article, see Apache's Web site at http://xml.apache.org. The best places to ask questions about using this specific processor would be the Xalan-J and Xalan-C users' mailing lists; you can find out about them at http://xml.apache.org/mail.html. If you want to get involved in Xalan's development, try the Xalan-Dev mailing list, found at the same place.
- For the official definition of XSLT, including
<xsl:namespace-alias>-- check out the XSLT Recommendation on the W3C's Web site.
- The W3C also has an XSL Home Page, which has links not only to the Recommendation but to related information such as the XPath and XSL-FO specifications, lists of software that supports these standards, pointers to interesting articles about XSL and other resources (including most of the ones I've mentioned here), as well as the Working Drafts (WDs) which describe proposed future versions of these tools. Yes, W3C specifications are often a bit hard to read -- both because they were written by experts for experts, and because many expert programmers really don't write very well -- but if you need the Official Word on exactly what XSLT should be doing in any particular case, this is where you'll find it.
- Find out more about the
setParameter("tracegroups","fred,ginger")call at the Apache XML Project site.
- For an interesting example of using XSLT as a code compiler, take a look at the DOM Test Suite now being developed, http://www.w3.org/DOM/Test/. Because the DOM is available in multiple languages and bindings, they chose to write the test suite using an abstract XML-based meta-language, and to use XSLT stylesheets to turn that into executable code. A test case written once in the meta-language can be compiled and executed in any language they have a stylesheet for, ensuring that the same tests are applied everywhere. (I've experimented with some similar code generation myself, but in my case I had the stylesheet produce BML -- IBM's Bean Markup Language -- and then let the BML tools do the work of turning it into executing Java code.)
- And of course don't forget to check right here on IBM's developerWorks XML Zone for a wide variety of articles, tutorials, tips and tools. Two articles by Uche Ogbuji cover concepts mentioned in this article: "Debug XSLT on the fly" (November 2002).