Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Expand XSL with extensions

Technique helps expand the capabilities of XSL's core features

Jared Jackson, Research Associate, IBM, Software Group
Jared Jackson is a Researcher at IBM's Almaden Research Center. He works in the area of Web-based technologies.

Summary:  The combined power of XML and XSL for representing, manipulating, and presenting data over the Web and sharing data across differing applications has been clearly demonstrated through the fast acceptance and broad usage of these technologies. Still, most developers familiar with the basics of XML and XSL are not utilizing this power fully. This article shows developers how to use extensions, a technique that allows you to expand the capabilities of XSL.

Date:  01 Apr 2002
Level:  Advanced
Also available in:   Japanese

Activity:  21777 views
Comments:  

In terms both of power and simplicity, the combination of XML and XSL has revolutionized data storage and manipulation in a way not seen since the early days of the SQL database language. XML provides a clear and independent way of recoding data that is easily shared and understood. Similarly, many people feel that XSL is also easy to read, write, and understand. Clearly, this powerful duo are essential knowledge for everyone involved in the technology industry.

The broad scope and small learning curve associated with the basic elements of XSL transformation sometimes acts as a double-edged sword -- yielding broad usage of the core technology but dissuading the majority of developers learning XSL from investigating and using its more advanced and powerful features.

This article is written for developers who already have a basic understanding of XML and XSL, and are ready to build on this knowledge. If you are unfamiliar with these technologies, you can find several good introductory articles and tutorials on developerWorks and other Web sites. The article shows you how to use extensions -- a technique present in most XSL processors -- which allows virtually unlimited expansion of the existing capabilities of XSL's core features. This article includes a general description of how to write extensions with code, followed by three specific and widely applicable examples.

What are XSL extensions?

It must first be understood that XSL, like all other programing languages, is merely a grammar specification in need of an implementation. Fortunately, XSL has become very popular and there are several implementations to choose from. Extensions are not a required feature of the grammar and, thus, their syntax is not as well defined as the other constructs of the language. They are, however, now included in the W3C's XSLT Recommendation. The examples in this article will follow the format of that recommendation.

Simply put, extensions are a way of calling a method written in some other programming language from within an XSL document. Usually, the extension methods are written in the same language as that of the XSL processor. There are exceptions to this rule: Java, for example, can be made to run programs in other languages such as Javascript or Perl. Thus it is possible to write extensions in XSL in Javascript, Perl, or some other language and make use of them through a Java-based XSL processor.

What makes these extensions so significant when XSL can already do so much? What XSL gains in simplicity and broad ability for transformation is often lost in efficiency and ability to do anything unrelated to transformation. For instance, suppose you have an XML document that lists 5,000 users of your system. The user name, real name, and e-mail address of each of these users is given under a Users node within the XML. You later append to the XML document an Interests node in a separate subtree of the XML with user names grouped by particular interests such as acrobatics, bicycling, computers. You hope eventually to transform the data into an HTML page that groups users by interests and presents e-mail contacts for people of similar interests. XSL can do this handily with the following code:


Listing 1. User interest XSL transformation without extensions

<xsl:for-each select="Interests/Interest">
  <b><xsl:value-of select="@InterestName"/></b>
  <ul>
    <xsl:for-each select="User">
      <xsl:variable name="userName" select="@userName"/>
      <xsl:variable name="userNode" select="/Root/Users/User[@userName = 
		$userName]"/>
      <li>
		<xsl:value-of select="$userNode/@realName"/> 
		<xsl:value-of select="concat(' ',$userName/@email"/>
	 </li>
    </xsl:for-each>
  </ul>
</xsl:for-each>

Unfortunately, the way the transform executes, the entire list of 5,000 users will be examined for each user in each interest category. This is far more work than you want your server to do for each request to this Web page.

Extensions provide a convenient way around this and several other possible hang-ups that you may encounter when using XSL on nontrivial data sets. In the above example, a simple hashmap or binary search tree could have easily solved the problem, but implementing one of these data structures in XSL would be inconvenient and unnecessary. Extensions to a language that has more appropriate data types will more easily fix the problem. (Incidentally, the code for this fix is given in the first example below).


Technologies used in this article

It would be a daunting task to list all of the XSL processors and their methods for implementing extensions. This article uses the Java version of Xalan -- a popular and freely available XSL processor from the Apache Project -- to describe the specifics of writing extensions. All of the examples are targeted to that platform. (Xerces, another Apache product, is used as the XML parser. You can download Xalan and Xerces from links in Resources.) Most other popular XSL implementations also provide a mechanism for extensions, but you'll need to consult their documentation to find any differences in approach.

To simplify working with XML and XSL, I have also provided Java code for some of the more common XML manipulations. This code, along with the code and data necessary to run all of the examples, is provided in a zip file in Resources. This file does not, however, include external libraries such as Xalan and Xerces. After you obtain those libraries by following links in Resources (versions: Xalan - Java 2.3.1; Xerces 1.4.4), place their jar files in the lib directory extracted from the zip file. For those readers who wish to jump directly to the examples, all Java code is in the src directory, XML data in the XML directory, XSL transforms in the XSL directory, batch files in the bin directory, and compiled code in the lib directory.


Creating an extension

In order to call a method from XSL, that method must first be written and its compiled form placed in the classpath of the application that is performing the XSL transformation. Methods may be of your own design, supplied by the standard libraries of Java, or taken from other Java libraries. In some XSL processors, like Xalan, there are even extension methods written directly into the processor.

The first thing to be aware of when you write or use these methods is the mapping of data types from XSL to Java and back again. The following table provides a reference to these mappings in Xalan.

Tables 1,2. Data Type Mappings

Parameter Mapping
XSLT TypeJava Type
Node Setorg.w3c.dom.traversal.NodeIterator
Stringjava.lang.String
Booleanjava.lang.Boolean
Numberjava.lang.Double
Result Tree Fragmentorg.w3c.dom.DocumentFragment
Return Type Mapping
Java TypeXSLT Type
org.w3c.dom.traversal.NodeIterator org.apache.xml.dtm.DTM org.apache.xml.dtm.DTMAxisIterator org.apache.xml.dtm.DTMIterator org.w3c.dom.Node Node Set
java.lang.StringString
java.lang.BooleanBoolean
java.lang.NumberNumber
org.w3c.dom.DocumentFragmentResult Tree Fragment

Once your methods are written, incorporating them into XSL is fairly simple. The first step is to declare a namespace for your methods in the <xsl:stylesheet> element. For example, if you want to run methods from a class called foo in package com.myCompany.XSLExtensions, the root of your XSL file would contain the following line:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:extension="xalan://com.myCompany.XSLExtensions.foo"/>

If you later want to call a method from the class you have declared, use the namespace declared in the <xsl:stylesheet> element. Continuing the example, in order to run a method called bar() that takes a String as a parameter and returns a String, you might use code like the following:

<xsl:variable name="myParam" select="'theParameter'"/>
<xsl:variable name="myResult" select="extension:bar($myParam)"/>

It's that simple. The myResult variable now contains the result of calling bar from your Java class. To obtain a better grasp on the technique, work through the following three examples.


Example 1: Lookup tables

The beginning of this article presented a scenario in which the use of standard XSL techniques for looking up data in distinct subtrees of an XML document used excessive amounts of compute time. A simple way around this is to create a general purpose hashtable that provides a mechanism for storing and retrieving strings. Since hashtables are built directly into the standard Java libraries, writing an extension that uses them should be painless.

The hashtable Java code is found in the src/StringHash.java file contained in the zip file in Resources. It has two methods of note:

  1. addString(String tableName, String key, String value)
  2. getString(String tableName, String key)

The first method allows the creation of hashtables associated with a table name plus the insertion of string values mapped to a key. The second method provides a means for retrieving the stored values.

An XML data source is found in the XML/user_interests.xml file (see the zip file in Resources). It follows the form:


Listing 2. User interest XML fragment

<Users>
  <User userName="aragon" realName="Aragon" 
	email="aragon@middleEarth.fict"/>
  <User userName="boromir" realName="Boromir" 
	email="boromir@middleEarth.fict"/>
  ...
</Users>
<Interests>
  <Interest name="archery">
    <User userName="legolas"/>
    ...
  </Interest>
  ...
</Interests>

Two XSL files are given in the zip file in Resources for producing the Web page result. The first is found in the XSL/user_interests_xsl_only.xsl file and follows the code shown in Listing 1. The second is found in the XSL/user_interests_extensions.xsl file which modifies the former XSL file to the code shown in Listing 3. To easily run the XSL conversion on Windows, use the bin/Example_1*.bat batch files. Unix and Mac developers should have little trouble running the examples after examining these batch files.


Listing 3. User interest XSL transformation with extensions

<xsl:stylesheet xmlns:lookup="xalan://StringHash">
...
<xsl:for-each select="Users/User">
  <xsl:value-of select="lookup:addString('realName', string(@userName), 
	string(@realName))"/>
  <xsl:value-of select="lookup:addString('email', string(@userName), 
	string(@email))"/>
</xsl:for-each>
...
<li>
  <xsl:value-of select="lookup:getString('realName',$userName)"/>
  <xsl:value-of select="concat(' - ',lookup:getString('email',
	$userName))"/>
</li>


Example 2: Regular expressions

The current XSL standard uses the XPath technology to perform all of its pattern matching. While XPath provides a compact and elegant way of traversing an XML tree, its pattern matching functions have a rather limited capability. (The entirety of the string functions in XPath that performs boolean matching is: starts-with(), ends-with(), and contains(). You can also automatically parse strings into numbers.) Regular expressions provide much richer pattern matching across strings of text, but are as easy to use as XPath when traversing a data structure such as an XML tree. For more detailed information on regular expressions, see Resources.

The optimum solution is to combine the two technologies. The next version of the XSL transformation language, which is still under development and review, includes a proposal to add regular expressions to the language. For developers who want to use the technology now, extensions provide the mechanism for doing so.

The source code for the Java methods accessed as extensions can be found in the src/PatternMatcher.java file contained in the zip file accompanying this article. These methods make use of external code that is not contained within the standard Java libraries, thus this example also shows what steps are necessary to link external jar files for use in extensions. You will need to obtain he regular expression jar file provided by GNU (see Resources) and place it in the extracted lib directory, in order to get the examples to work. Feel free to find another regular expression package and modify the code to fit it.

For the second example, suppose you wish to generate a list of users from the original source, for which the first and last names of those users are known. While this is a fairly trivial example, it is not difficult to imagine more complicated examples working on groups of users, product catalogs, or reference databases. A simple way to do this is to look through the real names of the users and match those names which consist of one name followed by a space followed by another name. The regular expression for this is \w* \w. The XSL now contains the lines in Listing 4.


Listing 4. Regular expressions in XSL

<xsl:stylesheet xmlns:regexp="xalan://PatternMatcher">
...
<ul>
  <xsl:for-each select="Users/User[regexp:containsMatch('\w* \w*',
	string(@realName))]">
    <li>
      <xsl:value-of select="@realName"/>
    </li>
  </xsl:for-each>
</ul>

Similar to Example 1, this example can be executed through the bin/Example_2.bat file. You can find the XSL file used at XSL/user_last_names.xsl. The possibilites for extension on this technique are infinite.


Example 3: Internationalization

Internationalization, sometimes referred to as localization or natural language support, is the method by which developers make their products readable across languages and cultures. It is particularly important in the context of XML translation if the product of the transformation is a set of Web pages that targets a broad audience. While topic of internationalization is too broad to introduce in a comprehensive way in the context of this example, you can find good treatment of it in other developerWorks articles referenced below.

This example makes use of Java's built-in technique of handling internationalization through the use of resource bundles. If you are unfamiliar with the topic, I encourage you to read the referenced articles. Suffice it to say for now that resource bundles consist of a collection of files that contain translations for different regions or, more precisely, locales. Web servers can read the preferred locale of a user when that user requests a Web page and, using these resource bundles, can respond appropriately. XML-based applications can also target results to a specific locale.

The potential uses of the code in this example are just as wide and varied as the previous one. In order to demonstrate the technology, the code executed by the bin/Example_3.bat file creates three Web pages from the sample XML users data. The three resulting pages represent the same view of the data, but are presented in three different languages. The translations used can be found in properties files in the lib directory extracted from the zip file.


Conclusion

Even when considering the most basic components of XSL transformations, their capabilities are remarkable. When this core is extended with extensions to encompass the power of modern programming languages, the possibilities become virtually limitless. The ideas and examples presented above are but the tip of the iceberg, and I leave it to you, after gaining an undestanding of what is presented here, to explore the many remaining possibilites.



Download

DescriptionNameSizeDownload method
Source code for this articlex-callbk/XSL_Callbacks_Code.zip1588 KB HTTP

Information about download methods


Resources

About the author

Jared Jackson

Jared Jackson is a Researcher at IBM's Almaden Research Center. He works in the area of Web-based technologies.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12093
ArticleTitle=Expand XSL with extensions
publish-date=04012002
author1-email=jjared@almaden.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers