Extensible Markup Language (XML) has become a popular means to represent data. One of the fastest growing uses of XML is within various business environments. Business applications use XML to represent data shared within the bounds of a business application, between business applications, and between businesses. A necessity for making use of the data housed in XML documents is the ability to access and manipulate the data to fit the needs of the business application or end user of the data. Extensible Stylesheet Language (XSL) provides facilities to access and manipulate the data in XML documents.
XSL is itself an XML dialect and provides two distinct and useful mechanisms for handling and manipulating XML documents. Many of the same constructs are shared between the two mechanisms, but each plays a distinct role. One is concerned with formatting data, and the other is concerned with data transformation. When XSL is used as a formatting language, the stylesheets consist of formatting objects that prepare an XML document for presentation, usually in a browser.
When XSL is used for transformation, XSL takes the form of Extensible Stylesheet Language Transformation (XSLT). An XSLT stylesheet is composed of template rules that match specific portions of an XML document and allow the transformation of the XML document content. Not only can XSLT transform an XML document from one dialect to another (often HTML), but it provides many other capabilities for extracting data from an XML document and manipulating that data. This article focuses on XSLT and demonstrates extraction and manipulation capabilities through the use of example stylesheets.
XSL is of interest to any developer who needs to access and manipulate XML documents. XSL is robust, yet easy to learn. Developers can focus on the problem they are solving and the XML document content.
Let's say your company receives an XML document (an order) from a supplier that your company wants to do business with. Your company already has its own XML document order format, which is different from your supplier's. Your XML document format is used by your company's business application as an input to the order entry process. To keep life simple, your company can create an XSL stylesheet to transform the incoming order into your own XML document format. Now, your company can process the order and do business with your supplier without manually processing the order.
Before we launch into the examples, let's get a better understanding of XSLT. An XSLT document, referred to as a stylesheet, consists of a series of template rules. Each template rule matches against elements, attributes, or both within the target XML document. The basic construct for a template rule is shown below:
<xsl:template match="pattern">
... rule body...
</xsl:template>
|
The template rule has a start tag (<xsl:template>) and an end tag (</xsl:template>). Normally, each template start tag has a match attribute that specifies the portion of the input XML document that the template rule is intended to match against.
A template rule body can consist of:
- More detailed selection or match conditions and other logic
- A specific type of action or actions to be performed
- Text that becomes part of the results along with the selected target XML document's content
This article contains examples that show stylesheets containing template rules demonstrating different instructions and different types of data for the output stream.
XSLT does not work alone. An XSL processor engine performs the matching between the XML document and the stylesheets. The processor used, in these examples, is the Lotus XSL processor from IBM alphaWorks. The processor performs pattern matching between the various portions of the XML document and the XSLT stylesheet. The steps included in Figure 1 provide a simple example of how the matching process is performed.
Figure 1. XSL processing flow

An XML document and an XSL stylesheet are input to the XSL processor. Let's look at the steps in Figure 1 in more detail:
- Match template patterns. As the XML document content is accessed (element by element), the match attribute for each template rule is compared against that portion (element and its children) of the XML document. The XSL processor accesses the XML document from top to bottom, so the matching process is sequential.
- Determine correct template. The XSL processor selects a template rule pattern that matches the XML document. The pattern for a match attribute can be very concise, specifying the path to a particular element or a particular attribute, or more general, allowing matches to any occurrence of an element or attribute within the XML document regardless of its parentage. The determination process takes these factors into account.
- Create results for output. The XSL processor deals with the template rule. Depending upon the rule body data, the literal data, XML document content, or both may be put into the output stream or other actions can take place. A result tree is created containing the results of the rule processing. As each rule is processed, information may be added to the results tree.
- Any more templates? After checking other templates that need to be processed, the XSL processor continues processing or outputs the results tree and ends the execution.
Much more is going on within the processor to handle the XML document, the XSL stylesheet, and the resulting document than is discussed here. However, these steps provide a conceptual understanding of the major activities performed. When the processing completes, the processor creates the output, such as an XML, HTML, or some other file type.
Probably the hardest and most interesting aspect of creating stylesheets is defining the patterns associated with the match attribute of the xsl:template start tag. The patterns may be difficult to define, because some XML documents have very complex element relationships and element hierarchies. The elements within an XML document are hierarchically associated with each other. The first element within the XML document is the root element; all elements beneath the root are some aspect of the root's family tree. Listing 1 shows the skeleton of the AddressBook.xml, an XML document used in some of the examples, and the parentage path for its elements. The parentage makes up the pattern for the template rule's match attribute. A mechanism for tracing parentage, XPath, helps the processor analyze the parentage of each node (element or attribute) within the XML document to determine its parentage. The specification for XPath is provided by the World Wide Web Consortium (W3C).
Listing 1. Skeleton AddressBook.xml
<AddressBook>..root
<AddressEntry>..AddressBook AddressEntry
//** every entry has AddressBook and AddressEntry as their parent
//** so they are not included
<Name title= >.. Name
.. Name title
<FirstName></FirstName>.. Name FirstName
<MiddleInitial></MiddleInitial>.. Name MiddleInitial
<LastName></LastName>.. Name LastName
</Name>
<Address>.. Address
<PostalAddress>.. Address PostalAddress
<Street></Street>.. Address PostalAddress Street
<City></City>.. Address PostalAddress City
<State</State>.. Address PostalAddress State
<PostalCode></PostalCode>.. Address PostalAddress PostalCode
<Country></Country>..Address PostalAddress Country
</PostalAddress>
<email><email>.. Address email
<Phone></Phone>.. Address Phone
</Address>
</AddressEntry>
</AddressBook>
|
Listing 1 shows the parentage for each element in the AddressBook.xml document; one attribute (title) is included. AddressBook is the root, and AddressEntry groups the address information. AddressBook and AddressEntry are parents to all the other elements within this XML document. To simplify the listing, AddressBook and AddressEntry elements are not included for each element in the listing; but be aware that they are the first two ancestors to all the other elements. The rules match attribute can explicitly state the complete parentage, bypass levels by including double slashes (//) in replacement for intermediate levels, or access the sub-elements within the body of the template rule.
Although they were not used in building these example stylesheets, the IBM alphaWorks Web site provides previews of potential tools to aid in building and testing the template rule patterns.
Listing 2 contains an entry from AddressBook.xml XML document used in Examples 1A and 1B. At the end of Listing 2 is a summary of the other AddressBook.xml entries.
Listing 2. AddressBook.xml
<?xml version="1.0" encoding="UTF-8"?>
<AddressBook>
<AddressEntry>
<Name title="Mr.">
<FirstName>Jim</FirstName>
<MiddleInitial>E</MiddleInitial>
<LastName>Waton</LastName>
</Name>
<Address>
<PostalAddress>
<Street>123 Main St.</Street>
<City>MyTown</City>
<State>MN</State>
</PostalAddress>
<PostalCode>55489</PostalCode>
<Country>US</Country>
<Phone>334-6565</Phone>
<eMail>jewat@xyz.com</eMail>
</Address>
</AddressEntry>
</AddressBook>
-------Other Address Book Entries Summary -------
AddressEntry 2 Miss Betsy A Ross (complete name)
433Flag St. (street)
HayTown, MN 56321 US (city, state zipcode country)
356-4377 (phone number)
baross@xyz.com (email)
Address Entry 3 Mr. Bob T. Brown
210 B AveA
Town, MN 58431 US
343-881
bbrown@xyz.com
Address Entry 4 Mr. Ben B. King
814 2nd St.
MyTown, MN 55489 US
334-8430
bbking@xyz.com
|
The XSL stylesheet in Example 1A creates a table containing a subset of the information in the AddressBook.xml document. The example has templates to extract the desired information and format it into an HTML document for display in a browser. Example 1B is a variation of Example 1A, which shows a similar activity, but it imbeds XSL rules to create the document inside HTML rather than using an XSLT stylesheet (like Example 1A).
For Example 1A, the input to the XSL processor is the AddressBook.xml document from Listing 2 and the XSLT stylesheet shown in Listing 3. All XSLT stylesheets start with an XML declaration statement identifying the stylesheet as an XML document. The second statement designates this as an XSLT stylesheet and identifies the URL location of the Document Type Definition (Dtd), as well as the version information. The URL points at the standard Dtd specified in XSLT stylesheets (not included in this document). Details of the Dtd can be found at the W3C Web site. The stylesheet contains 11 template rules. Each template rule contains a match attribute identifying specific elements within the XML document to be processed.
Listing 3. Example 1A XSL stylesheet
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl=
"http://www.w3.org/1999/XSL/Transform"
version="1.0">
<!-- ---- Rule 1 ---- -->
<xsl:template match="/">
<HTML>
<HEAD><TITLE>Address Book</TITLE>
</HEAD>
<BODY>
<TABLE border="1" >
<tr> <TH>Name</TH> <TH>Street</TH>
<TH>City</TH> <TH>PCode</TH>
<TH>Phone</TH> <TH>eMail</TH> </tr>
<xsl:apply-templates/>
</TABLE>
</BODY>
</HTML>
</xsl:template>
<!- --- Rule 2 --- ->
<xsl:template match="AddressEntry">
<tr><xsl:apply-templates/> </tr>
</xsl:template>
<!- --- Rule 3 --- ->
<xsl:template match="Name">
<td>
<xsl:apply-templates select="LastName"/>,
<xsl:apply-templates select="FirstName"/>
</td>
</xsl:template>
<!- --- Rule 4 --- ->
<xsl:template match="Address/PostalAddress">
<xsl:apply-templates select="Street"/>
<xsl:apply-templates select="City"/>
<xsl:apply-templates select="PostalCode"/>
</xsl:template>
<!- --- Rule 5 --- ->
<xsl:template match=_FirstName_>
<xsl:value-of select="."/>
</xsl:template>
<!- --- Rule 6 --- ->
<xsl:template match="LastName">
<xsl:value-of select="."/>
</xsl:template>
<!- --- Rule 7 --- ->
<xsl:template match="Street">
<td><xsl:value-of select="."/></td>
</xsl:template>
<!- --- Rule8 --- ->
<xsl:template match="City">
<td><xsl:value-of select="."/></td>
</xsl:template>
<!- --- Rule 9 --- ->
<xsl:template match=_PostalCode_>
<td><xsl:value-of select="."/></td>
</xsl:template>
<!- --- Rule 10 --- ->
<xsl:template match="Address/Phone">
<td><xsl:value-of select="."/></td>
</xsl:template>
<!- --- Rule 11 --- -gt;
<xsl:template match="Address/eMail">
<td><xsl:value-of select="."/></td>
</xsl:template>
</xsl:stylesheet>
|
Before looking at each rule, let's spend a moment on filtering. Filtering is a by-product of the matching process, because the portions of the XML document that do not match against template rules or are not processed within a rule body are not output to the results tree. This is a convenient way of eliminating portions of an input XML document without being required to include specific template rules in the stylesheet.
In the example, each template rule is preceded by a comment that is used to identify another template rule. Rule 1 matches against the XML document root, AddressBook, as designated by the match="/". Rule 1 outputs various HTML tags, including the beginning HTML, heading, body, table definition, and table heading tags. The <xsl:apply-templates/> statement within the rule's body tells the processor to look for other rules to process before outputting the various end tags, including the end table, end body, and the end HTML tag.
Rule 2 matches against an AddressEntry. So for each AddressEntry element, a table row is created that is defined by the <tr> and </tr> tags. Between the table row tags is another <xsl:apply-templates/> statement telling the processor to look for more template rules before outputting the </tr>. Rule 3 matches against the Name element creating a table detail entry and relies on template rules (Rules 5 and 6) to output the LastName and FirstName elements. Rule 3 builds this relationship between itself and Rules 5 and 6 by including an <xsl:apply-templates select="elementName"> element. The select =elementName in these cases explicitly states the associated rules by the elementName that are associated with this template rule.
Rules 5 and 6 (and many other rules within this stylesheet) contain the <xsl:value-of select="."/> element that allows the content of the element defined in the match to become part of the results. Rule 4 matches against the PostalAddress's sub-elements and requests the application of template Rules 7, 8, and 9, similar to Rule 3. This approach allows an added degree of control over the processing. Note that Rules 7 through 11 all output table detail tags and specific elements content. The results of the processing are shown in Figure 2.
Figure 2. Example 1A results
| Name | Street | City | PCode | Phone | |
| Waton, Jim | 123 Main St. | MyTown | 55489 | 334-6565 | jewat@xyz.com |
| Ross, Betsy | 433 Flag St. | HayTown | 56321 | 356-4377 | baross@xyz.com |
| Brown, Bob | 210 B Ave A | Town | 58431 | 343-8812 | bbrown@xyz.com |
| King, Ben | 814 2nd St. | MyTown | 55489 | 334-8430 | bbking@xyz.com |
Example 1B - XSL Stylesheet wrapped in HTML
As shown in Listing 4, the stylesheet for Example 1B accomplishes the same thing (minus the e-mail address) as Example 1A, with the addition of wrapping the XSL in HTML. By doing this, the stylesheet shrinks in size and complexity. The matching is performed inside an <xsl:for-each> element. An xsl:for-each processes each element specified in its select attribute. For each AddressEntry, an HTML table row is created in the results. All general HTML and table statements are specified prior to and after the xsl:for-each element. The xsl:template element is not needed in this example. This stylesheet filters the e-mail address from the output generated. The e-mail was filtered to show a variation in the stylesheets.
Listing 4. Example 1B HTML wrapped XSL
<html xsl:version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
lang="en">
<head>
<title>Address Book for Home</title>
</head>
<body> <table border="1">
<tr><TH>Name</TH><TH>Street</TH>
<TH>City</TH><TH>PCode</TH><TH>Phone</TH>
</tr>
<!- ***** the xsl:for-each statement **** -->
<xsl:for-each select="AddressBook/AddressEntry">
<tr> <td>
<xsl:value-of select="Name/LastName"/>,
<xsl:value-of select="Name/FirstName"/>
</td> <td>
<xsl:value-of select="Address//Street"/>
</td> <td>
<xsl:value-of select="Address//City"/>
</td> <td>
<xsl:value-of
select="Address/PostalAddress/PostalCode"/>
</td> <td>
<xsl:value-of select="Address/Phone"/>
</td> </tr>
</xsl:for-each>
</table>
</body>
</html>
|
The xsl:for-each element contains various xsl:value-of select elements. Each xsl:value-of element explicitly identifies the element whose content will be put into the results. Note that the double slash is used in two of the xsl:value-of select attributes (Address//Street and Address//City). This eliminates specifying some of the intermediate parents. There are other xsl:value-of elements with selects containing the complete parentage. Either of these approaches can be used in this example. Figure 3 shows the results of the processing.
Figure 3. Example 1B results
| Name | Street | City | PCode | Phone |
| Waton, Jim | 123 Main St. | MyTown | 55489 | 334-6565 |
| Ross, Betsy | 433 Flag St. | HayTown | 56321 | 356-4377 |
| Brown, Bob | 210 B Ave A | Town | 58431 | 343-8812 |
| King, Ben | 814 2nd St. | MyTown | 55489 | 334-8430 |
Example 2 - Converting to another XML
In Example 2, the XSLT stylesheet is used to convert the AddressBook.xml into a PhoneBook.xml document. Often, examples show XML transformed into HTML. However, many developers also need to transform XML to other XML dialects. PhoneBook.xml contains the data found in a conventional telephone book. The XML tags for the PhoneBook replace the HTML tags normally seen.
Included in Listing 5 is a subset of the stylesheet that gives solid examples of the complete stylesheet. This stylesheet has template rules similar to those defined in Example 1A, except XML tags instead of HTML tags are put into the results. Listing 6 shows one of the address entries converted to the new PhoneBook.xml document format.
Listing 5. XSL for PhoneBook
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/"> <PhoneBook> <xsl:apply-templates/> </PhoneBook> </xsl:template> <xsl:template match="AddressEntry"> <Entry> <xsl:apply-templates select="Name"/> <xsl:apply-templates select="Address"/> </Entry> </xsl:template> <xsl:template match="Name"> <Name> <xsl:apply-templates select="LastName"/>, <xsl:apply-templates select="FirstName"/> </Name> </xsl:template> <xsl:template match="Address"> <LocatorInfo> <xsl:apply-templates select="PostalAddress/Street"/> <xsl:apply-templates select="PostalAddress/City"/> <xsl:apply-templates select="PostalAddress/PostalCode"/> <xsl:apply-templates select="Phone"/> </LocatorInfo> </xsl:template> <xsl:template match="FirstName"> <xsl:value-of select="."/> <xsl:value-of select="."/> </xsl:template> <xsl:template match="LastName"> <xsl:value-of select="."/> </xsl:template> ... </xsl:stylesheet> |
Listing 6. PhoneBook.xml
<?xml version="1.0" encoding="UTF-8"?>
<PhoneBook>
<Entry>
<Name>Waton, Jim</Name>
<LocatorInfo>
<Street>123 Main St.</Street>
<City>MyTown</City>
<ZipCode>55489</ZipCode>
<Phone>334-6565</Phone>
</LocatorInfo>
</Entry>
</PhoneBook>
|
The XSL processor takes the AddressBook.xml document and the XSL stylesheet (shown in Listing 5) as input to the processing and creates the PhoneBook.xml document as output.
XSL has two elements that allow further analysis of the matched XML document content. The xsl:if element and the xsl:choose element provide these capabilities. Both of these elements allow more discrete control over the generation of the output. This extends the capability of template rules by allowing analysis of the XML document content. Example 3 shows the xsl:if element, and Example 4 shows the xsl:choose element.
Example 3 shows the xsl:if capability within XSLT. An xsl:if has a test attribute that contains an expression that evaluates to a boolean. If the expression is true, the associated actions occur. Otherwise, the actions do not occur. The xsl:if test in this example is: if the City does NOT equal the literal MyTown, the City elements content is placed in the results. The stylesheet creates a table that contains the name, street, city, zip code, and phone number. Listing 7 shows the template rule with the xsl:if. The PhoneBook.xml is the input XML document that was shown in Listing 6. The output from the processor is an HTML table shown in Figure 4. Both the first and last table rows have blank City column entries - Jim and Ben are from MyTown.
Figure 4. xsl:if test results
| Name | Street | City | Zip Code | Phone # |
| Waton, Jim | 123 Main St. | Â | 55489 | 334-6565 |
| Ross, Betsy | 433 Flag St. | HayTown | 56321 | 356-4377 |
| Brown, Bob | 210 B Ave A | Town | 58431 | 343-8812 |
| King, Ben | 814 2nd St. | Â | 55489 | 334-8430 |
Listing 7. PhoneBook.xml
<xsl:template match="LocatorInfo">
<td><xsl:value-of select=
"Street"/></td>
<td><xsl:iftest="not(City="MyTown")">
<xsl:value-of select="City"/>
</xsl:if>
</td>
<td><xsl:value-of select="ZipCode"/></td>
<td><xsl:value-of select="Phone"/></td>
</xsl:template>
|
Example 4 shows the xsl:choose element of XSLT, which is similar to a case or select statement found in various programming languages. Each xsl:choose element can have any number of associated xsl:when elements and an optional xsl:otherwise statement. Each xsl:when element contains a test attribute, which in this example checks the City by its name. The test attribute's value is an expression that evaluates to a boolean result. The xsl:when elements are evaluated sequentially. The first test expression that evaluates to true is executed, and the other xsl:when and the xsl:otherwise elements are bypassed. If none of the xsl:when elements are true and an xsl:otherwise element exists, it is executed. However, if no xsl:otherwise exists, no action is taken.
This example uses the PhoneBook.xml shown in Listing 6 as input. The stylesheet creates a table that contains the name, city, and area code. The template rule containing the xsl:choose is shown in Listing 8. In this example, the template rule's xsl:choose determines the Area code for each City within the PhoneBook.xml document. Figure 5 shows the processor output as an HTML table.
Figure 5. xsl:choose results
| Name | City | Area Code |
| Waton, Jim | MyTown | 509 |
| Ross, Betsy | HayTown | 502 |
| Brown, Bob | Town | 572 |
| King, Ben | MyTown | 599 |
Listing 8. xsl:choose
<xsl:template match="Entry/LocatorInfo">
<td><xsl:value-of select=
"City"/></td>
<td>
<xsl:choose><xsl:whentest="City=
"ATown"">572</xsl:when><xsl:whentest="City=
"HayTown"">502</xsl:when><xsl:whentest="City="MyTown"">599</xsl:when>
</xsl:choose>
</td>
</xsl:template>
|
Providing the ability to sort the contents of an existing XML document allows reorganization of the data. Example 5 shows the use of the xsl:sort element. An xsl:sort is either part of an xsl:apply-template (as in our example) or part of an xsl:for-each element. The select attribute of the xsl:sort element defines the sort criteria used to sort the elements into output order.
In this example, the input XML document is the PhoneBook.xml document. Listing 9 shows the template rule with the xsl:sort element. The sort criterion for this example is the Name element content and default ordering (ascending) is used. The results of the sort is an HTML unordered list shown below:
Listing 9. xsl:sort
<xsl:template match="PhoneBook">
<HTML>
<HEAD><TITLE>Sort of Names by Last Name
</TITLE></HEAD>
<BODY> <UL>
<xsl:apply-templates select="Entry">
<xsl:sort select="Name"/>
</xsl:apply-templates>
</UL> </BODY>
</HTML>
</xsl:template>
|
This article explored some of the capabilities of XSLT, through the use of various stylesheet examples. XSLT provides many of the basic capabilities needed to manipulate an XML document. XSLT provides the means to state the template rules for manipulating an XML document, and the Lotus XSL processor provides a robust engine to perform the pattern matching and transformations.
- Article sample code
- IBM alphaWorks
- World Wide Web Consortium
- developerWorks XML zone
- Apache XML Project
- Category: XML Tools
- IBM XML and Web Services Development
- XSL Editor
- XML Master
- XML Parser for Java
LindaMay Patterson is a software engineer with the International Support Organization at IBM Rochester. She has written various papers on WebSphere Everyplace Access components and capabilities. One of her recent assignments was working with the IBM Software Group' Application and Integration Middleware Software Enterprise Access Product team.

