Note: For this tip, you can use any XSLT processor, such as Xalan or Saxon, or a browser-based solution, such as Microsoft Internet Explorer or Mozilla.
Before processing a transformation, an XSLT processor analyzes the style sheet and the source document, and removes any applicable white space nodes. It then processes the document, building the result tree from the remaining nodes.
Let's look at a basic transformation. The source document contains raw FAQ information that will be translated into a different XML structure for processing by a second application:
Listing 1. The source document
<?xml version="1.0"?>
<?xml-stylesheet href="style.xsl" type="text/xsl"?>
<faqs>
<question>
<questiontitle>Output DOM object?</questiontitle>
<questiontext>
Is there an easy way to send a DOM Document to the screen?
<address>confused@wisconsin.com</address>
</questiontext>
<answer>
<answertext>
<line>Yes, as a matter of fact, there is. </line>
<line>All you have to do is transform the </line>
<line>Document, but don't add a style sheet:</line>
</answertext>
<codesection><codeline>...</codeline>
<codeline> DOMSource source = new DOMSource(myDocObject);</codeline>
<codeline> StreamResult result = new StreamResult(System.out);</codeline>
<codeline> TransformerFactory transFactory = </codeline>
<codeline> TransformerFactory.newInstance();</codeline>
<codeline> Transformer transformer = transFactory.newTransformer();</codeline>
<codeline> transformer.transform(source, result);</codeline>
<codeline>...</codeline></codesection>
</answer>
</question>
</faqs> |
The style sheet for this transformation is intended to combine individual lines, but will ultimately have to preserve the white space in the code section:
Listing 2. The basic style sheet
<?xml version="1.0"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<faqoutput>
<info>
<title>
<xsl:value-of select="faqs/question/questiontitle"/>
</title>
</info>
<xsl:apply-templates/>
</faqoutput>
</xsl:template>
<xsl:template match="question">
<issue><xsl:value-of select="questiontext"/></issue>
<solution><xsl:apply-templates select="answer"/></solution>
</xsl:template>
<xsl:template match="answertext">
<answered><xsl:apply-templates/></answered>
</xsl:template>
<xsl:template match="codesection">
<code>
<xsl:apply-templates/>
</code>
</xsl:template>
</xsl:stylesheet> |
Transforming the document shows the results of the default rules for white space stripping:
Listing 3. The result set
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput><info><title>Output DOM object?</title></info>
<issue>
Is there an easy way to send a DOM Document to the screen?
confused@wisconsin.com
</issue><solution>
<answered>
Yes, as a matter of fact, there is.
All you have to do is transform the
Document, but don't add a style sheet:
</answered>
<code>...
DOMSource source = new DOMSource(myDocObject);
StreamResult result = new StreamResult(System.out);
TransformerFactory transFactory =
TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
transformer.transform(source, result);
...</code>
</solution>
</faqoutput>
|
Notice that the white space nodes within individual templates, such as those in the main template, have been removed, but that the nodes within the source document, such as those between the line elements in answertext, have been preserved. There are several options for dealing with these issues.
Controlling white space in the source
When the processor strips the white space nodes from an element, it first checks to see if that element is on a list of white-space preserving elements. By default, all of the source nodes are added to this list, but you can remove one (or more) by adding them to the xsl:strip-space element. For example, you can strip all of the white space nodes out of the question element in order to compress any responses within the question or answer texts:
Listing 4. Stripping the question element
<?xml version="1.0"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="question"/>
<xsl:template match="/">
<faqoutput>
<info>
<title>
...
|
If necessary, you can add more element names to the elements attribute by using a space-delimited list. The result of this rather brute-force method is that all of the white space nodes are stripped out:
Listing 5. The stripped results:
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput><info><title>Output DOM object?</title></info>
<issue>
Is there an easy way to send a DOM Document to the screen?
confused@wisconsin.com</issue><solution><answered>Yes, as a matter of fa
ct, there is. All you have to do is transform the Document, but don't add a styl
e sheet:</answered><code>... DOMSource source = new DOMSource(myDocObject); St
reamResult result = new StreamResult(System.out); TransformerFactory transFacto
ry = TransformerFactory.newInstance(); Transformer tra
nsformer = transFactory.newTransformer(); transformer.transform(source, result)
;...</code></solution>
</faqoutput>
|
You may have noticed that there is still white space between the question and the text of the address element, and within the code section. This may seem confusing, but remember we're stripping out white space nodes, not the white space itself. If a text node has anything other than white space in it, the processor will never strip it out, no matter what other settings are in place.
On the other hand, all of the line breaks in the code are now gone, which was not desired. Once an element has been removed from the list of white space preserving elements, you can add one of its descendents back in using the xsl:preserve-space element:
Listing 6. Preserving white space
<?xml version="1.0"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="question"/>
<xsl:preserve-space elements="codesection"/>
<xsl:template match="/">
<faqoutput>
<info>
...
|
As a result of this change, the codesection element is added back in to the list of white space preserving elements, so the line breaks are retained:
Listing 7. The preserved results:
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput><info><title>Output DOM object?</title></info>
<issue>
Is there an easy way to send a DOM Document to the screen?
confused@wisconsin.com</issue><solution><answered>Yes, as a matter of fa
ct, there is. All you have to do is transform the Document, but don't add a styl
e sheet:</answered><code>...
DOMSource source = new DOMSource(myDocObject);
StreamResult result = new StreamResult(System.out);
TransformerFactory transFactory =
TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
transformer.transform(source, result);
...</code></solution>
</faqoutput>
|
The document still could be a little more readable, however, by managing the white space within the style sheet.
Adding white space to the style sheet
When the processor strips the white space nodes from the style sheet, only one element is, by default, on the list of white space preserving elements: xsl:text. A text element is always preserved, so it can be useful for adding line breaks or spaces within a document:
Listing 8. Adding line breaks with xsl:text:
...
<xsl:template match="question">
<issue><xsl:value-of select="questiontext"/></issue><xsl:text>
</xsl:text><solution><xsl:apply-templates select="answer"/></solution>
</xsl:template>
...
|
You can use the xml:space attribute to individually control the addition of individual elements to the list of white space preserving elements, as well as the removal of individual elements from that list:
Listing 9. Controlling white space with xml:space
...
<xsl:template match="/">
<faqoutput><xsl:text>
</xsl:text><info xml:space="preserve">
<title xml:space="default">
<xsl:value-of select="faqs/question/questiontitle"/>
</title>
</info>
<xsl:apply-templates/>
</faqoutput>
</xsl:template>
...
|
The result in this case is that the white spaces around the title element are preserved, while those within the title element are discarded.
Listing 10. Results of controlling white space in a style sheet:
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput>
<info xml:space="preserve">
<title xml:space="default">Output DOM object?</title>
</info>
<issue>
Is there an easy way to send a DOM Document to the screen?
confused@wisconsin.com</issue>
<solution><answered>Yes, as a matter of fact, there is. All you have to do is
transform the Document, but don't add a style sheet:</answered><code>...
DOMSource source = new DOMSource(myDocObject);
StreamResult result = new StreamResult(System.out);
TransformerFactory transFactory =
TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
transformer.transform(source, result);
...</code></solution>
</faqoutput>
|
By understanding the rules for white space preservation as they apply to a style sheet or a source document, you can closely control the appearance of the final document, but white space will never be stripped from within a text node.
- Check out the XSLT 1.0 Recommendation from the W3C.
- Get an understanding of XSLT by taking "Introduction to XSLT" tutorial (developerWorks, Nicholas Chase, January 2007).
- Find more XML resources on the developerWorks XML zone. For a complete list of XML tips to date, check out the tips summary page.
-
IBM trial software: Build your next development project with trial software available for download directly from developerWorks.
- Find out how you can become an IBM Certified Developer in XML and related technologies.
- Want us to send you useful XML tips like this every week? Sign up for the developerWorks XML Tips newsletter.
Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, USA, and is the author of three books on Web development, including Java and XML From Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.





