Contents


Tip

Control white space in an XSLT style sheet

Create the document you want by understanding white space stripping

Comments

Content series:

This content is part # of # in the series: Tip

Stay tuned for additional content in this series.

This content is part of the series:Tip

Stay tuned for additional content in this series.

Note: For this tip, you can use any XSLT processor, such as Xalan or Saxon, or a browser-based solution, such as Microsoft Internet Explorer or Mozilla.

White space stripping rules

Before processing a transformation, an XSLT processor analyzes the style sheet and the source document, and removes any applicable white space nodes. It then processes the document, building the result tree from the remaining nodes.

Let's look at a basic transformation. The source document contains raw FAQ information that will be translated into a different XML structure for processing by a second application:

Listing 1. The source document
<?xml version="1.0"?>
<?xml-stylesheet href="style.xsl" type="text/xsl"?>
<faqs>
   <question>
      <questiontitle>Output DOM object?</questiontitle>
      <questiontext>
        Is there an easy way to send a DOM Document to the screen?
        <address>confused@wisconsin.com</address>
      </questiontext>
      <answer>
          <answertext>
              <line>Yes, as a matter of fact, there is. </line>
              <line>All you have to do is transform the </line>
              <line>Document, but don't add a style sheet:</line>
          </answertext>
          <codesection><codeline>...</codeline>
          <codeline>  DOMSource source = new DOMSource(myDocObject);</codeline>
          <codeline>  StreamResult result = new StreamResult(System.out);</codeline>
          <codeline>  TransformerFactory transFactory = </codeline>
          <codeline>                         TransformerFactory.newInstance();</codeline>
          <codeline>  Transformer transformer = transFactory.newTransformer();</codeline>
          <codeline>  transformer.transform(source, result);</codeline>
          <codeline>...</codeline></codesection>       
      </answer>
   </question>
</faqs>

The style sheet for this transformation is intended to combine individual lines, but will ultimately have to preserve the white space in the code section:

Listing 2. The basic style sheet
<?xml version="1.0"?>
<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<faqoutput>
    <info>
       <title>
         <xsl:value-of select="faqs/question/questiontitle"/>
       </title>
    </info>
    <xsl:apply-templates/>
</faqoutput>
</xsl:template>

<xsl:template match="question">
<issue><xsl:value-of select="questiontext"/></issue>

<solution><xsl:apply-templates select="answer"/></solution>
</xsl:template>

<xsl:template match="answertext">
<answered><xsl:apply-templates/></answered>
</xsl:template>

<xsl:template match="codesection">
<code>
<xsl:apply-templates/>
</code>
</xsl:template>
</xsl:stylesheet>

Transforming the document shows the results of the default rules for white space stripping:

Listing 3. The result set
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput><info><title>Output DOM object?</title></info>
   <issue>
        Is there an easy way to send a DOM Document to the screen?
        confused@wisconsin.com
      </issue><solution>
          <answered>
              Yes, as a matter of fact, there is.
              All you have to do is transform the
              Document, but don't add a style sheet:
          </answered>
          <code>...
            DOMSource source = new DOMSource(myDocObject);
            StreamResult result = new StreamResult(System.out);
            TransformerFactory transFactory = 
                                   TransformerFactory.newInstance();
            Transformer transformer = transFactory.newTransformer();
            transformer.transform(source, result);
          ...</code>
      </solution>
</faqoutput>

Notice that the white space nodes within individual templates, such as those in the main template, have been removed, but that the nodes within the source document, such as those between the line elements in answertext, have been preserved. There are several options for dealing with these issues.

Controlling white space in the source

When the processor strips the white space nodes from an element, it first checks to see if that element is on a list of white-space preserving elements. By default, all of the source nodes are added to this list, but you can remove one (or more) by adding them to the xsl:strip-space element. For example, you can strip all of the white space nodes out of the question element in order to compress any responses within the question or answer texts:

Listing 4. Stripping the question element
<?xml version="1.0"?>
<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:strip-space elements="question"/>

<xsl:template match="/">
<faqoutput>
    <info>
       <title>
...

If necessary, you can add more element names to the elements attribute by using a space-delimited list. The result of this rather brute-force method is that all of the white space nodes are stripped out:

Listing 5. The stripped results:
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput><info><title>Output DOM object?</title></info>
   <issue>
        Is there an easy way to send a DOM Document to the screen?
        confused@wisconsin.com</issue><solution><answered>Yes, as a matter of fa
ct, there is. All you have to do is transform the Document, but don't add a styl
e sheet:</answered><code>...  DOMSource source = new DOMSource(myDocObject);  St
reamResult result = new StreamResult(System.out);  TransformerFactory transFacto
ry =                          TransformerFactory.newInstance();  Transformer tra
nsformer = transFactory.newTransformer();  transformer.transform(source, result)
;...</code></solution>
</faqoutput>

You may have noticed that there is still white space between the question and the text of the address element, and within the code section. This may seem confusing, but remember we're stripping out white space nodes, not the white space itself. If a text node has anything other than white space in it, the processor will never strip it out, no matter what other settings are in place.

On the other hand, all of the line breaks in the code are now gone, which was not desired. Once an element has been removed from the list of white space preserving elements, you can add one of its descendents back in using the xsl:preserve-space element:

Listing 6. Preserving white space
<?xml version="1.0"?>
<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:strip-space elements="question"/>
<xsl:preserve-space elements="codesection"/>

<xsl:template match="/">
<faqoutput>
    <info>
...

As a result of this change, the codesection element is added back in to the list of white space preserving elements, so the line breaks are retained:

Listing 7. The preserved results:
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput><info><title>Output DOM object?</title></info>
   <issue>
        Is there an easy way to send a DOM Document to the screen?
        confused@wisconsin.com</issue><solution><answered>Yes, as a matter of fa
ct, there is. All you have to do is transform the Document, but don't add a styl
e sheet:</answered><code>...
            DOMSource source = new DOMSource(myDocObject);
            StreamResult result = new StreamResult(System.out);
            TransformerFactory transFactory = 
                                   TransformerFactory.newInstance();
            Transformer transformer = transFactory.newTransformer();
            transformer.transform(source, result);
          ...</code></solution>
</faqoutput>

The document still could be a little more readable, however, by managing the white space within the style sheet.

Adding white space to the style sheet

When the processor strips the white space nodes from the style sheet, only one element is, by default, on the list of white space preserving elements: xsl:text. A text element is always preserved, so it can be useful for adding line breaks or spaces within a document:

Listing 8. Adding line breaks with xsl:text:
...
<xsl:template match="question">
<issue><xsl:value-of select="questiontext"/></issue><xsl:text>

   </xsl:text><solution><xsl:apply-templates select="answer"/></solution>
</xsl:template>
...

You can use the xml:space attribute to individually control the addition of individual elements to the list of white space preserving elements, as well as the removal of individual elements from that list:

Listing 9. Controlling white space with xml:space
...
<xsl:template match="/">
<faqoutput><xsl:text>
    </xsl:text><info xml:space="preserve">
       <title xml:space="default">
         <xsl:value-of select="faqs/question/questiontitle"/>
       </title>
    </info>
    <xsl:apply-templates/>
</faqoutput>
</xsl:template>
...

The result in this case is that the white spaces around the title element are preserved, while those within the title element are discarded.

Listing 10. Results of controlling white space in a style sheet:
<?xml version="1.0" encoding="UTF-8"?>
<faqoutput>
    <info xml:space="preserve">
       <title xml:space="default">Output DOM object?</title>
    </info>
   <issue>
        Is there an easy way to send a DOM Document to the screen?
        confused@wisconsin.com</issue>

   <solution><answered>Yes, as a matter of fact, there is. All you have to do is
 transform the Document, but don't add a style sheet:</answered><code>...
            DOMSource source = new DOMSource(myDocObject);
            StreamResult result = new StreamResult(System.out);
            TransformerFactory transFactory = 
                                   TransformerFactory.newInstance();
            Transformer transformer = transFactory.newTransformer();
            transformer.transform(source, result);
          ...</code></solution>
</faqoutput>

Summary

By understanding the rules for white space preservation as they apply to a style sheet or a source document, you can closely control the appearance of the final document, but white space will never be stripped from within a text node.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12186
ArticleTitle=Tip: Control white space in an XSLT style sheet
publish-date=11012002