XML transformation with WebSphere Message Broker Version V6

This article shows you how to use the XMLTransformation node in WebSphere Message Broker V6 and supplements the information in the product information center. It describes the context of various XMLTransformation node tasks, and includes tips on stylesheet handling and troubleshooting. Readers should have basic knowledge of XML, XSLT programming, WebSphere Message Broker, and its XMLTransformation node.

Xiaoming Zhang (zhang@uk.ibm.com), Staff Software Engineer, IBM

Xiaoming Zhang is a Staff Software Engineer on the WebSphere Message Broker Development team. Before joining IBM in 1998, he was a lecturer in Computer Science at Brunel University in the UK. Xiaoming graduated from Fudan University in China, and received a Ph.D from University of Wales in Swansea in the UK.



06 December 2006

Also available in Russian

What does the XMLTransformation node do?

XSL transformation (XSLT) is a standards-based technology that transforms XML documents. Transformation rules for the input XML documents are described in the programming language known as Extensible Stylesheet Language (XSL), which is itself in XML. The XSL transformation rules reside in a document called a stylesheet. For information on the W3C specifications for XSLT and XSL, see Resources below.

In addition to providing other data transformation technologies -- ESQL using a Compute node and Java® programming using a JavaCompute node -- IBM® WebSphere® Message Broker V6 provides XSLT capability in its XMLTransformation node (hereafter called the XMLT node). The XMLT node is basically a wrapper around an XSL transformation engine. The XMLT node also facilitates XSL transformation of WebSphere Message Broker messages. The following diagram outlines the transformation process applied to a WebSphere Message Broker message by the XMLT node:

Figure 1. WebSphere Message Broker XSL transformation
Figure 1. WebSphere Message Broker XSL transformation
  • The XMLT node takes the entire body of a WebSphere Message Broker message as its input XML document. It is your responsibility to ensure that the input message body can be parsed into an XML document.
  • Although the environment information in an input message is passed to the output, a new output message, which is the output of the XSLT, is generated by the node. If you need to make the original input message body available on the output side, you must do so explicitly by writing code in the stylesheet to copy the input message body to the output.
  • The node requires another input -- a stylesheet to perform a transformation. WebSphere Message Broker provides flexible run-time stylesheet selection and caching, as explained below.

Selecting and managing stylesheets

The XMLT node supports run-time dynamic stylesheet selection from different sources. It also lets you cache stylesheets in memory to improve processing speed. Here is an overview of how to use these two features to increase the effectiveness and flexibility of the XMLT node:

Stylesheet selection and caching can be configured from the XMLT node's Stylesheet property user interface:

Figure 2. XMLT node Stylesheet property user interface
Figure 2. XMLT node Stylesheet property user interface

Conceptually, you need to perform four stylesheet configuration tasks:

Define stylesheet name search order

Needless to say, the XMLT node must know the name of the stylesheet to use. The node lets you specify stylesheet names in three different ways, as described below. You therefore need to define a search order of these locations for the node. To do so, fill in the XML Embedded Selection Priority for searching the input message body, the Message Environment Selection Priority for searching the local environment, and the Broker Node Attribute Selection Priority for searching the Stylesheet property user interface (shown above in Figure 2). To specify a priority for a particular location, set the value of the corresponding priority property to a non-zero value, with 1 indicating the highest priority. These properties also let you disable the search of individual locations by setting the corresponding priority to 0. The node performs a search of all specified locations for every input message and stops its search as soon as a name is found.

Defining stylesheet names

  1. Embedded in the input message: If you are specifying a name in the input message body, include an xml-stylesheet processing instruction, which is specified by the XML standard, in your input message body. Here is an example:

    Listing 1. Embedded stylesheet name declaration
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xml" href="<style sheet name>" ?>

    The value of href is extracted by the node when the input message body is searched. The advantage of this approach is that it gives you an easy way to directly associate a stylesheet name with individual input messages. It implies that the constructor of the input message knows about the name of the stylesheet it needs to use. If you modify the content of a stylesheet specified in this way, you must ensure that the change is valid for all input messages explicitly referencing it, which may complicate maintenance.

  2. Specified in the local environment message tree: If you are specifying a stylesheet name in the local environment, you need to use a preceding Compute node to define the local environment variable ComIbmXslXmltStylesheetname:

    Listing 2. Declaring style sheet name in the local environment
    SET LocalEnvironment.ComIbmXslXmltStylesheetname = '<style sheet name>';

    The advantage of this technique is that you can change stylesheets dynamically for different input messages (you create your selection logic in the Compute node). The disadvantage is that you need an extra node.

  3. Specified in the Stylesheet property GUI interface: If you want to use the node's Stylesheet property user interface shown above in Figure 2 to declare a stylesheet name, fill in the Stylesheet Directory / Stylesheet Name properties in the interface. The node will concatenate the values of these two properties and treat it as a reference to a file in the file system. (This is the only place where the node uses the Stylesheet Directory value; in another words, the value will not be used to concatenate with, say, a name specified in the local environment.) You can fill in these two properties in manually, or you can click Browse beside the Stylesheet Name property to fill in the Stylesheet Name.

    The advantage of this approach is its simplicity, while the disadvantage is that the name cannot be changed dynamically.

Indicating where your stylesheets can be found

The XMLT node is quite flexible regarding the source of the named stylesheet. It can access four different sources as described below. The node determines the location of your named stylesheet by examining the declared name.

  1. Stylesheets on the Internet: Indicated by a name starting with an Internet transfer protocol, such as http://. It is your responsibility to ensure that the stylesheet is available at the specified location.
  2. Stylesheets in the local file system: Indicated by a name starting with a file protocol (file://) or without any leading protocol. If your stylesheet name specification resolves to a relative file path, the node will treat the path as relative to directory <MQSI_WORKPATH>/XSL/external, where <MQSI_WORKPATH> is the directory defined by the MQSI_WORKPATH system environment variable. Again, it is your responsibility to ensure that the stylesheet is available at the specified location.
  3. Stylesheets in the execution group's deployed stylesheet storage: WebSphere Message Broker V6 supports deploying stylesheets to execution groups, and each execution group has its own independent stylesheet storage. In order to put a stylesheet into an execution group's stylesheet storage, you must add the stylesheet to a BAR file and deploy the BAR file to the execution group. You can do this manually, but a simpler way is to click Browse beside the Stylesheet Name property to identify a stylesheet in your tooling workspace, which will cause the stylesheet to be automatically added to your BAR file. Do not put anything into the Stylesheet Directory property. For more information, see the WebSphere Message Broker information center.

    A name referencing a deployed stylesheet must have a relative file path and cannot have any leading protocol in its name. When it encounters a relative file path name specification, the XMLT node first searches the execution group's stylesheet storage. If the stylesheet is not found there, it then searches the <MQSI_WORKPATH>/XSL/external directory.

    The advantage of using deployed stylesheets is that the node will manage them for you by backing out a deployment if things go wrong). The disadvantage is that you need to deploy a copy of the stylesheets to each execution group that needs them.

  4. Stylesheets embedded in an input message: You can embed either the name or the entire contents of a stylesheet in an input message. To do so, make your stylesheet a sub-element of your input XML message body and assign an identity to it using an XML id attribute. You can then use the identity as the name of your stylesheet in an embedded stylesheet name declaration. However, the stylesheet name must be preceded with a '#', as shown in the sample XML document below:

    Listing 3. XML document with an embedded style sheet
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xml" href="#styleSheetId" ?>
    <docRoot>
        <docData>
            ...
        </docData>
        <xsl:stylesheet id="styleSheetId" version="1.0"
          xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
            <xsl:template match="xsl:stylesheet"/>
            <xsl:template match="/">
                <xsl:copy-of select="/"/>
            </xsl:template>
        </xsl:stylesheet>
    </docRoot>

    In the XML document above:

    • The embedded stylesheet's id is "styleSheetId".
    • The embedded stylesheet is referenced as "#styleSheetId".
    • The line <xsl:template match="xsl:stylesheet"/> prevents the embedded stylesheet from being copied into the output, which is normally what you want to do.

    You can only reference the content of an embedded stylesheet using a <?xml-stylesheet href="..."?> processing instruction in an input message. In other words, you cannot use other naming mechanisms, such as local environment variables, to reference the content of an embedded stylesheet.

Stylesheet locations other than the four described above are not supported, and therefore you must migrate such stylesheets into one of the four supported locations. For example, if you have stylesheets stored in a database, a workaround is to embed a stylesheet in an input message. You can use a Compute node to read your stylesheets from your database, and then insert them into your input messages. Your modified message body must conform to the format in Listing 3 above.

Assuming that your stylesheet has been extracted from your database and stored in an ESQL variable called stylesheet, it has an id attribute defined, and your message is in the XMLNSC domain. The following ESQL statement will then let you identify it:

Listing 4. Finding a style sheet's id
DECLARE styleSheetID REFERENCE TO styleSheet.{'xsl:stylesheet'}.(XMLNSC.Attribute)id;

You can then create an embedded stylesheet name declaration for the embedded stylesheet and insert the stylesheet into the message using the following ESQL statements (assuming the name of your message body XML root is docRoot:

Listing 5. Creating an embedded style sheet name declaration and inserting a style sheet
IF LASTMOVE(styleSheetID) = TRUE THEN
    -- Create embedded style sheet name declaration
    SET OutputRoot.XMLNSC.(XMLNSC.ProcessingInstruction)"xml-stylesheet" = 
        'type="text/xsl" href="#' || styleSheetID || '"';
    -- Insert style sheet
    SET OutputRoot.XMLNSC.docRoot.{'xsl:stylesheet'} = styleSheet.{'xsl:stylesheet'};
END IF;

Deciding whether to cache stylesheets

Stylesheets are normally preprocessed into another format before being used by an XSL transformation engine. There are two preprocessed formats: interpretive and compiled. The interpretive format usually takes less time to prepare, but is less efficient when used for transformation.

To minimize preprocessing time, each XMLT node owns a in-memory stylesheet cache (one cache per node). Preprocessed stylesheets can be stored in the cache so that no more preprocessing is required when they are needed again. The number of cache entries in your XMLT node is determined by the Stylesheet Cache Level property on the XMLT node Stylesheet property user interface shown above in Figure 2. Needless to say, if the Stylesheet Cache Level property is set to 0, nothing can be cached. When the node cache is full, a new stylesheet entry replaces the oldest stylesheet in the cache.

Whether or not a stylesheet is cached does not depend on where its name is specified or where its content is stored. However, when a stylesheet is embedded in an input message, it is implied that the stylesheet is valid only for this particular input message. Therefore, the node will not put such a stylesheet into its stylesheet cache, which may reduce node speed and efficiency.

The XMLT node automatically preprocesses a stylesheet into the compiled format before putting it into the cache. Stylesheets that are not being cached are preprocessed into the interpretive format. A few words of caution:

  • In rare cases, a compilation may take a significant amount of time and become counterproductive.
  • A stylesheet compiler may not support certain parts of the XSLT standard.

In such cases, you may need to modify your stylesheets or set the Stylesheet Cache Level property to 0 to switch off the compilation.


Troubleshooting

Here are some troubleshooting steps for possible problems during the development of a WebSphere Message Broker message flow containing XMLT nodes.

Setting up your run-time environment properly: Many problems are caused by faulty environment settings. For example, if your stylesheet calls a Java extension function, ensure that the Java class is placed under the shared-classes directory.

Debugging your style sheets: Stylesheets are programs and require debugging. The WebSphere Message Broker V6 Toolkit comes with an XSLT debugger -- for usage information, see Rational Application Developer XSLT debugger. As of December 2006, WebSphere Message Broker does not yet support debugging stylesheets directly from its Flow Debugger, so you need to import your stylesheets and extract your input message XML body into your debugger's working environment. For information on other available XSLT debuggers, see Resources below.

Performing transformations using Xalan-Java directly: If you determine that there is nothing wrong with your stylesheets, you can perform an XSL transformation by directly using the Xalan-Java transformation engine through its command-line interface. Doing so may help you identify certain problems, such as your stylesheet using features that are not supported by the engine. I If your stylesheet is not cached, use the Xalan-Java Interpretive Processor; otherwise, use the Compiling Processor. For more information, see Resources below.


Conclusion

This article has described how to configure the XMLT node and some advantages and disadvantages of various node configurations. It has also provided a some problem determination steps. For more information on XMLT node tasks, see the WebSphere Message Broker information center.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, XML
ArticleID=182135
ArticleTitle=XML transformation with WebSphere Message Broker Version V6
publish-date=12062006