Contents


XML transformation with WebSphere Message Broker Version V6

Comments

What does the XMLTransformation node do?

XSL transformation (XSLT) is a standards-based technology that transforms XML documents. Transformation rules for the input XML documents are described in the programming language known as Extensible Stylesheet Language (XSL), which is itself in XML. The XSL transformation rules reside in a document called a stylesheet. For information on the W3C specifications for XSLT and XSL, see Related topics below.

In addition to providing other data transformation technologies -- ESQL using a Compute node and Java® programming using a JavaCompute node -- IBM® WebSphere® Message Broker V6 provides XSLT capability in its XMLTransformation node (hereafter called the XMLT node). The XMLT node is basically a wrapper around an XSL transformation engine. The XMLT node also facilitates XSL transformation of WebSphere Message Broker messages. The following diagram outlines the transformation process applied to a WebSphere Message Broker message by the XMLT node:

Figure 1. WebSphere Message Broker XSL transformation
Figure 1. WebSphere Message Broker XSL transformation
Figure 1. WebSphere Message Broker XSL transformation
  • The XMLT node takes the entire body of a WebSphere Message Broker message as its input XML document. It is your responsibility to ensure that the input message body can be parsed into an XML document.
  • Although the environment information in an input message is passed to the output, a new output message, which is the output of the XSLT, is generated by the node. If you need to make the original input message body available on the output side, you must do so explicitly by writing code in the stylesheet to copy the input message body to the output.
  • The node requires another input -- a stylesheet to perform a transformation. WebSphere Message Broker provides flexible run-time stylesheet selection and caching, as explained below.

Selecting and managing stylesheets

The XMLT node supports run-time dynamic stylesheet selection from different sources. It also lets you cache stylesheets in memory to improve processing speed. Here is an overview of how to use these two features to increase the effectiveness and flexibility of the XMLT node:

Stylesheet selection and caching can be configured from the XMLT node's Stylesheet property user interface:

Figure 2. XMLT node Stylesheet property user interface
Figure 2. XMLT node Stylesheet property user interface
Figure 2. XMLT node Stylesheet property user interface

Conceptually, you need to perform four stylesheet configuration tasks:

Define stylesheet name search order

Needless to say, the XMLT node must know the name of the stylesheet to use. The node lets you specify stylesheet names in three different ways, as described below. You therefore need to define a search order of these locations for the node. To do so, fill in the XML Embedded Selection Priority for searching the input message body, the Message Environment Selection Priority for searching the local environment, and the Broker Node Attribute Selection Priority for searching the Stylesheet property user interface (shown above in Figure 2). To specify a priority for a particular location, set the value of the corresponding priority property to a non-zero value, with 1 indicating the highest priority. These properties also let you disable the search of individual locations by setting the corresponding priority to 0. The node performs a search of all specified locations for every input message and stops its search as soon as a name is found.

Defining stylesheet names

  1. Embedded in the input message: If you are specifying a name in the input message body, include an xml-stylesheet processing instruction, which is specified by the XML standard, in your input message body. Here is an example:

    Listing 1. Embedded stylesheet name declaration
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xml" href="<style sheet name>" ?>

    The value of href is extracted by the node when the input message body is searched. The advantage of this approach is that it gives you an easy way to directly associate a stylesheet name with individual input messages. It implies that the constructor of the input message knows about the name of the stylesheet it needs to use. If you modify the content of a stylesheet specified in this way, you must ensure that the change is valid for all input messages explicitly referencing it, which may complicate maintenance.

  2. Specified in the local environment message tree: If you are specifying a stylesheet name in the local environment, you need to use a preceding Compute node to define the local environment variable ComIbmXslXmltStylesheetname:

    Listing 2. Declaring style sheet name in the local environment
    SET LocalEnvironment.ComIbmXslXmltStylesheetname = '<style sheet name>';

    The advantage of this technique is that you can change stylesheets dynamically for different input messages (you create your selection logic in the Compute node). The disadvantage is that you need an extra node.

  3. Specified in the Stylesheet property GUI interface: If you want to use the node's Stylesheet property user interface shown above in Figure 2 to declare a stylesheet name, fill in the Stylesheet Directory / Stylesheet Name properties in the interface. The node will concatenate the values of these two properties and treat it as a reference to a file in the file system. (This is the only place where the node uses the Stylesheet Directory value; in another words, the value will not be used to concatenate with, say, a name specified in the local environment.) You can fill in these two properties in manually, or you can click Browse beside the Stylesheet Name property to fill in the Stylesheet Name.

    The advantage of this approach is its simplicity, while the disadvantage is that the name cannot be changed dynamically.

Indicating where your stylesheets can be found

The XMLT node is quite flexible regarding the source of the named stylesheet. It can access four different sources as described below. The node determines the location of your named stylesheet by examining the declared name.

  1. Stylesheets on the Internet: Indicated by a name starting with an Internet transfer protocol, such as http://. It is your responsibility to ensure that the stylesheet is available at the specified location.
  2. Stylesheets in the local file system: Indicated by a name starting with a file protocol (file://) or without any leading protocol. If your stylesheet name specification resolves to a relative file path, the node will treat the path as relative to directory <MQSI_WORKPATH>/XSL/external, where <MQSI_WORKPATH> is the directory defined by the MQSI_WORKPATH system environment variable. Again, it is your responsibility to ensure that the stylesheet is available at the specified location.
  3. Stylesheets in the execution group's deployed stylesheet storage: WebSphere Message Broker V6 supports deploying stylesheets to execution groups, and each execution group has its own independent stylesheet storage. In order to put a stylesheet into an execution group's stylesheet storage, you must add the stylesheet to a BAR file and deploy the BAR file to the execution group. You can do this manually, but a simpler way is to click Browse beside the Stylesheet Name property to identify a stylesheet in your tooling workspace, which will cause the stylesheet to be automatically added to your BAR file. Do not put anything into the Stylesheet Directory property. For more information, see the WebSphere Message Broker information center.

    A name referencing a deployed stylesheet must have a relative file path and cannot have any leading protocol in its name. When it encounters a relative file path name specification, the XMLT node first searches the execution group's stylesheet storage. If the stylesheet is not found there, it then searches the <MQSI_WORKPATH>/XSL/external directory.

    The advantage of using deployed stylesheets is that the node will manage them for you by backing out a deployment if things go wrong). The disadvantage is that you need to deploy a copy of the stylesheets to each execution group that needs them.

  4. Stylesheets embedded in an input message: You can embed either the name or the entire contents of a stylesheet in an input message. To do so, make your stylesheet a sub-element of your input XML message body and assign an identity to it using an XML id attribute. You can then use the identity as the name of your stylesheet in an embedded stylesheet name declaration. However, the stylesheet name must be preceded with a '#', as shown in the sample XML document below:

    Listing 3. XML document with an embedded style sheet
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xml" href="#styleSheetId" ?>
    <docRoot>
        <docData>
            ...
        </docData>
        <xsl:stylesheet id="styleSheetId" version="1.0"
          xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
            <xsl:template match="xsl:stylesheet"/>
            <xsl:template match="/">
                <xsl:copy-of select="/"/>
            </xsl:template>
        </xsl:stylesheet>
    </docRoot>

    In the XML document above:

    • The embedded stylesheet's id is "styleSheetId".
    • The embedded stylesheet is referenced as "#styleSheetId".
    • The line <xsl:template match="xsl:stylesheet"/> prevents the embedded stylesheet from being copied into the output, which is normally what you want to do.

    You can only reference the content of an embedded stylesheet using a <?xml-stylesheet href="..."?> processing instruction in an input message. In other words, you cannot use other naming mechanisms, such as local environment variables, to reference the content of an embedded stylesheet.

Stylesheet locations other than the four described above are not supported, and therefore you must migrate such stylesheets into one of the four supported locations. For example, if you have stylesheets stored in a database, a workaround is to embed a stylesheet in an input message. You can use a Compute node to read your stylesheets from your database, and then insert them into your input messages. Your modified message body must conform to the format in Listing 3 above.

Assuming that your stylesheet has been extracted from your database and stored in an ESQL variable called stylesheet, it has an id attribute defined, and your message is in the XMLNSC domain. The following ESQL statement will then let you identify it:

Listing 4. Finding a style sheet's id
DECLARE styleSheetID REFERENCE TO styleSheet.{'xsl:stylesheet'}.(XMLNSC.Attribute)id;

You can then create an embedded stylesheet name declaration for the embedded stylesheet and insert the stylesheet into the message using the following ESQL statements (assuming the name of your message body XML root is docRoot:

Listing 5. Creating an embedded style sheet name declaration and inserting a style sheet
IF LASTMOVE(styleSheetID) = TRUE THEN
    -- Create embedded style sheet name declaration
    SET OutputRoot.XMLNSC.(XMLNSC.ProcessingInstruction)"xml-stylesheet" = 
        'type="text/xsl" href="#' || styleSheetID || '"';
    -- Insert style sheet
    SET OutputRoot.XMLNSC.docRoot.{'xsl:stylesheet'} = styleSheet.{'xsl:stylesheet'};
END IF;

Deciding whether to cache stylesheets

Stylesheets are normally preprocessed into another format before being used by an XSL transformation engine. There are two preprocessed formats: interpretive and compiled. The interpretive format usually takes less time to prepare, but is less efficient when used for transformation.

To minimize preprocessing time, each XMLT node owns a in-memory stylesheet cache (one cache per node). Preprocessed stylesheets can be stored in the cache so that no more preprocessing is required when they are needed again. The number of cache entries in your XMLT node is determined by the Stylesheet Cache Level property on the XMLT node Stylesheet property user interface shown above in Figure 2. Needless to say, if the Stylesheet Cache Level property is set to 0, nothing can be cached. When the node cache is full, a new stylesheet entry replaces the oldest stylesheet in the cache.

Whether or not a stylesheet is cached does not depend on where its name is specified or where its content is stored. However, when a stylesheet is embedded in an input message, it is implied that the stylesheet is valid only for this particular input message. Therefore, the node will not put such a stylesheet into its stylesheet cache, which may reduce node speed and efficiency.

The XMLT node automatically preprocesses a stylesheet into the compiled format before putting it into the cache. Stylesheets that are not being cached are preprocessed into the interpretive format. A few words of caution:

  • In rare cases, a compilation may take a significant amount of time and become counterproductive.
  • A stylesheet compiler may not support certain parts of the XSLT standard.

In such cases, you may need to modify your stylesheets or set the Stylesheet Cache Level property to 0 to switch off the compilation.

Troubleshooting

Here are some troubleshooting steps for possible problems during the development of a WebSphere Message Broker message flow containing XMLT nodes.

Setting up your run-time environment properly: Many problems are caused by faulty environment settings. For example, if your stylesheet calls a Java extension function, ensure that the Java class is placed under the shared-classes directory.

Debugging your style sheets: Stylesheets are programs and require debugging. The WebSphere Message Broker V6 Toolkit comes with an XSLT debugger -- for usage information, see Rational Application Developer XSLT debugger. As of December 2006, WebSphere Message Broker does not yet support debugging stylesheets directly from its Flow Debugger, so you need to import your stylesheets and extract your input message XML body into your debugger's working environment. For information on other available XSLT debuggers, see Related topics below.

Performing transformations using Xalan-Java directly: If you determine that there is nothing wrong with your stylesheets, you can perform an XSL transformation by directly using the Xalan-Java transformation engine through its command-line interface. Doing so may help you identify certain problems, such as your stylesheet using features that are not supported by the engine. I If your stylesheet is not cached, use the Xalan-Java Interpretive Processor; otherwise, use the Compiling Processor. For more information, see Related topics below.

Conclusion

This article has described how to configure the XMLT node and some advantages and disadvantages of various node configurations. It has also provided a some problem determination steps. For more information on XMLT node tasks, see the WebSphere Message Broker information center.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, XML
ArticleID=182135
ArticleTitle=XML transformation with WebSphere Message Broker Version V6
publish-date=12062006