About transforming tabular data (DataStage)
XML Output requires XPath expressions to transform tabular data to XML. A table definition stores the XPath expressions. Using the Description property on the Columns pages within the stage, you record or maintain the XPath expressions.
About National Language Support (NLS) in DataStage
XML Output supports different character encodings for output documents, depending on the NLS mode.
DataStage in NLS mode
When DataStage® runs in NLS mode, all Internet Assigned Numbers Authority (IANA) character sets are supported. For a complete list of character sets, visit the following IANA web page:
http://www.iana.org/assignments/character-sets
For information about selecting the output encoding, see Options page.
Preserving the encoding of output documents
An integral part of any XML document is its encoding. If you apply an DataStage map to the document, the document may become corrupted.
To prevent corrupting an XML document, perform one of the following steps:
- Set the stage map to
NONEin each downstream stage. - Set the map for the column that contains the XML input to
NONEin each downstream stage. - Set the SQL type for the column that contains the XML input to
VarBinaryin each downstream stage and on the output link of the XML Input stage.
DataStage in non-NLS mode
When DataStage runs in non-NLS mode, note the following information:
- The document is written in UTF-8.
- Input columns are encoded using the local codepage of the machine hosting the engine tier. Therefore, assume that input data to the XML Output stage has been encoded with this codepage.
Supported XPath expressions
The following Backus Naur Form (BNF) diagram describes the subset of XPath expressions that you can use in XML Output.
path ::= ['/'] (element_spec '/')* end_segment
end_segment ::= element_spec['/text()'] | '@' attribute
element_spec ::= element '[ 'attr_value ( 'and' attr_value )*' ]'
attr_value ::= '@' attribute '=' '"'value'"'
Equivalent XPath expressions
For an XML Output operation, two types of XPath expressions are equivalent. Both expressions result in the text node being included:
- An expression that ends with an element name:
/a/b - An expression that ends with a text node:
/a/b/text()
Using XPath expressions
If a stage has both an input and an output link, XPath expressions are required on both links.
XPaths on input link
On the input link, the XPath expressions drive the generation of XML. Each XPath expression maps the values of an input column to a node in an XML hierarchy.
XPaths on output link
Each output column that has an XPath expression is a candidate for receiving XML. The source of the XML for an output column are those input columns whose XPath expressions start with and contain the same nodes.
To make the entire XML available to an output column, use one forward slash as the XPath expression. The forward slash identifies the root node.
The following table demonstrates the relationship between XPath expressions on the
input and output links. Two output columns use XPath expressions that form the first part of one of
more XPath expressions used by input columns. For example, the output column that uses the XPath
expression /orders receives XML generated using the XPath expressions
/orders/cust and /orders/items. The column that uses the forward
slash receives all the XML.
| Input column XPaths | Output column XPaths | ||
|---|---|---|---|
|
|
|
|
|
Yes | No | Yes |
|
No | No | Yes |
|
No | No | Yes |
|
Yes | Yes | Yes |
Mapping related data to different root elements
You can easily segregate related data in the XML by varying the root element. This feature is available when your XML Output stage has both input and output links. In a stage with only an input link, all XPath expressions must specify the same root element.
Example of XPath expressions
The input consists of addresses and orders for customers. The address data is
grouped using the root element /addresses. The order data is grouped using the root
element /orders.

The ADDRESSES column receives the following XML structures:
<addresses>
<address street=" " city=" ">
...
</addresses>
The ORDERS column receives the following XML structures:
<orders>
<order id=" ">
<order item=" ">
...
</orders>
Parsing XML reserved and special characters
You can
avoid parsing reserved and special XML characters that are already represented
by entity references (&entity;) by setting the Data element property on
the input link to XML.
If you use a different data element value or omit it, XML Output parses the input to make it XML-safe.
For example, the value < replaces
the less-than symbol (<).