IBM Support

WDI with XML Mixed Content Model

Question & Answer


Question

Sometimes XML DTDs and schemas define the data very ambiguously. That is, they do not define the structure of the document very precisely, but allow a lot of freedom for the document structure. Some examples include the ANY content spec, mixed content, substitution groups, and late binding. However, WebSphere Data Interchange (WDI) defines the map and translation based on the document definition provided by the DTD or the schema – not based on a particular instance of the document. If the DTD or schema does not provide enough information about the document structure, this can make it difficult or impossible to map these DTDs and schemas without modification. In these cases, the DTD or schema may be ok to use for validation, but may need to be modified to use it for mapping.

Answer

The following is an example of a mixed content definition and XML input:

<!ELEMENT Comments ( #PCDATA | Attachment )* >

XML input:

<Comments>Some stuff

<Attachment><URL>URL data 1</URL></Attachment>

Maybe more stuff here

<Attachment><URL>URL data 2</URL></Attachment>

</Comments>

For mapping purposes WDI Client displays mixed content as a Choice (CHnnnn node) or with our example, Comments.CHnnnn.

Although the Choice node seems to identify repeating content to allow multiple or single occurrence mapping, this is not the case. WDI does not have enough information about the Choice node to determine repeating content. The Choice node does not have an element identification or Tag in the input data. Therefore WDI cannot identify occurrences of the Choice node and any qualification mapping (Multi-Occurrence or Occurrence) on this node will be ignored.

To map the mixed content, you can map the Comments.PCDATA and the URL.PCDATA to the output, but these nodes do not repeat. This results in only the first occurrence of Comments.PCDATA and URL.PCDATA being processed.

<Messages>

<Message>Some stuff</Message>

<Message>URL data 1</Message>

</Messages>


With changes to the content definition and the input XML, the repeating content can be mapped resulting in all occurrences of PCDATA and Attachment. The following is an example of redefining the Comments element and XML input:

<!ELEMENT Comments ( CommentText* | Attachment*)>

<!ELEMENT CommentText (#PCDATA)>

XML input:

<Comments>

<CommentText>Some stuff</CommentText>

<Attachment><URL>URL data 1</URL></Attachment>

<CommentText>Maybe more stuff here</CommentText>

<Attachment><URL>URL data 2</URL></Attachment>

</Comments>

This example removes the mixed content definition by defining an element for the PCDATA and defines Comments as a choice of repeating elements.

With this definition, WDI will process all of the CommentText elements together, then all of the Attachment elements together. With the following results.

<Messages>

<Message>Some stuff</Message>

<Message>Maybe more stuff here</Message>

<Message>URL data 1</Message>

<Message>URL data 2</Message>

</Messages>

Inside the Choice node both CommentText and Attachment show repeating content to allow multiple or single occurrence mapping. Since these nodes have an element identification or Tag in the input data, WDI can identify occurrences of these nodes. Qualification mapping (Multi-Occurrence or Occurrence) on these nodes will be executed.

Results of Qualify(Occurrence() EQ 1)

<Messages>

<Message>Some stuff</Message>

<Message>URL data 1</Message>

</Messages>

When dealing with mixed content, you may know what the structure of the data is, but WDI does not! Mapping logic, changes to the DTD or schema, and changes to the input data may be necessary to achieve your expected results.

[{"Product":{"code":"SSFKTZ","label":"WebSphere Data Interchange"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"WDI 3.3 z\/OS","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF033","label":"Windows"},{"code":"PF035","label":"z\/OS"}],"Version":"3.3","Edition":"All Editions","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

Document Information

Modified date:
01 August 2018

UID

swg21981240