Working with XML Document Types

An XML document type is an asset in the IS namespace created from an XML Schema definition. When you create an XML document type from an XML schema definition, Integration Server creates a collection of XML document types to represent the structure, content, and constraints defined in an XML schema definition. Each XML document type corresponds to a global element declaration, global attribute declaration, or global complex type definition in an XML Schema definition.

What Is an XML Document Type?

An XML document type defines the structure and types of data in a document. When you create an XML document type from an XML schema definition, Integration Server creates a collection of assets to represent the XML schema definition, which can include:

  • XML document types, each of which corresponds to a global element declaration, global attribute declaration, or global complex type definition in an XML schema definition.
  • XML fields each of which corresponds to global element declaration with simple content.
  • IS schemas which contain the global simple type definitions for a particular namespace in the XML schema definition.

Like IS document types, XML document types can be used to define the structure of document being created from an XML node or a document being converted to XML, define service signatures, build a document or document list field, or perform data validation. However, XML document types more accurately represent XML Schema definitions and provide support for XML schema constructs and features that are not supported through IS document types.

What Is XMLData?

At run time, instances of XML document types and XML fields are of contained in an XMLData object. XMLData is an IData object that uses a specific encoding format to represent the XML Information Set (XML Infoset). The format facilitates all the features of XML Infoset and XML Schema, including support for capabilities such as nested model groups and substitution groups. The format also eliminates the need to specify an association between prefixes and namespace URIs.

Unlike the traditional encoding used for XML representation with raw IData, the encoding format for XMLData is not public and is subject to change at any time. The various built-in services that support XMLData, including those in the pub.xmldata folder in the WmPublic package, and flow MAP step operations are the only supported means of accessing and modifying XMLData. Directly manipulating an XMLData object as one can a traditional IData object will lead to unexpected results.

Why Use XML Document Types Instead of IS Document Types?

While XML document types and IS document types have similar uses and, in some cases, similar sources as both can be created from an XML schema definition, XML document types offer the following distinct benefits:

  • Improved XML namespace handling. XML document types do not use prefixes from the XML schema definition in the names of document types or fields (variables). Instead IS uses the following format for XML document type names, XML field names, and names of fields within XML document types: NCName#NamespaceURI. This naming convention ensures that XML document types, XML fields, and fields within a document type have a unique name, preventing the conflicts that arise when IS document types are generated using one set of prefixes and the instance XML documents use a different set of prefixes.
  • Nested and repeating model group support. IS can create XML document types from XML schema definitions that contain nested model groups or repeating model groups. An IS document type cannot correctly represent nested and repeating model groups.
  • Any attribute support. IS can create XML document types from XML schema definitions that contain the anyAttribute element. An IS document type cannot correctly represent the anyAttribute element.
  • Improved support for substitution groups.
  • Improved support for any element.
  • Support for xsi:nil and xsi:type on simple types (String fields). IS document types support xsi:nil and xsi:type on complex types (Document fields) only.
  • Improved handling of identically named fields at the same level.

If you want your solutions to incorporate or leverage any above the above items, consider using XML document types instead of IS document types in your solutions.

Note: XML document types and instance documents based on XML document types are intended to implement XML Schema and XML as closely as possible. Behavior that is inconsistent with XML Schema and XML will be treated as known issues that need resolution. Implementations should not exploit behavior that is inconsistent with XML and XML schema as it may have unpredictable results.

Differences Between XML Document Types and IS Document Types

In addition to improved namespace handling and support for XML schema constructs such as nested model groups, repeating model groups, and any attribute, XML document types and IS document types have the following differences:

  • XML document types can be created from XML schema definitions only.
  • Integration Server uses the names in an XML Schema definition to name the generated IS assets, including the XML document types. This is unlike IS document types where you can specify the name of the first IS document type generated from an XML schema definition but Integration Server creates the names of any subsequent IS document types.
  • XML document types do not use prefixes from the XML schema definition in the names of XML document types. XML fields, or fields withing XML document types. Instead IS uses the following format for XML document type names, XML field names, and names of fields within XML document types: NCName#NamespaceURI . Integration Server document types use prefixes from the XML schema definition in field (variable) names.
  • In XML document types, attribute names do not include the @ symbol to indicate that it is an attribute.
  • XML document types do not contain *body, *doctype, or *any fields.
  • XML document types provide improved handling of identically named fields at the same level. In XML document types, Integration Server maintains a particle ID for each field. To view the particle ID, hover the cursor over the name of the field. Designer displays properties for the field including the following for the name: {ID}NCName#NamespaceURI where ID is a number representing the occurrence of the field in the document type or the pipeline. For example, {2}myLocalName#myNamespaceName indicates the second occurrence of a field named myLocalName#myNamespaceName.

    In IS document types, all occurrences of identically named fields at the same level are collected into a single array. This approach may not preserve order during runtime.

    Note: Integration Server creates arrays for XML document types when an individual element has a maxOccurs greater than 1. If there are two fields named myLocalName#myNamespaceName and each has a maxOccurs greater than 1, Integration Server creates {1}myLocalName#myNamespaceName as an array and {2}myLocalName#myNamespaceName as an array.
  • The contents of XML document types and XML fields cannot be edited.
  • A Document Reference or Document Reference List variable contained in an IS document type or in a service signature can reference an XML document type that corresponds to a complex type definition or a root XML document type only.

Limitations of XML Document Type Usage

Although XML document types and IS document types can be used in nearly identical ways, there are some limitations in the usage of XML document type:

  • XML document types cannot be made publishable. That is, an XML document type cannot become a publishable document type.
  • XML document type cannot be used in web services, which includes the signatures of services used as operations, headers, faults, and the pub.soap.handler* services.
  • XML document types cannot be used as the top-level element in a service signature. That is, on the Input/Output tab, you cannot specify an XML document type for the Input field or Output field.
  • XML document types should not be created from an XML schema definition in an Event Type Store.
  • XML document types should not be created from e-forms.

Creating an XML Document Type

About this task

When you create an XML document type you specify the following:
  • The destination folder in which you want Designer to place the generated XML document types, XML fields, IS schemas, and folders.
  • The source XML schema definition.
  • Whether or not Integration Server use the Xerces Java parser to validate the XML Schema definition before creating XML document types.

There are no additional options, making the process of creating an XML document type less complex than that of creating an IS document type.

When you create an XML document type, keep the following information in mind:

  • You can create only one set of XML document types per folder. If you used folderA as the destination for the XML document types and other assets created for mySchema.xsd, you cannot use folderA as the destination for the XML document types and other assets generated from another XML schema definition. However, you could use a subfolder in folderA as the destination for the XML document type and other assets created for another XML schema definition.
  • Do not use a folder created by Integration Server to store assets generated for an XML schema definition as the destination folder for new XML document types.
  • To create an XML document type from an XML Schema definition in CentraSite, Designer must be configured to connect to CentraSite.

To create an XML document type

Procedure

  1. In the Service Development perspective of Designer, click File > New > XML Document Type.
  2. In the Create a New XML Document Type dialog box, select the folder in which you want to save the XML document types, XML fields, IS schemas, and folders generated from the XML schema definition.
  3. Click Next.
  4. On the Select a Source Location panel, under Source location, do one of the following to specify the source XML schema definition for the XML document type:
    • To use an XML schema definition in CentraSite as the source, select CentraSite.
    • To use an XML schema definition that resides on the Internet as the source, select File/URL. Then, type the URL of the resource. (The URL you specify must begin with http: or https:.)
    • To use an XML Schema definition that resides on your local file system as the source, select File/URL. Then, type in the path and file name, or click the Browse button to navigate to and select the file.
  5. If you want Integration Server to use the Xerces Java parser to validate the XML Schema definition, select the Validate schema using Xerces check box.
    Note: Integration Server automatically uses an internal schema parser to validate the XML Schema definition. However, the Xerces Java parser provides stricter validation than the Integration Server internal schema parser. As a result, some schemas that the internal schema parser considers to be valid might be considered invalid by the Xerces Java parser.
  6. If you selected CentraSite as the source, click Next. Then, under Select a Schema, select the XML schema definition that you want to use as the source and click Finish.

    If Designer is not configured to connect to CentraSite, Designer displays the CentraSite> Connections preference page and prompts you to configure a connection to CentraSite.

  7. Click Finish.

Results

Notes:

  • Integration Server uses the internal schema parser to validate an XML schema definition. If you selected the Validate schema using Xerces check box, Integration Server also uses the Xerces Java parser to validate the XML Schema definition. With either parser, if the XML Schema does not conform syntactically to the schema for XML Schemas defined in XML Schema Part 1: Structures (which is located at http://www.w3.org/TR/xmlschema-1), Integration Server does not create an XML document type. Instead, Designer displays an error message that lists the number, title, location, and description of the validation errors within the XML Schema definition. If only warnings occur, Designer generates the XML document type and the other assets.
    Note: Integration Server uses Xerces Java parser version J-2.11.0. Limitations for this version are listed at http://xerces.apache.org/xerces2-j/xml-schema.html.
  • When validating XML schema definitions, Integration Server uses the Perl5 regular expression compiler instead of the XML regular expression syntax defined by the World Wide Web Consortium for the XML Schema standard. As a result, in XML schema definitions consumed by Integration Server, the pattern constraining facet must use valid Perl regular expression syntax. If the supplied pattern does not use proper Perl regular expression syntax, Integration Server considers the pattern to be invalid.
    Note: If the watt.core.datatype.usejavaregex configuration parameter is set to true, Integration Server uses the Java regular expression compiler instead of the Perl5 regular expression compiler. When the parameter is true, the pattern constraining facet in XML schema definitions must use valid syntax as defined by the Java regular expression.