An introduction to XML Schema 1.1
Co-occurence constraints using XPath 2.0
This content is part # of # in the series: XML Schema 1.1, Part 2
This content is part of the series:XML Schema 1.1, Part 2
Stay tuned for additional content in this series.
Complex and simple type definitions in XML Schema 1.0 allow schema authors to specify and restrict the content of elements and values of attributes. According to the XML Schema 1.0 specification, complex type definitions constrain elements by providing attribute declarations that govern the appearance and contents of attributes by restricting elements to be empty or to conform to a specific content model, such as element-only, mixed, or simple content determined by a simple type definition of the content.
Complex type definitions also define a mechanism that governs a type definition hierarchy which determines how complex types can be derived from other complex types or simple types by extension or restriction. Substitution groups on complex types control the substitution of elements with elements of its derived type. Simple types on the other hand constrain the character values of the contents of elements and attributes.
In this article we discuss co-occurrence constraints, a new feature introduced in XML Schema 1.1 to not only constrain the content of elements and attributes, but their existence as well.
A bit of history
As we mentioned in the first article of the series, XML Schema 1.0 has certain limitations. Beyond the constraints mentioned above, XML schema authors often needed to enforce more complex rules that determine and restrict the content of elements and attributes, such as the ability to restrict the appearance of certain child elements based on the value of an attribute, having the total sum of child elements not exceed a certain value, or allowing the value of a child element to be valid based on the scope in which it is found.
Unfortunately, XML Schema 1.0 did not provide a way to enforce these rules. To implement such constraints, you would
- Write code at the application level (after XML schema validation)
- Use stylesheet checking (also a post-validation process)
- Use a different XML schema language such as RelaxNG or Schematron
With the constant requests for co-occurrence constraint checking support from the XML Schema 1.0 user community, the XML Schema 1.1 working group introduced the concept of assertions and type alternatives in XML Schema 1.1 to allow XML schema authors to express such constraints.
Assertions provide XML schema authors with a flexible way to control the occurrence and values of elements and attributes.
Before you delve into how assertions are defined in XML Schema 1.1, first take a look at some usage scenarios.
- Specify a constraining rule based on the values of two or more attributes. Given the XML fragment in Listing 1, you can
specify a rule between attributes
heightso that the height is never be greater than the width.
Listing 1. XML fragment - element with two attributes
<dimension width="10" height="5"/>
- Specify a constraining rule between attributes and elements. In Listing 2, we have an element that has an attribute
and two child elements. You can specify a rule between an attribute and the child elements
such that value of the attribute equals the number of child elements.
Listing 2. XML fragment - element with one attribute and two child elements
<parent children="2"> <child name="one"/> <child name="two"/> </parent>
Specify a constraining rule that determines the order and choice between attributes.
For the element defined in Listing 3, you can specify a
timerhas either a
iterationsattribute but not both.
Listing 3. XML fragment - timer element
<timer time="30" iterations="2000"/>
Specify a grouping of elements and attributes into a model group. For example,
you can restrict the content of element
parent(defined in Listing 4), by specifying a rule that forces the content to be either
grandchildand both elements having the attributes
Listing 4. XML fragment - A parent element
<parent> <child name="abc" dob="1/1/1997"/> <grandchild name="xyz" dob="1/1/2007"/> </parent>
Specify a constraining rule on the text in an element with mixed content. In
Listing 5 is an element,
parent, that has mixed content. You can then specify a rule that allows the mixed content text to be only a maximum of 10 characters.
Listing 5. XML fragment - A parent element with mixed content
<parent>2 children <child>abc</child> <child>xyz</child> </parent>
To address these and other usage scenarios, XML Schema 1.1 provides more expressive constraints through XML Schema 1.1 assertions. Assertions in XML Schema 1.1 are similar to those available in other schema languages such as Schematron and RelaxNG.
At the time of writing this article, you can specify assertions on simple and complex types. The predicate is specified using an XPath 2.0 expression which is part of the assertion specified on the type.
Assertions on complex types
In XML Schema 1.1, complex type definitions can contain an assertions schema component
which is a sequence of
<xs:assert> child elements of
the complex type definition. The order of this sequence is insignificant. Assertions
constrain the existence of elements and attributes and their values. The
<xs:assert> schema component contains a
property which is an XPath expression property record and an
The value of the
test attribute of the
xs:assert element information item is an XPath
expression that evaluates to either true or false. You can use a special variable, $value, to refer to the simple content value of the element or attribute being checked.
Evaluation is done in the context of the parent element. The XPath expression must be a
valid XPath 2.0 expression or at least conform to the minimal XPath subset defined in
the XML Schema 1.1 specification.
If the XPath expression specified is invalid, an
xpath-valid error is reported.
xs:assert is incorrectly specified, the schema
processor reports an
as-props-correct error. If the evaluation of the test expression is true and does not
result in a dynamic or type error, the element is considered locally valid. If it
evaluates to false a generic
cvc-assertion error is reported.
Listing 6 shows an example of a complex type with an
assertion that constrains the values of two attributes. The assertion expression
true if the value of
is less than the value of
width, otherwise it evaluates to
Listing 6. Assertion on complex type - @height < @width
<xs:element name="dimension"> <xs:complexType> <xs:attribute name="height" type="xs:int"/> <xs:attribute name="width" type="xs:int"/> <xs:assert test="@height < @width"/> </xs:complexType> </xs:element>
In the example above, we defined an
xs:assert element information item as a direct
xs:complexType. We can also specify
xs:extension when defining a complex type with complex content
or simple content (
xs:simpleContent). For an element to be valid, each assertion in
its sequence of assertions needs to evaluate to true. This sequence is comprised of
all the assertions defined on the complex type as well as all assertions of the complex
In Listing 7, we have two complex types,
derivedType, each with its own assertion. The assertion
baseType checks if the attribute
is present on the element. The assertion on
checks if the
mustUnderstand attribute has a value
YES and at least one
is present; otherwise it expects
mustUnderstand to have a
has a sequence of two assertions, the one from
and its own. For the element
message to be valid, its content
must be valid as defined by its
complexType definition and all assertions must evaluate
Listing 7. Assertion on complex type with complex content
<xs:complexType name="baseType"> <xs:sequence> <xs:element name="body" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="mustUnderstand" type="xs:string"/> <xs:assert test="@mustUnderstand"/> </xs:complexType> <xs:complexType name="derivedType"> <xs:complexContent> <xs:restriction base="baseType"> <xs:sequence> <xs:element name="body" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="mustUnderstand" type="xs:string"/> <xs:assert test="( @mustUnderstand eq 'YES' and fn:count(./body) > 0 ) or ( @mustUnderstand eq 'NO' )"/> </xs:restriction> </xs:complexContent> </xs:complexType> <xs:element name="message" type="derivedType"/>
When defining a complex type with simple content, you can specify two types of
assertions. The first one acts as facet and restricts the simple content type
(for example, restricting the simple value to be a multiple of 7), while the second one
appears as an assertion on the element as a whole, including its attributes. Since
the syntax of the content model of
xs:simpleContent/xs:restriction does not
distinguish between the two types of assertions, a new element information item,
xs:assertion was introduced to indicate an assertion
facet. We will cover
xs:assertion in the next section when we discuss assertions
on simple type definitions.
Assertions on simple types
In XML Schema 1.1, like complex types,
elements among the children of an
xs:assertion elements. Assertions in simple types
are similar to other simple type constraining facets. The assertions simple type
component represents a set of constraining facets that restrict the value space of
a simple type by requiring values to satisfy conditions specified by the XPath
expression in the value of
As with complex type definition, the assertions are an ordered sequence of
xs:assertion elements specified as facets in the simple type
definition. The specified order of the sequence of assertions is insignificant as all
assertions in this sequence need to evaluate to true for an element or attribute of
this type to be valid. The assertions schema component contains a value property
which is a sequence of assertions from the base type, if any, and assertions defined
in the derived
The value of the
test attribute of the
facet is an XPath 2.0 expression or XPath 2.0 subset as defined by the XML Schema 1.1
specification that evaluates to either true or false. Evaluation is done in the context
of the parent element.
An element or attribute with simple content is valid if it is valid with respect to all assertion facets (that is, the
test property of each
xs:assertion evaluates to true, without any dynamic or type errors.)
In Listing 8, we show an example of an element with simple content that has an assertion facet that evaluates to true if the element's value is a multiple of 10.
Listing 8. An element with simple content that allows values that are multiples of 10
<xs:element name="message"> <xs:simpleType> <xs:restriction base="xs:int"> <xs:assertion test="($value mod 10) = 0"/> </xs:restriction> </xs:simpleType> </xs:element>
A value is valid with respect to a derived simple type that restricts another simple type, provided that it satisfies the derived type (and its restricting facets), and all assertions belonging to both the base and the derived type. In Listing 9, a string value is valid only if it is from 3 to 25 characters long and ends with the string "xyz".
Listing 8. Assertions on derived simple type definitions
<xs:simpleType name="base"> <xs:restriction base="xs:string"> <xs:maxLength value="25"/> <xs:assertion test="fn:ends-with($value, 'xyz')"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="derived"> <xs:restriction base="base"> <xs:assertion test="fn:string-length($value) > 3 "/> </xs:restriction> </xs:simpleType>
Error message customization
As demonstrated in the previous sections, XML Schema 1.1 assertions can use any XPath 2.0 expressions, and these expressions can be very complex. When the assertions fail, it becomes very important to provide error messages that are easy to understand.
Schema error codes
When a schema constraint is violated, the schema specification requires that the
corresponding error code be reported. For example, when you see the error code
cvc-attribute.3, you know clause 3 of the constraint
Attribute Locally Valid is violated, indicating that the value of an attribute is
not valid with respect to its type.
With a little more information about the context (for example, the element or attribute name,
line and column numbers, or values involved), this error code approach is often
sufficient for problem diagnosis. Applying this to assertions, the error code
cvc-assertion will be reported when an assertion is not
satisfied. Even with all the context information, you still do not know what really went
wrong and how to fix it, unless you look at the schema and try to understand the
(potentially very complex) XPath expressions.
Listing 10. A Schematron rule
<report test="@min > @max"> On element "<sch:value-of select="local-name(.)"/>", value of the "min" attribute "<sch:value-of select="@min"/>" can not be greater than that of the "max" attribute "<sch:value-of select="@max"/>". </report>
The following XML fragment (Listing 11) violates this rule.
Listing 11. A fragment that violates a Schematron rule
<range min="30" max="10"/>
This fragment will produce a message:
On element "range", value of the "min" attribute "30" can
not be greater than that of the "max" attribute "10".
This approach has two significant benefits:
- Human readable error messages can be associated with validation rules, making it easy to diagnose validation failures.
- The error message can also use XPaths to refer to values in the instance
being validated to provide more information about what is causing the violation.
In the above example,
range, 30, and 10 are all information that can vary from instance to instance.
Validation rules can be deployed in systems with different locales, and users will
expect to see error messages in different human languages. To make it possible to use
a localized message, Schematron suggests using the
attribute in association with the
as in Listing 12.
Listing 12. Example of localized message in Schematron
<sch:pattern> <sch:rule context="person"> <sch:assert test="name" diagnostics="d1 d2"> A person must have a name. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="d1" xml:lang="en"> A person must have a name. </sch:diagnostic> <sch:diagnostic id="d2" xml:lang="fr"> Une personne doit avoir un nom. </sch:diagnostic> </sch:diagnostics>
Schematron implementations can now select the right
based on the language expected.
The Schematron approach is still not perfect for the localization issue. When new
languages are supported, the Schematron rule has to be updated, both to add the new
diagnostic entry, and to add the new
ID to the
The Java™ programming language handles this by using property bundles. When a new language is added, a new property bundle is introduced, and as long as it follows a certain naming convention, it can be discovered automatically, without the need to change the places where the messages are used.
The Service Modeling Language (SML) uses Schematron as one of its validation mechanisms. It introduces the "location ID" concept (Listing 13) to allow resource management strategies like the one used by a Java environment.
Listing 13. SML with a location ID concept
<sch:assert test="name" sml:locid="person:nameRequired"> A person must have a name. </sch:assert>
locid attribute is of type
namespace name can be used to locate the bundle (which might contain, for example,
all error messages related to a person), and the
to identify the error message to show for the corresponding rule. In
Listing 14 and Listing 15, we show
some examples of message properties in English and French.
Listing 14. A fragment of a message property in English
nameRequired = A person must have a name.
Listing 15. A fragment of a message property in French
nameRequired = Une personne doit avoir un nom.
Error message customization for assertions
XML Schema 1.1 does not prescribe any way to customize error messages for assertions, but it allows application specific information embedded in annotations. For example, Listing 16 shows how to include a customized error message in the "appinfo" element inside an annotation and use "documentation" to provide additional information about the message. The users benefit when XML Schema 1.1 processors adopt a best practice for using annotations to customize assertion errors. The common practice also might include mechanisms to enable localization of error messages.
Listing 16. Customize error messages using annotations
<xs:complexType name="rangeType"> <xs:attribute name="min" type="xs:int"/> <xs:attribute name="max" type="xs:int"/> <xs:assert test="@min <= @max"> <xs:annotation> <xs:appinfo> Value of the "min" attribute can not be greater than that of the "max" attribute. </xs:appinfo> <xs:documentation> When this assertion fails, the content of the above "appinfo" is used to produce the error message. </xs:documentation> </xs:annotation> </xs:assert> </xs:complexType>
XML Schema 1.1 introduces a new mechanism called type alternatives that allow the schema author to specify type substitutions on an element declaration.
A look at conditional type assignment
In XML Schema 1.0,
xsi:type was introduced as a mechanism
for type substitution.
xsi:type is specified on an element in the instance document to replace the declared type with a derived one. This
mechanism works well if you design an XML vocabulary specifically for use
with XML Schema and you require that instances of your vocabulary use
xsi:type for type substitution. If, however, you write
an XML schema for a vocabulary which already has its own notion of type substitution,
xsi:type will not work. Instances of this vocabulary
select types using some other mechanism. One such example is the Atom Syndication
Format, an XML language used for Web feeds.
Atom allows instances to specify a
type attribute on elements containing text
constructs. If present, the value of this attribute must be one of
The content allowed is determined by the value of this attribute. Because
this attribute is not
xsi:type, it is impossible to write a
schema which models Atom using the XML Schema 1.0 language. If the condition for
selecting the type is more complex, for example
@height < @width (a comparison
between two attributes values), you cannot simply substitute it in the instance
To address the shortcomings of
xsi:type, you can use the
type alternative mechanism. This allows the schema author to specify type substitutions on an element declaration which are selected based on the evaluation of XPath expressions. In the next section we will show how this works
using Atom as an example.
How type alternatives work
In XML Schema 1.1, element declarations can have a type table which contains a
sequence of type alternative components and a default type definition (which is also
a type alternative). In an XML schema document these are specified as a sequence of
xs:alternative child elements of the element declaration.
The type alternative schema component contains a
test property which is an XPath
expression property record, a type definition, and an
The value of the
test attribute on
test property, an XPath expression which evaluates to true or false. The
expressions allowed are limited to a subset of XPath 2.0, specifically those which
only select the attribute axis. This means that only attributes on the current element
are accessible by XPath evaluation. It is worth noting that the XDM data model which is
constructed for the evaluation does not include any type information. This was done to
avoid a situation where a schema processor would need to somehow guess the types of the
attributes in order to determine the type of the element. One cannot know the actual
types of the attributes until the element's type is determined.
xs:alternative child of the element declaration is
allowed to omit the test attribute. If present, this type alternative is the default
type definition. If no such
xs:alternative is specified the
default type definition is the one which was declared for the element.
The value of the
type attribute on
to the type definition property of the type alternative schema component. If the XPath
expression on the
test attribute has evaluated to true, then the specified
selected as a substitution for the one declared on the element. The
type specified must
be derived from the declared type or a special simple type definition called
xs:error (which has no valid instances). The
xs:error type can be used to cause elements to be invalid if
they satisfy the condition for the type alternative.
If an element declaration has type alternatives, they are evaluated in the order that
they were specified in the schema. The first type alternative whose XPath expression
evaluates to true is the type that is selected. If none of the XPath expressions
evaluate to true then the default type definition is selected as the
type for the element.
Now that we have described how the type alternatives mechanism works, look
at an example (Listing 17) of how you can use it to write
a schema for Atom. As mentioned in the previous section, elements containing textual
constructs may have a
type attribute which specifies the allowed content for the
element. The snippet below shows how to write a declaration for a
element in Atom.
Listing 17. Type alternative xsd example
<xs:element name="title" type="xs:anyType"> <xs:alternative test="@type='text'" type="xs:string"/> <xs:alternative test="@type='html'" type="htmlContentType"/> <xs:alternative test="@type='xhtml'" type="xhtmlContentType"/> <xs:alternative test="@type" type="xs:error"/> <xs:alternative type="xs:string"/> </xs:element>
The element declaration for
title has a base type of
and specifies five type alternatives. The type alternatives are evaluated in order
until one of the XPath expressions evaluates to true (or, if none evaluate to true, the
default type definition is chosen). The first three type alternatives select a type
based on the value of the
type attribute being
xhtml. If the type
attribute has none of these values, the XPath expressions for the first three type
alternatives will evaluate to false. The fourth type alternative checks whether the
type attribute exists. If the schema processor has reached this point, the value of
type (if this attribute exists) is not one which Atom allows. We assign the type
for this alternative to be
xs:error to signal that if this
condition is satisfied, the element is invalid. If none of the XPath expressions
evaluate to true, the default type definition (
The instances of the
title element, Listing 18, illustrate
how the different type alternatives are selected.
Listing 18. Type alternative xml instance example
<!-- 1st type alternative selected: xs:string --> <title type="text">My News</title> <!-- 3rd type alternative selected: xhtmlContentType --> <title type="xhtml" xmlns:xhtml="http://www.w3.org/1999/xhtml">My <xhtml:em>News</xhtml:em>!</title> <!-- default type alternative selected: xs:string --> <title>My News</title> <!-- 4th type alternative selected: xs:error. Invalid. --> <title type="unknown">Oops! Error.</title>
In this article, we gave an overview of co-occurrence constraint support in XML Schema 1.1, highlighting the addition of assertions and type alternatives to further restrict the existence and values of elements and attributes. In Part 3 of the series, we will explore wildcard support and how it allows you to evolve your XML schema.
- XML Schema 1.1, Part 1: An introduction to XML Schema 1.1: An overview of the key improvements over XML Schema 1.0 and an in- depth look at datatypes (Neil Delima, Sandy Gao, Michael Glavassevich, Khaled Noaman; deveoperWorks; December 2008): Start your exploration with an overview of the key improvements over XML Schema 1.0 and in-depth look at datatype.
- XML Schema 1.1, Part 3: An introduction to XML Schema 1.1: Evolve your schema with powerful wildcard support (Neil Delima, Sandy Gao, Michael Glavassevich, Khaled Noaman; deveoperWorks; November 2009): Take an in-depth look at versioning features introduced by XML Schema 1.1, specifically the new powerful wildcard mechanisms and open content.
- XML 1.0 specification: Read about XML and how it enables generic SGML to be served, received, and processed on the Web.
- XML Schema Part 1: Structures Second Edition: Learn more about the W3C XML Schema language and how it describes the structure and constrains the contents of XML 1.0 documents, including those which exploit the XML Namespace facility. This specification depends on XML Schema Part 2: Datatypes.
- XML Schema Part 2: Datatypes Second Edition: Find information on the datatypes used in the W3C XML Schema language.
- W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures: Check out the latest specification of the W3C XML Schema language.
- W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes: Find more information on the new datatypes added to the W3C XML Schema language.
- XQuery 1.0: Learn more about XML Query language, which uses the structure of XML to express queries across all kinds of data.
- XML Path Language 2.0: Learn more about the XPath language.
- The Service Modeling Language (SML): Learn more about how to model complex systems and services.
- XSL Transformations (XSLT) Version 2.0: Review this specification that defines the syntax and semantics of the XSLT 2.0 language.
- XQuery 1.0 and XPath 2.0 Data Model (XDM): Read about this W3C specification which is the data model of XPath 2.0, XSLT 2.0, and XQuery languages.
- Atom Syndication Format: Find more about an XML-based Web content and metadata syndication format.
- Schematron: Check out this language for making assertions about the presence or absence of patterns in XML documents.
- RELAX NG: Explore a schema language for XML.
- IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- developerWorks podcasts: Listen to interesting interviews and discussions for software developers.
- The XML Parser for Java (Xerces2-J): Try this parser distributed by Apache.
- IBM trial software for product evaluation: Build your next project with trial software available for download directly from developerWorks, including application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.