Contents


Representations of null in XML Schema

Comments

Content series:

This content is part # of # in the series: Web services tip

Stay tuned for additional content in this series.

This content is part of the series:Web services tip

Stay tuned for additional content in this series.

A Java bean has properties, or fields. These fields, unless they are primitive types, can be null. When you map a Java bean to XML, these fields become elements or attributes. Unadorned elements and attributes cannot have null values (you can sort of think of them as the equivalent of Java primitive types -- neither can be null). There are a number of ways to adorn XML attributes and elements so that instances of them can be null (or at least logically equivalent to null).

  • For attributes:
    • use the attribute use="optional"
  • For elements:
    • use the attribute nillable="true"
    • or use the attribute minOccurs="0"

This tip describes the details and ramifications of each of these choices.

Element versus attribute

Before looking into the null representations, first decide why you might want to use an element versus an attribute for an object field. The first consideration is the type of the field. You can only use simple types in attributes. So if you have a complex type, you have no choice; you must use an element. If you have a simple type, however, which should you use: attributes or elements? For example, given the structure in Listing 1, which is better, attrField or elemField?

Listing 1. AttributeOrElement schema
<complexType name="AttributeOrElement">
  <sequence>
    <element name="elemField" type="xsd:int"/>
  </sequence>
  <attribute name="attrField" type="xsd:int"/>
</complexType>

Take a look at Listing 2 for an instance of the schema from Listing 1.

Listing 2. Instance of AttributeOrElement
<attributeOrElement attrField="5">
  <elemField>5</elemField>
</attributeOrElement>

The attribute field clearly takes up less space in the instance than the element field, so SOAP messages with XMLs containing attributes would be smaller and, therefore, take less time to transport. So it seems reasonable to assume that attributes are better than elements. But if you've dealt with schemas for a while, then you may have noticed that you don't often see attributes used. Why is that? I actually couldn't tell you, though I have some ideas:

  • Since you must use elements for complex types, for consistency, you should use elements for simple types as well. You end up with simpler looking schemas.
  • The document/literal wrapped pattern dictates that the wrapper element's complexType does not contain attributes (see the "Which style of WSDL should I use?" article in Related topics). Since this list of parameters within a wrapper element can only be elements, for consistency all complexType lists should be elements.
  • Since there is no ordering of attributes, and there is ordering of elements (unless you use the <all> tag), there is the potential that parsing of attributes might be somewhat more costly than the parsing of elements. (However, I suspect that this is not really an issue with sophisticated XML parsers.)

So for simplicity I recommend sticking with elements unless throughput performance is a concern, at which point you can test whether or not attributes improve your performance.

Now let's discuss null values.

Null attribute

As I've already mentioned, you can make an attribute logically nullable by making it optional. See Listing 3 for a schema with a nullable attribute and Listing 4 for instances -- one with a value in the field and one without.

Listing 3. TypeWithNullAttribute schema
<complexType name="TypeWithNullAttribute">
  <attribute name="attrField" type="xsd:int" use="optional"/>
</complexType>
Listing 4. Instances of TypeWithNullAttribute
Attribute with a value:

<typeWithNullAttribute attrField="5"/>

Attribute passed as null:

<typeWithNullAttribute/>

You can see that the schema declaration of a null attribute is very straightforward. The instance of a null attribute is also very straightforward -- it simply isn't there.

Null elements

Since attributes are rarely used, and elements are much more prevalent, let's move on to null elements. There are two ways to represent a null value with elements: with either the attribute nillable="true", or with minOccurs="0". Listing 5 shows a schema for TypeWithNullElements, which contains one element for each of the styles of nullable fields.

Listing 5. Schema for TypeWithNullElements
<complexType name="TypeWithNullElements">
  <sequence>
    <element name="nillableElem" nillable="true" type="int"/>
    <element name="minOccursElem" minOccurs="0" type="int"/>
  </sequence>
</complexType>

Listing 6 shows instances of TypeWithNullElements, first with normal values, then with null values.

Listing 6. Instances of TypeWithNullElements
Elements with values:

<typeWithNullElements>
  <nillableElem>5</nillableElem> 
  <minOccursElem>5</minOccursElem> 
</typeWithNullElements>

Elements with null values:

<typeWithNullElements>
  <nillableElem xsi:nil="true"/>
</typeWithNullElements>

Like an optional attribute, an element with a minOccurs="0" attribute, whose value is null, simply does not appear in the XML instance. This is clearly a smaller cost in terms of message size than an element which is defined with the attribute nillable="true". nillableElem, even when its value is null, still has a placeholder for its value which indicates that it is, indeed, null.

When nillable="true" is useful

minOccursElem is so obviously better than nillableElem, why would you ever want to use nillableElem? I already gave you a hint. I said a null value for nillableElem has a placeholder for the value. Where might you need a placeholder? One example would be an array where each array entry could potentially be null. Imagine an array of four elements, for example, whose values are {0, null, 1, null}. How would you represent that array using an instance of a minOccursElem element? Answer: you cannot. There would be no way to distinguish between the four element array described above and a two element array whose values are {0, 1}. With minOccurs="0" elements, there are no placeholders for the null elements. So you must use nillable="true" elements for this scenario. Listing 7 shows a schema for such an array and Listing 8 shows the XML instance for {0, null, 1, null}.

(For a subtle twist on arrays and null values, see the "Array Gotcha" article in Related topics.)

Listing 7. Schema for nillable array elements
<complexType name="nullableElementArray">
  <sequence>
    <element name="elem" type="int" maxOccurs="4" nillable="true"/>
  </sequence>
</complexType>
Listing 8. XML instance of nillable array elements
<nullableElementArray>
  <elem>1</elem>
  <elem xsi:nil="true"/>
  <elem>2</elem>
  <elem xsi:nil="true"/>
</nullableElement>

Summary

There are three ways of representing null fields in XML schema: optional attributes, minOccurs="0" elements, and nillable="true" elements. There are cases when you would use each one of these: an optional attribute if you have a simple, nullable type; a minOccurs="0" element if you have a complex nullable type and you want it to take up the least amount of space; a nillable="true" element if a null value must have a placeholder (for instance, when it appears in an array).


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and web services, XML
ArticleID=91440
ArticleTitle=Web services tip: Representations of null in XML Schema
publish-date=08092005