Schema scope: Primer and best practices

Understand a crucial aspect of schema design

XML Schema Definition Language is an excellent tool for modeling and validating data. Schema scope is a critical but often overlooked aspect of schema design. The elements, types, and attributes of a schema can have either local or global scope, a choice that affects the reusability, interoperability, and life cycle of the schema. This article illustrates the definition and use of locally and globally scoped schema constructs, and provides tips and best practices for choosing the optimal scope design for your schemas.

Share:

Casey D Jordan (casey.jordan@jorsek.com), Owner, Jorsek

Photo of Casey D. JordanCasey Jordan is an structured/smart data evangelist and the co-founder of Jorsek, a company providing software and services to improve the quality of organizations' data deployment, storage, and human collaboration. Claiming more than 10 years of experience in content management and web services, Casey helps companies align their content strategies with XML technologies such as native XML databases, XQuery, XSLT, XML Schema, and DITA.



Dale Waldt, Senior Consultant, aXtive Minds

Photo of Dale WaldtDale Waldt has more than 25 years of experience leading the design and development of XML applications, composition and publishing solutions, and complex web sites for government, commercial, and nonprofit organizations. For the past 10 years he has been a consultant, instructor, and industry analyst focusing on web and content technology and open standards adoption. Dale frequently works with development teams optimizing processes, designing schemas, leading data and application design and development, evaluating software and services, and training developers in XML, XSLT, and related technologies.



17 September 2010 (First published 14 September 2010)

Also available in

17 Sep 2010 - As a followup to reader comments, the author made updates to the code in Listings 3, 11, 12, and 13.

A schema is a well-formed XML document that uses the powerful XML Schema Definition Language (XSD, also sometimes called W3C Schema) to model and validate other XML data. Depending on how you define schema particles (elements, types, attributes, and other constructs), they have an associated scope that is either global/exposed or local/hidden. The scope design of your schema significantly affects how the schema can evolve, be reused, and interoperate with other technologies.

Whether you are just starting to use schemas or want to get more out of your current solution, understanding schema scope can play a key role in success. In this article, we'll first show how global or local scope is defined for various schema particles and explain how scope affects their behavior. Then we'll describe basic schema design patterns and explore considerations and best practices for creating scope designs that fit the needs of your projects.

Frequently used acronyms

  • W3C: World Wide Web Consortium
  • XML: Extensible Markup Language
  • XSD: XML Schema Definition Language

Defining elements with global scope

A schema's highest-level container element is schema. Direct child elements of the schema element are defined globally (that is, they have global scope). You can use global elements as root nodes and can reference them from other parts of the schema. You define the element once and can reuse it throughout the schema.

The schema example in Listing 1 shows a simple data model with one global element, named postalCode:

Listing 1. Schema with a single global element
<xs:schema>
 <xs:element name='postalCode' type='xs:string'/>
</xs:schema>

You can use the schema in Listing 1 to validate the following data instance successfully:

<postalCode>14534</postalCode>

In this data instance, postalCode is the root element—the highest-level container within the data instance. Only elements defined at the highest level in the associated schema can serve as root elements in a data instance. The schema in Listing 1 defines only one element, so it's easy to understand that only postalCode can serve as the root element in an instance.

The example schema in Listing 2 defines two elements at the root level:

Listing 2. A schema with two possible root elements
<xs:schema>
  <xs:element name='postalCode' type='xs:string'/>
  <xs:element name='zipCode' type='xs:string'/>
</xs:schema>

Either postalCode or zipCode can serve as a root element in an instance modeled by the schema in Listing 2.


Defining elements with local scope

Defining elements locally prevents them from being exposed to other parts of the schema. A local element's context is limited to its current location, so it cannot be referenced from other parts of the schema. In the example in Listing 3, the zipCode element is not defined globally. Instead, it is defined inside the complexType of an element definition, as a subelement of the address element.

Listing 3. A single global element with local child elements
<xs:schema> 
        <xs:element name='address'>
            <xs:complexType>
                <xs:sequence>
                    <xs:element name='street' type='xs:string'/>
                    <xs:element name='city' type='xs:string'/>
                    <xs:element name='state' type='xs:string'/>
                    <xs:element name='zipCode' type='xs:string'/>
                </xs:sequence>
            </xs:complexType>
        </xs:element>    
</xs:schema>

Because the definition of the zipCode element is within the declaration of the address element, it is a local definition and has scope only inside an address element. For a document instance to be valid, the zipCode element must appear inside an address element, as in Listing 4:

Listing 4. Valid data instance for the schema in Listing 3
<address>
 <!-- street, city and state hidden for example purposes -->

 <zipCode>14534</zipCode>
</address>

In Listing 4, the address element is the root element. The zipCode element cannot serve as the root element in an instance, because it is not defined globally at the root level within the schema module. Locally defined elements can only appear in the context of the element definition in which they are defined.


Referencing global elements and attributes in a local scope

In addition to serving as a root element, any globally defined element can be referenced and appear in any local scope where it might be needed. In the example in Listing 5, the globally defined zipCode element is used inside the definition of the address element in a locally scoped context:

Listing 5. Global element referenced in local scope
<xs:schema>
 <xs:element name="address">
  <xs:complexType>
   <xs:sequence>
    <xs:element name='street' type='xs:string'/>
    <xs:element name='city' type='xs:string'/>
    <xs:element ref='zipCode'/> <!-- reference to globally defined element -->
   </xs:sequence>
  </xs:complexType>
 </xs:element>
 
<!-- Globally defined element that is referenced in element above -->
 <xs:element name='zipCode' type='xs:string'/>
</xs:schema>

You can see that exposing element declarations globally supports modularization and reuse. You can reference the zipCode element in other parts of this schema and in parent schemas that might import this schema.

Attribute definitions behave in the same manner. In Listing 6, for example, the globally defined state attribute is referenced within the address element in a locally scoped context:

Listing 6. Global attribute referenced in local scope
<xs:schema>
 <xs:element name='address'>
  <xs:complexType>
   <!--[.. elements removed for readability..]-->
   <xs:attribute ref="state"/> <!-- referencing globally defined attribute -->
  </xs:complexType>
 </xs:element>

 <xs:attribute name="state" type="xs:string"/> <!--globally defined attribute -->

</xs:schema>

Type definitions

Just as you can define elements and attributes globally and locally, you also can define types. The preceding examples use locally defined types for the address element definition. To make this type definition global, remove it from the local definition, give it a unique name, and place it under the root schema node, as in Listing 7:

Listing 7. Global type referenced in local scope
<xs:schema>
 <xs:element name='address' type="address.type"/>
 
  <xs:complexType name="address.type">
   <xs:sequence>
    <xs:element name='street' type='xs:string'/>
    <xs:element name='city' type='xs:string'/>
    <xs:element name='state' type='xs:string'/>
    <xs:element name='zipCode' type='xs:string'/>
   </xs:sequence>
 </xs:complexType>
</xs:schema>

The type definition is now global and has the unique name of address.type. To associate this type with an element, we reference it by associating the type attribute (type="") with the global type name. You can extend a global type definition by using the xs:extension element or restrict it by using the xs:restriction element.


Basic design patterns

It's not always easy to determine whether you should define schema particles with local or global scope. Depending on the use case, namespacing requirements, and schema evolution, the best choices can vary. Generally, a schema design falls into four basic patterns:

  • Russian doll
  • Salami slice
  • Venetian blinds
  • Garden of Eden

It is important to understand these patterns to determine the best solution for your project.

The Russian doll pattern

This pattern is coined after the famous Matryoshka Russian dolls—wooden dolls of decreasing size placed one inside another. The Russian doll pattern defines all subelements locally; thus, each element and its type are encapsulated by their parent, much like the Russian dolls.

The example in Listing 8—a simplified representation of a help document for a home appliance—demonstrates this pattern:

Listing 8. Russian doll style schema
<xs:schema>
<xs:element name="HelpDoc">
<xs:complexType>
 <xs:sequence>    
  <xs:element name="Section">    
   <xs:complexType>
    <xs:sequence>    
     <xs:element name="Title" type="xs:string"/>    
     <xs:element name="Body" type="xs:string"/>
    </xs:sequence>
    <xs:attribute name="name" type="xs:string"/>
   </xs:complexType>
  </xs:element>
 </xs:sequence>
 <xs:/complexType>
</xs:element>
</xs:schema>

An associated instance might look like Listing 9.

Listing 9. Instance matching Russian doll schema model
<HelpDoc>
 <Section name="operation_instructions">
  <Title>Operating your appliance.</Title>
  <Body>First, open the packaging and check to see...</Body>
 </Section>
</HelpDoc>

You can see that every subelement, attribute, and type in Listing 8 is defined locally. The only global element is the root, HelpDoc. This syntax is compact and some may consider it easily readable. Russian doll style schemas don't expose their components to other types, elements, or schemas, so they are also considered highly decoupled (that is, elements are not globally dependent on other elements) and cohesive (related elements are grouped within a single self-contained parent).

This pattern epitomizes a schema intended to have little interaction with other systems and no reuse of its components. By defining schemas in this manner, you can keep structures self-contained, hide namespaces, and prevent influence from other systems.

Quick tip: Namespaces

When a schema is namespaced (that is, has a targetNamespace), all global particles must be referenced by a fully qualified name (that is, prefix:name). Such namespaces are said to be exposed. For instance:

<xs:schema xmlns:xyz="http://xyzcompany.com" 
  targetNamespace="http://xyzcompany.com">
  
  <xs:element name="HelpDocs">
   <xs:complexType>
    <xs:sequence>    
     <!--   Global element is referenced,      -->
           <!--   must contain namespace prefix.     -->
     <xs:element ref="xyz:Section"/> 
    </xs:sequence>
   </xs:complexType>
  </xs:element>
  
  <xs:element name="Section">    
   <!--        -->
  </xs:element>
  
  </xs:schema>

Because the Section element is global and a targetNamespace is declared, the xyz namespace prefix is required in the reference to the Section element. <xs:element ref="Section"/> is not a valid reference.

Salami slice pattern

With the Salami slice pattern, you take the next step toward exposing content models. In this pattern, you move all your locally defined elements into global definitions. Listing 10 shows the Russian doll style example in Listing 8 modified to fit the Salami slice pattern:

Listing 10. Salami slice pattern
<xs:schema>
 <xs:element name="Body" type="xs:string"/>
 <xs:element name="Title" type="xs:string"/>   

<xs:element name="Section">    
 <xs:complexType>
  <xs:sequence>    
   <xs:element ref="Title"/>    
   <xs:element ref="Body"/>
  </xs:sequence>
  <xs:attribute name="name" type="xs:string"/>
 </xs:complexType>
 </xs:element>

 <xs:element name="HelpDocs">
  <xs:complexType>
   <xs:sequence>    
    <xs:element ref="Section"/>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>

The Salami slice pattern exposes all elements so you can reference and reuse them in other parts of the schema, and it makes them transparent to other schemas. A major advantage in this approach is that elements are highly reusable. However, this also means that all namespaces are globally exposed, and coupling between elements increases. In Listing 10, the Section element is globally coupled to the Title and Body elements. Any modification to the Title and Body elements would subsequently affect the Section definition.

Venetian blinds pattern

In the Venetian blinds pattern, instead of defining all elements globally, you start by defining all types globally, as in the example in Listing 11:

Listing 11. Venetian blinds pattern
<xs:schema> 

        <xs:complexType name="section.type">
            <xs:sequence>    
                <xs:element name="Title" type="xs:string"/>    
                <xs:element name="Body" type="xs:string"/>
            </xs:sequence>

            <xs:attribute name="name" type="xs:string"/>
        </xs:complexType>

        <xs:complexType name="helpdocs.type">
            <xs:sequence>    
                <xs:element name="Section" type="section.type"/>
            </xs:sequence>
        </xs:complexType>

        <xs:element name="HelpDocs" type="helpdocs.type"/>

</xs:schema>

The Venetian blinds style uses global type definitions to increase reuse capabilities. Because all subelements are localized, it comes with the added benefit of being able to hide namespaces. This approach allows you to expose your structure definitions for reuse while using the elementFormDefault attribute as a switch to hide or expose namespaces. You get the best of both worlds!

Garden of Eden pattern

In the Garden of Eden design pattern, you make both element declarations and type declarations global, taking globalization to the extreme. Listing 12 shows a Garden of Eden style schema:

Listing 12. Garden of Eden pattern
<xs:schema> 

        <xs:attribute name="name" type="xs:string"/>
        <xs:element name="Title" type="xs:string"/>    
        <xs:element name="Body" type="xs:string"/>
        <xs:element name="Section" type="section.type"/>
        <xs:element name="HelpDocs" type="helpdocs.type"/>

        <xs:complexType name="section.type">
            <xs:sequence>    
                <xs:element ref="Title"/>    
                <xs:element ref="Body"/>
            </xs:sequence>
            <xs:attribute ref="name"/>
        </xs:complexType>

        <xs:complexType name="helpdocs.type">
            <xs:sequence>    
                <xs:element ref="Section"/>
            </xs:sequence>
        </xs:complexType>
</xs:schema>

By making every possible element, attribute, and type global, you create a scenario that maximizes reuse, both internally and between schemas—albeit by forcing namespaces to be exposed. By completely exposing your structure, you make the schema highly coupled but agile. Because elements are interdependent, sweeping changes to the schema can be applied quickly.


Best practices: Schema management and evolution

When designing schemas, you often perform a balancing act among exposing reusable components, hiding namespaces and limiting namespace exposure, and decreasing coupling (or the interdependence of multiple global elements/types). Figure 1 summarizes the reuse potential of each of the four schema patterns, indicating their relative rank in both coupling and exposure:

Figure 1. Exposure versus coupling with different patterns
A chart illustrating exposure versus coupling with the four schema patterns

Providing high potential for reusing schema components can reduce future development time and make sweeping changes easy. However, it can also create scenarios in which multiple elements and types are unnecessarily coupled. When schemas become highly coupled, elements and types become interdependent, making it difficult to manage future changes and additions. Coupling run amok prevents schema evolution, because other systems depend on your interfaces remaining consistent. It's important to be careful with what and how much you expose. Once you make a choice, it can be difficult to undo.

Nevertheless, there are ways to ensure future success. Start with an appropriate scope design:

  • If schema reuse is not imperative and minimizing size is, use the Russian doll style, because it is compact and can be used to maximize namespace hiding.
  • If element substitution is imperative to your design or you need to make elements transparent to other schemas, use the Salami slice style or Garden of Eden.
  • Use the Venetian blinds style if you want high levels of reuse as well as to maximize the potential for namespace hiding. You can then use elementFormDefault as a toggle to require exposed namespaces or hide them.

Also, don't be afraid to mix it up. Using multiple schema design patterns within a particular schema can be highly beneficial. For portions of your structure that you'd like to keep private and hidden, use the Russian doll style. At the same time, you might want to expose some elements globally using a Salami slice or Garden of Eden design. For example, Listing 13 uses a highly exposed Garden of Eden style with a hidden Russian doll style section:

Listing 13. Mixing patterns
<xs:schema> 

        <!-- Garden of Eden style component    -->
        <xs:element name="Title" type="xs:string"/>
        <xs:element name="Body" type="xs:string"/>
        <xs:element name="Section" type="section.type"/>
        <xs:element name="HelpDocs" type="helpdocs.type"/>

        <xs:complexType name="section.type">
            <xs:sequence>    
                <xs:element ref="Title"/>    
                <xs:element ref="Body"/>
            </xs:sequence>

            <xs:attribute name="name"/>
        </xs:complexType>

        <xs:complexType name="helpdocs.type">
            <xs:sequence>    
                <xs:element ref="Section"/>
                <!--    Russian doll style component    -->
                <xs:element name="Credits">
                    <xs:complexType>
                        <xs:sequence>    
                            <xs:element name="Author" type="xs:string"/> 
                            <xs:element name="Year" type="xs:string"/>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>       
        
</xs:schema>

In this scenario, perhaps you need to expose the HelpDocs, Section, Title, and Body elements for other schemas to use. However, you want to hide the Credits element to prevent it from being coupled to other schema definitions.

If the need for reuse is not great, it's completely logical to write schemas this way. It's easy to add more exposure by making elements global. Removing exposure later can be a huge pain. In this case, if future design requires more exposure, you can easily add it.


Conclusion

Before you begin any schema project, it's imperative to align your design choices with your goals. By understanding the uses of schema scope, you can streamline the process of managing schemas and content. Ultimately, this will increase your ability to manage schema life cycle and allow your schemas to interact efficiently with other systems.

Resources

Learn

Get products and technologies

  • IBM product evaluation versions: Download and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=517730
ArticleTitle=Schema scope: Primer and best practices
publish-date=09172010