Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Real-world XML Schema

Good naming conventions extend beyond retail

Paul Golick (golick@us.ibm.com), Programmer, IBM, Software Group
During the period that IXRetail developed its "XML Best Practices" document, Paul Golick was editor of that document, Secretary of IXRetail, and the representative to IXRetail and to the ARTS Data Model Committee for IBM Retail Store Solutions. You can contact Paul at golick@us.ibm.com.
Richard Mader (Maderr@nrf.com), Executive Director, ARTS
Richard Mader is Executive Director of ARTS and Administrator of IXRetail. You can contact Richard at Maderr@nrf.com.

Summary:  This article presents a set of 17 broadly applicable practices for using XML. These practices were published by the Association for Retail Technology Standards to aid its development of standardized XML messages for exchange between information technology systems that support retail stores.

Date:  01 Jan 2002
Level:  Introductory

Comments:  

Does your industry provide a set of best practices for XML Schema to streamline industrywide data integration? If not, perhaps it should follow retail's lead. Since 1993, the Association for Retail Technology Standards (ARTS) of the National Retail Federation (NRF) has been developing a standard data model to help retailers integrate applications and interface point-of-sale (POS) data more easily.

The International XML Retail Cooperative (IXRetail) is the ARTS committee that is standardizing XML messages for exchange between IT systems that support retail stores. IXRetail has adapted names and definitions from the ARTS Data Model standard for use in XML messages. IXRetail has also worked on standardizing other aspects of XML technology across the retail industry and among its vendors.

This article was initially prepared for publication in two installments in NRF's STORES Magazine. Links to the online versions of those article installments are in the Resources section. The material from those installments has been collected here and revised for developerWorks readers.

XML Schema language

XML provides the format for identifying information that applications need, but does not assure that the information needed by the recipient is provided. However, XML provides formatting structures that help obtain this assurance. The XML Schema language elaborates on XML and related specifications to provide a flexible way to describe a shared vocabulary of names that can be used to mark up XML documents. By using a shared schema, applications can use validating parsers to assure that appropriate information is sent or received.

IXRetail has chosen XML because of the universal applicability of XML to structured document and data exchange on the Web. XML and the XML Schema language were designed as general solutions that include facilities for almost every need. This has made XML Schema too general for use without additional constraints. For example, there are multiple ways to describe values from a set that may vary: xs:choice, element substitution groups, abstract elements, abstract types, and many more. Each of these alternatives has different characteristics. In some cases, one alternative may be clearly the most appropriate. However, in many cases, several different alternatives could be chosen. This is not desirable: Arbitrary selection among suitable features of XML Schema can hide similarities in related messages or among similar types and can cause people who maintain or extend the message system to waste effort preserving what was arbitrarily chosen. In many cases, IXRetail adopted a "Best Practice" guideline to state a preference for using a specific feature of XML or the XML Schema language for specified situations.

The level of assurance that a schema can provide to an application depends on how well the schema translates application requirements into requirements on XML messages. The XML Schema language can describe most common constraints on components of XML messages. In some cases, you can have an optimal schema that validates every "good" message and rejects every "bad" message.

However, even when such an optimal schema is possible, you may want to let the application deal with some bad messages (such as when valid values change too dynamically for enumeration in a standardized schema). Further, some constraints cannot be described in the XML Schema language. As a result, validation of messages with XML schemas does not eliminate the need for applications to verify that input data is acceptable.

Just as messages may be good or bad depending on whether the application finds them acceptable, schemas may be considered good if acceptable by some criteria. The question is, which criteria?

Some criteria seem obvious. Tools that conform to the XML Schema language can process schema written in that language, which suggests that using the W3C specification for XML schemas should be a guideline. Things were not always so obvious, however, because the W3C formally adopted the XML Schema language only recently. Many people predicted the language would fail because it was complex, initially, and getting worse. This complexity, however, was necessary to deal with the wide range of different uses.

Criteria for "good" XML and XML Schema specifications may vary widely depending on how those specifications are used. Even when there is agreement to use the W3C specification for XML Schema, there may be no single set of criteria for "good" application of the standards to specifications; determining the "best" criteria can be even more uncertain. However, when you can restrict the particular environment of application, you can hope to determine -- by well-informed consensus -- the set of criteria that will lead to the best practice of using XML and related standards in that environment. In this case, the particular environment is data interchange between and among information technology applications that either support the operation of retail stores or integrate retail stores with the retail enterprise.


Best practices

The best practices guidelines listed below have been approved by the IXRetail Technical Committee. The guidelines are listed in the order approved by the Best Practices Subcommittee of IXRetail; the order has no other significance. Each guideline is shown in bold and is followed by comments describing rationale for the guideline or related remarks from the authors of this article. However, only the guidelines were approved; IXRetail has not approved the commentary. The guidelines may evolve during use and additional guidelines are anticipated. ARTS developed these practices to guide its development of standardized XML schemas. ARTS published these practices to guide developers of retail applications until they are able to use the forthcoming ARTS standards. If you develop other kinds of applications, many of these guidelines can improve the consistency of your XML schema (just substitute the name of your specification approver for IXRetail).

  1. Use "UCC camel case" with no spaces or hyphens between words for all XML names assigned. This kind of camel case results in the capitalization of the first letter of each word of a compound name including capitalization of the initial letter of the name. This ensures that names are both legal for XML (no spaces) and more readable than single-case text. An example of a UCC camel case name is InventoryControlDocument. (Some organizations have adopted naming standards that use LCC camel case, such as initialLowerCase for some kinds of names, typically attribute names. IXRetail decided against this. The decision to use UCC or LCC or both is not driven by XML Schema, which has distinct name spaces for elements, attributes, and types.)
  2. Readability is more important than tag length. Although IXRetail remains concerned that long tags will make XML documents impracticably long, it is important to help users choose the correct tag. For example, POSDepartmentID is preferred over ID_DPT_POS. (It is also anticipated that "messaging infrastructure" will provide message compression.)
  3. With a few exceptions, abbreviations and acronyms should not be used in element, attribute, and type names. The exceptions are: GTIN, ID, and POS for, respectively, Global Trade Item Number, Identifier, and Point of Sale. The logical view of the ARTS Data Model avoids abbreviations. Only the exceptions listed could be justified. This guideline is also considered desirable for names used for components of XML messages. (Clearly, the exceptions listed here are specific to the retail industry; other industries are likely to permit other abbreviations.)
  4. Remove entity names from attribute names where possible. In the ARTS Data Model, the entity name is often used as a prefix for attribute names. This makes importing foreign keys easier in relational database models. However, the hierarchical structure of an XML message eliminates ambiguity and makes repeating the entity name unnecessary. Thus, repetition of entity names needlessly increases tag lengths. Although this guideline uses the entity and attribute terminology of the ARTS Data Model, it applies to both element and attribute names of XML; Data Model attributes can correspond to either elements or attributes in XML messages. (Guideline 8 is related and is a generalized version of this guideline.)
  5. Use W3C specification for XML schemas instead of DTDs or alternative schema languages. XML Schema allows local element names, but DTDs require that all element names be globally unique. (The potential for automated parsing with open-source validating parsers also influenced adoption of this guideline.)
  6. Enumeration values should use names only (not numbers) and the names used for enumeration values must conform to the guidelines for element or attribute names. If suitable names already exist, they should be used (instead of IXRetail creating new names). Prefer ISO standards to national standards or consortium specifications. Names composed of natural language words can suggest the meaning of the value. Numbered enumerations invite nonstandard extensions that do not interoperate. (A criticism of this guideline is that the requirement to use names forces a choice of natural language. The language chosen for these names should be the one most helpful to those who maintain and extend the messaging system. However, these names should be limited to differentiating information handled differently by the information technology system; users should always be presented with messages in each user's chosen language. This guideline is not an excuse to avoid good user interfaces.)
  7. Enumeration values should use names consisting of English words. Names based on a natural language can suggest meanings by appropriate selection of words, but numeric values do not. It is helpful to be consistent with names derived from the ARTS Data Model, which uses English words. The words displayed to end users for elements, attributes, and enumerated values need to be chosen for usability by end users and will probably need to be translated even for English speakers. Only programmers who do system debugging should be expected to deal directly with XML messages. (Some industries may not need a guideline such as this or may choose a different language. However, not making a choice may lead to use of cryptic "words" that are no more helpful than arbitrary numbers.)
  8. Names should not include a repetition of the names of containing structures. The container provides adequate context; using its name in component names is redundant and needlessly lengthens component tags. For example, a <Customer> element could contain a <Name> element, but should not contain a <CustomerName> element, which would repeat the containing structure name (Customer). (Recommendation 4 is related but uses data modeling terminology. Using repetition consistently was also considered, but led to an obviously undesirable practice.)
  9. All schemas specified by IXRetail shall put the global names they declare in one namespace; this shall be the IXRetail namespace, which is http://www.nrf-arts.org/IXRetail/namespace/. Putting IXRetail names in a namespace avoids name collisions with schema specifications from other sources that our users may need. By avoiding multiple namespaces, IXRetail can better limit occurrences of unintended equivalent declarations or definitions. IXRetail can check that each global name within this namespace has a unique declaration or definition. This guideline does not restrict importing schema documents that use other namespaces. (By limiting itself to one namespace, IXRetail has committed itself to carefully reviewing each addition it makes to its namespace. It is anticipated that this namespace will grow as IXRetail standardizes additional message schemas. Other guidelines limit the use of global names and reduce the difficulty in following this guideline. Because only IXRetail can approve specifications using its namespace URI, each other specification approver must adopt its own namespace URI. The slash or solidus character ["/"] that terminates this URI has also been discussed. Many standard namespace names do not end with this character. The registrar that provided this URI was asked to provide an identifier that would not be used for a specific file because the identified resource would change over time; the registrar provided a URI for a directory. However, a namespace is a conceptual resource and its URI is used to name it and not to locate it. The namespace is neither a file nor a directory; whether its URI ends in a solidus is not significant.)
  10. Each XML instance document produced by IXRetail should specify a default namespace, which should be the IXRetail namespace. The use of a default namespace avoids the need to explicitly prefix names from that namespace. This shortens the tags that use names from the IXRetail namespace. Specifying a default namespace also provides the appropriate example to users of the XML schema documents specified by IXRetail. (It is important to note that this guideline applies to "XML instance documents" and not to "XML schema documents." It is intended that documents that specify particular messages, such as example interaction scenarios, be distinct from the documents that specify schemas for standardized message types. This distinction is made to clearly differentiate what is being standardized and what are examples of application of the standard. XML messages that only reference a schema and do not add new declarations are XML instance documents in this sense.)
  11. Each XML schema document produced by IXRetail should specify a default namespace and a target namespace, both of which should be the IXRetail namespace. This provides consistent references to names from the IXRetail namespace. Although this requires that names from W3C's XML Schema specification be explicitly prefixed, it only increases the length of the schema, not the length of instance documents. It also makes the handling of all names defined in XML Schema and related XML standards consistent with each other: W3C standardized names are always prefixed. (As with the preceding guideline, schema documents and instance documents are distinguished. For this distinction, an XML schema document is an XML document that adds new attribute or element declarations.)
  12. Where domain experts believe a type is likely to be reused, either a simpleType or a complexType should be defined globally in the namespace instead of being defined anonymously in the Element declaration. Because they are not usually used in tags, type names can be concatenated from sufficient roots and modifiers to identify the appropriate domain without necessarily causing long tags. This differs from element names, which are always used in tags. As a result, the alternative of global element names would lengthen tags. (Sometimes a type name is used in an instance document, such as when the type of a concrete element is specified with xsi:type. In such cases, the length of type names does affect message length.)
  13. Schemas should use nested elements that use the type attribute or an inline type definition (simpleType or complexType) instead of the ref attribute that references a global element. Whenever possible, local element naming should be used so that names can be kept short. The global part of the IXRetail namespace should be reserved for names with well-defined meanings. These global names should be constructed with sufficient roots and modifiers to identify their domain of use. (Guideline 12 applies when reuse of declarations or definitions is suggested. Guideline 12 states a preference for global types over global elements. The outermost element of a message will appropriately have a global name, which will distinguish that message from all other messages. Elements contained within a message always have the context of the containing message.)
  14. Each version of a schema produced by IXRetail must have its own URI value for the schemaLocation attribute that is different from the URI value of every other version of every other schema; the URI must be in a hierarchy agreed with ARTS-NRF (each schemaLocation will be the URI for a UTF-8 file subordinate to http://www.nrf-arts.org/IXRetail/schemaLocation/). The version attribute of the <xs:schema> opening tag should be specified and should have a value that is the same string as the schemaLocation URI value. The assignment of values for schemaLocation and for version should be tied to schema approval, establishment, and release. These values should also include release date, following the pattern used by W3C. Even the initial versions of IXRetail schemas should use some version-control mechanism. It is desirable to use a version mechanism that parallels the schema-discovery mechanism standardized for validating parsers. An example of the W3C pattern for including release date in a name is: http://www.w3.org/TR/2001/REC-xmlschema-2-20010502. (XML requires that all conforming XML processors support UTF-8. UTF-8 can also be browsed or read by almost all text-processing tools, many of which would have problems with UTF-16. The development procedures assumed by this guideline may not be appropriate for all organizations, and some organizations may already have established conventions for version identification that do not permit the schemaLocation hint provided by using version as suggested here.)
  15. Use names from ARTS XML Dictionary when appropriate, instead of inventing new names. The ARTS XML Dictionary is a list of names initially derived from entity and attribute names of the logical view of the ARTS Data Model. The context of the ARTS Data Model provides significant semantics to these names. The names must still be selected and used consistently with all the other guidelines. Names from IXRetail schemas will also be added to the ARTS XML Dictionary. (This guideline is intended to make use of XML technology to extend the database efforts that preceded it. IXRetail and ARTS staff have expended much effort on converting data dictionaries and related Data Model specifications. Although these conversions were a significant effort, they ensured that the XML specifications were closely related to information flows and processes already widely deployed in the retail industry. Without these conversions, much more requirements validation would have been required. Further, requirements gathering and validation is perhaps the slowest part of the standardization process since the industry's leaders are unwilling to tell their competitors about high priority requirements.)
  16. When choosing a name that is global within a namespace, use compound names that describe the specific meaning of the thing being declared or defined. The purpose of this guideline is to avoid inappropriate use of a general term with a specific meaning. If global names are simple, users will tend to think of them as having a general utility, even when the type was chosen to meet the requirement of only a limited domain, industry segment, or geographic region. For example, a LineItem global concrete type should not be defined because the information components differ significantly between sales line items and tender line items. (This guideline does not apply to local names, which also have the context of use to describe their meaning and which do not prevent other uses of the same name with different context and different meaning.)
  17. Use consistent prefixes for names from namespaces that differ from the IXRetail namespace. Use no prefix for the IXRetail namespace. Use only these prefixes and definitions:
    • xml (defined in XML standard)
    • xmlns (defined in Namespaces in XML standard)
    • xs http://www.w3.org/2001/XMLSchema
    • xsi http://www.w3.org/2001/XMLSchema-instance

    Keeping default namespaces and prefixes consistent helps make included schemas have the same meaning as inline textual inclusion, which ensures that people come to the same conclusions as to meaning that validating parsers do. (Guidelines 10 and 11 specify that the IXRetail namespace be specified as the default namespace; as a result, no prefix is needed for its global names. Additional prefixes would be added to this guideline as their use is approved.)

The goal of these guidelines is to assist development of standardized XML schemas. Fundamental features include choosing names for descriptive value and continuity with prior industry standards, using local naming to keep message sizes reasonable, and planning for change. We hope that you find some of our results applicable to your needs.


Resources

About the authors

During the period that IXRetail developed its "XML Best Practices" document, Paul Golick was editor of that document, Secretary of IXRetail, and the representative to IXRetail and to the ARTS Data Model Committee for IBM Retail Store Solutions. You can contact Paul at golick@us.ibm.com.

Richard Mader is Executive Director of ARTS and Administrator of IXRetail. You can contact Richard at Maderr@nrf.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12062
ArticleTitle=Real-world XML Schema
publish-date=01012002
author1-email=golick@us.ibm.com
author1-email-cc=
author2-email=Maderr@nrf.com
author2-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).