Tip: Use data dictionary links for XML and Web services schemata

Leaving anchors for the meanings in XML nodes

When designing XML and Web services schemata you will often (and ideally) reuse data elements defined in pre-existing standards. When you do, it is extremely useful to include links to such standards, providing precise data dictionary references. In so doing, you make processing and maintenance easier to automate. This tip illustrates this practice.

Uche Ogbuji, Principal Consultant, Fourthought, Inc.

Photo of Uche OgbujiUche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.



20 May 2004

In my Thinking XML column, I frequently focus on how various industries are working toward semantic transparency, which is the shared meaning of at least the framework of what is communicated in an XML document. Either the industries do so by creating complete document formats along with the semantics of all the elements, attributes, and content, or they define terms and concepts discretely and individually, independently of the documents in which they would appear. I call these approaches top-down and bottom-up, respectively, and very active communities provide useful material on each.

If you develop XML schemas for your own use or for public use, the usual sage advice is to be sure that you aren't carelessly duplicating existing work. But even if you're truly in new territory, or have good reason not to simply reuse existing languages, try as much as possible to lean on existing initiatives towards semantic transparency. This is best whether you are developing an XML format for private usage, or as a shared or public resource. You can borrow naming conventions and perhaps even schema snippets from existing vocabularies, but a less common technique for building on the work of others is to incorporate what I call semantic links into your own schemata -- special links to existing standards that define the syntactic constructs you define in your schema. This provides for a particularly rich form of data dictionaries for XML schemata. In this tip, I show how to work such links into your schemata.

What exactly is a service anyway?

Imagine that you are a developer working on information systems to manage your organization's Web services. Your first task is to put together some information for detailing the proposed Web services in the context of the budget requests that you must make to be sure the products get off the ground. You'll use XML so that you can easily generate reports and views on the information, and so that you can mix together information from several domains effortlessly. Listing 1 is a snippet from a RELAX NG schema that you might construct for this purpose.

Listing 1. Portion of RELAX NG schema for budget information for Web services development
<element name='service'>
  <owl:sameClassAs
    resource="http://www.daml.org/services/owl-s/1.0/Service.owl#Service"/>
  <attribute name='id'/>
  <element name='synopsis'>
    <owl:samePropertyAs
      resource='http://www.w3.org/2000/01/rdf-schema#comment'/>
  </element>
  <element name='budget-request'>
    <attribute name='currency'/>
  </element>
  <element name='justification'/>
</element>

This snippet does not include the namespace declarations, but the owl prefix is bound to the namespace for OWL Web Ontology Language, http://www.w3.org/2002/07/owl#. OWL is the W3C standard for ontologies, which are documents that provide enough information to share the meanings of a group of concepts. As such, OWL is an excellent way to express semantic links in schemata. Each OWL element is an annotation that expresses a link from the containing RELAX NG definition.

The OWL expression owl:sameClassAs is used to declare that an information item in the XML schema represents some class of thing. In the example, it's important to be clear that what you mean by the element type named service is a Web service of the SOAP or REST or similar variety, so you anchor it to the concept definition from OWL-S, which is a standard ontology of Web services and the service-oriented architecture (SOA). An element sometimes has more of the feel of an attribute or property of another element rather than a class in its own right. Here, the owl:samePropertyAs expression is used to identify the synopsis element as equivalent to a comment in RDF schema (RDFS), which is prosaically defined as "used to provide a human-readable description of a resource." In this case you're formally asserting that the synopsis is a human-readable description of the service.

RELAX NG makes it easy to add such annotations because any element in a foreign namespace can be placed anywhere in a definition. If you're using W3C XML Schema (WXS), things are more complicated: You have to place such foreign elements within xsd:appinfo, which must in turn be placed in xsd:annotation elements.


Semantic links in WS descriptions

Because you've taken so much care in organizing and presenting your Web services budget requests, you've won approval, and now it's time for implementation. One of the newly-funded services is a calendar and appointment service, and you now need to write the WSDL for it. Of course, you want to be sure that the schema snippets in the WSDL contain semantic links, but you also want to try to sprinkle semantic links into other parts of the description as well. Listing 2 is such a snippet of WSDL, conforming to WSDL 1.1.

Listing 2. Portion of WSDL that includes a semantic link for a message part
  <wsdl:message name="get-upcoming-appointments">
    <wsdl:part name="requested-duration" element="schema:duration">
      <wsdl:documentation>
        <owl:sameClassAs
          resource="http://www.w3.org/2002/12/cal/ical#duration"/>
      </wsdl:documentation>
    </wsdl:part>
  </wsdl:message>

In defining a message that requests the upcoming appointments from now through a given duration, you establish precision about what you mean by duration when defining that parameter in the request. You do this by referencing the W3C's suggested expression of the iCalendar standard (RFC 2445) as a formal ontology. You place this link in a wsdl:documentation element, which is not ideal since this element is usually reserved for human-readable documentation. You want to do this because of constraints in the WSDL specification, which like WXS restricts the places where foreign elements (called "extensibility element") can be placed. The model for extensibility is still being developed for WSDL 2.0. I hope the working group decides to loosen unnecessary restrictions before they're done (never mind for now the fact that it seems the idea of message part is being overhauled for 2.0).


Wrap-up

By adding semantic links, you have taken simple terms used for XML and Web services schemata (service, synopsis, duration), and placed them in a specific context. This makes it much easier to automatically infer their meaning when deployed within systems, and increases the information value of the corresponding XML documents. As an example, using some such semantic link, you can easily determine that the synopsis element has the same meaning as an element named description in another vocabulary. Professional DBAs emphasize the importance of data dictionaries to frame the terms they use in their schemata. XML developers should be no less diligent.

On a more general note, the value of supporting semantic links is also good reason to be sure that materials in schemata, dictionaries, ontologies, and such can be referenced using simple URLs. If you are working in an initiative for semantic transparency, please be sure that one of your goals is easy access through simple linking.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, SOA and web services
ArticleID=11918
ArticleTitle=Tip: Use data dictionary links for XML and Web services schemata
publish-date=05202004