Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Thinking XML: Basic XML and RDF techniques for knowledge management, Part 5

Defining RDF and DAML+OIL

Uche Ogbuji (uche@ogbuji.net), Principal consultant, Fourthought, Inc.
Uche Ogbuji
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact him at uche@ogbuji.net.

Summary:  Uche Ogbuji moves on to define RDF and DAML+OIL schemata for the issue tracker application, continuing the discussion of modeling as he goes along.

Date:  01 Mar 2002
Level:  Advanced

Comments:  

In my last installment of this column, I discussed how XML knowledge management systems such as RDF shed a different light on age-old problems of data design and modeling. This was done toward the goal of nailing down a schema for the issue tracker package that I have been using to illustrate the use of RDF in association with XML applications. Now I'll complete the definition of the issue tracker schema, in RDFS and DAML+OIL form.

Again, familiarity with RDF, RDFS, and DAML+OIL are required. Since the last installment, I have published an introduction to DAML+OIL (see Resources) with my colleague Roxane Ouellet, so you no longer have to slog through the dense specifications to get a handle on it.

Just getting on with it

With no further ado, I present listing 1, the complete RDFS for the issue tracker.


Listing 1. RDFS schema for the issue tracker
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
<!ENTITY it "http://rdfinference.org/schemata/issue-tracker/">
<!ENTITY dc "http://purl.org/dc/elements/1.1/">
]>
<rdf:RDF
xmlns:rdf="&rdf;"
xmlns:rdfs="&rdfs;"
xmlns:it="&it;"
>

<rdfs:Class rdf:about="&it;Catalog">
<rdfs:label>Issue catalog</rdfs:label>
<rdfs:comment>
An optional collection of resources for which issues have or can
be defined.  Use dc:relation to associate the catalog with its
resources.
</rdfs:comment>
</rdfs:Class>

<rdfs:Class rdf:about="&it;Issue">
<rdfs:label>Issue</rdfs:label>
<rdfs:comment>
A problem, suggestion or other matter for action or discussion
relevant to a resource.  Use Dublin Core properties for base
description.
</rdfs:comment>
</rdfs:Class>

<rdfs:Property rdf:about="&it;issue">
<rdfs:label>issue</rdfs:label>
<rdfs:comment>Associate an issue to its resources</rdfs:comment>
<rdfs:range rdf:resource="&it;Issue"/>
</rdfs:Property>

<rdfs:Property rdf:about="&it;action">
<rdfs:label>action</rdfs:label>
<rdfs:comment>Associate an action with an issue</rdfs:comment>
<rdfs:domain rdf:resource="&it;Issue"/>
<rdfs:range rdf:resource="&it;Action"/>
</rdfs:Property>

<rdfs:Class rdf:about="&it;Action">
<rdfs:label>Action</rdfs:label>
<rdfs:comment>
An action to be taken with regard to an issue
</rdfs:comment>
</rdfs:Class>

<rdfs:Class rdf:about="&it;it:assignee">
<rdfs:label>Assign to</rdfs:label>
<rdfs:comment>
Specify the party to whom the action is assigned
</rdfs:comment>
<rdfs:domain rdf:resource="&it;Action"/>
</rdfs:Class>

<rdfs:Class rdf:about="&it;status">
<rdfs:label>status</rdfs:label>
<rdfs:comment>For instance, "not done" or "done"</rdfs:comment>
<rdfs:domain rdf:resource="&it;Action"/>
</rdfs:Class>

<rdfs:Class rdf:about="&it;comment">
<rdfs:label>comment</rdfs:label>
<rdfs:comment>Associate a comment with an issue</rdfs:comment>
<rdfs:domain rdf:resource="&it;Issue"/>
<rdfs:range rdf:resource="&it;Comment"/>
</rdfs:Class>

<rdfs:Class rdf:about="&it;Comment">
<rdfs:label>Comment</rdfs:label>
<rdfs:comment>A comment made with regard to an issue</rdfs:comment>
</rdfs:Class>

</rdf:RDF>


You will note some changes, including changes to the namespaces used. These, unfortunately, are not as blithely accounted for as the fact that our earlier RDF examples did not use any defined classes. This schema represents what is currently being used for the issue tracker for RDFInference.org, including changes that have been made for various reasons. I'll present corresponding updates to the instance RDF below.

I also adopt some lexical conventions: First of all, I define all the namespace URIs as entities in the DTD internal subset (a convention I learned from Ms. Ouellet), which reduces error and improves readability. Then, I only use rdf:about, never rdf:ID, a convention I recently adopted after hard experience with all the pitfalls associated with resolving IDs against the supposed URI of the containing document. Note that I use rdf:ID only when I can ensure that there is an explicit xml:base declaration, and that all RDF processors for which interoperability is needed support XML base.

The Catalog class provides a way to aggregate all resources that have an issue, or for which users are allowed to create issues. This is mostly an application convenience. Imagine a Web-based form for the tracker. It would probably have a drop-down selection box for the resources of interest. One way to populate that list is to check for all the objects of dc:relation statements from a given catalog. The DAML+OIL schema I'm about to present illustrates another approach.

There are a few other small changes, such as the renaming from "assigned-to" to "assignee" for more consistent use of parts of speech. Otherwise, there are no surprises in this schema, so let's move on to a look at the DAML+OIL version.


DAML Yankees

DAML+OIL is a schema system that provides key improvements over RDFS, including a built-in data typing system, support for enumerations, specializations on properties, and classification and typing by inference. It also goes beyond mere schematics to allow us to define ontologies, which are meant to be approximations of how we hold concepts, but for now we shall be mostly using the basic schematic features. Listing 2 is a DAML+OIL schema for the issue tracker similar to Listing 1.


Listing 2. DAML+OIL schema for the issue tracker
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<!ENTITY xsd "http://www.w3.org/2000/10/XMLSchema#">
<!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
<!ENTITY daml "http://www.daml.org/2001/03/daml+oil#">
<!ENTITY dc "http://purl.org/dc/elements/1.1/">
<!ENTITY it "http://rdfinference.org/schemata/issue-tracker/">
]>
<rdf:RDF
xmlns:rdf="&rdf;"
xmlns:rdfs="&rdfs;"
xmlns:daml="&daml;"
xmlns:xsd="&xsd;"
xmlns:it="&it;"
>

<daml:Ontology rdf:about="">
<daml:versionInfo>
<!-- Note: requires expansion by RCS, CVS or the like
$Revision$
</daml:versionInfo>
<rdfs:comment>
Ontology for an issue tracking system for arbitrary resources
</rdfs:comment>
<daml:imports rdf:resource="http://www.w3.org/2001/10/daml+oil"/>
</daml:Ontology>

<daml:Class rdf:about="&it;RelevantResource">
<rdfs:label>Relevant resource</rdfs:label>
<rdfs:comment>
An implied classification of resources that have related issues
</rdfs:comment>
<rdfs:subClassOf>
<daml:Restriction>
  <daml:onProperty rdf:resource="&it;issue"/>
  <daml:toClass rdf:resource="&it;Issue"/>
</daml:Restriction>
</rdfs:subClassOf>
</daml:Class>

<daml:Class rdf:about="&it;Issue">
<rdfs:label>Issue</rdfs:label>
<rdfs:comment>
A problem, suggestion or other matter for action or discussion
relevant to a resource.  Use Dublin Core properties for base
description.
</rdfs:comment>
</daml:Class>

<daml:ObjectProperty rdf:about="&it;issue">
<rdfs:label>issue</rdfs:label>
<rdfs:comment>Associate an issue to its resources</rdfs:comment>
<rdfs:range rdf:resource="&it;Issue"/>
</daml:ObjectProperty>

<daml:ObjectProperty rdf:about="&it;action">
<rdfs:label>action</rdfs:label>
<rdfs:comment>Associate an action with an issue</rdfs:comment>
<rdfs:domain rdf:resource="&it;Issue"/>
<rdfs:range rdf:resource="&it;Action"/>
</daml:ObjectProperty>

<daml:Class rdf:about="&it;Action">
<rdfs:label>Action</rdfs:label>
<rdfs:comment>An action to be taken with regard to an
issue</rdfs:comment>
</daml:Class>

<daml:ObjectProperty rdf:about="&it;it:assignee">
<rdfs:label>Assign to</rdfs:label>
<rdfs:comment>
Specify the party to whom the action is assigned
</rdfs:comment>
<rdfs:domain rdf:resource="&it;Action"/>
</daml:ObjectProperty>

<daml:ObjectProperty rdf:about="&it;status">
<rdfs:label>status</rdfs:label>
<rdfs:comment>For instance, "not done" or "done"</rdfs:comment>
<rdfs:domain rdf:resource="&it;Action"/>
</daml:ObjectProperty>

<daml:ObjectProperty rdf:about="&it;comment">
<rdfs:label>comment</rdfs:label>
<rdfs:comment>Associate a comment with an issue</rdfs:comment>
<rdfs:domain rdf:resource="&it;Issue"/>
<rdfs:range rdf:resource="&it;Comment"/>
</daml:ObjectProperty>

<daml:Class rdf:about="&it;Comment">
<rdfs:label>Comment</rdfs:label>
<rdfs:comment>A comment made with regard to an issue</rdfs:comment>
</daml:Class>

</rdf:RDF>

Before any definitions comes the ontology header. This is a DAML convention that describes the document and specifies the schema (hence the empty rdf:about, which sets the document itself as the subject). It features a revision statement, which I define using a keyword to be expanded by the revision-control system. It also features an import, which is an explicit mechanism added by DAML+OIL for incorporating definitions from other files into the current one (before DAML, you either had to merge multiple sources into a model, or use a lower-level mechanism such as XInclude). As standard practice, I import the core DAML+OIL schema, adding definitions for DAML+OIL-specific resources.

Next comes a special class, RelevantResource, whose instances are not stated explicitly, but which is defined by inference on the properties of instances. A closer look at the RelevantResource class should make this clear. It is defined as a subclass of an anonymous in-line resource, which in turn is of type daml:Restriction. This is a special DAML+OIL mechanism that allows you to define rules according to the properties instances have, and the values of those properties. In this case, the restriction selects all resources that have an issue property where the value of that property is of class Issue. By its subclassing from this restriction, the RelevantResource class is a sort of virtual class that includes a set of all resources that meet the restriction. If at any time a resource acquires the right property with a value of the right class, it automatically becomes a member of this virtual class, without needing to be explicitly stated as such.

Please note that part of this restriction is strictly unnecessary. The range of the issue resource has already been constrained to class Issue by an rdfs:range statement. I left the toclass in the DAML restriction purely for illustration.

This is a very important facility to have when you may not have control over all of the information space over which you are operating, and this is why DAML+OIL is put forth as a big step forward in the sort of technology that would be needed to underpin the Semantic Web. In our more modest case, this facility allows us to not have all resources explicitly registered for issue tracking, as we do in the RDFS form of the schema (using the Catalog class).

I define all classes using daml:Class, which is a subclass of rdfs:Class That provides all the additional facilities introduced by DAML. Similarly, I use daml:ObjectProperty to define properties. The issue tracker schema does not use particular data types (string, integer, etc.) to define the value of any property, but as a note, such properties are defined in DAML+OIL as being subclasses of daml:DatatypeProperty.

The DAML+OIL schema is actually what is being used in RDFInference.org applications, and is what we'll use as the basis of continuing work in this column.


Updating the instances

Because of the changes I've noted, I have revisited and updated the sample instances of issues that we've been looking at so far in this column -- see Listing 3.


Listing 3. Updated instance data
<?xml version='1.0'?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
<!ENTITY daml "http://www.daml.org/2001/03/daml+oil#">
<!ENTITY dc "http://purl.org/dc/elements/1.1/">
<!ENTITY foaf "http://xmlns.com/foaf/0.1/">
<!ENTITY it "http://rdfinference.org/schemata/issue-tracker/">
<!ENTITY rit "http://rdfinference.org/ril/issue-tracker/">
]>
<rdf:RDF
xmlns:rdf="&rdf;"
xmlns:rdfs="&rdfs;"
xmlns:daml="&daml;"
xmlns:rit="&rit;"
xmlns:it="&it;"
xmlns:dc="&dc;"
xmlns:foaf="&foaf;"
xmlns="&it;"
>

<rdf:Description rdf:about='http://rdfinference.org/ril/ril-20010502'>
<issue rdf:resource='&rit;i2001030423'/>
<issue rdf:resource='&rit;i2001042003'/>
</rdf:Description>

<Issue rdf:about='&rit;i2001030423'>
<dc:title>Unnecessary abbreviation</dc:title>
<dc:creator rdf:resource='mailto:Alexandre.Fayolle@logilab.fr'/>
<dc:description>
Is the abbreviation of rdf:type predicates in queries necessary?
</dc:description>
<dc:date>2001-03-04</dc:date>
<comment rdf:parseType="Resource">
<dc:creator rdf:resource='mailto:Alexandre.Fayolle@logilab.fr'/>
<dc:description>
The abbreviation in listing 8 doesn't seem necessary to Nico
Chauvat or me.
</dc:description>
</comment>
<action rdf:parseType="Resource">
<dc:description>Organize a vote on this topic</dc:description>
<it:assignee rdf:resource='mailto:uche.ogbuji@fourthought.com'/>
</action>
</Issue>

<Issue rdf:about='&rit;i2001042003'>
<dc:title>Inconsistent versioning</dc:title>
<dc:creator rdf:resource='mailto:Nicolas.Chauvat@logilab.fr'/>
<dc:description>
The RIL versioning is not clear (there's a mix of 0.1, 0/1, 0.2
and 0/2)
</dc:description>
<dc:date>2001-04-20</dc:date>
<action rdf:parseType="Resource">
<dc:description>
Correct all to use the "0/1" form in the next draft.
</dc:description>
<it:assignee rdf:resource='mailto:uche.ogbuji@fourthought.com'/>
</action>
</Issue>

<rdf:Description rdf:about='mailto:Alexandre.Fayolle@logilab.fr'>
<foaf:name>Alexandre Fayolle</foaf:name>
</rdf:Description>

<rdf:Description rdf:about='mailto:uche.ogbuji@fourthought.com'>
<foaf:name>Uche Ogbuji</foaf:name>
</rdf:Description>

<rdf:Description rdf:about='mailto:Nicolas.Chauvat@logilab.fr'>
<foaf:name>Nicolas Chauvat</foaf:name>
</rdf:Description>

</rdf:RDF>

We define a resource against which the sample issues are raised. According to the DAML+OIL schema, http://rdfinference.org/ril/ril-20010502 is automatically a member of the RelevantResource class. The other significant change is that we refer to people through mailto URLs, which are then linked to their regular names using "friend of a friend" (FOAF), a well-known DAML+OIL schema for specifying information about individual contacts, suitable for describing who might be attached to an electronic mailbox. Note that there is another well-known choice for modeling contact information in RDF, based on the common vCard format for embedding contact information as e-mail attachments. The vCard RDF schema is more general in coverage than the FOAF schema, but we don't need its additional properties. And if we did, there is also a FOAF-based option: FOAFCorp, which adds elements related to corporate structure to the core personal profile information in FOAF.

The changes to the XSLT that generate this form rather than the original are minor -- mostly the changing of literal result element names and namespace URIs.


Conclusion

Generally, even if you wish to apply constraints in the loose way discussed in the last installment of this column, you should have a schema of some sort, for documentation if nothing else. RDFS is still the simplest and most pervasive choice, but DAML+OIL has many things to recommend it: not just the additional features, but the cleaner core semantics as well. Now that we have a schema for the issue tracker, we'll move on to improving the way we construct our queries: We'll look at Versa, an open query language for RDF that will make all the query code we've presented simpler and faster.


Resources

About the author

Uche Ogbuji

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact him at uche@ogbuji.net.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=86731
ArticleTitle=Thinking XML: Basic XML and RDF techniques for knowledge management, Part 5
publish-date=03012002
author1-email=uche@ogbuji.net
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).