Linked Data Interfaces

Define REST API contracts for RDF resource representations

Linked Data is a compelling approach to integrating data at web scale. IBM® Rational® software is adopting Linked Data for enabling tool collaboration across the development lifecycle and has established Open Services for Lifecycle Collaboration (OSLC) to foster the creation of specifications based on this approach. Linked Data is based on a fusion of Resource Description Framework (RDF) and REST; however, the precise specification of REST APIs based on RDF is hampered by the absence of a standard type definition language. Arthur Ryman, an IBM Distinguished Engineer, explains the nature of RDF Schema (RDFS) and Web Ontology Language (OWL) and shows why they are not traditional type definition languages. He then introduces the OSLC Resource Shape specification as a candidate type definition language for RDF.


Arthur Ryman (, IBM Distinguished Engineer, IBM

Arthur Ryman is the Chief Architect for Rational Portfolio and Strategy Management and Reporting. which includes Rational Focal Point, Rational Publishing Engine, and Rational Insight. Previously, he was a development manager and architect for Rational Application Developer, WebSphere Studio Application Developer, and VisualAge for Java. Arthur joined the IBM Toronto Laboratory in 1982 and has worked there developing software products since. He is a member of the Open Services for Lifecycle Collaboration (OSLC) Core Working Group, where he is developing specifications for Linked Lifecycle Data. Previously, he founded and led the Eclipse Web Tools Platform and Apache Woden projects and developed web services specifications at W3C.

19 March 2013


Linked Data (see Resources) provides a very appealing approach for integrating information from multiple sources, which is a key challenge for software and systems development projects. Linked Data builds on the well-established web architecture concepts of resources, URLs, and HTTP by adding the less well-understood semantic web technologies, RDF and SPARQL (see Resources).

Most new Linked Data application implementers have extensive experience in developing systems using object-oriented (OO), relational, and XML technologies. When they come to RDF, they look for the analog of a type definition system. Object-oriented programming languages have class definitions, relational databases have SQL data definition language (DDL), and XML has document type definitions (DTD), schemas (XSD), and others (RELAX NG, for example). What, then, is the type definition language for RDF?

In reality, there is no type definition system for RDF. There is an RDF Schema language (RDFS) (see Resources), but it is not a type definition language. The use of the term schema in RDFS is an unfortunate choice of words, because it leads to much confusion. The Web Ontology Language (OWL) (see Resources) has even more superficial resemblance to a type definition language, but it is not one either.

So what are RDFS and OWL, then? They are RDF inference languages. You can use them to define how to infer a new set of RDF data from a given set of RDF data. One of the most useful applications of these technologies is to classify resources, that is, to infer class membership based on the properties of a resource. This capability of OWL is widely used in the life sciences domain, where it is used to classify drugs, diseases, and so on.

Both RDFS and OWL are undeniably useful technologies, but they do not fill the role played by a type definition language in software development. This article explains why RDFS and OWL are not type definition languages and makes the case that we need some new technology to play this role for Linked Data. It also discusses OSLC Resource Shapes (see Resources), a proposed type definition language for Linked Data.


As previously stated, Linked Data is a fusion of web architecture: REST and RDF. Both REST and RDF are based on the central concept of a URI, but they use URIs in different ways. In REST, a URI is a thing that can be dereferenced by an HTTP client, primarily to get the content of an information resource, such as a web page, XML document, or image. In RDF, a URI is simply an opaque linguistic term that identifies some resource. A resource is any conceptual entity and is not necessarily identified with an HTTP URI.

Linked Data fuses REST and RDF by requiring that resources be identified with dereferenceable HTTP URIs and that HTTP clients are able to get RDF representations of resources.

From the REST perspective, RDF serialization formats are like other formats, such as XML, JSON, or CSV. They are just data formats. Therefore, it is natural for developers to expect to be able to define the content of an RDF payload (HTTP request or response), because that is part of the REST service interface. It is sound engineering practice to define interfaces between components in a system. The interface definition defines the contract between the provider and consumer of a component. For software systems, the main part of the interface definition is a precise specification of the inputs and outputs. Type definition languages are used for this purpose. So what is the type definition language for RDF?

This summary illustrates another common point of confusion for developers: RDF does not play exactly the same role as XML, JSON, CSV, and so forth. Those are data formats. In contrast, RDF is a data model that can be serialized in general purpose data formats, such as XML (RDF/XML), XHTML (RDFa), and JSON (JSON-LD), as well as RDF-specific formats, such as N-Triples, Turtle, and Notation3.

Clearly, a true type definition language for RDF must be completely independent of any particular data format used to serialize RDF.

RDF vocabularies and ontologies

The proper way to think about RDF is that it is a language for expressing information about resources by using very simple declarative statements. An RDF statement is a sequence of exactly three terms, which are referred to as the subject, predicate, and object. For example, the statement bug 42 has high severity can be rendered as the sequence of terms (bug-42, has-severity, high). When encoding this statement in the syntax of RDF, each term is a URI, a literal value, or a so-called blank node. A URI may be used as the subject, predicate, or object. A blank node can be used as the subject or object, but not the predicate. A literal can be used only as the object. Examples of real RDF syntax appear in the following sections of this article.

Because it contains three terms, a statement is also called a triple. A set of triples is called a graph. We can visualize the graph by drawing nodes for the subjects and objects, and then drawing arcs that connect these nodes for the predicates. Thus, with Linked Data, when we dereference a URI and ask for an RDF representation, we get an RDF graph. A type definition language for RDF would let us describe RDF graphs. Such a description would help consumers and providers determine whether a given graph satisfied the REST interface contract.

The first step toward a type definition language is a way to define the meaning of any newly introduced URIs that might appear in a graph. These URIs are typically predicates that represent properties of resources or relationships between resources. They might also identify classes of resources or special individual resources. These newly introduced URIs are said to form the vocabulary for some domain of knowledge. By the principles of Linked Data, a client should be able to dereference such a vocabulary term and get an RDF graph that describes the term. The vocabulary used to describe RDF terms is defined by the RDFS specification, so RDFS is a vocabulary for describing vocabularies.


In the following sections and examples, an expression such as rdfs:label denotes a compact URI or CURIE, where the prefix rdfs: is an abbreviation for the URI and the expression rdfs:label is a shorthand for the full URI,

RDFS contains annotation terms, such as rdfs:label, rdfs:comment, rdfs:isDefinedBy, and rdfs:seeAlso, that are used purely for documentation. RDFS also defines the rdfs:Class and rdf:Property classes, which are used to classify terms as either classes or predicates. This limited subset of RDFS constitutes a very simple type definition language.

In addition, RDFS contains other terms, such as rdfs:domain, rdfs:range, rdfs:subClassOf, and rdfs:subPropertyOf, that go beyond mere vocabulary definition and enter into the world of ontologies. The primary difference between a vocabulary and an ontology is that an ontology includes inference rules that let you infer new information from given information. This is where RDFS diverges from traditional type definition languages. Technically, the inferences are computed by a software component called a reasoner.

To follow along with this article, you'll use Pellet (see Resources), the popular open source, Java-based OWL reasoner. The Pellet distribution comes with a command line program, pellet, that wraps its Java class libraries. Download Pellet and unzip it to a convenient directory, and then download and unzip the file into the same directory. See the Download section for links.

To see the inference semantics of rdfs:domain in action, consider the simple ontology that follows in Listing 1 (domainHuman.ttl), represented in Turtle syntax (see Resources).

OWL ontology domainHuman.ttl
@prefix owl: <> .
@prefix rdfs: <> . 
@prefix ex: <> . 

ex: a owl:Ontology . 

ex:Droid a owl:Class ;
    rdfs:isDefinedBy ex: . 

ex:Human a owl:Class ; 
    rdfs:isDefinedBy ex: . 

ex:hasFather a owl:ObjectProperty ; 
    rdfs:domain ex:Human ; 
    rdfs:isDefinedBy ex: . 

ex:Luke a ex:Human .

ex:R2D2 ex:hasFather ex:Luke ;
    a ex:Droid .

This ontology defines two classes of resources, ex:Human and ex:Droid. It defines two individuals, ex:Luke in the class ex:Human, and ex:R2D2 in the class ex:Droid. It also defines a property, ex:hasFather, and asserts its domain to be ex:Human. Finally, it asserts that the father of ex:R2D2 is ex:Luke. An OO programmer would naturally regard this last assertion as an error since the domain of ex:hasFather is ex:Human, but ex:R2D2 is in ex:Droid. An OO compiler would flag this as an error. The closest analogue of a compiler for ontologies is a consistency checker. We can check the consistency of this ontology using the following Pellet command:

pellet consistency –l jena domainHuman.ttl

Pellet produces the following output:

Consistent: Yes

As an ontology, domainHuman.ttl is consistent. The assertion that the domain of ex:hasFather is ex:Human does not mean that ex:hasFather can only be used in statements where the subject is in ex:Human. Instead, it means that if a statement uses ex:hasFather, we can infer that the subject is in ex:Human. Conceptually, this means that a statement asserting that the subject is in ex:Human is added to the ontology. That is what reasoners like Pellet do. We can demonstrate this behavior by running a SPARQL query on the ontology. The SPARQL query in Listing 2 (domainHuman.rq) finds all resources that are in ex:Human:

SPARQL Query domainHuman.rq
PREFIX ex: <> 

SELECT ?human 
    ?human a ex:Human 

We can execute this query by using the following Pellet query command:

pellet query --input-format Turtle --query-file domainHuman.rq domainHuman.ttl

Pellet produces the following result:

Query Results (2 answers): 

As expected, Pellet has inferred that ex:R2D2 is in ex:Human. This type of inference can be very useful in certain applications, but it effectively prevents us from using RDFS and OWL as a traditional type definition language.

A person knowledgeable about OWL might counter that there is an easy fix for the above example. We can extend the ontology by asserting that ex:Human and ex:Droid are disjoint classes, thus that no resource can be in both. This rule is asserted by using the following statement:

ex:Human owl:disjointWith ex:Droid .

Unfortunately, adding more and more statements to the ontology in an attempt to detect constraint violations is a lost cause, because a reasoner will go to great lengths to repair apparent inconsistencies. A reasoner will infer new triples in an attempt to create a consistent world. This behavior will be illustrated later. Now let's examine how inference affects REST APIs.

Consider a simple web application that hosts resources about change requests. We'll use the OSLC oslc_cm:ChangeRequest class to define the class of change requests. Assume that there is a REST service where we can POST HTTP requests to create new oslc_cm:ChangeRequest resources. The REST service looks at the HTTP request, and if it contains an oslc_cm:ChangeRequest resource, it will create a new resource and copy the properties from the HTTP request to it. The following HTTP POST request body should succeed:

HTTP POST changeRequest.ttl
@prefix dcterms: <> .
@prefix oslc_cm: <> .

<> a oslc_cm:ChangeRequest ; 
    dcterms:title "Null pointer exception in web ui" ;
    oslc_cm:status "Submitted" .

Now suppose that the author of the OSLC Change Management specification had declared the domain of the oslc_cm:status property to be oslc_cm:ChangeRequest, using the following RDFS statement:

oslc_cm:status rdfs:domain oslc_cm:ChangeRequest .

As we saw above, with RDFS, the semantics of the rdfs:domain assertion is not a constraint that says you can use only oslc_cm:status on oslc_cm:ChangeRequest resources. Rather, it is an inference rule that says if you use oslc_cm:status as a property on any resource, then that resource is classified as an oslc_cm:ChangeRequest. More precisely, the meaning of this statement is that if any statement uses the predicate oslc_cm:status, we can infer that the subject of the statement is a member of the oslc_cm:ChangeRequest class. Similar to rdfs:domain, RDFS also defines the predicate rdfs:range, which lets us infer the class membership of the object of any statement that uses a given predicate.

Consider the following HTTP POST request, where the explicit triple stating that the resource is an oslc_cm:ChangeRequest has been omitted:

HTTP POST changeRequest-implicit.ttl
@prefix dcterms: <> . 
@prefix oslc_cm: <> . 

    dcterms:title "Null pointer exception in web ui" ; 
    oslc_cm:status "Submitted" .

From the traditional viewpoint, this HTTP POST request should fail, because the server can't find an oslc_cm:ChangeRequest resource. However, from the ontology viewpoint, it should succeed because of the semantics of RDFS. An RDFS reasoner would infer from the explicit triples in the HTTP POST request and the OSLC Change Management ontology that the HTTP POST request implied a triple stating that the resource was an oslc_cm:ChangeRequest.

RDFS contains several other terms, such as rdfs:subClassOf and rdfs:subPropertyOf, that look like common type definition language constraints, but are inference rules, instead. OWL also looks like a type definition language but, in fact, greatly expands on the set of inference rules.

Expressing constraints in OWL

Although the semantics of OWL are defined in terms of inference rules, it is so much more expressive than RDFS that it is possible for an OWL reasoner to infer mutually contradictory triples from a given graph, in which case the graph is said to be inconsistent. This sounds like a promising line of attack. Perhaps we can construct ontologies in which constraint violations result in graph inconsistencies. To illustrate how OWL deals with inconsistency, consider the ontology in Listing 5, differentFrom.ttl:

OWL ontology differentFrom.ttl
@prefix owl: <>. 
@prefix rdfs: <> . 

@prefix ex: <> . 

ex: a owl:Ontology . 

ex:Human a owl:Class ;
    rdfs:isDefinedBy ex: . 

ex:Anakin a ex:Human ; 
    owl:differentFrom ex:Anakin .

This ontology describes the ex:Human class and asserts that ex:Anakin is in ex:Human. But then it also asserts that ex:Anakin is different from ex:Anakin, which is a contradiction. The differnentFrom.ttl ontology is inconsistent because the meaning of the OWL predicate owl:differentFrom is that the subject and object identify different resources. However, because the subject and object of this statement are the same URI, ex:Anakin, they must identify the same resource. This means that there is no possible world in which the above statement could be true. Therefore, any RDF graph that contains this statement is inconsistent.

Perhaps this contradiction crept into the ontology as the result of a typo or programming error. It would be handy if we had a checker that could find this type of error. Fortunately, Pellet can do just that. To check the consistency, run the following Pellet command:

pellet consistency –l jena differentFrom.ttl

Pellet will read the differentFrom.ttl ontology, analyze it, and report the following message:

Consistent: No 
Reason: Individual is sameAs and differentFrom at the same time

The ability of an OWL reasoner to detect inconsistent graphs looks, at first glance, like a potentially useful constraint checking mechanism. However, on closer examination, an OWL reasoner will go to great lengths to make some superficially inconsistent-looking graphs consistent. For example, consider the following ontology, hasFather.ttl:

OWL ontology hasFather.ttl
@prefix owl: <> . 
@prefix rdfs: <> . 
@prefix ex: <> . 

ex: a owl:Ontology . 

ex:Human a owl:Class ;
    rdfs:isDefinedBy ex: . 

ex:hasFather a owl:ObjectProperty, owl:FunctionalProperty ;
    rdfs:isDefinedBy ex: . 

ex:Anakin a ex:Human . 
ex:Darth a ex:Human . 
ex:Luke a ex:Human ; 
    ex:hasFather ex:Anakin, ex:Darth .

This ontology defines the class ex:Human and the property ex:hasFather. This property is asserted to be a functional property, which means that it is single-valued, thus for any given subject there must be at most one object. The ontology also describes three humans: ex:Anakin, ex:Darth, and ex:Luke. It asserts that ex:Luke has two fathers, ex:Anakin and ex:Darth. This looks like a contradiction. It would be nice if a type checker could flag this.

An OWL reasoner will not say that this ontology is inconsistent because OWL does not make the Unique Name Assumption (UNA). This is a fundamental aspect of web architecture, because there is no requirement that every resource have a unique URI. In fact, it is common for synonyms to be defined in different vocabularies. Given the above ontology, an OWL reasoner will find no inconsistency. Run the following command:

pellet consistency –l jena hasFather.ttl

Pellet reports the following:

Consistent: Yes

What's going on here? An OWL reasoner will judge an ontology to be consistent if there is some world in which the ontology makes sense. In this case, the ontology makes sense when ex:Anakin and ex:Darth identify the same resource. The ontology is said to entail this implication. OWL has the owl:sameAs property, which asserts that its subject and object identify the same resource. Thus the triple ex:Anakin owl:sameAs ex:Darth . is entailed by the ontology. We can verify this entailment by running the SPARQL query in Listing 7, sameAsAnakin.rq, which finds all resources that are the same as ex:Anakin.

SPARQL query, sameAsAnakin.rq sameAsAnakin.rq
PREFIX owl: <>
PREFIX ex: <> 

SELECT ?human 
    ex:Anakin owl:sameAs ?human 

Execute this query using the following Pellet command:

pellet query --input-format Turtle --query-file sameAsAnakin.rq hasFather.ttl

Pellet returns the following result:

Query Results (2 answers): 

OSLC Resource Shapes

RDF programmers have a legitimate need to be able to specify constraints on data, for example, as preconditions in REST APIs. OO programmers are used to specifying constraints on data with a variety of traditional type definition languages, such as Java, UML, and XML Schema. The preceding discussion should convince you that RDFS and OWL are very different from traditional type definition languages. Therefore, RDFS and OWL are not the solution. Open Services for Lifecycle Collaboration (OSLC) has proposed the Resource Shape specification for specifying constraints on RDF data.

RDF is a language for asserting information. Its assertions are simple declarative statements of the linguistic form subject predicate object. Continuing in the vein of linguistics, a vocabulary document is like a dictionary in that it simply lists terms and defines their meanings by using natural language. A graph is like a document written using the terms of the vocabulary. An ontology is a set of inference rules that lets you derive new statements from a given set of statements or, failing that, tells you that the statements are contradictory. In this context, a shape is a set of grammar rules that tell you whether the document is constructed correctly.

A shape specification lists the properties that are expected or required in a graph, their occurrence, range, allowed values, and so forth. A shape specification lets you determine whether a given graph is valid or invalid. A shape checker could be implemented as a set of SPARQL ASK queries on the graph. A SPARQL ASK query is a query with a result of either true or false. If all of the SPARQL ASK queries return true, the graph is valid; otherwise, it is invalid.

To illustrate shapes briefly, suppose that in our oslc_cm:ChangeRequest example, we require that when a new resource is created, it must have exactly one dcterms:title property and zero or one oslc_cm:status property. These constraints are expressed in the simplified shape resource shown in Listing 8, changeRequest-shape.ttl:

OSLC Resource Shape changeRequest-shape.ttl
@prefix dcterms: <> . 
@prefix oslc: <> . 
@prefix oslc_cm: <> . 

@base <> . 

<> a oslc:ResourceShape ;
    dcterms:title "Creation shape of OSLC Change Request" ; 
    oslc:describes oslc_cm:ChangeRequest ; 
    oslc:property <#dcterms-title>, <#oslc_cm-status> . 

<#dcterms-title> a oslc:Property ;
    oslc:propertyDefinition dcterms:title ; 
    oslc:occurs oslc:Exactly-one .

<#oslc_cm-status> a oslc:Property ; 
    oslc:propertyDefinition oslc_cm:status ;
    oslc:occurs oslc:Zero-or-one .

This shape document specifies constraints to be applied as preconditions to creating oslc_cm:ChangeRequest resource through HTTP POST. It uses the oslc:occurs property to specify the occurrence constraints of the dcterms:title and oslc_cm:status properties. Specifying the occurrence of a property as either oslc:Exactly-one or oslc:Zero-or-one constrains the property to be functional, which is what we were trying to achieve through the use of owl:FunctionalProperty in the OWL ontology hasFather.ttl listing.

As mentioned above, each constraint can be expressed as a SPARQL ASK query. For example, the following query, ask-oslc_cm-status-occurs.rq, checks the occurrence of the oslc_cm:status property:

SPARQL Query ask-oslc_cm-status-occurs.rq
prefix oslc_cm: <> 

        select ?resource 
        where { 
            ?resource a oslc_cm:ChangeRequest. 
            ?resource oslc_cm:status ?status 
        group by ?resource 
        having (count(?status) <= 1) 

This query uses SPARQL aggregation to count the occurrences of the oslc_cm:status property and compare them to the constraint specified in the shape document.

Run this query on the HTTP POST body in , using the following Pellet command:

pellet query --input-format Turtle --query-file ask-occurs.rq changeRequest.ttl

Pellet replies with:

ASK query result: yes

This result confirms that the shape is valid with respect to this occurrence constraint. For a counter-example, consider the following HTTP POST, changeRequest-2.ttl, which has two values for the oslc_cm:status property:

HTTP POST changeRequest-2.ttl
@prefix dcterms: <> . 
@prefix oslc_cm: <> .

<> a oslc_cm:ChangeRequest ; 
    dcterms:title "Null pointer exception in web ui" ; 
    oslc_cm:status "Submitted", "Working" .

Run the following Pellet command to check the occurrence constraint:

pellet query --input-format Turtle --query-file ask-occurs.rq changeRequest-2.ttl

Pellet replies:

ASK query result: no

As expected, the query fails, because oslc_cm:status occurs twice.

The OSLC Resource Shape specification lets you express many other common constraints in addition to occurrence constraints. See the OSLC reference in Resources for more information. Although the meaning of each constraint can be expressed in terms of a suitable SPARQL ASK query, implementations of the specification are not required to use SPARQL to check constraints.

OWL revisited

The main reason we could not use OWL to check constraints is that the meaning of OWL is based on the Open World Assumption (OWA). Simply put, an OWL reasoner will not conclude that a statement is false just because it is not explicitly included in an RDF graph. A reasoner will infer new statements and add them to the RDF graph. As previously stated, this behavior can be extremely useful in certain applications, such as classifying resources based on their properties, but it prevents us from using OWL to check constraints.

The need for a way to check constraints on Linked Data, and the unsuitability of standard OWL for this purpose, has been convincingly articulated by Evren Sirin. To repurpose OWL for constraint checking, Sirin proposed the use of alternative semantics for OWL, based on the Closed World Assumption (CWA) (see Resources). In CWA, a reasoner will conclude that a statement is false if it is not present in the RDF graph. CWA is also referred to as Negation as Failure because of its use in logic programming languages, such as Prolog. The Pellet reasoner includes support for integrity constraint validation using the CWA semantics for OWL. Just as for the semantics of OSLC Resource Shapes, the semantics of CWA OWL can be expressed in terms of SPARQL queries.

The CWA OWL proposal was not submitted to W3C so we are without a W3C standard for constraint checking Linked Data even though there is a clear need for one.


Linked Data fuses REST with RDF. Sound software engineering practices dictate that we specify REST interfaces clearly. Traditional approaches, such as XML schema, don't apply to RDF, but RDF ontology languages, such as RDFS and OWL, are not type definition languages. Therefore, we need an RDF-friendly way to describe Linked Data REST interfaces. The OSLC Resources Shapes specification is a proposed solution to this need.


I'd like to thank Steve Speicher, Martin Nally, Achille Fokoue, Arnaud Le Hors, Nils Kronqvist, Jim Amsden, Vadim Eisenberg, and Maged Elaasar for their careful review and helpful comments. I am especially grateful to Evren Sirin for his assistance with Pellet.


Example RDF and SPARQL files used in this article1examples.zip3KB


  1. The Pellet reasoner from Clark & Parsia is required to run these examples.



Get products and technologies


More downloads


developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Rational software on developerWorks

ArticleTitle=Linked Data Interfaces