Skip to main content

skip to main content

developerWorks  >  XML  >

Thinking XML: Introducing N-Triples

A simpler serialization for RDF

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Rate this page

Help us improve this content


Level: Intermediate

Uche Ogbuji (uche@ogbuji.net), Principal Consultant, Fourthought, Inc.

08 Apr 2003

RDF/XML isn't the only representation of an RDF model. The W3C developed N-Triples, a format for an RDF representation that is especially suited for test suites. Here, Uche Ogbuji introduces N-Triples using examples converted from RDF/XML.

In an earlier article, I used the heading "Repeat after me: There is no syntax". RDF's traditional XML syntax is often maligned, but luckily it is not what makes RDF tick, and the emergence of alternative serializations has always been inevitable. One problem with XML as a serialization syntax is that it is so flexible that it can be difficult to compare desired versus actual results in the process of automated testing. Whether in regression testing or conformance testing, it is often useful to try to normalize XML to some form so that simple text comparisons give meaningful results. The XML community developed XML canonical form for such purposes, and the W3C RDF working group required the same sort of form for RDF while it was developing RDF conformance test suites.

One option is to define a canonical form of RDF/XML that matches any graphs, and then canonicalize the resulting XML according to the relevant W3C recommendation. Instead, I think the RDF working group chose the right course in developing a simple and strictly-defined textual format for RDF graphs. This format is named N-Triples, and is incorporated into the RDF Test Cases working draft (see Resources). In this article I introduce N-Triples, using examples converted from RDF/XML. You should be familiar with XML and RDF.

Three is the lucky number

I'll start with a simple example of N-Triples. Listing 1 is RDF/XML taken from my earlier article on PRISM.


Listing 1. Thinking XML column 12 described formally in RDF/XML (basic PRISM vocabulary)
<?xml version="1.0" encoding="UTF-8"?> 
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xml:lang="en"
>
 <rdf:Description
rdf:about="http://www.ibm.com/developerworks/xml/library/x-think12.html">
  <dc:description>
A discussion of the broader context and relevance of XML/RDF techniques.
  </dc:description> 
  <dc:title>
Basic XML and RDF techniques for knowledge management, Part 7
  </dc:title>
  <dc:publisher>IBM developerWorks</dc:publisher>
  <dc:creator>Uche Ogbuji</dc:creator>
  <dc:subject>XML</dc:subject>
  <dc:subject>RDF</dc:subject>
  <dc:format>text/html</dc:format>
 </rdf:Description>
</rdf:RDF>

Listing 2 shows an N-Triples equivalent to Listing 1.

I would describe N-Triples as "verbose but explicit." As you can see, there are no abbreviations -- not even namespaces. All the URIs are fully spelled out. This is ideal for testing and the like because it introduces no confusion over what the corresponding RDF model is.

N-Triples is a line-oriented format. Each triple must be written on a separate line, and consists of a subject specifier, a predicate specifier, then an object specifier, followed by a period. One or more spaces or tabs separate subject from predicate, and predicate from object. Resources are specified in one of two forms. If they have a URI, they must be presented in the form you see in Listing 1: the absolute URI reference enclosed in angle brackets. Relative URI references such as <local/file.ext> are not allowed.

Of course in RDF all subjects and predicates are URIs, but objects can be URIs or literals. All literals are presented as strings in quotes, although N-Triples does support language specifiers and data typing, as I discuss later on in The details, literally.



Back to top


Terra incognito

As I mentioned, there are two forms for expressing resources in N-Triples. I've already discussed the form for resources with URIs. N-Triples also has a convention for expressing anonymous nodes (also known as blank nodes). Listing 3 is a simple RDF/XML example containing a couple of blank nodes:


Listing 3. Simple RDF/XML example with blank nodes
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description>
    <dc:title>Unwritten work</dc:title>
    <dc:creator rdf:parseType="Resource">
      <dc:title>The League of Procrastinators</dc:title>
    </dc:creator>
    <dc:contributor rdf:resource="http://put-off.org"/>
  </rdf:Description>
</rdf:RDF>

Figure 1 shows Listing 3 in graph form.


Figure 1. A model diagram of listing 3
Model diagram of listing 3

As you can see, two of the ovals have no labels. These are blank nodes. They do have identity, but that identity is not given by a URI. Blank nodes are often used when there is really no URI appropriate to associate with the resource, as in the example in Listing 3 and Figure 1, where a work is being described that has not yet been written.

Listing 4 shows an N-Triples equivalent to Listing 3, which also corresponds to the graph in Figure 1.


Listing 4. N-Triples equivalent to Listing 3
_:blank1  <http://purl.org/dc/elements/1.1/title> "Unwritten work" .
_:blank2  <http://purl.org/dc/elements/1.1/title> "The League of Procrastinators" .
_:blank1  <http://purl.org/dc/elements/1.1/creator>       _:blank2 .
_:blank1  <http://purl.org/dc/elements/1.1/contributor>   <http://put-off.org> .

Blank nodes are in the form _:name, where name is an identifier for that blank node within a given set of N-Triples. The _:name identifiers maintain the identity of the nodes, even though they don't have any corresponding identifiers in the RDF model. RDF/XML has recently added a similar facility for you to use rdf:nodeID in an rdf:Description or typed node start tag. Listing 5 is equivalent to Listing 3, but the same local node IDs are used as in Listing 4.


Listing 5. Simple RDF/XML example with blank nodes
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description rdf:nodeID="blank1">
    <dc:title>Unwritten work</dc:title>
    <dc:creator rdf:parseType="Resource" rdf:nodeID="blank2">
      <dc:title>The League of Procrastinators</dc:title>
    </dc:creator>
    <dc:contributor rdf:resource="http://put-off.org"/>
  </rdf:Description>
</rdf:RDF>

Again, it is very important to note that these local IDs for blank nodes are purely a convention within a particular RDF/XML or N-Triples file. Just because listings 4 and 5 both use the node ID "blank1" does not mean the corresponding blank nodes have the same identity. This can be a bit confusing, but is an essential property of blank nodes.



Back to top


The details, literally

RDF has always allowed users to specify the language used to express the values of properties. Listing 6 shows an example in RDF/XML of an anonymous resource with a property given in both English and Spanish.


Listing 6. An RDF description using language meta-properties.
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description>
    <dc:title xml:lang="es">A lo cubano</dc:title>
    <dc:title xml:lang="en">Cuban style</dc:title>
    <dc:creator>Orishas</dc:creator>
  </rdf:Description>
</rdf:RDF>

Here the dc:title property is given in two different languages. The language specifier is not a property of the statement as a whole (which is why internationalization does not turn RDF into a system of quads rather than triples). The language is instead a fundamental property of the literal itself. N-Triples shows this in its notation for languages, as you can see in Listing 7, which is a conversion of Listing 6 to N-Triples.


Listing 7. N-Triples equivalent to Listing 6
_:blank1        <http://purl.org/dc/elements/1.1/title> "A lo cubano"@es .
_:blank1        <http://purl.org/dc/elements/1.1/title> "Cuban style"@en .
_:blank1        <http://purl.org/dc/elements/1.1/creator>       "Orishas" .

The @ is tacked on to the representation of the literal value. It is followed by a language code as defined in RFC 3066; this is the primary designation for a language ("en" for "English", "es" for "Spanish", and so forth.). It can also designate a language variant; for example, "en-US" for American English, "en-GB" for British English, or "es-MX" for Mexican Spanish.

Another property literals can have -- and one introduced more recently in RDF -- is a data type. RDF literals can be given data types such as "integer", "string", "date", or even "morse code". The data type is designated as a URI, and you can use the common data types from the W3C XML Schema (WXS) language using URLs based on the WXS namespace, which is commonly mapped to the prefix xsd. One of the N-Triples in Listing 8 includes a data type designation.


Listing 8. A triple whose object includes a data type designation
#This is a comment in N-Triples
#It must appear by itself on a separate line
#The object of the following triple is of type xsd:int
http://example.com/employees/jdoe    http://example.com/employee-id    
     "23"^^http://www.w3.org/2001/XMLSchema#int

The ^^ marker is followed by a URI specifying the data type, which may be based on a standard (as in this case) or could be a local convention. The important thing to remember is that even though the object here is expressed in quotes, it is actually interpreted as a WXS integer by any data-types-aware system. Listing 7 also shows how you can embed comments in N-Triples. Be careful: I have seen many N-Triples examples with comments on the same line as a triple, after the closing period. The current N-Triples grammar does not support this usage.



Back to top


Information in triplicate

That's all there is to the structure of N-Triples. I did not cover a few nuances; for example, a very strict set of characters is allowed in the syntax, and you must be careful to escape any characters outside these ranges. Some characters (in URI references) must be escaped using URI conventions, and others use an N-Triples convention with a leading backslash. If you are writing code to read or write N-Triples, be sure to see the specification for these details.

One of several efforts aimed at a simple triples-based representation for RDF includes N3 (see Resources), which is pretty popular and is the source of some of the ideas in N-Triples. But N-Triples has the advantage of being written into a formal specification, and because of its use in the standard RDF test cases, will probably be implemented by all RDF processors.



Resources



About the author

Photo of Uche Ogbuji

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top