Skip to main content

Translate Atom to RDF using Java technology

From syndication to semantics with ease

Brian M. Carey (careyb@triangleinformationsolutions.com), Information Systems Consultant, Triangle Information Solutions
Photo of Brian Carey
Brian Carey is an information systems consultant who specializes in the architecture, design, and implementation of Java enterprise applications. You can follow Brian on Twitter at http://twitter.com/brianmcarey, and his tweets are publicly available.

Summary:  Given that Resource Description Framework (RDF) query languages do not recognize documents that follow the Atom specification, how can you translate an Atom document into a distinct document that follows the RDF specification? The answer: Java™ technology. Learn how to make it happen.

Date:  23 Jun 2009
Level:  Intermediate PDF:  A4 and Letter (58KB | 17 pages)Get Adobe® Reader®
Activity:  6199 views

The RDF is comprised of a variety of specifications put forth by the W3C. It is basically a metadata modeling framework that facilitates software-readable information as it is distributed throughout the Web. It does this by identifying said information using subject-object-predicate expressions known as triples.

Frequently used acronyms

  • API: Application programming interface
  • RDF: Resource Description Framework
  • DOM: Document Object Model
  • IETF: Internet Engineering Task Force
  • RSS: Really Simple Syndication
  • URI: Uniform Resource Identifier
  • URL: Universal Resource Locator
  • W3C: World Wide Web Consortium
  • XML: Extensible Markup Language

For example, take the English expression, "Perry the Platypus's arch-enemy is Dr. Doofenschmirtz." In this case, the subject is Perry the Platypus, the predicate is archenemy, and the object is Dr. Doofenschmirtz. In RDF, this triple would be encoded based on the format used to identify cartoon characters and their arch-enemies.

RDF represents "tomorrow," because it is part of the Semantic Web movement. In fact, it is a significant part of that movement.

The Semantic Web movement is the next evolution of the World Wide Web in which information is identified by semantics. The idea is to present data that can clearly be identified by both software and human beings based on a predefined format. And, guess what: That predefined format is accomplished using RDF. (An exhaustive analysis of RDF is beyond the scope of this article. However, see Resources for links to more information.)

Atom: Welcome to yesterday

The heading for this section might seem like a pejorative statement, but it's not intended that way. It is instead meant to contrast a technology that is emerging (RDF) with a technology that has been around for a while (Atom).

Atom is a syndication format, developed from inherent limitations in RSS, for a series of Web-based documents. The syndication format is expressed as an XML language. So Atom documents are XML documents.

Atom documents are routinely read by software known as feed readers, which give their users the ability to view a synopsis of related documents from a particular Web site. Users can decide which documents they want to read, then click only those links. The Atom syndication format also enables webmasters to display feeds on a their sites.

However, Atom does not define semantics as they are understood by the emerging Semantic Web activity. To do that, you need RDF.


The best of both worlds

So, these questions come to mind: "Is there an RDF specification that facilitates syndication? Can you enjoy the benefit of semantics with the advantages of broader exposure?"

Yes.

Enter the "other" RSS. Not the RSS that you're thinking about: this RSS stands for RDF Site Summary, and it defines a syndication format in a semantic manner. It enables webmasters to broadcast their documents in RDF format so that the information contained in those documents is understood by the Semantic Web.

The advantage of providing feeds in RDF format is that resources that support Semantic Web activity will now read, cache, and include content from those feeds in their search results. As the Semantic Web continues to emerge, webmasters who adopt RDF Site Syndication will find themselves already on the cutting edge of the latest and greatest technology. They'll have broader exposure, and that means more traffic. More traffic means more impressions for their advertisers. More impressions for their advertisers means more money in their pockets. It's definitely a worthwhile investment in development effort.


Translating Atom into RDF

Now that the cost-benefit analysis is out of the way, it's time to start making this happen. This article explains how to translate an existing Atom document into an RDF document using the Java programming language.

Fortunately, both Atom and RDF are XML documents. That means that the same tool that you use to read one can be used to write the other.

The Java programming language

You'll do the code to make the translation happen with version 1.6 of the Java programming language. I chose this language because of its famous "write once, run anywhere" capabilities. You can compile and run the code provided with this article on any platform that has a 1.6-compliant version of the Java software development kit (JDK).

The API for parsing and creating the XML documents is the Streaming API for XML (StAX), an outstanding interface that transcends the traditional DOM and Simple API for XML (SAX) parsing schemes. With StAX, parsing an XML document becomes cursor-based, and the application only uses what it needs from the XML document as it progresses. StAX also enables developers to create XML documents.

The metadata

Metadata is basically data about data, and it's absolutely essential in the Semantic Web. It's how the triples that I mentioned earlier are identified and interpreted.

As I mentioned, you'll use the RDF Site Syndication specification for the end product. This is perfect for this purpose, because it is a syndication format that conforms to Semantic Web standards.

It's important to note here that the RDF Site Syndication format is a stand-alone specification, but it lacks certain definitions, such as dates. To fill the gaps, another RDF-compliant specification is usually used, known as the Dublin Core Metadata Initiative (DCMI). The DCMI specification is one of the most popular XML languages used with RDF.

Coding it

The basic idea is that you will read an existing Atom feed, then translate that feed into RDF. In this case, the requirement is to translate the Twitter public timeline in Atom format to RDF Site Syndication format. To accomplish this, you use standard JavaBeans™ to store the information read in from the Atom feed. JavaBeans, as you probably know, are Java classes that contain a series of private properties and publicly available accessors and mutators. You then use the content of these classes to produce the RDF document.

Download the source code

Download the source code for this article. The .zip file contains the three files you need to follow along: AtomToRdf.java, Channel.java, and Item.java.

There are two significant stanzas in an RDF Site Syndication document. One is the <channel> stanza, the other is an <item> stanza, which can exist more than once. The <channel> stanza describes the overall feed. Each <item> stanza describes a document within the feed. See Listing 1 for a sample RDF Site Syndication document.


Listing 1. Sample RDF document (abbreviated)
	
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
	xmlns="http://purl.org/rss/1.0/" 
	xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="http://www.twitter.com">
         <title>Twitter public timeline</title>
         <description>Twitter updates from everyone!</description>
         <link>http://twitter.com/public_timeline</link>
         <dc:date>2009-04-05T13:11:01+00:00</dc:date>
         <items>
              <rdf:Seq>
                   <rdf:li>http://twitter.com/TJalexander/statuses/1456808203</rdf:li>
                   <rdf:li>http://twitter.com/xElsiex/statuses/1456808201</rdf:li>
                   <rdf:li>http://twitter.com/mmama1215/statuses/1456808197</rdf:li>
                   <rdf:li>http://twitter.com/kennethmaxey/statuses/1456808196</rdf:li>
                   <rdf:li>http://twitter.com/katiestars/statuses/1456808195</rdf:li>
                   <rdf:li>http://twitter.com/Zweeal/statuses/1456808194</rdf:li>
                   <rdf:li>http://twitter.com/lilvicofficial/statuses/1456808193</rdf:li>
                   <rdf:li>http://twitter.com/PrettyNitti/statuses/1456808192</rdf:li>
                   <rdf:li>http://twitter.com/mrrobbo/statuses/1456808190</rdf:li>
                   <rdf:li>http://twitter.com/smd75jr/statuses/1456808189</rdf:li>
                   <rdf:li>http://twitter.com/BirdDiva/statuses/1456808188</rdf:li>
                   <rdf:li>http://twitter.com/nouwen/statuses/1456808185</rdf:li>
                   <rdf:li>http://twitter.com/gustavopereira/statuses/1456808184</rdf:li>
                   <rdf:li>http://twitter.com/sky_7/statuses/1456808183</rdf:li>
                   <rdf:li>http://twitter.com/fauzty/statuses/1456808182</rdf:li>
                   <rdf:li>http://twitter.com/Cheriefaery/statuses/1456808181</rdf:li>
                   <rdf:li>http://twitter.com/CarolineAttia/statuses/1456808180</rdf:li>
                   <rdf:li>http://twitter.com/ukyo_rst/statuses/1456808179</rdf:li>
                   <rdf:li>http://twitter.com/Len0r/statuses/1456808177</rdf:li>
                   <rdf:li>http://twitter.com/jhill444faceboo/statuses/1456808175</rdf:li>
              </rdf:Seq>
         </items>
    </channel>    
    <item rdf:about="http://twitter.com/TJalexander/statuses/1456808203">
         <dc:format>text/html</dc:format>
         <dc:date>2009-04-05T13:11:01+00:00</dc:date>
         <dc:source>http://www.twitter.com</dc:source>
         <dc:creator>t.j. alexander</dc:creator>
         <dc:date>2009-04-05T13:11:01+00:00</dc:date>
         <title>TJalexander: Photo: somethingtobelievein: i don</title>
         <link>http://twitter.com/TJalexander/statuses/1456808203</link>
         <description>TJalexander: Photo: somethingtobelievein: i don</description>
    </item>
    <item rdf:about="http://twitter.com/xElsiex/statuses/1456808201">
         <dc:format>text/html</dc:format>
         <dc:date>2009-04-05T13:11:01+00:00</dc:date>
         <dc:source>http://www.twitter.com</dc:source>
         <dc:creator>Elsie Constantinides</dc:creator>
         <dc:date>2009-04-05T13:11:01+00:00</dc:date>
         <title>xElsiex: my hairs gone all fluffy like :O !! nooooooooooooo !!!</title>
         <link>http://twitter.com/xElsiex/statuses/1456808201</link>
         <description>xElsiex: my hairs gone all</description>
    </item>
...	

Notice that the document in Listing 1 looks strikingly similar to the original Really Simple Syndication format. This is not a coincidence as the idea behind the RDF Site Syndication specification is to produce a syndicated format that is RDF compliant.

Most of the elements are self-explanatory. One significant difference between RDF Site Syndication and Really Simple Syndication is the <items> element, a child of the <channel> element. This element provides a list of all document links contained in the RDF file. Think of it as a synopsis of a synopsis.

With all this in mind, it seems appropriate to create two JavaBeans: one for each major stanza. Listing 2 shows the Channel class.


Listing 2. The Channel class
	
public class Channel {

	private String about;
	private String title;
	private String description;
	private String link;
	private String date;
	private List<String> items = new ArrayList<String>();
	
	
	
	public String getAbout() {
		return about;
	}
	public void setAbout(String about) {
		this.about = about;
	}
	public String getTitle() {
		return title;
	}
	public void setTitle(String title) {
		this.title = title;
	}
	public String getDescription() {
		return description;
	}
	public void setDescription(String description) {
		this.description = description;
	}
	public String getLink() {
		return link;
	}
	public void setLink(String link) {
		this.link = link;
	}
	public String getDate() {
		return date;
	}
	public void setDate(String date) {
		this.date = date;
	}
	public List<String> getItems() {
		return items;
	}
	public void setItems(List<String> items) {
		this.items = items;
	}
}
	

As you can see, the Channel class is nothing more than a straightforward JavaBean that describes information contained in the <channel> stanza. There is a direct correlation between each property in the class and each element that is a child of <channel>. There is even a list of String objects (List) for the links that are children of the <items> element.

Listing 3 is another simple JavaBeans class. This class represents a document within the feed.


Listing 3. The Item class
	
public class Item {
	
	private String format;
	private String date;
	private String link;
	private String creator;
	private String title;
	private String description;
	private String source;
	
	
	public String getSource() {
		return source;
	}
	public void setSource(String source) {
		this.source = source;
	}
	public String getFormat() {
		return format;
	}
	public void setFormat(String format) {
		this.format = format;
	}
	public String getDate() {
		return date;
	}
	public void setDate(String date) {
		this.date = date;
	}
	public String getLink() {
		return link;
	}
	public void setLink(String link) {
		this.link = link;
	}
	public String getCreator() {
		return creator;
	}
	public void setCreator(String creator) {
		this.creator = creator;
	}
	public String getTitle() {
		return title;
	}
	public void setTitle(String title) {
		this.title = title;
	}
	public String getDescription() {
		return description;
	}
	public void setDescription(String description) {
		this.description = description;
	}	
}

As you can see, this class contains pertinent information about an item including title, creator (or author), description (the synopsis), and the link.

Before delving further into the code, it's important to look at a sample Atom document. See Listing 4.


Listing 4. A sample Atom document (Twitter public timeline)
	
<?xml version="1.0" encoding="UTF-8"?>
<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
  <title>Twitter public timeline</title>
  <id>tag:twitter.com,2007:Status</id>
  <link type="text/html" rel="alternate" href="http://twitter.com/public_timeline"/>
  <updated>2009-04-06T12:20:02+00:00</updated>
  <subtitle>Twitter updates from everyone!</subtitle>
    <entry>
      <title>UMaineExtension: Backyard Poultry course</title>
      <content type="html">UMaineExtension: Backyard Poultry course</content>
      <id>tag:twitter.com,2007:http://twitter.com/UMaineExtension/statuses/1462447470</id>
      <published>2009-04-06T12:20:00+00:00</published>
      <updated>2009-04-06T12:20:00+00:00</updated>
      <link type="text/html" rel="alternate" href="http://twitter.com//1462447470"/>
      <link type="image/jpeg" rel="image" href="http://UM-crest_normal.jpg"/>
      <author>
        <name>UMaine Extension</name>
        <uri>http://www.extension.umaine.edu</uri>
      </author>
    </entry>
    <entry>
      <title>tmj_mem_adv: Ecommerce Marketing Manager http://tinyurl.com/cthahs</title>
      <content type="html">tmj_mem_adv: Ecommerce Marketing Manager</content>
      <id>tag:twitter.com,2007:http://twitter.com/1462447468</id>
      <published>2009-04-06T12:19:59+00:00</published>
      <updated>2009-04-06T12:19:59+00:00</updated>
      <link type="text/html" rel="alternate" 
          href="http://twitter.com/statuses/1462447468"/>
      <link type="image/png" rel="image" href="http://83603474/twitter_normal.png"/>
      <author>
        <name>TMJ-MEM Advert Jobs</name>
        <uri>http://www.tweetmyjobs.com</uri>
      </author>
    </entry>  
...

Note the <title> element that is a direct child of <feed>. Yet another one is a direct child of <entry>. You will need to handle this in the code later.

Now that the model is complete, it's time to actually code the work to perform the parsing of the Atom feed and the creation of the RDF file. The AtomToRdf class makes that happen. Listing 5 shows you the essence of that class.


Listing 5. The essence of AtomToRdf
	
public class AtomToRdf {
	
	. . .
	private Channel channel = new Channel();
	private List<Item> itemList = new ArrayList<Item>();

	public static void main(String[] args) {
		AtomToRdf atomToRdf = new AtomToRdf();
		atomToRdf.go();
	}
	
	private void go() {
		parseAtom();
		createRdf();
	}
	. . .
}

Wouldn't it be nice if everything were that simple? The reality is that the main() method simply executes a private method, go(), on an instantiated AtomToRdf class. This is done as a means of getting out of the static context. The go() method in turn executes two fairly self-explanatory methods: parseAtom() and createRdf(). The first method is the reader. The second is the writer.

To ensure that information read in from the Atom feed is available to all methods within the AtomToRdf object, two privately available object variables are declared, as in Listing 5. One is an instance of the Channel class (called channel). The other is a List object containing one or more Item objects (called itemList).

Listing 6 shows the beginning of the parseAtom() method. You use StAX to parse the Atom feed. The code begins by instantiating a new XMLInputFactory object. Then, an InputStream object containing the Twitter public timeline in Atom format is opened. The StAX InputFactory creates an XMLEventReader object from that InputStream. This is what StAX uses to identify events in the process of pull-parsing the Atom feed. Some examples of events include the start of the document, the start of an element, and the end of an element.


Listing 6. Starting to parse Atom
	
	private void parseAtom() {
        try {
            XMLInputFactory inputFactory = XMLInputFactory.newInstance();
            InputStream in = new URL("http://twitter.com/statuses/public_timeline.atom")
			.openStream();
            
            XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
            
            boolean inEntry = false;
            Item currentItem = null;
            
            while (eventReader.hasNext()) {   
...

To handle the two <title> elements, the isEntry Boolean is used to distinguish between them. If the Boolean is True, the parser is examining a <title> element that is a child of <entry>.

The currentItem variable is used to store the information that will be contained in each <item> stanza in the output file. A new currentItem object is instantiated every time the parser encounters another <entry> element in the input file. The existing currentItem object is added to the list of Item objects (itemList) every time the parser encounters the end of a <entry> element.

Finally, Listing 6 begins the parser loop. Basically that while statement says, "As long as the parser encounters any event, keep executing the code in the braces ({})."

Which begs the question, what kind of events are encountered, and how should they be handled? Look at Listing 7.


Listing 7. Parsing the Title element
	
if (event.isStartElement()) {
   StartElement startElement = event.asStartElement();
        	   
   if (event.asStartElement().getName().getLocalPart().equals("title")) {
     	   event = eventReader.nextEvent();
     	   String title = event.asCharacters().getData();
                       
     	   if (!inEntry) {
     		   channel.setTitle(title);
     	   } else {
     		   currentItem.setTitle(title);
     	   }
             	                          
         continue;
    } 
...

The first thing the code checks for when it encounters an event is whether the event is the start of a new element. If it is, a StartElement object is instantiated. Then, the name of the element is examined. If the name is title, the code puts the actual contents of the element into the String variable title.

Remember the isEntry variable? That's going to be used here, because—as you might recall—elements named title occur in two different places in the Atom feed. If isEntry is set to True, the code knows to look at a title of a document, not the title of the whole feed. In the former case, the title property of the currentItem object is set. In the latter case, the title property of the channel object is set.

Finally, the continue statement is a standard Java statement that reads, in English, "Just continue looping once you're done here." In other words, when the code is done with this event, start to look for more events.

If you look at the entire code, you'll see that there are many blocks similar to the one in Listing 7. The difference is that each block checks for a different element within the Atom feed, then sets the appropriate instance variables in the right objects.

When the loop is finished, the code will have a fully populated Channel object and a list of fully populated Item objects. These objects will then be read, and the information contained in them will be used to produce the RDF file.

Before examining the code to produce the RDF document, it's important first to understand some constants that have been defined in AtomToRdf (see Listing 8).


Listing 8. Constants defined in AtomToRdf
	
private static final String DUBLIN_CORE_PREFIX	= "dc";
private static final String DUBLIN_CORE_URI	= "http://purl.org/dc/elements/1.1/";
private static final String RDF_PREFIX		= "rdf";
private static final String RDF_URI	= "http://www.w3.org/1999/02/22-rdf-syntax-ns#";
private static final String RSS_URI			= "http://purl.org/rss/1.0/";

Why are these constants necessary? Take a look all the way back to the sample RDF document in Listing 1. You'll see that the RDF output requires namespaces, and in many cases, those namespaces are repeated. These constants make it easier to reference those namespaces and their respective URIs. The code in Listing 9 starts the output.


Listing 9. Starting the output
	
   private void createRdf() {
   	try {
  		
		XMLOutputFactory xmlof = XMLOutputFactory.newInstance();       
    		XMLStreamWriter xmlw = xmlof.createXMLStreamWriter
			(new FileOutputStream ("c:/twitter.rdf"));    
    		xmlw.writeStartElement(RDF_PREFIX, "RDF", RDF_URI);     
    		xmlw.writeNamespace(RDF_PREFIX, RDF_URI); 
    		xmlw.writeNamespace("",RSS_URI);
    		xmlw.writeNamespace(DUBLIN_CORE_PREFIX, DUBLIN_CORE_URI);
    		xmlw.writeCharacters("\n");    
    		xmlw.writeCharacters("    "); 
    		
    		writeChannel(xmlw);
    		writeItems(xmlw);
    		    		   
    		xmlw.writeCharacters("\n");    
    		xmlw.writeEndElement();    
    		xmlw.writeEndDocument();    
    		xmlw.close();
    	} catch (Exception e) {
    		e.printStackTrace();
    	}
    }

Once again, the StAX API is used. The difference is that in this case, it's used to produce output rather than to read input. The code begins by instantiating a new XMLOutputFactory object. Then, an XMLStreamWriter is created from a FileOutputStream object that points to c:/twitter.rdf, the name and location of the output file. You might need to change the location of the file for your own environment.

The code begins writing the elements. It starts with the root element and uses the prefix rdf with its respective URI for that element. The next three lines define the various namespaces associated with this element and their respective prefixes. Note that the RSS_URI constant represents the default prefix, so an empty string is used as the prefix.

The next two lines are for formatting purposes only. They make the output easier for a human being to read. You'll see quite a lot of that throughout the output code.

The next two lines invoke separate methods that act as "the guts" of the output routine. The first method writes out the <channel> stanza. The second one is used to write each <item> stanza.

The next few lines close out the root element and the document itself. Finally, the XMLStreamWriter object is closed.

You might notice a pattern within Listing 10. First, the parent element (appropriately named channel) is created with an appropriate about attribute. Note that the about attribute requires the rdf namespace. The value of the about attribute is simply a URL pointing to the information contained in the RDF Site Syndication. For this purpose, I used the URL for Twitter.


Listing 10. The writeChannel() method
	
    private void writeChannel(XMLStreamWriter xmlw) throws Exception {
		xmlw.writeStartElement("channel");
		xmlw.writeAttribute(RDF_PREFIX, RDF_URI, "about", 
			"http://www.twitter.com");
		xmlw.writeCharacters("\n");    
		xmlw.writeCharacters("         ");  
		xmlw.writeStartElement("title");
		xmlw.writeCharacters(channel.getTitle());
		xmlw.writeEndElement();
		xmlw.writeCharacters("\n");    
		xmlw.writeCharacters("         "); 
		xmlw.writeStartElement("description");
		xmlw.writeCharacters(channel.getDescription());
		xmlw.writeEndElement();
		xmlw.writeCharacters("\n");    
		xmlw.writeCharacters("         "); 
...

This is the second time you've seen the about attribute in the output. It's important to know why it's there: The about attribute defines the subject in that subject-predicate-object triple concept I mentioned previously. In this case (and not infrequently), the subject is a URL. Each child element (such as <title>) represents a predicate. The content of each element is the object.

After a couple of formatting lines, the <title> element is created and populated with the title gleaned from the Atom feed. The <description> element is created, and so on.

Even though only two elements are provided in Listing 10, you probably notice a pattern emerging. An element is created, then the element is populated with contents from the Channel object. Then, the element is ended. And this process is repeated for all data contained in the Channel object.

To make things intuitive, the name of the properties in the Channel object is the same as the name of the elements in the <channel> stanza, which makes it easier to properly map between the code and the output.

In Listing 11, the writeItems() method is a bit different in that it doesn't just write one stanza, but many. It writes one stanza for each <entry> element that was in the Atom feed. The for loop toward the beginning of the method makes sure of that.


Listing 11. The writeItems() method
	
    private void writeItems(XMLStreamWriter xmlw) throws Exception {
    	xmlw.writeCharacters("\n"); 
    	xmlw.writeCharacters("    ");
    	
	for (Item item : itemList) {
		xmlw.writeStartElement("item");
		xmlw.writeAttribute(RDF_PREFIX, RDF_URI, 
			"about", item.getLink());
		xmlw.writeCharacters("\n");    
		xmlw.writeCharacters("         "); 
   		
		xmlw.writeStartElement(DUBLIN_CORE_PREFIX,"format",
			DUBLIN_CORE_URI);
		xmlw.writeCharacters(item.getFormat());  
		xmlw.writeEndElement();
		xmlw.writeCharacters("\n");    
		xmlw.writeCharacters("         ");
...

For each Item object in itemList, a new <item> element is created. Again, a namespace-specific about attribute points to the link for the document. In this case, the link points to a tweet from a particular user on Twitter.

After a bit more formatting, a child element called format is created. The format element describes the format of the output in the document; for your purposes, this element will be text/html. Note that here, the Dublin Core metadata is used rather than the metadata specified by the RDF Site Syndication standard or the RDF standard. That is because neither of the other specifications enables you to define the document format.

As with the previous code block, a pattern emerges. For each property in the Item class, a new element is created and associated with an element that corresponds to the property name.

That's the code in a nutshell. Now, it's time to see if it actually works as advertised.

Testing the code

Extract the file included with this article—AtomToRdf.zip— into the test directory of your choice. You should see the three files that have been described in some detail already: Item.java, Channel.java, and AtomToRdf.java.

Use your favorite integrated development environment (IDE) or go to a command prompt and compile those classes using a Java compiler that is compliant with version 1.6. Then, simply execute AtomToRdf.class with no command-line parameters.

If you kept everything as it is in the code provided, you should have a file in the root of your C drive called twitter.rdf. Open this file and examine it. It should follow the same format as that in Listing 1.

How do you know this is a valid RDF file? Take it to a validator. Fortunately, the good folks at the W3C have created one for you: it's located at http://www.w3.org/RDF/Validator. Access that URL, then paste the contents of twitter.rdf into the text area. Click Parse RDF. You should see a window that reads at the top, "Your RDF document validated successfully." Congratulations!


Conclusion

As the Semantic Web emerges to become the leading edge of the technological revolution, it's important that Web sites issue documents that are compliant with Semantic Web standards. One of those standards is RDF.

Using RDF Site Syndication, webmasters can produce RDF-compliant documents that are similar to Atom feeds. This provides the double-edged benefit of syndication with semantics.

Using the Java programming language with the StAX API, it is easy to parse an Atom feed and translate it into an RDF document that you can then use to provide semantic-specific feeds.



Download

DescriptionNameSizeDownload method
Source files for translating Atom to RDFJavasrc.zip4KB HTTP

Information about download methods


Resources

Learn

Get products and technologies

Discuss

About the author

Photo of Brian Carey

Brian Carey is an information systems consultant who specializes in the architecture, design, and implementation of Java enterprise applications. You can follow Brian on Twitter at http://twitter.com/brianmcarey, and his tweets are publicly available.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=398152
ArticleTitle=Translate Atom to RDF using Java technology
publish-date=06232009
author1-email=careyb@triangleinformationsolutions.com
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers