Skip to main content

Thinking XML: Use the Atom format for syndicating news and more

Originally an RSS replacement, Atom spins into the nucleus of the conversational Web

Uche Ogbuji (uche@ogbuji.net), Principal Consultant, Fourthought, Inc.
Photo of Uche Ogbuji
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Summary:  The Web has always included sites that present series of articles, events, and other postings which are meant to be shared and cross-referenced. With large parts of the Web becoming conversational communities, many in these communities have come together to work on an XML-based standard for such interchange and cross-reference. Atom is the product of this effort -- a format and API for exchanging Web metadata. In this article, Uche Ogbuji introduces Atom. Share your thoughts on this article with the author and other readers in the accompanying discussion forum.

Date:  25 May 2004
Level:  Introductory
Activity:  3711 views

The RSS wars are well known in the XML community. Netscape scraped together this lightweight format for syndication -- the transmission of information from Web sites to be aggregated into a portal. Ever since that low-key beginning, not even the meaning of the acronym has been safe from controversy. Weblogs and a next generation of portals have made the exchange of descriptions of Web resources a widespread and important phenomenon. The various flavors of RSS rule the world of metadata exchange, which has raised the stakes even more in the endless RSS debate.

One of the well-known technologists who had long been involved in this fray is Sam Ruby of the IBM Emerging Technology Group. In mid-2003, Ruby proposed that the various experts and users of RSS and related syndication formats work together to develop a next generation format. Part of the aim was to establish a standard that all the factions had a stake in, and thus suppress the RSS wars. Another goal was to create a more technologically sound design than many of the flavors of RSS, using the practical experience of the many RSS users to make the practical design compromises that would enable the new format to work in harmony with the architecture as well as the culture of the Web. A legion of developers and writers immediately scrambled to join the project, apparently driven by frustration with the endless combativeness and obstructionism in RSS, as well as a desire for a fresh approach to the technological problems in question.

The project was at first called Echo, but then trademark concerns arose and it was renamed Atom. As the Atom Wiki states, the project creates "specifications for syndicating, archiving and editing episodic web sites." I think that the defining characteristic of the domain Atom addresses is not just Web sites that are naturally broken into episodes, but also Web sites that have a conversational nature through their interchange with other sites; the episodes are often fueled by cross-reference to similar entries on other sites, and Atom intends to be the glue for this interchange.

Atom is remarkable for many reasons, but especially in how it has remained simple despite being the product of one of the largest committees that ever assembled itself for a community specification. Atom comprises a Syndication Format Specification (currently at version 0.3, draft 2), which is the XML format for representing information about a Web resource, and an API Specification (currently at version 0.9), which is a set of conventions based on HTTP for retrieving and modifying information in Web resources. Both specifications are written as Internet Drafts with the goal of standardization as RFCs, although currently only the API spec has been formally submitted to the Internet Engineering Task Force (IETF). In addition to the XML syntax, Atom is also being developed in RDF form using Web Ontology Language (OWL). In this article, I introduce Atom, focusing on the XML format specification but also touching on the API where appropriate. All the Atom specifications are still under heavy development and likely to change before they are standardized, although the essential flavor of Atom will likely remain the same.

Creating Web resources

Imagine that you're launching a Web magazine of formal poetry called Stanza Web. You can use Atom for updating the site, to list new verse, articles, and other features, and to gather information from other sites of interest. The first step is to post an article welcoming readers. In order to do so, the Atom API specifies that you wrap the content in an XML document comprising an Atom entry element, and send this document as an HTTP POST message to the Web server at a special URI called the PostURI. Listing 1 is an example of this XML document.


Listing 1. Example of an Atom entry that is sent to the PostURI to create an entry
<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://purl.org/atom/ns#">
  <title>Welcome to Stanza Web</title>
  <author>
    <name>Uche Ogbuji</name>
    <!-- May also contain a URL and e-mail address -->
  </author>
  <link rel="alternate" type="text/html" 
        href="http://stanzaweb.art/2004-06-01/welcome"/>
  <!-- Time in UTC -->
  <modified>2004-06-01T10:11:12Z</modified>
  <content type="application/xhtml+xml" xml:lang="en">
    <div class="article" xmlns="http://www.w3.org/1999/xhtml">
      <p>Welcome to
        <a href="http://stanzaweb.art/">Stanza Web</a>.
        Come back often to keep track of the best in modern poetry.
      </p>
      <p>This site is powered by
         <a href="http://atomenabled.org">Atom</a>
      </p>
    </div>
  </content>
</entry>

All elements in the Atom format must be in the namespace http://purl.org/atom/ns#, or in a foreign namespace according to rules for extensibility which are not yet set. The entry element must have the following child elements:

  • title
  • link with an alternate relationship
  • modified

entry may have other elements such as id, contributor, and content -- the latter is usually present when addressing a PostURI, as in this case.

All this entry information makes sense when the server is sending information on an existing resource to a client, but the role of each element when addressed to a PostURI is still not clear in the specifications. Presumably the title and author elements are used to set the relevant metadata in the created resource. The modified element provides a hint to the server as to how to give a time stamp to the entry, though it's unclear how a server should react in cases where time stamps for resources are set by server policy rather than submitter's instructions. The link presumably tells the server how to construct the URI for the created resource and what Internet media type to assign to at least one representation of that resource, although this raises even sharper questions of policy.

The content element presumably establishes the body of the created resource. It can even be binary, using all the capabilities of MIME. In this example, the content is set as an XHTML div. This is fine when the resource being posted will eventually be incorporated into a larger Web page for display -- a common situation in Weblogs. In other cases, it might make sense to send an entire XHTML document (with an html root element) as the content. Of course it is perfectly valid to send other content formats, such as HTML, plain text, or even images or audio files. If you do so, be sure to set the various media type attributes correctly.


Atom discovery

Now that you have a welcome message up on your new Web site, you hope that others with the same interests will find it. Some of these people will want to use your Atom feed to syndicate information to their own Web site. Some will want to comment on your message. To enable this through Atom, update your Web site to add some special Atom-specific links to the HTML header. Listing 2 is a snippet of HTML that illustrates these links.


Listing 2. Illustration of links in HTML headers for discovering URIs related to Atom
<html>
<head>
  <title>Stanza Web</title>
  <link rel="service.post" type="application/x.atom+xml"
        href="http://stanzaweb.art/atom-post"
        title="Stanza Web"/>
<link rel="service.feed" type="application/x.atom+xml" href="http://stanzaweb.art/atom-feed" title="Stanza Web"/> </head> <body> ... </body> </html>

In Listing 2, the first link element (in red) has a relationship value of service.post and provides the PostURI for the whole site, which is generally used to create new articles. The Web page for an individual article would likely have a different link to a PostURI for creating comments on that article. It might also have a link with rel="service.edit", which would provide an EditURI. Sending an HTTP GET to this URI would retrieve a representation of the content suitable for editing, and updates could be sent using an HTTP POST. The second link element (in blue) has a relationship value of service.feed and provides the FeedURI. An Atom client can send an HTTP GET request in order to retrieve a complete Atom feed.

Security is not directly addressed in the Atom API specification, but one would expect an implementation to support HTTP-based authentication for the various Atom API operations.


The Atom feed

Ideally, readers will come to Stanza Web, read the articles, and decide they'd like to come back to see more. They can then point their Atom tools at the site, reading the Atom FeedURI. At any time, you can access the FeedURI to retrieve a feed for the site, usually representing the metadata of recent or changed items. Listing 3 is an example of such a feed in Atom format.


Listing 3. Example of a full Atom feed retrieved from the FeedURI
<?xml version="1.0" encoding="utf-8"?>
<feed version="0.3" xmlns="http://purl.org/atom/ns#">
  <title>Schema Web</title>
  <link rel="alternate" type="text/html" 
        href="http://stanzaweb.art/"/>
  <modified>2004-06-01T10:11:12Z</modified>
  <author>
    <name>Uche Ogbuji</name>
  </author>
  <entry xmlns="http://purl.org/atom/ns#">
    <title>Welcome to Stanza Web</title>
    <author>
      <name>Uche Ogbuji</name>
    </author>
    <link rel="alternate" type="text/html" 
          href="http://stanzaweb.art/2004-06-01/welcome"/>
    <modified>2004-06-01T10:11:12Z</modified>
    <content type="application/xhtml+xml" xml:lang="en">
      <div class="article" xmlns="http://www.w3.org/1999/xhtml">
        <p>Welcome to
          <a href="http://stanzaweb.art/">Stanza Web</a>.
          Come back often to keep track of the best in modern poetry.
        </p>
        <p>This site is powered by
           <a href="http://atomenabled.org">Atom</a>
        </p>
      </div>
    </content>
  </entry>
</feed>

The top-level element is now feed. It contains the entry element for the article that's been added. The other child elements of feed convey metadata for the entire site. The link with rel="alternate" type="text/html" gives the readable version of the site as an alternative representation. The modified element can be used by Atom tools to determine when new content has been added to the site.


Wrap-up

The Atom specifications need a good deal of work and some annoyances remain, such as the fact that the URL relating to a person is expressed in element content rather than in an XML attribute. There is plenty of time to fix all this as Atom works its way to full standardization. Several Atom implementations are already available for experimentation and the Atom community is quite open, in case you're inspired to contribute to the effort. And don't forget, if this article inspires you to comment, please do post your thoughts on the Thinking XML discussion forum.


Resources

About the author

Photo of Uche Ogbuji

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=11919
ArticleTitle=Thinking XML: Use the Atom format for syndicating news and more
publish-date=05252004
author1-email=uche@ogbuji.net
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers