Skip to main content

skip to main content

developerWorks  >  XML  >

Atom 1.0 Extensions, Part 2: Copyright licenses, automated processing of links, and syndicating threads

An overview of proposed extensions to the Atom 1.0 Syndication Format

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Rate this page

Help us improve this content


Level: Intermediate

James Snell (jasnell@us.ibm.com), Software Engineer, IBM

31 Oct 2005

Get a technical overview of a number of proposed extensions to the Atom 1.0 Syndication Format. This second of two articles discusses three proposed extensions that enable you to associate copyright licenses with feed content, control automated processing of links, and syndicate thread discussions.

In this, the second of a two-part series exploring a number of work-in-progress extensions to the Atom 1.0 Syndication format, I'll focus on the topics of copyright licensing, controlling automated downloads, and syndicating threaded discussions.

This article assumes that you have at least a basic understanding of the Atom 1.0 syndication format and of content syndication in general. As you read through this discussion, I recommend that you keep a copy of the Atom 1.0 specification handy as a cross-reference for the various elements discussed (see Resources).

Please note, you should consider all of the extensions discussed here to be works-in-progress that will continue to evolve as they navigate through the IETF Internet Standards process. Most are fairly stable, but if you choose to implement any of them today, you should expect some changes in the future as the final details are discussed and finalized.

Licensing feeds

It has become a common practice in the world of syndication to associate copyright licenses with content; the most popular use case being the association of Creative Commons licenses.

The popular syndication formats RSS 1.0 and RSS 2.0 each have their own ways of associating such licenses with feeds. Listing 1 illustrates an abbreviated RSS 1.0 feed with a Creative Commons license.


Listing 1. RSS 1.0 with a Creative Commons license

<rdf:RDF xmlns:rdf="..." 
         xmlns:cc="http://web.resource.org/cc/"
         xmlns="http://purl.org/rss/1.0">
  <channel rdf:about="...">
    ...
    <cc:license 
     rdf:resource="http://creativecommons.org/licenses/by/2.5/" />
    ...
  </channel>
</rdf:RDF>

To facilitate the association of such licenses with Atom 1.0 feeds and elements, an IETF Internet Draft entitled "Feed License Link Relation" (see Resources) proposes a new Atom link relation extension. This mechanism leverages Atom's expressive and extensible link element. Listing 2 shows an Atom 1.0 feed with a Creative Commons license.


Listing 2. Atom 1.0 feed with a Creative Commons license

<feed xmlns="http://www.w3.org/2005/Atom">
  ...
  <link rel="license" 
        type="application/rdf+xml"
        href="http://creativecommons.org/licenses/by/2.5/rdf" />
  ...
  <entry>
    ...
    <link rel="license" 
          type="application/rdf+xml"
          href="http://creativecommons.org/licenses/by/2.5/rdf" />
    ...
  </entry>
</feed>

While on the surface, the license link relation looks and acts in very similar fashion to its RSS 1.0- and RSS 2.0-based cousins, Atom has one very distinct difference in the way licenses are inherited by items in a feed.

In both the RSS 1.0 and RSS 2.0 license modules, a license association specified on the feed level automatically applies to all contained items. In Atom, feed and entry elements are individually licensed -- that is, a license link relation contained within an atom:feed element does not automatically apply to the contained collection of atom:entry elements. Publishers who wish to associate licenses with individual atom:entry elements must include a license link relation as a child of each of those entries.

The reason for this change is simple: Unlike RSS, feeds and entries in Atom are each considered to be first-class entities, each with its own collections of metadata. The copyright owner of a feed may be different from the copyright owner of an entry. For instance, should an entry that was originally published with a restriction forbidding derivative works ever be included in an aggregate feed? While the aggregation service produces the feed, the entries themselves are actually pulled from many other sources, each of which maintain their original copyrights. By requiring that feeds and entries specify their own licenses independently of one another, Atom preserves the ability for a feed produced by one party to contain entries produced by others.

Incidentally, this requirement raises important issues about producing aggregate feeds with licenses that contradict the licenses specified by individual entries. For instance, should an entry that was originally published with a restriction forbidding that derivative works ever be included in an aggregate feed? Must an entry published with a share-alike license that does not grant commercial use only be aggregated within a feed published under the same terms?

As with the RSS 1.0 and RSS 2.0 modules, you can specify multiple license link relations on a feed or entry indicating that the content has been published under multiple copyright licenses.



Back to top


Are you following me?

By associating copyright licenses with a feed or entry, feed publishers can control how their content is distributed, reused, modified, displayed, and so on. Additional controls are necessary to specify how to process that content. For example, podcast feeds distribute audio content by associating the URL of the audio content with an RSS item using an enclosure element. Most applications that are designed to read podcast feeds (known informally as podcatchers) automatically download such audio files whenever they access and process a podcast feed. Such automated downloading can cause significant strain on a server that's hosting the content and eat away at the host's bandwidth.

To give content publishers the ability to specify their preference that user agents not attempt to automatically download links associated with an Atom feed or entry, an extension has been proposed as an IETF Internet Draft entitled "Atom Link No Follow" (see Resources).

The Atom no follow extension introduces three new attributes that you can specify for the atom:link and atom:content elements in order to control what forms of automated processing feed readers should perform:

  • follow: This attribute specifies whether or not readers should attempt to automatically follow links contained in the feed document.
  • index: This attribute specifies whether or not readers should attempt to index the resource specified by a link. "Index" in this context is used in the same sense as query engines indexing the content for searching or profiling purposes.
  • archive: This attribute specifies whether or not readers should attempt to archive the resource specified by a link.

One example of how you might use this mechanism is a podcast feed in which the most recent podcast is automatically downloaded while older shows are not. Listing 3 illustrates how to specify this choice.


Listing 3. Specifying which enclosures should not be automatically downloaded

<feed xmlns="http://www.w3.org/2005/Atom"
      xmlns:nf="http://purl.org/atompub/nofollow/1.0">
  ...
  <entry>
    ...
    <link rel="enclosure" 
          href="http://www.example.com/todayspodcast.mp3"
          nf:follow="yes" />
    ...
  </entry>
  <entry>
    ...
    <link rel="enclosure" 
          href="http://www.example.com/yesterdayspodcast.mp3"
          nf:follow="no" />
    ...
  </entry>
  <entry>
    ...
    <link rel="enclosure" 
          href="http://www.example.com/oldpodcast.mp3"
          nf:follow="no" />
    ...
  </entry>
</feed>



Back to top


Would you like to comment?

A vast majority of the syndicated content on the Web comes from weblogs and similar conversational media in which one party creates a post and others comment on it. In the RSS syndication world, a number of feed readers now include a rudimentary mechanism for linking an item to a feed containing comments that respond to that item (see Listing 4). Those readers can use that extension to display comments to an entry as a threaded discussion.


Listing 4. The RSS Comments extension

<rss version="2.0"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/">
  <channel>
    ...
    <item>
      ...
      <wfw:commentRss>http://www.example.com/comments.rss</wfw:commentRss>
    </item>
  </channel>
</rss>

This approach works for simple cases involving individual weblogs that need to give readers the ability to subscribe to both the blog's main feed and comments made on the various entries on that blog. However, it begins to break down in scenarios that involve complex conversation threads and distributed conversation scenarios (feeds from an unknown number of content publishers).

For example, if a colleague of mine posts an entry on his personal blog and I want to respond to it by posting an entry on my own personal blog, the commentsRss extension provides no means of associating those entries. Feed readers would need to resort to a rather inefficient process of either passing around TrackBacks (see Resources) or examining the content of the entries to determine whether or not my entry really does relate to my colleague's. While it's good to simply have the ability to link an entry to a feed that might contain responses to that entry, what is really needed is the ability to explicitly mark one entry as a response to another in much the same way that message headers build threaded discussions in e-mail.

The IETF Internet Draft entitled "Feed Thread: Enabling Threaded Entries in Atom" (see Resources) introduces such a mechanism for use with Atom 1.0 -- see Listing 5.


Listing 5. The Atom 1.0 Comments extension

<feed xmlns="http://www.w3.org/2005/Atom"
      xmlns:ft="http://purl.org/syndication/thread/1.0">
  ...
  <link rel="replies" 
        type="application/atom+xml"
        href="http://www.example.com/commentsfeed.xml" />
  <entry>
    <id>tag:example.com,2005/entries/1</td>
    ...
  </entry>
  <entry>
    <id>tag:example.com,2005/entries/1/1</td>
    <ft:in-reply-to idref="tag:example.com,2005/entries/1" />
    ...
  </entry>
</feed>

The replies link relation serves the same basic purpose as the wfw:commentRss element shown in Listing 4 -- that is, it specifies an external location where one can find comments to the entries in this feed.

The in-reply-to element explicitly indicates that the containing entry is a response to the identified resource. This element can take two forms: one that uses the idref attribute to specify a non-dereferenceable URI that can identify the resource being responded to; and a second that uses an href attribute to specify a dereferenceable URI that can locate the resource being responded to.

Note that the href attribute of the replies link relation and in-reply-to element is not required to point to Atom documents -- meaning it is possible for an Atom feed to contain responses to resources such as Web pages, documents, and e-mail and newsgroup messages.

Another advantage of the Feed Thread extension is that responses to an item can be contained within the same feed as the original entry, in a separate feed that's associated using the replies link relation, or in an entirely unassociated feed, making it possible to have truly decentralized threading of conversations.



Back to top


Putting it together

Listing 6 illustrates a single feed that uses each of the extensions discussed in this two-part series. Taken as a whole, these extensions enable a number of powerful features that extend the capabilities and value of the base syndication format standard.


Listing 6. A combined example

<feed xmlns="http://www.w3.org/2005/Atom"
      xmlns:fh="http://purl.org/syndication/history/1.0"
      xmlns:fr="http://purl.org/syndication/index/1.0"
      xmlns:fa="http://purl.org/atompub/age/1.0"
      xmlns:fh="http://purl.org/syndication/thread/1.0"
      xmlns:nf="http://purl.org/atompub/nofollow/1.0">
  <title>My Movie Queue</title>
  <link href="http://www.example.com/movies"/>
  <link rel="self" href="http://www.example.com/movies/feed" />
  <link rel="license" 
        href="http://creativecommons.org/licenses/by/2.5/rdf" />
  <updated>2005-12-12T12:00:00Z</updated>
  <author><name>James Snell</name></author>
  <id>tag:example.com,2005:movies</id>
  <fh:incremental>false</fh:incremental>
  <fr:ranking-scheme
    domain="http://www.example.com/movies/queue"
    label="Queue"
    significance="descending"
    precision="0"
    min-value="1" />
  <fr:ranking-scheme
    domain="http://www.example.com/movies/ratings"
    label="Ratings"
    significance="ascending"
    precision="1"
    min-value="0"
    max-value="5" />
  <fa:expires>2005-12-22T12:00:00Z</fa:expires>
  <entry>
  <title>Hitchhikers Guide to the Galaxy</title>
  <link href="..." />
  <link rel="replies" 
        href="http://www.example.com/movies/comments/hhgg.xml" />
  <link rel="enclosure"
        title="Preview"
        href="http://www.example.com/movies/hhgg.mpeg" 
        nf:follow="no" />
  <r:rank 
  domain="http://www.example.com/movies/queue">1</r:rank>
  <r:rank 
  domain="http://www.example.com/movies/ratings">5.0</r:rank>
  ...
  </entry>
  <entry>
  <title>Charlie Chaplin - City Lights</title>
  <link href="..." />
  <link rel="replies" 
        href="http://www.example.com/movies/comments/cccl.xml" />
  <link rel="enclosure"
        title="Preview"
        href="http://www.example.com/movies/cccl.mpeg" 
        nf:follow="no" />
  <r:rank 
  domain="http://www.example.com/movies/queue">3</r:rank>
  <r:rank 
  domain="http://www.example.com/movies/ratings">4.5</r:rank>
  ...
  </entry>
  <entry>
  <title>Buster Keaton - College</title>
  <link href="..." />
  <link rel="replies" 
        href="http://www.example.com/movies/comments/bkc.xml" />
  <link rel="enclosure"
        title="Preview"
        href="http://www.example.com/movies/bkc.mpeg" 
        nf:follow="no" />
  <r:rank 
  domain="http://www.example.com/movies/queue">2</r:rank>
  <r:rank 
  domain="http://www.example.com/movies/ratings">3.5</r:rank>
    ...
  </entry>
</feed>

All of the Atom 1.0 extensions discussed here were designed to provide very specific and focused functions that go beyond the core abilities of the syndication format. Through the creative combination of these extensions, feed consumers and producers can create a broad variety of applications that support a variety of needs and use cases.



Resources

Learn
  • "Atom 1.0 extensions, Part 1" (developerWorks, October 2005): Read the previous article in this series, which covers three proposed extensions that enable the reconstruction of feed history, the ability to order entries within a feed according to numeric rankings, and the expression of expiration timestamps for syndicated content.

  • "An overview of the Atom 1.0 Syndication Format" (developerWorks, August 2005): Review James Snell's earlier article which discusses Atom's technical strengths relative to other syndication formats and offers several compelling use case examples that illustrate those strengths.

  • Atom 1.0 specification: Read it on the IETF site.

  • Feed History: Enabling Incremental Syndication: Mark Nottingham's spec defines the means for reconstructing the historical content of Atom and RSS feeds. Mark has implemented Feed History support in his personal weblog's RSS 1.0 feed.

  • Feed Rank specification: Explore how to specify a numeric ranking order for feed entries.

  • Feed Thread: Enabling Threaded Entries in Atom: This spec introduces an Atom 1.0 mechanism that can explicitly mark one entry as being a response to another in much the same way that message headers are used to build threaded discussions in e-mail.

  • Feed License Link Relation: This spec proposes a link relation extension that facilitates the association of copyright licenses with Atom 1.0 feeds and elements.

  • Atom Metadata Expiration specification: Learn how to express specific expirations for feeds and entries.

  • Atom Link No Follow: This proposed extension is designed to give content publishers the ability to specify that user agents should not attempt to automatically download links associated with an Atom feed or entry.

  • TrackBack: This is a framework for peer-to-peer communication and notifications between Web sites.

  • Creative Commons: Find out more about this nonprofit organization that offers a flexible copyright for creative work. Also, check out Uche Ogbuji's developerWorks article on the topic, "The commons of creativity" (May 2003).

  • developerWorks XML zone: Find hundreds more XML resources, including tutorials, articles, tips, and standards.

  • IBM Certified Developer in XML and related technologies: Find out how you can get certified.

Get products and technologies

Discuss


About the author

Photo of James M Snell

James Snell is a member of IBM's Emerging Technologies Toolkit team. He has spent the past few years focusing on emerging Web services technologies and standards, and has been a contributor to the Atom 1.0 specification. He maintains a weblog focused on emerging technologies at http://www.ibm.com/developerworks/blogs/dw_blog.jspa?blog=351.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top