In this, the second of a two-part series exploring a number of work-in-progress extensions to the Atom 1.0 Syndication format, I'll focus on the topics of copyright licensing, controlling automated downloads, and syndicating threaded discussions.
This article assumes that you have at least a basic understanding of the Atom 1.0 syndication format and of content syndication in general. As you read through this discussion, I recommend that you keep a copy of the Atom 1.0 specification handy as a cross-reference for the various elements discussed (see Resources).
Please note, you should consider all of the extensions discussed here to be works-in-progress that will continue to evolve as they navigate through the IETF Internet Standards process. Most are fairly stable, but if you choose to implement any of them today, you should expect some changes in the future as the final details are discussed and finalized.
It has become a common practice in the world of syndication to associate copyright licenses with content; the most popular use case being the association of Creative Commons licenses.
The popular syndication formats RSS 1.0 and RSS 2.0 each have their own ways of associating such licenses with feeds. Listing 1 illustrates an abbreviated RSS 1.0 feed with a Creative Commons license.
Listing 1. RSS 1.0 with a Creative Commons license
<rdf:RDF xmlns:rdf="..." xmlns:cc="http://web.resource.org/cc/" xmlns="http://purl.org/rss/1.0"> <channel rdf:about="..."> ... <cc:license rdf:resource="http://creativecommons.org/licenses/by/2.5/" /> ... </channel> </rdf:RDF>
To facilitate the association of such licenses with Atom 1.0 feeds and elements, an IETF Internet Draft entitled "Feed License Link Relation" (see Resources) proposes a new Atom link relation extension. This mechanism leverages Atom's expressive and extensible link element. Listing 2 shows an Atom 1.0 feed with a Creative Commons license.
Listing 2. Atom 1.0 feed with a Creative Commons license
<feed xmlns="http://www.w3.org/2005/Atom"> ... <link rel="license" type="application/rdf+xml" href="http://creativecommons.org/licenses/by/2.5/rdf" /> ... <entry> ... <link rel="license" type="application/rdf+xml" href="http://creativecommons.org/licenses/by/2.5/rdf" /> ... </entry> </feed>
While on the surface, the license link relation looks and acts in very similar fashion to its RSS 1.0- and RSS 2.0-based cousins, Atom has one very distinct difference in the way licenses are inherited by items in a feed.
In both the RSS 1.0 and RSS 2.0 license modules, a license association
specified on the feed level automatically applies to all contained items. In
Atom, feed and entry elements are individually licensed -- that is, a license
link relation contained within an
does not automatically apply to the contained collection of
atom:entry elements. Publishers who wish to
associate licenses with individual
elements must include a license link relation as a child of each of those entries.
The reason for this change is simple: Unlike RSS, feeds and entries in Atom are each considered to be first-class entities, each with its own collections of metadata. The copyright owner of a feed may be different from the copyright owner of an entry. For instance, should an entry that was originally published with a restriction forbidding derivative works ever be included in an aggregate feed? While the aggregation service produces the feed, the entries themselves are actually pulled from many other sources, each of which maintain their original copyrights. By requiring that feeds and entries specify their own licenses independently of one another, Atom preserves the ability for a feed produced by one party to contain entries produced by others.
Incidentally, this requirement raises important issues about producing aggregate feeds with licenses that contradict the licenses specified by individual entries. For instance, should an entry that was originally published with a restriction forbidding that derivative works ever be included in an aggregate feed? Must an entry published with a share-alike license that does not grant commercial use only be aggregated within a feed published under the same terms?
As with the RSS 1.0 and RSS 2.0 modules, you can specify multiple license link relations on a feed or entry indicating that the content has been published under multiple copyright licenses.
By associating copyright licenses with a feed or entry, feed publishers can
control how their content is distributed, reused, modified, displayed, and so on.
Additional controls are necessary to specify how to process that
content. For example, podcast feeds distribute audio content by
associating the URL of the audio content with an RSS item using an
enclosure element. Most applications that are
designed to read podcast feeds (known informally as podcatchers)
automatically download such audio files whenever they access and process a podcast
feed. Such automated downloading can cause significant strain on a server that's
hosting the content and eat away at the host's bandwidth.
To give content publishers the ability to specify their preference that user agents not attempt to automatically download links associated with an Atom feed or entry, an extension has been proposed as an IETF Internet Draft entitled "Atom Link No Follow" (see Resources).
The Atom no follow extension introduces three new attributes that you can specify for the
atom:content elements in order to control what forms
of automated processing feed readers should perform:
follow: This attribute specifies whether or not readers should attempt to automatically follow links contained in the feed document.
index: This attribute specifies whether or not readers should attempt to index the resource specified by a link. "Index" in this context is used in the same sense as query engines indexing the content for searching or profiling purposes.
archive: This attribute specifies whether or not readers should attempt to archive the resource specified by a link.
One example of how you might use this mechanism is a podcast feed in which the most recent podcast is automatically downloaded while older shows are not. Listing 3 illustrates how to specify this choice.
Listing 3. Specifying which enclosures should not be automatically downloaded
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:nf="http://purl.org/atompub/nofollow/1.0"> ... <entry> ... <link rel="enclosure" href="http://www.example.com/todayspodcast.mp3" nf:follow="yes" /> ... </entry> <entry> ... <link rel="enclosure" href="http://www.example.com/yesterdayspodcast.mp3" nf:follow="no" /> ... </entry> <entry> ... <link rel="enclosure" href="http://www.example.com/oldpodcast.mp3" nf:follow="no" /> ... </entry> </feed>
A vast majority of the syndicated content on the Web comes from weblogs and similar conversational media in which one party creates a post and others comment on it. In the RSS syndication world, a number of feed readers now include a rudimentary mechanism for linking an item to a feed containing comments that respond to that item (see Listing 4). Those readers can use that extension to display comments to an entry as a threaded discussion.
Listing 4. The RSS Comments extension
<rss version="2.0" xmlns:wfw="http://wellformedweb.org/CommentAPI/"> <channel> ... <item> ... <wfw:commentRss>http://www.example.com/comments.rss</wfw:commentRss> </item> </channel> </rss>
This approach works for simple cases involving individual weblogs that need to give readers the ability to subscribe to both the blog's main feed and comments made on the various entries on that blog. However, it begins to break down in scenarios that involve complex conversation threads and distributed conversation scenarios (feeds from an unknown number of content publishers).
For example, if a colleague of mine posts an entry on his personal blog and
I want to respond to it by posting an entry on my own personal blog, the
commentsRss extension provides no means of associating those entries. Feed
readers would need to resort to a rather inefficient process of either passing
around TrackBacks (see Resources) or examining the content of the entries
to determine whether or not my entry really does relate to my colleague's.
While it's good to simply have the ability to link an entry to a feed that might
contain responses to that entry, what is really needed is the ability to
explicitly mark one entry as a response to another in much the same way
that message headers build threaded discussions in e-mail.
Listing 5. The Atom 1.0 Comments extension
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:ft="http://purl.org/syndication/thread/1.0"> ... <link rel="replies" type="application/atom+xml" href="http://www.example.com/commentsfeed.xml" /> <entry> <id>tag:example.com,2005/entries/1</td> ... </entry> <entry> <id>tag:example.com,2005/entries/1/1</td> <ft:in-reply-to idref="tag:example.com,2005/entries/1" /> ... </entry> </feed>
replies link relation serves the same basic
purpose as the
shown in Listing 4 -- that is, it specifies an external location where one can find comments to the entries in this feed.
in-reply-to element explicitly indicates that
the containing entry is a response to the identified resource. This element
can take two forms: one that uses the
to specify a non-dereferenceable URI that can identify the resource
being responded to; and a second that uses an
attribute to specify a dereferenceable URI that can locate the resource
being responded to.
Note that the
href attribute of the
replies link relation and
element is not required to point to Atom documents -- meaning it is possible for
an Atom feed to contain responses to resources such as Web pages, documents, and
e-mail and newsgroup messages.
Another advantage of the Feed Thread extension is that responses to an item
can be contained within the same feed as the original entry, in a separate feed
that's associated using the
replies link relation,
or in an entirely unassociated feed, making it possible to have truly decentralized
threading of conversations.
Listing 6 illustrates a single feed that uses each of the extensions discussed in this two-part series. Taken as a whole, these extensions enable a number of powerful features that extend the capabilities and value of the base syndication format standard.
Listing 6. A combined example
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:fh="http://purl.org/syndication/history/1.0" xmlns:fr="http://purl.org/syndication/index/1.0" xmlns:fa="http://purl.org/atompub/age/1.0" xmlns:fh="http://purl.org/syndication/thread/1.0" xmlns:nf="http://purl.org/atompub/nofollow/1.0"> <title>My Movie Queue</title> <link href="http://www.example.com/movies"/> <link rel="self" href="http://www.example.com/movies/feed" /> <link rel="license" href="http://creativecommons.org/licenses/by/2.5/rdf" /> <updated>2005-12-12T12:00:00Z</updated> <author><name>James Snell</name></author> <id>tag:example.com,2005:movies</id> <fh:incremental>false</fh:incremental> <fr:ranking-scheme domain="http://www.example.com/movies/queue" label="Queue" significance="descending" precision="0" min-value="1" /> <fr:ranking-scheme domain="http://www.example.com/movies/ratings" label="Ratings" significance="ascending" precision="1" min-value="0" max-value="5" /> <fa:expires>2005-12-22T12:00:00Z</fa:expires> <entry> <title>Hitchhikers Guide to the Galaxy</title> <link href="..." /> <link rel="replies" href="http://www.example.com/movies/comments/hhgg.xml" /> <link rel="enclosure" title="Preview" href="http://www.example.com/movies/hhgg.mpeg" nf:follow="no" /> <r:rank domain="http://www.example.com/movies/queue">1</r:rank> <r:rank domain="http://www.example.com/movies/ratings">5.0</r:rank> ... </entry> <entry> <title>Charlie Chaplin - City Lights</title> <link href="..." /> <link rel="replies" href="http://www.example.com/movies/comments/cccl.xml" /> <link rel="enclosure" title="Preview" href="http://www.example.com/movies/cccl.mpeg" nf:follow="no" /> <r:rank domain="http://www.example.com/movies/queue">3</r:rank> <r:rank domain="http://www.example.com/movies/ratings">4.5</r:rank> ... </entry> <entry> <title>Buster Keaton - College</title> <link href="..." /> <link rel="replies" href="http://www.example.com/movies/comments/bkc.xml" /> <link rel="enclosure" title="Preview" href="http://www.example.com/movies/bkc.mpeg" nf:follow="no" /> <r:rank domain="http://www.example.com/movies/queue">2</r:rank> <r:rank domain="http://www.example.com/movies/ratings">3.5</r:rank> ... </entry> </feed>
All of the Atom 1.0 extensions discussed here were designed to provide very specific and focused functions that go beyond the core abilities of the syndication format. Through the creative combination of these extensions, feed consumers and producers can create a broad variety of applications that support a variety of needs and use cases.
"Atom 1.0 extensions, Part 1" (developerWorks, October 2005): Read the previous article in this series, which covers three proposed extensions that enable the reconstruction of feed history, the ability to order entries within a feed according to numeric rankings, and the expression of expiration timestamps for syndicated content.
- "An overview of the Atom 1.0 Syndication Format" (developerWorks, August 2005): Review James Snell's earlier article which discusses Atom's technical strengths relative to other syndication formats and offers several compelling use case examples that illustrate those strengths.
Atom 1.0 specification: Read it on the IETF site.
Feed History: Enabling Incremental Syndication: Mark Nottingham's spec defines
the means for reconstructing the historical content of Atom and RSS feeds. Mark has implemented Feed History
support in his personal weblog's RSS 1.0 feed.
Feed Rank specification: Explore how to specify a numeric ranking order for feed entries.
Feed Thread: Enabling Threaded Entries in Atom: This spec introduces an Atom 1.0 mechanism that can explicitly mark one entry as being a response to another in much the same way that message headers are used to build threaded discussions in e-mail.
Feed License Link Relation: This spec proposes a link relation extension that facilitates the association of copyright licenses with Atom 1.0 feeds and elements.
Atom Metadata Expiration specification: Learn how to express specific expirations for feeds and entries.
Atom Link No Follow:
This proposed extension is designed to give content publishers the ability to specify that user agents should not attempt to automatically download links associated with an Atom feed or entry.
TrackBack: This is a framework for peer-to-peer communication and notifications between Web sites.
Creative Commons: Find out more about this nonprofit organization that offers a flexible copyright for creative work. Also, check out Uche Ogbuji's developerWorks article on the topic, "The commons of creativity" (May 2003).
developerWorks XML zone: Find hundreds more XML resources, including tutorials, articles, tips, and standards.
Certified Developer in XML and related technologies: Find out how you can get certified.
Get products and technologies
developerWorks content feeds: Get RSS or Atom feeds of our content.
James Snell is a member of IBM's Emerging Technologies Toolkit team. He has spent the past few years focusing on emerging web services technologies and standards, and has been a contributor to the Atom 1.0 specification. He maintains a weblog focused on emerging technologies at http://www.ibm.com/developerworks/blogs/page/jasnell.