Level: Intermediate James Snell (jasnell@us.ibm.com), Software Engineer, IBM
31 Oct 2005 Get a technical overview of a number of proposed extensions to the Atom 1.0 Syndication Format. This second of two articles discusses three proposed extensions that enable you to associate copyright licenses with feed content, control automated processing of links, and syndicate thread discussions. In this, the second of a two-part series exploring a number of work-in-progress
extensions to the Atom 1.0 Syndication format, I'll focus on the topics of copyright
licensing, controlling automated downloads, and syndicating threaded discussions.
This article assumes that you have at least a basic understanding of the
Atom 1.0 syndication format and of content syndication in general. As you read
through this discussion, I recommend that you keep a copy of the Atom 1.0
specification handy as a cross-reference for the various elements discussed
(see Resources).
Please note, you should consider all of the extensions discussed here to
be works-in-progress that will continue to evolve as they navigate through
the IETF Internet Standards process. Most are fairly stable, but if you choose
to implement any of them today, you should expect some changes in the future as
the final details are discussed and finalized.
Licensing feeds
It has become a common practice in the world of syndication to associate
copyright licenses with content; the most popular use case being the association
of Creative Commons licenses.
The popular syndication formats RSS 1.0 and RSS 2.0 each have their own ways
of associating such licenses with feeds. Listing 1 illustrates an abbreviated
RSS 1.0 feed with a Creative Commons license.
Listing 1. RSS 1.0 with a Creative Commons license
<rdf:RDF xmlns:rdf="..."
xmlns:cc="http://web.resource.org/cc/"
xmlns="http://purl.org/rss/1.0">
<channel rdf:about="...">
...
<cc:license
rdf:resource="http://creativecommons.org/licenses/by/2.5/" />
...
</channel>
</rdf:RDF>
|
To facilitate the association of such licenses with Atom 1.0 feeds and
elements, an IETF Internet Draft entitled "Feed License Link Relation" (see Resources) proposes
a new Atom link relation extension. This mechanism leverages Atom's expressive
and extensible link element. Listing 2 shows an Atom 1.0 feed with a Creative
Commons license.
Listing 2. Atom 1.0 feed with a Creative Commons license
<feed xmlns="http://www.w3.org/2005/Atom">
...
<link rel="license"
type="application/rdf+xml"
href="http://creativecommons.org/licenses/by/2.5/rdf" />
...
<entry>
...
<link rel="license"
type="application/rdf+xml"
href="http://creativecommons.org/licenses/by/2.5/rdf" />
...
</entry>
</feed>
|
While on the surface, the license link relation looks and acts in very
similar fashion to its RSS 1.0- and RSS 2.0-based cousins, Atom has one very
distinct difference in the way licenses are inherited by items in a feed.
In both the RSS 1.0 and RSS 2.0 license modules, a license association
specified on the feed level automatically applies to all contained items. In
Atom, feed and entry elements are individually licensed -- that is, a license
link relation contained within an atom:feed element
does not automatically apply to the contained collection of
atom:entry elements. Publishers who wish to
associate licenses with individual atom:entry
elements must include a license link relation as a child of each of those entries.
The reason for this change is simple: Unlike RSS, feeds and entries in Atom
are each considered to be first-class entities, each with its own collections
of metadata. The copyright owner of a feed may be different from the copyright
owner of an entry. For instance, should an entry that was originally published with a restriction forbidding derivative works ever be included in an aggregate feed?
While the aggregation service produces the feed, the entries themselves are
actually pulled from many other sources, each of which maintain their original
copyrights. By requiring that feeds and entries specify their own licenses
independently of one another, Atom preserves the ability for a feed produced
by one party to contain entries produced by others.
Incidentally, this requirement raises important issues about producing
aggregate feeds with licenses that contradict the licenses specified by
individual entries. For instance, should an entry that was originally published
with a restriction forbidding that derivative works ever be included in an
aggregate feed? Must an entry published with a share-alike license that does
not grant commercial use only be aggregated within a feed published under the
same terms?
As with the RSS 1.0 and RSS 2.0 modules, you can specify multiple license
link relations on a feed or entry indicating that the content has been published
under multiple copyright licenses.
Are you following me?
By associating copyright licenses with a feed or entry, feed publishers can
control how their content is distributed, reused, modified, displayed, and so on.
Additional controls are necessary to specify how to process that
content. For example, podcast feeds distribute audio content by
associating the URL of the audio content with an RSS item using an
enclosure element. Most applications that are
designed to read podcast feeds (known informally as podcatchers)
automatically download such audio files whenever they access and process a podcast
feed. Such automated downloading can cause significant strain on a server that's
hosting the content and eat away at the host's bandwidth.
To give content publishers the ability to specify their preference
that user agents not attempt to automatically download links associated with an
Atom feed or entry, an extension has been proposed as an IETF Internet Draft
entitled "Atom Link No Follow" (see Resources).
The Atom no follow extension introduces three new attributes that you can specify for the atom:link and
atom:content elements in order to control what forms
of automated processing feed readers should perform:
-
follow: This attribute specifies whether or not readers should attempt to automatically follow links contained in the feed document.
-
index: This attribute specifies whether or not readers should attempt to index the resource specified by a link. "Index" in this context is used in the same sense as query engines indexing the content for searching or profiling purposes.
-
archive: This attribute specifies whether or not readers should attempt to archive the resource specified by a link.
One example of how you might use this mechanism is a podcast feed
in which the most recent podcast is automatically downloaded while older
shows are not. Listing 3 illustrates how to specify this choice.
Listing 3. Specifying which enclosures should not be automatically downloaded
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:nf="http://purl.org/atompub/nofollow/1.0">
...
<entry>
...
<link rel="enclosure"
href="http://www.example.com/todayspodcast.mp3"
nf:follow="yes" />
...
</entry>
<entry>
...
<link rel="enclosure"
href="http://www.example.com/yesterdayspodcast.mp3"
nf:follow="no" />
...
</entry>
<entry>
...
<link rel="enclosure"
href="http://www.example.com/oldpodcast.mp3"
nf:follow="no" />
...
</entry>
</feed>
|
Would you like to comment?
A vast majority of the syndicated content on the Web comes from weblogs and
similar conversational media in which one party creates a post and others
comment on it. In the RSS syndication world, a number of feed readers now
include a rudimentary mechanism for linking an item to a feed containing
comments that respond to that item (see Listing 4). Those readers can use that extension to
display comments to an entry as a threaded discussion.
Listing 4. The RSS Comments extension
<rss version="2.0"
xmlns:wfw="http://wellformedweb.org/CommentAPI/">
<channel>
...
<item>
...
<wfw:commentRss>http://www.example.com/comments.rss</wfw:commentRss>
</item>
</channel>
</rss>
|
This approach works for simple cases involving individual weblogs that need
to give readers the ability to subscribe to both the blog's main feed and
comments made on the various entries on that blog. However, it begins to break
down in scenarios that involve complex conversation threads and distributed
conversation scenarios (feeds from an unknown number of content publishers).
For example, if a colleague of mine posts an entry on his personal blog and
I want to respond to it by posting an entry on my own personal blog, the
commentsRss extension provides no means of associating those entries. Feed
readers would need to resort to a rather inefficient process of either passing
around TrackBacks (see Resources) or examining the content of the entries
to determine whether or not my entry really does relate to my colleague's.
While it's good to simply have the ability to link an entry to a feed that might
contain responses to that entry, what is really needed is the ability to
explicitly mark one entry as a response to another in much the same way
that message headers build threaded discussions in e-mail.
The IETF Internet Draft entitled "Feed Thread: Enabling Threaded Entries in
Atom" (see Resources) introduces such a mechanism for use with Atom 1.0 -- see Listing 5.
Listing 5. The Atom 1.0 Comments extension
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:ft="http://purl.org/syndication/thread/1.0">
...
<link rel="replies"
type="application/atom+xml"
href="http://www.example.com/commentsfeed.xml" />
<entry>
<id>tag:example.com,2005/entries/1</td>
...
</entry>
<entry>
<id>tag:example.com,2005/entries/1/1</td>
<ft:in-reply-to idref="tag:example.com,2005/entries/1" />
...
</entry>
</feed>
|
The replies link relation serves the same basic
purpose as the wfw:commentRss element
shown in Listing 4 -- that is, it specifies an external location where one can find comments to the entries in this feed.
The in-reply-to element explicitly indicates that
the containing entry is a response to the identified resource. This element
can take two forms: one that uses the idref attribute
to specify a non-dereferenceable URI that can identify the resource
being responded to; and a second that uses an href
attribute to specify a dereferenceable URI that can locate the resource
being responded to.
Note that the href attribute of the
replies link relation and in-reply-to
element is not required to point to Atom documents -- meaning it is possible for
an Atom feed to contain responses to resources such as Web pages, documents, and
e-mail and newsgroup messages.
Another advantage of the Feed Thread extension is that responses to an item
can be contained within the same feed as the original entry, in a separate feed
that's associated using the replies link relation,
or in an entirely unassociated feed, making it possible to have truly decentralized
threading of conversations.
Putting it together
Listing 6 illustrates a single feed that uses each of the extensions
discussed in this two-part series. Taken as a whole, these extensions enable a
number of powerful features that extend the capabilities and value of the base
syndication format standard.
Listing 6. A combined example
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:fh="http://purl.org/syndication/history/1.0"
xmlns:fr="http://purl.org/syndication/index/1.0"
xmlns:fa="http://purl.org/atompub/age/1.0"
xmlns:fh="http://purl.org/syndication/thread/1.0"
xmlns:nf="http://purl.org/atompub/nofollow/1.0">
<title>My Movie Queue</title>
<link href="http://www.example.com/movies"/>
<link rel="self" href="http://www.example.com/movies/feed" />
<link rel="license"
href="http://creativecommons.org/licenses/by/2.5/rdf" />
<updated>2005-12-12T12:00:00Z</updated>
<author><name>James Snell</name></author>
<id>tag:example.com,2005:movies</id>
<fh:incremental>false</fh:incremental>
<fr:ranking-scheme
domain="http://www.example.com/movies/queue"
label="Queue"
significance="descending"
precision="0"
min-value="1" />
<fr:ranking-scheme
domain="http://www.example.com/movies/ratings"
label="Ratings"
significance="ascending"
precision="1"
min-value="0"
max-value="5" />
<fa:expires>2005-12-22T12:00:00Z</fa:expires>
<entry>
<title>Hitchhikers Guide to the Galaxy</title>
<link href="..." />
<link rel="replies"
href="http://www.example.com/movies/comments/hhgg.xml" />
<link rel="enclosure"
title="Preview"
href="http://www.example.com/movies/hhgg.mpeg"
nf:follow="no" />
<r:rank
domain="http://www.example.com/movies/queue">1</r:rank>
<r:rank
domain="http://www.example.com/movies/ratings">5.0</r:rank>
...
</entry>
<entry>
<title>Charlie Chaplin - City Lights</title>
<link href="..." />
<link rel="replies"
href="http://www.example.com/movies/comments/cccl.xml" />
<link rel="enclosure"
title="Preview"
href="http://www.example.com/movies/cccl.mpeg"
nf:follow="no" />
<r:rank
domain="http://www.example.com/movies/queue">3</r:rank>
<r:rank
domain="http://www.example.com/movies/ratings">4.5</r:rank>
...
</entry>
<entry>
<title>Buster Keaton - College</title>
<link href="..." />
<link rel="replies"
href="http://www.example.com/movies/comments/bkc.xml" />
<link rel="enclosure"
title="Preview"
href="http://www.example.com/movies/bkc.mpeg"
nf:follow="no" />
<r:rank
domain="http://www.example.com/movies/queue">2</r:rank>
<r:rank
domain="http://www.example.com/movies/ratings">3.5</r:rank>
...
</entry>
</feed>
|
All of the Atom 1.0 extensions discussed here were designed to provide
very specific and focused functions that go beyond the core abilities of the
syndication format. Through the creative combination of these extensions, feed
consumers and producers can create a broad variety of applications that support
a variety of needs and use cases.
Resources Learn
-
"Atom 1.0 extensions, Part 1" (developerWorks, October 2005): Read the previous article in this series, which covers three proposed extensions that enable the reconstruction of feed history, the ability to order entries within a feed according to numeric rankings, and the expression of expiration timestamps for syndicated content.
- "An overview of the Atom 1.0 Syndication Format" (developerWorks, August 2005): Review James Snell's earlier article which discusses Atom's technical strengths relative to other syndication formats and offers several compelling use case examples that illustrate those strengths.
-
Atom 1.0 specification: Read it on the IETF site.
-
Feed History: Enabling Incremental Syndication: Mark Nottingham's spec defines
the means for reconstructing the historical content of Atom and RSS feeds. Mark has implemented Feed History
support in his personal weblog's RSS 1.0 feed.
-
Feed Rank specification: Explore how to specify a numeric ranking order for feed entries.
-
Feed Thread: Enabling Threaded Entries in Atom: This spec introduces an Atom 1.0 mechanism that can explicitly mark one entry as being a response to another in much the same way that message headers are used to build threaded discussions in e-mail.
-
Feed License Link Relation: This spec proposes a link relation extension that facilitates the association of copyright licenses with Atom 1.0 feeds and elements.
-
Atom Metadata Expiration specification: Learn how to express specific expirations for feeds and entries.
-
Atom Link No Follow:
This proposed extension is designed to give content publishers the ability to specify that user agents should not attempt to automatically download links associated with an Atom feed or entry.
-
TrackBack: This is a framework for peer-to-peer communication and notifications between Web sites.
-
Creative Commons: Find out more about this nonprofit organization that offers a flexible copyright for creative work. Also, check out Uche Ogbuji's developerWorks article on the topic, "The commons of creativity" (May 2003).
-
developerWorks XML zone: Find hundreds more XML resources, including tutorials, articles, tips, and standards.
-
IBM
Certified Developer in XML and related technologies: Find out how you can get certified.
Get products and technologies
Discuss
About the author  | 
|  | James Snell is a member of IBM's Emerging Technologies Toolkit team. He has spent the past few years focusing on emerging Web services technologies and standards, and has been a contributor to the Atom 1.0 specification. He maintains a weblog focused on emerging technologies at http://www.ibm.com/developerworks/blogs/dw_blog.jspa?blog=351. |
Rate this page
|