 | Level: Introductory Uche Ogbuji (uche@ogbuji.net), Partner, Zepheira
02 Oct 2007 Firefox 2.0 brought several important changes in its XML support. It's currently reaching its peak in user deployment. Learn about updated XML features in Firefox 2.0, including a controversial change to the handling of RSS Web feeds.
Web browsers are perhaps the hottest sort of software right now, given their
emerging role as the new application platform. These are particularly exciting
times for software development, what with the re-emergence of dynamic HTML
technologies as Asynchronous JavaScript + XML (Ajax), the revival of Microsoft® Internet Explorer® development, and more. Over the past couple of years the
developerWorks series on XML in Firefox (see Resources) has
covered version 1.5, which builds on version 1.8 of the core Mozilla Gecko browser
engine. The relentless pace of development in the Mozilla project has since led to
the release of Firefox 2.0, building on the Gecko 1.8.1 Web rendering engine. Some
of the developments in Firefox 2.0 touch on XML processing. In this article you'll
learn what's new in Firefox XML processing, including one major potential snag that developers should keep in mind.
Less control over Web feeds
One change in Firefox 2.0 has caused a bit of consternation in the user community. If you host a Web feed such as RSS or Atom you might include XSLT in order to turn that stylesheet into some other representation for the user. Listing 1 is a sample of an Atom feed that references such a transform.
Listing 1. Atom feed with stylesheet reference
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xml" href="atom2html.xslt"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xml:lang="en"
xml:base="http://www.example.org">
<id>http://www.example.org/myfeed</id>
<title>My Simple Feed</title>
<updated>2005-07-15T12:00:00Z</updated>
<link href="/blog" />
<link rel="self" href="/myfeed" />
<author><name>Uche Ogbuji</name></author>
<entry>
<id>http://www.example.org/entries/1</id>
<title>A simple blog entry</title>
<link href="/blog/2005/07/1" />
<updated>2005-07-14T12:00:00Z</updated>
<summary>This is a simple blog entry</summary>
</entry>
<entry>
<id>http://www.example.org/entries/2</id>
<title />
<link href="/blog/2005/07/2" />
<updated>2005-07-15T12:00:00Z</updated>
<summary>This is simple blog entry without a title</summary>
</entry>
</feed>
|
The second line is key, the stylesheet processing instruction (PI). If you view
this in Firefox 1.5, the browser dutifully loads atom2html.xslt and displays the results. You have to view source to
see the actual XML, as discussed in Part 2 of this article series (see Resources). In Firefox 2.0, the browser ignores the stylesheet PI
and uses a custom Firefox view, shown in Figure 1 (a screen capture of Firefox 2.0.0.6 on Mac OS X).
Figure 1. Built-in Web feed view for Firefox 2.0
The only way to work around this and force the use of your chosen stylesheet is to fool the simple heuristic Firefox uses to check for Web feeds, which involves sniffing the first 512 bytes of the file for the words "rss" or "feed". Listing 2 uses the well-known workaround inserting a comment designed to pad out this 512 bytes.
Listing 2. Atom feed with workaround for Firefox
2.0 and Internet Explorer 7 stylesheet default handling
<?xml version="1.0" encoding="utf-8"?>
<!-- Firefox 2.0 and Internet Explorer 7 use simplistic feed sniffing to override desired
presentation behavior for this feed, and thus we are obliged to insert this comment, a
bit of a waste of bandwidth, unfortunately. This should ensure that the following
stylesheet processing instruction is honored by these new browser versions. For some more
background you might want to visit the following bug report:
https://bugzilla.mozilla.org/show_bug.cgi?id=338621
-->
<?xml-stylesheet type="text/xml" href="atom2html.xslt"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xml:lang="en"
xml:base="http://www.example.org">
<!-- content of the feed identical to listing 1, so trimmed -->
</feed>
|
After considerable debate in the user community the Firefox developers decided to
stand their ground, and as things stand, the behavior will be the same in future Firefox versions. I personally dislike this behavior, but you can read the debate and decide for yourself whether or not it's appropriate. One thing worth mention is that the new behavior is similar to that of Internet Explorer and Apple Safari.
Microsummaries
Microsummaries, also called Live Titles are a neat new feature in Firefox 2.0 where
you instruct the browser to substitute some useful content from a Web site in place
of its title, particularly in bookmarks. For example, a microsummary of IBM developerWorks might say the title of the latest article on the site rather than the static text "developerWorks : IBM's resource for developers". A Web site can offer a microsummary, or the user can create one. The latter case is known as a "microsummary generator", and is a more interesting matter for this article because it requires XML and XSLT processing on the part of the user (those who are not XML-savvy can reuse generators produced by others). Listing 3 is a microsummary generator that extracts the title of the main feature article on developerWorks.
Listing 3. Microsummary generator using the title of the main feature article on IBM developerWorks
<?xml version="1.0" encoding="UTF-8"?>
<generator xmlns="http://www.mozilla.org/microsummaries/0.1"
name="IBM developerWorks featured article">
<template>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:html="http://www.w3.org/1999/xhtml">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:text>Featured article:</xsl:text>
<!-- On sites that make wider use of element IDs
you can use more direct and efficient XPaths -->
<xsl:value-of select="//html:a[@class='feature'][1]"/>
</xsl:template>
</xsl:transform>
</template>
<pages>
<include>http://www.ibm.com/developerworks/[a-zA-Z0-9]*/?</include>
</pages>
</generator>
|
The generator has two sections: the template, and the page information.
The former contains XSLT to be applied to the Web page to extract the
microsummary text. The later specifies on which pages the browser should
use the microsummary. Microsummaries are simple text, so the output instruction is given accordingly. The crux of the microsummary is the XPath //html:a[@class='feature'][1], which looks for the element which contains the title of the featured article.
In the pages section a regular expression is provided to make the microsummary applicable to the main site page, and that of each developerWorks zone.
See Resources for a tutorial that includes instructions
for installing microsummary generators such as Listing 3.
For now, microsummaries are a Mozilla-only feature.
SAX and more
In a development that is mostly of interest to those who develop Mozilla
extensions, there is now a SAX parser framework for the XPCOM component system of
Mozilla. This should allow people to develop extensions that process XML
efficiently, if none of the other higher level processing technologies are suitable.
XPCOM integration means you can handle SAX events with C++ or JavaScript code, or with any other language with XPCOM bindings.
OpenSearch
OpenSearch is an XML standard developed at the Amazon A9 incubator. It provides
several XML formats and other conventions to describe and use search engines. Firefox has always had strong support for extensible search engine plug-ins, and version 2.0 introduces OpenSearch support so that search features can be extended using facilities that are also compatible with Internet Explorer and other browsers.
Firefox supports OpenSearch 1.1, which is presently in beta, so it's possible that
updates will be required to keep compatibility with Firefox and OpenSearch. Listing 4 is an example of an OpenSearch description document for IBM developerWorks.
Listing 4. OpenSearch description document for IBM developerWorks
<?xml version="1.0" encoding="UTF-8"?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
<ShortName>IBM developerWorks search</ShortName>
<Description>Search IBM developerWorks zones</Description>
<Tags>xml java architecture</Tags>
<InputEncoding>utf-8</InputEncoding>
<Contact>https://www.ibm.com/developerworks/secure/feedback.jsp</Contact>
<!-- The template attribute is split at the "?" for formatting purposes -->
<Url type="text/html"
template="http://www.ibm.com/developerworks/views/xml/libraryview.jsp?
search_by={searchTerms}"/>
<Attribution>All content Copyright 2007, IBM developerWorks</Attribution>
</OpenSearchDescription>
|
This document simply says that the IBM developerWorks site offers a search URL at
http://www.ibm.com/developerworks/views/xml/libraryview.jsp?search_by={searchTerms}
|
where
{searchTerms} is a template parameter that the search tools
will replace with the search terms. So to search for "Firefox XML" the URL
becomes
http://www.ibm.com/developerworks/views/xml/libraryview.jsp?search_by=Firefox+XML
|
The OpenSearch specification defines this URL template system. OpenSearch also defines a convention of returning search results as RSS 2.0 or Atom 1.0 feeds, with a few special extensions. Firefox does not yet support such Web feed search results, and any description that does not have a Url element with type="text/html" (representing the returned content type from the URL) will result in an error. This limitation is unfortunate, but probably a realistic consequence of the fact that most people search through classic HTML forms and result pages, rather than through Web 2.0 mechanisms.
In Firefox 2.0, an OpenSearch description such as in Listing 4 serves as a complete
search engine plug-in. A Web site can specify this description by using a link from page headers such as:
<link rel="search" type="application/opensearchdescription+xml"
title="IBM developerWorks"
href="/path/to/opensearch/description/document.xml"/> |
Note: The three preceding short code examples normally appear as single lines.
For display and print purposes only, they are broken into multiple lines.
Wrap up
Even more significant XML features will come in Firefox 3.0, which is in alpha
testing. Expect a full release in the first half of 2008. It includes some very significant bug fixes and new features for XML processing, and I'll continue to cover these once it becomes the main Firefox version for general use. Mozilla's core toolkit for XML continues to improve and this shows clear benefit for developers and users who deal with XML technologies. The Web browser is the face of XML processing for most users and developers, so this series will continue to track and detail features within the latest Firefox versions.
Resources Learn
-
Updated developer
features for Firefox 2.0: Check out features new to Firefox 2.0, some of which touch on XML.
-
Updated developer
features for Firefox 3.0: Keep up with the changes and functions, including those that relate to XML, in the next version of Firefox.
-
XML
in Firefox 1.5: Review the three articles in this recent developerWorks series:
-
Controversial new Firefox 2.0 change: Track and discuss that Firefox now ignores stylesheet links in many XML Web feeds.
-
Microsummaries: Learn more about these
regularly-updated short summaries of Web pages that are new in Firefox 2.0.
-
SAX support in Firefox 2.0: Explore how extension developers an use the Simple API for XML (SAX) parsing API with XUL applications and extensions.
-
SVG in Firefox 2.0: Check out the level of SVG support now in Firefox 2.0.
-
Creating
OpenSearch plugins for Firefox: Learn to create OpenSearch-compatible search plugins
that support additional Firefox-specific features, such as search suggestions and the
SearchForm element.
-
Introducing OpenSearch
(Uche Ogbuji,O'Reilly xml.com, July 2007): Learn more about OpenSearch, a collection of simple formats for the sharing of search results.
-
developerWorks search
help: Learn about effective search strategies, how to enter search queries, and use of operators on IBM developerWorks.
-
IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
-
XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
-
developerWorks technical events and webcasts: Stay current with technology in these sessions.
- The technology
bookstore: Browse for books on these and other technical topics.
Get products and technologies
-
Firefox: Get the Mozilla-based Web browser that offers standards compliance, performance, security, and solid XML features. The current version is 2.0.0.6.
-
IBM trial software: Build your next development project with trial software available for download directly from developerWorks.
Discuss
About the author  | 
|  | Uche Ogbuji is a partner at Zepheira, LLC, a solutions firm specializing in the next generation of Web technologies. Mr. Ogbuji is lead developer of 4Suite, an open source platform for XML, RDF, and knowledge-management applications, and lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado. You can find more about Mr. Ogbuji at his Weblog, Copia. |
Rate this page
|  |