Skip to main content

skip to main content

developerWorks  >  XML | Web development  >

Thinking XML: Firefox 2.0 and XML

Learn how the latest Firefox release updates XML processing

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Rate this page

Help us improve this content


Level: Introductory

Uche Ogbuji (uche@ogbuji.net), Partner, Zepheira

02 Oct 2007

Firefox 2.0 brought several important changes in its XML support. It's currently reaching its peak in user deployment. Learn about updated XML features in Firefox 2.0, including a controversial change to the handling of RSS Web feeds.

Web browsers are perhaps the hottest sort of software right now, given their emerging role as the new application platform. These are particularly exciting times for software development, what with the re-emergence of dynamic HTML technologies as Asynchronous JavaScript + XML (Ajax), the revival of Microsoft® Internet Explorer® development, and more. Over the past couple of years the developerWorks series on XML in Firefox (see Resources) has covered version 1.5, which builds on version 1.8 of the core Mozilla Gecko browser engine. The relentless pace of development in the Mozilla project has since led to the release of Firefox 2.0, building on the Gecko 1.8.1 Web rendering engine. Some of the developments in Firefox 2.0 touch on XML processing. In this article you'll learn what's new in Firefox XML processing, including one major potential snag that developers should keep in mind.

Less control over Web feeds

One change in Firefox 2.0 has caused a bit of consternation in the user community. If you host a Web feed such as RSS or Atom you might include XSLT in order to turn that stylesheet into some other representation for the user. Listing 1 is a sample of an Atom feed that references such a transform.


Listing 1. Atom feed with stylesheet reference
                
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xml" href="atom2html.xslt"?>
<feed xmlns="http://www.w3.org/2005/Atom"
      xml:lang="en"
      xml:base="http://www.example.org">
  <id>http://www.example.org/myfeed</id>
  <title>My Simple Feed</title>
  <updated>2005-07-15T12:00:00Z</updated>
  <link href="/blog" />
  <link rel="self" href="/myfeed" />
  <author><name>Uche Ogbuji</name></author>
  <entry>
    <id>http://www.example.org/entries/1</id>
    <title>A simple blog entry</title>
    <link href="/blog/2005/07/1" />
    <updated>2005-07-14T12:00:00Z</updated>
    <summary>This is a simple blog entry</summary>
  </entry>
  <entry>
    <id>http://www.example.org/entries/2</id>
    <title />
    <link href="/blog/2005/07/2" />
    <updated>2005-07-15T12:00:00Z</updated>
    <summary>This is simple blog entry without a title</summary>
  </entry>
</feed>

The second line is key, the stylesheet processing instruction (PI). If you view this in Firefox 1.5, the browser dutifully loads atom2html.xslt and displays the results. You have to view source to see the actual XML, as discussed in Part 2 of this article series (see Resources). In Firefox 2.0, the browser ignores the stylesheet PI and uses a custom Firefox view, shown in Figure 1 (a screen capture of Firefox 2.0.0.6 on Mac OS X).


Figure 1. Built-in Web feed view for Firefox 2.0
Editing session on Wikipedia, using WikEd

The only way to work around this and force the use of your chosen stylesheet is to fool the simple heuristic Firefox uses to check for Web feeds, which involves sniffing the first 512 bytes of the file for the words "rss" or "feed". Listing 2 uses the well-known workaround inserting a comment designed to pad out this 512 bytes.


Listing 2. Atom feed with workaround for Firefox 2.0 and Internet Explorer 7 stylesheet default handling
                
<?xml version="1.0" encoding="utf-8"?>
<!-- Firefox 2.0 and Internet Explorer 7 use simplistic feed sniffing to override desired
presentation behavior for this feed, and thus we are obliged to insert this comment, a
bit of a waste of bandwidth, unfortunately. This should ensure that the following
stylesheet processing instruction is honored by these new browser versions. For some more
background you might want to visit the following bug report:
https://bugzilla.mozilla.org/show_bug.cgi?id=338621
-->
<?xml-stylesheet type="text/xml" href="atom2html.xslt"?>
<feed xmlns="http://www.w3.org/2005/Atom"
      xml:lang="en"
      xml:base="http://www.example.org">
<!-- content of the feed identical to listing 1, so trimmed -->
</feed>

After considerable debate in the user community the Firefox developers decided to stand their ground, and as things stand, the behavior will be the same in future Firefox versions. I personally dislike this behavior, but you can read the debate and decide for yourself whether or not it's appropriate. One thing worth mention is that the new behavior is similar to that of Internet Explorer and Apple Safari.

Microsummaries

Microsummaries, also called Live Titles are a neat new feature in Firefox 2.0 where you instruct the browser to substitute some useful content from a Web site in place of its title, particularly in bookmarks. For example, a microsummary of IBM developerWorks might say the title of the latest article on the site rather than the static text "developerWorks : IBM's resource for developers". A Web site can offer a microsummary, or the user can create one. The latter case is known as a "microsummary generator", and is a more interesting matter for this article because it requires XML and XSLT processing on the part of the user (those who are not XML-savvy can reuse generators produced by others). Listing 3 is a microsummary generator that extracts the title of the main feature article on developerWorks.


Listing 3. Microsummary generator using the title of the main feature article on IBM developerWorks
                
<?xml version="1.0" encoding="UTF-8"?>
<generator xmlns="http://www.mozilla.org/microsummaries/0.1" 
  name="IBM developerWorks featured article">
  <template>
    <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
        xmlns:html="http://www.w3.org/1999/xhtml">
      <xsl:output method="text"/>
      <xsl:template match="/">
        <xsl:text>Featured article:</xsl:text>
        <!-- On sites that make wider use of element IDs
        you can use more direct and efficient XPaths -->
        <xsl:value-of select="//html:a[@class='feature'][1]"/>
      </xsl:template>
    </xsl:transform>
  </template>
  <pages>
    <include>http://www.ibm.com/developerworks/[a-zA-Z0-9]*/?</include>
  </pages>
</generator>

The generator has two sections: the template, and the page information. The former contains XSLT to be applied to the Web page to extract the microsummary text. The later specifies on which pages the browser should use the microsummary. Microsummaries are simple text, so the output instruction is given accordingly. The crux of the microsummary is the XPath //html:a[@class='feature'][1], which looks for the element which contains the title of the featured article. In the pages section a regular expression is provided to make the microsummary applicable to the main site page, and that of each developerWorks zone.

See Resources for a tutorial that includes instructions for installing microsummary generators such as Listing 3. For now, microsummaries are a Mozilla-only feature.

SAX and more

In a development that is mostly of interest to those who develop Mozilla extensions, there is now a SAX parser framework for the XPCOM component system of Mozilla. This should allow people to develop extensions that process XML efficiently, if none of the other higher level processing technologies are suitable. XPCOM integration means you can handle SAX events with C++ or JavaScript code, or with any other language with XPCOM bindings.



Back to top


OpenSearch

OpenSearch is an XML standard developed at the Amazon A9 incubator. It provides several XML formats and other conventions to describe and use search engines. Firefox has always had strong support for extensible search engine plug-ins, and version 2.0 introduces OpenSearch support so that search features can be extended using facilities that are also compatible with Internet Explorer and other browsers.

Firefox supports OpenSearch 1.1, which is presently in beta, so it's possible that updates will be required to keep compatibility with Firefox and OpenSearch. Listing 4 is an example of an OpenSearch description document for IBM developerWorks.


Listing 4. OpenSearch description document for IBM developerWorks
                
<?xml version="1.0" encoding="UTF-8"?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
      <ShortName>IBM developerWorks search</ShortName>
      <Description>Search IBM developerWorks zones</Description>
      <Tags>xml java architecture</Tags>
      <InputEncoding>utf-8</InputEncoding>
      <Contact>https://www.ibm.com/developerworks/secure/feedback.jsp</Contact>
      <!-- The template attribute is split at the "?" for formatting purposes -->
      <Url type="text/html" 
       template="http://www.ibm.com/developerworks/views/xml/libraryview.jsp?
search_by={searchTerms}"/>
      <Attribution>All content Copyright 2007, IBM developerWorks</Attribution>
</OpenSearchDescription>

This document simply says that the IBM developerWorks site offers a search URL at

http://www.ibm.com/developerworks/views/xml/libraryview.jsp?search_by={searchTerms}

where {searchTerms} is a template parameter that the search tools will replace with the search terms. So to search for "Firefox XML" the URL becomes

http://www.ibm.com/developerworks/views/xml/libraryview.jsp?search_by=Firefox+XML

The OpenSearch specification defines this URL template system. OpenSearch also defines a convention of returning search results as RSS 2.0 or Atom 1.0 feeds, with a few special extensions. Firefox does not yet support such Web feed search results, and any description that does not have a Url element with type="text/html" (representing the returned content type from the URL) will result in an error. This limitation is unfortunate, but probably a realistic consequence of the fact that most people search through classic HTML forms and result pages, rather than through Web 2.0 mechanisms.

In Firefox 2.0, an OpenSearch description such as in Listing 4 serves as a complete search engine plug-in. A Web site can specify this description by using a link from page headers such as:

<link rel="search" type="application/opensearchdescription+xml" 
title="IBM developerWorks" 
href="/path/to/opensearch/description/document.xml"/>

Note: The three preceding short code examples normally appear as single lines. For display and print purposes only, they are broken into multiple lines.



Back to top


Wrap up

Share this...

digg Digg this story
del.icio.us Post to del.icio.us
Slashdot Slashdot it!

Even more significant XML features will come in Firefox 3.0, which is in alpha testing. Expect a full release in the first half of 2008. It includes some very significant bug fixes and new features for XML processing, and I'll continue to cover these once it becomes the main Firefox version for general use. Mozilla's core toolkit for XML continues to improve and this shows clear benefit for developers and users who deal with XML technologies. The Web browser is the face of XML processing for most users and developers, so this series will continue to track and detail features within the latest Firefox versions.



Resources

Learn

Get products and technologies
  • Firefox: Get the Mozilla-based Web browser that offers standards compliance, performance, security, and solid XML features. The current version is 2.0.0.6.

  • IBM trial software: Build your next development project with trial software available for download directly from developerWorks.


Discuss


About the author

Photo of Uche Ogbuji

Uche Ogbuji is a partner at Zepheira, LLC, a solutions firm specializing in the next generation of Web technologies. Mr. Ogbuji is lead developer of 4Suite, an open source platform for XML, RDF, and knowledge-management applications, and lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado. You can find more about Mr. Ogbuji at his Weblog, Copia.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top


IBM is a trademark of IBM Corporation in the United States, other countries, or both. Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others. Other company, product, or service names may be trademarks or service marks of others.