XML in Firefox 1.5, Part 2: Basic XML processing

Do a lot with XML in Firefox, but watch out for some basic limitations

This second article in the series, "XML in Firefox 1.5," focuses on basic XML processing. Firefox supports XML parsing, Cascading Stylesheets (CSS), and XSLT stylesheets. You also want to be aware of some limitations. In the first article of this series, "XML in Firefox 1.5, Part 1: Overview of XML features," Uche Ogbuji looked briefly at the different XML-related facilities in Firefox.

Share:

Uche Ogbuji (uche@ogbuji.net), Principal Consultant, Fourthought Inc.

Photo of Uche OgbujiUche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can find more about Mr. Ogbuji at his Weblog Copia or contact him at uche@ogbuji.net.



21 March 2006

Also available in Japanese

A few important things have happened in the Firefox world since my article "XML in Firefox 1.5, Part 1: Overview of XML features" was first published in September 2005 (updated March 2006). Chief among these is the release of Firefox 1.5, and then the release of the current version of Firefox which is 1.5.0.1, a minor bug-fix. Firefox 1.5.0.2 is soon to be released (it's due in early April). In this article, I offer a detailed look at the basics of XML processing in Firefox. I include Firefox screenshots, all of which were created using Firefox 1.5.0.1 on Ubuntu Linux, with a fresh profile (that is, with no extensions, and default options as installed).

Parsing 101

The most basic thing you can do with Firefox and XML is to load an XML file in an unknown vocabulary with no associated stylesheet. Listing 1 is such a file.

Listing 1 (listing1.xml). Simple XML file example
<memo>
<date form="iso-8601">2002-08-14</date>
This is just to <strong>say</strong>:
I ate the eggs you left in the fridge
And were probably saving for breakfast.
Do you know?  They were <emph>quite</emph> rotten.
</memo>

Viewing this file with Firefox yields the display in Figure 1.

Figure 1. View Listing 1 in Firefox
Default XML view in FIrefox

Pay close attention to the message at the top of the browser area. In particular "The document tree is shown below." This underscores that you should not consider this a source view of the XML. It is simply a logical layout of the parts of the document Firefox cares about. It does omit details that might matter to you, though not to Firefox, and it does introduce some distortion of the document. For an example of distortion, notice that Firefox puts each element on a new line, even though it is not so in the source document. For a document such as this one, which uses mixed content, this is a significant rearrangement of the content. As an example of Firefox's omitting details, try to view Listing 2 in the browser.

Listing 2 (listing2.xml). Simple XML file example with namespaces and more
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE memo>
<memo xmlns='http://example.com'>
<date form="iso-8601">2002-08-14</date>
This is just to <strong>say</strong>:
I ate the eggs you left in the fridge
And were probably saving for breakfast.
Do you know?  They were <emph>quite</emph> rotten.
</memo>

The display of Listing 2 is precisely the same as that of Listing 1, so the XML declaration, document type declaration and namespace declarations are all omitted from the display. If you do want to see the original XML in all its glory, use the view-source function. From the menu bar, select View, then Page Source. The usual shortcut is Ctrl+U. You can also use the context menu (right click) from the main browser pane. The view-source display is shown in Figure 2, a perfect match to the source listing.

Figure 2. View-source display of Listing 2 in Firefox
View-source XML view in Firefox

Go back to the display in Figure 1 and notice the minus sign next to the memo element's opening tag. Each container element has such a marker, and you can click on it to collapse or fold that element. This can be useful when debugging, if you want to pack away parts of the XML file that do not interest you at the moment.

Parsing errors

To demonstrate Firefox's treatment of ill-formed documents I added a bogus character entity &#1; (some characters are illegal in XML, even if expressed as entities) just before the date element of Listing 2 and viewed it in Firefox. Figure 3 shows the Firefox output, reporting the error and the location at which it was detected.

Figure 3. Firefox display of ill-formed XML.
XML parse error display in Firefox

Notice how just enough of the source file is displayed to pin-point the error. You can always see the full source document again using the view-source feature.

Types of XML

XML is just a base format with which you can build more specific formats, and Firefox uses special processing and rendering for prominent XML formats it happens to support. I touched on some of these in the previous article, including XHTML, Scalable Vector Graphics (SVG), and XSLT. The primary means for Firefox to determine whether a browsed resource is XML, and if so whether it's some special form of XML, is the internet media type (commonly known as the MIME type). A Web server sends MIME type information for every resource delivered to the browser. For files opened from your local file system, the browser guesses the MIME type based on the file's extension. Table 1 summarizes the XML-related MIME types that are recognized by a clean Firefox install.

Table 1. XML MIME types handled by Firefox
MIME typeHandling by FirefoxNotes
text/xmlApply CSS, if available, otherwise default XML handlerAvoid this media type, if possible
application/xmlApply CSS, if available, otherwise default XML handler 
application/*+xmlApply CSS, if available, otherwise default XML handlerApplies to any media type using the convention for XML
application/xhtml+xmlRender as XHTML, according to doctype declarationThis includes vocabularies -- such as MathML, XLink, and even SVG -- that Firefox recognizes when embedded within XHTML
application/vnd.mozilla.xul+xmlMozilla chrome handler (for custom Mozilla UIs) 
application/rdf+xmlDefault RDF renderer (displays all text literal objects)Firefox does use RDF in its configuration registries
image/svg+xmlSVG renderer 

Some Firefox extensions allow Firefox to recognize additional media types. You can also add an application handler for some XML format. For example, if you want to handle VoiceXML with a voice browser, you register that application with the MIME type application/voicexml+xml.

You can always check the MIME type that Firefox associates with a page once it loads the page. Right click on the page, and select View page info from the context menu. Firefox uses text/xml for files with a .xml extension, although according to current best practice it should use application/xml.


A series of unfortunate limitations

Firefox does not support a few XML facilities you might use. When you design pages with Firefox in mind, you want to know its limitations in basic XML processing. Many of these limitations are acknowledged in bug reports and enhancement requests ("bugzilla"). You'll find links to many of these in Resources. You can vote for a bug or enhancement request to be given higher priority by Mozilla developers, so if these limitations affect you, please consider getting a Mozilla bug tracker account (very simple to obtain) and vote for resolution.

The first limitation to mention is that whenever Firefox parses an XML file, it pretty much hangs up that thread of processing until the parse is complete. This means that if you send Firefox a very large XML file, your users might have a long wait before they see anything happen. If you send Firefox a large HTML file, it uses incremental rendering to display the HTML bit by bit as it reads it. It would be nice to have the equivalent capability for XML, but for now, just consider the size of XML files that you send the browser.

Firefox does not support DTD validation. It doesn't read DTDs in external files, but it also doesn't use any declarations within the document (called the internal subset) for validation. As far as reading external files, Firefox does not read any external entities at all, whether parameter entities (such as DTDs and DTD fragments) or general entities (external, well-formed XML fragments). This means that Listing 3 is logically processed by Firefox the same way as Listing 4 regardless of the contents of extFile.ent.

Listing 3. XML file that uses an external parsed entity
    <!DOCTYPE myXML[
    <!ENTITY extFile SYSTEM "extFile.ent">
    ]>
    <myXML>&extFile;</myXML>
Listing 4. XML without entities that is logically treated by Firefox identically to Listing 3
    <myXML></myXML>

Support of such external entities does have possible security implications, and possible performance implications, but both have workarounds, and I hope Firefox addresses these limitations soon.

If you happen to use RDF/XML, be aware that Firefox does not employ well-formedness checks when parsing RDF. As a consequence, the fact that Firefox processes RSS 1.0 Web feeds (which are RDF) without regard to well-formedness, which is unfortunate because the Web community is trying to increase the enforcement of well-formed Web feeds.


Elements of style

The easiest way to get Firefox to render arbitrary XML in a non-generic way is to use a stylesheet. Firefox supports cascading stylesheets and XSLT. I won't dwell too much on the use of these technologies in Firefox because IBM developerWorks already offers a set of in-depth tutorials on the topic. See Resources for more details. One thing I shall mention here is that you must ensure that any stylesheets are loaded from the same Internet domain as the source XML document, otherwise Firefox will not load and apply the stylesheets. This security restriction is to avoid cross-site scripting attacks (XSS).

For Firefox-specific XSLT

Even in an established standard such as XSLT, precise behavior can differ across engines. If you need to specify a section of XSLT to execute only under Mozilla and thus Firefox, or indeed under any form of Transformiix, which is the XSLT engine bundled with Mozilla, use a conditional block such as in Listing 5.

Listing 5. Example of a Mozilla-specfic code block in XSLT
<xsl:if test="system-property('xsl:vendor')='Transformiix'">
  <xsl:text>This will only be output by 

Firefox/Mozilla/Transformiix</xsl:text>
</xsl:if>

Each XSLT engine will have a different value for system-property, and if necessary you can use xsl:choose instead to provide sections specific to each one.


Wrap up

Entities all over the map

XML has many types of entities, and from that comes a lot of confusion over the term "entities." Unfortunately, some of this confusion affects documentation and discussion of Firefox. I've found such confusion in the Mozilla FAQ and in several bugzilla entries.

Here are some of the important types of entities, and my informal summary of the support in Mozilla and thus Firefox.

  • character entities: fully supported
  • internal general entities (declared as literal, well-formed XML fragments): Fully supported as long as you define in the internal DTD subset.
  • external parsed general entities (declared as references to an external resource): not supported (ignored)
  • parameter entities (DTD fragments): not supported (ignored)
  • unparsed entities: not supported (ignored)

As you can see, Firefox has a lot of capabilities. You can view XML in a simplified logical view or in original source form. You are notified of any well-formedness errors. You can tailor the display using CSS or XSLT. Firefox recognizes several important XML vocabularies based on MIME type, and handles them accordingly. Firefox does have some limitations when you process XML with it. All the major browsers could use some work in their XML support, and so understanding such abilities and limitations is important. As XML-based technologies such as Web feeds, SVG and XSLT become more important, one can expect better XML support in browsers. Meanwhile you can do a lot with XML in Firefox today. Look for future articles in this series on IBM developerWorks to learn more.

So to summarize, the internal subset of a DTD is checked for internal general entities, which are correctly processed. All other non-character entity types are ignored, and no validation is performed, even for declarations within the internal subset.

Resources

Learn

Get products and technologies

  • Firefox: Get the Mozilla-based Web browser that offers standards compliance, performance, security, and solid XML features. The current version is 1.5.0.1.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Web development
ArticleID=106063
ArticleTitle=XML in Firefox 1.5, Part 2: Basic XML processing
publish-date=03212006