XML Matters: Transcending the limits of DOM, SAX, and XSLT

The HaXml functional programming model for XML

Consider Haskell in lieu of DOM, SAX, or XSLT for processing XML data. The library HaXml creates representations of XML documents as native recursive data structures in the functional language Haskell. HaXml brings with it a set of powerful higher order functions for operating on these "datafied" XML documents. Many of the HaXml techniques are far more elegant, compact, and powerful than the ones found in familiar techniques like DOM, SAX, or XSLT. Code samples demonstrate the techniques.


David Mertz (mertz@gnosis.cx), Prestidigitator, Gnosis Software, Inc.

David Mertz David Mertz' tattoos provide a surprisingly deep insight into his programming and theoretical interests. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future, columns are welcomed.

01 October 2001

Also available in Japanese

The most common techniques for manipulating XML documents are DOM, SAX, and XSLT. These techniques have a distressing lack of unifying principles among them. Everything you might want to do with XML is available in one of the major approaches, but when what you want to do crosses the boundaries of what each technique does best, it is far from clear how to approach a problem. You are likely to wind up with a hodge-podge application in which various smaller transformations are chained together with heterogeneous techniques and tools.

DOM melds everything XML into an object-oriented programming (OOP) framework, at a level of abstraction higher than that of any particular programming language. DOM is language independent, and its "document object model" is provided by libraries in many general-purpose programming languages. In a sense, DOM is the language -- it specifies all the methods and arguments -- and the underlying general-purpose language is just a little glue. My utility xml_objectify, discussed in previous installments, provided a way of transforming XML documents into more native-feeling "OOPified" objects (in Python), largely in response to the somewhat artificial feel of DOM.

SAX is similar to DOM in its language neutrality, but it substitutes an event-driven and procedural model for the OOP framework of DOM. SAX has the very nice feature in its ability to process XML documents as streams, acting upon each element and content as it is encountered. But with its event-driven philosophy comes SAX's disadvantage of having no real concept of the data structures represented by XML documents. One can build such structures in a concrete application, but even something so basic as a parent/child relationship must be represented in the vocabulary of the programming language that underlies a SAX library; SAX itself is blind to most of what is interesting about XML.

Finally, XSLT is the familiar technique that, in a sense, best matches the structure of XML. Perhaps reflecting this match, XSL documents are themselves XML document instances. XSLT is a special-purpose functional programming language that allows you to specify transformations of XML documents into other things (especially, but not only, into other XML documents). Aside from the somewhat annoying verboseness of XSLT, it is limited in its expressiveness -- the things you can say are expressed rather clearly (and functionally, not procedurally), but you quickly bump up against all the things that you simply cannot say in XSLT.

A simple and typical task

Let me describe a quite realistic scenario that shows weaknesses in the common techniques. XSLT is typically the most direct way to describe a transformation of an XML document into an output. For example, you might want to create an HTML representation of an XML document. Let's say you have an XML version of the I Ching that looks something like Listing 1.

Listing 1. XML version of the I Ching
<?xml version="1.0"?>
    <title>Some Hexagrams from the I Ching</title>
        <name>Ch'ien / The Creative</name>
            The Creative works sublime success,
            Furthering through perseverance.
        <name>K'un / The Receptive</name>
            The Receptive brings about sublime success,
            Furthering through the perseverance of a mare.
        <name>Chun / Difficulty at the Beginning</name>
            Difficulty at the Beginning works supreme success,
            Furthering through perseverance.

To present this information in an HTML table, you might use something like the XSLT instructions in Listing 2.

Listing 2. XSLT instructions for an I Ching HTML table
<xsl:stylesheet version="1.0"
    <xsl:output method="html" indent="yes" encoding="UTF-8"/>
    <xsl:template match="IChing">
        <head><title><xsl:value-of select="title"/></title></head>
        <body><table border="1"><xsl:apply-templates/></table></body>
    <xsl:template match="hexagram">
    <xsl:template match="number">
    <xsl:template match="name">
    <xsl:template match="judgement">
    <xsl:template match="*"></xsl:template>

The XSLT in Listing 2 seems simple and direct: you just create a template to describe how you would like each element formatted. What could be easier? The problem comes as soon as you want to filter or compute something for the output -- something that is not included in the few comparisons available to XSLT. For example, maybe you want (in a numerological spirit) to display only the even-numbered hexagrams, or only the prime ones. With XSLT, you are out of luck for something this simple.

At this point, you might turn to DOM or SAX for the task. This will work, but you first have to simply throw away the work that went into earlier XSLT transformations. DOM or SAX are completely different models; they share no significant code or concepts with XSLT. For something as simple as my toy style sheet, it hardly matters; for large production-quality transformations, you might lose a lot.

Moreover, neither SAX nor DOM have particularly elegant or maintainable models for output. In order to output the simple (filtered) HTML table described, you just have to litter the application code with print statements or printf() functions (or whatever our general language uses). These output statements are themselves buried in conditional blocks that test for element types and other conditions. In such code, there is no way to tell at a glance how the output flows or make sure that a print "</tr>" is reached to correspond with every print "<tr>" that occurred earlier. There is certainly no guarantee given by imperative or OOP code that the output HTML or XML will be well formed (let alone valid).

One can build "writer" methods on top of DOM. Some popular examples of this are Java's org.w3c.dom.html package and Python's xml.dom node method .writexml(). These methods take a DOM tree and output it in sensible HTML (with an assurance of well formedness). However, my feeling (from what I have seen) is that "sensible" is limited to a few configuration options -- not enough to output arbitrary HTML (the way XSLT or HaXml can do). In any case, you are still left with a procedural manipulation of DOM trees prior to the output, which is less clear and more bug-prone than functional styles.

A uniform solution

A system that both lets you express output declaratively (as XSLT does) and lets you include arbitrary filters and computations (as the implementation languages underlying DOM and SAX do) would be ideal for XML transformations. While you're at it, it couldn't hurt to be guaranteed the well formedness, or even validity, of the output. And a compact and direct syntax would be nice, too.

HaXml meets all those requirements. Actually, the power of HaXml goes beyond meeting the requirements. Taking advantage of the higher-order combinatorial functions that a functional programming language like Haskell allows, you can arbitrarily combine multiple filters in specifying output. In XSLT, each <xsl:template> acts as a sort of filter between the input and output. But the only real combination of filters created by an XSL file is a union on all the filters. In contrast, HaXml lets us specify much more fine-grained chains of filters for each element of the output. Actually, much of the same can be achieved using XPath with XSLT, but HaXml is much more concise, and strictly more powerful.

As well as providing numerous combinators, HaXml allows the inclusion of arbitrary computations in the Haskell language as part of the filtering. An extra bonus is that you can specify output in a much more coherent block that allows readable intermixing of output terms with filtering conditions. The result, shown in Listing 3, is much more concise than XSLT, and has fewer punctuation symbols to make visual scanning difficult.

Listing 3. HaXml program to output an I Ching HTML table
module Main where
import XmlLib
-- Concise XSLT-like specification of output
main = processXmlWith (hexagrams `o` tag "IChing")
hexagrams =
    html [
      hhead [htitle [keep /> tag "title" /> txt] ],
      hbody [htableBorder [rows `o` children `with` tag "hexagram"] ]
htableBorder = mkElemAttr "TABLE" [("BORDER",("1"!))]
rows f =
      num = keep /> tag "number" /> txt
      nam = keep /> tag "name" /> txt
      jdg = keep /> tag "judgement" /> txt
      if (condition (num f) (nam f) (jdg f))
      then hrow [hcol [num], hcol [nam], hcol [jdg]] f
      else []
condition num nam jdg = isPrime (makeInt num)

-- Supporting computations for rows condition
makeInt = stringToInt . unwrap      -- Turn a Content into an Integer
unwrap [(CString b c)] = c          -- Turn a Content into a String
stringToInt = revToInteger.reverse  -- Turn a String into an Integer
    revToInteger = toInteger . revToInt
    revToInt []  = 0
    revToInt (d:ds) = digitToInt d + (10*(revToInt ds))
-- ordered search of Sieve of Eratosthenes
isPrime = ordSearch (sieve [2..])  
    ordSearch (x:xs) n
        | x < n     = ordSearch xs n
        | x == n    = True
        | otherwise = False
    sieve (x:xs) = x : sieve [y | y <- xs, y `mod` x > 0]

As with XSLT, definitions may occur in any order. The first 20 lines of the code specify the output format, with some definitions broken out into supporting functions (simply for readability). In Haskell syntax, a function is whatever thing occurs to the left of an equal sign, and the definition is to the right of the equal sign. The where and let clauses specify what you might call "inner functions" in other languages. The first lines are conceptually very much like the XSLT approach. But as an improvement, the HaXml version lets us define ad hoc filters anywhere they are needed. Below is an example filter.

rows `o` children `with` tag "hexagram"

The `o` is somewhat fancifully called "Irish composition," and it can also be pronounced of. rows is the filter that I have defined out of convenience; you are just as free to reuse rows as to use any of the standard filters, and in unlimited combination.

The latter half of the program contains a few computational functions. One nice example is a six-line primality test that does a lot to illustrate the power and elegance of Haskell as a language. As the program is structured, the function condition could equally well perform any tests it wanted on the Contentof the <number>, <name> or <judgement> elements. Content is a special data type, so the first thing you need to do is unwrap the String contained in a Content. After that, you can convert it to an Integer and test it (or do whatever else you want with it).

There are a lot more details to the HaXml library -- and learning Haskell itself requires a learning curve for programmers accustomed to imperative and OOP styles of programming. But if you limit your interest to just the capabilities you would find in XSLT, the top half of the example program is quite easy to expand upon (I would argue, with an easier learning curve than XSLT, while using a similar functional style). Not only is the syntax more readable than XSLT, but you bypass the need need to immediately learn how to do the sort of thing in the bottom half of the program (and much more).

Enforcing validity

In the above example, an XML document was treated as a generic tree structure. For most purposes, doing this is the easiest and quickest approach. But HaXml provides something else that is totally unique. If you have a DTD for a document type you plan to work with, you can generate a set of native Haskell data structures from that DTD. From that point forward, you can write applications that utilize the native DTD in your manipulations. Generating the HaXml data structures from a DTD involves several steps. The first thing is to create the data structure as a module, something like:

% DtdToHaskell MyFile.DTD MyFileDTD.hs

Once a data structure module is available, you can write an application that handles XML documents that must conform to the DTD. Such applications will generally contain at least the lines in the brief Listing 6.

Listing 4. Custom HaXml app for MyFile.DTD XML documents
import Xml2Haskell (readXml)
import MyFileDTD

From there, you can use all the higher order techniques Haskell provides for dealing with recursive data structures. The first thing, naturally, will probably be to readXml in order to work with a particular input XML document.

A reader can be forgiven for thinking at this point, "so what." It would appear that this is no different from a DOM approach -- or even SAX -- where one can perfectly well work with structured data, or even validate against a DTD.

There is much more here than meets the eye. Haskell, unlike almost all other programming languages, is thoroughly type safe about data structures. It is quite simply impossible in Haskell to perform any computation on the generated data structure that would result in an invalid XML document. In contrast, the best you can do in languages like C/C++, Java, Perl, Smalltalk, or Python is do a sanity check (validation) on the way in, and another one on the way out, and hope everything goes well. It might be possible in something like Eiffel to add enough contractual constraints on every "adder" and "deleter" to make sure DTD validity is maintained (or, with enough work, in all the mentioned languages), but doing so involves custom programming within every .addSpamTag() method. Moreover, all the DTD can do in the mentioned languages is provide a cheat sheet for an application programmer to look at.

With HaXml, the data structure generated programmatically from a DTD automatically includes every validity constraint. Mind you, just enforcing the constraints doesn't make an application programmer write correct code; but at least any bad code that would result in invalid documents is caught at compile time. The other caveat, of course, is that programming a custom application that enforces validity is just plain going to be more work than programming one that does not need to do this. But for "mission critical" requirements, HaXml could well provide the quickest and safest path to the rigorous goal.


  • Bijan Parsia wrote a very interesting related essay called, Functional Programming and XML at XML.com. Parsia makes the argument that functional programming styles are generally better suited to XML manipulation than are more familiar OOP techniques. He discusses HaXml and several other tools.
  • A detailed discussion of HaXml was written by its original authors, Malcolm Wallace and Colin Runciman, Haskell and XML: Generic Combinators or Type-Based Translation. Its tone and level presume a greater familiarity with Haskell and functional programming than is requisite for this column, but Wallace and Runciman's paper thereby contains many details not addressed herein.
  • Information about Haskell, including several tutorials, numerous papers, and various compilers and interpreters can be found at the Haskell language Web site.
  • I have myself written an introductory tutorial to Haskell, aimed at beginners, that's hosted on developerWorks.
  • You can download a zip archive of the files mentioned in this article from my archive for this column.
  • Some sample outputs of the transformations discussed in the article can be found at:
  • Find other articles in David Mertz's XML Matters column.


developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into XML on developerWorks

ArticleTitle=XML Matters: Transcending the limits of DOM, SAX, and XSLT