Skip to main content

XML processing in Ajax, Part 1: Four approaches

Parsing and transforming XML in Ajax programs

Mark Pruett (mark.l.pruett@dom.com), System Architect, Dominion
Mark Pruett is a Systems Architect at Dominion. He's a contributing author to O'Reilly Media's book Ajax Hacks and author of the O'Reilly Shortcuts Ajax and Web Services and Yahoo! Pipes.

Summary:  Any programming problem can be solved in multiple right ways. This series looks at four approaches for creating an Asynchronous JavaScript + XML (Ajax) weather badge, a small reusable widget that's easily embedded on any Web page. This first article lays the foundation and examines the first approach—walking the DOM tree.

View more content in this series

Date:  04 Mar 2008
Level:  Intermediate
Activity:  1366 views
Comments:  

Aristotle, the Greek philosopher, once wrote "it is possible to fail in many ways..., while to succeed is possible only in one way." But then, Aristotle wasn't a computer programmer.

Frequently used acronyms

  • DOM: Document Object Model
  • HTML: Hypertext Markup Language
  • HTTP: Hypertext Transfer Protocol
  • RSS: Rich Site Summary
  • XML: Extensible Markup Language
  • XSLT: Extensible Stylesheet Language Transformation

While Aristotle's first conjecture certainly applies to programming—"it is possible to fail in many ways"—his second is nowhere near as certain.

This series of articles looks at four different approaches to the same problem. None of them is demonstrably wrong—each has its own set of strengths and weaknesses. The problem they each solve is not complex, and neither are the solutions. Even so, the approaches illustrate a wide range of trade-offs that can be embodied in even simple solutions.

The problem: Creating a reusable Ajax weather badge

To define the problem I want to solve, here's the problem specification:

Build an Ajax library that reads current observation data from the National Weather Service, then extracts and transforms parts of that data into HTML to create a weather badge.

What's a badge?

A badge, or widget, is a small, self-contained chunk of JavaScript code you can include on a Web site to display special (often third-party) content. Badges usually display this content in a small area of the page. Badges can display the content of news feeds, calendars, clocks, or any other content the badge's authors want.

Many Web sites like to include local weather on their Web pages. To do this, they need access to up-to-date weather information. How do they get that data?

Within the United States, the National Weather Service (NWS) provides huge amounts of weather information. This data includes current weather observations for hundreds of U.S. cities. The data is available in either RSS or XML format.

The X in Ajax stands for XML, so this NWS data seems to be a good fit for an Ajax approach.

Four possible solutions

This series of articles looks at four different approaches for building an Ajax weather badge—a little box with weather information for any city or town the NWS monitors. The design goals are

  • Simplicity
  • Ease of reuse

I'll use these four approaches to explore the trade-offs inherent in each approach. None of the approaches is right or wrong.

Internally, each implementation is quite different, as described in Table 1.


Table 1. Four versions of the weather badge library
ApproachDescription
1: Walking the DOM treeA simple Web proxy on a server pulls data from the NWS server and sends it to the browser. Within the browser, the JavaScript interpreter extracts parts of the returned responseXML DOM tree, adds some HTML formatting, and inserts this into a DIV tag within the page.
2: XSLT on the serverA server-side script pulls the data from the NWS server, uses XSLT to transform that XML to HTML, then sends the HTML snippet back to the browser. The browser then just inserts the snippet into a DIV tag.
3: Client-side XSLTThis approach uses a simple Web proxy (identical to Approach 1) to send the XML data back to the browser. Unlike Approach 1, client-side XSLT converts the XML to HTML and inserts this into a DIV tag.
4: JSON and dynamic script tagsAn external service (Yahoo! Pipes) converts the NWS data from XML to JavaScript Object Notation (JSON). The weather badge library exploits the special qualities of JSON and the JavaScript language to pull the converted data back to the browser—side-stepping the need for proxying.

Shared elements

All four approaches that I cover for building a reusable Ajax weather badge share the following elements:

  • The pipeline approach
  • A simple Ajax library
  • The weather_badge() JavaScript function
  • The National Weather Service data

The pipeline approach

The concept of data pipelines dates back at least to the early days of UNIX®. In this model, data enters the pipeline and moves through a series of filters. Each filter transforms the data in some way. The transformed data is sent back out into the pipeline, and possibly through more transformations, until all transformations are complete. At the end of the pipeline might be a user's terminal, redirection to a file, or another program.

This approach works well when dealing with XML-based data and services available on the Web. A program can grab XML data from the Web, send it into a pipeline, and chain together a series of transformations to extract data and reformat it.

Unlike pipes and filters on a UNIX command line, this approach applied to Ajax applications requires a pipeline that spans multiple computers across a network. The XML data can originate on one Web server, be passed to another server within another domain, and eventually arrive at its final destination: the user's Web browser.

A simple Ajax library

An introduction to Ajax is beyond the scope of this article, but excellent primers are available (see Resources).

To make this series accessible to the widest possible audience, the examples presented here use a tiny Ajax library, shown in Listing 1. This library provides the thinnest veneer around the XMLHttpRequest object—just enough to smooth over the XMLHttpRequest differences found in each of the major browsers.


Listing 1. ajax-simple.js—A minimal Ajax/XMLHttpRequest library used in these examples
function Ajax (url, parms, method, callback) {

    this.url = url;
    this.parms = parms;
    this.method = method;
    this.callback = callback;
    this.async = true;

    this.create ();

    this.req.onreadystatechange = this.dispatch (this);
}

Ajax.prototype.dispatch = function (ajax) {

    function funcRef()
    {
        if (ajax.req.readyState == 4) {
            if (ajax.callback) {
                ajax.callback (ajax.req);
            }
        }
    }

    return funcRef;
}

Ajax.prototype.request = function () {

    if (this.method == "POST") {
        this.req.open("POST", this.url, this.async);
        this.req.send (this.parms);
    }
    else if (this.method == "GET") {
        this.req.open("GET", this.url + this.parms, this.async);
        this.req.send (null);
    }

}

Ajax.prototype.setAsync = function (async) {

    this.async = async;
}

Ajax.prototype.create = function () {

    var xmlhttp;
    /*@cc_on
    @if (@_jscript_version >= 5)

    try {
        xmlhttp = new ActiveXObject("Msxml2.XMLHTTP");
    }
    catch (e) {
        try {
            xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
        }
        catch (E) {
            xmlhttp = false;
        }
    }

    @else

    xmlhttp = false;

    @end @*/

    if (!xmlhttp && typeof XMLHttpRequest != 'undefined') {
        try {
            xmlhttp = new XMLHttpRequest();
        } catch (e) {
            xmlhttp = false;
        }
    }

    this.req = xmlhttp;
}

The weather_badge() JavaScript function

Common to all four approaches is the interface that adds a weather badge to a Web page. The interface is a single JavaScript function: weather_badge(). This function expects two parameters: a NWS station ID that identifies the city or town of interest and the element ID of an HTML DIV tag. This DIV tag is the target into which the weather badge is rendered. Figure 1 is an example of an Ajax weather badge.


Figure 1. An Ajax weather badge
An Ajax Weather Badge

The weather badge is rendered using HTML, but you can control many elements of its appearance, including fonts, background colors, and borders using Cascading Style Sheets (CSS).

Listing 2 illustrates how you can embed the weather badge in a Web page. Here the weather_badge() function is called from within a JavaScript onLoad event handler.


Listing 2. Embedding the weather badge in a Web page
<html>
  <head>
    <title>Apache Proxy Example</title>
    <link rel="stylesheet" type="text/css" href="weather.css" />
    <script language="Javascript" src="ajax-simple.js"></script>
    <script language="Javascript" src="weather_badge_apache_proxy.js">
    </script>
    <script>
      function init () {
        weather_badge ("KAKQ", "target1");
      }
    </script>
  </head>

  <body onload="init();">
    <h3>Apache Proxy Example</h3>

    <div class="wbadge" id="target1">
      Loading...
    </div>
  </body>
</html>

The National Weather Service data

The National Weather Service site uses a station ID to identify the city, town, or other location where weather readings were taken. A station ID is a unique four-character code.

The base URL for all NWS current observation data is:

http://www.nws.noaa.gov/data/current_obs/

The base URL, combined with the four-character station ID, provides a URL to the weather data. For example, the station ID for Richmond, Virginia is KRIC. The URL for Richmond's weather data is:

http://www.nws.noaa.gov/data/cuurent_obs/KRIC.xml

A simple XML format defines the current observation data, as shown in Listing 3.


Listing 3. National Weather Service XML data for Richmond, Virginia
<current_observation version="1.0" 
 xsi:noNamespaceSchemaLocation=
 "http://www.weather.gov/data/current_obs/current_observation.xsd">
  <credit>NOAA's National Weather Service</credit>
  <credit_URL>http://weather.gov/</credit_URL>
  <image>
    <url>http://weather.gov/images/xml_logo.gif</url>
    <title>NOAA's National Weather Service</title>
    <link>http://weather.gov</link>
  </image>
  <suggested_pickup>15 minutes after the hour</suggested_pickup>
  <suggested_pickup_period>60</suggested_pickup_period>
  <location>Richmond International Airport, VA</location>
  <station_id>KRIC</station_id>
  <latitude>37.51</latitude>
  <longitude>-77.31</longitude>
  <observation_time>
    Last Updated on Dec 11, 12:54 pm EST
  </observation_time>
  <observation_time_rfc822>
    Tue, 11 Dec 2007 12:54:00 -0500 EST
  </observation_time_rfc822>
  <weather>Overcast</weather><temperature_string>54 F (12 C)</temperature_string>
  <temp_f>54</temp_f>
  <temp_c>12</temp_c>
  <relative_humidity>80</relative_humidity><wind_string>From the South at 5 MPH</wind_string>
  <wind_dir>South</wind_dir>
  <wind_degrees>180</wind_degrees>
  <wind_mph>4.6</wind_mph>
  <wind_gust_mph>NA</wind_gust_mph>
  <pressure_string>30.31" (1026.7 mb)</pressure_string>
  <pressure_mb>1026.7</pressure_mb>
  <pressure_in>30.31</pressure_in>
  <dewpoint_string>48 F (9 C)</dewpoint_string>
  <dewpoint_f>48</dewpoint_f>
  <dewpoint_c>9</dewpoint_c>
  <heat_index_string>NA</heat_index_string>
  <heat_index_f>NA</heat_index_f>
  <heat_index_c>NA</heat_index_c>
  <windchill_string>53 F (12 C)</windchill_string>
  <windchill_f>53</windchill_f>
  <windchill_c>12</windchill_c>
  <visibility_mi>7.00</visibility_mi><icon_url_base>
    http://weather.gov/weather/images/fcicons/
  </icon_url_base>
  <icon_url_name>ovc.jpg</icon_url_name>
  <two_day_history_url>
    http://www.weather.gov/data/obhistory/KRIC.html
  </two_day_history_url>
  <ob_url>http://www.nws.noaa.gov/data/METAR/KRIC.1.txt</ob_url>
  <disclaimer_url>http://weather.gov/disclaimer.html</disclaimer_url>
  <copyright_url>http://weather.gov/disclaimer.html</copyright_url>
  <privacy_policy_url>http://weather.gov/notice.html</privacy_policy_url>
</current_observation>

The weather badge only needs a small subset of this data. I'll use values within the location, weather, icon_url_base, icon_url_name, temperature_string, wind_string, relative_humidity, visibility_mi, and observation_time elements.

Approach 1: Walking the DOM tree

This approach must first tackle a basic limitation of the XMLHttpRequest object used by Ajax programs: the same domain problem.

For security reasons, an XMLHttpRequest call can only initiate requests to the same server that delivered the original Web page. Unless I work for the National Weather Service, my server is outside of their domain (www.nws.noaa.gov). Figure 2 shows the data pipeline for this first approach to the weather badge.


Figure 2. The data pipeline for weather badge Approach 1
The pipeline for Approach 1

If I have access to my Web server's configuration, there's a simple solution to this problem: a Web proxy.

Web proxying is when a request to one server is redirected to a second server. I need my Ajax program to request a resource on my server and have it translate that request into a request on the NWS server. This circumvents the same domain problem: the Ajax program speaks to its own server, which surreptitiously redirects this request to the NWS server.

On Apache Web servers, proxies are implemented using a ProxyPass rule. The syntax of the rule is simple:

ProxyPass our_directory their_url

The first option refers to a non-existent location on my server, and the second option is a URL on a remote server. Any time a request comes in for our_directory, the request is redirected by Apache to their_url. The requestor (my Ajax program) is never aware of this.

Here's the proxy rule I'll implement to access the National Weather Service data:

ProxyPass /nws_currobs/ http://www.nws.noaa.gov/data/current_obs/

To get to the data for Richmond, Virginia, I request this URL:

/nws_currobs/KRIC.xml

Apache converts this to a request to NWS:

http://www.nws.noaa.gov/data/current_obs/KRIC.xml

Parsing XML in the browser

In an Ajax application, if a server request for XML data is successful, the responseXML property is initialized. This object property contains the retrieved XML, parsed into a DOM tree of type XMLDocument. (If the server data is not valid XML or, on some browsers, if the returned data isn't accompanied by a text/xml or application/xml HTTP Content-type header, the responseXML property isn't created. In these cases, the responseText property contains the unprocessed text returned by the server.)

With responseXML, I can traverse the DOM to extract values from the returned XML. Listing 4 provides an abridged version of the returned XML:


Listing 4. Abridged XML returned from the NWS server

<current_observation version="1.0" 
 xsi:noNamespaceSchemaLocation=
 "http://www.weather.gov/data/current_obs/current_observation.xsd">
  <location>Richmond International Airport, VA</location>
                    <observation_time>
    Last Updated on Dec 11, 12:54 pm EST
  </observation_time>
                    <weather>Overcast</weather>
                    <temperature_string>54 F (12 C)</temperature_string>
                    <relative_humidity>80</relative_humidity>
                    <wind_string>From the South at 5 MPH</wind_string>
                    <visibility_mi>7.00</visibility_mi>
                    <icon_url_base>
                    http://weather.gov/weather/images/fcicons/
                    </icon_url_base>
                    <icon_url_name>ovc.jpg</icon_url_name>
</current_observation>

Now I can extract the wind_string element from responseXML. The responseXML property is an XMLDocument type. The documentElement property of XMLDocument returns the top-level element of my XML DOM tree. To verify this in a program, I insert an alert() function in the code:

alert ("tagName: " + req.responseXML.documentElement.tagName);

When executed, the alert() pops up a window containing:

tagName: current_observation

To access individual elements below current_observation, use getElementsByTagName(). This Element method takes a tag name parameter and returns an array of all child Element nodes with that element name. Within a JavaScript program, I can write:

var elements = req.responseXML.documentElement.getElementsByTagName("wind_string");

The NWS XML data only included one wind_string element, so it's safe to assume the data I want is in the first element. The actual text within the wind_string element tags is accessed like this:

elements[0].firstChild.data

This is a lot of work to extract a single value from an XML document, especially when that document has a simple structure. It's easy to see how this method to extract data from XML can quickly become unwieldy. If I combine all the steps above into a single DOM reference, it looks like this:

req.responseXML.documentElement.getElementsByTagName("wind_string")[0].firstChild.data

For this application, I can make the code a bit more readable by defining a JavaScript helper function to extract the values, as shown in Listing 5.


Listing 5. A function to make DOM access less cumbersome
function get_element (doc_el, name, idx) {
    var element = doc_el.getElementsByTagName (name);
    return element[idx].firstChild.data;
}

With that function in place, the weather_badge() function becomes a bit more manageable, as you can see in Listing 6.


Listing 6. The weather_badge() function using an Apache proxy to retrieve the XML
function weather_badge (nws_id, div_name) {
    var ajax = new Ajax
        ("/nws_currobs/" + nws_id + ".xml",
         "",
         "GET",
         function (req) {
            var doc_el = req.responseXML.documentElement;

            // Extract values from XML structure returned by
            // by Ajax (XMLHttpRequest) call.

            var location = get_element (doc_el, "location", 0);
            var temperature_string = get_element (doc_el, "temperature_string", 0);
            var weather = get_element (doc_el, "weather", 0);
            var icon_url_base = get_element (doc_el, "icon_url_base", 0);
            var icon_url_name = get_element (doc_el, "icon_url_name", 0);
            var wind_string = get_element (doc_el, "wind_string", 0);
            var relative_humidity = get_element (doc_el, "relative_humidity", 0);
            var visibility_mi = get_element (doc_el, "visibility_mi", 0);
            var observation_time = get_element (doc_el, "observation_time", 0);

            var div = document.getElementById ("target1");
            div.innerHTML =
                "<center>\n"
                + "<b>" + location + "</b><br>\n"
                + weather + "<br>"
                + "<img border='0' src='"
                + icon_url_base + icon_url_name + "'/><br>\n"
                + temperature_string + "<br>\n"
                + "Wind: " + wind_string + "<br>\n"
                + "Humidity: " + relative_humidity + "<br>\n"
                + "Visibility: " + visibility_mi + "<br>\n"
                + "<br><span style='font-size: 0.8em; font-weight: bold;'>"
                + observation_time + "</span><br>\n"
                + "</center>\n";
          }
         );
    ajax.request ();
}

This code creates an Ajax object (recall that this is just a thin wrapper around the XMLHttpRequest object). The Ajax constructor takes four parameters, which are described in Table 2.


Table 2. The four parameters taken by the Ajax constructor
ParameterDescription
urlThe URL for the remote resource—in this case, the National Weather Service XML file proxied through my server.
parmsA string containing any URL parameters. I request a static XML document, not a server-side script, so no parameters are required.
methodThis parameter tells Ajax to make an HTTP GET request.
callbackThis parameter defines the callback function that Ajax invokes when the XML document is returned to the browser. Here, I extract values from the XML, then splice them together with a few HTML formatting tags to create a snippet of HTML. That snippet defines my weather badge.

I use the innerHTML property to stuff the HTML snippet into the Web page. The target DIV tag was passed into weather_badge() as the div_name parameter. Inserting my HTML snippet now becomes trivial:

var div = document.getElementById (div_name);
div.innerHTML = html_snippet;

Pros and cons

Any approach to a programming problem has its trade-offs. Table 3 lists some pros and cons to this approach.


Table 3. Pros and cons of Approach 1
ProsCons
Works identically across all major browsers.

Requires no add-on libraries or third-party tools.
The syntactically baroque method to access XML elements is unwieldy, even for simple XML documents.

Coming up

In Part 2 of this series, I look at alternatives to direct DOM access. I'll use XSLT to transform XML into HTML. The second and third implementations both use XSLT, differing in where in the data pipeline—the browser or the server—the transformation takes place.



Download

DescriptionNameSizeDownload method
Sample code for this seriesx-xmlajax.zip194KB HTTP

Information about download methods


Resources

Learn

Get products and technologies

  • IBM trial software: Build your next development project with trial software available for download directly from developerWorks.

Discuss

About the author

Mark Pruett is a Systems Architect at Dominion. He's a contributing author to O'Reilly Media's book Ajax Hacks and author of the O'Reilly Shortcuts Ajax and Web Services and Yahoo! Pipes.

Comments



Trademarks

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Web development
ArticleID=292478
ArticleTitle=XML processing in Ajax, Part 1: Four approaches
publish-date=03042008
author1-email=mark.l.pruett@dom.com
author1-email-cc=dwxed@us.ibm.com