A couple of years ago, many developers bet their futures on XML, XSLT, Extensible HTML (XHTML), and a host of tag-based "X" languages. Now, the new rage is Asynchronous JavaScript and XML (AJAX), and investors' eyes are turning toward data-driven Rich Internet Applications that use JavaScript code. But have developers bridged the gap between XML and this new technology?
Sure, you could use the XML parser in a Web client to read the data, but two problems arise with that approach. First, for security reasons, XML data can only be read from the same domain as the page. That's not a huge limiting factor, but it does cause some headaches in deployment and impedes the creation of DHTML widgets. Second, reading and parsing XML is slow.
Another option is to let the server do the work of parsing the XML by configuring it to send the data to the browser encoded as JavaScript code or, in the more trendy parlance, JavaScript Object Notation (JSON). In this article, I demonstrate three techniques for you to generate JSON from XML data using the XSLT V2 language and the Saxon XSLT V2 processor:
- Simple encoding
- Loading data through function calls
- Encoding objects
To learn how to encode data as JSON (which is really just a JavaScript subset), start with some data. Listing 1 shows an example XML data set with a list of books.
Listing 1. The basic graphics library
<?xml version="1.0" encoding="UTF-8"?>
<books>
<book id="1">
<title>Code Generation in Action</title>
<author><first>Jack</first><last>Herrington</last></author>
<publisher>Manning</publisher>
</book>
<book id="2">
<title>PHP Hacks</title>
<author><first>Jack</first><last>Herrington</last></author>
<publisher>O'Reilly</publisher>
</book>
<book id="3">
<title>Podcasting Hacks</title>
<author><first>Jack</first><last>Herrington</last></author>
<publisher>O'Reilly</publisher>
</book>
</books>
|
The data set is simple and contains three books, each with a unique ID, a title, the author's first and last name, and the publisher. (Okay, I'm doing some shameless plugging by selecting just my own books as the data set. Can you blame me? They make great holiday and birthday gifts.)
Listing 2 shows how this data would look in JSON.
Listing 2. The sample data set in JSON
[ { id: 1,
title: 'Code Generation in Action',
first: 'Jack',
last: 'Herrington',
publisher: 'Manning' },
... ]
|
The brackets ([]) indicate an array. The curly braces ({}) indicate a hash table, which is a set of name and value pairs. In this case, I create an array of hash tables -- a common method to store structured data such as this.
It's worth noting that strings are encoded using single or double quotation marks.
So, if I want to encode O'Reilly in a single quoted
string, I must escape it using backslash, 'O\'Reilly'.
That makes the XSLT stylesheets that I write a little more interesting.
I haven't put any dates in the example, but you can encode a date in either of two ways. The first method is as a string that must be parsed later. The second method is as an object, for example:
publishdate: new Date( 2006, 6, 16, 17, 45, 0 ) |
This code sets the publishdate value to 7/16/2006 at
5:45:00 p.m..
I will offer several different techniques for JSON encoding. The first is the simplest. The stylesheet is shown in Listing 3.
Listing 3. The simple.xsl stylesheet
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
xmlns:js="http://muttmansion.com">
<xsl:output method="text" />
<xsl:function name="js:escape">
<xsl:param name="text" />
<xsl:value-of select='replace( $text, "'", "\\'" )' />
</xsl:function>
<xsl:template match="/">
var g_books = [
<xsl:for-each select="books/book">
<xsl:if test="position() > 1">,</xsl:if> {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)" />',
first: '<xsl:value-of select="js:escape(author/first)" />',
last: '<xsl:value-of select="js:escape(author/last)" />',
publisher: '<xsl:value-of select="js:escape( publisher )" />'
}</xsl:for-each>
];
</xsl:template>
</xsl:stylesheet>
|
To understand the stylesheet, it's easier to view the output as shown in Listing 4 first.
Listing 4. The output of simple.xsl
var g_books = [
{
id: 1,
name: 'Code Generation in Action',
first: 'Jack',
last: 'Herrington',
publisher: 'Manning'
}, {
id: 2,
name: 'PHP Hacks',
first: 'Jack',
last: 'Herrington',
publisher: 'O\'Reilly'
}, {
id: 3,
name: 'Podcasting Hacks',
first: 'Jack',
last: 'Herrington',
publisher: 'O\'Reilly'
}
];
|
Here, I set a variable called g_books to an array
that contains three hash tables, where each hash table contains information about
the book. Look back at Listing 3, you'll see the first template is the one that matches the "/" path, which is the first template applied to the input data set. This template uses a for-each
loop to walk through each book. Then, it uses the <value-of>
tag to output the text from the data into the output JavaScript code.
In the case of the strings, I use a custom function called
js:escape(), which is defined just before the template.
That function uses a regular expression to change a single quotation mark into
backslashed single quotation mark.
The last important element is the <xsl:output> tag,
which tells the processor that I want to output text and not XML. To see whether
this process works, I put together a simple .html file that references the output
of the XSL stylesheet that I saved in a file called simple.js. The HTML is shown
in Listing 5.
Listing 5. The simple.html file
<html> <head> <title>Simple JS loader</title> <script src="simple.js"></script> </head> <body> <script> document.write( "Found "+g_books.length+" books" ); </script> </body> </html> |
The .html file simply loads the encoded JavaScript code using the first
<script> tag. Then, the second
<script> tag writes out the length of the array
into the browser page, as shown in Figure 1.
Figure 1. The output of simple.html
Great! The data file contains three books, and the corresponding JavaScript file contains three books. It works!
The first example is quite simple and might do the trick in many situations, but
there are some problems with it. The first problem is that there's no indication
of when the data has been loaded. That's not a problem when the data is loaded
in a static way, as this page is. But when the page creates a
<script> tag on the fly to load data on request,
it's important to know when that <script>
tag is complete. One of the best ways to do that is to have the encoded data
invoke a JavaScript function instead of just setting data.
This concept is important, so I'll step back and spend some time on why you must
be able to load data through dynamically generated <script>
tags. Getting data from the server after the page is loaded is at the core of
Web 2.0. One way to do that is to use the AJAX mechanism to load the XML through
a call to the server. However, for security reasons, that AJAX mechanism is
limited to getting data from a single domain. That works well in a most situations,
but sometimes, you want your JavaScript code to run on other people's pages (for
example, Google Maps).
The only way to get data from the server in that case is through dynamically
loading <script> tags. The best way to know
when a <script> tag is loaded is to have the
script that the <script> tag returns call a
function instead of simply loading data. Listing 6 shows
data encoded in a function call.
Listing 6. Function1.js
AddBooks( [
{
id: 1,
name: 'Code Generation in Action',
first: 'Jack',
last: 'Herrington',
publisher: 'Manning'
}, {
id: 2,
name: 'PHP Hacks',
first: 'Jack',
last: 'Herrington',
publisher: 'O\'Reilly'
}, {
id: 3,
name: 'Podcasting Hacks',
first: 'Jack',
last: 'Herrington',
publisher: 'O\'Reilly'
}
] );
|
Listing 7 shows the corresponding .html file.
Listing 7. Function1.html
<html>
<head>
<title>Function 1 JS loader</title>
<script>
var g_books = [];
function AddBooks( books ) { g_books = books; }
</script>
<script src="function1.js"></script>
<script src="drawbooks.js"></script>
</head>
<body>
<script>drawbooks( g_books );</script>
</body>
</html>
|
I'll get into the drawbooks function in a second. But
what's important here is to see how the page defines the
AddBooks function, which is then called by the script
in the function1.js file. That AddBooks function
figures out what to do with the data. And the AddBooks
function, having been called, indicates to the page that the
<script> tag has loaded properly and has completed
loading.
To create the function1.js file, I made only minor modifications to the stylesheet, as shown in Listing 8.
Listing 8. The function1.xsl stylesheet
<xsl:template match="/">
AddBooks( [
<xsl:for-each select="books/book">
<xsl:if test="position() > 1">,</xsl:if> {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)" />',
first: '<xsl:value-of select="js:escape(author/first)" />',
last: '<xsl:value-of select="js:escape(author/last)" />',
publisher: '<xsl:value-of select="js:escape( publisher )" />'
}</xsl:for-each>
] );
</xsl:template>
|
So, instead of simply setting a variable, I now invoke a function. That's the only change.
Back to the page, I used the drawbooks function to build
a table of the books so that I can make sure that the data is encoded properly and
appears correctly. The function is defined in drawbooks.js, which is shown in
Listing 9.
Listing 9. Drawbooks.js
function drawbooks( books )
{
var elTable = document.createElement( 'table' );
for( var b in books )
{
var elTR = elTable.insertRow( -1 );
var elTD1 = elTR.insertCell( -1 );
elTD1.appendChild( document.createTextNode( books[b].id ) );
var elTD2 = elTR.insertCell( -1 );
elTD2.appendChild( document.createTextNode( books[b].name ) );
var elTD3 = elTR.insertCell( -1 );
elTD3.appendChild( document.createTextNode( books[b].first ) );
var elTD4 = elTR.insertCell( -1 );
elTD4.appendChild( document.createTextNode( books[b].last ) );
var elTD5 = elTR.insertCell( -1 );
elTD5.appendChild( document.createTextNode( books[b].publisher ) );
}
document.body.appendChild( elTable );
}
|
This simple function creates a table node, then iterates through the book list and creates a row for each book, with a cell for each data element. The result of the code on this page is shown in Figure 2.
Figure 2. The function1.html result
Now I can look at the output of the page and see that everything from the original
.xml file was translated into JavaScript code correctly and that the data was sent to
the AddData function and added to the page properly.
Refining the function call technique
I like the function call technique, but I'm not sure I like that all the book data goes in as one block. Another option is to have one call per record, as shown in Listing 10.
Listing 10. Function2.js
AddBook( {
id: 1,
name: 'Code Generation in Action',
first: 'Jack',
last: 'Herrington',
publisher: 'Manning'
} );
AddBook( {
id: 2,
name: 'PHP Hacks',
first: 'Jack',
last: 'Herrington',
publisher: 'O\'Reilly'
} );
...
|
Only a minor modification is required to the .html page, as shown in Listing 11.
Listing 11. Function2.html
...
<script>
var g_books = [];
function AddBook( book ) { g_books.push( book ); }
</script>
...
|
The XSLT is changed so that the function invocation now resides within the body
of the for-each loop. Listing 12
shows the updated stylesheet.
Listing 12. function2.xsl
...
<xsl:template match="/">
<xsl:for-each select="books/book">
AddBook( {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)" />',
first: '<xsl:value-of select="js:escape(author/first)" />',
last: '<xsl:value-of select="js:escape(author/last)" />',
publisher: '<xsl:value-of select="js:escape( publisher )" />'
} );</xsl:for-each>
</xsl:template>
...
|
This change might seem arbitrary given the example. But if your original XML dataset has a variety of data types, delivering each with a separate function call makes both the XSL and the JavaScript code on the page simpler and easier to maintain.
For small pages, using JavaScript functions is fine. But for larger projects, you definitely want to use some of the object-oriented features of the JavaScript language. Yes, the JavaScript language does objects, and does them well.
Listing 13 shows how creating objects with the data looks.
Listing 13. Object1.js
g_books.push( new Book( {
id: 1,
name: 'Code Generation in Action',
first: 'Jack',
last: 'Herrington',
publisher: 'Manning'
} ) );
g_books.push( new Book( {
id: 2,
name: 'PHP Hacks',
first: 'Jack',
last: 'Herrington',
publisher: 'O\'Reilly'
} ) );
|
In this case, I simply add Book objects to an array
called g_books. JavaScript object creation is similar
to the Java™, C#, or C++
programming languages. A new operator is followed by the class name. Then,
the arguments go in parentheses. In this case, I send in a single hash table with
the values. I might just as easily break them out as separate parameters.
The code to create this object is shown in Listing 14.
Listing 14. Object1.xsl
<xsl:template match="/">
<xsl:for-each select="books/book">
g_books.push( new Book( {
id: <xsl:value-of select="@id" />,
name: '<xsl:value-of select="js:escape(title)" />',
first: '<xsl:value-of select="js:escape(author/first)" />',
last: '<xsl:value-of select="js:escape(author/last)" />',
publisher: '<xsl:value-of select="js:escape( publisher )" />'
} ) );</xsl:for-each>
</xsl:template>
|
The code that's more interesting is in the page where the
Book class is defined. Listing 15
shows this page.
Listing 15. object1.html
...
<script>
var g_books = [];
function Book( data )
{
for( var d in data ) { this[d] = data[d]; }
}
</script>
...
|
The constructor for the Book class iterates through the
data in the hash table. With each key, an instance variable is created on the
object with that name and data. No modification is required to the
drawbooks function, because the objects all have the
same keys and values as the original hash tables. And the JavaScript language
doesn't distinguish between accessing named values in a hash table or accessing
the named values on an object.
Of course, a Book class should really have accessors
like set and get.
Listing 16 shows how I like to encode the JavaScript data.
Listing 16. Object2.js
var b1 = new Book(); b1.setId ( 1 ); b1.setTitle ( 'Code Generation in Action' ); b1.setFirst ( 'Jack' ); b1.setLast ( 'Herrington' ); b1.setPublisher ( 'Manning' ); g_books.push( b1 ); var b2 = new Book(); b2.setId ( 2 ); b2.setTitle ( 'PHP Hacks' ); ... |
Right, that's more like it. I'll create an object, set its values, then add it to the array, and so on. To begin, I make some bigger modifications to the stylesheet, as shown in Listing 17.
Listing 17. Object2.xsl
...
<xsl:function name="js:createbook">
<xsl:param name="book" />
<xsl:variable name="b" select="concat( 'b', $book/@id )" />
var <xsl:value-of select="$b" /> = new Book();
<xsl:value-of select="concat( $b, '.setId' )" />
( <xsl:value-of select="$book/@id" /> );
<xsl:value-of select="concat( $b, '.setTitle' )" />
( '<xsl:value-of select="js:escape( $book/title )" />' );
<xsl:value-of select="concat( $b, '.setFirst' )" />
( '<xsl:value-of select="js:escape( $book/author/first )" />' );
<xsl:value-of select="concat( $b, '.setLast' )" />
( '<xsl:value-of select="js:escape( $book/author/last )" />' );
<xsl:value-of select="concat( $b, '.setPublisher' )" />
( '<xsl:value-of select="js:escape( $book/publisher )" />' );
</xsl:function>
<xsl:template match="/">
<xsl:for-each select="books/book">
<xsl:value-of select="js:createbook(.)" />
g_books.push( b<xsl:value-of select="@id" /> );
</xsl:for-each>
</xsl:template>
...
|
I defined a new function called createbook that
builds the book object and is invoked by the template with each book. The
createbook function still calls an
escape function to make sure the strings are encoded
properly.
On the HTML side of things, I must add more methods to the
Book class so that the encoded JavaScript code can
call them. These new methods are shown in Listing 18.
Listing 18. Object2.html
...
<script>
var g_books = [];
function Book() { }
Book.prototype.setId = function( val ) { this.id = val; }
Book.prototype.setTitle = function( val ) { this.name = val; }
Book.prototype.setFirst = function( val ) { this.first = val; }
Book.prototype.setLast = function( val ) { this.last = val; }
Book.prototype.setPublisher = function( val ) { this.publisher = val; }
</script>
...
|
The prototype mechanism is something specific to the JavaScript language. Each object in this language is its own individual entity with its own data and function that you can set independently. Each object of a certain class has the same prototype. So, to create methods shared by all classes, I set the function on the prototype, not just on the object.
You can use any of several techniques to encode data stored in XML as JavaScript code. How you encode the data depends on the design of your Web 2.0 application and what you intend to do with the data after it's on the page. The key is to make the best use of the dynamic JavaScript language that you generate.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample code used in this article | x-xml2json-samplecode.zip | 7KB | HTTP |
Information about download methods
Learn
- Json.org: Find useful information, libraries, and links that cover JavaScript Object Notation.
- Mastering Ajax, Part 1: Introduction to Ajax (Brett McLaughlin, developerWorks, December 2005): Learn to use Ajax, a productive approach to building Web sites.
- developerWorks Ajax articles and tutorials: Browse this list and get a head start on Ajax and related technology.
- The JavaScript standard: Explore this standard defined by Ecma International.
- XSLT 2.0 standard: Dig into the syntax and semantics of this standard language for transforming XML documents into other XML documents, maintained by the World Wide Web Consortium (W3C).
- Saxon XSLT 2.0 processor: Learn more about the processor used by the author for this article.
- IBM XML 1.1 certification: Find out how you can become an IBM Certified Developer in XML 1.1 and related technologies.
- XML: See developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
Get products and technologies
- Build your next development project with IBM trial software, available for download directly from developerWorks.
Discuss
- Participate in the discussion forum.
- developerWorks blogs: Get involved in the developerWorks community.
- XML zone discussion forums: Participate in other XML-centered forums.
Jack D. Herrington is a senior software engineer with more than 20 years of experience. He's the author of three books: Code Generation in Action, Podcasting Hacks, and PHP Hacks. He has also written more than 30 articles. You can reach Jack at jherr@pobox.com.
Comments (Undergoing maintenance)





