Reading XML documents is not easy -- for a human. XML documents are all text, but the visual arrangement of parts does not necessarily correspond well to the conceptual connections between the parts. And finding the content amidst the tags makes skimming difficult. Of course, XML is rarely intended primarily as a format for humans to look at directly. Typically, an XML source is transformed into something else before it becomes ready for human consumption. For prose-oriented documents, usually the target is either an HTML page, or a PDF file (via Formatting Object [FO] tags), or perhaps a printed page. XSLT is often used to perform the transformations to human-readable format. For data-oriented XML documents, the target is usually the data format of a DBMS or an in-memory representation by an application that reads XML files.
Readers of developerWorks, however, are generally developers. Our lot is to look at many things that end-users can and should be spared. When something doesn't work in the behind-the-scenes trail of format transformations, it is our job to eyeball the intermediate formats, XML included. Often during the development process, it is also our responsibility to develop sample or test XML documents to simulate what might come out of or go into some stage in a distributed application (before the real generator or consumer exists).
Compared to some formats, XML is somewhat manageable in its raw form. Unlike binary formats, it is not out of the question to open an XML document in a text editor or text viewer. However, tags can be hard to parse visually, especially if the XML producer does not arrange vertical and horizontal whitespace to make it easier. If a big part of your job is reading raw XML documents, take a look at one of the XML editors I reviewed previously (see Resources). But for someone who only occasionally views XML source files (or when these files need to be viewed by a number of people), XML editors are often too expensive; not just in licensing dollars, but also in learning curves.
Almost all developers have a wonderful XML viewer already installed ... well, at least a pretty good one. Recent versions of both Internet Explorer and Mozilla/Netscape make an effort to render XML documents along with HTML ones. Other browsers like Opera and Konquorer also implement CSS2 -- Opera 5+ does a flawless job, Konqueror 2.2 a moderately good one. In general Mozilla and Netscape 6 do an excellent and accurate job of displaying an XML document in the styles indicated in a CSS2 style sheet. Internet Explorer (at least as of version 6) does a fair job, but seems to ignore the display: inline attribute. This makes IE6 less suitable for displaying prose-oriented XML documents (but it is still good for data-oriented ones). However, IE6 does have the advantage of displaying XML documents that lack a CSS2 style sheet in a hierarchical tree, and it allows folding subtrees.
Normally I either use XMetal (with some XMetal "rules" provided by my developerWorks colleague Benoît Marchal) or write the source in what I call "smart ASCII" and transform it to XML using the txt2dw.py tool I wrote that converts the text to the developerWorks' XML manuscript format. As an exercise I decided to write this tip using only a text editor (plus Mozilla 0.9.5). The exercise helped me understand the ins-and-outs of the Web browser-plus-CSS2 approach.
Here is how I approached things. I wrote some words in the appropriate XML dialect (using an earlier tip as a template). Then I created an empty dW.css CSS2 file to work with and added the following line to my XML document to serve as the style sheet declaration:
<?xml-stylesheet href="dW.css" type="text/css"?> |
So far, with that style sheet declaration alone, Mozilla does not do anything to help see the structure of the document. What you need to do next is build up a CSS2 style sheet for prettifying the XML document. An easy approach is to start at the top of the XML document and work your way down. For example, the first thing in a developerWorks article.dtd document is a <seriestitle> for the name of a column and the like. I'll make that look big and bold, and center it for emphasis. Actually, even before that, there are a few defaults that I know the whole document should have (we can override them for individual contexts, as needed). Listing 1 shows the first few lines of my CSS2 file.
Listing 1. Initial style sheet contents
$DOCUMENT {
font-family: "Times New Roman";
font-size: 12pt;
margin-top: 5px;
margin-left: 10px;
}
* {
display: block;
background-color: white;
padding: 2px;
}
seriestitle {
font-weight: bold;
font-size: 18pt;
text-align: center;
} |
From this point, I moved on to the next elements encountered (<papertitle> in this case), one after the other. After a few additions of block-level elements, I figured it would be worthwhile to make sure all the inline elements appear that way. For this, a glance through the DTD helped remind me of the relevant elements. So I included the few lines in Listing 2.
Listing 2. Handling inline elements
/* Inline Typographic Elements */
|
Add a few more block-level elements, and you wind up with a very nice presentation of an XML document. Even better, the bit of work I needed to do will be useful every time I need to look at a document in the same XML dialect in the future. Assuming that you use an up-to-date Web browser, you can view the selfsame useful appearance without needing to first transform the XML source to an HTML or PDF format (as developerWorks does as part of its own process).
The procedure for developing a CSS2 style sheet to match an XML document dialect is straightforward. You will find different
specific tags to worry about, of course. And for more data-oriented documents, you will almost surely want to use some display: table attributes somewhere in the definition. Doing a little work to set up a CSS2
style sheet makes examining XML documents considerably easier.
Let me leave you with a picture of this document as I have worked with it: Figure 1. The relevant sources can be found in the Resources links, but readers might have different browser versions and platforms that produce somewhat different renderings (or if your browser doesn't do something reasonable, upgrade):
Figure 1. A view of this document viewed with Mozilla, guided by a CSS2 style sheet
- In XML Matters #6 I reviewed a number of custom XML editors (many supporting CSS2).
- The resources that went into the production of the tip in front of you include the XML file that underlies this tip. Also noteworthy is the CSS2 style sheet that I used (and modified) during the writing of the tip. Moreover, to conform with the necessary XML dialect, I kept the developerWorks DTD open in a window while writing this tip. (To view some of these files, you might need to right click and save the file to your local machine.)
- A wonderful online resource for looking up CSS properties is the CSS2 Tutorial by Miloslav Nic. Particularly valuable is the Index of CSS properties.

David Mertz uses a wholly unstructured brain to write about structured document formats. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/.




