This article takes a sneak peek at what's new in XHTML 2.0 and how you might one day put it to use. Readers should be familiar with HTML and/or XHTML 1.0. Familiarity with Cascading Style Sheets (CSS) is helpful, but not required.
Good-bye backward compatibility, hello structure
When the World Wide Web Consortium (W3C) released the first working draft of XHTML 2.0 on 5 August 2002, the major surprise was that, unlike its predecessors, it was not backward compatible. With previous releases, such as the move from HTML 4.01 to XHTML 1.0, and later to XTHML 1.1, the changes were about additions; a browser that could read XHTML 1.0 (Transitional) documents could also understand HTML 4.01 documents. Not so with XHTML 2.0.
If, two years ago, you had announced that today we'd be looking at a version of HTML without an img tag or a bold tag, the vast majority of Web developers would have looked at you in disbelief. Yet here it is. In addition to replacing both forms and frames outright, XHTML 2.0 removes the b, i, and img tags (as well as big, small and tt), and even deprecates br in preparation for removing it from a future release. But why?
The reason is that most of these tags are presentational. Their sole purpose is to give the browser instructions on how their contents should look, while providing absolutely no information on what their contents are. For example, consider these two sentences:
Presentational elements are, <i>for the most part</i>, <b>gone</b>. |
and
Presentational elements are, <em>for the most part</em>, <strong>gone</strong>. |
In the absence of a style sheet, the sentences appear identical in the browser, but only the second sentence provides information on why. In fact, the emphasis and strong tags have been in HTML from the beginning, but for years authors have largely ignored them, concentrating on presentation at the expense of content.
This doesn't mean, however, that whenever you want to have something in bold or italics you should shoehorn it into these two tags. Instead, the whole purpose of stripping out presentational elements is to try to finish the job the inventors of CSS started, namely that content should be marked up according to what it represents, and style sheets should be used to make it look pretty. For example, Listing 1 uses classes to indicate content types.
Listing 1. Using classes to specify content types
<html>
<head><title>Employee Notice</title>
<style type="text/css">
.duedate { color: red;
font-weight: bold; }
.holiday { color: green;
font-style: italic }
</style>
</head>
<body>
<h1>Notice</h1>
<p>Employees should take note of the following important dates:</p>
<ul>
<li class="duedate">8/28/2002 (Progress reports due)</li>
<li class="holiday">9/1/2002 (Labor Day)</li>
<li class="duedate">10/28/2002 (Final reports due)</li>
</ul>
</body>
</html>
|
In this page, the type of day is identifiable from the content itself, and the browser can use the class information to decide how to style it, as shown in Figure 1.
Figure 1. Classes can determine what type of content is present, and style sheets can format it appropriately.

Looking at it this way, the break (br) tag can't have a purpose beyond presentation because it doesn't actually have any content. XHTML 2.0 deprecates the br tag in favor of the line tag. The line tag specifies a particular kind of content: a line of text or other content that is generally rendered in such a way that it's followed by a line feed and carriage return. For example, the text:
<p>
public class HelloWorld {<br />
public static void main (String[] args){<br />
System.out.println("Hello world!");<br />
}<br />
}
</p>
|
becomes
<p>
<line>public class HelloWorld {</line>
<line>public static void main (String[] args){ </line>
<line>System.out.println("Hello world!"); </line>
<line>}</line>
<line>}</line>
</p>
|
This way, the document has an actual object that represents the line, the same way that a paragraph (p) tag represents a paragraph of content.
Why is all of this important? The Web is increasingly becoming a place where communication happens not just between people, but between software applications such as servers and search engine indexers. What's more, the days when everyone (or almost everyone) used the same browser are long gone. Increasingly, developers are redesigning content for different devices, such as PDAs and mobile phones. Voice-activated systems are not that far off. The structural meaning of content is becoming almost as important as the content itself.
To that end, two of the additions to XHTML 2.0 are sections and headings. HTML has always had the numbered headings, h1 through h6, and as of the 5 August 2002 working draft they haven't been deprecated, but it's only a matter of time. Instead, XHTML 2.0 makes use of generic headings, and of sections. For example, sections can be nested, giving meaning to headers. A document formerly rendered with numbered headings (Listing 2):
Listing 2. Numbered headings in a document
<html> <head><title>Adding sections</title></head> <body> <h1>The Web's future: XHTML 2.0</h1> <p>by Nicholas Chase</p> <h2>Good-bye backward compatibility, hello structure</h2> <p>Why backward compatibility is over.</p> <h3>Presentation versus Structure</h3> <p>Using style sheets rather than presentational elements.</p> <h3>Lines</h3> <p>Line breaks are deprecated.</p> <h2>Sections</h2> <p>Creating more reasonable sections.</p> <h2>Navigation lists and menus</h2> <p>Hierarchical menus.</p> <h2>Links, links, everywhere</h2> <p>Adding links.</p> </body> </html> |
can be replaced with generic headings and sections (Listing 3):
Listing 3. Generic headings and sections
<html>
<head><title>Adding sections</title></head>
<body>
<section>
<h>The Web's future: XHTML 2.0</h>
<p>by Nicholas Chase</p>
<section>
<h>Good-bye backward compatibility, hello structure</h>
<p>Why backward compatibility is over.</p>
<section>
<h>Presentation vs. Structure</h>
<p>Using style sheets rather than presentational elements.</p>
</section>
<section>
<h>Lines</h>
<p>Line breaks are deprecated.</p>
</section>
</section>
<section>
<h>Sections</h>
<p>Creating more reasonable sections.</p>
</section>
<section>
<h>Navigation lists and menus</h>
<p>Hierarchical menus.</p>
</section>
<section>
<h>Links, links, everywhere</h>
<p>Adding links.</p>
</section>
</section>
</body>
</html>
|
This structure gives two advantages. First, it's much easier for an application such as a search engine crawler to understand the relative importance of content, and second, a section is self-contained. In HTML, a section started with its heading, so no content, such as introductory material, could appear before the heading. The section element removes that constraint because anything within it is part of the section.
One structural addition that could yield massive benefits to Web developers is the addition of the navigation list. A navigation list, designated by the nl tag, works much like its ordered list (ol) and unordered list (ul) cousins, but with a twist: The items of a navigation list appear only when the list is active. In this way, navigation lists are very similar to the hierarchical pop-up menus that are so popular because they provide a lot of navigational information without taking up much screen real estate. For example, a soap opera site might have the following menu (Listing 4):
Listing 4. Using a navigational list
<nl>
<name>Character Options</name>
<li href="stay.html">Stay</li>
<nl>
<name>Leave</name>
<li href="newjob.html">Job transfer</li>
<li href="divorce.html">Divorce</li>
<li href="fataldisease.html">Fatal disease</li>
</nl>
<li href="backburner.html">Back Burner</li>
</nl>
|
When a user activates the name (Character Options), the list items appear. The working draft is unclear as to whether a sub-list, such as the Leave menu, appears when a user activates the main list or whether a user must activate the sublist item itself. Authors might ultimately control this behavior through styles or events. In any case, when the main element loses focus, the list items disappear.
You may have noticed that even though it was intended as a menu, the previous example has no anchor (a) tags. Instead, the href attribute has been placed right on the li elements. This isn't a feature of navigation lists, but rather a new feature of XHTML 2.0. The hypertext-related attribute such as href, target, and accesskey are now part of the Common Attribute Collection, which includes the Core Attributes (class, id, and title), Internationalization Attributes (xml:lang, which replaced lang in XHTML 1.1), and the Events Attributes, which come from the XML Events recommendation, as you'll see below.
What this means is that you can turn any element into a link simply by adding an href attribute to it, rather than having to surround individual elements with an anchor tag.
Does this mean that XLink, after four years of work, has been adopted into XHTML 2.0? In a word, no. In fact, the difference between XLink and the linking specified in XHTML 2.0 is a source for some controversy among those who are working on the respective recommendations, so it's possible that changes may be made between this first public working draft and the final recommendation. In the meantime, you can duplicate much of XLink's functionality with a combination of this functionality, navigation lists, and the use of the link element, as well as Resource Description Framework (RDF).
One XML-related recommendation that did make it into XHTML 2.0 is XForms.
XML Forms Language (XForms) is a completely new way to look at a form -- where, like the rest of XHTML -- content, structure, and presentation are completely separate. An XForms page specifies a model, which carries information about the form itself, and then you can scatter form elements around the page, rather than being confined to a single form element. This means that you can even combine elements for different forms in the same area of the page. You can populate the form through an instance document, which you reference from XPath expressions on the form elements. The form elements themselves also represent objects of particular types, rather than describing what they should look like on the page. When you update the data in a form element, the instance document is updated. When the user submits the form, the instance document is actually sent. Take, for example, the following simple form (Listing 5):
Listing 5. A simple HTML form
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Preference Form</title>
</head>
<body>
<h1>Preferences Form</h1>
<form action="myformprocessor.jsp">
<p>
Username: <input type="text" name="userid" />
<br />
Password: <input type="password" name="pass"/>
</p>
<p>
Area preference:
<select name="seatingpreference">
<option value="1">One</option>
<option value="2">Two</option>
<option value="3">Three</option>
</select>
</p>
<p>
<input type="submit" />
</p>
</form>
</body>
</html>
|
Listing 6 shows an XForms version of the form:
Listing 6. An XForms version of the form
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:xforms="http://www.w3.org/2002/01/xforms">
<head>
<title>Preference Form</title>
<xforms:model>
<xforms:submitInfo method="postxml"/>
<xforms:instance xmlns="">
<preferences>
<person userid="">
<password></password>
</person>
<seatingpreference></seatingpreference>
</preferences>
</xforms:instance>
</xforms:model>
</head>
<body>
<h1>Preferences Form</h1>
<p>
<xforms:input ref="preferences/person@userid">
<xforms:caption>Username: </xforms:caption>
</xforms:input>
<br />
<xforms:secret ref="preferences/person/password">
<xforms:caption>Password: </xforms:caption>
</xforms:secret>
</p>
<p>
<xforms:selectOne ref="preferences/seatingpreference" selectUI="listbox">
<xforms:caption>Area preference: </xforms:caption>
<xforms:item>
<xforms:value>1</xforms:value>
<xforms:caption>One</xforms:caption>
</xforms:item>
<xforms:item>
<xforms:value>2</xforms:value>
<xforms:caption>Two</xforms:caption>
</xforms:item>
<xforms:item>
<xforms:value>3</xforms:value>
<xforms:caption>Three</xforms:caption>
</xforms:item>
</xforms:selectOne>
</p>
<p>
<xforms:submit>
<xforms:caption>Submit Report</xforms:caption>
</xforms:submit>
</p>
</body>
</html>
|
Terminology Note: The XForms recommendation specifically notes that there is no singular form of XForms. It is an XForms page, never an XForm page.
Forms commonly need to be validated. In other words, a data field must contain valid data, and so on. XForms uses XML Schemas to constrain submitted data. In addition, you can further enhance the functionality of an XForms page by the addition of XML Events, which is also included in XHTML 2.0.
You may be familiar with using events on a Web page by adding events such as onclick and onmouseover. No more. These familiar attributes have been replaced by the integration of the XML Events module into XHTML 2.0. XML Events provides a generic way to specify an action that should happen when an event takes place. The advantage here is that you're not limited to predefined events such as a mouse click. Instead, you can define your own events and what should happen when they're triggered.
XML Events involves the following components. An event, such as a mouse click, is intended for a target. For example, in the page shown in Listing 7:
Listing 7. A page to click
<html>
<head><title>Rides</title></head>
<body>
<ul id="ridelist">
<li href="monorail.html">Monorail</li>
<li href="Matterhorn.html">Matterhorn</li>
<li href="coaster.html">Roller coaster</li>
</ul>
</body>
</html>
|
the user might click the second li element, Matterhorn. When that happens, the mouse click event travels from the root of the document to the target (li) and back again. The sequence is:
(root) -- html -- body -- ul -- li -- ul -- body -- html -- (root) |
The trip down to the target is known as the capture phase, while the trip back up again is known as the bubbling phase. (Not all events can bubble.) At any time during the trip, the event may pass an object that has been registered as an observer, meaning that it's watching for that specific event, and if it sees the event, it performs a particular action. A listener creates the observer. For example, in the following sequence:
<ev:listener observer="ridelist" event="mousedown" handler="#myscript"/> |
the listener makes the ul element (or rather, the overall list) an observer, so when the user clicks any list item, the observer (ridelist) executes myscript. (The mechanism for invoking an arbitrary script has yet to be determined.)
Also replaced in XHTML 2.0 are the widely reviled frames. The first working draft of XFrames made its debut on 6 August 2002, the day after XHTML 2.0 announced that it would be using XFrames, and attempts to solve the problems that traditional HTML frames presented. Most of the issues revolved around difficulty in bookmarking and refreshing pages, and the inability of search engines -- which don't support frames -- to index the appropriate content.
In an XFrames document, the URIs for the included content become part of the URI for the overall document. For example, the following page in Listing 8 might represent an HTML page with three frames:
Listing 8. An XFrames page
<html>
<head><title>XFrames</title></head>
<body>
<row>
<frame id="header" />
<column>
<frame id="menu"/>
<frame id="content"/>
</column>
</row>
</body>
</html>
|
Notice that the URIs for each frame aren't specified, but each frame has its own unique identifier. The URI for this document might then be:
site.xfm#frames(header=header.xhtml,menu=menu.xhtml,content=main.xhtml) |
A browser that understands XFrames would then associate the content of each frame to the appropriate URI. When the user clicks a link and changes the content of an individual frame, the overall URI of the page changes, so it always represents the actual content the user is looking at, and bookmarks and the back button provide accurate content.
The final major change as of the 5 August 2002 working draft involves the removal of the img tag and its replacement with the object tag. The object tag has actually been around since HTML 4.01, but developers have mostly used it for embedding multimedia and Java applets. It has always, however, been able to support images. A major advantage of using the object tag is that it's designed to cascade downwards. In other words, if the browser can't display a particular object, it will display that object's contents instead. For example, a browser encountering the following snippet first attempts to load the movie. Failing that, it loads the image. Failing that, it simply displays the text.
<object data="rides.mpeg" type="application/mpeg">
<object data="rollercoaster.jpg" type="image/jpg">
Jack tries to expand his horizons on the racing coasters.
</object>
</object>
|
The only thing that is certain about the 5 August 2002 working draft of XHTML 2.0 is that nothing about it is certain. It will almost definitely change in some way between now and adoption as a recommendation, but the goal of emphasizing structure and semantics isn't likely to change. For this reason, it's a good idea to take a look at the pages you build now, and start getting into the habit of using structure and styles appropriately. Use markup to designate what something is, not what it should look like, and use CSS to do the rest. Overall, think more about the structure of your documents and what you want them to do, and not necessarily so much about what they should look like.
- See the latest version of XHTML 2.0 to see what's currently included.
- Find out about how to control events in your pages and in any XML document with XML Events.
- Read about the next generation of Web forms with XForms.
- Learn more about XForms in the article, "Get ready for XForms" (developerWorks, September 2002).
- Get a glimpse of XFrames.
- See how XHTML 2.0 is put together by reading about the Modularization of XHTML, or read the Modularization of XHTML tutorial on the developerWorks Web Architecture zone (October 2001).
- Check out Kendall Grant Clark's comments on XHTML 2.0 and the Semantic Web in XHTML 2.0 The Latest Trick.
- Read Bob DuCharme's thoughts on XLink and its prospects in XLink: Who Cares?
- Download X-Smiles, an XForms browser.
Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, and is the author of three books on Web development, including Java and XML from Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.




