Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

The Web's future: XHTML 2.0

A sneak peek at the changes

Nicholas Chase (nicholas@nicholaschase.com), President, Chase and Chase, Inc.
Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, and is the author of three books on Web development, including Java and XML from Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.

Summary:  Over the years, HTML has only become bigger, never smaller, because new versions had to maintain backward compatibility. That's about to change. On 5 August 2002, the first working draft of XHTML 2.0 was released and the big news is that backward compatibility has been dropped; the language can finally move on. So, what do you as a developer get in return? How about robust forms and events, a better way to look at frames and even hierarchical menus that don't require massive amounts of JavaScript.

Date:  01 Sep 2002
Level:  Intermediate

Activity:  31557 views
Comments:  

This article takes a sneak peek at what's new in XHTML 2.0 and how you might one day put it to use. Readers should be familiar with HTML and/or XHTML 1.0. Familiarity with Cascading Style Sheets (CSS) is helpful, but not required.

Good-bye backward compatibility, hello structure

When the World Wide Web Consortium (W3C) released the first working draft of XHTML 2.0 on 5 August 2002, the major surprise was that, unlike its predecessors, it was not backward compatible. With previous releases, such as the move from HTML 4.01 to XHTML 1.0, and later to XTHML 1.1, the changes were about additions; a browser that could read XHTML 1.0 (Transitional) documents could also understand HTML 4.01 documents. Not so with XHTML 2.0.

If, two years ago, you had announced that today we'd be looking at a version of HTML without an img tag or a bold tag, the vast majority of Web developers would have looked at you in disbelief. Yet here it is. In addition to replacing both forms and frames outright, XHTML 2.0 removes the b, i, and img tags (as well as big, small and tt), and even deprecates br in preparation for removing it from a future release. But why?

The reason is that most of these tags are presentational. Their sole purpose is to give the browser instructions on how their contents should look, while providing absolutely no information on what their contents are. For example, consider these two sentences:

Presentational elements are, <i>for the most part</i>, <b>gone</b>.

and

Presentational elements are, <em>for the most part</em>, <strong>gone</strong>.

In the absence of a style sheet, the sentences appear identical in the browser, but only the second sentence provides information on why. In fact, the emphasis and strong tags have been in HTML from the beginning, but for years authors have largely ignored them, concentrating on presentation at the expense of content.

This doesn't mean, however, that whenever you want to have something in bold or italics you should shoehorn it into these two tags. Instead, the whole purpose of stripping out presentational elements is to try to finish the job the inventors of CSS started, namely that content should be marked up according to what it represents, and style sheets should be used to make it look pretty. For example, Listing 1 uses classes to indicate content types.


Listing 1. Using classes to specify content types
		
<html>
<head><title>Employee Notice</title>
<style type="text/css">
    .duedate { color: red;
               font-weight: bold; }
    .holiday { color: green;
               font-style: italic }
</style>
</head>
<body>
<h1>Notice</h1>
<p>Employees should take note of the following important dates:</p>
<ul>
    <li class="duedate">8/28/2002 (Progress reports due)</li>
    <li class="holiday">9/1/2002 (Labor Day)</li>
    <li class="duedate">10/28/2002 (Final reports due)</li>
</ul>
</body>
</html>

In this page, the type of day is identifiable from the content itself, and the browser can use the class information to decide how to style it, as shown in Figure 1.


Figure 1. Classes can determine what type of content is present, and style sheets can format it appropriately.

Looking at it this way, the break (br) tag can't have a purpose beyond presentation because it doesn't actually have any content. XHTML 2.0 deprecates the br tag in favor of the line tag. The line tag specifies a particular kind of content: a line of text or other content that is generally rendered in such a way that it's followed by a line feed and carriage return. For example, the text:

<p>
public class HelloWorld {<br />
public static void main (String[] args){<br />
System.out.println("Hello world!");<br />
}<br />
}
</p>

becomes

<p>
<line>public class HelloWorld {</line>
<line>public static void main (String[] args){ </line>
<line>System.out.println("Hello world!"); </line>
<line>}</line>
<line>}</line>
</p>

This way, the document has an actual object that represents the line, the same way that a paragraph (p) tag represents a paragraph of content.

Why is all of this important? The Web is increasingly becoming a place where communication happens not just between people, but between software applications such as servers and search engine indexers. What's more, the days when everyone (or almost everyone) used the same browser are long gone. Increasingly, developers are redesigning content for different devices, such as PDAs and mobile phones. Voice-activated systems are not that far off. The structural meaning of content is becoming almost as important as the content itself.


Sections

To that end, two of the additions to XHTML 2.0 are sections and headings. HTML has always had the numbered headings, h1 through h6, and as of the 5 August 2002 working draft they haven't been deprecated, but it's only a matter of time. Instead, XHTML 2.0 makes use of generic headings, and of sections. For example, sections can be nested, giving meaning to headers. A document formerly rendered with numbered headings (Listing 2):


Listing 2. Numbered headings in a document
		

<html>
<head><title>Adding sections</title></head>
<body>
   <h1>The Web's future: XHTML 2.0</h1>
   <p>by Nicholas Chase</p>
   <h2>Good-bye backward compatibility, hello structure</h2>
   <p>Why backward compatibility is over.</p>
   <h3>Presentation versus Structure</h3>
   <p>Using style sheets rather than presentational elements.</p>
   <h3>Lines</h3>
   <p>Line breaks are deprecated.</p>
   <h2>Sections</h2>
   <p>Creating more reasonable sections.</p>
   <h2>Navigation lists and menus</h2>
   <p>Hierarchical menus.</p>
   <h2>Links, links, everywhere</h2>
   <p>Adding links.</p>
</body>
</html>

can be replaced with generic headings and sections (Listing 3):


Listing 3. Generic headings and sections
		

<html>
<head><title>Adding sections</title></head>
<body>
   <section>
      <h>The Web's future: XHTML 2.0</h> 
      <p>by Nicholas Chase</p>
      <section>
         <h>Good-bye backward compatibility, hello structure</h>
         <p>Why backward compatibility is over.</p>
         <section>
            <h>Presentation vs. Structure</h>
            <p>Using style sheets rather than presentational elements.</p>
         </section>
         <section>
             <h>Lines</h>
             <p>Line breaks are deprecated.</p>
         </section>
      </section>
      <section>
         <h>Sections</h> 
         <p>Creating more reasonable sections.</p>
      </section>
      <section>
          <h>Navigation lists and menus</h>
          <p>Hierarchical menus.</p>
      </section>
      <section>
          <h>Links, links, everywhere</h>
          <p>Adding links.</p>
      </section>
   </section>
</body>
</html>

This structure gives two advantages. First, it's much easier for an application such as a search engine crawler to understand the relative importance of content, and second, a section is self-contained. In HTML, a section started with its heading, so no content, such as introductory material, could appear before the heading. The section element removes that constraint because anything within it is part of the section.


Navigation lists and menus

One structural addition that could yield massive benefits to Web developers is the addition of the navigation list. A navigation list, designated by the nl tag, works much like its ordered list (ol) and unordered list (ul) cousins, but with a twist: The items of a navigation list appear only when the list is active. In this way, navigation lists are very similar to the hierarchical pop-up menus that are so popular because they provide a lot of navigational information without taking up much screen real estate. For example, a soap opera site might have the following menu (Listing 4):


Listing 4. Using a navigational list
		
<nl>
   <name>Character Options</name>
   <li href="stay.html">Stay</li>
   <nl>
      <name>Leave</name>
      <li href="newjob.html">Job transfer</li>
      <li href="divorce.html">Divorce</li>
      <li href="fataldisease.html">Fatal disease</li>
   </nl>
   <li href="backburner.html">Back Burner</li>
</nl>

When a user activates the name (Character Options), the list items appear. The working draft is unclear as to whether a sub-list, such as the Leave menu, appears when a user activates the main list or whether a user must activate the sublist item itself. Authors might ultimately control this behavior through styles or events. In any case, when the main element loses focus, the list items disappear.


Links, links, everywhere

You may have noticed that even though it was intended as a menu, the previous example has no anchor (a) tags. Instead, the href attribute has been placed right on the li elements. This isn't a feature of navigation lists, but rather a new feature of XHTML 2.0. The hypertext-related attribute such as href, target, and accesskey are now part of the Common Attribute Collection, which includes the Core Attributes (class, id, and title), Internationalization Attributes (xml:lang, which replaced lang in XHTML 1.1), and the Events Attributes, which come from the XML Events recommendation, as you'll see below.

What this means is that you can turn any element into a link simply by adding an href attribute to it, rather than having to surround individual elements with an anchor tag.

Does this mean that XLink, after four years of work, has been adopted into XHTML 2.0? In a word, no. In fact, the difference between XLink and the linking specified in XHTML 2.0 is a source for some controversy among those who are working on the respective recommendations, so it's possible that changes may be made between this first public working draft and the final recommendation. In the meantime, you can duplicate much of XLink's functionality with a combination of this functionality, navigation lists, and the use of the link element, as well as Resource Description Framework (RDF).

One XML-related recommendation that did make it into XHTML 2.0 is XForms.


XForms

XML Forms Language (XForms) is a completely new way to look at a form -- where, like the rest of XHTML -- content, structure, and presentation are completely separate. An XForms page specifies a model, which carries information about the form itself, and then you can scatter form elements around the page, rather than being confined to a single form element. This means that you can even combine elements for different forms in the same area of the page. You can populate the form through an instance document, which you reference from XPath expressions on the form elements. The form elements themselves also represent objects of particular types, rather than describing what they should look like on the page. When you update the data in a form element, the instance document is updated. When the user submits the form, the instance document is actually sent. Take, for example, the following simple form (Listing 5):


Listing 5. A simple HTML form
		

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Preference Form</title>
</head>
<body>

<h1>Preferences Form</h1>
<form action="myformprocessor.jsp">
<p> 
    Username: <input type="text" name="userid" />
    <br />
    Password: <input type="password" name="pass"/>
</p>

<p>
    Area preference:
        <select name="seatingpreference">
             <option value="1">One</option>
             <option value="2">Two</option>
             <option value="3">Three</option>
        </select>
</p>
<p>
    <input type="submit" />
</p>
</form>
</body>
</html>

Listing 6 shows an XForms version of the form:


Listing 6. An XForms version of the form
		

<html xmlns="http://www.w3.org/1999/xhtml" 
      xmlns:xforms="http://www.w3.org/2002/01/xforms">
<head>
    <title>Preference Form</title>

    <xforms:model>

        <xforms:submitInfo method="postxml"/>

        <xforms:instance xmlns="">
            <preferences>
                <person userid="">
                    <password></password>
                </person>
                <seatingpreference></seatingpreference>
            </preferences>
        </xforms:instance>
    </xforms:model>

</head>
<body>

<h1>Preferences Form</h1>

<p> 
    <xforms:input ref="preferences/person@userid">
        <xforms:caption>Username: </xforms:caption>
    </xforms:input>

    <br />

    <xforms:secret ref="preferences/person/password">
        <xforms:caption>Password: </xforms:caption>
    </xforms:secret>
</p>

<p>

    <xforms:selectOne ref="preferences/seatingpreference" selectUI="listbox">
        <xforms:caption>Area preference:   </xforms:caption>
        <xforms:item>
            <xforms:value>1</xforms:value>
            <xforms:caption>One</xforms:caption>
        </xforms:item>
        <xforms:item>
            <xforms:value>2</xforms:value>
            <xforms:caption>Two</xforms:caption>
        </xforms:item>
        <xforms:item>
            <xforms:value>3</xforms:value>
            <xforms:caption>Three</xforms:caption>
        </xforms:item>
    </xforms:selectOne>

</p>
<p>
    <xforms:submit>
        <xforms:caption>Submit Report</xforms:caption>
    </xforms:submit>
</p>

</body>
</html>


Terminology Note: The XForms recommendation specifically notes that there is no singular form of XForms. It is an XForms page, never an XForm page.

Forms commonly need to be validated. In other words, a data field must contain valid data, and so on. XForms uses XML Schemas to constrain submitted data. In addition, you can further enhance the functionality of an XForms page by the addition of XML Events, which is also included in XHTML 2.0.


XML Events

You may be familiar with using events on a Web page by adding events such as onclick and onmouseover. No more. These familiar attributes have been replaced by the integration of the XML Events module into XHTML 2.0. XML Events provides a generic way to specify an action that should happen when an event takes place. The advantage here is that you're not limited to predefined events such as a mouse click. Instead, you can define your own events and what should happen when they're triggered.

XML Events involves the following components. An event, such as a mouse click, is intended for a target. For example, in the page shown in Listing 7:


Listing 7. A page to click
		

<html>
  <head><title>Rides</title></head>
  <body>
     <ul id="ridelist">
        <li href="monorail.html">Monorail</li>
        <li href="Matterhorn.html">Matterhorn</li>
        <li href="coaster.html">Roller coaster</li>
     </ul>
  </body>
</html>

the user might click the second li element, Matterhorn. When that happens, the mouse click event travels from the root of the document to the target (li) and back again. The sequence is:

(root) -- html -- body -- ul -- li -- ul -- body -- html -- (root)

The trip down to the target is known as the capture phase, while the trip back up again is known as the bubbling phase. (Not all events can bubble.) At any time during the trip, the event may pass an object that has been registered as an observer, meaning that it's watching for that specific event, and if it sees the event, it performs a particular action. A listener creates the observer. For example, in the following sequence:

<ev:listener observer="ridelist" event="mousedown" handler="#myscript"/>

the listener makes the ul element (or rather, the overall list) an observer, so when the user clicks any list item, the observer (ridelist) executes myscript. (The mechanism for invoking an arbitrary script has yet to be determined.)


XFrames

Also replaced in XHTML 2.0 are the widely reviled frames. The first working draft of XFrames made its debut on 6 August 2002, the day after XHTML 2.0 announced that it would be using XFrames, and attempts to solve the problems that traditional HTML frames presented. Most of the issues revolved around difficulty in bookmarking and refreshing pages, and the inability of search engines -- which don't support frames -- to index the appropriate content.

In an XFrames document, the URIs for the included content become part of the URI for the overall document. For example, the following page in Listing 8 might represent an HTML page with three frames:


Listing 8. An XFrames page
		
<html>
<head><title>XFrames</title></head>
<body>
<row>
    <frame id="header" />
    <column>
        <frame id="menu"/>
        <frame id="content"/>
    </column>

</row>
</body>
</html>

Notice that the URIs for each frame aren't specified, but each frame has its own unique identifier. The URI for this document might then be:

site.xfm#frames(header=header.xhtml,menu=menu.xhtml,content=main.xhtml)

A browser that understands XFrames would then associate the content of each frame to the appropriate URI. When the user clicks a link and changes the content of an individual frame, the overall URI of the page changes, so it always represents the actual content the user is looking at, and bookmarks and the back button provide accurate content.


Images as objects

The final major change as of the 5 August 2002 working draft involves the removal of the img tag and its replacement with the object tag. The object tag has actually been around since HTML 4.01, but developers have mostly used it for embedding multimedia and Java applets. It has always, however, been able to support images. A major advantage of using the object tag is that it's designed to cascade downwards. In other words, if the browser can't display a particular object, it will display that object's contents instead. For example, a browser encountering the following snippet first attempts to load the movie. Failing that, it loads the image. Failing that, it simply displays the text.

<object data="rides.mpeg" type="application/mpeg">
    <object data="rollercoaster.jpg" type="image/jpg">
        Jack tries to expand his horizons on the racing coasters.
    </object>
</object>


Next steps

The only thing that is certain about the 5 August 2002 working draft of XHTML 2.0 is that nothing about it is certain. It will almost definitely change in some way between now and adoption as a recommendation, but the goal of emphasizing structure and semantics isn't likely to change. For this reason, it's a good idea to take a look at the pages you build now, and start getting into the habit of using structure and styles appropriately. Use markup to designate what something is, not what it should look like, and use CSS to do the rest. Overall, think more about the structure of your documents and what you want them to do, and not necessarily so much about what they should look like.


Resources

About the author

Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, and is the author of three books on Web development, including Java and XML from Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development, XML
ArticleID=11703
ArticleTitle=The Web's future: XHTML 2.0
publish-date=09012002
author1-email=nicholas@nicholaschase.com
author1-email-cc=