Two competing standards coexist

While the intention of both HTML V5 and XHTML V2 is to improve on the existing versions, the approaches the developers chose to make those improvements is very different. And with differing philosophies come distinct results. For the first time in many years, the direction of upcoming browser versions is uncertain. Uncover the bigger picture behind the details of these two standards.

Adriaan de Jonge, Senior Java Software Engineer, SDB Java

Adriaan de JongeAdriaan de Jonge is part of a team of Java specialists at SDB Professionals in The Hague, The Netherlands. His writing career began with a comparison of XForms and Ruby on Rails. As a Java™ developer, he is especially interested in front-end technology, both Web based and client side. You can reach Adriaan at

20 November 2007

Also available in Russian

Most people use HTML V4 and XHTML V1 to create Web pages. Relatively few HTML enthusiasts understand the concepts of semantic HTML, validating HTML structures, and improving documents for accessibility. A high-quality HTML document is a result of many trade-offs, design choices, and discussions. Despite all criticism, no alternative even nearly as universal as HTML exists. Most users settle for the standard as it is, even if it will be the final version ever produced.

Like any other standard, however, HTML will have successors. Even now, specialists are thinking about the next version of HTML, solving every known issue in the current version. And like any group of people, these specialists disagree on the future direction of this work.

The first proposal for a new HTML version came from a work group initiated by the W3C. This group's idea centered around XHTML V2—a standard that continued previous developments in XHTML toward a purer version and returned to the design philosophy of the first editions of HTML.

Some prominent HTML specialists outside the W3C—browser vendors, Web developers, authors, and other stakeholders—disagreed with the direction of XHTML V2. In 2004, they started an independent work group to propose an alternative direction for the next version of HTML. Under the flag of WHATWG (Web Hypertext Application Technology Working Group), the group put together proposals for HTML V5 and Web Forms V2.

After a few years, the working draft is a clear description of an alternative direction for HTML. In April 2007, the W3C voted on a proposal to adopt HTML V5 for review, without accepting it as an official recommendation (yet). A great majority voted in its favor. As a result, an interesting situation has arisen: The W3C is working on two competing successors for HTML and XHTML. In theory, both proposals have legitimate justifications. In practice, many hurdles must be overcome before the all the major browsers support the standards.

Those are the dry facts leading to the current situation. The more interesting discussion surrounds the actual disagreements between the two proposals. This article describes the essence of each proposal from a bird's-eye view, then looks at the details of either philosophy.

Frequently used acronyms

  • CSS: Cascading Style Sheets
  • HTML: Hypertext Markup Language
  • W3C: World Wide Web Consortium
  • XHTML: Extensible Hypertext Markup Language

A brief history of XHTML

Before you can understand the philosophy behind XHTML V2, you need a bit of history. In the early 1990s, the first version of HTML was based on Standard Generalized Markup Language (SGML). The key difference was the hyperlink feature—the key foundation and success factor for the World Wide Web. Like SGML, HTML allowed authors to describe the structure of a document, separating headers from paragraphs, ordered and unordered lists. The appearance on screen was browser dependent.

With the increasing popularity of the Web, HTML users demanded control over the look and feel of their pages. Browser vendors pushed new features into HTML V2 and V3. Web pages degenerated into inaccessible, complex structures of nested tables as the primary way to control page layout. The rest of the document was filled with font tags and color declarations. Original document structures could not be recognized.

HTML V4 was designed to end this chaos, as it externalized presentation logic to CSS and introduced layers (DIVs) for advanced positioning of content. This meant a paradigm shift compared to HTML V3. To simplify the migration, a Transitional version of HTML V4 supported the old HTML V3 constructs. The Strict version enforced full separation of content and presentation for advanced HTML authors.

The first HTML V4 sites adopted DIVs as the new holy grail and used them for almost every element in a page that needed a little decoration, including (but not limited to) page headers. The style attributes in HTML were a popular placeholder for presentation details. In the end, superfluous tables were eliminated from the page. But content and presentation logic were still mixed throughout. The obligatory CSS file consisted of just a few lines.

More recently, some well-known Web developers started to promote the more elegant approach of HTML V4 with style sheets. In modern browsers, CSS properties aren't limited to DIV elements. You can style all HTML elements any way you like. Many weblogs started to talk about semantic HTML. Without completely banning the DIV element, authors try to use HTML elements that best described the content they contain. For example, the navigation menus you find on most Web sites can best be described as unordered lists. Also, instead of using a class named bigHeader on a paragraph element, they prefer the H1 element, using CSS to change its appearance as desired.

In the meantime, the W3C proposed XHTML V1 as an equivalent to HTML V4 rewritten as well-formed and valid XML. For XML users, this simplified the task of transforming XML content to Web pages and validating the result with existing validators. XHTML V1.1 is an effort to separate concerns into separate modules. The modular approach makes it easier to reuse parts of the standard for different purposes and to extend the standards with new features.

Even more than HTML V4 users, XHTML V1.1 users separate content and presentation. As always, however, some practical requirements can only be solved by using dirty tricks in CSS. For example, the menu structures described as unordered lists usually consist of good-looking images. Images, however, are not easily read by text-to-speech devices for people with visual disabilities. Also, text browsers such as Lynx don't present images well. A dirty CSS trick allows you to hide the text and show a picture in the browser. But if the content of the menu is different from page to page, it's not very elegant to specify this content in a CSS.

The philosophy behind XHTML V2

The first and most important design philosophy behind XHTML V2 is to further improve the separation between content and presentation, improving the remaining flaws from HTML V4 and XHTML V1. For example, native support to specify an image source for each item in an unordered list. The old IMG SRC tag is replaced by the option to specify an SRC on any element. With this change, the CSS is free of content, and alternative devices can easily present the text instead of the image.

CSS isn't the only challenge for Web developers, though. A lot of time is spent on server interaction with HTML Forms and many lines of JavaScript™ code. Forms are limited to one-dimensional key-value pairs. Developing JavaScript code requires a lot of effort, which becomes useless on alternative interfaces such as text-to-speech devices.

Within the modular approach, XHTML V2 replaces HTML Forms with the XForms module, adding support for common concerns by using appropriate application models. In XForms, you can specify interactive logic, validation rules, and computations without a single line of script. Also, the technique uses rich XML structures instead of key-value pairs, allowing nested sub-forms and repeating elements. Besides offering a powerful engine, text-to-speech devices are better able to translate the full richness of the application.

In addition to XForms, several other concerns from the XHTML definition are extracted into stand-alone specifications that can be reused for other purposes—for example, XML Events, XFrames, and Ruby (for Asian languages).

Along with the separation of presentation, programmability is also separated from the standard. Interaction attributes such as onClick are replaced by the XML Events module. Because the XML Events specification is natively designed for this purpose, it offers a more powerful set of tools to handle user interaction.

To summarize the innovations in XHTML V2, the main philosophy is the separation of concerns. Each concern is no longer a secondary feature of HTML but a primary goal for a new specification. As a result, the new specifications are better optimized in the interest of the issue at hand. However, separation of concerns is a philosophical goal rather than a practical one. A clever developer who has the technical skills to use the full set of tools appropriately can achieve effective results. It is questionable whether average users of current HTML versions are capable of creating XHTML documents of proper quality.

Most likely, XHTML V2 isn't aimed at average HTML authors. But in the hands of the right developer, it leads to very elegant solutions fully optimized for accessibility.

The philosophy behind HTML V5

The WHATWG is taking a far more practical approach in designing HTML V5. Instead of chasing an abstract ideal such as separation of concerns, the group started to document the current actual behavior of major browsers, which is different from the W3C specification. Using this analysis as a foundation, the group investigated how HTML is actually used in practice.

With this information, the group began to propose features that might simplify the lives of average Web developers. Although HTML V5 respects its origins from former versions of HTML, purification is not a primary goal. For example, the primary purpose of modeling documents is easily replaced with optimizing for Web applications.

Development of Web applications is greatly simplified with a modeling language that has native support for this purpose. For example, HTML V5 has native support for interactive components such as data grids, menus, and toolbars. Using descriptive HTML elements with default behavior saves a lot of time writing JavaScript code to simulate the behavior with general-purpose DIVs.

The HTML V5 specification isn't limited to HTML elements and attributes. It specifies a JavaScript API for specific purposes such as editing documents and drag-and-drop interaction. This approach is diametrically opposed to separating concerns. It simplifies the API for Web developers but increases the size of the specification significantly.

HTML V5 is more similar to HTML V4 than XHTML V2 is to XHTML V1. The migration path is smoother, and it's easier for an experienced HTML V4 developer to become accustomed to the new version. New features follow the similar logic. Specific event attributes for specific elements allow HTML editors to provide more appropriate text completion.

Current Web applications rely on Asynchronous JavaScript + XML (Ajax) for server interaction. HTML V5 acknowledges the importance of server interaction and specifies several ways to communicate over the network, dispatch events received from the server, and send messages to documents from other domains without introducing security issues.

The main philosophy behind HTML V5 is to extend HTML V4 with practical features that the average Web developer requires. HTML V5 simplifies the technology as it continues the familiar approach from HTML V4. To address flaws from HTML V4, HTML V5 prefers a minimal approach over a major redesign.

Practical use of the new standards

XHTML V2 and related modules are officially supported by the W3C, and the related modules are becoming key ingredients for other XML specifications that the W3C maintains. Unfortunately, official W3C approval is no guarantee of support by major Web browsers. Supporting plain, vanilla XHTML V2 isn't the issue: Modern browsers already support many features. Proper use of XHTML V2 depends on the availability of the related modules. At the time of writing, it's unclear whether Microsoft will extend Windows® Internet Explorer® with capabilities such as XML Events and XForms. A Mozilla XForms plug-in, including XML Events, has been under development for a few years now. The plug-in proves both the capabilities of the technology and the difficulties in the implementation.

The HTML V5 specification was written using good communication with browser vendors, keeping implementation concerns in mind. The team is sceptical about official W3C approval, though. The FAQ doesn't even try to give a serious answer about the expected date of approval. Regardless of the W3C, browser vendors might decide to implement an unofficial HTML V5 standard anyway. It wouldn't be the first time browser vendors were ahead of official recommendations.

Competitive standards

At this point, neither HTML V5 nor XHTML V2 are official recommendations yet. Some details might change in future. What doesn't change is the direction they take, each addressing a different set of flaws in the current standards. It will be interesting to see how future browsers will add support for each new standard. Current browsers support both HTML V4 and XHTML V1. Similarly, future browsers might support both HTML V5 and XHTML V2. The standards will appeal to different audiences.

If you're more interested in XHTML V1.1 than HTML V4, looking for an elegant approach to create documents accessible from multiple devices, you are likely to appreciate the advantages of XHTML V2. If you only use XHTML V1 because of its XML compliance but you prefer the new features in HTML V5, you might appreciate XHTML V5 (HTML V5 rewritten as an XML dialect).

HTML V5 is best appreciated by developers of interactive Web applications using HTML V4. It should also be the more practical alternative for sites maintained using What You See Is What You Get document editors. Considering the situation, though, HTML V4 and XHTML V1 are likely to stay around for a long time.





developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into XML on developerWorks

Zone=XML, Web development
ArticleTitle=HTML V5 and XHTML V2