HTML5, the latest version of the Hypertext Markup Language (HTML), is the most radical revision of the language to date. It introduces many new features in a variety of areas. Some of the more notable additions include:
- Built-in multimedia tags for audio and video
- A canvas tag for drawing content in the browser
- Smarter forms that let you do things such as validation through the use of a required attribute
With a new set of structural tags, HTML5 revises the way that HTML documents are structured. The new structural tags focus on dividing an HTML document into logical parts. The name of the tag is descriptive of the type of content it is intended to contain. In this article, learn about these new tags in detail.
Tim Berners-Lee created the original HTML in 1989 to address some of the shortcomings of existing methods of accessing information on the Internet. Since its inception, finding your way around the Internet was a difficult task. Content on the Internet was treated as individual documents, with no easy method of navigating between them. You essentially had to know the address of the document you were looking for and enter it by hand. To address this issue, Berners-Lee created two technologies: Hypertext Transfer Protocol (HTTP) and HTML.
HTTP is a service protocol used by web servers to deliver content. The beginning of a URL in your web browser (assuming the browser shows the full URL) will most likely begin with http://. This part of the URL tells the browser what type of protocol to use when making the request to the web server. When the server receives a request for a document, that document is likely written or converted to HTML. The HTML document is what is sent back to the browser making the request.
HTML is a scripting language that tells a web browser how to present content. Links to other documents can be in this content, providing a user-friendly method of navigating between documents on the Internet.
The combination of HTTP and HTML provides quick and easy navigation of content on the Internet by allowing you to simply click on text links to navigate between documents. After creating these two technologies, Berners-Lee went on to found the World Wide Web Consortium (W3C). The W3C was the guiding force for the first four versions of HTML.
The original intent of the Internet was to serve simple text documents. The earliest browsers were all text-based (no fancy windows—just text on a screen). Even the addition of images was a big deal when first introduced. Now, people do everything from sending e-mail messages to watching TV on the Internet. The Internet has become much more than a mechanism for transporting simple text documents. With new features and uses came new challenges and problems that the HTML language was never designed to handle.
The W3C attempted to address the problems of today's Internet with the Extensible Hypertext Markup Language (XHTML) 2.0 standard. However, this standard was not well-received and has, for the most part, been abandoned. In 2004, while the W3C was focusing on the XHMTL 2.0 standard, a group called the Web Hypertext Application Technology Working Group (WHATWG) began working on the HTML5 standard, which was more warmly received than the XHTML 2.0 standard. The W3C abandoned the XHTML 2.0 standard and is now working with the WHATWG on the development of HTML5.
At the time of this writing, HTML5 has not been officially released. Most of the content on the web is still being written for the HTML 4 specification. However, several browsers include support for the HTML5 specification. Because each browser might support only certain features of HTML5, things can get tricky. Before writing an HTML5-based website, check each of your target browsers to make sure they support the features you'll use for your site.
Regardless of the capabilities of your target browsers, you have to tell the browser that you want your content to be rendered using the HTML5 specification. You do this using the doctype declaration.
The doctype declaration tells the browser what version of the markup language the page is written in. It does so by referring to a Document Type Definition (DTD). The DTD specifies the rules used by the markup language so that the browsers correctly render the content.
Doctypes can be a confusing concept. In the current HTML specification, there are many doctypes, and the differences between them aren't entirely clear. Table 1 shows the currently available doctypes and their capabilities.
Table 1. Doctypes and capabilities
| Doctype | Capabilities | Example |
|---|---|---|
| HTML 4.01 strict | Allows for all HTML elements and attributes, but does not allow for presentational tags, such as the font tag. No framesets allowed. | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> |
| HTML 4.01 transitional | Same as HTML strict, but allows for the use of deprecated elements, such as the font tag. No framesets allowed. | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| HTML 4.01 frameset | Same as HTML transitional, but allows for framesets. | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd"> |
| XHTML 1.0 strict | Same as HTML strict, but all content must be written as well-formatted XML. For example, all opening tags must have a matching closing tag. No framesets allowed. | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| XHTML 1.0 transitional | Same as HTML transitional, but all content must be written as well-formatted XML. No framesets allowed. | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| XHTML 1.0 frameset | Same as XHTML transitional, but allows for framesets. | <DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> |
| XHTML 1.1 | Same as XHTML strict, but also provides capabilities for modules, such as Ruby support for East-Asian languages. | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> |
Fortunately, the doctype declaration is greatly simplified in HTML5. In fact, there is only one. For your browser to render using the HTML 5 specification, add the doctype shown in Listing 1.
Listing 1. HTML5 doctype
<!DOCTYPE html>
|
The doctype declaration should be the first thing in an HTML document,
before the <html> tag.
The rationale for creating new structural tags is to divide web pages into logical parts with tags that are descriptive of the type of content they contain. Conceptually, think of the web page as a document. Documents contain headers, footers, chapters, and various other conventions that divide the document into logical parts.
This section reviews the current methods of dividing an HTML document using generic sample code. In the rest of this article, you'll revise the original code using the new HTML5 structural tags to see step-by-step how the document is transformed into logical sections.
If you've created even the simplest of HTML documents, then you're familiar
with the div tag. The
div tag is the major mechanism in the pre-HTML5
era for creating blocks of content in an HTML document. For example, Listing 2 shows how you can use
div tags to create a simple page with a header,
content area, and footer.
Listing 2. Simple HTML page using
div tags
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>
A Simple HTML Page Using Divs
</title>
</head>
<body>
<div id='header'>Header</div>
<div id='content'>Content</div>
<div id='footer'>Footer</div>
</body>
</html>
|
This works fine; the div tag is a nice general
purpose tag. However, other than by looking at the
id attribute of each
div tag, it's hard to tell what section of the
document each div tag represents. Though you
could argue that the id is enough of an
indicator if properly named, the id attributes
are arbitrary. There are many variations that can be considered equally
valid ids. The tag itself gives no indication
of the type of content it is intended to represent.
HTML5 addresses this issue by providing a set of tags that more clearly defines the major blocks of content that make up an HTML document. Regardless of the final content displayed by a web page, most web pages consist of varying combinations of common page sections and elements.
The code in Listing 2 creates a simple page with a header, content area, and footer. These, and other sections and page elements, are quite common, so HTML5 includes tags that break documents into the common sections and indicates the content contained in each section. The new tags are:
The rest of this article gives an overview of each tag. You will also learn
about the intended use of the tags by revising the original
div-based code example from Listing 2 to use the new HTML5 structural tags.
As the name suggests, the header tag is intended
to mark a section of the HTML page as the header. Listing 3
shows the code example from Listing 2 modified to
use a header tag.
Listing 3. Adding a
header tag
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header</header>
<div id='content'>Content</div>
<div id='footer'>Footer</div>
</body>
</html>
|
The doctype in Listing 3 was also changed to indicate that the browser should use HTML5 to render the page. From this point on, all of the examples assume you are using the correct doctype.
The section tag is meant to identify significant
portions of the content on the page. This tag is somewhat analogous to
dividing a book into chapters. Adding a section
tag to the code example results in the code in Listing 4.
Listing 4. Adding a
section tag
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header</header>
<section>
<p>
This is an important section of the page.
</p>
</section>
<div id='footer'>Footer</div>
</body>
</html>
|
The article tag identifies major sections of
content within a web page. Think of a blog, where each individual post
constitutes a significant piece of content. Adding
article tags to the code example results in the
code shown in Listing 5.
Listing 5. Adding
article tags
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header</header>
<section>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
</article>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
</article>
</section>
<div id='footer'>Footer</div>
</body>
</html>
|
The aside tag indicates that the content
contained within the tag is related to the main content of the page but is
not part of it. It's somewhat analogous to using parentheses to make a
comment in the body of text (like this one). The content in the
parentheses provides additional information about the element that
encloses it. Adding an aside tag to the code
example results in the code in Listing 6.
Listing 6. Adding an
aside tag
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header</header>
<section>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
<aside>
<p>
This is an aside for the first blog post.
</p>
</aside>
</article>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
</article>
</section>
<div id='footer'>Footer</div>
</body>
</html>
|
The footer tag marks the contained content of
the element as the footer of the document. Adding a
footer tag to the code example results in the
code shown in Listing 7.
Listing 7. Adding a
footer tag
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header</header>
<section>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
<aside>
<p>
This is an aside for the first blog post.
</p>
</aside>
</article>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
</article>
</section>
<footer>Footer</footer>
</body>
</html>
|
At this point, all of the original div tags have
been replaced with HTML5 structural tags.
The content contained within the nav tag is
intended for navigational purposes. Adding a
nav tag to the code example results in the code
in Listing 8.
Listing 8. Adding a
nav tag
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header
<nav>
<a href='#'>Some Nav Link</a>
<a href='#'>Some Other Nav Link</a>
<a href='#'>A Third Nav Link</a>
</nav>
</header>
<section>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
<aside>
<p>
This is an aside for the first blog post.
</p>
</aside>
</article>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
</article>
</section>
<footer>Footer</footer>
</body>
</html>
|
Listing 9 shows the result of replacing the original
div tags with the new HTML5 structural
tags.
Listing 9. Finished example
<!DOCTYPE html>
<html>
<head>
<title>
A Simple HTML Page
</title>
</head>
<body>
<header>Header
<nav>
<a href='#'>Some Nav Link</a>
<a href='#'>Some Other Nav Link</a>
<a href='#'>A Third Nav Link</a>
</nav>
</header>
<section>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
<aside>
<p>
This is an aside for the first blog post.
</p>
</aside>
</article>
<article>
<p>
This is an important section of content on the page.
Perhaps a blog post.
</p>
</article>
</section>
<footer>Footer</footer>
</body>
</html>
|
Though the example is simple, for demonstration purposes, when you compare
the original div-based example from Listing 2 to the result in Listing 9, the intent of the new structural tags should be clear.
The new HTML5 tags describe the types of content that they contain, and
they help divide the document into logical sections. It's still up to you
to decide when and where to use the new tags within a document, similar to
an author writing a book. While two authors writing the same book may
choose different ways of dividing the book into chapters, the act of using
chapters still provides a consistent method of dividing the book into
sections. Similarly, while two authors of a given web page may choose
different structures, the new HTML5 structural tags provide new
conventions that web page developers can use that the old
div tags do not provide.
Learn
- "New Elements in HTML5" (developerWorks, August 2007): Learn more
about HTML5 structure.
-
WHATWG: Explore the WHATWG community,
the organization responsible for the HTML5 specification.
-
W3C: Learn more about W3C, the community
that created the original HTML specification and is now working with the
WHATWG on the HTML5 specification.
-
HTML5 (Wikipedia): Learn
more about HTML5.
- "HTML5 differences from
HTML4" (W3C): Further understand the differences between HTML 4
and HTML5 in this working draft.
- "HTML5
First Look" (lynda.com): Explore what HTML5 is (and what it
isn't).
-
developerWorks Web
development zone: Find articles covering various web-based
solutions.
Get products and technologies
-
Evaluate IBM products in the
way that suits you best: Download a product trial, try a product online,
use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to
implement Service Oriented Architecture efficiently.
Discuss
-
WHATWG forums: Visit
the WHATWG forums.
- Create your developerWorks profile today and setup a watchlist on HTML. Get connected and stay connected with
developerWorks community.
- Find other developerWorks members interested in web development.
- Join one of our developerWorks groups focused on web topics:
Share what you know.
- Roland Barcia talks about Web 2.0 and middleware in his blog.
- Follow developerWorks' members' shared bookmarks on web topics.
- Visit the Web 2.0 Apps forum: Get answers quickly.
- Visit the Ajax forum: Get answers quickly.

Jeremy Wischusen has over 13 years of experience designing websites and applications for clients such as Purple Communications, myYearbook.com, HBO, and others, building both front- and back-end systems with Flash, Flex, jQuery, PHP, MySQL, MSSQL, and PostgreSQL. He has taught web design, Flash, and ActionScript for clients such as Wyeth Pharmaceuticals, The Vanguard Group, Bucks County Community College, and The University of the Arts.




