Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Back to basics with HTML

Learn how an HTML page works and how to write well-formed markup

Mike Wilcox (mike@mikewilcox.net), Director of Technology, BetterVideo
Mike Wilcox photo
Mike Wilcox is director of technology for BetterVideo, a fast-growing startup in Frisco, Texas. He is in charge of front-end engineering and online video services. Mike is a regular speaker on Ajax and other web technologies, and has spoken at the 2009 Rich Web Experience, the 2009 Dallas TechFest, and many other conferences. His open source work is on display in the Dojo Toolkit, where, as a committer, he has implemented many of the multimedia technologies, which include the Multi-file Uploader, the audio and video components, and the vector-based DojoX Drawing.

Summary:  If you've never created an HTML page before, or you have but you're not sure why it works the way it does, this article explains and demystifies the process. It covers the basics of HTML and its construction, the different components and how they work, and navigating a directory with links. This article also covers stylesheets, scripts, and what new things HTML5 includes.

Date:  10 May 2011
Level:  Introductory PDF:  A4 and Letter (82KB | 13 pages)Get Adobe® Reader®
Also available in:   Japanese

Activity:  10873 views
Comments:  

Your first web page

HTML stands for Hypertext Markup Language and is the language most used on the web. A markup language means that some of the text is "marked up" with some sort of annotation to separate contextual text from instruction text.

To create an HTML page, open a text editor. Type Hello World and save it with the name HelloWorld.html. Check to ensure that the editor didn't save the file as HelloWorld.html.txt by appending the .txt extension. Now, find the file in your file system, right-click on it, and select Open with... > [browser] to open it with your browser.

Adding markup

The browser needs a way to determine which blocks of text should be large, bold, italicized, and so on. You provide this information to the browser by surrounding the text with opening and closing HTML tags. These tags are enclosed within a less-than sign (<) and a greater-than sign (>); for example, <p>. A closing tag is the same as the opening tag with the addition of a forward slash before the tag name; for example, </p>.

Create a new file named test.html, and enter the markup shown in Listing 1.


Listing 1. Simple markup

<div><h1>TODO List</h1><p><span>I highly </span><strong>recommend</strong> we
<em>walk</em>.</p></div>

The first tag used in Listing 1 is <div>, which is typically used to divide content. Next is <h1>, which is a heading. There are six available heading tags, from <h1> to <h6>, and the native browser styling starts with <h1> as the largest and <h6> as the smallest. The <p> tag is a paragraph, and within that, displayed natively as bolded text is a <strong> tag. Finally there is the <em> tag, which is short for emphasis, displayed natively as italicized text.

When you view the markup from Listing 1 in a browser, the result looks something like Figure 1.


Figure 1. The HTML code from Listing 1 rendered in a browser
Screen shot of the HTML code from Listing 1 rendered in a browser

Display styles: block and inline

There are many display styles with subtle differences, but there are two basic types: block and inline. The block styles are for blocks of text, such as headings or paragraphs, that start on a new line. The <div>, <h1>, and <p> tags are block styles. The inline styles are for styles within blocks, such as bold and italics. The <span>, <strong>, and <em> are inline styles.

Semantics

Semantics is the study of meaning and relationships between words or symbols. Applied to an HTML page, semantics means using the appropriate tag name to describe the content it contains.

When users read your web page, they don't need to know what tags you used. But there are other things that read your web page besides humans, such as search engines. When your page is being indexed for search, it is broken into sections that are given different priority. An <h1> tag is considered the highest priority on your page, followed by <h2>, and so on, down to the paragraphs.

Screen readers are devices used by people who have vision challenges. When a screen reader encounters a <strong> or an <em> tag, it pronounces that content more strongly or with more emphasis. This is why the use of <strong> and <em> is encouraged over the use of <b> and <i>, respectively.

Images

To add an image to your test.html file, first you need an image. You can use one from somewhere on your computer or you can grab an image from the Internet by right-clicking on the image, selecting Save Image As..., and saving the image file to the same folder where your test file is located. Add the image to your test file using the code shown in Listing 2.


Listing 2. Image markup example

<img src="MyImage.jpg" />

This example tells the browser to render an image using the source found in the path of the src attribute.

You might also notice that there is no closing tag in Listing 2. There are two types of tags: paired and unpaired. Paired tags might contain textual content. Unpaired tags never contain content. For instance, you can't use <img>An image is here</img> because the <img> tag is used to display an image, not text. If a tag does not contain text, a closing tag is unnecessary, so you can simply finish it with a forward slash, as shown in Listing 2.

Attributes

Tags can contain attributes that tell the browser how to render the content contained within the tags. For instance, in Listing 2, the src attribute defines the path to MyImage.jpg to create the image. One of the more useful attributes is id, which you can use to find and manipulate an element in a page with the JavaScript language or to apply styles to an element with Cascading Style Sheets (CSS).


Document linking

An anchor tag can connect to locations in multiple ways, such as a hash tag (#), absolute Uniform Resource Locators (URLs), or relative URLs.

Hash tag

The first way is to target a location within the same document by referencing the id or name attribute of the target, preceded by the hash symbol, as shown in Listing 3:


Listing 3. Using a hash to reference another part of the document

<a href="#anwserA">See Answer A</a>
<a href="#anwserB">See Answer B</a>

<div id="anwserA">The answer is 41</div>
<div id="anwserB">The answer is 43</div>

When a user clicks on a link that contains a hash reference, the browser scrolls to that point in the document. If the document is too short to scroll, there is no noticeable change, except that the browser address changes to reflect the hash location, as shown in Figure 2:


Figure 2. The browser address bar showing the hash location
Screen shot of the browser address bar showing the hash location, with hash location highlighted

Absolute URLs

A link can, of course, also reference other pages on the Internet by using a URL, as shown in Listing 4:


Listing 4. Common Internet address link

<a href="http://www.ibm.com/developerworks/">developerWorks</a>

Relative URLs

The link shown in Listing 4 is known as an absolute URL because the address begins with the domain of the website. A relative URL targets a page relative to another page within the same site. Think of it in the same way that you access your files within the folders on your computer. Figure 3 shows a simple website structure. The only difference between files and folders on your computer and those on the Internet is that a web server is referencing these files, making them available on the Internet, and calling them a website.


Figure 3. A simple website structure
Screen shot of a simple website structure, showing three files and one folder containing two files

Listing 5 shows what relative URL links in pageA.html might look like.


Listing 5. Relative URL examples

<a href="pageB.html">See Page B</a>
<a href="subpages/subA.html">See Sub Page A</a>
<a href="subpages/subB.html#section3">See Sub Page B, Section 3</a>

Notice the first link in Listing 5 does not contain the protocols you are used to seeing in an Internet address, such as http or www; it's just a page name. The second link shows how you target a page within a folder: you use folder name, forward slash, file name. The third link shows how you can target a section within a page using a hash.

Default page

The examples shown in Listing 5 use page names in the URLs, but the absolute example in Listing 4 doesn't. To simplify web addresses, web servers have default pages. If no page name is given, one of the default pages is accessed. These default pages can have any name, but most often they are named index.html.

If the default page or the page referenced is not there, the server throws a 404 error message, which indicates the page was not found.


Browsers and well-formed HTML

Well-formed HTML simply means HTML markup that follows the rules. The two basic rules you should adhere to are:

  1. If you open it, close it.
  2. Don't overlap tags.

The first rule means remember to close your tags. Listing 6 shows an example of an open tag.


Listing 6. Markup with an open tag

<strong>My markup <em>should be well-formed</<strong>

The intention is probably to emphasize the word "should," but the browser doesn't know that and will probably italicize everything in the page after the <em> tag.

Listing 7 shows well-formed, properly nested markup.


Listing 7. Well-formed markup example

<p>I want my markup to be <strong>really</strong> well-<em>formed</em></p>

Listing 8 shows improperly formed markup with overlapping tags.


Listing 8. Improper markup with overlapping tags

<p>I want my markup to be <strong>really <em>well</strong>-formed</em></p>

You might wonder why Listing 8 is improper. If you try it, it might render as you expect in your browser with the word "well" being both bolded and italicized. HTML is intended for the common person, as opposed to other markup languages, such as Extensible Markup Language (XML), which are intended for professionals. Therefore, browser implementors work hard to guess how improperly formed HTML should render. However, even if your improper HTML looks correct in your browser, it may look completely different in another browser, or in a future version of your browser.

The HTML tree

It might just look like text to you, but the browser sees markup as objects or elements. These elements use a parent-child hierarchal relationship. In computer science, a parent can have multiple children, but a child can have only one parent. Figure 4 shows how the browser sees the well-formed markup from Listing 7.


Figure 4. How a browser views your markup
Illustration of how a browser views your markup, with 'body' parenting 'p,' which is parenting 'strong' and 'em'

Trying to convert the improper markup from Listing 8 doesn't work because the strong and em objects collide. What happens is the browser rewrites your code to create its objects. It has to guess what you mean, and, therefore, your page may not render as you intend.

To prevent improper HTML, it helps to write your markup using indentation, as shown in Listing 9:


Listing 9. Indented markup

<div>
    <p>
        <span>
            <strong>
                Bold Text
            </strong>
        </span>
        <span>
            <em>
                Italic Text
            </em>
        </span>
    </p>
</div>


Meta components

Until now, the focus of this article has been on the textual section of an HTML document. But there are some meta elements as well. Listing 10 shows a small but valid HTML document.


Listing 10. Sample HTML document

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <title>Example page</title>
    </head>
    <body>
        <h1>Hello world</h1>
    </body>
</html>

DOCTYPE basics

The first line (continued to the second) in Listing 10 shows something called a doctype. A doctype triggers a different rendering mode in the browser. The rules for how to render HTML markup are determined by a set of standards that change and evolve over the years. The use of doctypes help browsers keep up with the changes without breaking pages written with older standards.

The head and body elements

After the doctype is the html tag, which wraps everything in the document. There are two child elements in the html element: the head and the body. The head is where the meta information goes, starting with the title element. Whatever text you put in title is what you see in the title bar of the browser and is how the page is recognized by search engines and bookmarks. The head may also contain various meta information, such as keywords, a description, style sheets, or scripts. The body is where you put all your displayed textual content.

Styles and stylesheets

A stylesheet is an external file that contains the style definitions for the page. The stylesheet is associated with the page with a link tag and targeted with the href attribute. Because of the various ways link tags are used, you must also include the rel attribute set to "stylesheet", as shown in Listing 11:


Listing 11. Example of a linked stylesheet

<link rel="stylesheet" href="stylesheets/MyStyles.css" />

You can also directly include styles in the head within a style tag. The contents of a style element can contain style definitions or <@imports> of other stylesheets. Listing 12 provides an example.


Listing 12. A style element with various styles and an import

<style>
    @import "stylesheets/MyStyles.css";

    #myElement{
        background:black;
        color:white;
    }

    h3{
        font-size:36px;
    }

    .bordered{
        border: 1px solid red;
    }
</style>

Listing 12 also demonstrates some of the basic ways to style HTML elements. The first, <#myElement>, starts with a hash symbol, which means it is targeting an element with myElement as the id attribute. The second shows that you can directly style elements by tag name. Here, all h3 elements have a font size of 36 pixels. The third way starts with a period, which indicates that it is a class name, and targets any element that contains bordered in its class attribute, such as <div class="bordered">Stuff</div>.

The JavaScript language

Asynchronous JavaScript + XML (Ajax) is all the rage these days. Ajax is really a fancy marketing term for the JavaScript language, which is the default scripting language of browsers. You can use the JavaScript language to ensure forms are filled out correctly, hide or display elements, or even make them move around on the page.

As an HTML page is opened by the browser, the browser literally reads the content from the top to the bottom and renders it along the way. Pages on the Internet do not just appear; they display a few elements at a time.

Listing 13 contains a script that, when encountered by the browser, executes and displays "Hello World" in an alert box. The browser then stops everything and waits for you to press the Okay button. At this point, the title in the browser is rendered because the title element has been read. But the page text "Page Content Here" does not display because the browser hasn't read that far yet.


Listing 13. Hello World browser rendering example

<html>
    <head>
        <title>Example page</title>
    <script type="text/javascript">alert("Hello World");</script>
    </head>
    <body>
        <h1>Page Content Here</h1>
    </body>
</html>

Functions

The alert box mentioned earlier is launched by calling the alert function. It displays the text passed as an argument between the parentheses immediately after the function name. You can make and call a custom function the same way. Listing 14 shows how now the alert is called within the customFunction function.


Listing 14. Custom function example

<script type="text/javascript">
    function customFunction(){
        alert("Called via custom!");
    }
    customFunction();
</script>

The JavaScript language is asynchronous, which pretty much means it doesn't run until you call it. If the customFunction(); line is removed, the function will not be called. It's more versatile to call the function during a browser event. An event is an occurrence in the browser. One of the most used events is window.onload. It fires when the browser is done reading and rendering all the content. You can make the onload event the custom function by simply setting it, as shown in Listing 15:


Listing 15. Custom function fired on the load of the browser

<script type="text/javascript">
function customFunction(){
    alert("Called via custom!");
}
window.onload = customFunction;
</script>

HTML elements have events too. Listing 16 shows how you can call the function on a mouse click.


Listing 16. Custom function fired on the load of the browser

<html>
    <head>
    <title>JavaScript and Events</title>
        <script type="text/javascript">
            function customFunction(){
                alert("Called via mouse click!");
            }
        </script>
    </head>
    <body>
        <div onclick="customFunction">Click Me!</div>
    </body>
</html>


Web servers

A web server is the software that returns the content of a page or other resource that a client requests. There will come a time when you reach the limitation of directly viewing HTML pages from your hard drive. To a browser, a URL such as file:///Users/Documents/test.html is a security risk because it could theoretically be something on the Internet that is trying to access your hard drive. If you start seeing security messages, it's time to install a web server.

Fortunately, it's not difficult to install a web server, and there are many tutorials about it on the Internet. Apache is easy to install, small, and popular. IBM® WebSphere® Application Server is powerful, and there are downloads available so you can test it.


HTML5

A common question is "Should I learn HTML5 first or start with HTML?" It's really all just HTML, and you should start with the basics regardless of the version.

HTML5 does provide new features that developers are excited about. In terms of markup, there are many new tag names that are available to help make web pages more semantic and maintainable. The JavaScript application programming interfaces (APIs) for HTML5 have dramatically increased so that web authors can build full-fledged web applications without the help of plug-ins.


Conclusion

In this article, you have learned the basics of well-formed HTML and how to get started with CSS and the JavaScript language. There are many resources and tutorials on the web to help you take your skills to the next level.


Resources

Learn

Get products and technologies

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox, learning how to efficiently implement service-oriented architecture.

  • WebSphere Application Server Community Edition: Try, WebSphere Application Server Community Edition, a no-charge, pre-integrated, lightweight Java Platform, Enterprise Edition 5 (Java EE 5) application server built on Apache Geronimo technology.

  • Apache HTTP Server: Download the Apache HTTP Server.

Discuss

About the author

Mike Wilcox photo

Mike Wilcox is director of technology for BetterVideo, a fast-growing startup in Frisco, Texas. He is in charge of front-end engineering and online video services. Mike is a regular speaker on Ajax and other web technologies, and has spoken at the 2009 Rich Web Experience, the 2009 Dallas TechFest, and many other conferences. His open source work is on display in the Dojo Toolkit, where, as a committer, he has implemented many of the multimedia technologies, which include the Multi-file Uploader, the audio and video components, and the vector-based DojoX Drawing.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=657739
ArticleTitle=Back to basics with HTML
publish-date=05102011
author1-email=mike@mikewilcox.net
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers