Introducing HTML5 video

Why we need HTML5 video and how to use it

What is HTML5 video, and how is it different from what you're used to? What problems does it solve, and what issues does it have? Since HTML5 doesn't require a plug-in, is it open source? Find the answers to these questions and learn the basic terminology for understanding how video works. Learn how to embed HTML5 video, explore the API, and explore browser compatibility.

Mike Wilcox, Director of Technology, BetterVideo

Author photoMike Wilcox is Director of Technology for BetterVideo, a fast-growing startup in Frisco, Texas; he is in charge of front-end engineering and online video services. Mike is a regular speaker on Ajax and other web technologies, and has spoken at the 2009 Rich Web Experience, the 2009 Dallas TechFest, and many other conferences. His open source work is on display in the Dojo Toolkit, where, as a committer, he's implemented many of the multimedia technologies; these include the Multi-file Uploader, the audio and video components, and the vector-based DojoX Drawing. You can reach Mike at mike@mikewilcox.net.



05 October 2010

Also available in Chinese Russian Japanese

A brief history of online video

In the 1990s, it was considered cool if postage stamp-sized Apple QuickTime or Windows Media® videos just played on your computer, much less online. QuickTime version 1.0 was released as a technological breakthrough in 1991, and Microsoft answered with Video for Windows® in 1992. RealNetworks released RealAudio Player in 1995; it was one of the first media players capable of streaming audio over the Internet. In the late 1990s and early 2000s, advancements in consumer network bandwidth made online video possible. All of the major players released versions that played streaming and progressively downloaded media. As of 2000, online video was a reality—and an unstandardized mess.

Frequently used acronyms

  • API: Application programming interface
  • CSS: Cascading stylesheet
  • GUI: Graphical user interface
  • HTML: Hypertext Markup Language
  • UI: User interface
  • W3C: World Wide Web Consortium
  • XHTML: Extensible Hypertext Markup Language

The quality of online video in the early 2000s was either hit or miss. RealNetworks' RealPlayer was perhaps one of the best online video players but pestered users to upgrade, and there were tainting allegations of RealNetworks saving private user information. QuickTime was better—at least on a Macintosh system. On Windows, it was crippled by incompatibility problems. Users were left to wonder whether the video would play, whether they would get enough frames for a smooth experience, and whether the audio would be remotely in sync.

And the advancements ended there. Microsoft won the browser wars in 2001 and effectively stopped working on innovations for Windows Internet Explorer®, instead focusing on the security holes caused by the barrage of features that were rushed to market. The W3C was of no help, as it declared that the HTML specification was "done" and turned its focus to XHTML and XHTML2. Developers were turning to Adobe® (then Macromedia) Flash® for innovative features like vector animation, cross-domain communication, multi-file uploads, audio, and video.

Develop skills on this topic

This content is part of a progressive knowledge path for advancing your skills. See HTML5 fundamentals

Flash video takes over

In 2002, Macromedia answered the demand of its developers with Flash Video, using Sorenson Spark. In 2003, the company introduced the external-video FLV format, using the VP6 codec, which, at the time, was of very high quality and had great compression. YouTube was launched in 2005 and made exclusive use of the FLV format. The Flash Player had a large installation base, Flash Video worked almost flawlessly, and YouTube had a simple interface for uploading and converting videos. As a result, Flash Video became the de facto standard of the web.

Issues with past solutions

But outside of YouTube, the pains of implementing online video were not eliminated. Placing a Flash video on your personal or company website generally required a strong understanding of Adobe ActionScript® and proprietary tools for encoding the video and creating the player controls. You've been seeing the embed code of a Flash object for years, but as Listing 1 shows, lengthy exposure doesn't make it any less complex.

Listing 1. Flash object embed example
<object id="UNIQUEID" height="520" width="528" codebase="http://download.macromedia.com/"
		classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" >
    <param value="../player/myVideoPlayer.swf" name="movie" />
    <param value="true" name="allowFullScreen" />
    <param value="all" name="allowNetworking" />
    <param value="always" name="allowScriptAccess" />
    <param value="opaque" name="wmode" />
    <param value="myVideoFile.flv" name="FlashVars" />
    <embed height="520" width="528" src="../player/mds_player.swf"
           id="UNIQUEID" wmode="opaque" allowscriptaccess="always" allownetworking="all"
		   allowfullscreen="true" swf="../player/myVideoPlayer.swf"
           flashvars="myVideoFile.flv"
           pluginspage="http://www.macromedia.com/go/getflashplayer"
		   type="application/x-shockwave-flash" quality="high" />
</object>

While the other video players' quality was advancing, their ease of use was not, because it was difficult to know whether the video might play on a user's computer. The primary example of this was an attempt to play a WMV file. If the browser were Internet Explorer, the WMV might play. But what if the browser were Mozilla Firefox? On Windows, it would probably play; on a Mac, it would probably not, but still might. The result was that different video formats were needed to work on different browsers and different operating systems. Different formats meant different players, and the need for different players typically meant a complex JavaScript solution.


Introducing HTML5 video

HTML and the World Wide Web, as Tim Berners-Lee saw it, was to be free and open. Lee founded the WC3 to "ensure compatibility and agreement among industry members in the adoption of new standards." But as of 2000, the W3C was busy working on XHTML while developers were desperately turning to the proprietary Flash plug-in.

In 2007, Opera proposed the <video> tag in the Web Hypertext Application Technology Working Group (WHATWG) working draft. The intention was to "make video a first-class web citizen in an easy, open solution to integrate video into web pages and native support for video in browsers." Listing 2 shows the proposal, which was much more elegant and finger-friendly than the verbose object-embed markup needed for plug-ins.

Listing 2. Simple HTML5 video example
<video controls src="demo.ogg"></video>

In July 2009, the W3C announced that it planned to let support for XHTML expire and to begin work on HTML5. Today, all modern browsers (and Windows Internet Explorer version 9) support the <video> tag, and the API is largely consistent (though some of the finer details are still in flux).

HTML5 video has many benefits. No JavaScript or ActionScript code is required, because you simply include the <video> tag and parameters, as shown in Listing 2. It's a first-class citizen of the browser, not a plug-in. This means that if you use JavaScript code, the video is ready when the page is ready and you don't have to wait additional time for the plug-in to load. Although there will be exceptions, the API will be standardized and will work across all browsers. Because it's a native element, there won't be struggles with plug-ins—like the plug-in reloading after the CSS is changed—or with the display, such as the video interfering with scroll bars.

Issues with HTML5 video

The specification for HTML5 video is still young, so there will be problems. The most obvious issue is that it's not supported in Internet Explorer, although the version 9 preview has support. The native UI controls are convenient, but the look and functionality are not consistent among browsers. Sandboxing a third-party video player is more difficult and requires iframes at the very least. In addition, the specification lacks strong full-screen capabilities, which have been taken for granted in Flash; Mozilla recently submitted a proposal to address this.

Flash is still very much in the lead in many other areas, such as streaming, handling different bandwidth capabilities, video capture, and content protection. Perhaps most important is that with Flash, one video file will play across all browsers on all operating systems. The browser vendors were not able to agree on a single HTML5 video format, so currently, you need at least two video files.

HTML5 video advanced

To address the lack of agreement among browser vendors, the specification was modified to handle different types of video (see Listing 3).

Listing 3. HTML5 video with multiple sources
<video controls>
	<source src="demo.ogg" />
	<source src="demo.webm" />
	<source src="demo.mp4" />
</video>

A browser will make attempts to play each of the sources in order, so if it can't play an Ogg video, it will try the WebM video and then the MP4 video. If the browser can't handle any of the formats, it will provide a visual clue that no file is loaded. In effect, the video element is backwards-compatible, because a browser that doesn't recognize it will simply ignore it. You can use this to your advantage by inserting a more familiar element, such as the one shown in Listing 4.

Listing 4. HTML5 video with a fail-over image
<video controls>
	<source src="demo.ogg" />
	<source src="demo.webm" />
	<source src="demo.mp4" />
    <img src="images/videoReplacement.gif" />
</video>

Other solutions involve inserting a Flash embed object instead of an image.


HTML5 video's API

The API for HTML5 video is refreshingly simple—but again, it's still young and in flux. Remember that the HTML5 specification is not driven by the WC3 but by the browser vendors. Though this process pushes innovation, because of it, each browser may have some unique features that others don't. Otherwise, the API detailed in Table 1 is largely consistent among supporting browsers.

Table 1. The HTML5 video API
AttributesPropertiesMethodsEvents
src
width
height
type
poster
autoplay
loop
controls
preload
currentSrc
currentTime
videoWidth
videoHeight
duration
ended
error
paused
muted
seeking
volume
paused
muted
play
pause
load
canPlayType
play
pause
progress
error
timeupdate
ended
abort
empty
emptied
waiting
loadedmetadata

The difference between attributes and properties is that you can't use properties in markup, whereas you can use attributes in both markup and script. The src attribute of the <video> element overrides the src attribute of the source elements. If you use source elements in markup, src will be an empty string. Of the properties, the only ones immediately available are width and height, which set the size of the container. All others are available after the video's meta data has loaded.

API bugs

Browsers have no critical bugs at this time, but the Apple iPad has some that affect the current API:

  • Dynamic video bug. If you create the <video> element with innerHTML, the source elements won't trigger automatically. The solution is to set the src attribute and invoke the load method. See Listing 5 for an example.
  • Source order bug. If the first source element is for a non-MP4 video, the iPad will stop there and not load. The solution is to always list the MP4 source element first.
  • Poster bug. The iPad doesn't display the poster image. This bug will most likely be fixed soon, but in the meantime, the solution is to create an HTML IMG element to float in its place.
Listing 5. Fix for iPad dynamic load bug
window.onload = function(){
	var video = document.body.appendChild(document.createElement('video'));
	if(video.canPlayType("video/ogg")){
		video.src = "video/myMovie.ogv";
	}else if("video/mp4"){
		video.src = "video/myMovie.mp4";
	}
	video.load();
}

Video file terminology

To prepare for video development, you need to understand what the terminology means, what makes MP4 and Ogg different, and, of course, how to encode the video. The two main terms used when discussing video are file format and codec.

A file format is also known as a wrapper or a container. MP4, WebM, and OGV are file formats. This meta-data file format describes how the data is stored and gives information to your computer so that it can load the necessary libraries to display the file in. A file format generally contains a video and an audio codec and has instructions for the computer on how to synchronize them.

A codec is the code that handles images, audio, or other data decoding. It usually includes the process for compressing the encoded data. The following are HTML5 video-implemented file formats and their respective codecs:

  • MP4, which uses H264 video AAC audio
  • WebM, which uses VP8 video Vorbis audio
  • Ogg, which uses Theora video and Vorbis audio

HTML5 video file formats

Currently, the supported formats for HTML5 video are MP4, Ogg, and WebM; remember that, as hard as it is to keep track of these, each browser supports different formats.

Browsers

Table 2 shows which browsers support which file formats.

Table 2. Browser compatibility chart
BrowserMP4OggWebM
Internet Explorer 9YESNOMAYBE
Firefox 4.0NOYESYES
Google Chrome 6YESYESYES
Apple Safari 5YESNONO
Opera 10.6NOYESYES

Note: Safari on Mac and Internet Explorer 9 on Windows will support any type if that codec is installed in the operating system. Other browsers (Firefox, Opera, Chrome) will need to specifically implement all video codecs.

Smartphones

Smartphones usually have the video codec implemented in the hardware; Apple iPhone, iPad, and Android phones all only play the codec that they came with, which is MP4. RIM Blackberry devices use the 3GP video file format, which also uses the H264 codec.


Encoding software

The H264 codec has been widely adopted, so most editing software you'll use can encode an MP4 video. WebM is new, but the tools are already available. Despite the fact that it is open source, Ogg isn't widely used, so there are only a few tools for it. See the Resources section for more information and technologies.

Video encoding terminology

When encoding video, you typically encounter a lot of perplexing terminology. Although you might be able to make an educated guess and output something, it helps to have a decent grasp of the terms. As a result, you can create a video for high-quality versus low-bandwidth, and progressive download versus streaming, and also troubleshoot problems if the video won't play on all devices.

  • Variable bit rate (VBR) and constant bit rate (CBR). VBR adjusts the bit rate according to the complexity of the current image; in contrast, CBR uses the same bit rate throughout the video, regardless of the segment's complexity, and is the typical technique used for streaming.
  • Multi-pass. This term is used to describe making two passes when encoding. The first pass analyzes the data so that the second pass can maximize compression. This feature is not used with streaming.
  • Square/rectangle pixels. This term is an unfortunate artifact from the early video conversion software days. Essentially, it explains 720x480 using non-square pixels and 640x480 using square pixels. If the incorrect conversion is used, the image will be stretched.
  • Level. This is an H264 setting. The levels, of which there are 16, are essentially shortcuts to constrain different video components when compressing.
  • Profiles. Profiles are sets of capabilities in an MP4 encoding. The most common are baseline, which is used for web, video conferencing, and mobile applications; main, which is used for standard-definition digital TV broadcasts or high-resolution web broadcasts; and high, which is used for broadcast and disc storage applications, particularly Blu-ray.

Licensing

The MP4 container, the H264 video codec, and the ACC audio codec are proprietary formats covered by MPEG LA Group patents. For personal websites or for a business that has only a small number of videos, this is not an issue. Businesses that have a many videos, however, must pay careful attention to the licenses and fees, as these can affect their bottom lines. The MP4 container and its codecs are currently free to end users.

The WebM and Ogg containers, the VP8 and Theora video codecs, and the Vorbis audio codec are all under the Berkeley Software Distribution License, royalty-free and open source. Videos can be made, distributed, and viewed without cost. There are, however, rumors that VP8 might infringe on some of the H264 patents, so stay up to date.


Recommendations

MP4 is an industry-standard file format, but there's no guarantee that it will remain that way. There are many reasons to choose one of the other formats:

Ogg Theora:

  • Pros:
    • No cost
    • Works on Linux®
  • Cons:
    • Not widely used
    • No hardware acceleration
  • Use cases:
    • Maintenance of only a few files
    • Small or personal sites
    • Compatibility not a problem
    • Open source fans

WebM:

  • Pros:
    • No cost
    • Acceptance growing very quickly
    • Viewable with Flash soon
    • Hardware acceleration support building
    • Backed by Google/YouTube
  • Cons:
    • Patent lawsuits looming with MPEG LA
    • Not supported on iPhone nor iPad
  • Use cases:
    • Good for sites with many files
    • Compatibility not tied to financial bottom line
    • Not worried about compatibility with Apple products
    • Gambling on future success

MP4:

  • Pros:
    • Well-developed industry standard
    • Smallest files and clearest images
    • Uses widely supported hardware acceleration
  • Cons:
    • Hardware acceleration is necessary because it is so processor-intensive to decode
    • License fees looming
    • Potential Google success with WebM unsure
  • Use cases:
    • Best for overall compatibility and viewer adoption
    • Playable on all browsers and the newer mobile platforms, if you include a Flash player
    • Best solution if only one version of the video is desired

Conclusion

This article showed the progression of digital video from the early days of the Internet to its current state. While a single format that works everywhere is desirable, that's not a real-world scenario. Video codecs use very advanced technologies that take many years to write, which is a glacial pace considering the life cycle speed of the web. As soon as one video format seems to be dominant, another one is already poised to challenge it. Fortunately, following the rules laid out in this article and using a bit of forethought, the decision of which video format or formats to use does not need to be difficult.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Web development on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=549460
ArticleTitle=Introducing HTML5 video
publish-date=10052010