For several years now, developers have been producing full-fledged interactive experiences that run, more or less, right in the browser. Such sites usually require browser plug-ins (Flash). With the advent of smartphones and tablets, interactive experiences seemed a perfect fit for the new gadgets. Because of the limiting processing power of mobile devices, however, browser plug-ins were no longer a viable platform for development.
HTML5 has added a huge pallet of in-browser tools that require no extra plug-ins. The HTML5 specification from the W3C is still under development, but browsers are providing support as the spec evolves.
HTML5 audio is a powerful advancement for embedding sound in the browser, especially on mobile devices such as iOS's mobile Safari browser. Though HTML5 audio is a new feature, it has support in iOS. According to developers of the popular mobile application Instapaper, 98.8% of its iOS users in November 2011 were using at least iOS 4 (see Resources). Because HTML5 audio was introduced to mobile Safari in iOS 3, you can be assured there is almost universal support for HTML5 audio on the iOS platform.
In this article, learn about the HTML5 limitations for the desktop and in mobile Safari, and try some solutions for creating interactive sound effects. Also covered are: unsupported events, audio sprites, and how to use directCanvas and multiSound to accelerate HTML5 game performance.
It is important to note that with iOS 6, Apple has added support for the Web Audio API (discussed below), which removes the need for a lot of the workarounds discussed in this article. However, iOS 6 has only been out for a few weeks, so iOS 5 still has the majority of the market. The issues discussed and the workarounds provided in this article are still valid and should be considered when developing audio for mobile Safari.
You can download the source code for the examples used in this article.
Before discussing the limitations in mobile Safari, it's important to understand the limitations of HTML audio on the desktop. HTML5 audio is both robust and limiting, depending largely on its implementation. It works well for music players (jukebox player) or simple sound effects, but leaves much to be desired for sound-intensive applications such as games.
Unfortunately, not all browsers support the same audio file format. As shown in Table 1, there are currently four major formats: MP3, OGG, WAV, and AAC.
Table 1. HTML5 audio format support
| Ogg Vorbis | WAV | PCM | AAC | |
|---|---|---|---|---|
| Internet Explorer 9 | X | X | ||
| Firefox | X | X | ||
| Chrome/Safari/mobile Safari | X | X | X |
To cover all browsers, it's best to have all audio streams as both Ogg Vorbis and AAC.
Why isn't MP3 included? MP3 comes with hefty royalty payments when distributed commercially. The license requirements for MP3 will claim a distribution fee of 2% of all revenue over $100K (see Resources). For this reason, I prefer AAC over MP3. AAC is not royalty-free, but it has a much more relaxed license that allows free distribution. AAC also provides better compression, allowing for smaller file sizes—a blessing in the web world (see Resources).
Ogg Vorbis wins my vote overwhelmingly because it is open-source, patent-free, and royalty-free. However, only Firefox supports it.
Listing 1 shows what cross-browser compatible HTML markup should look like.
Listing 1. HTML markup for the audio element
<audio>
// AAC file (Chrome/Safari/IE9)
<source src="sound.m4a" type="audio/mpeg" />
// Ogg Vorbis (Firefox)
<source src="sound.ogg" type="audio/ogg" />
</audio>
|
When dealing with audio, a powerful feature is the ability to manipulate the sound. Whether it's synthesizing sound on-the-fly, processing sound effects, applying environmental effects, or even doing basic stereo panning, HTML5 audio lacks all manipulation abilities. The audio you load is the audio that is played.
The Web Audio API (Chrome) and Audio Data API (Firefox) help address the missing features by giving you the ability to synthesize and process audio on-the-fly without any browser plug-ins (see Resources). Both APIs are currently still under development and are only supported in Chrome 14+ and Firefox 4+. Unfortunately, they are also quite different in implementation. There are great libraries to help normalize support, including audiolibjs (see Resources). Chrome's Web Audio API is the standard being pushed through the W3C.
Single sound layering (Polyphonic)
To play the same sound over itself, you must instantiate a separate audio object of that same sound. There is a 1:1 correspondence between the markup and the audio that can be played. No layering is achievable with the current state of HTML5 audio. Other platforms, such as Flash, let you layer a single audio object without having to create a new one.
iOS, mobile Safari, and HTML5 audio limitations
HTML5 audio is already somewhat limited, and mobile Safari adds another layer of limitations to the most basic uses of HTML5 audio.
One of the biggest limitations imposed by mobile Safari is that only a single audio stream can be played at one time. HTML5 media elements in mobile Safari are singletons, so only a single HTML5 audio (and HTML5 video) stream can be playing at one time. Apple has offered no explanation for this limitation, but one can assume it is to reduce data charges (as is the reason for most other iOS HTML5 limitations).
iOS provides mobile Safari with only a single HTML5 media (audio and video) container. If you play an audio stream while another is currently playing, the previous audio stream will be removed from the container and the new stream will be instantiated in its place.
Listing 2 shows how calling play()
while another stream is playing will stop the previous stream—in this case, audio1.
Listing 2. Single audio stream
var audio1 = document.getElementById('audio1');
var audio2 = document.getElementById('audio2');
audio1.play(); // this stream will immediately stop when the next line is run
audio2.play(); // this will stop audio1
|
See and hear this example in action.
It's important to keep in mind that audio and video are interchangeable. If an audio file is played while a video is playing, the video will stop. Only one audio or video stream can be playing at a time, as shown in Listing 3.
Listing 3. Interchangeable audio video stream
var audio = document.getElementById('audio');
var video = document.getElementById('video');
video.play();
// at a later time
audio.play(); // this will stop video
|
Audio files cannot be auto-played on page load in mobile Safari. Audio files can only
be loaded from a user-triggered touch (click) event. If the autoplay attribute is used in the HTML markup, mobile Safari will
ignore the attribute and not play the file on page load, like so:
<audio id="audio" src="audio_file.mp3" autoplay></audio> |
The Safari Developer Guide has details on the matter (see Resources).
Audio streams cannot be loaded unless triggered by a user touch event such as onmousedown, onmouseup, onclick, or ontouchstart. Figure 1 shows an example.
Figure 1. Workflow to load audio in mobile Safari
If the code in Listing 4 is run on page load, the audio stream will not be loaded, or even downloaded, in mobile Safari.
Listing 4. Playing an audio stream on page load will silently fail
var audio = document.getElementById('audio');
audio.play();
|
Even if the preload attribute is used in the HTML markup,
mobile Safari ignores the attribute and will not load the file until triggered by a
user touch event, as shown in Listing 5.
Listing 5.
preload attribute not supported in mobile Safari<audio id="audio" src="audio_file.mp3" preload="auto"></audio> |
See and hear this example in action.
On desktop Safari, the code in Listing 5 will download the audio file on page load. However, on mobile Safari, the attribute will be ignored and the audio file will not be downloaded.
There are a few additional quirks to consider when using HTML5 audio is mobile Safari.
There is a few-seconds delay when initializing a new audio stream due to iOS instantiating a new audio object. Listing 6 shows how to encounter the delay.
Listing 6. HTML5 audio delay when switching between audio objects
var audio1 = document.getElementById('audio1');
var audio2 = document.getElementById('audio2');
audio1.play();
// at a later time
audio2.play();
// there will be a few-seconds delay as iOS is instantiating a new audio object.
// at an even later time
audio1.play(); // there will also be a few-seconds delay, as the audio object
// for audio1 in iOS was destroyed when we played audio2.
|
See and hear this example in action.
It's important to ensure your logic does not assume the audio streams are loaded on
page load. While calling play() will fail silently, trying
to set the currentTime on a yet-to-be-loaded audio stream
that hasn't had its metadata loaded will throw a fatal error, as shown in Listing 7.
Listing 7. Setting
currentTime on audio stream that hasn't had metadata loaded
// run on page load
var audio = document.getElementById('audio');
audio.play(); // This will silently fail
audio.currentTime = 2; // This will throw a fatal error because the metadata
// for the audio does not exist
|
See and hear this example in action.
Audio files cannot be cached in a mobile manifest on iOS. This is only applicable when using a manifest for an offline web application. If an audio file is included in the manifest, iOS will ignore it and not cache the file. Every time the web application needs access to the audio file it will need to access the file from the network.
Mobile Safari does not respect the volume and playbackRate
property when set programmatically with JavaScript. Changing the attributes will not
actually adjust the values. Volume is always under user control, and playbackRate is not supported in mobile Safari. While volume always
stays set at 1, playbackRate will be set to the new value
you set it to—but the actual rate of playback for the audio stream will
not be changed. This creates some complications with the onratechange event, which is discussed in Unsupported events.
Before iOS 5, the loop attribute was not supported. To work around the lack of support,
add an event listener to the onended event and, in that
function, call play(). Listing 8 shows an example.
Listing 8. Looping audio workaround for iOS < 5
var audio = document.getElementById('audio');
audio.play();
var onEnded = function() {
this.play();
};
audio.addEventListener('ended', onEnded, false);
|
See and hear this example in action.
Solutions for mobile Safari's HTML5 audio shortcomings all depend on the usage. If you only want to play a single audio file or a playlist of audio files, not much will need to change. However, if interactive sound effects are needed, things can get a bit tricky.
One solution to the single audio stream limitation is to simply swap out the source file with the audio needed, as shown in Listing 9. This is not an ideal solution because you need to wait for the new audio stream to load before you can play it.
Listing 9. Swapping out an audio object's source
var audio = document.getElementById('audio');
audio.play();
// at some later point in your script (does not need to be from a touch event)
audio.src = 'newfile.m4a';
audio.play(); // there will be a slight delay while the new audio file loads
|
See and hear this example in action.
A better way to solve the single audio stream limitation is to use an audio sprite. In short, you would combine all your audio into a single audio stream and then play portions of the stream. Audio sprites has more detail.
There is no workaround for the autoplay limitation. As mentioned, audio streams can only be loaded from a user-touch event. When developing for mobile Safari, it's important to adjust your workflow as necessary to accommodate this limitation. (From experience, I know that a lot of refactoring will happen if this isn't taken into consideration from the start.)
Before iOS 4.2.1, you could load an audio file from the callback of a synchronous Ajax call, as in the example in Listing 10.
Listing 10. Loading an audio stream in the callback of an Ajax call before iOS 4.2.1
// run on page load
var audio = document.getElementById('audio');
jQuery.ajax({
url: 'ajax.js',
async: false,
success: function() {
audio.play(); // audio will play in iOS before 4.2.1
}
});
|
Hear this example in action.
There's an issue with the method in Listing 10: It's a synchronous Ajax call, so the browser is locked until the call is complete. In mobile Safari, locked doesn't mean just the page is locked—the entire application is locked. If an error occurs and mobile Safari gets stuck in a locked state (not terribly likely), the only way to exit is to click the home button and force-close the application.
Apple patched this workaround in iOS 4.2.1, so the workaround does not work in any version of iOS 4.2.1 and later.
Audio streams cannot be loaded unless triggered by a user event. As shown in Listing 11, onmousedown, onmouseup, onclick, and ontouchstart are valid events that will successfully load an audio stream when called within a callback. Note that this is only for loading an audio file; calling play() on a file that has already loaded will work as expected.
Listing 11. Using a user-triggered event to load an audio stream
// run on page load
var button = document.getElementById('button');
var audio = document.getElementById('audio');
var onClick = function() {
audio.play(); // audio will load and then play
};
button.addEventListener('click', onClick, false);
|
See and hear this example in action.
At first glance, Listing 11 may seem like an annoying workaround. However, it's a best practice to give your game or interactive experience a splash screen, as in Figure 2, that requires the user to click a button to start. When the user clicks the start button, you can use that event to load the audio in your project.
Figure 2. Cut the Rope HTML5 splash screen
Though HTML5 audio in mobile Safari supports all media events from the desktop, note that some events will never fire because of a few unsupported properties mentioned previously. There are also a few quirks to be aware of.
Table 2 lists all the event callbacks for the audio element and their compatibility on desktop and mobile Safari. The results are based on an HTML5 audio event debugger set up by the author, which you can play around with if you choose.
Table 2. Desktop versus mobile Safari support for media events
| Event | Description | Desktop | Mobile Safari |
|---|---|---|---|
abort | The browser stops fetching the media before the media was completely downloaded. | X | X |
canplay | The browser can resume playback of the media data, but estimates that if playback has started, the media resource could not be rendered at the current playback rate up to its end without having to stop for further buffering of content. | X | X |
canplaythrough | The browser estimates that if playback is started now, the media resource could be rendered at the current playback rate all the way to its end without having to stop for further buffering. | X | X |
durationchange | The duration property changes. | X | X |
emptied | The media element network state changes to the NETWORK_EMPTY state. | X | X |
ended | Playback has stopped at the end of the media resource and the ended property is set to true. | X | X |
error | An error occurs while fetching the media data. Use the error property to get the current error. | X | X |
loadeddata | The browser can render the media data at the current playback position for the first time. | X | X |
loadedmetadata | The browser knows the duration and dimensions of the media resource. | X | X |
loadstart | The browser begins loading the media data. | X | X |
pause | Playback pauses after the pause method returns. | X | X |
play | Playback starts after the play method returns. | X | X |
playing | Playback starts. | X | X |
progress | The browser is fetching the media data. | X | X |
ratechange | Either the defaultPlaybackRate or the
playbackRate property changes. | X | X (shouldn't) |
seeking | The seeking property is set to true and there is time to send this
event. | X | X* |
seeked | The seeking property is set to false. | X | X* |
stalled | The browser is fetching media data but it has stopped arriving. | X | X |
suspend | The browser suspends loading the media data and does not have the entire media resource downloaded. | X | X |
timeupdate | The currentTime property changes as
part of normal playback or because of some other condition. | X | X |
volumechange | Either the volume property or the muted property changes. | X | |
waiting | The browser stops playback because it is waiting for the next frame. | X | X |
The follwoing list provides some notes on a few of the event callbacks.
ratechange- The
ratechangeevent is fired whenever theplaybackRateis changed. As mentioned, changing the playback rate of an audio stream (as well as video) is not supported in mobile Safari, so theplaybackRateshould never fire. However, as of iOS 5.1.1, HTML5 audio will still fire theratechangeevent even though the actual playback rate hasn't changed. volumechange- Volume cannot be set using JavaScript, so the
volumechangeevent will never be fired. Even if the user changes the volume on their device while mobile Safari is open, this event will not fire. seeking/seeked- Mobile Safari only supports the
seekingandseekedevents when the seeking is done through JavaScript, as shown in Listing 12. If the built-in controls are displayed and the user seeks using the progress bar,seekingandseekeddo not fire as expected.
Listing 12. SettingcurrentTimewill triggerseekingandseekedeventsvar audio = document.getElementById('audio'); audio.currentTime = 60; // seeking and seeked will be fired
Using an audio sprite is one of the best solutions to overcome the need for multiple sounds in mobile Safari. Much like a CSS image sprite, an audio sprite combines all your audio into a single stream, as shown in Figure 3.
Figure 3. Audio sprite
The principle is straightforward. You will need to store the data for each sprite:
starting position, ending position or length, and an ID. When you want to play a
particular sprite, you set the currentTime of the audio
stream to the start position and call play(). Listing 13 shows an example.
Listing 13. Simple audio sprite implementation
// audioSprite has already been loaded using a user touch event
var audioSprite = document.getElementById('audio');
var spriteData = {
meow1: {
start: 0,
length: 1.1
},
meow2: {
start: 1.3,
length: 1.1
},
whine: {
start: 2.7,
length: 0.8
},
purr: {
start: 5,
length: 5
}
};
// play meow2 sprite
audioSprite.currentTime = spriteData.meow2.start;
audioSprite.play();
|
Listing 13 will play the meow2 sprite and, because there isn't
logic implemented to stop when the sprite is complete, it will also play the whine and
purr sprite. By adding an event listener to the ontimeupdate event in Listing 14, you can watch
the currentTime and stop the audio when the sprite reaches its end.
Listing 14. Adding logic to stop the stream when it reaches the end of a sprite
var handler = function() {
if (this.currentTime >= spriteData.meow2.start + spriteData.meow2.length) {
this.pause();
}
};
audioSprite.addEventListener('timeupdate', handler, false);
|
See and hear this example in action.
A great advantage to using an audio sprite is that there will be no delay when switching between sprites (like when switching between audio streams, assuming the entire audio sprite is loaded). Having all streams in one file is also advantageous to cut down on HTTP requests.
Be aware that changing currentTime isn't 100% accurate. Setting the currentTime to 6.5 can actually seek to 6.7, or 6.2. A small amount of space is needed between each sprite to avoid seeking to the end of another sprite. Adding this space can add a slight delay if the stream seeks to 6.4 when the sprite starts at 6.8 seconds.
Ensure that the entire audio stream is loaded before accessing any sprites. This is important because if the audio stream isn't completely loaded, and an attempt is made to access a portion of the stream that's loaded, the stream will need to be buffered and a delay will occur while the stream is loading.
See and hear an example of an audio sprite framework. The example takes into consideration the topics covered in this article.
How directCanvas and multiSound accelerate HTML5 game performance
AppMobi has developed an interesting solution to overcome the various HTML5 limitations on mobile devices with directCanvas and multiSound (see Resources). directCanvas and multiSound use the native capabilities of a device within a standard HTML5 browser application. Slow graphical performance, and the limitations discussed in this article, are no longer an issue; you get the full performance benefits of a native application.
When a user navigates to a site that makes use of directCanvas, the page will prompt the user to download the MobiUs application from the App Store. If the user already has the application installed on their device, then the page will be opened in the MobiUs application.
AppMobi has videos on their site that show side-by-side comparisons of games running in their MobiUs application and games running in mobile Safari. The results are quite amazing, offering a 10X performance boost, as shown in Figure 4.
Figure 4. Average HTML5 performance improvement from mobile Safari to MobiUs app using directCanvas
AppMobi's API site has great documentation, so you can jump right in. The SDK is free to download, and there is also a handy Google Chrome extension that lets you develop in your desktop browser.
Though it's not ideal to require users to install an application on their device, AppMobi has an interesting solution that should warrant consideration. Currently, the MobiUs application is not available in the App Store, but MobiUs assures it will be back soon.
Despite the limitations discussed in this article, HTML5 audio is a welcome addition to mobile Safari and you should take advantage of it. In this article, you learned about the limitations on both desktop and mobile Safari, walked through solutions to the limitations, and explored the advantages to using audio sprite in mobile Safari. Being aware of the mobile Safari limitations can increases its usability for you.
As a developing specification, HTML5 audio is sure to evolve, but there is no reason to wait until the spec is final in (supposedly) 2014. With near universal HTML5 audio compatibility for all iOS users, there is no reason not to use it.
| Description | Name | Size | Download method |
|---|---|---|---|
| Article source code | html5audio.article.source.zip | 4073KB | HTTP |
Information about download methods
Learn
-
More
iOS device and OS version stats from Instapaper: See recent data and trends from people
using the Instapaper application.
-
License
requirements for MP3: Get mp3, mp3HD, and mp3surround patent and licensing information.
-
AAC licensing
requirements: Learn about licensed products (other than PC Software) standard rates.
-
HTML5 Audio: Read more about HTML5
Audio on Wikipedia.
-
HTML5 audio lacks the ability to manipulate sound. Read how the Web Audio API (Chrome) and Audio Data API (Firefox) help
address the missing features and give you the ability to synthesize and process audio
on-the-fly without any browser plug-ins.
-
How
can I autoplay media in iOS >= 4.2.1 Mobile Safari?: Get the answer to the
question.
-
appMobi:
Learn how
appMobi has solved HTML5 shortcomings with directCanvas and multiSound.
-
iOS
Specific Considerations: Learn the considerations when embedding audio and video using HTML5.
-
HTMLMediaElement
Class Reference: Learn more from the Safari Extensions Development Guide.
-
Getting Started with Web
Audio API: Read this article on the HTML5 Rocks website.
-
Audio Data API: Learn more about
the Audio Data API on the MozillaWiki.
-
Web Audio API: Check
out the official specification from W3C.
-
WHATWG: Explore this community of developers working with the W3C to fine-tune HTML5.
- developerWorks Web development
zone: Find articles covering various web-based solutions. See the Web development technical library for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
-
The developerWorks
community:
Personalize your developerWorks experience.
-
developerWorks
technical events and webcasts: Stay current with technology in these sessions.
-
developerWorks on Twitter: Join
today to follow developerWorks tweets.
-
developerWorks
on-demand demos: Watch demos ranging from product installation and setup
for beginners to advanced functionality for experienced developers.
Get products and technologies
-
audiolib.js: Install the powerful toolkit for audio written in JS.
-
directCanvas:
Download the directCanvas SDK, a collection of HTML5 game acceleration technologies,
to solve several HTML5 shortcomings.
-
IBM product evaluation
versions: Download or explore
the online trials in the IBM SOA Sandbox and get your hands on application
development tools and middleware products from DB2®, Lotus®,
Rational®, Tivoli®, and WebSphere®.

Aaron Gloege discovered his passion for programming in 2006. A course on web development at Brown College lead him to dedicate his free time to becoming a self-taught JavaScript, iOS, and PHP guru. After graduating from Brown College in 2007 with an associate degree in applied science of visual communications, Aaron was hired as the lead web and interactive developer at Greatapes/MediaXpress. In early 2011, Aaron started as a software engineer at The Nerdery, where he quickly gained a reputation as a talented developer, and a reliable and dedicated team member and project lead.



