Overcoming iOS HTML5 audio limitations

Solutions and workarounds for mobile Safari

Though HTML5 audio can be great, it has many limitations as a still-developing specification. Mobile Safari introduces even more limitations. In this article, learn about HTML5 limitations in mobile Safari. Working examples provide solutions and comprehensive workarounds. Learn the advantages of using audio sprites in mobile Safari, and try a few unique solutions to bypass all HTML5 limitations in iOS.

Share:

Aaron Gloege (agloege@nerdery.com), Software Engineer, The Nerdery

Photo of Aaron GloegeAaron Gloege discovered his passion for programming in 2006. A course on web development at Brown College lead him to dedicate his free time to becoming a self-taught JavaScript, iOS, and PHP guru. After graduating from Brown College in 2007 with an associate degree in applied science of visual communications, Aaron was hired as the lead web and interactive developer at Greatapes/MediaXpress. In early 2011, Aaron started as a software engineer at The Nerdery, where he quickly gained a reputation as a talented developer, and a reliable and dedicated team member and project lead.



09 October 2012

Also available in Chinese Russian Japanese

Introduction

For several years now, developers have been producing full-fledged interactive experiences that run, more or less, right in the browser. Such sites usually require browser plug-ins (Flash). With the advent of smartphones and tablets, interactive experiences seemed a perfect fit for the new gadgets. Because of the limiting processing power of mobile devices, however, browser plug-ins were no longer a viable platform for development.

Frequently used abbreviations

  • AAC: Advanced Audio Coding
  • CSS: Cascading Style Sheet
  • HTML: HyperText Markup Language
  • MP3: MPEG-1 Audio Layer 3
  • OGG: An open container format
  • WAV: Waveform Audio Format

HTML5 has added a huge pallet of in-browser tools that require no extra plug-ins. The HTML5 specification from the W3C is still under development, but browsers are providing support as the spec evolves.

HTML5 audio is a powerful advancement for embedding sound in the browser, especially on mobile devices such as iOS's mobile Safari browser. Though HTML5 audio is a new feature, it has support in iOS. According to developers of the popular mobile application Instapaper, 98.8% of its iOS users in November 2011 were using at least iOS 4 (see Resources). Because HTML5 audio was introduced to mobile Safari in iOS 3, you can be assured there is almost universal support for HTML5 audio on the iOS platform.

In this article, learn about the HTML5 limitations for the desktop and in mobile Safari, and try some solutions for creating interactive sound effects. Also covered are: unsupported events, audio sprites, and how to use directCanvas and multiSound to accelerate HTML5 game performance.

It is important to note that with iOS 6, Apple has added support for the Web Audio API (discussed below), which removes the need for a lot of the workarounds discussed in this article. However, iOS 6 has only been out for a few weeks, so iOS 5 still has the majority of the market. The issues discussed and the workarounds provided in this article are still valid and should be considered when developing audio for mobile Safari.

You can download the source code for the examples used in this article.


Limitations of HTML5 audio

Before discussing the limitations in mobile Safari, it's important to understand the limitations of HTML audio on the desktop. HTML5 audio is both robust and limiting, depending largely on its implementation. It works well for music players (jukebox player) or simple sound effects, but leaves much to be desired for sound-intensive applications such as games.

Format support

Unfortunately, not all browsers support the same audio file format. As shown in Table 1, there are currently four major formats: MP3, OGG, WAV, and AAC.

Table 1. HTML5 audio format support
Ogg VorbisWAVPCMAAC
Internet Explorer 9XX
FirefoxXX
Chrome/Safari/mobile SafariXXX

To cover all browsers, it's best to have all audio streams as both Ogg Vorbis and AAC.

Why isn't MP3 included? MP3 comes with hefty royalty payments when distributed commercially. The license requirements for MP3 will claim a distribution fee of 2% of all revenue over $100K (see Resources). For this reason, I prefer AAC over MP3. AAC is not royalty-free, but it has a much more relaxed license that allows free distribution. AAC also provides better compression, allowing for smaller file sizes—a blessing in the web world (see Resources).

Ogg Vorbis wins my vote overwhelmingly because it is open-source, patent-free, and royalty-free. However, only Firefox supports it.

Listing 1 shows what cross-browser compatible HTML markup should look like.

Listing 1. HTML markup for the audio element
<audio>
    // AAC file (Chrome/Safari/IE9)
    <source src="sound.m4a" type="audio/mpeg" />
    // Ogg Vorbis (Firefox)
    <source src="sound.ogg" type="audio/ogg" />
</audio>

Manipulation and effects

When dealing with audio, a powerful feature is the ability to manipulate the sound. Whether it's synthesizing sound on-the-fly, processing sound effects, applying environmental effects, or even doing basic stereo panning, HTML5 audio lacks all manipulation abilities. The audio you load is the audio that is played.

The Web Audio API (Chrome) and Audio Data API (Firefox) help address the missing features by giving you the ability to synthesize and process audio on-the-fly without any browser plug-ins (see Resources). Both APIs are currently still under development and are only supported in Chrome 14+ and Firefox 4+. Unfortunately, they are also quite different in implementation. There are great libraries to help normalize support, including audiolibjs (see Resources). Chrome's Web Audio API is the standard being pushed through the W3C.

Single sound layering (Polyphonic)

To play the same sound over itself, you must instantiate a separate audio object of that same sound. There is a 1:1 correspondence between the markup and the audio that can be played. No layering is achievable with the current state of HTML5 audio. Other platforms, such as Flash, let you layer a single audio object without having to create a new one.


iOS, mobile Safari, and HTML5 audio limitations

HTML5 audio is already somewhat limited, and mobile Safari adds another layer of limitations to the most basic uses of HTML5 audio.

Single audio stream

One of the biggest limitations imposed by mobile Safari is that only a single audio stream can be played at one time. HTML5 media elements in mobile Safari are singletons, so only a single HTML5 audio (and HTML5 video) stream can be playing at one time. Apple has offered no explanation for this limitation, but one can assume it is to reduce data charges (as is the reason for most other iOS HTML5 limitations).

iOS provides mobile Safari with only a single HTML5 media (audio and video) container. If you play an audio stream while another is currently playing, the previous audio stream will be removed from the container and the new stream will be instantiated in its place.

Listing 2 shows how calling play() while another stream is playing will stop the previous stream—in this case, audio1.

Listing 2. Single audio stream
var audio1 = document.getElementById('audio1');
var audio2 = document.getElementById('audio2');
audio1.play(); // this stream will immediately stop when the next line is run
audio2.play(); // this will stop audio1

See and hear this example in action.

It's important to keep in mind that audio and video are interchangeable. If an audio file is played while a video is playing, the video will stop. Only one audio or video stream can be playing at a time, as shown in Listing 3.

Listing 3. Interchangeable audio video stream
var audio = document.getElementById('audio');
var video = document.getElementById('video');
video.play();

// at a later time
audio.play(); // this will stop video

Autoplay

Audio files cannot be auto-played on page load in mobile Safari. Audio files can only be loaded from a user-triggered touch (click) event. If the autoplay attribute is used in the HTML markup, mobile Safari will ignore the attribute and not play the file on page load, like so:

<audio id="audio" src="audio_file.mp3" autoplay></audio>

The Safari Developer Guide has details on the matter (see Resources).

Loading audio

Audio streams cannot be loaded unless triggered by a user touch event such as onmousedown, onmouseup, onclick, or ontouchstart. Figure 1 shows an example.

Figure 1. Workflow to load audio in mobile Safari
Workflow to load audio in mobile Safari

If the code in Listing 4 is run on page load, the audio stream will not be loaded, or even downloaded, in mobile Safari.

Listing 4. Playing an audio stream on page load will silently fail
var audio = document.getElementById('audio');
audio.play();

Even if the preload attribute is used in the HTML markup, mobile Safari ignores the attribute and will not load the file until triggered by a user touch event, as shown in Listing 5.

Listing 5. preload attribute not supported in mobile Safari
<audio id="audio" src="audio_file.mp3" 
preload="auto"></audio>

See and hear this example in action.

On desktop Safari, the code in Listing 5 will download the audio file on page load. However, on mobile Safari, the attribute will be ignored and the audio file will not be downloaded.

Other quirks

There are a few additional quirks to consider when using HTML5 audio is mobile Safari.

There is a few-seconds delay when initializing a new audio stream due to iOS instantiating a new audio object. Listing 6 shows how to encounter the delay.

Listing 6. HTML5 audio delay when switching between audio objects
var audio1 = document.getElementById('audio1');
var audio2 = document.getElementById('audio2');
audio1.play();

// at a later time
audio2.play(); 
// there will be a few-seconds delay as iOS is instantiating a new audio object. 

// at an even later time
audio1.play(); // there will also be a few-seconds delay, as the audio object 
// for audio1 in iOS was destroyed when we played audio2.

See and hear this example in action.

It's important to ensure your logic does not assume the audio streams are loaded on page load. While calling play() will fail silently, trying to set the currentTime on a yet-to-be-loaded audio stream that hasn't had its metadata loaded will throw a fatal error, as shown in Listing 7.

Listing 7. Setting currentTime on audio stream that hasn't had metadata loaded
// run on page load
var audio = document.getElementById('audio');
audio.play(); // This will silently fail
audio.currentTime = 2; // This will throw a fatal error because the metadata 
// for the audio does not exist

See and hear this example in action.

Audio files cannot be cached in a mobile manifest on iOS. This is only applicable when using a manifest for an offline web application. If an audio file is included in the manifest, iOS will ignore it and not cache the file. Every time the web application needs access to the audio file it will need to access the file from the network.

Mobile Safari does not respect the volume and playbackRate property when set programmatically with JavaScript. Changing the attributes will not actually adjust the values. Volume is always under user control, and playbackRate is not supported in mobile Safari. While volume always stays set at 1, playbackRate will be set to the new value you set it to—but the actual rate of playback for the audio stream will not be changed. This creates some complications with the onratechange event, which is discussed in Unsupported events.

Before iOS 5, the loop attribute was not supported. To work around the lack of support, add an event listener to the onended event and, in that function, call play(). Listing 8 shows an example.

Listing 8. Looping audio workaround for iOS < 5
var audio = document.getElementById('audio');
audio.play();

var onEnded = function() {
    this.play();
};

audio.addEventListener('ended', onEnded, false);

See and hear this example in action.


Solutions

Solutions for mobile Safari's HTML5 audio shortcomings all depend on the usage. If you only want to play a single audio file or a playlist of audio files, not much will need to change. However, if interactive sound effects are needed, things can get a bit tricky.

Single audio streams

One solution to the single audio stream limitation is to simply swap out the source file with the audio needed, as shown in Listing 9. This is not an ideal solution because you need to wait for the new audio stream to load before you can play it.

Listing 9. Swapping out an audio object's source
var audio = document.getElementById('audio');
audio.play();

// at some later point in your script (does not need to be from a touch event)
audio.src = 'newfile.m4a';
audio.play(); // there will be a slight delay while the new audio file loads

See and hear this example in action.

A better way to solve the single audio stream limitation is to use an audio sprite. In short, you would combine all your audio into a single audio stream and then play portions of the stream. Audio sprites has more detail.

Autoplay

There is no workaround for the autoplay limitation. As mentioned, audio streams can only be loaded from a user-touch event. When developing for mobile Safari, it's important to adjust your workflow as necessary to accommodate this limitation. (From experience, I know that a lot of refactoring will happen if this isn't taken into consideration from the start.)

Before iOS 4.2.1, you could load an audio file from the callback of a synchronous Ajax call, as in the example in Listing 10.

Listing 10. Loading an audio stream in the callback of an Ajax call before iOS 4.2.1
// run on page load
var audio = document.getElementById('audio');

jQuery.ajax({
    url: 'ajax.js',
    async: false,
    success: function() {
        audio.play(); // audio will play in iOS before 4.2.1
    }
});

Hear this example in action.

There's an issue with the method in Listing 10: It's a synchronous Ajax call, so the browser is locked until the call is complete. In mobile Safari, locked doesn't mean just the page is locked—the entire application is locked. If an error occurs and mobile Safari gets stuck in a locked state (not terribly likely), the only way to exit is to click the home button and force-close the application.

Apple patched this workaround in iOS 4.2.1, so the workaround does not work in any version of iOS 4.2.1 and later.

Loading audio

Audio streams cannot be loaded unless triggered by a user event. As shown in Listing 11, onmousedown, onmouseup, onclick, and ontouchstart are valid events that will successfully load an audio stream when called within a callback. Note that this is only for loading an audio file; calling play() on a file that has already loaded will work as expected.

Listing 11. Using a user-triggered event to load an audio stream
// run on page load
var button = document.getElementById('button');
var audio = document.getElementById('audio');

var onClick = function() {
    audio.play(); // audio will load and then play
};

button.addEventListener('click', onClick, false);

See and hear this example in action.

At first glance, Listing 11 may seem like an annoying workaround. However, it's a best practice to give your game or interactive experience a splash screen, as in Figure 2, that requires the user to click a button to start. When the user clicks the start button, you can use that event to load the audio in your project.

Figure 2. Cut the Rope HTML5 splash screen
Cut the Rope HTML5 splash screen

Unsupported events

Though HTML5 audio in mobile Safari supports all media events from the desktop, note that some events will never fire because of a few unsupported properties mentioned previously. There are also a few quirks to be aware of.

Table 2 lists all the event callbacks for the audio element and their compatibility on desktop and mobile Safari. The results are based on an HTML5 audio event debugger set up by the author, which you can play around with if you choose.

Table 2. Desktop versus mobile Safari support for media events
EventDescriptionDesktopMobile Safari
abortThe browser stops fetching the media before the media was completely downloaded.XX
canplayThe browser can resume playback of the media data, but estimates that if playback has started, the media resource could not be rendered at the current playback rate up to its end without having to stop for further buffering of content. XX
canplaythroughThe browser estimates that if playback is started now, the media resource could be rendered at the current playback rate all the way to its end without having to stop for further buffering.XX
durationchangeThe duration property changes.XX
emptiedThe media element network state changes to the NETWORK_EMPTY state.XX
endedPlayback has stopped at the end of the media resource and the ended property is set to true.XX
errorAn error occurs while fetching the media data. Use the error property to get the current error.XX
loadeddataThe browser can render the media data at the current playback position for the first time.XX
loadedmetadataThe browser knows the duration and dimensions of the media resource.XX
loadstartThe browser begins loading the media data.XX
pausePlayback pauses after the pause method returns.XX
playPlayback starts after the play method returns.XX
playingPlayback starts.XX
progressThe browser is fetching the media data.XX
ratechangeEither the defaultPlaybackRate or the playbackRate property changes.XX (shouldn't)
seekingThe seeking property is set to true and there is time to send this event.XX*
seekedThe seeking property is set to false.XX*
stalledThe browser is fetching media data but it has stopped arriving.XX
suspendThe browser suspends loading the media data and does not have the entire media resource downloaded.XX
timeupdateThe currentTime property changes as part of normal playback or because of some other condition.XX
volumechangeEither the volume property or the muted property changes.X
waitingThe browser stops playback because it is waiting for the next frame.XX

The follwoing list provides some notes on a few of the event callbacks.

ratechange
The ratechange event is fired whenever the playbackRate is changed. As mentioned, changing the playback rate of an audio stream (as well as video) is not supported in mobile Safari, so the playbackRate should never fire. However, as of iOS 5.1.1, HTML5 audio will still fire the ratechange event even though the actual playback rate hasn't changed.
volumechange
Volume cannot be set using JavaScript, so the volumechange event will never be fired. Even if the user changes the volume on their device while mobile Safari is open, this event will not fire.
seeking/seeked
Mobile Safari only supports the seeking and seeked events when the seeking is done through JavaScript, as shown in Listing 12. If the built-in controls are displayed and the user seeks using the progress bar, seeking and seeked do not fire as expected.
Listing 12. Setting currentTime will trigger seeking and seeked events
var audio = document.getElementById('audio');
audio.currentTime = 60; // seeking and seeked will be fired

Audio sprites

Using an audio sprite is one of the best solutions to overcome the need for multiple sounds in mobile Safari. Much like a CSS image sprite, an audio sprite combines all your audio into a single stream, as shown in Figure 3.

Figure 3. Audio sprite
An audio sprite is several audio streams joined into a single stream

The principle is straightforward. You will need to store the data for each sprite: starting position, ending position or length, and an ID. When you want to play a particular sprite, you set the currentTime of the audio stream to the start position and call play(). Listing 13 shows an example.

Listing 13. Simple audio sprite implementation
// audioSprite has already been loaded using a user touch event
var audioSprite = document.getElementById('audio');
var spriteData = {
    meow1: {
        start: 0,
        length: 1.1
    },
    meow2: {
        start: 1.3,
        length: 1.1
    },
    whine: {
        start: 2.7,
        length: 0.8
    },
    purr: {
        start: 5,
        length: 5
    }
};

// play meow2 sprite
audioSprite.currentTime = spriteData.meow2.start;
audioSprite.play();

Listing 13 will play the meow2 sprite and, because there isn't logic implemented to stop when the sprite is complete, it will also play the whine and purr sprite. By adding an event listener to the ontimeupdate event in Listing 14, you can watch the currentTime and stop the audio when the sprite reaches its end.

Listing 14. Adding logic to stop the stream when it reaches the end of a sprite
var handler = function() {
    if (this.currentTime >= spriteData.meow2.start + spriteData.meow2.length) {
        this.pause();
    }
};
audioSprite.addEventListener('timeupdate', handler, false);

See and hear this example in action.

A great advantage to using an audio sprite is that there will be no delay when switching between sprites (like when switching between audio streams, assuming the entire audio sprite is loaded). Having all streams in one file is also advantageous to cut down on HTTP requests.

Be aware that changing currentTime isn't 100% accurate. Setting the currentTime to 6.5 can actually seek to 6.7, or 6.2. A small amount of space is needed between each sprite to avoid seeking to the end of another sprite. Adding this space can add a slight delay if the stream seeks to 6.4 when the sprite starts at 6.8 seconds.

Ensure that the entire audio stream is loaded before accessing any sprites. This is important because if the audio stream isn't completely loaded, and an attempt is made to access a portion of the stream that's loaded, the stream will need to be buffered and a delay will occur while the stream is loading.

Full-featured example

See and hear an example of an audio sprite framework. The example takes into consideration the topics covered in this article.


How directCanvas and multiSound accelerate HTML5 game performance

AppMobi has developed an interesting solution to overcome the various HTML5 limitations on mobile devices with directCanvas and multiSound (see Resources). directCanvas and multiSound use the native capabilities of a device within a standard HTML5 browser application. Slow graphical performance, and the limitations discussed in this article, are no longer an issue; you get the full performance benefits of a native application.

When a user navigates to a site that makes use of directCanvas, the page will prompt the user to download the MobiUs application from the App Store. If the user already has the application installed on their device, then the page will be opened in the MobiUs application.

AppMobi has videos on their site that show side-by-side comparisons of games running in their MobiUs application and games running in mobile Safari. The results are quite amazing, offering a 10X performance boost, as shown in Figure 4.

Figure 4. Average HTML5 performance improvement from mobile Safari to MobiUs app using directCanvas
A 4 column line graph showing percentage used

AppMobi's API site has great documentation, so you can jump right in. The SDK is free to download, and there is also a handy Google Chrome extension that lets you develop in your desktop browser.

Though it's not ideal to require users to install an application on their device, AppMobi has an interesting solution that should warrant consideration. Currently, the MobiUs application is not available in the App Store, but MobiUs assures it will be back soon.


Conclusion

Despite the limitations discussed in this article, HTML5 audio is a welcome addition to mobile Safari and you should take advantage of it. In this article, you learned about the limitations on both desktop and mobile Safari, walked through solutions to the limitations, and explored the advantages to using audio sprite in mobile Safari. Being aware of the mobile Safari limitations can increases its usability for you.

As a developing specification, HTML5 audio is sure to evolve, but there is no reason to wait until the spec is final in (supposedly) 2014. With near universal HTML5 audio compatibility for all iOS users, there is no reason not to use it.


Download

DescriptionNameSize
Article source codehtml5audio.article.source.zip4073KB

Resources

Learn

Get products and technologies

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Web development on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=839525
ArticleTitle=Overcoming iOS HTML5 audio limitations
publish-date=10092012