Skip to main content

skip to main content

developerWorks  >  Power Architecture technology | Linux  >

Multifunction multimedia machine, Part 4: Mixing hardware and software for cost control

Movie madness

developerWorks
Document options

Document options requiring JavaScript are not displayed

Sample code


Rate this page

Help us improve this content


Level: Introductory

Lewin Edwards (sysadm@zws.com), Design Engineer, Freelance

07 Mar 2006

Explore the technical issues in video playback, and see how a blend of hardware and software achieves good performance at a reasonable cost. Also, Lewin Edwards reveals that MP3 does not mean MPEG-3, which alone is worth the price of admission.

The previous articles in this series show you how to create a scriptable, network-connected appliance that can play back still images and scale them to fit an arbitrary display window. I promised a while ago to show you how to get the device to show movies, and this is the episode where I fulfill that promise.

Movie playback is one of those functions where you really don't want to have to reinvent the wheel. The next few paragraphs discuss -- very briefly -- the history behind the common digital video formats and some of the data processing steps they contain. This information should put the task of movie playback in context and give you some idea of the complexities involved. Since we're using a hardware platform of known capabilities, detailed video decoding specifications don't figure into the design equation the way they might in a "real" (commercial) product; our design has hardware with XYZ capabilities, and we have to make do with that.

MPEG standards

Most of the digital video content you need to care about will be encoded with one of the following three standards, developed by the Moving Picture Coding Experts Group (MPEG):

  • MPEG-1 (finalized in 1992). The fundamental goal of MPEG-1 was to encode audio and video signals for storage on standard 650MB compact disks, with a visual quality approximating that of VHS video, at a data rate of 1.5Mbit/sec. You can store roughly an hour of MPEG-1 video on a single CD-ROM at this compression level. The baseline resolution for NTSC MPEG-1 video is 352x240 pixels; this resolution is referred to as SIF (Standard Interchange Format). MPEG-1 is used commonly in the low-cost VideoCD (VCD) movie format, once wildly popular (and still supported to a certain extent) in Asia.
  • MPEG-2 (finalized in 1994). The goal of MPEG-2 was to achieve higher (broadcast) quality at higher bitrates from 3 to 10 Mbit/sec. MPEG-2 is typically used at the full-frame NTSC resolution of 720x480 pixels. It has better support for interlaced video than MPEG-1, and offers greatly superior video quality at higher bitrates. Note that a compliant MPEG-2 decoder can handle MPEG-1 bitstreams, offering good backwards compatibility. You'll probably be very familiar with MPEG-2 implementations in the form of DVD players and satellite TV broadcasts.
  • MPEG-4 (finalized in 1998). This is a much more flexible and advanced encoding system, offering a wide selection of bitrate versus quality setpoints for different types of content. It performs particularly well (compared to MPEG-1 and -2) at very low-bitrate, low-resolution settings. Unfortunately, it is also an exceedingly complicated standard, rarely fully implemented. The standard includes support for programmable visual objects, content copy protection and other intellectual property management features, and much more.

If you start researching this topic, you might find mention of a couple of other MPEG standards: MPEG-3, which is an abandoned standard intended to apply to high-definition TV signals (MPEG-2 overtook this role), and MPEG-7, which refers, essentially, to a standardized method of encoding metadata for media objects. Neither of these standards have any direct part to play in a media playback appliance of the type we're building.

Video encoder/decoder compatibility with MPEG standards is specified in terms of "profiles" and "levels," and these are written in the form "profile@level." For example, MPEG-2 "Main Profile, Main Level" (MP@ML) specifies, among other things, a video resolution of 720x480 at 30 frames per second (for NTSC systems). Profiles and levels are defined in excruciating detail in the MPEG standards documents (Find some exemplar material in Resources). The main reason for specifying these different protocol levels is to define the minimum system requirements (RAM and processing horsepower) required to implement a basic player for standard off-the-shelf video media; in fact, some compromises in MPEG-1 profiles specifically address the (then) high cost of moving up from a single DRAM chip to multiple chips, or the next size bigger. Note, by the way, that MP3 refers, not to MPEG-3, but to MPEG Audio Layer 3. Most MP3 files you would find for download are probably MPEG-2, Layer 3, but some are MPEG-1, Layer 3. The "layers" are different audio codecs, with Layer 2 used mainly in broadcast applications. These "layers" have nothing to do with the video encoder "level." Confused yet?

You'll observe that the various MPEG encoding formats are open, though not public domain, ISO standards. For the purposes of this article, I'm ignoring the closed video encoding formats such as QuickTime, RealPlayer and Windows® Media Video (and a number of others used in various industries that distribute video content for money -- no, I won't be providing links in Resources!). Note that many so-called proprietary formats are technologically practically identical to MPEG-4, but are wrapped in slight modifications or customized digital rights management shims.

Performance requirements

This leads nicely into a discussion of how to dimension the system for smooth video playback. How much processing power is necessary to play back movies smoothly? Unfortunately, you can't answer this question with a simple MHz rating; it leads to a complicated matrix of CPU multimedia acceleration features versus formats supported versus resolution, audio quality, and so forth. In general, however, motion video decoding encounters the following bottlenecks:

  • File transfer speed. Using lower data rates (implying better compression ratios or lower quality reproduction) can improve performance here. MPEG-4, which can generate good output from exceedingly low data rates, wins on this point.
  • Decryption. This only applies to encrypted video streams, obviously. DVDs and broadcast digital video content are usually encrypted.
  • Stream parsing. This is usually not particularly compute-intensive, but it's one of the tasks you need to do.
  • iDCT (inverse Discrete Cosine Transform) performance. The Discrete Cosine Transform is used to transfer the input image data from the spatial domain into the frequency domain; the output data is then quantized and run-length compressed. Decoding the video stream involves reversing these processes, which requires use of the iDCT.
  • MC (motion compensation). You'll find a reasonably informative link about this process in Resources, but in brief: you can find big savings in bitrate by locating objects that are moving onscreen and encoding the resultant frame changes as motion vectors. For example, if you have a spaceship moving across a static background, it takes fewer bits to say "take the shape at position x,y in frame #1 and move it ten pixels right in frame #2" than it does to communicate a list of exactly which pixels have changed between the two frames. (For purists, this description is an oversimplification, especially for MPEG-1 and MPEG-2, but it gets the idea across).
  • Colorspace conversion. Digital video is compressed in the YUV colorspace, for various reasons -- one of which is that it facilitates cunning space-saving hacks such as keeping all the luminance information but throwing away most of the chrominance, since the human eye is less sensitive to color scales than overall light intensity. Computer monitors almost universally operate in the RGB colorspace, so a conversion is necessary.
  • Output scaling. This is possibly more important for the multimedia player niche than for single-purpose devices like DVD players. Find more detail on this issue below.

As it happens, PowerPC® cores are encountered frequently in embedded digital video applications such as set-top boxes. However, in these applications the PowerPC is always surrounded by external hardware that assists with the decoding process (all this magic is usually integrated into a single chip). The PowerPC core mainly coordinates system activities, drives the user interface, and perhaps runs provider-specific middleware. For example, your satellite TV box might contain a Java bytecode interpreter so it can run custom applets from your service provider. Because the PowerPC isn't handling the workload of actual video decompression, you won't find multiGHz cores in such applications; a typical speed range would be in the 233 to 400MHz ballpark.

So, why would you choose to use a high-performance general-purpose PowerPC (as we're doing here) rather than application-specific decoder hardware plus a lower-end PowerPC core running at a more sedate pace? For this application, the situation is simple -- the hardware platform is already defined, and you want to use it for this new functionality. However, you might intentionally elect to go this route if you're only building a small number of units (perhaps only a prototype run), or you don't have a lot of startup capital.

The reason for this is licensing. The chipsets that are used in DVD players and set-top boxes are very cheap, but you can't buy them, or even see the full datasheets, until you have (a) decided to commit to a huge volume, and (b) acquired all the relevant licenses for the technologies implemented in these parts -- even if you don't intend to use those features. These license fees are tens of thousands of dollars (the DVD standard license is US$10,000 by itself), and that's just to get your foot in the door and acquire the relevant technical documents. You must pay additional per-unit royalty fees as well! By choosing a pure-software approach, you can sidestep this problem by buying software licenses in whatever quantity you need. For an educational project or in-house prototype, you might not even need to negotiate licenses at all.

A second point to consider is flexibility. Hardware decoders -- especially at the low end -- can be quite constrained with respect to the bitstreams they can handle. If you've ever experimented with homemade VideoCD or DVD disks in regular living room DVD players, you've most likely seen all kinds of weird symptoms such as disks that can't be resumed if you pause them, audio synchronization doing strange things, bizarre video corruption artifacts in certain scenes, and so forth. If you're implementing most of the decode operation in software, you can tweak both ends of the encode-decode equation as much as you like without affecting playability.

In fact, the compute-intensity situation isn't quite as bad as you might have been led to infer from the previous couple of paragraphs. In these decadent modern times, all general-purpose computer video chipsets are designed to assist with digital video decoding algorithms. The Radeon 9200 used in the Mac mini, for instance, has hardware support for iDCT and MC, as well as YUV-to-RGB colorspace conversion. Very roughly speaking, this hardware offloads 85% of the decode effort from the CPU. Actually using the acceleration features at a register level is not trivial (especially because the datasheets for these video chips are, again, largely secret), but fortunately the XFree86 folks have done the research and the hard implementation work for us. Note, by the way, that support for ATI products in Linux® is generally quite good; many other vendors' graphics hardware is significantly irksome to get working.

I mentioned at the start of this discourse that you really don't want to reinvent the wheel when it comes to digital video decoding, and hence you need to use off-the-shelf software to do the playback. The package I've chosen to use is mplayer (see Resources), an open source player that is incredibly flexible and very amenable to integration into other programs. If you're pursuing an actual commercial project, you might prefer to look at xine (see Resources). The reason for this is that Linspire has created a fully (DVD) licensed version of this player for their consumer Linux variant, and in the past they have expressed a willingness to sell batches of serial numbers (covering the content-scrambling license) for embedded applications. The binary they make available is, of course, for x86 -- but it seems likely you could negotiate the licensing independently of the specific build that's available from the Linspire Web site.

In case you haven't already gleaned it, licensing in general is a very itchy issue around multimedia formats, especially digital video. MPEG-2 by itself -- not including the various goodies related to DVD playback -- lives on a raft of more than 600 patents held by different parties. This is why central clearinghouses like the MPEG Licensing Authority (see Resources) are, unfortunately, absolutely necessary.

To use mplayer and one of the other external programs we'll add a little later, you'll have to install three additional libraries: libao, libmad, and libid3tag. These are, respectively, a cross-platform audio library, an MPEG audio-decoder library, and a library for extracting ID3 tag information out of MP3 files. For your convenience, I've gathered these libraries together into a single archive for you to download (see Downloads). To make and install each library, simply unpack it, and from within the directory thus created, run:


Listing 1. Building a library

./configure ; make all ; make install

You use the exact same steps to build mplayer. I used the MPlayer-1.0pre7try2 version, but if you find a newer version by the time you read this, you almost certainly won't have to change anything in these procedures to get it working properly.

At this point, you can also build and install the new version of the slide show program; download and unpack it, make all; make install and you're there. You can integrate MPEG video content (with .MPE, .MPG, or .MPEG extensions) into your slide shows using the same HTML-style syntax described in the previous article. Note, however, that most Web browsers will not properly preview an image link that points to a .MPG file, so you will probably just see a blank square for any MPEG items in your slide show. Addressing this problem would require adding significant intelligence to the inbuilt HTML parser, so that it could recognize the HTML tags for embedded movies, and that's not a high priority just at the moment.

Take a closer look at how I modified the slide show program. The first change I made was in filescan.h (where I simply added a new media type, MT_MPEG, and a new media_type field in the slide show item structure). I also modified the FL_Identify function in filescan.c so that it would identify the appropriate extensions for MPEG files. The other modifications were to main.c, first in the beginning where we parse the script file, and later to the slide show playback code.

But why go to the trouble of explicitly stating the media format with a separate field in the slide show entry? Surely you can just determine that by looking at the file extension? This is exactly how the FL_Identify function works, after all. However, it turns out that you can gain considerable amenity by keeping track of the media type as a separate attribute. An almost trivial benefit is that you don't have to re-parse the filename every time you need to make a decision based on media formats, and you can use a simple switch statement to control execution flow for the different supported formats. (There aren't many such decision points in the code as it stands, but more will come as we add new features). A more subtle, but also more important issue is that you might not always be able to identify the media based on the last few characters of the filename, particularly for content that is being fetched directly off the Internet. The slide show program as it stands doesn't support this yet, but eventually you want to be able to embed URLs in the slide show script that refer to remote content. URLs served up by remote programs might have quite indecipherable filenames, and you need another mechanism to know what type of content you're actually trying to display.

The last change I made to main.c is the meaty part that actually spawns mplayer, when appropriate, to play your embedded movie files. It's worth dissecting the command-line switches I pass to mplayer, because they were selected with some care.

The first two switches are -noconsolecontrols, which prevents mplayer from looking at its input stream for command characters, and -really-quiet, which suppresses the generation of a lot of printf'd information that you won't be able to see when mplayer is spawned from your program.

The next switch is -vo xv, which tells mplayer to use the X Video extensions for playback. This gives you access to the video card's colorspace conversion (overlay) hardware, which saves a little CPU time. It also opens the door for the -vm switch mentioned below.

If you type mplayer -vo help to get a list of supported output methods, you'll see that this program supports a multiplicity of different outputs -- it can render video directly to the framebuffer (-vo fbdev -- observe that this mode can be coaxed into running on practically any device that runs Linux and supports bitmapped graphics), to vanilla X devices (-vo x11), using the X Video extensions (-vo xv), through OpenGL, and even to various animated file formats such as animated GIF. The output mode you select depends on what you want to achieve and exactly what capabilities your hardware offers. For instance, if your machine has some kind of OpenGL hardware acceleration, you might choose OpenGL output -- that way, you can use the acceleration hardware to handle scaling (and possibly colorspace conversion).

Next, we have -fs (full screen) and -vm (which allows mplayer to choose a different video mode; it automatically switches back to the previous mode when playback is done, although this behavior can be overridden if desired). This latter feature alone is worth X11's price of admission. The sweet spot for VHS-quality digital video is VideoCD format, which is MPEG-1 at 352x240 (for NTSC, anyway). Even the little old 233MHz iMac can decode this smoothly in software, with accompanying audio. (By the way, that's really impressive performance. An equivalent x86 system would be groaning under the same workload).

Rendering this low-resolution bitstream on your 1024x768 screen, however, would either result in a tiny video window, or waste a lot of CPU cycles scaling the video output up to the full-display resolution. The solution is to allow mplayer to change X's output resolution temporarily. No software scaling is necessary; the actual pixel clock sent out to the monitor is altered, and the monitor's electronics change the raster's speed to match the sync rates coming from the computer. Observe that without this functionality, you would have two competing design goals: still images generally require very high resolution if they are to appear attractive, whereas video content can be much lower-resolution (and in fact, needs to be -- since your platform would have difficulty decoding a full-resolution 1024x768 video file).

The final switch is -framedrop, which simply allows mplayer to skip rendering steps on some frames if the video output starts to lag audio. This can help keep up the appearance of smooth playback if your system is right on the performance borderline for a particular bitstream; an occasional dropped frame here or there won't be noticed as readily as out-of-sync audio or stuttering output.

That's it for this episode. In the next article (which will take less time to release than this one did -- honest!), you'll see how to add Web-based remote control functionality, and as a special bonus I'll show you how to put the elinks embedded Web browser onto the player itself so the same user interface is presented at both local and remote locations. In the meantime, as an exercise, try adding AVI support by modifying filescan.c appropriately, and experiment with various different media files to see what your hardware can handle efficiently.




Back to top


Downloads

DescriptionNameSizeDownload method
Example source codepa-madmac4code.tar.gz7KBHTTP
Libraries for mplayerpa-madmac4code-mplayer-libs.tar.gz1.17MBHTTP
Information about download methods


Resources

Learn

Get products and technologies


About the author

Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development. He can be reached at sysadm@zws.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top