Multimedia, by definition, means a variety of media types. You can store audio, video, and metadata in a myriad of file formats. However, this also means learning to use many tools to manipulate such diverse content.
This is where GStreamer comes to the rescue. By hiding all the different tools and libraries inside its plug-ins and using the general concept of a media pipeline, GStreamer is able to present the manipulation of different types of media in a uniform way. This allows you to concentrate on the media at hand, instead of wondering what pipe diameter your plumbing should have.
The benefits of such a unified approach are immediate. Instead of writing an MP3 player or an AVI/DivX player, you can write a music or a video player. When you want to support another format, you don't need to learn and then code for a new library. Instead, you simply install a plug-in for that format. That's all -- you don't even need to recompile. All GStreamer applications are able to pick up the new format on the go.
GStreamer can answer many problems, such as "I need to store all audio samples coming from various sources in a common format." Because all formats are treated alike, you only need to write one tool. This saves time and makes the solution more robust and easier to maintain. Moreover, after you learn the GStreamer concepts, there's almost no limit to what you can apply it to. If you want to stream audio over a network, you only need to think about the network, because the API (application programming interface) you use for sound and everything else stays the same.
Because of its nature, GStreamer sits a bit above the level of a normal library. Thus, it's important to understand exactly what it is, what it isn't, as well as what it does.
GStreamer is a media processing library. That means it gives you an abstract model of some transformation -- composed of input, output, and different stages -- and allows you to construct concrete instances of such transformations to fit a particular end result and a particular media type. The following are examples of such processing:
- Transcoding an MP3 audio file to Ogg Vorbis
- Playing back an AVI movie file
- Capturing a live performance with an IEEE1394 digital video (DV) camera and saving it as an MPEG-2 stream
To achieve such diverse results, GStreamer operates on the abstract notion of a pipeline. A pipeline is a directed graph in which media flows in a defined direction from the input to the output. Pipelines consist of elements -- another core concept. An element is an object that you can put inside a pipeline, wrapping some
operation on the media inside. You can link elements together so they collectively yield a process that transforms the input into the desired output. By convention, pipelines are depicted with data flowing from the left (upstream) to the right (downstream). That is the same way they are written using
gst-launch, which is described later in this article.
It is important to note that everything so far is completely abstract. There has been no mention of video or audio, and there's a good reason. The model described above is not restricted to any specific media type. As long as you can express it in terms of input, output, and transformation, your pipeline can manipulate it. For instance, your desktop can be a media source, and you can record a screencast of your operation to a video file. In fact, that's what the Istanbul application is designed to do (see the Resources section).
The core of GStreamer itself has no elements. All it provides is knowledge of plumbing. Everything specific is provided by plug-ins. A plug-in is a piece of compiled code, usually distributed as an object file (.so on UNIX® and .dll on Microsoft® Windows®), that provides one or more elements. At startup, GStreamer queries all installed plug-ins to derive the set of elements available for applications. Plug-ins usually call other libraries for specific tasks (for instance, an MPEG-2 decoder probably uses an existing library for handling the MPEG format), but the application doesn't need to know that. All it sees are elements that all look and behave the same.
Some plug-ins are distributed in the core source packages and compiled into the library itself, even though they are conceptually separate entities. Other basic plug-ins are distributed in a gst-plugins-base package. Those are present in most installations of GStreamer. Then there are the gst-plugins-good, -bad, and -ugly packages, where different plug-ins, depending on the level of support they get and licensing terms, are collected. Finally, there are plug-ins that are distributed by third-party vendors or registered for private use by a specific application.
Now that you understand the pipeline, you need to understand how it maps to GStreamer's implementation. You also get to learn some more terminology along the way.
As I mentioned, the basic unit of processing is an element,
represented by the
GstElement class. GStreamer is written in C, but it
uses the same GObject library known from the GTK+ toolkit to get
object-oriented features (see the
Resources section). An element has pads, which are the linking points for
other elements. There are two types of pads:
- Sink pads provide input for an element.
- Data produced by an element is available from source pads.
The pads have capabilities, called caps. Capabilities dictate what kind of data can flow
through a pad. For instance, if you inspect a
element, which is a decoder for the free Vorbis code, you see the code shown in Listing 1. A dollar sign ($) at the beginning of a line means it's a normal UNIX shell command.
Listing 1. A snippet from vorbisdec element information
$ gst-inspect-0.10 vorbisdec [...] Pad Templates: SRC template: 'src' Availability: Always Capabilities: audio/x-raw-float rate: [ 8000, 50000 ] channels: [ 1, 6 ] endianness: 1234 width: 32 SINK template: 'sink' Availability: Always Capabilities: audio/x-vorbis [...]
You can see there are two pad templates: one for source (
one for sink. The source pad is always available (other possible availability values are
request) and can output raw float audio at rates between 8kHz and 50kHz, with one to six channels, in
little-endian order, and with 32-bit-wide samples. The sink pad, on the other hand, simply accepts Vorbis-encoded audio stream.
These templates are crucial for the pipeline to function properly. Whenever you attempt to link two elements together to form a pipeline, GStreamer checks to see if their pad's templates are compatible. This process is called negotiation. During negotiation, elements try to come up with the best possible format that they both support. If there aren't any, linking fails. Otherwise, they agree on a common format. That format is no longer a template, but something called fixed caps -- meaning all values are concrete and unambiguous. The data can then pass from one to the other.
Now you know what you need to get started. For that, I'll introduce the Swiss Army knife of
gst-launch is one of the most versatile tools you'll come across. It is for GStreamer what shell is for UNIX. Using it, you can construct even complex pipelines using a special syntax, appropriately called gst-launch syntax, as shown in Listing 2.
Listing 2. An example of a gst-launch line
$ gst-launch-0.10 filesrc location= "concept.mp3" ! decodebin ! alsasink Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: audioclock0
Table 1. Element descriptions for the syntax shown in Listing 2
|gst-launch-0.10||This is the name of the command. The |
|filesrc location="concept.mp3"||This creates an element of class |
|!||The exclamation mark means link to. Similar to shell's pipe symbol (|), it was chosen for its visual similarity and the fact that it doesn't have to be escaped when written in the shell, as long as there are spaces around it.|
|decodebin|| This is an autoplugger provided by GStreamer. An autoplugger is an element that, given a data type on its input and output, uses other available elements to find a sub-pipeline that provides the requested results. Remember that all links in GStreamer must be typed, thus the exclamation mark (|
|alsasink||This is the correct element to use for audio output on my Linux® system. It talks to the soundcard and feeds it with raw audio samples. It also times the whole pipeline, because the soundcard has a natural rate at which it can consume data.|
When I press Enter, it prints several status messages until the pipeline reaches the
PLAYING state. Then, the data starts flowing, and I hear the sound, as timed by my soundcard (
As you can see, GStreamer saves you a lot of work. You don't even need to know what type of media
you're attempting to decode. Remember that, just as shell can't replace all your C programs, the
gst-launch tool can't replace a full GStreamer application. For instance,
gst-launch doesn't let you control the pipeline in any
way after it's launched, so you can't skip parts of the stream. Nonetheless, it's still incredibly useful -- particularly for quick jobs, such as recoding an audio file to another format or simply experimenting with pipelines.
This article provides just a teaser of what you can do with GStreamer. Obviously, creating an audio player using a simple shell command is cool. However, it's a rather poor player with no user interface or controls. To add those items and much more, you do need to use some code. Even so, GStreamer's API is simple and well thought out. And, if you don't fancy C, you can choose from several other bindings, including a vigilantly maintained set of Python language bindings.
gst-launch man page. The full syntax is a bit bigger, and you can use it to create much more complex and interesting pipelines -- including the ones you create from your code. Yes, you can even have your own
gst-launch (check out the
gst_parse_launch () function documentation to see how).
Also, join the mailing list and drop by the IRC channel (
#firstname.lastname@example.org). GStreamer developers are a lively bunch, and there is always someone to help -- or be helped by -- you.
GStreamer Application Development Manual: Read this manual to learn more about GStreamer concepts and how to write applications with GStreamer.
GStreamer Plugin Writer's Guide: This guide is handy when you want to create your own plug-in.
- GStreamer 0.10 Core Reference Manual: This is a great reference manual.
- GObject Reference Manual: Read this manual to learn more about the object-oriented library used by GStreamer.
- AIX® and UNIX: Want more? The developerWorks AIX and UNIX zone hosts hundreds of informative articles and introductory, intermediate, and advanced tutorials.
- Podcasts: Tune in and catch up with IBM technical experts.
Get products and technologies
- GStreamer homepage: Visit this site for the latest information and downloads.
- Istanbul: Istanbul is a desktop session recorder that uses GStreamer.
- IBM trial software: Build your next development project with IBM software, available for download directly from developerWorks.
Participate in the AIX and UNIX forums, developerWorks
blogs, and get involved in the developerWorks community.
- gstreamer-devel: Don't hesitate to post a message on gstreamer-devel, which is a mailing list for development of and with GStreamer.
Maciej Katafiasz is a graduate student in Computer Science and has been using open source technologies since high school. Annoyed by the lack of simple, working ways to watch movies in his GNOME desktop, he picked up interest in the (then still young) GStreamer, and stayed around to help a bit. You can contact him at email@example.com.