Part 1 mostly discussed infrastructure: support code needed in order to get something up on a PS3's screen and an explanation of various platform-specific oddities you'll encounter with this particular combination of hardware. In this article, you will see how to build on that infrastructure to make the system into a fully operational, if somewhat primitive, spectrum analyzer. To download the sample code referenced in this article, go to Part 1.
The basic function of a spectrum analyzer is to decompose an input signal in the frequency domain and display a representation of the energy levels of different frequencies of interest. There are several approaches to building such a device depending on your needs and the acceptable project cost. The simplest type can be seen in stereo systems that have an LED or VFD bar chart display showing the output signal strength in a few discrete frequency bands (typically three to seven). The usual way of implementing such a display is to feed the input signal into a comb filter. Each tooth of the comb is fed to a circuit that is, in essence, a low-pass filter. The output of this second filter is a slow-moving average of the input signal level for one frequency band. This average is fed into a stack of comparators with progressively higher reference voltages, the outputs of which drive the display segments in one column of the bar chart.
A traditional analog spectrum analyzer is a considerably more complex beast, but the basic design principle is easy to understand. The front end is essentially a superheterodyne receiver with a wide tuning range. The center frequency of this receiver is voltage-controlled (typically by means of varactor diodes in the receiver's local oscillator). The control input is driven with an internally generated sawtooth waveform from an internal timebase. The same sawtooth drives the horizontal deflection of an oscilloscope trace; the output of the receiver drives the vertical deflection. What you actually see on the scope screen is therefore a graph of frequency (x) against signal energy (y). An example of such a display is shown in Figure 1.
Figure 1. Real-life example graphing frequency (x) against signal energy (y)
The display shows a 20 MHz slice of the broadcast radio spectrum from 90.3 MHz to 110.3 MHz as received by a rather badly mismatched antenna. WHTZ, 100.3 MHz, is the peak at the center of the trace. You can see various other FM radio stations at various signal strengths to either side of it.
The non plus ultra of spectrum analyzers is the (mostly) all-digital design. At the high end, this consists of an extremely high-speed, high-resolution analog-to-digital converter that acquires the input signal in the time domain. A fast digital signal processor then converts this to frequency domain data and displays the result on the screen, optionally performing various filtering or other processing. A high-end digital spectrum analyzer can also perform other intelligent tasks to help you look at a signal of interest. For example, the analyzer might know about frequency-hopping spread spectrum signaling systems and allow you to set up the hop list and protocol timing in the analyzer itself to track an ongoing communication session.
The system you are building is of the all-digital type. Unfortunately, the limiting performance factor for this is right at the front end: the PS3 hardware does note have a convenient method of acquiring high-speed signals. Therefore, this article helps you build something of a proof-of-concept device that is limited to the audible signal range of approximately 20 Hz to 20 kHz, using the Griffin iMic as the data acquisition device.
It's time to get started. The subtasks in this project include:
- Filter the input signal.
- Acquire (digitize) the input waveform.
- Convert the time domain data to the frequency domain.
- Display the data attractively.
The first thing to consider is how to funnel some data from the outside
world into the PS3. You can use the Griffin iMic for this purpose, and there is
really not much to say about the installation process. Recent Linux kernels
include a compatible driver, so setting up the device on YDL is very much
plug-and-play. You can verify that the device was mounted successfully by using
tail -10 /etc/dmesg and checking for the appropriate
USB messages, and then use the ALSA mixer to determine if you can tinker with the
volume settings for the device.
For the programming side, there are a few different APIs you can use to access audio devices in Linux. The sample code presented with this article uses the Open Sound System API (OSS) mainly because it is uncomplicated to use and because its old age means it is well supported on various hardware and operating system flavors. The iMic is also supported by the Advanced Linux Sound Architecture (ALSA) API. It would not be unreasonably difficult to modify this sample code to work with ALSA.
The data stream will be sampled at 16 bits, 44.1 kHz. While this sampling frequency is not a particularly nice round number from a calculation point of view, it's a safe choice for hardware that was designed for audio recording. Because off-the-shelf hardware is often not tested rigorously against all possible API call parameters, it's prudent to choose popular sample formats and data rates when working with consumer hardware. This caveat basically restricts you to 11.025, 22.050, or the CD-quality sample rate of 44.100 kHz (though 48 kHz is also usually supported by modern audio hardware). I strongly advise you to stick to the upper end of this range. The reason for this is buried in the fact (which some of you might have noticed) that I glossed over step one in my project list. It's time to rectify that omission.
Normally, a digital data acquisition system starts with level-matching and isolation components followed by a sinc filter that rolls off, theoretically, to somewhere below the ADC's voltage resolution, across the frequency span, between the highest-frequency signal of interest and half the sample rate.
For the sample 16-bit ADC, that would theoretically mean 96 dB of attenuation (6 dB per bit) between the audible range of about 20 kHz and half the CD-quality sampling rate, which is 22.050 kHz. This requirement is an unfeasibly tall order for an analog circuit. It would involve an incomprehensibly high-order active filter network or a big compromise on performance parameters, such as passband ripple (or, more likely, both). Observably, the iMic does not contain such analog hardware. While it is possible that the device oversamples the input signal, filters it digitally, and downsamples it, this is very unlikely. Griffin does not publish specifications online, so you can make an educated guess that it is much more likely that the iMic compromises a bit by starting its rolloff earlier, quite probably not reaching the full 96 dB by the 22.050 kHz mark.
The net result of this discussion is that if you use a reduced sample rate, the iMic's front-end filter is not going to know about this, so it will continue passing through signals at frequencies that cannot be captured by your lower sample rate. This will cause aliasing artifacts to appear in your final output. As a result, the best strategy is to capture at a known-good sample rate. If you find that this generates too much data for the FFT engine to handle in a timely manner, then your next best plan is still to capture at the higher rate, but to run a digital low-pass filter over the raw data then downsample it before passing it on to the next stage.
When I planned this article, I was geared up to port an existing FFT
algorithm to the Cell/B.E. platform and thereby impress you with my elite porting
skills, but it seems that IBM already beat me to the punch. The latest alpha
version of the FFTW library (see Resources) already
includes explicit support for the Cell/B.E. processor. Basically the only thing
you need to do is build and install the library, and then add
-lm to the
linker flags in your Makefile. The tutorials included in the FFTW documentation
are adequate to get you started. Note the caveat in the documentation
regarding SPE usage. By default, the Cell/B.E. version chews up all available
SPEs. Use the
call (it is in the documentation, but not right alongside the rest of the API
description) to scale back FFTW's usage to
Initializing the FFTW library is simply a matter of completing the following two steps:
- Allocate memory for the input and output buffer. It's best to use the
fftw_malloc()function for this instead of the regular
malloc()because the fftw-specific function optimizes data alignment. This is particularly important on platforms like the Cell/B.E. processor.
- Develop and select a plan. There is actually a lot of arcane complexity
in this step (well explained in the documentation). At its simplest, you
fftw_plan_dft_1d()and tell it the array size, pointers to the input and output arrays, whether you want to go forward or backward, and several flags that can be used to squeeze out optimal performance. For the example application,
FFTW_ESTIMATEis perfectly acceptable as the flags' parameter.
The iMic delivers a stream of time-domain samples in the range 0 to 65535, 22.68 microseconds apart. You massage these (note that you are using only one channel of the stereo data stream) and place them in the FFT's input buffer. The example also plots a reduced-size, 128-pixel-high version of the input signal onscreen so you can see what it looks like in an oscilloscope-style format. That's not just a bit of eye candy, but rather to help you check that your input signal is properly connected and at an appropriate level. You also see a little snippet of code that you can uncomment if you don't have an iMic or a reference frequency source: it stuffs a sine wave directly into the sample buffer.
Now it's time to call
fftw_execute(). This passes your
sample data on to the SPEs, which crunch them into frequency spectrum data. The 512
real sample points are turned into two groups of complex spectrum data showing
symmetry around the center. As the references are fond of saying, the k'th
point in the output array represents the energy at a frequency of k/n * Fs,
where n is the number of samples and Fs is the sample frequency.
The first entry in the table in memory (0 Hz) is a special case. It represents the DC level of the input signal, and hence is usually off the scale. The spectrum rendering code deliberately clamps the Y-coordinate to allow for this condition. Note that this probably is not representative of the actual DC level at the iMic's input pin. Most likely, the input is capacitively coupled, so there isn't any real DC at the ADC. Rather, this bogus spike represents the fact that your input signal does not vary equally positively and negatively about the 0 V line, but rather varies between 0 and +65535.
Also note that, technically, you should use a logarithmic scale for the spectrum. The reason I don't do this in the sample code is that the resolution is rather low, and you can get a better idea of the signal shape from the linear plot.
Some technical details are in order here, particularly if you are now looking at
the source code in puzzlement. Arbitrarily, I chose a 512-point FFT.
FFTW does support arbitrary transform sizes, but you can realize much better
performance with a size that is a power of 2. Speaking of performance, the
generic complex one-dimensional transform I selected is not the optimal choice for
your sort of input data. The fastest would be
fftw_plan_dft_r2c_1d() (one-dimensional, real). The
reason I went with the generic case is that is applicable to other sorts of data,
and the additional computation load is really child's play to the PS3.
By the way, you shouldn't think of the FFT algorithm as being just a monolithic number-crunching magical black box. Plenty of research has gone into methods of computing FFTs and, among other things, how to factor a given FFT operation across multiple digital signal processors (DSPs). If, for some reason, you find the FFT itself is a bottleneck, there is a great deal of existing code (including, in this case, fftw) that can accelerate your application by splitting it up, if you throw more cores at the problem.
Now that you have sorted the input data into buckets, all that remains to be done is to display it. To do this, calculate the magnitude of each complex output point using simple Pythagoras and plot that number. If you wanted an actual power reading in dB, you should instead plot 20 times the base-10 log of the output point. This is not terribly useful information without some kind of reference marker though.
Observe that if you were to plot all 512 output points, you would see a lot of irrelevant information. Everything to the right of the 22.050 kHz mark is aliased and might not actually exist in the input signal. Hence, the code accompanying this article only plots the first 256 points, and it doubles the horizontal size so the display fits neatly under the oscilloscope display.
At this point, experiment a bit with performance. In particular, try building the FFTW libraries without Cell/B.E. support so they use only the PPE. The improvement would be more noticeable if you were doing a larger transform because the SPEs are much better at this sort of thing than the PPE. The small size of your data set means the transaction overhead is a significant fraction of the execution time.
So, you have now turned a PS3 into a useful spectrum analyzer. The next article in the series examines the other side of that equation and uses the same hardware as a function generator: the basis of an audio synthesizer, among many other things.
- Use an RSS
feed to request notification for the upcoming articles in this series. (Find out more
about RSS feeds of developerWorks content.)
- Check out all the articles in the series. Part 1 uncovers the design intent of the
project and inspects the details of the user interface implementation.
- Refer to Advanced Linux Sound Architecture (ALSA)
(which kinda sounds like a band from the 1980s) for audio and MIDI
functionality to Linux with fully modularized sound drivers, SMP and thread-safe
design, support for the older OSS API, binary compatibility for most OSS programs,
and a user space library to simplify application programming and provide a higher
- Read the Open Sound System (OSS) 4.0 Programmer's Guide
for a wealth of well-chosen tiny demo applets that demonstrate recording,
playback, and various mixer tweakage. Study this code for the fastest way
to get up to speed on the required steps.
- Go to the home of the
Poor Man's Spectrum Analyzer for
just one of many sites for building garage-project lab equipment.
- Find the latest
3.2alpha2 prerelease version of the fftw Fast
Fourier Transform library,
including IBM-supplied Cell/B.E. optimizations.
- See "25 tips to optimal application performance"
(developerWorks, June 2006) for how you can achieve near theoretical-maximum
performance for real applications on the Cell/B.E. processor by learning about the
processor's architectural characteristics.
- Review Jonathan Bartlett's essential
preliminary article "Programming high-performance applications on the Cell/B.E. processor"
(developerWorks, January 2007) about installing Linux on the PS3.
- Check out the document Sony released describing inter alia
how the ps3fb device interacts with the GPU and your Linux programs.
Note that this is a mirror document; there doesn't appear to be an official copy of this
document on Sony's sites.
- Refer to the Cell
Broadband Engine documentation section of the IBM Semiconductor Solutions Technical Library for a wealth of downloadable manuals,
specifications, and more.
- Sign up for the developerWorks newsletter
and get the latest developer news and Cell/B.E. happenings delivered to your inbox each week.
Check Power Architecture when you sign up to receive Cell/B.E. news in your newsletter.
Get products and technologies
- Jump over to Part 1
if you need to find the sample code referenced in this article.
- Look for the
Griffin iMic: my
audio input device of choice. Note that the Web site shows a (newer) white model
of the product. The model I have tested with PPC Linux is the older,
translucent-and-silver version with a switch between the input and output jacks.
- Download Yellow Dog
Linux through a free download from Terra Soft Solutions.
My experience is that all the mirrors are quite slow. I got the install ISO much
faster by searching a P2P network for the filename yellowdog-5.0-phoenix-20061208-PS3.iso.
- Find all Cell/B.E.-related articles, discussion forums, downloads,
and more at the IBM developerWorks Cell
Broadband Engine resource center: your definitive resource for all
- Contact IBM about custom
Cell/B.E.-based or custom-processor based solutions.
- Participate in the discussion forum.
- Check out the Cell Broadband
Engine Architecture forum to get your technical questions about the processor answered.
Juicy problems and answers from the forums are rounded up periodically and highlighted
in the "Forum watch" blog series.
- Go to the Power Architecture blog for news, downloads,
instructional resources, and event notifications for Cell/B.E. and other Power Architecture-related technologies. You can find
the popular "Forum watch" blog series (Q&A roundup) and the "FixIt" technology updates.
Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development.