 | Level: Intermediate Lewin Edwards (sysadm@zws.com), Design Engineer, Freelance
02 Oct 2007 How do you take the Cell Broadband Engine (Cell/B.E.) processor from an
off-the-shelf Sony PLAYSTATION 3 (PS3) and use it to construct a piece of
Linux®-based laboratory equipment (in essence, take the Cell/B.E. from fab to hab
to lab)? In this series, Lewin Edwards shows you how to go from game console to
simple audio-bandwidth spectrum analyzer and function generator. In this article,
the author shows you how to build on the infrastructure from Part 1 to make the
system into a fully operational, if primitive, spectrum analyzer.
Introduction
Part 1 mostly discussed infrastructure:
support code needed in order to get something up on a PS3's screen and an
explanation of various platform-specific oddities you'll encounter with this
particular combination of hardware. In this article, you will see how to
build on that infrastructure to make the system into a fully operational, if
somewhat primitive, spectrum analyzer. To download the sample code referenced in this
article, go to Part 1.
The target technology
The basic function of a spectrum analyzer is to decompose an input signal in the
frequency domain and display a representation of the energy levels of different
frequencies of interest. There are several approaches to building such a device
depending on your needs and the acceptable project cost. The simplest type can be
seen in stereo systems that have an LED or VFD bar chart display showing the output
signal strength in a few discrete frequency bands (typically three to seven). The
usual way of implementing such a display is to feed the input signal into a comb
filter. Each tooth of the comb is fed to a circuit that is, in essence, a
low-pass filter. The output of this second filter is a slow-moving average of the
input signal level for one frequency band. This average is fed into a stack of
comparators with progressively higher reference voltages, the outputs of which
drive the display segments in one column of the bar chart.
A traditional analog spectrum analyzer is a considerably more complex beast, but
the basic design principle is easy to understand. The front end is essentially a
superheterodyne receiver with a wide tuning range. The center frequency of this
receiver is voltage-controlled (typically by means of varactor diodes in the
receiver's local oscillator). The control input is driven with an internally
generated sawtooth waveform from an internal timebase. The same sawtooth drives
the horizontal deflection of an oscilloscope trace; the output of the receiver
drives the vertical deflection. What you actually see on the scope screen is
therefore a graph of frequency (x) against signal energy (y). An example of such a
display is shown in Figure 1.
Figure 1. Real-life example
graphing frequency (x) against signal energy (y)
The display shows a 20 MHz slice of the broadcast radio spectrum from
90.3 MHz to 110.3 MHz as received by a rather badly mismatched antenna. WHTZ, 100.3 MHz, is
the peak at the center of the trace. You can see various other FM radio stations
at various signal strengths to either side of it.
The non plus ultra of spectrum analyzers is the (mostly) all-digital
design. At the high end, this consists of an extremely high-speed, high-resolution
analog-to-digital converter that acquires the input signal in the time domain. A
fast digital signal processor then converts this to frequency domain data and
displays the result on the screen, optionally performing various filtering or other
processing. A high-end digital spectrum analyzer can also perform other
intelligent tasks to help you look at a signal of interest. For example, the
analyzer might know about frequency-hopping spread spectrum signaling systems
and allow you to set up the hop list and protocol timing in the analyzer itself to
track an ongoing communication session.
 |
Complaints department
To prevent advanced readers from complaining: It would theoretically be
possible to bring in signals of any arbitrary frequency (even up in the multiple
gigahertz range) by using an external mixer to heterodyne the source down to the
range of the iMic. In fact, some vendors of spectrum analyzers sell expansion
boxes that do precisely this. However, because the iMic's bandwidth is severely
limited, it would be irksome to scan across such high frequencies. |
|
The system you are building is of the all-digital type. Unfortunately, the
limiting performance factor for this is right at the front end: the PS3 hardware
does note have a convenient method of acquiring high-speed signals. Therefore,
this article helps you build something of a proof-of-concept device that is limited to the audible
signal range of approximately 20 Hz to 20 kHz, using the Griffin iMic as the data
acquisition device.
The four-step project
list
It's time to get started. The subtasks in this project include:
- Filter the input signal.
- Acquire (digitize) the input waveform.
- Convert the time domain data to the frequency domain.
- Display the data attractively.
Step 2 (yes, 2). Digitizing the input waveform
The first thing to consider is how to funnel some data from the outside
world into the PS3. You can use the Griffin iMic for this purpose, and there is
really not much to say about the installation process. Recent Linux kernels
include a compatible driver, so setting up the device on YDL is very much
plug-and-play. You can verify that the device was mounted successfully by using
tail -10 /etc/dmesg and checking for the appropriate
USB messages, and then use the ALSA mixer to determine if you can tinker with the
volume settings for the device.
For the programming side, there are a few different APIs you can use
to access audio devices in Linux. The sample code presented with this
article uses the Open Sound System API (OSS) mainly because it is
uncomplicated to use and because its old age means it is well supported
on various hardware and operating system flavors. The iMic is also supported by the Advanced
Linux Sound Architecture (ALSA) API. It would not be unreasonably difficult to
modify this sample code to work with ALSA.
The data stream will be sampled at 16 bits, 44.1 kHz. While this sampling
frequency is not a particularly nice round number from a calculation point of
view, it's a safe choice for hardware that was designed for audio recording.
Because off-the-shelf hardware is often not tested rigorously against all possible
API call parameters, it's prudent to choose popular sample formats and data
rates when working with consumer hardware. This caveat basically restricts you to
11.025, 22.050, or the CD-quality sample rate of 44.100 kHz (though 48 kHz is also
usually supported by modern audio hardware). I strongly advise you to stick to the
upper end of this range. The reason for this is buried in the fact (which some
of you might have noticed) that I glossed over step one in my project list. It's time
to rectify that omission.
OK, now Step 1. Filtering the input signal
Normally, a digital data acquisition system starts with level-matching and
isolation components followed by a sinc filter that rolls off, theoretically, to
somewhere below the ADC's voltage resolution, across the frequency span, between
the highest-frequency signal of interest and half the sample rate.
For the sample 16-bit ADC, that would theoretically mean 96 dB of attenuation (6 dB per
bit) between the audible range of about 20 kHz and half the CD-quality sampling
rate, which is 22.050 kHz. This requirement is an unfeasibly tall order for an
analog circuit. It would involve an incomprehensibly high-order active filter
network or a big compromise on performance parameters, such as passband ripple (or,
more likely, both). Observably, the iMic does not contain such analog hardware.
While it is possible that the device oversamples the input signal, filters
it digitally, and downsamples it, this is very unlikely. Griffin does not
publish specifications online, so you can make an educated guess that it is much
more likely that the iMic compromises a bit by starting its rolloff earlier, quite
probably not reaching the full 96 dB by the 22.050 kHz mark.
The net result of this discussion is that if you use a reduced sample rate, the
iMic's front-end filter is not going to know about this, so it will continue
passing through signals at frequencies that cannot be captured by your lower sample
rate. This will cause aliasing artifacts to appear in your final output. As a
result, the best strategy is to capture at a known-good sample rate. If you
find that this generates too much data for the FFT engine to handle in a timely
manner, then your next best plan is still to capture at the higher rate, but to
run a digital low-pass filter over the raw data then downsample it before
passing it on to the next stage.
Step 3. Converting data
When I planned this article, I was geared up to port an existing FFT
algorithm to the Cell/B.E. platform and thereby impress you with my elite porting
skills, but it seems that IBM already beat me to the punch. The latest alpha
version of the FFTW library (see Resources) already
includes explicit support for the Cell/B.E. processor. Basically the only thing
you need to do is build and install the library, and then add
-lfftw3 and -lm to the
linker flags in your Makefile. The tutorials included in the FFTW documentation
are adequate to get you started. Note the caveat in the documentation
regarding SPE usage. By default, the Cell/B.E. version chews up all available
SPEs. Use the fftw_cell_set_nspe(n)
call (it is in the documentation, but not right alongside the rest of the API
description) to scale back FFTW's usage to n SPEs.
Initializing the FFTW library is simply a matter of completing the following
two steps:
- Allocate memory for the input and output buffer. It's best to use the
fftw_malloc() function for this instead of the
regular malloc() because the fftw-specific function
optimizes data alignment. This is particularly important on platforms like the
Cell/B.E. processor.
- Develop and select a plan. There is actually a lot of arcane complexity
in this step (well explained in the documentation). At its simplest, you
call
fftw_plan_dft_1d() and tell it the array
size, pointers to the input and output arrays, whether you want to go forward
or backward, and several flags that can be used to squeeze out optimal
performance. For the example application,
FFTW_ESTIMATE is perfectly acceptable as the flags'
parameter.
The iMic delivers a stream of time-domain samples in the range 0 to 65535, 22.68
microseconds apart. You massage these (note that you are using only one channel of the
stereo data stream) and place them in the FFT's input buffer. The example also
plots a reduced-size, 128-pixel-high version of the input signal onscreen so
you can see what it looks like in an oscilloscope-style format. That's not just a
bit of eye candy, but rather to help you check that your input signal is properly
connected and at an appropriate level. You also see a little snippet of code
that you can uncomment if you don't have an iMic or a reference frequency source:
it stuffs a sine wave directly into the sample buffer.
Now it's time to call fftw_execute(). This passes your
sample data on to the SPEs, which crunch them into frequency spectrum data. The 512
real sample points are turned into two groups of complex spectrum data showing
symmetry around the center. As the references are fond of saying, the k'th
point in the output array represents the energy at a frequency of k/n * Fs,
where n is the number of samples and Fs is the sample frequency.
The first entry in the table in memory (0 Hz) is a special case. It
represents the DC level of the input signal, and hence is usually off the scale.
The spectrum rendering code deliberately clamps the Y-coordinate to allow for this
condition. Note that this probably is not representative of the actual DC level at
the iMic's input pin. Most likely, the input is capacitively coupled, so there
isn't any real DC at the ADC. Rather, this bogus spike represents the fact that
your input signal does not vary equally positively and negatively about the 0 V
line, but rather varies between 0 and +65535.
Also note that, technically, you should use a logarithmic scale for
the spectrum. The reason I don't do this in the sample code is that the resolution
is rather low, and you can get a better idea of the signal shape from the linear
plot.
Some technical details are in order here, particularly if you are now looking at
the source code in puzzlement. Arbitrarily, I chose a 512-point FFT.
FFTW does support arbitrary transform sizes, but you can realize much better
performance with a size that is a power of 2. Speaking of performance, the
generic complex one-dimensional transform I selected is not the optimal choice for
your sort of input data. The fastest would be
fftw_plan_dft_r2c_1d() (one-dimensional, real). The
reason I went with the generic case is that is applicable to other sorts of data,
and the additional computation load is really child's play to the PS3.
By the way, you shouldn't think of the FFT algorithm as being just a monolithic
number-crunching magical black box. Plenty of research has gone into methods of
computing FFTs and, among other things, how to factor a given FFT operation across
multiple digital signal processors (DSPs). If, for some reason, you find the FFT
itself is a bottleneck, there is a great deal of existing code (including, in
this case, fftw) that can accelerate your application by splitting it up, if you
throw more cores at the problem.
Step 4. Making a pretty display
Now that you have sorted the input data into buckets, all that remains to be done is to
display it. To do this, calculate the magnitude of each complex output point
using simple Pythagoras and plot that number. If you wanted an actual power
reading in dB, you should instead plot 20 times the base-10 log of the output
point. This is not terribly useful information without some kind of reference
marker though.
Observe that if you were to plot all 512 output points, you would see a lot
of irrelevant information. Everything to the right of the 22.050 kHz mark is
aliased and might not actually exist in the input signal. Hence, the code
accompanying this article only plots the first 256 points, and it doubles the
horizontal size so the display fits neatly under the oscilloscope display.
At this point, experiment a bit with performance. In
particular, try building the FFTW libraries without Cell/B.E.
support so they use only the PPE. The improvement would be more noticeable if you were
doing a larger transform because the SPEs are much better at this sort of thing than
the PPE. The small size of your data set means the transaction overhead is a
significant fraction of the execution time.
Surprise! A useful
spectrum analyzer
So, you have now turned a PS3 into a useful spectrum analyzer. The next
article in the series examines the other side of that equation and uses the same hardware
as a function generator: the basis of an audio synthesizer, among many other
things.
Resources Learn
- Use an RSS
feed to request notification for the upcoming articles in this series. (Find out more
about RSS feeds of developerWorks content.)
- Check out all the articles in the series. Part 1 uncovers the design intent of the
project and inspects the details of the user interface implementation.
- Refer to Advanced Linux Sound Architecture (ALSA)
(which kinda sounds like a band from the 1980s) for audio and MIDI
functionality to Linux with fully modularized sound drivers, SMP and thread-safe
design, support for the older OSS API, binary compatibility for most OSS programs,
and a user space library to simplify application programming and provide a higher
level functionality.
- Read the Open Sound System (OSS) 4.0 Programmer's Guide
for a wealth of well-chosen tiny demo applets that demonstrate recording,
playback, and various mixer tweakage. Study this code for the fastest way
to get up to speed on the required steps.
- Go to the home of the
Poor Man's Spectrum Analyzer for
just one of many sites for building garage-project lab equipment.
- Find the latest
3.2alpha2 prerelease version of the fftw Fast
Fourier Transform library,
including IBM-supplied Cell/B.E. optimizations.
- See "25 tips to optimal application performance"
(developerWorks, June 2006) for how you can achieve near theoretical-maximum
performance for real applications on the Cell/B.E. processor by learning about the
processor's architectural characteristics.
- Review Jonathan Bartlett's essential
preliminary article "Programming high-performance applications on the Cell/B.E. processor"
(developerWorks, January 2007) about installing Linux on the PS3.
- Check out the document Sony released describing inter alia
how the ps3fb device interacts with the GPU and your Linux programs.
Note that this is a mirror document; there doesn't appear to be an official copy of this
document on Sony's sites.
- Refer to the Cell
Broadband Engine documentation section of the IBM Semiconductor Solutions Technical Library for a wealth of downloadable manuals,
specifications, and more.
- Sign up for the developerWorks newsletter
and get the latest developer news and Cell/B.E. happenings delivered to your inbox each week.
Check Power Architecture when you sign up to receive Cell/B.E. news in your newsletter.
Get products and technologies
- Jump over to Part 1
if you need to find the sample code referenced in this article.
- Look for the
Griffin iMic: my
audio input device of choice. Note that the Web site shows a (newer) white model
of the product. The model I have tested with PPC Linux is the older,
translucent-and-silver version with a switch between the input and output jacks.
- Download Yellow Dog
Linux through a free download from Terra Soft Solutions.
My experience is that all the mirrors are quite slow. I got the install ISO much
faster by searching a P2P network for the filename yellowdog-5.0-phoenix-20061208-PS3.iso.
- Find all Cell/B.E.-related articles, discussion forums, downloads,
and more at the IBM developerWorks Cell
Broadband Engine resource center: your definitive resource for all
things Cell/B.E.
- Contact IBM about custom
Cell/B.E.-based or custom-processor based solutions.
Discuss
About the author  | |  | Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years
developing x86, ARM and PA-RISC-based networked multimedia appliances at
Digi-Frame Inc. He has extensive experience in encryption and security
software and is the author of two books on embedded systems development. |
Rate this page
|  |