Programmers face several new challenges in developing applications for the Cell BE processor. With nine cores, multiple Instruction Set Architectures (ISAs), and non-coherent memory, the design of the Cell BE processor presents an environment where debugging is both more important and more complex than in traditional architectures. The Cell BE SDK contains several tools to aid in debugging, the most important of which are the GNU Debugger, or GDB, and the IBM Full-System Simulator for the Cell Broadband Engine, or SystemSim.
GDB is a command-line debugger available as part of the GNU development environment. Because of the Cell BE processor's unique characteristics, GDB has been modified so that there are actually two versions of the debugger -- ppu-gdb for debugging PPE programs, and spu-gdb for debugging SPU programs. The IBM Full-System Simulator for the Cell BE processor can be used alone or in conjunction with GDB to observe and control program execution in fine detail to facilitate problem diagnosis. The simulator lets you view many aspects of the simulated system with an easy-to-use graphical user interface. You can also control many aspects of the simulator using Tcl commands.
This article describes how to begin debugging Cell BE software, starting with a description of how to debug PPE and SPU programs, followed by a brief description of some simulator features for debugging common problems in Cell BE applications. The final section describes how to debug the Linux® kernel for the Cell BE processor running on the IBM Full-System Simulator.
There are several approaches to debugging programs running on the PPE. If you have access to Cell BE hardware, you can use the standard approach of running the application under GDB. A similar approach is to run the application under GDB inside the simulator. The file system provided with the Cell BE SDK for use inside the simulator already has GDB installed. After the application is running under GDB, you can simply use the standard commands available in GDB to debug the application. Many excellent resources are available on GDB commands and debugging techniques (see Resources).
Another approach that is available both for hardware-based and simulator-based debugging is to run the application under gdbserver. gdbserver is a companion program to GDB that implements the GDB remote serial protocol. This is used to convert GDB into a client/server-style application, where gdbserver launches and controls the application on the target platform, and GDB connects to gdbserver to specify debugging commands. The connection between GDB and gdbserver can be either through a traditional serial line or through TCP/IP.
To exploit this feature, you must have a version of GDB that supports the
64-bit PowerPC® architecture. On 64-bit PowerPC host systems, this version
of GDB might be available as part of the standard OS installation.
Otherwise, download and build a version of GDB with the appropriate
architecture support. Listing 1 illustrates the steps needed to
configure, compile, and install the correct version of GDB. Simply cut and
paste this into a file and execute it as a shell script, sh file. If the wget of
the GDB source fails, download it manually from one of the many mirror
sites and comment out that line of the script. By default the install
stage installs into /usr/local/; for those who do not have write access to
/usr/local, specify the --prefix option on
configure to specify a different installation directory (for example, configure --target=powerpc64-linux --prefix
/home/sdkuser/local).
Listing 1. The commands to download and build powerpc64-linux-gdb for x86
#
# Script to download and build gdb for ppc64.
#
mkdir -p base
mkdir -p obj
wget -c ftp://ftp.gnu.org/pub/gnu/gdb/gdb-6.3.tar.bz2 -P base
tar jxvf base/gdb-6.3.tar.bz2
pushd obj
../gdb-6.3/configure --target=powerpc64-linux
make all
make install
popd
|
Remote debugging using gdbserver can occasionally be useful when running applications on real hardware, but it is especially valuable for debugging applications on the simulator since it enables the use of graphical debuggers such as DDD and Eclipse. To employ this approach, you need a version of gdbserver for the target platform and network connectivity. gdbserver typically comes packaged with GDB and is installed on the file system for the simulated system in the Cell BE SDK. For network connectivity to the simulated system, you must enable bogusnet support in the simulator, which creates a special ethernet device that uses a "call-thru" interface to send and receive packets to the host system. See the simulator documentation for details on how to enable bogusnet (see Resources).
To start a remote debugging session, launch the application on the target platform (either hardware or inside the simulator) using gdbserver as follows:
gdbserver :2101 myprog arg1 arg2
where :2101 is a parameter to gdbserver
specifying the TCP/IP port to be used for communication, myprog is the name of the program to be debugged, and
arg1 arg2 are the command line arguments to
myprog. Then start GDB from the client system
(for the simulator this will be the host system of the simulator):
/usr/local/bin/powerpc64-linux-gdbtui myprog
You should have the source and compiled executable version for myprog on the host system. If your program links to dynamic libraries, GDB will attempt to locate these when it attaches to the program. If you are cross-debugging, you will need to direct GDB to the correct versions of the libraries or it will try to load the libraries of the host platform. For the Cell BE SDK 1.1, this is accomplished with the following gdb command:
set solib-absolute-prefix /opt/sce/toolchain-3.2/ppu/sysroot
Then at the (gdb) prompt connect to the server with the command:
target remote 172.20.0.2:2101
Note that the :2101 parameter in this command
matches the TCP/IP port parameter used when starting gdbserver. The IP address of the simulator is generally fixed to the 172.20.0.2
address but you can verify this IP address by issuing the ifconfig command
in the console window of the simulator. Giving the simulator a symbolic
name is useful and can be done by editing the host system's /etc/hosts
file as shown here:
Listing 2. The modified /etc/hosts file
# Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost 172.20.0.2 mambo |
Figure 1 shows the example:
Figure 1. GDB session for myprog attached to gdbserver running on the simulator

To debug SPU programs, you need to have Version 1.0.1 or later of the Cell BE SDK installed. This is the first version that includes SPU GDB and the necessary kernel and library support. As part of the SDK install process, SPU GDB is installed as spu-gdb on the file system to be used by the system running inside the simulator.
You can use SPU GDB to launch and debug stand-alone SPU programs in much
the same way as GDB is used on PPE programs. Stand-alone SPU programs are
self-contained applications that execute entirely on the SPU. Listing 3
presents a simple stand-alone SPU program, Listing 4 presents its trivial
Makefile, and Figure 2 presents a sample
debug session for this program using SPU GDB. (Note: Recursive SPU
programs are generally a bad idea due to the limited size of local
storage. We've made an exception here since it allows us to illustrate
the backtrace command of GDB with a simple
example.)
Listing 3. A simple stand-alone SPU program
#include <stdio.h>
#include <spu_intrinsics.h>
unsigned int
fibn(unsigned int n)
{
if (n <= 2)
return 1;
return (fibn (n-1) + fibn (n-2));
}
int main(int argc, char **argv)
{
unsigned int c;
c = fibn (8);
printf ("c=%d\n", c);
return 0;
}
|
Listing 4. The simple Makefile
CC=/opt/sce/toolchain-3.2/spu/bin/spu-gcc
simple: simple.c
$(CC) simple.c -g -o simple
|
Source-level debugging of SPU programs with GDB is similar in nearly all aspects to source-level debugging for the PPE. For example, you can set breakpoints on source lines, display variables by name, display a stack trace, and single-step execution. Figure 2 illustrates the backtrace output for the simple stand-alone SPU program.
Figure 2. SPU GDB session for a stand-alone SPU program

GDB also supports many of the familiar techniques for debugging SPU programs at the assembler code level. For example, you can display register values, examine the contents of memory (which for the SPU means local storage), disassemble sections of the program, and step execution at the machine instruction level. Figure 3 illustrates some of these facilities.
Figure 3. Assembly language features for debugging SPU programs

One point that deserves special mention is the way GDB deals with the SPU registers. Since each SPU register can hold multiple fixed or floating point values of several different sizes, GDB treats each register as a data structure that can be accessed with multiple formats. The GDB ptype command, illustrated in Listing 5, shows the mapping used for SPU registers.
Listing 5. The SPU's registers as a data structure
(gdb) ptype $r80
type = union __gdb_builtin_type_vec128 {
int128_t uint128;
float v4_float[4];
int32_t v4_int32[4];
int16_t v8_int16[8];
int8_t v16_int8[16];
}
|
To display or update a specific slot in an SPU register, specify the appropriate field in the data structure, as shown in Listing 6.
Listing 6. Modifying an SPU register
(gdb) p $r80.uint128 $1 = 0x00018ff000018ff000018ff000018ff0 (gdb) set $r80.v4_int32[2]=0xbaadf00d (gdb) p $r80.uint128 $2 = 0x00018ff000018ff0baadf00d00018ff0 |
Debugging Cell BE applications
Cell BE applications generally contain functions that execute on the PPE
as well as on the SPUs. In some cases, the SPU portion of the application
can be recast as a stand-alone SPU program so that the previous approach
can be used. Otherwise you can attach SPU GDB to the SPU program after it
has been created by the PPE portion of the application. To accomplish
this, you must set the environment variable SPU_DEBUG_START to 1. This
directs the libspe library to create each SPE thread in the stopped state
and wait for a signal to start execution. After creating the SPE thread,
libspe also prints a message with the thread ID of the SPU program, which
you must specify to SPU GDB with the -p command
line option. After SPU GDB has attached to the target thread, it signals
the thread to begin execution.
The Full-System Simulator for the Cell Broadband Engine (SystemSim) provides only one console window for interaction with the simulated system, which means you have to either start the application in the background, or background it after it starts running. Whatever method you use, access to a shell prompt is required when it is time to attach to the SPE thread.
One of the most common things programmers do is transfer program data from Main Storage to Local Storage using DMA. If a buffer address is wrong, the program might hang or get a bus error. If mailboxes are used to coordinate communication between the PPE and SPE, and one of the two threads falls out of sync for some reason, the program might hang. In both of these instances it is helpful to be able to use a debugger to track down the source of the problem.
Figure 4 illustrates a typical bus error. The program is run and
immediately gets a bus error. Then the environment variable
SPU_DEBUG_START is set. This stops the SPE threads just after getting
loaded, and gives the programmer a chance to attach to them. After the
debugger has attached, the program is allowed to continue to the point of
failure. Here the debugger reveals that the SPU program was at line 20
when the error occurred. Since line 20 is waiting for the completion of
the DMA that was initiated at line 18, this suggests this DMA is the cause
of the bus error. In this case, the effective address specified on the
DMA, ef_addr, is the address of the parameter
supplied when the SPE thread was created. By inspecting the PPE portion
of the application which started the SPE thread, shown in Figure 5, you can
now identify the source of the problem. The PPE program malloc'ed the
memory but the address looks like a stack address. The PPE source should
have passed buffer[i], that is, the missing array subscript was the
problem.
Figure 4. Debugging a Bus error with spu-gdb

Figure 5. Code causing bus error. Red highlight shows error, yellow shows fix.

Beginning with the Cell BE SDK 1.1, it is also possible to do remote debugging of SPU programs using an approach similar to that described above for PPE programs. Remote debugging of SPU programs is performed with the spu-gdbserver application, which operates in a similar fashion to gdbserver. As with remote debugging for PPE programs, bogusnet should be enabled to allow network connectivity between the host and simulated system.
To start a remote debugging session for the SPU portion of an application, start the application just as you would for local debugging. When it is time to attach the debugger to the SPE thread, use spu-gdbserver in place of spu-gdb. For example, to attach to the SPE thread with ID 375, issue the command:
spu-gdbserver :2101 --attach 375
After spu-gdbserver has attached to the SPE thread, start SPU GDB on the
simulator host system and use the target command to attach to the
gdbserver that is running on the simulator. Figure 6 shows a GDB session
running within the Data Display Debugger (DDD) that has connected to the
gdbserver running in the simulator. Note the "target
remote mambo:2101" GDB command in the lower window of the figure.
In this example, mambo is a symbolic name for
the IP address of the system running on the simulator. This name and the
corresponding IP address were added to the host system's /etc/hosts file
as described above.
Many of the graphical front-ends for GDB can easily be configured to use
SPU GDB, enabling full graphical, source-level debugging of SPU programs.
Figure 6 shows the use of DDD with SPU GDB. DDD is included with Fedora
Core 5. To verify if it is installed do an rpm -q
ddd, and the result should be similar to:
ddd-3.3.11-5.2
If ddd is not already installed, you can install it from the FC-5
distribution CDs or with a yum install ddd.
To run DDD with SPU GDB, use the -debugger argument to specify the name of
the debugger program. If you are using the Cell BE SDK 1.1, this name is
/opt/sce/toolchain-3.2/spu/bin/spu-gdb. For
example:
ddd -debugger /opt/sce/toolchain-3.2/spu/bin/spu-gdb myprog
Figure 6. SPU GDB session running under DDD

Debugging features in SystemSim
The simulator has a vast array of debug facilities. This section explores some that are helpful for SPU debugging.
The SPU Local Store has no memory protection, and memory access wraps from the end of Local Store back to the beginning. An SPU program is free to write anywhere in Local Store including its own instruction space. A common problem in SPU programming is the corruption of the SPU program text when the stack area overflows into the program area. This problem typically does not become apparent until some later point in the program execution when the program attempts to execute code in area that was corrupted, which typically results in an illegal instruction exception. Even with a debugger it can be difficult to track down this type of problem because the cause and effect can occur far apart in the program execution. Adding printf's just moves the failure point around.
The simulator has a feature that monitors selected addresses or regions
of Local Store for read or write accesses. This feature can identify stack overflow conditions. To make this easy to use, a Tcl
procedure is provided with the simulator that creates the triggers
functions to detect stack overflow for a given SPU program. The Tcl
procedure is called enable_stack_checking, and
is invoked in the simulator command window as follows:
enable_stack_checking [spu_number] [spu_executable_filename]
This procedure uses the nm system utility to determine the area of Local Store that will contain program code and creates trigger functions to trap writes by the SPU into this region (see the SystemSim documentation for further information on trigger functions). Figure 7 shows the simulator console window and command window from a simulator run that employed enable_stack_checking to detect a stack overflow in a Cell BE application.
Note: The simulator's method of detecting stack overflow only looks for stack overflow into the text and static data segments and thus does not detect stack overflows into the heap. Another approach (that currently only works using gcc) is to enable stack checking by the compiler. The -fstack-check compile flag results in the insertion of runtime tests which will detect both forms of stack overflow. The program halts in the event of overflow.
Figure 7. Example of the SystemSim's stack overflow detection facility

Another common error in SPU programs is a DMA that specifies an invalid combination of Local Store address, effective address, and transfer size. The alignment rules for DMAs specify that transfers for less than 16 bytes must be "naturally aligned," meaning that the address must be divisible by the size. Transfers of 16 bytes or more must be 16-byte aligned. The size can have a value of 1, 2, 4, 8, 16, or a multiple of 16 bytes to a maximum of 16KB. In addition, the low-order four bits of the Local Store address must match the low-order four bits of effective address (in other words, they must have the same alignment within a quadword). Any DMA that violates one of these rules will generate an alignment exception which is presented to the user as a bus error.
The simulator checks all these alignment requirements and raises alignment exceptions as necessary to match the behavior of the hardware. But in addition to this, the simulator also generates warning messages to aid the programmer in finding and correcting these problems. Figure 8 illustrates a warning message, "WARNING: 441391050: GET command with illegal size (12) (< 16 and not 0, 1, 2, 4, or 8)," (highlighted in red in the figure) issued by the simulator for a DMA alignment exception.
Figure 8. Warning message from simulator for DMA alignment exception

Debugging the Linux kernel can be a difficult task, in part because the kernel is a complex piece of software, but also because the debugger cannot rely on basic OS functions being available or working properly. On the Cell BE SDK, kernel debugging is simplified because the IBM Full-System Simulator, part of the Cell BE SDK, allows a debugger running on the host system to debug a Linux kernel running inside the simulator.
To debug the Linux kernel at the source level, you should build a version of the kernel that contains the debugging information. To do this, you need a version of the Linux kernel source that contains support for the Cell BE platform. The easiest way to do this is to download and install the kernel source RPM from the Linux on CBE-based Systems Web site at the Barcelona Supercomputing Center (BSC; see Resources). The process for building the kernel depends on the host system, installed tools, and other details, and is beyond the scope of this paper. This article only covers the necessary steps to enable the debugging information. The example commands shown illustrate these steps on a Linux x86 platform with Cell BE SDK 1.1 installed.
To enable debugging information in the kernel, go to the directory where you will build the kernel and type:
ARCH=powerpc PLATFORM=cell CROSS_COMPILE=/opt/sce/toolchain-3.2/ppu/bin/ppu- make xconfig
The make xconfig command brings up the
configuration menu shown in Figure 9. Scroll down and click on the
"Kernel hacking" in the left-hand set of options, then click on the
"Compile the kernel with debug info" (DEBUG_INFO) on the right-hand side
set of options. This option specifies that symbols and source information
are retained in the generated binary to allow source-level debugging. In
some cases, you might also choose to turn off certain compiler
optimizations to make debugging easier. In particular, disabling the
-fomit-frame-pointer optimization allows the
debugger backtrack command to work reliably, and changing the optimization
level from -Os to -O0 will make it easier for GDB to associate
individual instructions with a line in the source code. After making all
the desired changes, save the configuration, exit the configuration dialog, and then rebuild the kernel.
Figure 9. The make xconfig screen

Next, start the simulator with the newly built kernel. To ensure that
the simulator is using the new kernel, create a symbolic link named vmlinux to the new kernel in the current directory
before starting the simulator. To verify that the correct kernel is being
used, check the name of the kernel file displayed by the simulator during
start-up.
Now you are ready to start a debug session. First, start the simulator and click on the "Service GDB" button; notice that the text of the button changes to "Waiting for GDB... ." In another window, change directories to the location where you compiled vmlinux and start the GDB session with the command
/usr/local/bin/powerpc64-linux-gdbtui vmlinux
then at the (gdb) prompt type
break start_kernel
target remote :2345
continue
You should see something very similar to Figure 10.
Figure 10. The kernel debug session

Now GDB is attached to the simulator and can monitor and control the execution of the Linux kernel. From here it is possible to set additional breakpoints, display variables by name, display processor registers, display a stack trace, single-step execution, and so on.
The Cell BE SDK provides most of the standard GNU debug facilities and taken with some of the facilities found in the simulation environment, developers have at their disposal the tools needed to debug many problems. The key to simplifying any debugging task is to program defensively from the beginning and use sound software engineering principles.
Learn
-
The following references describe commands and debugging
techniques available in GDB:
- Debugging with GDB, Version 6.3, Richard Stallman, Roland Pesch, Stan Shebs, et al., available at the GDB: The GNU Project Debugger Web site.
- Linux Debugging and Performance Tuning: Tips and Techniques, Steve Best, Prentice Hall, 2006
- Exterminate Bugs Faster with GDB (free registration required), William Nagel, Linux Magazine, January 2006, pp. 38-42
-
The following references provide information about GDB that is specific
to the PowerPC and Cell BE architectures:
- "Debugging tools and techniques for Linux on Power" Calvin Sze, IBM developerWorks, August 2005
- Porting the GNU Tool Chain to the Cell Architecture, Ulrich Weigand, from the Proceedings of the GCC & GNU Toolchain Developers' Summit, pp. 185-198, June 2005
-
Documentation for the IBM Full-System Simulator is provided with the
simulator package. For the release shipped in conjunction with the Cell BE SDK 1.1, this
includes:
- IBM Full-System Simulator User's Guide, in doc/SystemSim.Users.Guide.pdf
- SystemSim BogusNet HowTo, in doc/bogusnet-howto.pdf
- Using GDB with Systemsim, in doc/gdb/use.html
-
Dan Kegel has a good gdbserver resource page
that links to interesting and related material.
-
The IBM Semiconductor solutions technical library Cell
Broadband Engine documentation section lists specifications, user
manuals, and more.
-
Find all Cell BE-related articles, discussion forums, downloads, and more at
the IBM developerWorks Cell Broadband Engine
resource center: your definitive resource for all things Cell BE.
-
Keep abreast of all the Cell -- and other Power Architecture-related
news: subscribe to the
Power Architecture Community Newsletter.
Get products and technologies
-
Get Cell BE: Contact
IBM E&TS for custom Cell BE-based or custom-processor based
solutions.
-
Get the alphaWorks
Cell Broadband Engine SDK 1.1.
-
See all
Power-related
downloads on one page.
-
Download and install the kernel source RPM from the Linux on CBE-based Systems Web site at the Barcelona Supercomputing Center.
Discuss
- Participate in the discussion forum.
-
Take part in the IBM developerWorks Power Architecture Cell
Broadband Engine discussion forum.
-
Send a letter to the editor.
Michael Kistler is a Senior Software Engineer in the IBM Austin Research Laboratory. He joined IBM in 1982 and has held technical and management positions in MVS, OS/2, and Lotus Notes development. He joined the IBM Austin Research Laboratory in May 2000 and is currently working on simulation technologies for IBM's Power and PowerPC processors and systems. His research interests are parallel and cluster computing, fault tolerance, and full system simulation of high-performance computing systems. Mr. Kistler received his BA in Computer Science from Susquehanna University in 1982 and MS in Computer Science from Syracuse University in 1990.




