Before you start
This tutorial presents an overview of the concepts essential to understanding Cell Broadband Engine (Cell BE) architecture and how a compiler can implement solutions to automatically distribute code to the Synergistic Processing Elements (SPEs) of Cell BE architecture. It is based mainly on the experiences of IBM Research in creating the Octopiler optimizing compiler, but also includes input from the Visual Age XL compiler team; as a series, it should be useful to other compiler writers, but also to anyone who would like a clearer understanding of Cell BE architecture in general. It introduces the local store memory of SPEs, the instruction buffer, auto-SIMDizing code, and the software cache approach to accessing irregular data. Subsequent tutorials discuss these topics in greater detail.
The topics covered in this five-part series include:
- Part 1: Overview: The Cell BE architecture and some of the issues faced in compiler design
- Part 2: Optimizing for the SPE: Optimizations used on the SPEs, such as how the compiler translates scalar code for a vector-only processor
- Part 3: Making the most of SIMD: How a compiler can effectively generate SIMD code for two different architectures (the SPE and VMX), accommodating the various technical constraints of the processors
- Part 4: Partitioning large tasks: How the compiler, or the user, can divide tasks up between the SPEs and the main processor
- Part 5: Managing memory: Techniques used, by the compiler or the programmer, to give the SPEs access to data that can't fit in local storage
Basic familiarity with computer architecture is helpful, but nearly any programmer in possession of a fully functional brain should be able to follow along.



