The little broadband engine that could: Reviewing the newest little SDK that installs natively on PS3

The FC7 kernel built for the SDK is just right for the PS3

Come along on a little train tour of the SDK for Multicore Acceleration 3.0 to see what's different for developers and how you can make good use of the SDK, including native installation on PS3, support for FC7 and RHEL 5.1, enhanced compilers, Fortran and Ada support, BLAS, ALF, and DaCS--oh my!

Peter Seebach (developerworks@seebs.net), Freelance writer, Plethora.net

Author photoPeter Seebach usually responds to a new API the way most people respond to a brightly-wrapped present--by shaking it. He keeps wrapping paper and old documentation alike for the sentimental value.



19 February 2008

Introduction

The IBM Cell Broadband Engine™ (Cell/B.E.) Software Development Kit (SDK) has been updated once again. This article addresses what's new and interesting in the 3.0 release from a non-IBM point of view. Version 3.0 came out on October 19, 2007, and it completely replaces the prior 2.1 release. (In fact, if you don't have version 2.1, don't bother looking for it.)

Installing natively

The 3.0 update shows off some major changes. Probably the most significant to a lot of developers is that it can now be installed natively on a PS3 running Linux® without having to modify or cheat the installer. The 2.1 and earlier releases refused to install on a Cell/B.E.-based system without installing a kernel that ran only on native hardware, rather than in the hypervisor environment used by the PS3.

From the Editor: Multiple points of view

The Cell Broadband Engine seems to be lending itself to a disparate range of applications, from blade servers to hybrid supercomputers to game consoles. As someone who walks the halls (at least virtually) of IBM, I have access to lots of engineers and designers whose focus it is to apply Cell/B.E. technology to the more business-oriented uses and systems. What I'm missing are those developers that want to do PS3 hacks (like my friend and the author, Peter, here). My guess is that your efforts with the Cell/B.E. SDK and the Playstation 3 in the areas of application development and porting (and any other areas that interest you) will be an important factor in shaping the direction of the Cell/B.E. processor's growth and evolution. So, if you've got experiences or potential story ideas you want to share, post them to the Cell Broadband Architecture Forum.

Of course, if you don't have a Cell/B.E.-based system at all, you can still run under the Full System Simulator included on the extras CD. I wouldn't recommend trying to run the system simulator on a PS3, though; the performance would likely be a little slow.

Downloading and running

The SDK is supported on the Fedora 7 and the RHEL 5.1 systems. Although it might be possible to use it on others, IBM provides no support for that. The CDs for the product come with a small starter RPM that installs the SDK installer. The SDK installer can then be used to install the remainder of the SDK. There are two CDs in the SDK: the developer package, providing the main functionality of the SDK, and the extras package, providing experimental code, add-ons, and additional features such as the Full System Simulator.

The SDK installer uses files from the SDK CDs and files downloaded from the IBM Barcelona Supercomputing Center. If you have the ISO images downloaded already, the SDK installer can mount them and copy most of the files it needs from them, but you will still need to download some files (about 130MB) from the Barcelona site.

With each revision, the SDK installer gets a little easier to use. Compared with the hand-tuning of some earlier revisions, it's pretty polished now. Installation will take a while, partly because the yum installer chews up a fairly large amount of memory while it's running. The SDK installer uses yum after creating special custom repositories holding the SDK components; it's unclear how this is an improvement over installing them directly, but apparently it works.

The installer doesn't install the extras automatically. The extras are considered somewhat experimental, so if you want to install them, you'll have to do it yourself. You might need to install other packages first. For example, the Fedora 7 installation doesn't seem to have all the tcl and tk versions needed, but yum install tcl and yum install tk solves that. Some of the extra RPM files depend on RPMs that might not be installed by default. You might have to install the trace, pdt, alf-trace, and dacs-trace packages, plus their corresponding development packages, before some of the extras will install.


So, what's new?

The SDK has undergone the usual variety of changes, updates, and enhancements. The XL C compiler has been updated from 0.8.2 to 0.9.0, although it's still considered an alpha product. The GNU compiler has also been updated with substantial improvements to autovectorization. There is a workaround for one of the few known errata for the SPE (see Resources). The accelerator library framework (ALF) has been updated as well. There are some brand-new features such as Fortran (both PPE and SPE) and Ada (PPE only) support, and a linear algebra library.

However, perhaps the biggest change is that the documentation is updated and is now installed as part of the SDK. (Trust a writer to think the documentation is the biggest change.) The documentation has always been fairly good, but in previous releases, you had to go looking for it. Now it's all installed in /opt/cell: specifically, in /opt/cell/sdk/docs. The documentation provided with the SDK goes way beyond anything an article can cover, so go read the documentation.

A recurring theme in the updates of the SDK is a move toward standardization and specification. The early SDK offered experimental protocols. The original libspe is now deprecated because it had flaws that were addressed by developing a new and improved API called (with typical creativity) libspe2. The SIMD math library has been polished, improved, and debugged. For example, there's a SIMD math specification now, and the library implements it completely. That's significant progress, especially if you look at the notes in a previous article in this series about working around one of the limitations of a previous version.

The Fedora version of the SDK includes an updated 2.6.23 kernel specifically built to run on Cell/B.E. hardware. While Fedora 7 runs natively on the hardware, the kernel built for it is noticeably smaller. For example, the initrd file is about 40 percent of the size of the standard Fedora one, and the config file used for the kernel is even smaller. Those savings matter on the fairly common PS3 development platform, where even a couple of megabytes of free space can be at a premium.


A tour of the SDK's example code

If you plan to do any actual development, one of the most important parts of the SDK is the example code. The documentation can only get you so far. At the end of the day, reading working code is one of the best ways to learn how to use a system. The addition of man pages helps immensely too. For example, you can now expect man spe_context_create_affinity to show you what you need to know about creating contexts with affinity. (Note: One of the things it tells you is that you can't do it on the PS3 target hardware.)

The example code comes in four tar files. Just extract them in the /opt/cell/sdk/src directory where they are located, and run make to build all the sample code.

There are five subdivisions of the example programs:

Benchmarks
Well, it's a single benchmark, but it's very nice.
Demos
The demo programs show off particular accomplishments.
Examples
The example programs are tiny little programs, each of which highlights a particular coding technique or recipe for accomplishing a single task. For example, the examples/cache directory has a pair of programs showing how to take advantage of a software-managed cache implemented for the SPEs.
Tutorial
The tutorial code is either the most useful or the least useful, depending on your experience level. If you aren't sure where to get started, working through the tutorial is a great idea, and the euler tutorial shows a series of steps in the development of a fairly simple program that does a little bit of everything. Of particular interest is that when a file is unchanged between iterations, it's a symbolic link back to the previous iteration; noticing this could save you a bit of time trying to figure out what changed.
Library code
The library code offers an even more little cookbook recipes and reusable code snippets that show elegant (or at the least efficient) ways to perform common tasks. If you've been struggling with issues such as how to efficiently perform convolutions on arrays of data or fast Fourier transforms, you'll be happy to see that the code's already been written. Note that most of this code can run on both the PPE and the SPEs. For example, Listing 1 shows a nicely trivial function that produces a component-by-component maximum of two vectors:
Listing 1. Finding a maximum
static __inline vector signed int
_max_int_v(vector signed int in1, vector signed int in2)
{
#ifdef __SPU__
  vector unsigned int cmp;

  cmp  = spu_cmpgt(in1, in2);
  return (spu_sel(in2, in1, cmp));
#else
  return (vec_max(in1, in2));
#endif
}

An interesting change is the generally improved support for using the PPE's VMX vector unit to help keep things vectorized on both the PPE and the SPEs. It's too easy to get into the habit of thinking of the PPE as a scalar processor, but it has substantial vector hardware, too.


Tools and libraries

The SDK has developed a broader repertoire of tools and support features. There's a lot more support for performance tuning in the new SDK. The Performance Debugging Tool (PDT) provides tracing support and a visualizer for trace output. The Data Communication and Synchronization (DaCS) library provides support functions for process management, data movement, and synchronization between PPE and SPE.

The SDK is starting to include things everyone was previously developing independently, plus things that people kept asking for. While performance-tuning your code for a novel architecture is always a bit of work, there are a number of tasks that are pretty consistent. The provided linear algebra library is useful because there is a lot of commonality between different programs that need linear algebra. A tuned and debugged library for this can cut development costs substantially. Similarly, one of the extras is a simulation-grade random number generator. I don't know that I've ever met a programmer who hasn't written at least one very bad random number generator. The SDK providing a good one is a very good idea.


Conclusions and a look ahead

The 2.x Cell/B.E. SDK releases showed a significant step away from purely experimental material toward viable and polished material, but it was not quite polished and completed. The fact that the 3.x SDK is being distributed from developerWorks marks a transition from a somewhat alpha-level product to a development system at or above the quality one would usually expect from a commercial product. There's still experimental material, especially on the Extras CD, but the core SDK has been tested, verified, and even debugged a little. If you've been putting off getting into Cell/B.E. development because the API looked to be in flux, it seems to be more stable now.

With the removal of IDL from the base SDK, you might have to revise your old IDL-based fractal generator to use a more hands-on protocol. That sounds like a fun project, and it'll be interesting to see what impact it has on performance.

The next article in the series introduces the DaCS library in more detail.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration
ArticleID=289889
ArticleTitle=The little broadband engine that could: Reviewing the newest little SDK that installs natively on PS3
publish-date=02192008