Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Cell/B.E. SDK 3.0 tools, Part 1: Using performance tools

Explore a practical example on the use of performance tools with the SDK 3.0

Gad Haber (gadi@us.ibm.com), Senior Architect, IBM Japan
Dr. Gad Haber joined the IBM Haifa Labs in 1993 as a research staff member in the area of performance analysis and post-link optimization. He managed the Performance Analysis and Optimization Technologies (PAOT) group until 2006, and he is currently involved with promoting the Cell/B.E. performance tools in the IBM Austin Labs. Dr. Haber works in the IBM Systems and Technology Group in Enterprise Systems Development.

Summary:  This introductory tutorial, designed as a companion for the IBM SDK for Multicore Acceleration, Version 3.0 (otherwise known as the Cell Broadband Engine® SDK), teaches you how to use five performance tools that reside in the SDK 3.0: OProfile, Cell Performance Counter, Performance Debugging Tool, the PDT Trace Reader, and FDPR-Pro. The Visual Performance Analyzer, available separately, is also highlighted.

Date:  08 Apr 2008
Level:  Introductory PDF:  A4 and Letter (1063 KB | 30 pages)Get Adobe® Reader®

Activity:  29546 views
Comments:  

Preparing and building for profiling

Follow these steps to set up a sandbox-style project tree structure so you have more flexibility when modifying and generating files.

Step 1: Copy the application from the SDK tree

To work on a sandbox tree means you have your own copy of the project on an accessible location (for example, your home directory): cp -R /opt/cell/sdk/demos/FFT16M ~/.

Step 2: Prepare the makefile

Go to your recently created project structure, and locate the makefiles. You should find three of them:

  • ~/FFT16M/Makefile
  • ~/FFT16M/ppu/Makefile
  • ~/FFT16M/spu/Makefile

Next, make a few modifications to the makefiles to prevent them from trying to install executables back to the SDK tree. Also, introduce the required compilation flags for profiling data. Listing 1 shows you how to modify the ~/FFT16M/ppu/Makefile.


Listing 1. Changing ~/FFT16M/ppu/Makefile for gcc
                    
#######################################################################
##     Target
#######################################################################
#

PROGRAM_ppu= fft

#######################################################################
##     Objects
#######################################################################
#

IMPORTS = ../spu/fft_spu.a -lspe2 -lpthread -lm -lnuma

#INSTALL_DIR= $(EXP_SDKBIN)/demos
#INSTALL_FILES= $(PROGRAM_ppu)
LDFLAGS_gcc = -Wl,-q
CFLAGS_gcc = -g

#######################################################################
##      buildutils/make.footer
#######################################################################
#

ifdef CELL_TOP
   include $(CELL_TOP)/buildutils/make.footer
else
   include ../../../../buildutils/make.footer
endif

Note that:

  • Install directives are commented out.
  • No further makefile modifications except the ones described are required.
  • There are specific changes based on whether you use gcc or xlc as the compiler.

Now, look at Listing 2.


Listing 2. Changing ~/FFT16M/ppu/Makefile for gcc
                    
#######################################################################
##      Target
#######################################################################
#

PROGRAM_ppu= fft

#######################################################################
##      Objects
#######################################################################
#

PPU_COMPILER = xlc

IMPORTS = ../spu/fft_spu.a -lspe2 -lpthread -lm -lnuma

#INSTALL_DIR= $(EXP_SDKBIN)/demos
#INSTALL_FILES= $(PROGRAM_ppu)
LDFLAGS_xlc = -Wl,-q
CFLAGS_xlc = -g

#######################################################################
##       buildutils/make.footer
#######################################################################
#

ifdef CELL_TOP
   include $(CELL_TOP)/buildutils/make.footer
else
   include ../../../../buildutils/make.footer
endif

The code introduced the -g and -Wl,-q compilation flags in order to preserve the relocation and the line number information in the final integrated executable. In Listing 3, you modify the ~/FFT16M/spu/Makefile for gcc. In Listing 4, you modify the ~/FFT16M/spu/Makefile for xlc.


Listing 3. Changing ~/FFT16M/spu/Makefile for gcc
                    
#######################################################################
##       Target
#######################################################################
#

PROGRAMS_spu:= fft_spu
LIBRARY_embed:= fft_spu.a

#######################################################################
##       Local Defines
#######################################################################
#

CFLAGS_gcc:= -g --param max-unroll-times=1 # needed to keep size of
program down
LDFLAGS_gcc = -Wl,-q -g

#######################################################################
##       buildutils/make.footer
#######################################################################
#

ifdef CELL_TOP
   include $(CELL_TOP)/buildutils/make.footer
else
   include ../../../../buildutils/make.footer
endif


Listing 4. Changing ~/FFT16M/ppu/Makefile for xlc
                    
#######################################################################
##       Target
#######################################################################
#

SPU_COMPILER = xlc
PROGRAMS_spu:= fft_spu
LIBRARY_embed:= fft_spu.a

#######################################################################
##       Local Defines
#######################################################################
#

CFLAGS_xlc:= -g -qnounroll -O5
LDFLAGS_xlc:= -O5 -qflag=e:e -Wl,-q -g

#######################################################################
##       buildutils/make.footer
#######################################################################
#

ifdef CELL_TOP
   include $(CELL_TOP)/buildutils/make.footer
else
   include ../../../../buildutils/make.footer
endif

Before the actual build, be sure to set the default compiler accordingly by issuing /opt/cell/sdk/buildutils/cellsdk_select_compiler [gcc|xlc].

Now you can proceed with the build: cd ~/FFT16M ; CELL_TOP=/opt/cell/sdk make.

3 of 8 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration, Linux
ArticleID=301401
TutorialTitle=Cell/B.E. SDK 3.0 tools, Part 1: Using performance tools
publish-date=04082008
author1-email=gadi@us.ibm.com
author1-email-cc=

IBM SmartCloud trial. No charge.

IBM PureSystems on a kaleideoscope background

Unleash the power of hybrid cloud computing today!


Special offers