IBM Support

SDK 1.10: Using PPA plug-in to find bottlenecks in programs

Technical Blog Post


Abstract

SDK 1.10: Using PPA plug-in to find bottlenecks in programs

Body

1. The IBM Power Systems Performance Advisor plug-in
The IBM Power Systems Performance Advisor (PPA) plug-in allows users to profile C/C++ applications based on a selected set of processor-specific metrics. PPA leverages Ocount, an Oprofile tool used to count native hardware events, to gather the processor performance data and calculate the metrics.
 

The IBM SDK for Linux on Power (SDK) 1.10 provides a new version of the PPA plug-in. This version provides visualization charts for metrics and also includes a drill-down feature which can be used to investigate the occurrence of specific events in the source code.

 

2. Using the tool

To use the plug-in it is necessary to configure a launcher. In PPA launcher configuration (Figure 1), select the CPU Model (either POWER7 or POWER8), the Analysis Type (from the metrics categories), and also the specific metrics that will be used during the profile. In the "Main" tab, specify the program and working directory.  Arguments to the program and environment variables can be specified in the other tabs. Finally, click on the Profile button to start the analysis.

image

Figure 1: PPA launcher configuration

 

After the profiling completes, a view opens and displays each metric value with the corresponding events. For some metrics a chart is displayed, which shows comparisons between metrics (or events) values.

 

3. Matrix transpose example
For this example, a C program was used which transposes a 20k x 20k matrix of type "double" elements. Suppose we want to analyze the use of memory cache by the program. Select the "Memory Cache" category which calculate metrics about load and store misses.
 

After PPA profile run, it generated the following charts shown in Figure 2 and 3.

 

image

Figure 2: Cache loads and misses chart

 

image

Figure 3: Cache stores and misses chart

 

The first chart shows a comparison between the amount of cache loads and misses in all three cache levels and the second chart compares cache stores and misses, also in the three cache levels. These indicate that the program is getting a high value of load and store misses, especially in L1 cache level.

 

Besides the charts, it displays a view of counted events value and calculated metrics, as shown on Figure 4. The L1 cache load miss ratio was 68.61% and L1 cache store miss ratio was 73.16%.

 

image

Figure 4: L1 cache load and store misses

 

To identify which parts of the code are producing the events used to calculate the formulas, one can use the drill down feature. In this example the "L1 cache load miss ratio" metric divides the total number of misses in L1 cache by the total references in L1.

 

To figure out which pieces of code are responsible for the L1 load misses, click in the button in front of the "PM_LD_MISS_L1" event value and the PPA drill down will start. As the drill down runs the program, it can take some time depending on the program execution time. When the drill down job finishes, a view opens and displays the source code elements related to the selected event (Figure 5).
 

In the view, the results are sorted with the functions that had more occurrences of the selected event. Double-click an item to open the source code at the specific file and line. In Figure 5 a double-click was performed and it opens the line 92 from main.c file. This line is where the elements from matrix "b" are copied to matrix "a" and it is the bottleneck of L1 cache load misses.

 

imageFigure 5: Drill down using PM_LD_MISS_L1 event

 

4. More Information
For more information about PPA, access the IBM SDK for Linux on Power online documentation: http://www.ibm.com/support/knowledgecenter/linuxonibm/liaal/iplsdkprofilepsa.htm

 

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

UID

ibm16170085