Profile Guided Optimization (PGO)

Profile guided optimization (PGO), also known as profile-directed feedback (PDF), is a compiler optimization technique in computer programming that uses profiling to improve program runtime performance. 

Important: IBM Open XL C/C++ for AIX 17.1.4 supports the following operating systems:
  • IBM® AIX® 7.2: TL5 SP3 or later
  • IBM AIX 7.3: TL0 or later

However, to use PGO, your operating system must be IBM AIX 7.2 TL5 SP5 or later, or IBM AIX 7.3 TL1 or later.

If you use PGO on IBM AIX 7.2 TL5 SP4 or earlier, or IBM AIX 7.3 TL0 SP1 or earlier, you might encounter the following errors:
  • A segmentation fault when using the ibm-llvm-profdata utility:
    
    PLEASE submit a bug report to https://ibm.biz/openxlcpp-support and include the crash backtrace. 
    Stack dump: 
    0. Program arguments: /opt/IBM/openxlC/17.1.3/bin/ibm-llvm-profdata "ibm-llvm-profdata merge" -o default.profdata 
    default_15853201381331839107_0.profraw Location 0x0000e944 
    
    --- End of call chain --- 
    Segmentation fault(coredump)
    
  • Undefined symbols when linking:
    
    ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_cnts
    ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_cnts
    ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_data
    ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_data
    ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_names
    ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_names
    ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_vnds
    ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_vnds
  • Linker errors:
    ld: 0711-151 SEVERE ERROR: SETOPT: Invalid option name: NAMEDSECTS:ss

PGO is supported in IBM Open XL C/C++ for AIX 17.1.4. There are two ways to generate and use profile data. For more information on PGO, refer to the "Profile Guided Optimization" section in Clang documentation. PGO data files generated in IBM Open XL C/C++ for AIX 17.1.4 are incompatible with the PDF files of IBM XL C/C++ for AIX 16.1.0 or earlier releases.

The cleanpdf, showpdf, and mergepdf commands are replaced by the ibm-llvm-profdata utility in IBM Open XL C/C++ for AIX 17.1.4.

PGO optimization for Thread Local Store variables

IBM Open XL C/C++ for AIX can instrument your program considering the usage of local-exec Thread Local Store (TLS) variables. The instrumented program can produce TLS profile files, which are then used to guide the compiler and linker to generate short instruction sequences for the hottest local-exec TLS variables.

To enable PGO instrumentation and generate profile files for the program that contain local-exec TLS variables, specify the -fprofile-local-exec-tls option. To instruct the compiler to use the instrumentation data in the generated TLS profile files to optimize the program, specify the -flocal-exec-tls-profile-use option.

For the detailed usage of the -fprofile-local-exec-tls and -flocal-exec-tls-profile-use options, refer to -fprofile-local-exec-tls and -flocal-exec-tls-profile-use.

Example

$ cat x.c
#include <stdio.h>
#include <stdlib.h>

int main( int argc, char *argv[] ) {
   long i = strtol(argv[1], NULL, 10);
   if (i > 5)
      printf( "i is bigger than 5\n");
   else
      printf( "i is <= 5\n");
   return 0;
}

To enable PGO instrumentation and instruct the compiler to instrument the code that is being compiled, specify the -fprofile-generate[=<directory>] option. If a directory is specified, the raw profile file is stored in that directory. Otherwise, it is stored in the current directory. The raw profile file is called default_%m.profraw.

$ ibm-clang x.c -Ofast -fprofile-generate
$ ./a.out 
43
i is bigger than 5
$ ls
a.out  default_15822678448124319226_0.profraw  x.c

After the raw profile file is generated, run the ibm-llvm-profdata utility on the raw profile file to make it consumable by the compiler. Note that this step is necessary even when there is only one raw profile, since the merge operation also changes the file format.

$ ibm-llvm-profdata merge -o default.profdata default_15822678448124319226_0.profraw
$ ls
a.out  default_15822678448124319226_0.profraw  default.profdata  x.c
To instruct the compiler to use the instrumentation data to optimize the program, specify the -fprofile-use[=<merge profile file path>] option, where merge profile file path is the file location of the merged profile file. If merge profile file path is a directory or omitted, the name of the merged profile file is assumed to be default.profdata.
$ ibm-clang x.c -Ofast -fprofile-use
$
Note: If the file directories specified in the -fprofile-generate and -fprofile-use options are different, specify the -mllvm -static-func-full-module-prefix=false option if you need to map counters to static functions.

Related information