Profile Guided Optimization (PGO)

Profile guided optimization (PGO), also known as profile-directed feedback (PDF), is a compiler optimization technique in computer programming that uses profiling to improve program runtime performance. 

PGO is supported in IBM® Open XL C/C++ for AIX® 17.1.0, and there are two ways to generate and use profile data. You can find more information on PGO at https://ibm.biz/clang-v13-manual#profile-guided-optimization. PGO data files generated in IBM Open XL C/C++ for AIX 17.1.0 are incompatible with the PDF files in IBM XL C/C++ for AIX 16.1.0 or earlier releases.

The cleanpdf, showpdf, and mergepdf commands are replaced by the ibm-llvm-profdata utility in IBM Open XL C/C++ for AIX 17.1.0.

Notes:
  • In IBM Open XL C/C++ for AIX 17.1.0.0, PGO must be used with Link-Time Optimization (LTO) with the -flto option. Details of how to use LTO can be found in the Link Time Optimization (LTO) topic.
  • Starting from IBM Open XL C/C++ for AIX 17.1.0.1, you do not need to use LTO with PGO.
Example:
$ cat x.c
#include <stdio.h>
#include <stdlib.h>

int main( int argc, char *argv[] ) {
   long i = strtol(argv[1], NULL, 10);
   if (i > 5)
      printf( "i is bigger than 5\n");
   else
      printf( "i is <= 5\n");
   return 0;
}

To enable PGO instrumentation and instruct the compiler to instrument the code that is being compiled, specify the -fprofile-generate[=<directory>] option. If a directory is specified, the raw profile file is stored in that directory. Otherwise, it is stored in the current directory. The raw profile file is called default_%m.profraw.

$ ibm-clang x.c -Ofast -flto -fprofile-generate
$ ./a.out 
43
i is bigger than 5
$ ls
a.out  default_15822678448124319226_0.profraw  x.c

After the raw profile file is generated, run the ibm-llvm-profdata tool on the raw profile file to make it consumable by the compiler. Note that this step is necessary even when there is only one raw profile, since the merge operation also changes the file format.

$ ibm-llvm-profdata merge -o default.profdata default_15822678448124319226_0.profraw
$ ls
a.out  default_15822678448124319226_0.profraw  default.profdata  x.c
To instruct the compiler to use the instrumentation data to optimize the program, specify the -fprofile-use[=<merge profile file path>] option, where merge profile file path is the file location of the merged profile file. If merge profile file path is a directory or omitted, the name of the merged profile file is assumed to be default.profdata.
$ ibm-clang x.c -Ofast -flto -fprofile-use
$
Note: Starting from IBM Open XL C/C++ for AIX 17.1.0.1, you do not need to specify the -flto option in the previous example.

By default, the -bcdtors:all option is passed to the linker. Object files that are compiled with -fprofile-generate contain constructor functions that are generated by the compiler. As a result, the linker picks up any archive member built with -fprofile-generate that defines constructor or destructor functions. This might cause an issue that duplicated object files from archives are included in the final executable. To resolve this issue, specify the -bcdtors:mbr option together with -fprofile-generate on the link step. With the -bcdtors:mbr option, the linker does not pick up an archive member if it is not referenced or if it contains constructor functions with no reachable code.