Profile Guided Optimization (PGO)
Profile guided optimization (PGO), also known as profile-directed feedback (PDF), is a compiler optimization technique in computer programming that uses profiling to improve program runtime performance.
- IBM® AIX® 7.2: TL5 SP3 or later
- IBM AIX 7.3: TL0 or later
However, to use PGO, your operating system must be IBM AIX 7.2 TL5 SP5 or later, or IBM AIX 7.3 TL1 or later.
- A segmentation fault when using the ibm-llvm-profdata
utility:
PLEASE submit a bug report to https://ibm.biz/openxlcpp-support and include the crash backtrace. Stack dump: 0. Program arguments: /opt/IBM/openxlC/17.1.3/bin/ibm-llvm-profdata "ibm-llvm-profdata merge" -o default.profdata default_15853201381331839107_0.profraw Location 0x0000e944 --- End of call chain --- Segmentation fault(coredump) - Undefined symbols when linking:
ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_cnts ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_cnts ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_data ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_data ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_names ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_names ld: 0711-317 ERROR: Undefined symbol: __stop___llvm_prf_vnds ld: 0711-317 ERROR: Undefined symbol: __start___llvm_prf_vnds - Linker
errors:
ld: 0711-151 SEVERE ERROR: SETOPT: Invalid option name: NAMEDSECTS:ss
PGO is supported in IBM Open XL C/C++ for AIX 17.1.4. There are two ways to generate and use profile data. For more information on PGO, refer to the "Profile Guided Optimization" section in Clang documentation. PGO data files generated in IBM Open XL C/C++ for AIX 17.1.4 are incompatible with the PDF files of IBM XL C/C++ for AIX 16.1.0 or earlier releases.
The cleanpdf, showpdf, and mergepdf commands are replaced by the ibm-llvm-profdata utility in IBM Open XL C/C++ for AIX 17.1.4.
PGO optimization for Thread Local Store variables
IBM Open XL C/C++ for AIX can instrument your program considering the usage of local-exec Thread Local Store (TLS) variables. The instrumented program can produce TLS profile files, which are then used to guide the compiler and linker to generate short instruction sequences for the hottest local-exec TLS variables.
To enable PGO instrumentation and generate profile files for the program that contain local-exec TLS variables, specify the -fprofile-local-exec-tls option. To instruct the compiler to use the instrumentation data in the generated TLS profile files to optimize the program, specify the -flocal-exec-tls-profile-use option.
For the detailed usage of the -fprofile-local-exec-tls and -flocal-exec-tls-profile-use options, refer to -fprofile-local-exec-tls and -flocal-exec-tls-profile-use.Example
$ cat x.c
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char *argv[] ) {
long i = strtol(argv[1], NULL, 10);
if (i > 5)
printf( "i is bigger than 5\n");
else
printf( "i is <= 5\n");
return 0;
}
To enable PGO instrumentation and instruct the compiler to instrument the code that is being
compiled, specify the -fprofile-generate[=<directory>] option. If a
directory is specified, the raw profile file is stored in that directory. Otherwise, it is stored in
the current directory. The raw profile file is called default_%m.profraw.
$ ibm-clang x.c -Ofast -fprofile-generate
$ ./a.out
43
i is bigger than 5
$ ls
a.out default_15822678448124319226_0.profraw x.c
After the raw profile file is generated, run the ibm-llvm-profdata utility on the raw profile file to make it consumable by the compiler. Note that this step is necessary even when there is only one raw profile, since the merge operation also changes the file format.
$ ibm-llvm-profdata merge -o default.profdata default_15822678448124319226_0.profraw
$ ls
a.out default_15822678448124319226_0.profraw default.profdata x.c
$ ibm-clang x.c -Ofast -fprofile-use
$