In a couple of previous posts ( TOC Overflow: what is it, and why should you care?, Dealing with TOC overflow: the traditional approach ) I have presented the issue of TOC overflow. Now I will discuss some features of the XL compilers that can help bypass TOC overflow while minimizing any negative effects on runtime performance.
1. Minimal TOC: The option -qminimaltoc makes the compiler generate code that uses a single entry in the TOC for each compilation unit (in C/C++ a compilation unit is a source file). In order to do this, a separate level of indirection must be follow in order to access TOC-based variables. This means that the program will be larger and slower than if it did not have TOC overflow, but it will still be faster than using the -bbigtoc option. This is similar to the -mminimal-toc from gcc.
Furthermore, -qminimaltoc does not need to be used on all compilation units, so you can minimize the performance impact by using this flag only on compilation units that are not relevant for performance.
2. IPA: IPA is short for inter-procedural analysis, a form of compiler optimization that looks at the whole program, not just a single compilation unit. For this, the optimizer is invoked during the linking phase of your application, to perform transformations that can affect multiple compilation units.
Applying this process significantly reduces TOC pressure, and in most cases completely eliminates TOC overflow. It does so by restructuring your program to reduce the number of global symbols. The result is similar to what could be achieved through source changes, but avoiding the widespread manual source changes.
In the XL compilers, IPA is implied at optimization levels -O4 and -O5, but those also include other complex optimizations which may not be as relevant to commercial application development. One good alternative is the option -qipa=level=0, which applies a minimal level of whole-program optimization. This is often sufficient to eliminate TOC overflow, but in very large applications you may need -qipa=level=1 instead, which will perform a more aggressive reduction of the TOC requirements, at the cost of a longer compilation process.
Note that for whole-program analysis to be performed, the -qipa option needs to be specified both at the compile and link command lines. This means that the linking of the program has to be done through the compiler driver (xlc, xlC or cc) instead of directly through the system linker (ld). For maximum effect, all source files should be compiled with -qipa, but it is possible to mix-and-match objects compiled with different options and have them interoperate.
If you try these options please add comments to this post describing your results.