IBM Support

Using Optimization Switches on IBM SDK for Linux on Power

Technical Blog Post


Abstract

Using Optimization Switches on IBM SDK for Linux on Power

Body

      Overview

      Sometimes when programming, the use of optimization flags are more than welcome in order to produce better performing applications.

      The IBM Software Development Kit (SDK) for Linux on Power IDE is a wealthy programming tool that  provides several compiling optimization flags to help you produce better code and applications with better performance.
      It also can give you suggestions and descriptions about what flags you have available at hand which allows you decide the pros and cons of having an optimization tool.
    
      Keep in mind that the Build Advisor plugin was introduced at  version 1.8. This post will guide you through an overview of the Build Advisor plugin and its different switch and flag options available for compiling your code:
    

image

       Location of the Build Advisor plugin tab and some available advice for the code on the top window.

 

       The Build Advisor plugin switches and flags:    
        Some optimizations are enabled with -O<flag> is set from the  command line (e.g. -Ofast, -O3) , when triggering the compilation. When not using those, they are disabled by default. 
        Keep in mind that turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time (and possibly the ability to debug the program).
        
        The Migration Advisor plugin starts checking your code in search of what it considers the optimum combination for performance. 
        

        To start with, it verifies the Optimization level (-Os, -O1, -O2 or -O3)

  • -O3

The  highest optimization level. It manages a considerable amount of switches  and flags, focusing code and performance optimization.  -O3 turns on  all optimizations specified by -O2 and also turns on the  -finline-functions, -funswitch-loops, -fpredictive-commoning,  -fgcse-after-reload, -ftree-loop-vectorize,  -ftree-loop-distribute-patterns, -ftree-slp-vectorize,  -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options. For more details on -O2 and -O1, please refer to the References [1] section below.

 
        Then it also checks the compiler:
 
            Advance-toolchain

If you are using the distro GCC, chances are high that you don't have the latest flavours in terms of optimum compiler, so you should consider getting the IBM Advance Toolchain compiler. The AT switch advises that you should "Consider using IBM Advance Toolchain compiler."

 

Build Advisor checks for some POWER specific flags [2]:

Note: refer to the Advance Toolchain documentation for further information [3]

  • -mcpu=cpu_type

Set the architecture you are targeting. For us, cpu_type can be 'power7', 'power8', 'powerpc64' and 'powerpc64le'.

  • -mtune

Set the instruction scheduling parameters for machine type cpu_type, but do not set the architecture type, register usage, or choice of mnemonics, as -mcpu would.

  • -mcmodel=medium

This is the default suggested and the one you should use. This switch generates PowerPC64 code for the medium model: The Table of Contents (TOC) and other static data may be up to a total of 4G in size.

 

From the GCC documentation [2]: 

"Options of the form -fflag specify machine-independent flags. Most flags have both positive and negative forms; the negative form of -ffoo is -fno-foo."

  • -ftree-vectorize

-ftree-vectorize performs vectorization on trees and is composed by two other switches: -ftree-loop-vectorized and -ftree-slp-vectorize.  As both those flags are enabled by -O3, -ftree-vectorize is  therefore  enabled by -O3 as well.

 

Optimize Even More

There is a set of flags that can optimize your code even more, but the price can be increasing compilation time, code size or not compliance with some default standards.

image

       Enable special flags from project Preferences > Build Advisor option, and then check the flags you want advice with.

 

The flags are:

When the Build Advisor plugin does not detect the maximum optimization  flag set, it will suggest you to use it. You will see a message like "Consider updating lower level optimization (-O) with -O3."

  • -Ofast

Disregard  strict standards compliance.  -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the   Fortran-specific -fno-protect-parens and -fstack-arrays. WARNING: You should be aware that it will be producing non-IEEE 754 compliant Math results.

 

Enhancing loops when possible

  • -fpeel-loops:

Peel loops, in some cases, can completely remove loops with small constant number of iterations.

  • -funroll-loops

Unrolls loops, and replicates the body of the loop N times to reduce loop system use and improve scheduling opportunities.

  • Feedback-Directed Optimization (FDO)

Feedback-directed optimization (FDO) is a technique that alters a program's execution based on tendencies observed in its present or past runs. 

The flags you should use for the first and second compilations are -fprofile-generate flag in the first compilation, and -fprofile-use for the second compilation. Build Advisor will tell you which flags to use when you mark the "extra advice" FDO box from the Build Advisor settings window:

  • -fprofile-generate

Using -fprofile-generate, the compiler inserts instrumentation code in the program that collects statistics at runtime about execution frequencies over the different code paths.

  • -fprofile-use

Using -fprofile-use, the user  program is compiled a second time. The profile information saved in the previous step is used to guide the optimizers.

  • -flto

The Link Time Optimizer (LTO) switch basically links all object files of your application altogether and then read and instantiate their functions as if they were part of a single translating unit. To further understand how -flto works, please refer to the GCC Optimize Options documentation [1]

 

Conclusion

To produce code with an optimal performance, a programmer can easily rely on SDK for some guidance. In the scenarios where compilation time is not an expensive variable in the Software Development process, the set of flags and switches suggested by the Build Advisor plugin is absolutely a difference in the final product.

 

References     

       [2] https://gcc.gnu.org/onlinedocs/gcc-4.9.3/gcc/RS_002f6000-and-PowerPC-Options.html

       [3] ibm.co/AdvanceToolchain


[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

UID

ibm16170343