-qtune
Category
Pragma equivalent
None.
Purpose
Tunes instruction selection, scheduling, and other architecture-dependent performance enhancements to run best on a specific hardware architecture. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.
Syntax
.-balanced-. >>- -q--tune--=--+-auto-----+--+-----------------+------------->< +-ppc970---+ | .-st-------. | +-pwr4-----+ '-:--+-balanced-+-' +-pwr5-----+ +-smt2-----+ +-pwr6-----+ +-smt4-----+ +-pwr7-----+ '-smt8-----' '-pwr8-----'
Defaults
-qtune=balanced:balanced when no valid -qarch setting is in effect. Otherwise, the default depends on the effective -qarch setting. For details, see Table 1.
Parameters for CPU suboptions
The following CPU suboptions allow you to specify a particular architecture for the compiler to target for best performance:
- auto
- Optimizations are tuned for the platform on which the application is compiled.
- balanced
- Optimizations are tuned across a selected range of recent hardware.
- ppc970
- Optimizations are tuned for the PowerPC® 970 processor.
- pwr4
- Optimizations are tuned for the POWER4 hardware platforms.
- pwr5
- Optimizations are tuned for the POWER5 hardware platforms.
- pwr6
- Optimizations are tuned for the POWER6® hardware platforms.
- pwr7
- Optimizations are tuned for the POWER7® or POWER7+™ hardware platforms.
- pwr8
- Optimizations are tuned for the POWER8® hardware platforms.
Parameters for SMT suboptions
The following simultaneous multithreading (SMT) suboptions allow you to optionally specify an execution mode for the compiler to target for best performance.
- balanced
- Optimizations are tuned for performance across various SMT modes for a selected range of recent hardware.
- st
- Optimizations are tuned for single-threaded execution.
- smt2
- Optimizations are tuned for SMT2 execution mode (two threads).
- smt4
- Optimizations are tuned for SMT4 execution mode (four threads).
- smt8
- Optimizations are tuned for SMT8 execution mode (eight threads).
Usage
If you want your program to run on more than one architecture, but to be tuned to a particular architecture, you can use a combination of the -qarch and -qtune options. These options are primarily of benefit for floating-point intensive programs.
By arranging (scheduling) the generated machine instructions to take maximum advantage of hardware features such as cache size and pipelining, -qtune can improve performance. It only has an effect when used in combination with options that enable optimization.
A particular SMT suboption is valid if the effective -qarch option supports the specified SMT mode. The acceptable combinations of the -qarch and SMT tune options are listed in Table 1. The compiler ignores any invalid -qarch/-qtune SMT combination.
Although changing the -qtune setting may affect the performance of the resulting executable, it has no effect on whether the executable can be executed correctly on a particular hardware platform.
| -qarch option | Default -qtune setting | Available -qtune CPU settings | Available -qtune SMT settings |
|---|---|---|---|
| ppc | balanced:balanced | auto | pwr4 | pwr5 | pwr6 | pwr7 | pwr8 | ppc970 | balanced | balanced | st |
| ppcgr | balanced:balanced | auto | pwr4 | pwr5 | pwr6 | pwr7 | pwr8 | ppc970 | balanced | balanced | st |
| ppc64 | balanced:balanced | auto | pwr4 | pwr5 | pwr6 | pwr7 | pwr8 | ppc970 | balanced | balanced | st |
| ppc64gr | balanced:balanced | auto | pwr4 | pwr5 | pwr6 | pwr7 | pwr8 | ppc970 | balanced | balanced | st |
| ppc64grsq | balanced:balanced | auto | pwr4 | pwr5 | pwr6 | pwr7 | pwr8 | ppc970 | balanced | balanced | st |
| ppc64v | balanced:balanced | auto | ppc970 | pwr6 | pwr7 | pwr8 | balanced | balanced | st |
| ppc970 | ppc970:st | auto | ppc970 | balanced | balanced | st |
| pwr4 | pwr4:st | auto | pwr4 | pwr5 | pwr6 | pwr7 | pwr8 | ppc970 | balanced | balanced | st |
| pwr5 | pwr5:st | auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced | balanced | st |
| pwr5x | pwr5:st | auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced | balanced | st | smt2 |
| pwr6 | pwr6:st | auto | pwr6 | pwr7 | pwr8 | balanced | balanced | st | smt2 |
| pwr6e | pwr6:st | auto | pwr6 | balanced | balanced | st |
| pwr7 | pwr7:st | auto | pwr7 | pwr8 | balanced | balanced | st | smt2 | smt4 |
| pwr8 | pwr8:st | auto | pwr8 | balanced | balanced | st | smt2 | smt4 | smt8 |
Predefined macros
None.
Examples
xlc -o testing myprogram.c -qtune=pwr7
xlc -o testing myprogram.c -qtune=pwr8:smt4



