With SLES9 gcc-3.3 and RHEL4 gcc-3.4 new tuning options became
available to Linux users on the mainframe. The gcc-4.1 introduces
the additional argument 'z9-109' for both parameters '-mtune'
and '-march'. The first distributions which provide
support for taking specific advantage of System z9-109 models
are Novell/SUSE SLES10 and Red Hat Enterprise Linux AS 5.
'-march=g5 | g6 | z900 | z9-109'
generates code optimized for the particular CPU, using
the instruction set of the CPU. The generated code targeting
one CPU type will not necessarily run on a different CPU
type. The 'march' parameter is upwards compatible.
So code compiled with 'march=z900' will run
on a System z9 but it is not guaranteed that code compiled
with 'march=z9-109' will run on an eServer
zSeries 900. For eServer zSeries 800 use the 'z900' argument,
for eServer zSeries 890 use the 'z990' argument.
'-mtune=g5 | g6 | z990 | z9-109'
generates code optimized for the particular CPU and the
set of available instructions. The generated code targeting
one CPU type will run on a different CPU type but could
cause a performance degradation. For eServer zSeries 800
use the 'z900' argument, for eServer zSeries 890 use the
'z990' argument.
In most cases the options '-march' or '-mtune'
improve the performance of the application if the compile
is optimized for the target machine. The arguments 'z990'
and 'z9-109' utilize the super scalar machine capabilities.
Two execution units per CPU allow to run up to two instructions
and a branch in parallel, if the instructions are properly
arranged. The performance improvement varies and depends upon
the generated machine code and in case the CPU is the bottleneck.
Our experiments show performance improvements up to 20 percent.
Use '-march', if the generated
optimized code is to be executed on only that single target
machine type. If more than one target machine is identified,
use the argument for the oldest model (march
parameter is upward compatible).
Use '-mtune', if the generated optimized
code is intended to run on a specific target machine type,
but should be definitely runnable on other machines, too.
For SLES10 gcc-4.1 the defaults are '-mtune=z9-109'
and '-march=z900'.
In other 64-bit environments the defaults are '-mtune=z900'
and '-march=z900'.
In a 31-bit environment the defaults are '-mtune=G5'
and '-march=G5'.
Utilize the highest optimization level using parameter
'-O3'.
The option '-funroll-loops' might help to
speed up your application.
For a detailed description of the -m options
defined for the S/390 (31-bit) and System z (64-bit) architectures
see the gcc page 'S/390 and zSeries Options' at