The PowerPC® floating-point hardware performs calculations in either IEEE
single-precision (equivalent to REAL(4) in Fortran programs) or IEEE
double-precision (equivalent to REAL(8) in Fortran programs).
Keep the following considerations in mind:
Double precision provides greater range (approximately 10**(-308) to 10**308)
and precision (about 15 decimal digits) than single precision (approximate
range 10**(-38) to 10**38, with about 7 decimal digits of precision).
Computations that mix single and double operands are performed in double
precision, which requires conversion of the single-precision operands to double-precision.
These conversions do not affect performance.
Double-precision values that are converted to single-precision (such as
when you specify the SNGL intrinsic or when a double-precision computation
result is stored into a single-precision variable) require rounding operations.
A rounding operation produces the correct single-precision value, which is
based on the IEEE rounding mode in effect. The value may be less precise than
the original double-precision value, as a result of rounding error. Conversions
from double-precision values to single-precision values may reduce the performance
of your code.
Programs that manipulate large amounts of floating-point data may run
faster if they use REAL(4) rather than REAL(8) variables.
(You need to ensure that REAL(4) variables provide you with acceptable
range and precision.) The programs may run faster because the smaller data
size reduces memory traffic, which can be a performance bottleneck for some
applications.
The floating-point hardware also provides a special set of
double-precision operations that multiply two numbers and add a third number
to the product. These combined multiply-add (MAF) operations are
performed at the same speed at which either an individual multiply or add
is performed. The MAF functions provide an extension to the IEEE
standard because they perform the multiply and add with one (rather than two)
rounding errors. The MAF functions are faster and more accurate than
the equivalent separate operations.
Extended-precision values
XL Fortran extended precision is not in the format suggested by the IEEE standard,
which suggests extended formats using more bits in both the exponent (for
greater range) and the fraction (for greater precision).
XL Fortran extended precision, equivalent to REAL(16) in Fortran programs,
is implemented in software. Extended precision provides the same range as
double precision (about 10**(-308) to 10**308) but more precision (a variable
amount, about 31 decimal digits or more). The software support is restricted
to round-to-nearest mode. Programs that use extended precision must ensure
that this rounding mode is in effect when extended-precision calculations
are performed. See Selecting the rounding mode for the different ways you can control
the rounding mode.
Programs that specify extended-precision values as hexadecimal, octal,
binary, or Hollerith constants must follow these conventions:
Extended-precision numbers are composed of two double-precision numbers
with different magnitudes that do not overlap. That is, the binary exponents
differ by at least the number of fraction bits in a REAL(8).
The high-order double-precision value (the one that comes first in storage)
must have the larger magnitude. The value of the extended-precision number
is the sum of the two double-precision values.
For a value of NaN or infinity, you must encode one of these values within
the high-order double-precision value. The low-order value is not significant.
Because an XL Fortran extended-precision value can be the sum of two values
with greatly different exponents, leaving a number of assumed zeros in the
fraction, the format actually has a variable precision with a minimum of about
31 decimal digits. You get more precision in cases where the exponents of
the two double values differ in magnitude by more than the number of digits
in a double-precision value. This encoding allows an efficient implementation
intended for applications requiring more precision but no more range than
double precision.
Notes:
In the discussions of rounding errors because of compile-time folding
of expressions, keep in mind that this folding produces different results
for extended-precision values more often than for other precisions.
Special numbers, such as NaN and infinity, are not fully supported for
extended-precision values. Arithmetic operations do not necessarily propagate
these numbers in extended precision.
XL Fortran does not always detect floating-point exception conditions
(see Detecting and trapping floating-point exceptions) for extended-precision values. If you turn on
floating-point exception trapping in programs that use extended precision,
XL Fortran may also generate signals in cases where an exception condition
does not really occur.