Floating-point formats

XL C/C++ supports the following binary floating-point formats:
  • 32-bit single precision, with an approximate absolute normalized range of 0 and 10-38 to 1038 and with a precision of about 7 decimal digits
  • 64-bit double precision, with an approximate absolute normalized range of 0 and 10-308 to 10308 and with a precision of about 16 decimal digits
  • 128-bit extended precision, with slightly greater range than double-precision values, and with a precision of about 32 decimal digits

The 128-bit extended precision format of XL C/C++ is different from the binary128 formats that are suggested by the IEEE standard. The IEEE standard suggests that extended formats use more bits in the exponent for greater range and the fraction for higher precision.

It is possible that special numbers, such as NaN, infinity, and negative zero, cannot be represented by the 128-bit extended precision values. Arithmetic operations do not necessarily propagate these numbers in extended precision.



Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us