Approximate numeric data types

Use approximate numeric data types to store floating-point values.

The system stores numbers by using 6 or 15 digits of precision. The system stores the location of the decimal point separately so that it can represent large values. Approximate numeric types can be an option for space considerations because real and double-precision numeric data types are stored compactly, but they can produce imprecise rounding during computations.

Use caution when you are using approximate numerics. Do not use approximate numerics for distribution columns, join columns, or columns that require mathematical operations such as SUM and AVG.

Floating point data types also have inherent performance implications. For example, the system cannot run a fast hash join on a floating point data type, but instead must run a slower sort and merge join.

The following table describes the approximate numeric data types.

Table 1. Approximate numeric data types
Type	Value	Disk usage
float(p)	Floating point number with precision p, from 1 to 15. Precision less than 6 is equivalent to 6. Precision 7 - 15 is equivalent to 15.	Precision of 6 or less (4 bytes) Precision 7 - 15 (8 bytes)
real	Equivalent to float(6).	4 bytes
double precision	14 digits of precision are stored.	8 bytes

IBM® Netezza® SQL prefers type names real and double precision, with float(p) being closer to an alias for either of the preferred forms real or double precision.

Not only is floating point summation approximate, but more importantly, it is non-associative; that is, the result depends on the order in which the partial sums are combined. This result is different from integer and numeric summations that are precise and always produce the same result irrespective of any reordering.

In the massively parallel Netezza, sums and averages are partially evaluated on the SPUs and then combined at the host to produce the final result. Because SPUs return results to the host asynchronously, floating point summations might produce different results for different runs. This effect is noticeable if the values span a large dynamic range or if there are large values of differing signs that tend to cancel each other out.