num_quantiles - Number of quantiles for columns configuration parameter

This parameter controls the number of quantiles that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command.

Configuration type
Parameter type
  • Configurable online
  • Configurable by member in a Db2® pureScale® environment
Propagation class
Default [range]
20 [0 - 32 767]
Unit of measure

Increasing the value of this parameter increases the amount of statistics heap (stat_heap_sz) used when collecting statistics.

The quantile statistics help the optimizer understand the distribution of data values within a column. A higher value results in more information being available to the query optimizer but requires additional catalog space. When 0 or 1 is specified, no quantile statistics are retained, even if you request that distribution statistics be collected.

You can also specify the number of quantiles collected as part of the RUNSTATS command at the table or the column level, by using the NUM_QUANTILES command parameter. If none is specified, the num_quantiles configuration parameter value is used. Changing the number of quantiles that will be collected through the RUNSTATS command is easier than making the change using the num_quantiles database configuration parameter.

Updating this parameter can help obtain better selectivity estimates for range predicates over data that is non-uniformly distributed. Among other optimizer decisions, this information has a strong influence on whether an index scan or a table scan will be chosen. (It is more efficient to use a table scan to access a range of values that occur frequently and it is more efficient to use an index scan for a range of values that occur infrequently.)

After changing the value of this parameter, you need to:
  • Run the RUNSTATS command again to collect statistics with the changed number of frequent values
  • Rebind any packages containing static SQL or XQuery statements.

When using RUNSTATS, you have the ability to limit the number of quantiles collected at both the table level and the column level. This allows you to optimize on space occupied in the catalogs by reducing the distribution statistics for columns where they could not be exploited and yet still using the information for critical columns.

Recommendation: The default value for this parameter provides reasonably accurate estimates in most cases. You can consider increasing the value if you observe significant and consistent differences between:
  • Selectivity estimates from the explain output; and
  • Actual selectivity of range predicates over non-uniformly distributed column data.
A reasonable practical value for this parameter lies in the range of 10 to 50.