num_quantiles - Number of quantiles for columns configuration parameter
This parameter controls the number of quantiles that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command.
- Configuration type
- Database
- Parameter type
- Configurable online
- Configurable by member in a Db2® pureScale® environment
- Propagation class
- Immediate
- Default [range]
- 20 [0 - 32 767]
- Unit of measure
- Counter
Increasing the value of this parameter increases the amount of statistics heap (stat_heap_sz) used when collecting statistics.
The quantile
statistics
help the optimizer understand the distribution of data values within
a column. A higher value results in more information being available
to the query optimizer but requires additional catalog space. When 0 or 1 is
specified, no quantile statistics are retained, even if you request
that distribution statistics be collected.
You can also specify the number of quantiles collected as part of the RUNSTATS command at the table or the column level, by using the NUM_QUANTILES command parameter. If none is specified, the num_quantiles configuration parameter value is used. Changing the number of quantiles that will be collected through the RUNSTATS command is easier than making the change using the num_quantiles database configuration parameter.
Updating this parameter can help obtain better selectivity estimates for range predicates over data that is non-uniformly distributed. Among other optimizer decisions, this information has a strong influence on whether an index scan or a table scan will be chosen. (It is more efficient to use a table scan to access a range of values that occur frequently and it is more efficient to use an index scan for a range of values that occur infrequently.)
- Run the RUNSTATS command again to collect statistics with the changed number of frequent values
- Rebind any packages containing static SQL or XQuery statements.
When using RUNSTATS, you have the ability to limit the number of quantiles collected at both the table level and the column level. This allows you to optimize on space occupied in the catalogs by reducing the distribution statistics for columns where they could not be exploited and yet still using the information for critical columns.
- Selectivity estimates from the explain output; and
- Actual selectivity of range predicates over non-uniformly distributed column data.