RUNSTATS on expression-based indexes

Issuing the RUNSTATS command on a table with expression-based indexes updates and stores the index statistics as usual. However, statistics for each expression-based key column within the index are also collected. The optimizer uses these statistics to choose query execution plans for queries that involve the expressions.

The statistics for the expression-based keys are collected and stored in a statistical view. This statistical view is automatically created when you create an expression-based index. Issuing the RUNSTATS command on a table and its expression-based index resembles issuing the RUNSTATS command on the table and then issuing it again on the statistical view. However, although you can issue the RUNSTATS command directly on the statistical view, you should issue the command on the table and its indexes instead. This results in the gathering and updating of the statistics that are stored in the statistical view and the statistics for the index and table. If you issue the RUNSTATS command on the statistical view, the statistical view statistics are updated, but the index and table statistics are untouched.

To issue the RUNSTATS command on an index with expression-based keys, explicitly include the index by name, or implicitly include the index by using the AND INDEXES ALL clause or the FOR INDEXES ALL clause. If you do not explicitly or implicitly include the index, the index and the statistical view statistics are not updated.

You cannot name expressions as columns for the RUNSTATS command. However, expressions are considered key columns. Therefore, to gather distribution statistics for expression-based columns, include the index in the RUNSTATS command, or use a statistics profile. In addition, specify that distribution statistics are to be gathered on all columns or on key columns. Statistical view statistics are updated for non-expression columns only if they are included in the RUNSTATS column specification. Statistical view statistics for the non-expression columns can be disregarded, as they are not referenced by the optimizer.

Expression-based key columns exist only in the index. They do not exist in the base table. Therefore, the INDEXSAMPLE clause, rather than the TABLESAMPLE clause, determines how the expression column data is sampled.

Examples

The following conditions apply to all the examples:
  • Table TBL1 is created with expression-based index IND1.
  • Associated statistical view IND1_V is automatically created.

Example 1

The RUNSTATS command is issued on table TBL1 in two ways:
  • No index is specified:
    runstats on table TBL1
  • An index that is not IND1 is specified:
    runstats on table TBL1 with distribution on key columns and index IND2

The results in both cases are the same: the index statistics for IND1 are not updated, and the statistical view statistics for IND1_V are not updated, even though the ON KEY COLUMNS parameter was specified in the second case. (Specifying the ALL COLUMNS parameter would not change the results, either.) To gather statistics on expression-based key columns in an expression-based index, you must explicitly or implicitly include that index in the RUNSTATS command.

Example 2

The RUNSTATS command is issued on table TBL1 in three ways:
  • Index IND1 is specified:
    runstats on table TBL1 and index IND1
    runstats on table TBL1 on columns (c1,c3) and indexes IND1,IND2
  • The ALL parameter is specified:
    runstats on table TBL1 on key columns and indexes all

In all of these cases, the index statistics for IND1 are updated. As well, the statistical view statistics for IND1_V are updated with basic column statistics for all expression columns. These results apply even though the ON COLUMNS AND clause was specified.