Expression-based indexes and statistics profiles

The RUNSTATS command's statistics profile facility can automatically gather customized statistics for a particular table, including a table with an expression-based index. This facility simplifies statistics collection by storing the RUNSTATS command's options for convenient future use.

The RUNSTATS command provides the option to register and use a statistics profile, which specifies the type of statistics that are to be collected for a particular table. To register a profile and collect statistics at the same time, issue the RUNSTATS command with the SET PROFILE parameter. To register a profile only, issue the RUNSTATS command with the SET PROFILE ONLY parameter. To collect statistics with a profile that was already registered, issue the RUNSTATS command with the USE PROFILE parameter.

A statistics profile is useful if you want to gather nonstandard statistics for particular columns. However, an expression cannot be named as a column for a RUNSTATS command. Therefore, to gather nonstandard statistics, you would issue the RUNSTATS command on the base table and the statistical view. This strategy raises the problem of keeping the table and statistical view statistics in sync. Instead, issuing the RUNSTATS command uses a table's statistics profile and, if that profile includes an index with expression-based keys in its specifications, then any statistics profile on the statistical view that is associated with that index is also applied. So, all statistics are gathered with a just one RUNSTATS operation on the table because all relevant profiles are applied.

If the table does not have a statistics profile that is associated with it (or you do not specify the USE PROFILE parameter), then no profile on the associated statistical views is applied while the RUNSTATS facility is running. This behavior applies only to the statistical views associated with indexes that includes expression-based keys and not statistical views in general.

Creating a statistics profile on a statistical view that is associated with an index, which includes expression-based keys, automatically creates a statistics profile on the base table (but only if one does not exist).

If a statistics profile is associated with the base table and a statistics profile is associated with the statistical view, the profiles are used as follows while the RUNSTATS command is running on a table with an expression-based index:
  • The table's profile controls the statistics for the non-expression columns and the RUNSTATS command overall.
  • The statistical view's statistics profile controls the statistics for the expression columns.

The auto_runstats database configuration parameter and real-time statistics (rts) feature apply the statistics profiles in the same manner as a manually issued RUNSTATS command. If the statistics profile for a statistical view exists and the underlying table has a statistical profile that includes an expression-based index, the auto_runstats database configuration parameter and the rts feature apply the statistics profile for the statistical view. Neither the auto_runstats database configuration parameter nor the rts feature act directly on the expression-based index's statistical view. The statistical view statistics are gathered when the auto_runstats database configuration parameter and the rts feature act on the table itself. In addition, the rts feature does not fabricate statistical view statistics for the expression-based index.

Examples

The following initial conditions apply to the examples:
  • Table TBL1 has no statistics profile.
  • Table TBL1 has an expression-based index IND1.
  • Index IND1 has an automatically generated statistical view IND1_V.
  • The NUM_FREQVALUES database configuration parameter is set to the default value of 10.

Example 1

The following command sets a statistics profile on the statistical view IND1_V, which gathers extra distribution statistics:

RUNSTATS ON VIEW IND1_V WITH DISTRIBUTION DEFAULT NUM_FREQVALUES 40 SET PROFILE ONLY

Because there is no statistics profile on the table, a statistics profile is generated when the command is issued in the following form:

RUNSTATS ON TABLE TBL1 WITH DISTRIBUTION AND SAMPLED DETAILED INDEXES ALL

Now, the following command is issued:

RUNSTATS ON TABLE TBL1 USE PROFILE

The statistics profile on the table and the statistics profile on the statistic view are applied. Statistics are gathered for the table and the expression columns in the index. The columns in the table have the usual amount of frequent value statistics and the columns in the statistical view have more frequent value statistics.

Example 2

The following command sets a statistics profile that samples a large table with the aim to shorten the execution time for the RUNSTATS command and to reduce the impact on the overall system (the profile is set, but the command to sample the table is not issued):
RUNSTATS ON TABLE TBL1 WITH DISTRIBUTION AND INDEXES ALL TABLESAMPLE SYSTEM (2.5) INDEXSAMPLE SYSTEM (10) SET PROFILE ONLY UTIL_IMPACT_PRIORITY 30

It is decided that distribution statistics are not needed on the expression-based keys in index IND1. However, LIKE statistics are required on the second key column in the index. According to the definition for statistical view IND1_V in the catalogs, the second column in the view is named K01.

The following command is issued to modify the statistics profile to gather LIKE statistics on column K01 from statistical view IND1_V (the profile is set, but the command to gather statistics is not issued):
RUNSTATS ON VIEW IND1_V ON ALL COLUMNS AND COLUMNS(K01 LIKE STATISTICS) SET PROFILE ONLY

Now the statistics profile is set and the following command is issued to gather statistics that are based on the statistics profile:

RUNSTATS ON TABLE TBL1 USE PROFILE
The statistics profile on the table and the statistics profile on the statistic view are applied. Statistics are gathered for the table and the expression-based columns. The results are as follows:
  • Distribution statistics are gathered for the base table columns but not for the expression-based key columns.
  • The LIKE statistics are gathered for the specified expression-based key column.
  • While the RUNSTATS command is running, the expression-based key column values are sampled at the rate dictated by the INDEXSAMPLE SYSTEM(10) parameter in the table's profile.
  • The table's UTIL_IMPACT_PRIORITY parameter setting governs the priority of the entire RUNSTATS command operation.