Avoiding problems with correlated columns

Two columns in a table are said to be correlated if the values in the columns do not vary independently.

About this task

Db2 might not determine the best access path when your queries include correlated columns.

Procedure

Begin program-specific programming interface information. To address problems with correlated columns:

  • For leading indexed columns, run the RUNSTATS utility. For all other column groups, run the RUNSTATS utility with the COLGROUP option
  • Run the RUNSTATS utility to collect column correlation information for any column group with the COLGROUP option.
  • Update the catalog statistics manually.
  • Use SQL that forces access through a particular index.

Results

The RUNSTATS utility collects the statistics that Db2 needs to make proper choices about queries. With RUNSTATS, you can collect statistics on the concatenated key columns of an index and the number of distinct values for those concatenated columns. This gives Db2 accurate information to calculate the filter factor for the query.

Example

For example, RUNSTATS collects statistics that benefit queries like this:

SELECT * FROM T1
 WHERE C1 = 'a' AND C2 = 'b' AND C3 = 'c' ;

Where:

  • The first three index keys are used (MATCHCOLS = 3).
  • An index exists on C1, C2, C3, C4, C5.
  • Some or all of the columns in the index are correlated in some way.End program-specific programming interface information.