How Db2 uses multiple filter factors to determine the cost of a query

When Db2 estimates the cost of a query, it determines the filter factor repeatedly and at various levels.

Begin program-specific programming interface information. For example, suppose that you execute the following query:

SELECT COLS FROM T1
  WHERE C1 = 'A'
  AND   C3 = 'B'
  AND   C4 = 'C';

Table T1 consists of columns C1, C2, C3, and C4. Index I1 is defined on table T1 and contains columns C1, C2, and C3.

Suppose that the simple predicates in the compound predicate have the following characteristics:

C1='A'
Matching predicate
C3='B'
Screening predicate
C4='C'
Stage 1, nonindexable predicate

To determine the cost of accessing table T1 through index I1, Db2 performs these steps:

  1. Estimates the matching index cost. Db2 determines the index matching filter factor by using single-column cardinality and single-column frequency statistics because only one column can be a matching column.
  2. Estimates the total index filtering. This includes matching and screening filtering. If statistics exist on column group (C1,C3), Db2 uses those statistics. Otherwise Db2 uses the available single-column statistics for each of these columns.

    Db2 also uses FULLKEYCARDF as a bound. Therefore, it can be critical to have column group statistics on column group (C1, C3) to get an accurate estimate.

  3. Estimates the table-level filtering. If statistics are available on column group (C1,C3,C4), Db2 uses them. Otherwise, Db2 uses statistics that exist on subsets of those columns.
Important: If you supply appropriate statistics at each level of filtering, Db2 is more likely to choose the most efficient access path.

You can use RUNSTATS to collect any of the needed statistics. End program-specific programming interface information.