How Db2 uses multiple filter factors to determine the cost of a query
When Db2 estimates the cost of a query, it determines the filter factor repeatedly and at various levels.
For example, suppose that
you execute the following query:
SELECT COLS FROM T1
WHERE C1 = 'A'
AND C3 = 'B'
AND C4 = 'C';
Table T1 consists of columns C1, C2, C3, and C4. Index I1 is defined on table T1 and contains columns C1, C2, and C3.
Suppose that the simple predicates in the compound predicate have the following characteristics:
- C1='A'
- Matching predicate
- C3='B'
- Screening predicate
- C4='C'
- Stage 1, nonindexable predicate
To determine the cost of accessing table T1 through index I1, Db2 performs these steps:
- Estimates the matching index cost. Db2 determines the index matching filter factor by using single-column cardinality and single-column frequency statistics because only one column can be a matching column.
- Estimates the total index filtering. This includes matching and
screening filtering. If statistics exist on column group (C1,C3), Db2 uses
those statistics. Otherwise Db2 uses
the available single-column statistics for each of these columns.
Db2 also uses FULLKEYCARDF as a bound. Therefore, it can be critical to have column group statistics on column group (C1, C3) to get an accurate estimate.
- Estimates the table-level filtering. If statistics are available on column group (C1,C3,C4), Db2 uses them. Otherwise, Db2 uses statistics that exist on subsets of those columns.
Important: If you supply appropriate statistics at each
level of filtering, Db2 is more
likely to choose the most efficient access path.
You can use RUNSTATS to collect any of the needed statistics.