Filter factor estimation for the XMLEXISTS predicate
Db2 uses available statistics for implicit and explicit XML objects to estimate the filter factors for XMLEXISTS predicates.
Db2 can use frequency and histogram statistics for XML indexes to estimate the filter factor of XMLEXISTS predicates when the following conditions are true:
- The predicate is one of the following types:
- Equality predicates: =
- Range predicates: >, >=, >=, <=
- The right side of the predicate is a literal value.
- Frequency or histogram statistics are collected for an XML index that matches the XMLEXISTS predicate.
Db2 applies a formula to the frequency or histogram statistics on the XML indexes to calculate the filter factors of XMLEXISTS predicates on the base table.
If the preceding conditions are not true, Db2 calculates the filter factor for the XPath predicates by applying a formula that uses the following values:
- The filter factor of the auxiliary table. The filter factors of all XPath predicates in the XMLEXISTS predicate indicate the filtering on the XML table. If the FULLKEYCARD value is available for the node ID index, it is used as the default CARDF value of the XML table.
- The FIRSKEYCARDF value for the implicitly created node ID index
- The CARDF value for the base table.
The following rules apply to these values:
- When the statistics are available for the node ID index, the FIRSTKEYCARDF value is the number of distinct DOCID values in the XML table.
- When the statistics are available for the XML index, the FIRSTKEYCARDF value is used as the COLCARD value of the comparison operands in the XPath predicates.
- When statistics are not available, the default filter factor, is the same as for non-XPath predicates that have the same comparison type.
- If the XML index can be used to evaluate the XPath predicate, the default filter factor is redefined based on the FIRSTKEYCARDF value. The filter factor is the same as non-XPath predicates with the same comparison type and the same COLCARD value.
- If frequency statistics are not available for range predicates, Db2 uses interpolation that is based on the HIGH2KEY and LOW2KEY values and the key value of the comparison.
- Because the frequency statistics are not available, for range predicates, Db2 uses interpolation that is based on HIGH2KEY and LOW2KEY values, and the key value of the comparison is used.
- When no statistics are available for the node ID index or the
XML index, the following default statistics are used to estimate the
filter factor:
- NLEAF = XML-table-CARDF / 300
- NLEVELS uses the default value for the node ID index and the XML
index for the XMLEXISTS predicate.
Because the index statistics are not available to help the default filter factor estimation, the predicate filter factor is set according to the predicate comparison type.