The Db2® cost-based optimizer uses an estimate of the number of rows processed by an access plan operator to accurately cost that operator. This cardinality estimate is the single most important input to the optimizer cost model, and its accuracy largely depends upon the statistics that the RUNSTATS command collects from the database.
- Comparisons involving expressions. For example,
price > MSRP + Dealer_markup.
- Relationships spanning multiple tables. For example,
product.name = 'Alloy wheels' and product.key = sales.product_key.
- Any relationships other than predicates involving independent attributes and simple comparison operations.
When a query is compiled, the optimizer matches the query to the available statistical views. When the optimizer computes cardinality estimates for intermediate result sets, it uses the statistics from the view to compute a better estimate.
Queries do not need to reference the statistical view directly in order for the optimizer to use the statistical view. The optimizer uses the same matching mechanism that is used for materialized query tables (MQTs) to match queries to statistical views. In this respect, statistical views are similar to MQTs, except that they are not stored permanently, do not consume disk space, and do not have to be maintained.
Once the view is enabled for optimization, it is identified as a statistical view in the SYSCAT.TABLES catalog view with a 'Y' in position 13 of the PROPERTY column.
CREATE VIEW SV_TIME_FACT AS ( SELECT T.* FROM TIME T, SALES S WHERE T.TIME_KEY = S.TIME_KEY) ALTER VIEW SV_TIME_FACT ENABLE QUERY OPTIMIZATION RUNSTATS ON TABLE DB2DBA.SV_TIME_FACT WITH DISTRIBUTION
Without a statistical view, the optimizer assumes that all fact table TIME_KEY values corresponding to a particular TIME dimension YEAR_MON value occur uniformly within the fact table. However, sales might have been strong in December, resulting in many more sales transactions than during other months.
SELECT SUM(S.PRICE) FROM SALES S, TIME T, PRODUCT P WHERE T.TIME_KEY = S.TIME_KEY AND T.YEAR_MON = 200712 AND P.PROD_KEY = S.PROD_KEY AND P.PROD_DESC = ‘Power drill'
Statistics that are gathered on queries that have complex expressions in the predicate can be used by the optimizer to calculate accurate cardinality estimates which results in better access plans.
For many star join queries many statistical views might need to be created, however, if you have referential integrity constraints you can narrow down these many statistical views. The statistics for the other views can be inferred from the reduced number of statistical views by using referential integrity constraints.
Another way to obtain better access plans is to apply column group statistics on statistical views. These group statistics help to adjust filter factors which help to gather more accurate statistics which the optimizer can use to obtain accurate cardinality estimates.
can also be gathered automatically from statistical views through
the automatic statistics collection feature in Db2.
This new feature can be enabled or disabled by using the auto_stats_views database
configuration parameter. This database configuration parameter is
off by default and can be enabled by using the UPDATE DB
CFG command. The statistics collected by the automatic statistics
collection is equivalent to issuing the command
view view_name with distribution.
throttling can be used for statistical views to restrict the performance
impact on the workload. For example, the command
on view view_name util_impact_priority 10 contains
the impact on the workload within a limit specified by the util_impact_lim database
manager configuration parameter while statistics are collected on
A statistical view cannot directly or indirectly reference a catalog table.