Collecting accurate catalog statistics, including advanced statistics features

Accurate database statistics are critical for query optimization. Perform RUNSTATS command operations regularly on any tables that are critical to query performance.

You might also want to collect statistics on system catalog tables, if an application queries these tables directly and if there is significant catalog update activity, such as that resulting from the execution of data definition language (DDL) statements. Automatic statistics collection can be enabled to allow the Db2® data server to automatically perform a RUNSTATS command operation. Real-time statistics collection can be enabled to allow the Db2 data server to provide even more timely statistics by collecting them immediately before queries are optimized.

If you are collecting statistics manually by using the RUNSTATS command, use the following options at a minimum:
Distribution statistics make the optimizer aware of data skew. Detailed index statistics provide more details about the I/O required to fetch data pages when the table is accessed by using a particular index. Collecting detailed index statistics uses considerable processing time and memory for large tables. The SAMPLED option provides detailed index statistics with nearly the same accuracy but requires a fraction of the CPU and memory. These options are used by automatic statistics collection when a statistical profile is not provided for a table.

To improve query performance, consider collecting more advanced statistics, such as column group statistics or LIKE statistics, or creating statistical views.

Statistical views are helpful when gathering statistics for complex relationships. Gathering statistics for statistical views can be automated through the automatic statistics collection feature in Db2. Enabling or disabling the automatic statistic collection of statistical views is done by using the auto_stats_views database configuration parameter. To enable this function, issue the following command:
update db cfg for dbname using auto_stats_views on
To disable this feature, issue the following command:
update db cfg for dbname using auto_stats_views off
This database configuration parameter is off by default. The command that is issued to automatically collect statistics on statistical views is equivalent to the following command:
runstats on view view_name with distribution
Collecting statistics for a large table or statistical view can be time consuming. Statistics of the same quality can often be collected by considering just a small sample of the overall data. Consider enabling automatic sampling for all background statistic collections; this may reduce the statistic collection time. To enable this function, issue the following command:
update db cfg for dbname using auto_sampling on

Collected statistics are not always exact. In addition to providing more efficient data access, an index can help provide more accurate statistics for columns which are often used in query predicates. When statistics are collected for a table and its indexes, index objects can provide accurate statistics for the leading index columns.