Collecting accurate catalog statistics, including advanced statistics features
Accurate database statistics are critical for query optimization. Perform RUNSTATS command operations regularly on any tables that are critical to query performance.
You might also want to collect statistics on system catalog tables, if an application queries these tables directly and if there is significant catalog update activity, such as that resulting from the execution of data definition language (DDL) statements. Automatic statistics collection can be enabled to allow the Db2® data server to automatically perform a RUNSTATS command operation. Real-time statistics collection can be enabled to allow the Db2 data server to provide even more timely statistics by collecting them immediately before queries are optimized.
Distribution statistics make the optimizer aware of data skew. Detailed index statistics provide more details about the I/O required to fetch data pages when the table is accessed by using a particular index. Collecting detailed index statistics uses considerable processing time and memory for large tables. The SAMPLED option provides detailed index statistics with nearly the same accuracy but requires a fraction of the CPU and memory. These options are used by automatic statistics collection when a statistical profile is not provided for a table.
RUNSTATS ON TABLE DB2USER.DAILY_SALES WITH DISTRIBUTION AND SAMPLED DETAILED INDEXES ALL
To improve query performance, consider collecting more advanced statistics, such as column group statistics or LIKE statistics, or creating statistical views.
To disable this feature, issue the following command:
update db cfg for dbname using auto_stats_views on
update db cfg for dbname using auto_stats_views off
runstats on view view_name with distribution
update db cfg for dbname using auto_sampling on
Collected statistics are not always exact. In addition to providing more efficient data access, an index can help provide more accurate statistics for columns which are often used in query predicates. When statistics are collected for a table and its indexes, index objects can provide accurate statistics for the leading index columns.
Enabling automatic collection of column group statistics
update db cfg for dbname using auto_cg_stats on
The automatic collection of column group statistics will generate a profile describing the statistics that need to be collected. If a user profile does not exist, the background statistics collection will initially perform an automatic discovery of pair-wise column group statistics within the table and set a statistics profile. After the discovery is completed, statistics are gathered on the table using the existing statistics profile feature. The set of column groups discovered is preserved across subsequent discoveries.
If a statistics profile is already manually set, it will be used as is and the discovery is not performed. The automatically generated statistics profile can be used together with any PROFILE option of the RUNSTATS command. If the profile is updated using the UPDATE PROFILE option, any further discovery is blocked on the table, but the set of column group statistics already set in the profile will continue to be collected automatically as well as with a manual RUNSTATS that includes the USE PROFILE option.
The UNSET PROFILE command can be used to remove the statistics profile to restart the discovery process.
update db cfg for dbname using auto_cg_stats off
Disabling this feature will prevent any further discovery, but the statistic profiles will persist and will continue to be used. If there is a need to remove the profile, use the UNSET PROFILE option of RUNSTATS.