The DB2® optimizer
uses information and statistics in the DB2 catalog
to determine optimal access to the database based on the provided
query. Statistical information is collected for specific tables and
indexes in the local database when you run the RUNSTATS utility.
When significant numbers of table rows are added or removed, or
if data in columns for which you collect statistics is updated, execute RUNSTATS again
to update the statistics.
Use the RUNSTATS utility to collect statistics in the following
situations:
- When data was loaded into a table and the appropriate indexes
were created
- When you create a new index on a table. Execute RUNSTATS for
the new index only if the table was not modified since you last ran RUNSTATS on
it.
- When a table has been reorganized with the REORG utility
- When the table and its indexes have been extensively updated by
data modifications, deletions, and insertions. "Extensive" in
this case might mean that 10 to 20 percent of the table and index
data was affected.
- Before binding or rebinding, application programs whose performance
is critical
- When you want to compare current and previous statistics. If you
update statistics at regular intervals, you can discover performance
problems early.
- When the prefetch quantity is changed
- When you have used the REDISTRIBUTE DATABASE PARTITION GROUP utility
The
RUNSTATS command has several formats that
primarily determine the depth and breadth or statistics that are collected.
If you collect more statistics, the command takes more time to run.
The following options are included:
- Collecting either SAMPLED or DETAILED index statistics
- Collecting statistics on all columns or only columns used in JOIN
operations
- Collecting distribution statistics on all, key, or no columns.
Distribution statistics are useful when you have an uneven distribution
of data on key columns.
Take care when running RUNSTATS, because the
collected information impacts the selection of access paths by the
optimizer. Implement RUNSTATS as part of a regularly
scheduled maintenance plan if some of the conditions occur. To ensure
that the index statistics are synchronized with the table, execute RUNSTATS to
collect table and index statistics at the same time.
Consider some of the following factors when deciding what type
of statistics to collect:
- Collect statistics only for the columns that join tables or in
the WHERE, GROUP BY, and similar clauses of queries. If these columns
are indexed, you can specify the columns with the ONLY ON KEY COLUMNS
clause for the RUNSTATS command.
- Customize the values for num_freqvalues and num_quantiles for
specific tables and specific columns in tables.
- Collect detailed index statistics with the SAMPLE DETAILED clause
to reduce the amount of background calculation performed for detailed
index statistics. The SAMPLE DETAILED clause reduces the time required
to collect statistics and produces adequate precision in most cases.
- When you create an index for a populated table, add the COLLECT
STATISTICS clause to create statistics as the index is created.