Start of change

Collecting statistics by using DB2 utilities

You can run certain DB2® utilities to collect access path statistics for your database objects. Accurate statistics are an essential component of access path selection.

Before you begin

  • Consider identifying statistics to collect by using a query optimization tool such as IBM® Data Studio or IBM Data Server Manager.

About this task

You can use DB2 online utilities to collect statistics for database objects. The purpose of the RUNSTATS utility is to collect statistics for database objects. However, you can also collect inline statistics when you run certain other DB2 utilities.

Procedure

To collect statistics for database objects, use any of the following approaches:

  • Run the RUNSTATS utility. The RUNSTATS utility collects the most complete and accurate statistics.
  • Specify the STATISTICS keyword to specify the collection of inline statistics when you run one of the following utilities:
    • LOAD
    • REBUILD INDEX
    • REORG INDEX
    • REORG TABLESPACE
    You might be able to avoid the cost of running the RUNSTATS utility by collecting inline statistics.
    However, certain limitations apply to inline statistics. For example:
    • You cannot collect column group statistics with the STATISTICS keyword. You must run the RUNSTATS utility to collect column group statistics.
    • You cannot use statistics profiles with inline statistics.
    For details of the limitations, see the information for each utility.
    Important: Statistics that are collected with inline statistics are likely to differ from statistics that are collected by the RUNSTATS utility. Certain resources and values that RUNSTATS uses might be unavailable in the context of a utility that collects inline statistics. Estimations must be used in place of these missing values or uncertainties, and the resulting statistics might be less exact. Consequently, you might need to evaluate whether the inline statistics are suitable to support access path selection for your query workload.

    For example:

    • If the DISCARDDN option is specified when you collect inline statistics with the LOAD utility, the statistics are collected before the rows are discarded. If the number of discarded rows is large enough, the inaccuracy of the resulting statistics might be significant. As a general rule, if the number of discarded rows exceeds 20 percent of the total number of rows in the table, run the RUNSTATS separately, after running the LOAD utility, to collect accurate statistics.

What to do next

Consider taking the following actions to standardize and automate statistics collection:
End of change