Cleaning stale or unneeded Db2 catalog statistics

You can avoid possible access path problems from stale statistics and avoid collection of costly statistics that provide no benefit by removing stale or unneeded catalog statistics. The removal of unneeded statistics is particularly important in Db2 12 when you can use profile-based automation for statistics collection.

About this task

The RUNSTATS utility and inline statistics with other utilities replace existing statistics with new values only based on the keywords that you specify in the utility control statements. They do not remove any previously collected statistics that are not specified. For example, assume that you collect statistics one time with the NUMCOLS 2 COUNT 10 keywords. If you then omit those keywords the next time that you collect statistics, the statistics that were collected for those keywords remain in the catalog. Such values are cleaned up only if you take action to remove them.

Important: Cleanup of stale statistics becomes more important in Db2 12 or later, with the introduction of automated of statistics collection with profiles based on existing statistics. The same situation can also occur in Db2 11 with USE PROFILE FROM EXISTING STATS, but Db2 12 automates it. For more information, see Applying statistics recommendations to statistics profiles automatically.

If you previously collected some specialized statistics for a one-off tuning effort, Db2 can start automatically and routinely collecting those costly statistics again, even if they are not needed, and were previously collected only long ago.

For example, assume that for table T1 your SYSCOLDIST catalog table contains the old HISTOGRAM statistics for columns C3 and C5 and COLGROUP statistics for columns C2 and C4. Then you collect statistics with the following RUNSTATS job:
RUNSTATS TABLE(T1) INDEX(ALL)
Assume that Db2 recommends collecting FREQVAL statistics on column C1. The statistics profile that Db2 creates is likely to specify the following options:
TABLE(T1) INDEX(ALL)
COLGROUP(C1) FREQVAL COUNT 10
COLGROUP(C2, C4)
COLGROUP(C3) HISTOGRAM NUMQUANTILES 100
COLGROUP(C5) HISTOGRAM NUMQUANTILES 100

If you then collect statistics with USE PROFILE, Db2 begins to collect all of those statistics. After the first collection, all of the new entries have the same STATSTIME value, and you cannot tell which were not previously collected on a regular basis. The ongoing collection of such statistics increases the cost for collecting statistics.

Tip: Enhanced query tuning capabilities that can help you with this task are available in IBM® Db2 Query Workload Tuner for z/OS® and IBM Db2 Administration Foundation for z/OS.

Procedure

To identify and remove stale Db2 catalog statistics, complete the following steps:

  1. Run an SQL statement to identify old statistics.
    The following example statement identifies histogram, multi-column cardinality (COLGROUP), or multi-column frequency statistics that do not correspond to the most recent statistics collection for the table space or index.
    SELECT CD.TYPE, CD.NUMCOLUMNS, CD.TBOWNER,
    CD.TBNAME, CD.NAME, MIN(CD.STATSTIME) AS MINSTAT,
    COUNT(*) AS OCCURANCES
    FROM SYSIBM.SYSCOLDIST CD
    WHERE CD.STATSTIME < CURRENT TIMESTAMP - 1 MONTH
    AND (CD.TYPE IN ('C', 'H')
      OR CD.NUMCOLUMNS > 1
      OR CD.STATSTIME < CURRENT TIMESTAMP - 1 YEAR)AND NOT EXISTS(SELECT 1
    FROM SYSIBM.SYSINDEXES I
    WHERE I.TBCREATOR = CD.TBOWNER
      AND I.TBNAME = CD.TBNAME
      AND I.STATSTIME > '0001-01-01-00.00.00.000000'
      AND CD.STATSTIME BETWEEN I.STATSTIME - 8 DAYS
      AND I.STATSTIME + 8 DAYS)
      AND NOT EXISTS(SELECT 1
    FROM SYSIBM.SYSTABLES T
    WHERE T.CREATOR = CD.TBOWNER
      AND T.NAME = CD.TBNAME
      AND T.STATSTIME > '0001-01-01-00.00.00.000000'
      AND CD.STATSTIME BETWEEN T.STATSTIME - 8 DAYS
      AND T.STATSTIME + 8 DAYS)
    GROUP BY TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAME
    ORDER BY TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAME
    WITH UR;
  2. Take one of the following actions to remove the identified statistics:
    • Use DELETE statements with appropriate WHERE clauses to remove the unneeded SYSCOLDIST records. You might want to first insert the records that you plan to delete into a separate table, in case an access path problem occurs after the DELETE, and you need to restore them.
    • Use statistics profiles to remove the unneeded statistics without impacting concurrently running dynamic SQL:
      1. Update existing statistics profiles to remove any keywords related to unneeded statistics, or if no profile exists, create a new one that collect only your standard RUNSTATS options.
      2. Run the RUNSTATS TABLESPACE, REORG TABLESPACE, or LOAD utility with USE PROFILE. The utility collects new statistics, and it also deletes any existing statistics that are not included in the profile.
    • Remove all access path statistics for the identified objects, as described in Resetting access path statistics. Then collect statistics for the objects again, with your standard RUNSTATS options.