DB2 Version 10.1 for Linux, UNIX, and Windows

RUNSTATS command using the ADMIN_CMD procedure

Updates statistics in the system catalog about the characteristics of a table and/or associated indexes, or statistical views. These characteristics include number of records, number of pages, and average record length. The optimizer uses these statistics when determining access paths to the data.

For a table, call the RUNSTATS command when the table has had many updates, or after reorganizing the table. For a statistical view, call the RUNSTATS command when changes to underlying tables have substantially affected the rows returned by the view. The view must have been previously enabled for use in query optimization by using the ALTER VIEW statement.

Scope

The RUNSTATS command can be issued from any database partition in the db2nodes.cfg file. It can be used to update the catalogs on the catalog database partition.

For tables, this command collects statistics for a table on the database partition from which it is invoked. If the table does not exist on that database partition, the first database partition in the database partition group is selected.

For views, this command collects statistics using data from tables on all participating database partitions.

Authorization

For tables, one of the following authorities:
  • SYSADM
  • SYSCTRL
  • SYSMAINT
  • DBADM
  • SQLADM
  • CONTROL privilege on the table
  • LOAD authority

You do not need any explicit privilege to use this command on any declared temporary table that exists within its connection.

For statistical views, one of the following authorities:
  • SYSADM
  • SYSCTRL
  • SYSMAINT
  • DBADM
  • SQLADM
  • CONTROL privilege on the statistical view

Required connection

Database

Command syntax

Read syntax diagramSkip visual syntax diagram
>>-RUNSTATS ON--+-TABLE-+--object-name-------------------------->
                '-VIEW--'                

>--+------------------------+----------------------------------->
   +-USE PROFILE------------+   
   +-UNSET PROFILE----------+   
   '-| Statistics Options |-'   

>--+------------------------------------+----------------------><
   '-UTIL_IMPACT_PRIORITY--+----------+-'   
                           '-priority-'     

Statistics Options

                                 .-ALLOW WRITE ACCESS-.   
|--+--------------------------+--+--------------------+--------->
   '-| Table Object Options |-'  '-ALLOW READ ACCESS--'   

>--+----------------------------+------------------------------->
   '-| Table Sampling Options |-'   

>--+----------------------------+--+---------------------+------|
   '-| Index Sampling Options |-'  '-| Profile Options |-'   

Table Object Options

|--+-FOR--| Index Clause |--+-----------------------+----------------------------------+--|
   |                        '-EXCLUDING XML COLUMNS-'                                  |   
   '-+-------------------------+--+-----------------------+--+-----------------------+-'   
     '-| Column Stats Clause |-'  '-EXCLUDING XML COLUMNS-'  '-AND--| Index Clause |-'     

Table Sampling Options

|--TABLESAMPLE--+-BERNOULLI-+--(--numeric-literal--)------------>
                '-SYSTEM----'                          

>--+-----------------------------------+------------------------|
   '-REPEATABLE--(--integer-literal--)-'   

Index Sampling Options

|--INDEXSAMPLE--+-BERNOULLI-+--(--numeric-literal--)------------|
                '-SYSTEM----'                          

Profile Options

   .-SET PROFILE NONE--------------.   
|--+-------------------------------+----------------------------|
   '-+-SET----+--PROFILE--+------+-'   
     '-UPDATE-'           '-ONLY-'     

Index Clause

|--+-------------------------+--+-INDEXES-+--------------------->
   | .-SAMPLED---.           |  '-INDEX---'   
   '-+-----------+--DETAILED-'                
     '-UNSAMPLED-'                            

     .-,----------.     
     V            |     
>--+---index-name-+-+-------------------------------------------|
   '-ALL------------'   

Column Stats Clause

|--+-| On Cols Clause |------------------------------+----------|
   '-+--------------------+--| Distribution Clause |-'   
     '-| On Cols Clause |-'                              

On Cols Clause

   .-ON ALL COLUMNS-----------------------------------------------------.   
   |                                           .-,-----------------.    |   
   |                                           V                   |    |   
|--+-ON--+----------------------+--COLUMNS--(----| Column Option |-+--)-+--|
   |     '-+-ALL-+--COLUMNS AND-'                                       |   
   |       '-KEY-'                                                      |   
   '-ON KEY COLUMNS-----------------------------------------------------'   

Distribution Clause

|--WITH DISTRIBUTION--+-------------------------+--------------->
                      '-| On Dist Cols Clause |-'   

>--+--------------------------+---------------------------------|
   '-| Default Dist Options |-'   

On Dist Cols Clause

   .-ON ALL COLUMNS-----------------------------------------------------------------------------------.   
   |                                           .-,-----------------------------------------------.    |   
   |                                           |                    .--------------------------. |    |   
   |                                           V                    V                          | |    |   
|--+-ON--+----------------------+--COLUMNS--(----| Column Option |----+----------------------+-+-+--)-+--|
   |     '-+-ALL-+--COLUMNS AND-'                                     +-| Frequency Option |-+        |   
   |       '-KEY-'                                                    '-| Quantile Option |--'        |   
   '-ON KEY COLUMNS-----------------------------------------------------------------------------------'   

Default Dist Option

            .--------------------------.   
            V                          |   
|--DEFAULT----+-| Frequency Option |-+-+------------------------|
              '-| Quantile Option |--'     

Frequency Option

|--NUM_FREQVALUES--integer--------------------------------------|

Quantile Option

|--NUM_QUANTILES--integer---------------------------------------|

Column Option

|--+-column-name--+-----------------+-+-------------------------|
   |              '-LIKE STATISTICS-' |   
   |    .-,-----------.               |   
   |    V             |               |   
   '-(----column-name-+--)------------'   

Command parameters

object-name
Identifies the table or statistical view on which statistics are to be collected. This parameter must not be a hierarchy table. For typed tables, object-name must be the name of the root table of the table hierarchy. The fully qualified name or alias in the form: schema.object-name must be used. The schema is the user name under which the table was created.
USE PROFILE
This option allows RUNSTATS to employ a previously stored statistics profile to gather statistics for a table or statistical view. The statistics profile is created using the SET PROFILE options and is updated using the UPDATE PROFILE options.
UNSET PROFILE
Specify this option to remove an existing statistics profile. For example,
RUNSTATS ON tablemyschema.mytable UNSET PROFILE
FOR INDEXES
Collects and updates statistics for the indexes only. If no table statistics had been previously collected on the table, basic table statistics are also collected. These basic statistics do not include any distribution statistics. This option cannot be used for views.
SAMPLED
Used together only with the DETAILED parameter. Specifying this option does not change the default functionality from DETAILED. This option is left in for compatibility with previous versions of DB2®. This option cannot be used for views.
UNSAMPLED
This option, when used with the DETAILED option, forces RUNSTATS to examine every entry in the index to compute the extended index statistics. This option cannot be used for views and it cannot be used together with scan index sampling (INDEXSAMPLE keyword). This option significantly increases RUNSTATS resource consumption, while rarely providing significant improvement over the DETAILED or SAMPLED DETAILED options, which are equivalent.
DETAILED
Calculates extended index statistics. The extended index statistics are the CLUSTERFACTOR and PAGE_FETCH_PAIRS statistics that are gathered for relatively large indexes. Not all index entries are examined, a CPU sampling technique is employed instead to improve performance. This option cannot be used for views.
index-name
Identifies an existing index defined on the table. If you do not specify the fully qualified name in the form: schema.index-name, the default schema is assumed. This option cannot be used for views.
EXCLUDING XML COLUMNS
Use this clause to omit all XML type columns from statistics collection. Using this clause facilitates the collection of statistics on non-XML columns because the inclusion of XML data can require greater system resources. The EXCLUDING XML COLUMNS clause takes precedence over other clauses that specify XML columns for statistics collection. For example, if you use the EXCLUDING XML COLUMNS clause, and you also specify XML type columns with the ON COLUMNS clause or you use the ON ALL COLUMNS clause, all XML type columns will be ignored during statistics collection. For DB2 V9.7 Fix Pack 1 and later releases, distribution statistics over XML type columns are not collected when this clause is specified.
AND INDEXES
Collects and updates statistics for both the table and the indexes. This option cannot be used for views.
ON ALL COLUMNS
To collect statistics on all eligible columns, use the ON ALL COLUMNS clause. Columns can be specified either for basic statistics collection (On Cols clause) or in conjunction with the WITH DISTRIBUTION clause (On Dist Cols clause). The ON ALL COLUMNS specification is the default option if neither of the column specific clauses are specified.

If it is specified in the On Cols clause, all columns will have only basic column statistics collected unless specific columns are chosen as part of the WITH DISTRIBUTION clause. Those columns specified as part of the WITH DISTRIBUTION clause will also have basic and distribution statistics collected.

If the WITH DISTRIBUTION ON ALL COLUMNS is specified both basic statistics and distribution statistics are collected for all eligible columns. Anything specified in the On Cols clause is redundant and therefore not necessary.

ON COLUMNS
To collect statistics on specific columns, column groups, or both, use the ON COLUMNS. A column group is a parenthesized comma-separated list of columns for which you want to collect combined statistics.

The column and column groups are specified as a parenthesized comma-separated list.

When you run the RUNSTATS command on a table without gathering index statistics and specify a subset of columns for which statistics are to be gathered:
  • Statistics for columns not specified in the RUNSTATS command but which are the first column in an index are not reset.
  • Statistics for all other columns not specified in the RUNSTATS command are reset.

This clause can be used in the On Cols clause and the On Dist Cols clause. Collecting distribution statistics for a group of columns is not currently supported.

If XML type columns are specified in a column group, the XML type columns are ignored for collecting distinct values for the group. However, basic XML column statistics are collected for the XML type columns in the column group.

ON KEY COLUMNS
Instead of listing specific columns, you can choose to collect statistics on columns that make up all the indexes defined on the table. It is assumed here that critical columns in queries are also those used to create indexes on the table. If there are no indexes on the table, it is as good as an empty list and no column statistics will be collected. It can be used in the On Cols clause or the On Dist Cols clause. It is redundant in the On Cols clause if specified in both clauses since the WITH DISTRIBUTION clause is used to specify collection of both basic and distribution statistics. XML type columns are by definition not a key column and will not be included for statistics collection by the ON KEY COLUMNS clause. This option cannot be used for views.
column-name
Name of a column in the table or statistical view. If you specify the name of an ineligible column for statistics collection, such as a nonexistent column or a mistyped column name, error (-205) is returned. Two lists of columns can be specified, one without distribution and one with distribution. If the column is specified in the list that is not associated with the WITH DISTRIBUTION clause only basic column statistics will be collected. If the column appears in both lists, distribution statistics will be collected (unless NUM_FREQVALUES and NUM_QUANTILES are set to zero).
LIKE STATISTICS
When this option is specified additional column statistics might be collected for columns of type CHAR and VARCHAR with a code page attribute of single-byte character set (SBCS), FOR BIT DATA, or UTF-8.  The statistics are collected if the runstats utility determines that such statistics are appropriate after analyzing column values.  These statistics are the SUB_COUNT and the SUB_DELIM_LENGTH statistics in SYSSTAT.COLUMNS.  They are used by the query optimizer to improve the selectivity estimates for predicates of the type "column LIKE '%xyz'"and "column LIKE '%xyz%'".
WITH DISTRIBUTION
This clause specifies that both basic statistics and distribution statistics are to be collected on the columns. If the ON COLUMNS clause is not specified, distribution statistics are collected on all the columns of the table or statistical view (excluding columns that are ineligible such as CLOB and LONG VARCHAR). If the ON COLUMNS clause is specified, distribution statistics are collected only on the column list provided (excluding those ineligible for statistics collection). If the clause is not specified, only basic statistics are collected.

Collection of distribution statistics on column groups is currently not supported; distribution statistics will not be collected when column groups are specified in the WITH DISTRIBUTION ON COLUMNS clause.

DEFAULT
If NUM_FREQVALUES or NUM_QUANTILES are specified, these values will be used to determine the maximum number of frequency and quantile statistics to be collected for the columns, if these are not specified for individual columns in the ON COLUMNS clause. If the DEFAULT clause is not specified, the values used will be those in the corresponding database configuration parameters.
NUM_FREQVALUES
Defines the maximum number of frequency values to collect. It can be specified for an individual column in the ON COLUMNS clause. If the value is either not specified or is specified as '-1' for an individual column, the frequency limit value is picked up from that specified in the DEFAULT clause. If it is not specified there either, the maximum number of frequency values to be collected will be what is set in the num_freqvalues database configuration parameter.
NUM_QUANTILES
Defines the maximum number of distribution quantile values to collect. It can be specified for an individual column in the ON COLUMNS clause. If the value is either not specified or is specified as '-1' for an individual column, the quantile limit value is picked up from that specified in the DEFAULT clause. If it is not specified there either, the maximum number of quantile values to be collected will be what is set in the num_quantiles database configuration parameter.

For DB2 V9.7 Fix Pack 1 and later releases, distribution statistics for each index over XML data uses a maximum of 250 quantiles as the default. The default can be changed by specifying the NUM_QUANTILES parameter in the ON COLUMNS or the DEFAULT clause. The num_quantiles database configuration parameter is ignored while collecting XML distribution statistics.

ALLOW WRITE ACCESS
Specifies that other users can read from and write to the tables while statistics are calculated. For statistical views, these are the base tables referenced in the view definition.

The ALLOW WRITE ACCESS option is not recommended for tables that will have a lot of inserts, updates or deletes occurring concurrently. The RUNSTATS command first performs table statistics and then performs index statistics. Changes in the table's state between the time that the table and index statistics are collected might result in inconsistencies. Although having up-to-date statistics is important for the optimization of queries, it is also important to have consistent statistics. Therefore, statistics should be collected at a time when inserts, updates or deletes are at a minimum.

ALLOW READ ACCESS
Specifies that other users can have read-only access to the tables while statistics are calculated. For statistical views, these are the base tables referenced in the view definition.
TABLESAMPLE BERNOULLI
This option allows RUNSTATS to collect statistics on a sample of the rows from the table or statistical view. Bernoulli sampling considers each row individually, including that row with probability P/100 (where P is the value of numeric-literal) and excluding it with probability 1-P/100. Thus, if the numeric-literal were evaluated to be the value 10, representing a 10 percent sample, each row would be included with probability 0.1 and be excluded with probability 0.9. Unless the optional REPEATABLE clause is specified, each execution of RUNSTATS will usually yield a different such sample of the table. All data pages will be retrieved through a table scan but only the percentage of rows as specified through the numeric-literal parameter will be used for the statistics collection.
TABLESAMPLE SYSTEM
This option allows RUNSTATS to collect statistics on a sample of the data pages from the tables. System sampling considers each page individually, including that page with probability P/100 (where P is the value of numeric-literal) and excluding it with probability 1-P/100. Unless the optional REPEATABLE clause is specified, each execution of RUNSTATS will usually yield a different such sample of the table. The size of the sample is controlled by the numeric-literal parameter in parentheses, representing an approximate percentage P of the table to be returned. Only a percentage of the data pages as specified through the numeric-literal parameter will be retrieved and used for the statistics collection.

For statistical views, SYSTEM sampling can only be applied to a single base table referenced in the view definition. If the view contains multiple tables, SYSTEM sampling is possible if a single table among all the tables in the statistical view can be identified as being joined with all primary keys or unique index columns of the other tables used in the view. If the statistical view does not meet those conditions, Bernoulli sampling will be used instead and a warning will be returned.

numeric-literal
The numeric-literal parameter specifies the size of the sample to be obtained, as a percentage P. This value must be a positive number that is less than or equal to 100, and can be between 1 and 0. For example, a value of 0.01 represents one one-hundredth of a percent, such that 1 row in 10,000 would be sampled, on average. A value of 0 or 100 will be treated by the DB2 database system as if sampling was not specified, regardless of whether TABLESAMPLE BERNOULLI or TABLESAMPLE SYSTEM is specified. A value greater than 100 or less than 0 will be treated as an error (SQL1197N) by the DB2 database system.
REPEATABLE (integer-literal)
Adding the REPEATABLE clause to the TABLESAMPLE clause ensures that repeated executions of RUNSTATS return the same sample. The integer-literal parameter is a non-negative integer representing the seed to be used in sampling. Passing a negative seed will result in an error (SQL1197N). The sample set might still vary between repeatable RUNSTATS invocations if activity against the table or statistical view resulted in changes to the table or statistical view data since the last time TABLESAMPLE REPEATABLE was run. Also, the method by which the sample was obtained as specified by the BERNOULLI or SYSTEM keyword, must also be the same to ensure consistent results.
INDEXSAMPLE BERNOULLI
Use this option to collect index statistics on a sample of the rows in the index. Bernoulli sampling considers each row individually, including the row with probability P/100 (where P is the value of the numeric-literal) and excluding it with probability 1-P/100. Thus, if the numeric-literal were evaluated to be the value 10, representing a 10 percent sample, each row would be included with probability 0.1 and be excluded with probability 0.9. Each execution of RUNSTATS is likely to yield a different sample of the index. All index pages are retrieved through an index scan but only the percentage of rows as specified through the numeric-literal parameter is used for the statistics collection. This option is not supported on statistical views.
INDEXSAMPLE SYSTEM
Use this option to collect statistics on a sample of the index pages. System sampling considers each page individually, including the page with probability P/100 (where P is the value of the numeric-literal) and excluding it with probability 1-P/100. Each execution of the RUNSTATS command usually yields a different sample of the index. The size of the sample is controlled by the numeric-literal parameters in parentheses, representing an approximate percentage P of the index to be returned. Only a percentage of the index pages as specified through the numeric-literal parameter is retrieved and used for the statistics collection. This option is not supported on statistical views.
SET PROFILE NONE
Specifies that no statistics profile will be set for this RUNSTATS invocation.
SET PROFILE
Allows RUNSTATS to generate and store a specific statistics profile in the system catalog tables and executes the RUNSTATS command options to gather statistics.
SET PROFILE ONLY
Allows RUNSTATS to generate and store a specific statistics profile in the system catalog tables without running the RUNSTATS command options.
UPDATE PROFILE
Allows RUNSTATS to modify an existing statistics profile in the system catalog tables, and runs the RUNSTATS command options of the updated statistics profile to gather statistics. You cannot use the UPDATE PROFILE option to remove clauses that are in a statistics profile.
UPDATE PROFILE ONLY
Allows RUNSTATS to modify an existing statistics profile in the system catalog tables without running the RUNSTATS command options of the updated statistics profile. You cannot use the UPDATE PROFILE ONLY option to remove clauses that are in a statistics profile.
UTIL_IMPACT_PRIORITY priority
Specifies that RUNSTATS will be throttled at the level specified by priority. priority is a number in the range of 1 to 100, with 100 representing the highest priority and 1 representing the lowest. The priority specifies the amount of throttling to which the utility is subjected. All utilities at the same priority undergo the same amount of throttling, and utilities at lower priorities are throttled more than those at higher priorities. If priority is not specified, the RUNSTATS will have the default priority of 50. Omitting the UTIL_IMPACT_PRIORITY keyword will invoke the RUNSTATS utility without throttling support. If the UTIL_IMPACT_PRIORITY keyword is specified, but the util_impact_lim configuration parameter is set to 100, then the utility will run unthrottled.

In a partitioned database, when used on tables, the RUNSTATS command collects the statistics on only a single database partition. If the database partition from which the RUNSTATS command is executed has a partition of the table, then the command executes on that database partition. Otherwise, the command executes on the first database partition in the database partition group across which the table is partitioned.

Example

Collect statistics on all columns used in indexes and on all indexes.
CALL SYSPROC.ADMIN_CMD ('RUNSTATS ON TABLE employee
  ON KEY COLUMNS and INDEXES ALL')

Usage notes

  1. When there are detached partitions on a partitioned table, index keys that still belong to detached data partitions which require cleanup will not be counted as part of the keys in the statistics. These keys are not counted because they are invisible and no longer part of the table. They will eventually get removed from the index by asynchronous index cleanup. As a result, statistics collected before asynchronous index cleanup is run will be misleading. If the RUNSTATS command is issued before asynchronous index cleanup completes, it will likely generate a false alarm for index reorganization or index cleanup based on the inaccurate statistics. Once asynchronous index cleanup is run, all the index keys that still belong to detached data partitions which require cleanup will be removed and this may eliminate the need for index reorganization.

    For partitioned tables, you are encouraged to issue the RUNSTATS command after an asynchronous index cleanup has completed in order to generate accurate index statistics in the presence of detached data partitions. To determine whether or not there are detached data partitions in the table, you can check the status field in the SYSCAT.DATAPARTITIONS catalog view and look for the value L (logically detached), I (index cleanup), or D (detached with dependent MQT).

    The RUNSTATS command collects statistics for all index partitions of a partitioned index. Statistics in the SYSSTAT.INDEXES view for the partitioned index represent an index partition, except for FIRSTKEYCARD, FIRST2KEYCARD, FIRST3KEYCARD, FIRST4KEYCARD, and FULLKEYCARD statistics. Because these statistics are used in cardinality estimates, they are for the entire index and not for an index partition. Distribution statistics (frequent values and quantiles) are not collected for partitioned indexes, but are gathered if RUNSTATS is run on the table. Statistics on the leading columns of a partitioned index might not be as accurate as statistics on the leading columns of a nonpartitioned index.

  2. Command execution status is returned in the SQLCA resulting from the CALL statement.
  3. It is recommended to run the RUNSTATS command:
    • On tables that have been modified considerably (for example, if a large number of updates have been made, or if a significant amount of data has been inserted or deleted or if LOAD has been done without the statistics option during LOAD).
    • On tables that have been reorganized (using REORG, REDISTRIBUTE DATABASE PARTITION GROUP).
    • On tables which have been row compressed.
    • When a new index has been created.
    • Before binding applications whose performance is critical.
    • When the prefetch quantity is changed.
    • On statistical views whose underlying tables have been modified substantially so as to change the rows that are returned by the view.
    • After LOAD has been executed with the STATISTICS option, use the RUNSTATS utility to collect statistics on XML columns. Statistics for XML columns are never collected during LOAD, even when LOAD is executed with the STATISTICS option. When RUNSTATS is used to collect statistics for XML columns only, existing statistics for non-XML columns that have been collected by LOAD or a previous execution of the RUNSTATS utility are retained. In the case where statistics on some XML columns have been collected previously, the previously collected statistics for an XML column will either be dropped if no statistics on that XML column are collected by the current command, or be replaced if statistics on that XML column are collected by the current command.
  4. The options chosen must depend on the specific table and the application. In general:
    • If the table is a very critical table in critical queries, is relatively small, or does not change too much and there is not too much activity on the system itself, it might be worth spending the effort on collecting statistics in as much detail as possible.
    • If the time to collect statistics is limited, if the table is relatively large, or if the table is updated frequently, it might be beneficial to execute RUNSTATS limited to the set of columns that are used in predicates. This way, you will be able to execute the RUNSTATS command more often.
    • If time to collect statistics is very limited and the effort to tailor the RUNSTATS command on a table by table basis is a major issue, consider collecting statistics for the "KEY" columns only. It is assumed that the index contains the set of columns that are critical to the table and are most likely to appear in predicates.
    • If time to collect statistics is very limited and table statistics are to be gathered, consider using the TABLESAMPLE option to collect statistics on a subset of the table data.
    • If time to collect statistics is very limited and index statistics are to be gathered, consider using the INDEXSAMPLE option to collect statistics on a subset of the index data.
    • If there is skew in certain columns and predicates of the type "column = constant", it might be beneficial to specify a larger NUM_FREQVALUES value for that column
    • Collect distribution statistics for all columns that are used in equality predicates and for which the distribution of values might be skewed.
    • For columns that have range predicates (for example "column >= constant", "column BETWEEN constant1 AND constant2") or of the type "column LIKE '%xyz'", it might be beneficial to specify a larger NUM_QUANTILES value.
    • If storage space is a concern and one cannot afford too much time on collecting statistics, do not specify high NUM_FREQVALUES or NUM_QUANTILES values for columns that are not used in predicates.
    • If index statistics are requested, and statistics have never been run on the table containing the index, statistics on both the table and indexes are calculated.
    • If statistics for XML columns in the table are not required, the EXCLUDING XML COLUMNS option can be used to exclude all XML columns. This option takes precedence over all other clauses that specify XML columns for statistics collection.
  5. After the command is run, note the following:
    • A COMMIT should be issued to release the locks.
    • To allow new access plans to be generated, the packages that reference the target table must be rebound.
    • Executing the command on portions of the table could result in inconsistencies as a result of activity on the table since the command was last issued. In this case a warning message is returned. Issuing RUNSTATS on the table only might make table and index level statistics inconsistent. For example, you might collect index level statistics on a table and later delete a significant number of rows from the table. If you then issue RUNSTATS on the table only, the table cardinality might be less than FIRSTKEYCARD, which is an inconsistency. In the same way, if you collect statistics on a new index when you create it, the table level statistics might be inconsistent.
  6. The RUNSTATS command drops previously collected distribution statistics if table statistics are requested. For example, RUNSTATS ON TABLE, or RUNSTATS ON TABLE … AND INDEXES ALL will cause previously collected distribution statistics to be dropped. If the command is run on indexes only then previously collected distribution statistics are retained. For example, RUNSTATS ON TABLE … FOR INDEXES ALL causes the previously collected distribution statistics to be retained. If the RUNSTATS command is run on XML columns only, then previously collected basic column statistics and distribution statistics are retained. In the case where statistics on some XML columns have been collected previously, the previously collected statistics for an XML column are either be dropped if no statistics on that XML column are collected by the current command, or be replaced if statistics on that XML column are collected by the current command.
  7. For DB2 V9.7 Fix Pack 1 and later releases, distribution statistics are collected on indexes over XML data defined on an XML column. When the RUNSTATS command is run on a table with the WITH DISTRIBUTION clause, the following apply to the collection of distribution statistics on a column of type XML:
    • Distribution statistics are collected for each index over XML data specified on an XML column.
    • The RUNSTATS command must collect both distribution statistics and table statistics to collect distribution statistics for indexes over XML data defined on an XML column. Table statistics must be gathered in order for distribution statistics to be collected since XML distribution statistics are stored with table statistics.

      An index clause is not required to collect XML distribution statistics. Specifying only an index clause does not collect XML distribution statistics

      By default, XML distribution statistics use a maximum of 250 quantiles for each index over XML data. When collecting distribution statistics on an XML column, you can change the maximum number of quantiles by specifying a value with NUM_QUANTILES parameter in the ON COLUMNS or the DEFAULT clause.

    • Distribution statistics are collected for indexes over XML data of type VARCHAR, DOUBLE, TIMESTAMP, and DATE. Distribution statistics are not collected over indexes of type VARCHAR HASHED.
    • Distribution statistics are not collected for partitioned indexes over XML data defined on a partitioned table.
  8. For range-clustered tables, there is a special system-generated index in the catalog tables which represents the range ordering property of range-clustered tables. When statistics are collected on this type of table, if the table is to be included as part of the statistics collection, statistics will also be collected for the system-generated index. The statistics reflect the fast access of the range lookups by representing the index as a two-level index with as many pages as the base data table, and having the base data clustered perfectly along the index order.
  9. In the On Dist Cols clause of the command syntax, the Frequency Option and Quantile Option parameters are currently not supported for column GROUPS. These options are supported for single columns.
  10. There are three prefetch statistics that cannot be computed when working in DMS mode. When looking at the index statistics in the index catalogs, you will see a -1 value for the following statistics:
    • AVERAGE_SEQUENCE_FETCH_PAGES
    • AVERAGE_SEQUENCE_FETCH_GAP
    • AVERAGE_RANDOM_FETCH_PAGES
  11. A statistics profile can be set or updated for the table or statistical view specified in the RUNSTATS command, by using the set profile or update profile options. The statistics profile is stored in a visible string format, which represents the RUNSTATS command, in the STATISTICS_PROFILE column of the SYSCAT.TABLES system catalog table.
  12. Statistics collection on XML type columns is governed by two DB2 database system registry values: DB2_XML_RUNSTATS_PATHID_K and DB2_XML_RUNSTATS_PATHVALUE_K. These two parameters are similar to the NUM_FREQVALUES parameter in that they specify the number of frequency values to collect. If not set, a default of 200 will be used for both parameters.
  13. RUNSTATS acquires an IX table lock on SYSTABLES and a U lock on the row for the table on which statistics are being gathered at the beginning of RUNSTATS. Operations can still read from SYSTABLES including the row with the U lock. Write operations are also possible, providing they do not occur against the row with the U lock. However, another reader or writer will not be able acquire an S lock on SYSTABLES because of RUNSTATS' IX lock.
  14. Statistics are not collected for columns with structured types. If they are specified, columns with these data types are ignored.
  15. Only AVGCOLLEN and NUMNULLS are collected for columns with LOB or LONG data types.
  16. AVGCOLLEN represents the average space in bytes when the column is stored in database memory or a temporary table. This value represents the length of the data descriptor for LOB or LONG data types, except when LOB data is inlined on the data page.
    Note: The average space required to store the column on disk may be different than the value represented by this statistic.
  17. The UNSAMPLED DETAILED option is available to change the way index statistics are collected, but it should be used only in cases where its clear that the default or DETAILED doesnt work.
  18. When using the INDEXSAMPLE keyword you cannot specify different index sampling rates for different indexes within a single command. For example:
    runstats on table orders and index o_ck indexsample system(5), 
       index o_ok indexsample system(10)
    is invalid. The following two RUNSTATS commands can be used to achieve the required result:
    runstats on table orders and index o_ck indexsample system(5)
    runstats on table orders for index o_ok indexsample system(10)
  19. If you have modified the table since statistics were last collected on the table or its indexes, you should run RUNSTATS ON TABLE … AND INDEXES ALL. If you use RUNSTATS ON TABLE … FOR INDEXES ALL, the resulting statistics might be inconsistent.