For efficient RUNSTATS operations and
subsequent query-plan analysis, collect distribution statistics on
only those columns that queries reference in WHERE, GROUP BY, and
similar clauses. You can also collect cardinality statistics on combined
groups of columns. The optimizer uses such information to detect column
correlation when it estimates selectivity for queries that reference
the columns in a group.
About this task
The following example is based on a database named SALES
that contains a CUSTOMERS table with indexes CUSTIDX1 and CUSTIDX2.For
privileges and authorities that are required to use the RUNSTATS utility,
see the description of the RUNSTATS command.
When
you collect statistics for a table in a partitioned database environment, RUNSTATS operates
only on the database partition from which the utility is executed.
The results from this database partition are extrapolated to the other
database partitions. If this database partition does not contain a
required portion of the table, the request is sent to the first database
partition in the database partition group that contains the required
data.
Procedure
To collect statistics on specific columns:
- Connect to the SALES database.
-
Execute one of the following commands from the Db2® command line,
depending on your requirements:
- To collect distribution statistics on columns ZIP and
YTDTOTAL:
runstats on table sales.customers
with distribution on columns (zip, ytdtotal)
- To collect distribution statistics on the same columns, but with different distribution
options:
runstats on table sales.customers
with distribution on columns (
zip, ytdtotal num_freqvalues 50 num_quantiles 75)
- To collect distribution statistics on the columns that are indexed in CUSTIDX1 and
CUSTIDX2:
runstats on table sales.customer
on key columns
- To collect statistics for columns ZIP and YTDTOTAL and a column group that includes REGION
and
TERRITORY:
runstats on table sales.customers
on columns (zip, (region, territory), ytdtotal)
- Suppose that statistics for non-XML columns were collected previously using the
LOAD command with the STATISTICS parameter. To collect
statistics for the XML column
MISCINFO:
runstats on table sales.customers
on columns (miscinfo)
- To collect statistics for the non-XML columns
only:
runstats on table sales.customers
excluding xml columns
The
EXCLUDING XML COLUMNS clause takes precedence over all other clauses that
specify XML columns.
- For Db2 V9.7 Fix Pack 1 and
later releases, the following command collects distribution statistics using a maximum of 50
quantiles for the XML column MISCINFO. A default of 20 quantiles is used for all other columns in
the
table:
runstats on table sales.customers
with distribution on columns ( miscinfo num_quantiles 50 )
default num_quantiles 20
Note: The
following are required for distribution statistics to be collected on the XML column MISCINFO:
- Both table and distribution statistics must be collected.
- An index over XML data must be defined on the column, and the data type specified for the index
must be VARCHAR, DOUBLE, TIMESTAMP, or DATE.