IBM Support

75 ways to demystify DB2 #71: Techtip : How can I speed up statistics collection in DB2?

Technical Blog Post


Abstract

75 ways to demystify DB2 #71: Techtip : How can I speed up statistics collection in DB2?

Body

During RUNSTATS command execution: If collecting statistics using all data in the table is taking too long, specify an option to scan just a sample of the data in the table. Although the statistics are not as accurate as if you had taken a full scan, using a sample can provide a good compromise between RUNSTATS command execution time and better statistics. The RUNSTATS command includes an optional parameter, the TABLESAMPLE option, which allows you to select the percentage of the table you want it to scan.


The following sample command collects statistics on a table using 10 percent of the data in the table:
RUNSTATS ON TABLE <tabschema>.<tabname> WITH DISTRIBUTION AND INDEXES ALL TABLESAMPLE SYSTEM (10)

 

When collecting index statistics for a table, speed up the elapsed time of the statistics collection through using several options:
Specify only the columns for which you want to collect statistics by using the ON COLUMNSoption of the RUNSTATScommand, as shown in the following sample command:
RUNSTATS ON TABLE <tabschema>.<tabname> ON COLUMNS (column1, column2)

Specify only the key columns, as shown in the following sample command:
RUNSTATS ON TABLE <tabschema>.<tabname> ON KEY COLUMNS

Exclude XML columns if you do not need statistics for them, as shown in the following sample command.
RUNSTATS ON TABLE <tabschema>.<tabname> ON ALL COLUMNS EXCLUDING XML COLUMNS


During LOAD/REPLACE: If you are replacing the full contents of a table using the LOAD command with the REPLACE option, collecting statistics during the LOAD operation can be faster than
performing separate operations for LOADand RUNSTATS. Create a profile for the table and then use the STATISTICS USE PROFILEoption with the LOADcommand.
The following sample command uses a statistics profile to collect statistics while data is loaded into a table:
LOAD FROM xxx.del OF DEL REPLACE INTO <tabschema>.<tabname> STATISTICS USE PROFILE

[{"Business Unit":{"code":"BU029","label":"Data and AI"}, "Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":""}]

UID

ibm13286839