Modifying catalog statistics to influence access path selection
If you have the proper authority, you can influence access path selection by using an SQL statements to change statistics values in the catalog. However, doing so is not generally recommended, except as a last resort.
Before you begin
Consequently, the following recommendations apply if you make any such changes:
- Save the original catalog statistics or SQL statements before you consider making any changes to control the choice of access path.
- Before and after you make any changes, take performance measurements.
- Be prepared to back out any changes that have degraded performance.
- When you migrate to a new release, evaluate the performance again.
- Plan to keep track of the changes you make and of the plans or packages that have access path changes because of the changed statistics.
- Record when the statistics were modified by setting the value of the STATIME column to the current TIMESTAMP value for every statistics record that you modify in the catalog tables.
- Consider correlations among catalog tables before updating statistics values.
About this task
Procedure
To modify statistics values, use any of the following approaches:
Example
For example, assume that the following SQL statement has a problem with data correlation:
SELECT * FROM PART_HISTORY -- SELECT ALL PARTS
WHERE PART_TYPE = 'BB' P1 -- THAT ARE 'BB' TYPES
AND W_FROM = 3 P2 -- THAT WERE MADE IN CENTER 3
AND W_NOW = 3 P3 -- AND ARE STILL IN CENTER 3
Db2 does not know that 50% of the parts that were made in Center 3 are still in Center 3. The problem can be circumvented by making a predicate non-indexable. However, suppose that hundreds of users are writing queries similar to that query. Having all users change their queries would be impossible. In this type of situation, the best solution might be to change the catalog statistics.
One catalog table that you can update is the SYSIBM.SYSCOLDIST catalog table, which gives information about a column or set of columns in a table. Assume that because columns W_NOW and W_FROM are correlated, and that only 100 distinct values exist for the combination of the two columns, rather than 2500 (50 for W_FROM * 50 for W_NOW). Insert a row like this to indicate the new cardinality:
INSERT INTO SYSIBM.SYSCOLDIST
(FREQUENCY, FREQUENCYF, IBMREQD,
TBOWNER, TBNAME, NAME, COLVALUE,
TYPE, CARDF, COLGROUPCOLNO, NUMCOLUMNS)
VALUES(0, -1, 'N',
'USRT001','PART_HISTORY','W_FROM',' ',
'C',100,X'00040003',2);
You can also use the RUNSTATS utility to put this information in SYSCOLDIST.
You tell Db2 about the frequency of a certain combination of column values by updating SYSIBM.SYSCOLDIST. For example, you can indicate that 1% of the rows in PART_HISTORY contain the values 3 for W_FROM and 3 for W_NOW by inserting this row into SYSCOLDIST:
INSERT INTO SYSIBM.SYSCOLDIST
(FREQUENCY, FREQUENCYF, STATSTIME, IBMREQD,
TBOWNER, TBNAME, NAME, COLVALUE,
TYPE, CARDF, COLGROUPCOLNO, NUMCOLUMNS)
VALUES(0, .0100, '2006-12-01-12.00.00.000000','N',
'USRT001','PART_HISTORY','W_FROM',X'00800000030080000003',
'F',-1,X'00040003',2);