Detecting correlated columns

The first indication that correlated columns might be a problem is poor response times when Db2 has chosen an inappropriate access path. If you suspect that you have problems with correlated columns, you can issue SQL statements to test whether columns are correlated.

Procedure

To determine whether columns are correlated:

Issue SQL statements and compare the results as shown in the following example.

Example

Begin program-specific programming interface information. If you suspect two columns in a table, such as the CITY and STATE columns in the CREWINFO table might be correlated, then you can issue the following SQL queries that reflect the relationships between the columns:

SELECT COUNT (DISTINCT CITY) AS CITYCOUNT,
  COUNT (DISTINCT STATE) AS STATECOUNT FROM CREWINFO;

The result of the count of each distinct column is the value of COLCARDF in the Db2 catalog table SYSCOLUMNS. Multiply the previous two values together to get a preliminary result:

CITYCOUNT  x  STATECOUNT    =      ANSWER1

Then issue the following SQL statement:

SELECT COUNT(*) FROM
  (SELECT DISTINCT CITY, STATE
   FROM CREWINFO) AS V1;           (ANSWER2)

Compare the result of the previous count (ANSWER2) with ANSWER1. If ANSWER2 is less than ANSWER1, then the suspected columns are correlated. End program-specific programming interface information.