Referential integrity constraints help reduce the number of statistical views
Statistical views can be used to obtain good cardinality estimates for the Db2® query optimizer, but they also require resources to maintain and process. However, by using referential integrity constraints, you can cut down on the number of statistical views used to gather statistics.
If you want to collect accurate statistics for a star join query that joins different sets of tables, you can create multiple statistical views or one statistical view by using referential integrity constraints.
create table D1 (D1_PK int not null primary key,
D1_C1 int,
D1_C2 int,
D1_C3 int,
D1_FK int not null);
create table D2 (D2_PK int not null primary key,
D2_C1 int,
D2_C2 int,
D2_C3 int,
D2_FK int not null);
create table D3 (D3_PK int not null primary key,
D3_C1 int,
D3_C2 int,
D3_C3 int,
D3_FK int not null);
create table D4 (D4_PK int not null primary key,
D4_C1 int,
D4_C2 int,
D4_C3 int,
D4_FK int not null);
create table D5 (D5_PK int not null primary key,
D5_C1 int,
D5_C2 int,
D5_C3 int,
D5_FK int not null);
create table F (F_FK1 int not null,
F_FK2 int not null,
F_FK3 int not null,
F_FK4 int not null,
F_FK5 int not null,
F_C1 int, F_C2 int);
Consider
that table F is altered in the following way:alter table F add foreign key (F_FK1)
references D1 (D1_PK) on delete cascade;
alter table F add foreign key (F_FK2)
references D2 (D2_PK) on delete cascade;
alter table F add foreign key (F_FK3)
references D3 (D3_PK) on delete cascade;
alter table F add foreign key (F_FK4)
references D4 (D4_PK) on delete cascade;
alter table F add foreign key (F_FK5)
references D5 (D5_PK) on delete cascade;
Also, consider
that you want to provide statistics for the following query:select distinct * from F, D1, D2, D3 where F_FK1 = D1_PK and F_FK2
= D2_PK and F_FK3 = D3_PK and D1_C1='ON' and D2_C2>='2009-01-01';
To
gather accurate statistics you can create the complete set of views,
as follows:create view SV1 as (select D1.* from F, D1 where F_FK1 = D1_PK);
alter view SV1 enable query optimization;
create view SV2 as(select D2.* from F, D2 where F_FK2 = D2_PK);
alter view SV2 enable query optimization;
create view SV3 as(select D3.* from F, D3 where F_FK3 = D3_PK);
alter view SV3 enable query optimization;
create view SV4 as(select D1.*, D2.*, D3.* from F, D1, D2, D3 where
F_FK1 = D1_PK and F_FK2 = D2_PK and F_FK3 = D3_PK);
alter view SV4 enable query optimization;
create view SV5 as (select D1.*, D2.*, D3.*, D4.*, D5.* from F, D1, D2, D3, D4, D5
where
F_FK1 = D1_PK and F_FK2 = D2_PK and F_FK3 = D3_PK
and F_FK4 = D4_PK and F_FK5 = D5_PK
);
alter view SV5 enable query optimization;
The
statistics for SV4, SV3, SV2, and SV1 are inferred from SV5 based on referential integrity
constraints. The referential integrity constraints between F, D1, D2, D3, D4, and D5 ensure that the
joins among them are lossless. These lossless joins let us infer that the following cardinalities
are the same:- The cardinality of the join result between F and D1
- The cardinality of SV1
- The cardinality of the statistical view SV5
For referential integrity constraints that permit NULL values, the percentage of NULL values that are in foreign key values can affect statistics inferred from a statistical view.
If the NULL values are a significant percentage of the foreign key values, the statistics and cardinality estimates inferred by the Db2 optimizer by using the constraints can be inaccurate. Prevent these inaccuracies by preventing the referential integrity constraints from being used to infer statistics. To prevent the constraints from being used to infer statistics, disable query optimization by using the DISABLE QUERY OPTIMIZATION option of the ALTER TABLE command.
You can detect this situation by using the explain facility. The explain facility shows the proportion of NULL values in the foreign key by producing an explain diagnostic message if the number of NULL values is possibly too high and if statistics are being inferred from a statistical view.