The query optimizer uses the zigzag join
method if: the tables and query fit the prerequisites for zigzag join,
and a performance improvement is the result.
About this task
Use this task to ensure that your tables
and query meet the prerequisites that are required for a zigzag join.
In addition, you can use this task to manipulate the use of zigzag
join, if you are not satisfied with the query optimizer choices.
The zigzag join method calculates the Cartesian product
of rows from the dimension tables without actually materializing the
Cartesian product. This method probes the fact table with a multicolumn
index so that the fact table is filtered along two or more dimension
tables simultaneously. The probe into the fact table finds matching
rows. The zigzag join then returns the next combination of values
that is available from the fact table index. This next combination
of values, which is known as feedback, is used to skip over probe
values that are provided by the Cartesian product of dimension tables
that do not find a match in the fact table. Filtering the fact table
on two or more dimension tables simultaneously, and skipping probes
that are known to be unproductive, makes the zigzag join an efficient
method for querying large fact tables.
Procedure
- Ensure that the tables included in the zigzag
join fit the required criteria.
Each dimension tables must
have the following properties: a primary key, a unique constraint,
or a unique index that does not have a random ordering of index keys
that are defined on it. To define primary keys, unique constraints,
or unique indexes, use commands such as the following example:
-- defining a unique constraint on a dimension table using composite keys
create table dim1 (
d0 int not null,
d1 int not null,
c1 int,
constraint pk1_uniq unique (d0, d1)
);
-- defining a primary key on a dimension table
create table dim2 (
d2 int primary key not null,
c2 int
);
-- creating a unique index on a dimension table
create table dim3 (
d3 int,
c3 int
);
create unique index uniq_ind on dim3(d3);
- Write a suitable query.
The query must have
equality join predicates between each dimension tables primary key,
unique index, or unique constraint and the fact table columns.
For example:
select count(*)
from dim1,dim2,dim3,fact
where dim1.d0 = fact.f0
and dim1.d1 = fact.f1
and dim2.d2 = fact.f2
and dim3.d3 = fact.f3
and dim1.c1 = 10
and dim2.c2 < 20;
- Ensure that there is a suitable multicolumn index on the
fact table.
The
multicolumn index must include: columns that are used in the zigzag
query in equality join predicates between the fact table and primary
keys, unique indexes, or unique constraints from at least two of the
dimension tables.
To define such a multicolumn index,
use a command such as the following example:
create index fact_ind on fact (f0, f1, f2, f3);
If no suitable multicolumn index exists, an
informational diagnostic message is displayed in the output of the db2exfmt command.
- Run the query in EXPLAIN
mode and then issue the db2exfmt command to format
the EXPLAIN output.
Examine the output to determine whether
the zigzag join was used and whether the wanted performance was achieved.
- Optional: If the zigzag join
method was not used or if the wanted performance was not achieved,
you might want to create another multicolumn index.
Review
the “Extended diagnostic information” section of db2exfmt command
output. If an error message is listed in the output, follow the suggestions
(to generate a new index, for instance).
- Optional: If the wanted performance
was not achieved, determine whether there was a gap in the index.
Review the gap information (Gap Info) section in the
db2exfmt output.
Gap Info: Status
--------- ------
Index Column 0: No Gap
Index Column 1: Positioning Gap
Index Column 2: No Gap
Index Column 3: No Gap
Index Column 4: No Gap
If the section indicates
that the query contains predicates that are inconsistent with a composite
index, consider a new index or modifying an existing index to avoid
the index gap.