Ensuring that queries fit the required criteria for the zigzag join
The query optimizer uses the zigzag join method if: the tables and query fit the prerequisites for zigzag join, and a performance improvement is the result.
About this task
Use this task to ensure that your tables and query meet the prerequisites that are required for a zigzag join. In addition, you can use this task to manipulate the use of zigzag join, if you are not satisfied with the query optimizer choices.
The zigzag join method calculates the Cartesian product of rows from the dimension tables without actually materializing the Cartesian product. This method probes the fact table with a multicolumn index so that the fact table is filtered along two or more dimension tables simultaneously. The probe into the fact table finds matching rows. The zigzag join then returns the next combination of values that is available from the fact table index. This next combination of values, which is known as feedback, is used to skip over probe values that are provided by the Cartesian product of dimension tables that do not find a match in the fact table. Filtering the fact table on two or more dimension tables simultaneously, and skipping probes that are known to be unproductive, makes the zigzag join an efficient method for querying large fact tables.
- Ensure that the tables included in the zigzag
join fit the required criteria. Each dimension tables must have the following properties: a primary key, a unique constraint, or a unique index that does not have a random ordering of index keys that are defined on it. To define primary keys, unique constraints, or unique indexes, use commands such as the following example:
-- defining a unique constraint on a dimension table using composite keys create table dim1 ( d0 int not null, d1 int not null, c1 int, constraint pk1_uniq unique (d0, d1) ); -- defining a primary key on a dimension table create table dim2 ( d2 int primary key not null, c2 int ); -- creating a unique index on a dimension table create table dim3 ( d3 int, c3 int ); create unique index uniq_ind on dim3(d3);
- Write a suitable query. The query must have equality join predicates between each dimension tables primary key, unique index, or unique constraint and the fact table columns.For example:
select count(*) from dim1,dim2,dim3,fact where dim1.d0 = fact.f0 and dim1.d1 = fact.f1 and dim2.d2 = fact.f2 and dim3.d3 = fact.f3 and dim1.c1 = 10 and dim2.c2 < 20;
- Ensure that there is a suitable multicolumn index on the
The multicolumn index must include: columns that are used in the zigzag query in equality join predicates between the fact table and primary keys, unique indexes, or unique constraints from at least two of the dimension tables.To define such a multicolumn index, use a command such as the following example:
create index fact_ind on fact (f0, f1, f2, f3);If no suitable multicolumn index exists, an informational diagnostic message is displayed in the output of the db2exfmt command.
- Run the query in EXPLAIN
mode and then issue the db2exfmt command to format
the EXPLAIN output. Examine the output to determine whether the zigzag join was used and whether the wanted performance was achieved.
- Optional: If the zigzag join
method was not used or if the wanted performance was not achieved,
you might want to create another multicolumn index. Review the “Extended diagnostic information” section of db2exfmt command output. If an error message is listed in the output, follow the suggestions (to generate a new index, for instance).
- Optional: If the wanted performance
was not achieved, determine whether there was a gap in the index.
Review the gap information (Gap Info) section in the db2exfmt output.
If the section indicates that the query contains predicates that are inconsistent with a composite index, consider a new index or modifying an existing index to avoid the index gap.
Gap Info: Status --------- ------ Index Column 0: No Gap Index Column 1: Positioning Gap Index Column 2: No Gap Index Column 3: No Gap Index Column 4: No Gap