Join optimization algorithm
The query optimizer must determine the join columns, join operators, local row selection, dial implementation, and dial ordering for a join query.
The join columns and join operators depend on the following situations:
- Join column specifications of the query
- Join order
- Interaction of join columns with other row selection
Join specifications not implemented for the dial are deferred until a later dial or, if an inner join, processed as row selection.
When looking for an existing index to access a secondary dial, the query optimizer looks at the left-most key columns of the index. For a given dial and index, the join specifications which use the left-most key columns can be used. For example:
DECLARE BROWSE2 CURSOR FOR
SELECT * FROM EMPLOYEE, EMP_ACT
WHERE EMPLOYEE.EMPNO = EMP_ACT.EMPNO
AND EMPLOYEE.HIREDATE = EMP_ACT.EMSTDATE
OPTIMIZE FOR 99999 ROWS
For the index over EMP_ACT with key columns EMPNO, PROJNO, and EMSTDATE, the join operation is performed only on column EMPNO. After the join is performed, index scan-key selection is done using column EMSTDATE.
The query optimizer also uses local row selection when choosing the best use of the index for the secondary dial. If the previous example had been expressed with a local predicate as:
DECLARE BROWSE2 CURSOR FOR
SELECT * FROM EMPLOYEE, EMP_ACT
WHERE EMPLOYEE.EMPNO = EMP_ACT.EMPNO
AND EMPLOYEE.HIREDATE = EMP_ACT.EMSTDATE
AND EMP_ACT.PROJNO = '123456'
OPTIMIZE FOR 99999 ROWS
The index with key columns EMPNO, PROJNO, and EMSTDATE are fully utilized by combining join and selection into one operation against all three key columns.
When creating a temporary index, the left-most key columns are the usable join columns in that dial position. All local row selection for that dial is processed when selecting entries for inclusion into the temporary index. A temporary index is like the index created for a select/omit keyed logical file. The temporary index for the previous example has key columns of EMPNO and EMSTDATE.
Since the optimizer tries a combination of join and local row selection, you can achieve almost all the advantages of a temporary index by using an existing index. In the preceding example, using either implementation, an existing index could be used or a temporary index could be created. A temporary index is built with the local row selection on PROJNO applied during the index creation. The temporary index has key columns of EMPNO and EMSTDATE to match the join selection.
If, instead, an existing index was used with key columns of EMPNO, PROJNO, EMSTDATE (or PROJNO, EMP_ACT, EMSTDATE), the local row selection can be applied at the same time as the join selection. This method contrasts to applying the local selection before the join selection, as happens when the temporary index is created. Or applying the local selection after the join selection, as happens when only the first key column of the index matches the join column.
The existing index implementation is more likely to provide faster performance because join and selection processing are combined without the overhead of building a temporary index. However, the existing index could have slightly slower I/O processing than the temporary index because the local selection is run many times rather than once. In general, create indexes with key columns for the combination of join and equal selection columns as the left-most keys.