Hybrid join (METHOD=4)

The hybrid join method applies only to an inner join, and requires an index on the join column of the inner table.

Begin program-specific programming interface information. The following figure illustrates a hybrid join.
Figure 1. Hybrid join (SORTN_JOIN='Y')
Begin figure description. A flow chart with sample data that shows how obtains RIDs in the order required by list prefetch for a hybrid join. End figure description.

The method requires obtaining RIDs in the order needed to use list prefetch. The steps are shown in Figure 1. In that example, both the outer table (OUTER) and the inner table (INNER) have indexes on the join columns.

Db2 performs the following steps:

  •  1  Scans the outer table (OUTER).
  •  2  Joins the outer table with RIDs from the index on the inner table. The result is the phase 1 intermediate table. The index of the inner table is scanned for every row of the outer table.
  •  3  Sorts the data in the outer table and the RIDs, creating a sorted RID list and the phase 2 intermediate table. The sort is indicated by a value of Y in column SORTN_JOIN of the plan table. If the index on the inner table is a well-clustered index, Db2 can skip this sort; the value in SORTN_JOIN is then N.
  •  4  Retrieves the data from the inner table, using list prefetch.
  •  5 Concatenates the data from the inner table and the phase 2 intermediate table to create the final composite table.

Possible EXPLAIN results for hybrid join

The following table shows possible EXPLAIN results from a hybrid join and an explanation of each column value.

Table 1. Explanation of EXPLAIN results for a hybrid join
Column value Explanation
METHOD='4' A hybrid join was used.
SORTC_JOIN='Y' The composite table was sorted.
SORTN_JOIN='Y' The intermediate table was sorted in the order of inner table RIDs. A non-clustered index accessed the inner table RIDs.
SORTN_JOIN='N' The intermediate table RIDs were not sorted. A clustered index retrieved the inner table RIDs, and the RIDs were already well ordered.
PREFETCH='L' Pages were read using list prefetch.

Performance considerations for hybrid join

Hybrid join uses list prefetch more efficiently than nested loop join, especially if indexes exist on the join predicate with low cluster ratios. It also processes duplicates more efficiently because the inner table is scanned only once for each set of duplicate values in the join column of the outer table.

If the index on the inner table is highly clustered, it is unnecessary to sort the intermediate table (SORTN_JOIN=N). The intermediate table is placed in a table in memory rather than in a work file.

When hybrid join is used

Hybrid join is often used under the following situations.

  • A non-clustered index or indexes are used on the join columns of the inner table.
  • The outer table has duplicate qualifying rows. End program-specific programming interface information.