Subquery tuning

DB2® automatically performs some subquery tuning by subquery to join transformation and through subquery correlation and de-correlation.

Begin program-specific programming interface information.
However, you should be aware of the differences among the subqueries such as those in the following examples. You might need to code a query in one of the ways below for performance reasons that stem from restrictions to DB2 in transforming a given query, or because DB2 cannot accurately estimate the cost of the various transformation choices during query optimization.

Each of the following three queries retrieves the same rows. All three retrieve data about all designers in departments that are responsible for projects that are part of major project MA2100. These three queries show that you can retrieve a result in several ways.

Query A: a join of two tables

SELECT DSN8A10.EMP.* FROM DSN8A10.EMP, DSN8A10.PROJ
    WHERE  JOB = 'DESIGNER'
 		AND  WORKDEPT = DEPTNO
		AND  MAJPROJ = 'MA2100';

Query B: a correlated subquery

SELECT * FROM DSN8A10.EMP X
   WHERE  JOB = 'DESIGNER'
     AND  EXISTS (SELECT 1 FROM DSN8A10.PROJ
                  WHERE  DEPTNO = X.WORKDEPT
                    AND  MAJPROJ = 'MA2100');

Query C: a noncorrelated subquery

SELECT * FROM DSN8A10.EMP
    WHERE  JOB = 'DESIGNER'
      AND  WORKDEPT IN (SELECT DEPTNO FROM DSN8A10.PROJ
                        WHERE  MAJPROJ = 'MA2100');

Choosing between a subquery and a join

If you need columns from both tables EMP and PROJ in the output, you must use the join. Query A might be the one that performs best, and as a general practice you should code a subquery as a join whenever possible. However, in this example, PROJ might contain duplicate values of DEPTNO in the subquery, so that an equivalent join cannot be written. In that case, whether the correlated or non-correlated form is most efficient depends upon where the application of each predicate in the subquery provides the most benefit.

When looking at a problematic subquery, check if the query can be rewritten into another format, especially as a join, or if you can create an index to improve the performance of the subquery. Consider the sequence of evaluation for the different subquery predicates and for all other predicates in the query. If one subquery predicate is costly, look for another predicate that could be evaluated first to reject more rows before the evaluation of problem subquery predicate.

End program-specific programming interface information.