# IT05401: IN A RARE CONDITION, A QUERY OF A CHAIN OF EQUALITY JOIN PREDICATES BETWEEN 4 OR MORE TABLES COULD PRODUCE EXTRA ROWS

## Subscribe

You can track all active APARs for this component.

## Error description

• ```This problem deals with inner joins between 4 or more tables
and an outer join.

For a chain of join predicates between 4 columns, we need to
apply at least 3 join predicates among the 4 columns. This APAR
documents a case where only 2 join predicates are applied and a
redundant predicate is incorrectly picked up.

Assume we have tables A, B, C, D and X with the definitions
shown in the following SQL:

set schema test;
create table A(a1 int, a2 varchar(12));
create table B(b1 int, b2 varchar(12));
create table C(c1 int, c2 varchar(12));
create table D(d1 int, d2 varchar(12));
create table X(x1 int, x2 varchar(12));

An example of this problem is illustrated by the following
SQL:

select a2, b2, c2, d2
from   c
inner join d on c2=d2
left join x on c1=x1
inner join a on c2=a2
inner join b on a1=b1 and d2=b2
;

If DB2 chooses an access plan with the following join order:

( ( C inner join D ) outer join X ) ... [1]
inner join                        ... [2]
( A inner join B )                  ... [3]

and the inner join in [2] is a hash join. The order of [1] and
[3] may be swapped, that is, either [1] HSJOIN [3], or [3]
HSJOIN [1] could trigger the problem.

A sample data and small twist on the statistic in the SQL
below demonstrate the result of those extra rows.

insert into C values (1, '1');
insert into D values (1, '1');
insert into A values (1, '1');
insert into B values (1, '1');
insert into B values (1, 'extra row');
insert into B values (1, 'extra row');

runstats on table test.A;
runstats on table test.B;
runstats on table test.C;
runstats on table test.D;
runstats on table test.X;

-- To influence the optimizer to join A and B first
update sysstat.columns set colcard=3 where tabschema='TEST'
and tabname='B' and colname='B1';

Then, the 2 "extra row"s of B are returned in the result set
because the join predicate d2=b2 is incorrectly picked to be the
redundant predicate and by not applying it.

A2             B2            C2           D2
------------ ------------ ------------ ------------
1            1            1            1
1            extra row    1            1
1            extra row    1            1

3 record(s) selected.
```

## Local fix

• ```Setting the optimization level to 3, 1 or 0 as the command
"db2 set current query optimization 3" could avoid the problem.
```

## Problem summary

• ```****************************************************************
* USERS AFFECTED:                                              *
* All users                                                    *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See Error Description                                        *
****************************************************************
* RECOMMENDATION:                                              *
* Upgrade to DB2 version 10.1.0.5.                             *
****************************************************************
```

## Problem conclusion

• ```The problem is first fixed in DB2 version 10.1.0.5.
```

IT05401

DB2 FOR LUW

DB2FORLUW

A10

CLOSED PER

NoPE

YesHIPER

NoSpecatt

2014-11-06

2015-07-10

2015-07-10

DB2 FOR LUW

DB2FORLUW

## Applicable component levels

• ### RA10 PSN

UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"DB2 for Linux- UNIX and Windows"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Modified date:
13 January 2022