IBM Support

IC63387: OPTIMIZER MAY OVERESTIMATE JOIN CARDINALITY UNDER CERTAIN CONDITIONS WHEN STATISTICS ARE NOT COLLECTED ON ONE OF THE TABLE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The DB2 Query Optimizer may choose a non-optimal query execution
    plan for an SQL statement that contains a join between two or
    more tables, when the following conditions are satisfied for at
    least one the joins:
    
    1.  the join consists of two or more equality predicates
    2.  statistics are collected on one of the tables and not the
    other
    3.  a unique index or primary key exists on the table that
    statistics are not collected on, with the join predicates
    applied to all the key columns
    4.  a column group statistic or index on the join predicate
    columns does not exist on the table that statistics are
    collected on
    
    For example, consider the following SQL statement
    
    SELECT *
    FROM T1, T2
    WHERE T1.A=T2.A and T1.B=T2.B
    
    If an index on T1(A,B) does not exist and statistics are
    collected on T1 as follows:
    RUNSTATS ON TABLE <schema>.T1 WITH DISTRIBUTION AND DETAILED
    INDEXES ALL;
    and statistics are not collected on T2 but (A,B) is defined as
    the primary key of the table then, under these conditions, the
    optimizer may overestimate the cardinality of the join which may
    result in a non-optimal query execution plan.
    

Local fix

  • Any of the following actions will resolve this issue:
    
    1.  Collect statistics on all tables, or enable automatic
    statistics collection.  Collecting statistics is one of the best
    practices to minimize the impact of SQL statements on
    performance, as described in the best practice paper on "Writing
    and Tuning Queries for Optimal Performance".  This paper is
    available at:
    http://www.ibm.com/developerworks/data/bestpractices/
    
    2.  Collect column group statistics on the table that statistics
    are collected on.  In the example, column group statistics can
    be collected on T1(A,B) as follows:
    RUNSTATS ON TABLE <schema>.T1 ON ALL COLUMNS AND COLUMNS ((A,B))
    WITH DISTRIBUTION AND DETAILED INDEXES ALL;
    If automatic runstats is enabled, a statistics profile will be
    required.  The following article provides more information on
    how the DB2 Query Optimizer makes use of column group statistics
    and how to collect them:
    http://www.ibm.com/developerworks/data/library/techarticle/dm-06
    12kapoor/index.html
    

Problem summary

  • ERROR DESCRIPTION:
    
    The DB2 Query Optimizer may choose a non-optimal query execution
    plan for an SQL statement that contains a join between two or
    more tables, when the following conditions are satisfied for at
    least one the joins:
    
    1.  the join consists of two or more equality predicates
    2.  statistics are collected on one of the tables and not the
    other
    3.  a unique index or primary key exists on the table that
    statistics are not collected on, with the join predicates
    applied to all the key columns
    4.  a column group statistic or index on the join predicate
    columns does not exist on the table that statistics are
    collected on
    
    For example, consider the following SQL statement
    
    SELECT *
    FROM T1, T2
    WHERE T1.A=T2.A and T1.B=T2.B
    
    If an index on T1(A,B) does not exist and statistics are
    collected on T1 as follows:
    RUNSTATS ON TABLE <schema>.T1 WITH DISTRIBUTION AND DETAILED
    INDEXES ALL;
    and statistics are not collected on T2 but (A,B) is defined as
    the primary key of the table then, under these conditions, the
    optimizer may overestimate the cardinality of the join which may
    result in a non-optimal query execution plan.
    

Problem conclusion

  • The APAR is first fixed in the DB2 v9.5 fixpak6
    

Temporary fix

  • Any of the following actions will resolve this issue:
    
    
    1.  Collect statistics on all tables, or enable automatic
    statistics collection.  Collecting statistics is one of the best
    practices to minimize the impact of SQL statements on
    performance, as described in the best practice paper on "Writing
    and Tuning Queries for Optimal Performance".  This paper is
    available at:
    http://www.ibm.com/developerworks/data/bestpractices/
    
    2.  Collect column group statistics on the table that statistics
    are collected on.  In the example, column group statistics can
    be collected on T1(A,B) as follows:
    RUNSTATS ON TABLE <schema>.T1 ON ALL COLUMNS AND COLUMNS ((A,B))
    WITH DISTRIBUTION AND DETAILED INDEXES ALL;
    If automatic runstats is enabled, a statistics profile will be
    required.  The following article provides more information on
    how the DB2 Query Optimizer makes use of column group statistics
    and how to collect them:
    http://www.ibm.com/developerworks/data/library/techarticle/dm-06
    12kapoor/index.html
    

Comments

APAR Information

  • APAR number

    IC63387

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-09-24

  • Closed date

    2010-05-13

  • Last modified date

    2010-05-13

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IC63389

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R950 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
13 May 2010