IBM Support

[One page problem in plan optimization] Will an access plan be remain unchanged after collecting stats if no further stats is corrected and auto_runstats is disabled ?

Technical Blog Post


Abstract

[One page problem in plan optimization] Will an access plan be remain unchanged after collecting stats if no further stats is corrected and auto_runstats is disabled ?

Body

We are sometimes asked questions about plan stability. For example.
-We can correct stats and check the access plan, but if we no longer correct stats, will the plan be stable ?
-Does applying db2look -m output generates the same plan on another databases ?  
 
The answers are no, not always. Even after collecting stats, DB2 may fabricate stats which may change the access plan if the data fits into initially allocated one page of the table and exceeded later on.

Here's the outlined repro scenario of the symptom. Note, It can be even re-created since v8.2 and still re-creatable on db2 v111.1.1.
1. create a table or truncate a table.
2. import/insert/load to the table so that only a page is used to store the data and run runstats against the table.
3. correct db2exfmt output for an update query's access plan and see IXSCAN is chosen.
4. import much more data, say 100,000 rows
5. correct db2exfmt output for the update query's access plan and see TBSCAN is now chosen. The plan has been changed without running additional runstats.
Note, diagnostics message indicated

The reason of the plan change is due to stats fabrication. DB2 maintains Insert/Update/Delete (IUD) counter and fabricate table cardinality based on the initially corrected stats.
However, filter factor is preserved in the scenario, as the result, the optimal plan may change from the initial one.

To avoid such problems, you should correct stats after loading/importing/inserting a certain amount of rows which enough to exceed initial one page of the table.
Note, you can check this situation with Extended Diagnostic Information in db2exfmt output telling  fabricated statistics. This can lead to poor cardinality and predicate filtering estimates.
The size of the table changed significantly since the last time the RUNSTATS command was run.

 

// repro steps: consists of two steps
//step1: import four rows into a brand new table called a.a and collect db2exfmt with db2caem utility
db2 connect to sample
db2 -tvf $HOME/sqllib/misc/EXPLAIN.DDL
db2 update db cfg for sample using AUTO_SAMPLING off AUTO_STATS_VIEWS off AUTO_STMT_STATS off AUTO_RUNSTATS off AUTO_TBL_MAINT off AUTO_MAINT off
db2 terminate
db2stop force
db2start
db2 connect to sample
db2 -tvf exp1.sql
db2 "drop table a.a"
db2 "create table a.a ( key integer not null, val char(254) not null with default) in userspace1"
db2 "create index a.ix1a on a.a (key)"

db2 "insert into a.a values ( 0,'1'),( 1,'1'),( 2,'2'),(3,'3'),(4,'4')"
db2 "runstats on table a.a and indexes all"
db2look -d sample -e -z a -t a -m -o db2look.1.out
mkdir u1caemout
db2caem -d sample -sf u1.sql -o u1caemout

// Access plan of the UPDATE is INDEX SCAN as follows in the first db2caem output.
                  Rows
               Rows Actual
                 RETURN
                 (   1)
                  Cost
                   I/O
                   |
                    3
                    0
                 UPDATE
                 (   2)
                 31.3216
                   NA
               /---+----\
              3            5
              3           NA
           FETCH    TABLE: A
           (   3)          A
           9.27544        Q1
             NA
         /---+----\
        3            5
        3           NA
     IXSCAN   TABLE: A
     (   4)          A
     1.80484        Q2
       NA
       |
        5
       NA
 INDEX: A
      IX1A
       Q2


                Predicates:
                ----------
                2) Stop Key Predicate,
                        Comparison Operator:            Less Than (<)
                        Subquery Input Required:        No
                        Filter Factor:                  0.6

                        Predicate Text:
                        --------------
                        (Q2.KEY < 3)

//step2. importing additional 1,000,000 rows and correct db2exfmt with db2caem utility without running runstats.
db2 -tvf exp1.sql
db2 "import from expa.del of del commitcount 10000 insert into a.a"
db2look -d sample -e -z a -t a -m -o db2look.2.out
mkdir u1caemout2
db2caem -d sample -sf u1.sql -o u1caemout2

// Access plan of the UPDATE becomes TABLE SCAN as follows in the second db2caem output. You can see Extended Diagnostic Information.

            Rows
         Rows Actual
           RETURN
           (   1)
            Cost
             I/O
             |
           60027.6
              0
           UPDATE
           (   2)
           453339
             NA
         /---+----\
     60027.6      100046
        3           NA
     TBSCAN   TABLE: A
     (   3)          A
     12568.6        Q1
       NA
       |
     100046
       NA
 TABLE: A
        A
       Q2

 

Extended Diagnostic Information:
--------------------------------

Diagnostic Identifier:  1
Diagnostic Details:     EXP0045W  The table named "A       "."A" has
                        fabricated statistics.  This can lead to poor
                        cardinality and predicate filtering estimates.  The
                        size of the table changed significantly since the
                        last time the RUNSTATS command was run.

// supplemental data
// u1.sql file containing UPDATE statement
update a.a set val='a'||substr(val,2,253) where key between 1 and 2
;

//exp1.sql file containing export statement to generate 1,000,000 rows to import
export to expa.del of del
with w1 as (
select row_number() over( order by a.tabname ) number from syscat.columns a, syscat.columns b
fetch first 100000 rows only
)
, w2 as (
select
(number + 4 )         key
,char(int(number + 4)) val
from w1
)
select key, val from w2
;

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11140418