Recreate optimizer access plans using db2look

A tool to aid in tuning queries

The db2look tool gives you a way to recreate the same access plan for your test environment that DB2® Universal Database™ (DB2 UDB) is using for your production environment. Here are all the details for setting it up.

Share:

Samir Kapoor, DB2 UDB Advanced Support Analyst, IBM  

Samir Kapoor photoSamir Kapoor is an IBM Certified Advanced Database Administrator - DB2 UDB V8.1 for Linux, UNIX, and Windows. Samir currently works with the DB2 UDB Advanced Support - Down system division (DSD) team and has in-depth knowledge in the engine area.



Kaarel Truuvert, DB2 UDB Query Optimization Developer, IBM

Kaarel Truuvert photoKaarel Truuvert is a senior developer in the DB2 UDB Optimizer group. He has extensive experience in the diagnosis and resolution of customer problems.



04 August 2005

Also available in Japanese

Introduction

Working as a DB2 UDB Support Analyst, I frequently hear the following question from customers who are dealing with an optimizer or query planning problem:

"How do I recreate the same query access plan on my test environment that I am getting on my production environment?"

Many times, there is a need to replicate a production environment to a test environment, including recreating the same access plan for query analysis purposes.

For example, in production, you could experience performance problems caused by a query that is using a poor access plan and want to replicate the access plan on a test system in order try some different strategies, such as manipulating the statistics, changing the optimization level, trying different settings for DB2 registry variables, and so on, in order to improve performance.

In an ideal world, you would want to have the test environment as closely matched to production as possible. That is, you would want to use exactly the same hardware, operating system maintenance level and configuration, DB2 level, and configuration in both environments, as well as using the same data in test as in production. However, this ideal cannot always be achieved. If the production environment has a very large amount of data, you may simply not have the capacity to keep a test copy of the production system.

The db2look utility can be used to achieve your goal, even though you can't replicate all the details of production.

This article will explain how you can mimic a production system on a test system without the need for actual data in order to recreate a query planning problem. This ability will help you to debug queries and understand access plan issues without interrupting work in the production environment. Note, however, that if you want to test the execution of the resulting access plan, you will still need to load data from production onto test (as much data as possible). And there is always the possibility that the differences between the test and production systems are still enough that the execution characteristics on test do not match those on production. This part of analysis (performance tuning) is as much an art as science.

Other problems in the optimizer or query compiler area, such as SQL0901N errors or instance crashes, can also be recreated using the methods explained in this article. You can try various strategies, such as testing the most recent fix pack (if the system is at an older fix level), different optimization levels, different registry variables, and so on, in order to see if these changes will correct the problem.

Let's look at the options to use with db2look in order to achieve this goal.


The db2look command and its options

Here are the commands you use to capture the needed information from your production system:

Listing 1. Commands to recreate an optimizer problem
db2look -d <dbname> -l -o storage.out                        
db2look -d <dbname> -f -fd -o config.out
db2look -d <dbname> -e -a -m -o db2look.out 
db2look -d <dbname> -e -a -m -t table1 table2 .... tableX -o table.ddl

Now let's look at these db2look command options in more detail.

Generate buffer pool, tablespace, and database partition group information

db2look -d <dbname> -l -o storage.out

Here is a description of the options used in the db2look command above:

  • -d: Database name -- This must be specified.
  • -l: Generates database layout. This is the layout for database partition groups, buffer pools and tablespaces.
  • -o: Redirects the output to the given file name. If the -o option is not specified, then output goes to standard output (stdout), generally the screen.

The -l option is important to mimicking your production environment. Ideally, you want to have the same buffer pools, database partition groups (if you're in a multi-partition environment), and tablespace information (including temporary tablespaces). However, if you are constrained by memory and cannot allocate the large buffer pools that you have in production, then use the db2fopt command. I'll discuss this command in more detail later in this section.

It is not always possible to have the same tablespaces set up in test that you have in production. For example, you may have devices set up with large sizes, and you may not have the flexibility to create the same device sizes in test. Or, you may not have a separate tablespace device available at all in the test environment. In addition, you might not have the same paths set up in test that you have in production. You would need to alter the paths, devices, and files appropriately to fit your test environment.

The important information used by the optimizer for a tablespace is the following. This is what you would want to make sure are the same on both test and production. (Note: The numbers shown here are an example. You should use the same settings on test as you do on production.)

PREFETCHSIZE 16
EXTENTSIZE 16
OVERHEAD 12.670000
TRANSFERRATE 0.180000

If a tablespace is "managed by database" on production, it should also be "managed by database" on test, using the same type of container (file or device) as used in production. If it is "managed by system" on production, it should also be that way on test.

Note: If this is a system with multiple physical partitions (MPP), the number of partitions in the database partition group must be the same on test. However, the number of physical machines does not have to be the same. The number of logical partitions in the whole MPP environment must be the same on both test and production.

Generate configuration parameters and registry variables

db2look -d <dbname> -f -fd -o config.out

Here, I've used the following parameters:

  • -f: Extracts configuration parameters and registry variables. If this option is specified, -wrapper and -server options will be ignored.
  • -fd: Generates db2fopt statements for opt_buffpage and opt_sortheap, along with other configuration and registry settings.

The output of the command looks like this:

Listing 2. Sample output from db2look command
$ db2look -d sample -f -fd

-- No userid was specified, db2look tries to use Environment variable USER
-- USER is: SKAPOOR
-- This CLP file was created using DB2LOOK Version 8.2
-- Timestamp: Sat Mar 26 00:13:36 EST 2005
-- Database Name: SAMPLE
-- Database Manager Version: DB2/6000 Version 8.2.2
-- Database Codepage: 819
-- Database Collating Sequence is: UNIQUE

CONNECT TO SAMPLE; 

-------------------------------------------------------- 
-- Database and Database Manager configuration parameters 
-------------------------------------------------------- 

UPDATE DBM CFG USING cpuspeed 6.523521e-07; 
UPDATE DBM CFG USING intra_parallel NO; 
UPDATE DBM CFG USING federated NO; 
UPDATE DBM CFG USING fed_noauth NO; 

!db2fopt SAMPLE update opt_buffpage 50000; 
!db2fopt SAMPLE update opt_sortheap 10000; 
UPDATE DB CFG FOR SAMPLE USING locklist 1000; 
UPDATE DB CFG FOR SAMPLE USING dft_degree 1;
UPDATE DB CFG FOR SAMPLE USING maxlocks 10; 
UPDATE DB CFG FOR SAMPLE USING avg_appls 1; 
UPDATE DB CFG FOR SAMPLE USING stmtheap 2048; 
UPDATE DB CFG FOR SAMPLE USING dft_queryopt 5; 

--------------------------------- 
-- Environment Variables settings 
--------------------------------- 

!db2set DB2_ANTIJOIN=yes; 
!db2set DB2_INLIST_TO_NLJN=yes; 

COMMIT WORK;
CONNECT RESET; 
TERMINATE;

The -f and -fd options are key options to use in order to extract configuration parameter and environment variables, which the optimizer uses during access plan phase. In Listing 2, above, note the following output that resulted from the -fd option:

!db2fopt SAMPLE update opt_buffpage 50000; 
!db2fopt SAMPLE update opt_sortheap 10000;

The db2fopt command tells the optimizer to use the specified value for "Buffer pool size," rather than adding up the pages of the buffer pools available. (Buffer pool size in the db2exfmt output is discussed further in the buffer pool size section below.). For example, say that you cannot afford to have large buffer pools due to memory constraints on the test system and would like to configure the size the same without actually having them in reality. Use the -fd option, which would generate the db2fopt commands required to tell the optimizer to use the specified size rather than calculating based on the buffer pools available for this database.

It works the same way for sort heap, as we'll see in the sort heap section below.

Here is the usage for the db2fopt command. Note, that the -fd option with db2look picks the values for you but you should be aware of the usage and as well aware of how to reset the values so that we go back to using syscat.bufferpools for bufferpool pages and sortheap in the database configuration.

C:\>db2fopt
Usage: db2fopt <database-alias> update [opt_buffpage <value>] [opt_sortheap <value>f]
   or  db2fopt <database-alias> get    [opt_buffpage] [opt_sortheap]]

If you would like to set the values for opt_buffpage and opt_sortheap, issue:

db2fopt <dbname> update opt_buffpage <value> opt_sortheap <value>

For example:

C:\>db2fopt sample update opt_buffpage 50000 opt_sortheap 10000

Update succeeded

Make sure to terminate and reconnect to the database.

If you would like to view the values, issue:
C:\>db2fopt sample get opt_buffpage opt_sortheap opt_buffpage value is 50000 opt_sortheap value is 10000

And if you would like to reset the values so that we do not use these two parameters, and go back to using syscat.bufferpools for bufferpool pages estimate, and database configuration for sortheap size, issue:

C:\>db2fopt sample update opt_buffpage -1 opt_sortheap -1

Update succeeded

Make sure to terminate and reconnect to the database.

To make sure that they are reset, use the get option in db2fopt again:

C:\>db2fopt sample get opt_buffpage opt_sortheap

opt_buffpage value is -1
opt_sortheap value is -1

If you're a DBA, you will probably be using the DB2 SQL Explain Tool (db2exfmt) to gain an understanding of your SQL access plan. The db2exfmt tool is used to format the contents of the explain tables. If you look at the output of one of the access plans using db2exfmt from production, you will notice the following at the top of the plan. (Note: For the most part, these parameters are picked up by the -f and -fd option in the db2look output, with the exception of the dbheap setting).

Listing 3. Sample output from db2exfmt
Database Context:
---------------- 
        Parallelism:            None				
        CPU Speed:              6.523521e-07       
       Comm Speed:             100 
        Buffer Pool size:       50000 
        Sort Heap size:         10000 
        Database Heap size:     5120 
        Lock List size:         1000 
        Maximum Lock List:      10 
        Average Applications:   1 
        Locks Available:        7849 

Package Context: 
--------------- 
        SQL Type:               Dynamic 
        Optimization Level:     5 
        Blocking:               Block All Cursors 
        Isolation Level:        Cursor Stability 

---------------- STATEMENT 1  SECTION 201 ---------------- 
        QUERYNO:                1 
        QUERYTAG:               CLP 
        Statement Type:         Select 
        Updatable:              No 
        Deletable:              No 
        Query Degree:           1

If you go a bit further down in the db2exfmt output, right after the access plan, you will see if you have any registry settings that affect the optimizer plan.

Note: Again, unfortunately not all the relevant registry variables are listed by db2look -f. You will need to add the ones that are missing. In general, your registry variable settings on the test system should be identical, or as close as possible, to the settings on production.

Listing 4. Registry settings that affect access plan
1) RETURN: (Return Result) 
        Cumulative Total Cost:          57.6764 
        Cumulative CPU Cost:            191909 
        Cumulative I/O Cost:            2 
        Cumulative Re-Total Cost:       5.37264 
        Cumulative Re-CPU Cost:         134316 
        Cumulative Re-I/O Cost:         0 
        Cumulative First Row Cost:      26.9726 
        Estimated Buffer pool Buffers:   2 

        Arguments: 
        --------- 
        BLDLEVEL: (Build level) 
                DB2 v8.1.0.80 : s041221 
        ENVVAR  : (Environment Variable) 
	    	DB2_ANTIJOIN=yes 
                DB2_INLIST_TO_NLJN = yes 
        STMTHEAP: (Statement heap size) 
                2048

Create data definition language (DDL)

The following db2look command creates the DDL to duplicate all database objects, along with the configuration and statistical information.

db2look -d <dbname> -e -a -m -o db2look.out

Here we've used the following parameters:

  • -a: Generate statistics for all creators. If this option is specified, then the -u option will be ignored.
  • -e: Extract DDL file needed to duplicate database. This option generates a script containing DDL statements. The script can be run against another database to recreate database objects.
  • -m: Run the db2look utility in mimic mode. This option generates a script containing SQL UPDATE statements. These SQL UPDATE statements capture all the statistics. This script can be run against another database to replicate the original one. When the -m option is specified, the -p, -g, and -s options are ignored .

Gather statistics and DDL for a database subset

To gather statistics and ddl for only certain tables and related objects, use the following command:

db2look -d <dbname> -e -a -m -t <table1> <table2> .. <tableX> -o table.ddl

Here, I've used the following additional parameter:

Note: The -m option is very important. This option will collect all the statistics from the system tables. The statistics must be the same in test as they are in production and are key to your being able to mimic the production environment in your test environment.

  • -t: Generate statistics for the specified tables. The maximum number of tables that can be specified is 30.

In addition, if you do not use the -a option, you could use the -z option:

  • -z: Schema name. If -z and -a are both specified, then -z will be ignored. Schema name is ignored for the federated section.

More details db2exfmt output

Database manager level configuration parameters

Note: Use the command db2 "get dbm cfg", in order to look at these parameters and db2 "update dbm cfg using <parameter> <value>" to update a database manager configuration parameter.

Parallelism:
This parameter indicates whether inter- or intra-partition parallelism is enabled. If this is DPF with multiple partitions, then you will see Inter Partition Parallelism. If this is just SMP (intra_parallel enabled) single node environment, then you will see Intra Partition Parallelism. If both intra_parallel enabled and multiple partition environment, you will see both Inter and Intra partitions parallelism for this parameter. And finally, if there is no inter- or intra-parallelism, this parameter will show NONE.

CPU Speed (cpuspeed):
The CPU speed (in milliseconds per instruction) is used by the SQL optimizer to estimate the cost of performing certain operations.

Communications speed: (comm_bandwidth)
The value specified for the communications bandwidth (in megabytes per second) is used by the SQL optimizer to estimate the cost of performing certain operations between partition servers of a partitioned database system.

Database level configuration parameters

Note: Use the command db2 "get db cfg for <dbname>" in order to look at these parameters and db2 "update db cfg for <dbname> using <parameter> <value>") to update a database configuration parameter.

Buffer pool size:
The buffer pool size shown in db2exfmt output is determined by the buffpage parameter, if using buffpage as default for one buffer pool, or a calculation based on the contents of syscat.bufferpools. The number shown is the total number of buffer pool pages that are allocated for the database. For example, let's say we have the following buffer pools:

Table 1. Buffer pool setup
BUFFERPOOLNAMESIZE
IBMDEFAULTBP1000
BP11000
BP24000
BPIND11000
BPIND21000
BPLONG1000
BPTEMP1000
Total:10,000

The db2exfmt output would show the total size as the sum of the number of pages in all the bufferpools. In our example above, it is 10,000. Note: Pagesize does not matter, just the number of pages.

You could use the -fd option in db2look to use the db2fopt alternative if you cannot afford to have the same amount of buffer pool allocated in test as in production.

In MPP, the opt_buffpage is calculated per node, as the optimizer uses the total buffer pool information for the node on which the query is running. Hence, this change will only apply to the node on which this tool is being run.

Sort heap size (SORTHEAP)
This parameter defines the maximum number of private memory pages to be used for private sorts or the maximum number of shared memory pages to be used for shared sorts.

You should set this to the same value as in production. Again, using the -fd option in db2look, you will notice the following:

!db2fopt SAMPLE update opt_sortheap 256;

This will override the sortheap configuration parameter, and this is what the optimizer will use as the sortheap value. Again, in reality, the actual sortheap allocated at runtime will be determined by the sortheap setting in the database configuration. As with opt_buffpage, you can use opt_sortheap if you cannot afford to allocate the same size of sortheap on your test system as you can on production.

Database heap size: (DBHEAP)
There is one database heap per database, and the database manager uses it on behalf of all applications connected to the database. It contains control block information for tables, indexes, table spaces, and buffer pools.

Lock list size: (LOCKLIST)
This parameter indicates the amount of storage that is allocated to the lock list.

Maximum lock list: (MAXLOCKS)
This parameter defines a percentage of the lock list held by an application that must be filled before the database manager performs escalation.

The locklist and maxlocks would help determine the type of locks that will be held during a certain scan (index scan or table scan), along with the isolation level. For example, in the plan you will notice, say, an index scan operation:

	 IXSCAN: (Index Scan)

        		TABLOCK : (Table Lock intent)
                	INTENT SHARE

Note: Do not be concerned if Locks Available differs in your db2exfmt output from your test system, compared to production -- the difference has no effect on query planning.

Average applications: (AVG_APPLS)
This parameter is used by the SQL optimizer to help estimate how much buffer pool will be available at run-time for the access plan chosen (since the buffer pool is shared by all active applications connected to the database).

Optimization Level: (DFT_QUERYOPT)
The query optimization class is used to direct the optimizer to use different degrees of optimization when compiling SQL queries

Query Degree: (DFT_DEGREE)
The degree of intra-partition parallelism for an SQL statement. if set to ANY, the optimizer is sensitive to the actual number of cpus that are online. if you use ANY, then the number of cpus on test and production should be configured the same, unless intra_parallel is disabled.

In addition to the above changes, there are some others that you must make sure are the same.

Number of frequent values retained: (NUM_FREQVALUES)
This parameter allows you to specify the number of "most frequent values" that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command.

Number of quantiles retained: (NUM_QUANTILES)
This parameter controls the number of quantiles that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command.

The above two, NUM_FREQVALUES and NUM_QUANTILES must be the same on the test system as in production in order to make sure you collect the same number of frequent and quantile values on test system as in production.

SQL statement heap (4KB): (STMTHEAP)
The statement heap is used as a workspace for the SQL compiler during compilation of an SQL statement. This parameter specifies the size of this workspace. If this parameter is smaller on test than that in production, you may start seeing SQL0101N message due to lack of space in statement heap needed to compile the query. As well, you may see SQL0437W RC=1 , dropping down to greedy join enumeration if there is not enough statement heap for dynamic join enumeration.


Examples of recreating the optimizer/query plan issue on test system

Example 1:

OS: Windows 2000
DB2LEVEL: V8.2 Fixpack 8 ESE Single partition

Note: Create the TEST database with same codepage, territory and collating sequence as production.

TEST and PRODUCTION same OS and db2level.

Databases:

Production Database: SAMPLE
Test Database: DUMMYDB

Sample database created using: db2sampl
Dummy database created using:
db2 create db DUMMYDB

PRODUCTION ENVIRONMENT:

--------------------------------------------------------
-- Database SAMPLE and Database Manager configuration parameters
--------------------------------------------------------
UPDATE DBM CFG USING cpuspeed 9.446886e-007;
UPDATE DBM CFG USING intra_parallel NO;
UPDATE DBM CFG USING federated NO;
UPDATE DBM CFG USING fed_noauth NO;

!db2fopt SAMPLE update opt_buffpage 250;
!db2fopt SAMPLE update opt_sortheap 256;

UPDATE DB CFG FOR SAMPLE USING locklist 50;
UPDATE DB CFG FOR SAMPLE USING dft_degree 1;
UPDATE DB CFG FOR SAMPLE USING maxlocks 22;
UPDATE DB CFG FOR SAMPLE USING avg_appls 1;
UPDATE DB CFG FOR SAMPLE USING stmtheap 2048;
UPDATE DB CFG FOR SAMPLE USING dft_queryopt 5;

---------------------------------
-- Environment Variables settings
---------------------------------

!db2set DB2_INLIST_TO_NLJN=yes;
!db2set DB2_HASH_JOIN=yes;

In addtion to the above, the following configuration should also be noted in the database configuration:

db2 get db cfg for sample > dbcfg_sample.out

Database heap (4KB)                     (DBHEAP) 	= 	600
SQL statement heap (4KB)                (STMTHEAP) 	= 	2048
Number of frequent values retained     (NUM_FREQVALUES) = 	10
Number of quantiles retained            (NUM_QUANTILES) = 	20

Make sure that you stop and start the instance after making the database manager configuration (dbm cfg) changes. For the sample database, run runstats as follows for the ORG and the SALES table:

db2 connect to sample
db2 runstats on table <schema>.org with distribution and indexes all
db2 runstats on table <schema>.sales with distribution and indexes all
db2 terminate

Now, generate the EXPLAIN tables, by executing the EXPLAIN.DDL file, which is under the <install directory>\sqllib\misc directory:

db2 connect to sample
db2 -tvf <intall path>\EXPLAIN.DDL
db2 terminate

Save the following commands in a file called query.sql:

connect to sample
set current explain mode explain
select * from org a, staff b where a.deptnumb=b.dept and b.dept=15
set current explain mode no
terminate

Now, execute the file as follows:

db2 -tvf query.sql

The above will only compile the query in explain mode. You should see the following on the screen:

C:\>db2 -tvf query.sql
connect to sample

   Database Connection Information

 Database server        = DB2/NT 8.2.1
 SQL authorization ID   = SKAPOOR
 Local database alias   = SAMPLE

set current explain mode explain
DB20000I  The SQL command completed successfully.

select * from org a, staff b where a.deptnumb=b.dept and b.dept=15
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

set current explain mode no
DB20000I  The SQL command completed successfully.

C:\>db2 terminate
DB20000I  The TERMINATE command completed successfully.

Generate the access plan using db2exfmt as follows:

db2exfmt -d SAMPLE -g TIC -w -1 -n % -s % -# 0 -o prod_sample_exfmt.txt

Examine the contents of prod_sample_exfmt.txt file. You should see the following access plan generated:

Access Plan:
-----------
	Total Cost: 		25.8823
	Query Degree:		1

              Rows 
             RETURN 
             (   1) 
              Cost 
               I/O 
               |
                4 
             HSJOIN
             (   2) 
             25.8823 
                2 
          /-----+-----\
        4                1 
     TBSCAN           TBSCAN 
     (   3)           (   4) 
     12.9682          12.913 
        1                1 
       |                |
       35                8 
 TABLE: SKAPOOR   TABLE: SKAPOOR  
      STAFF             ORG

Now, this is the plan you are getting on your production sample database. You would like to mimic this plan on your test environment.

Gather the following information from the production sample database:

db2look -d SAMPLE -l -o storage.out                        
db2look -d SAMPLE -f -fd -o config.out
db2look -d SAMPLE -e -a -m -t ORG SALES -o table.ddl

TEST ENVIRONMENT:
Change the database name in each of the files gathered above where we connect to the database from SAMPLE to DUMMYDB.

For example, if you view the contents of the 3 files, you will notice:

CONNECT TO SAMPLE;

Change this to:

CONNECT TO DUMMYDB;

Take the files over to the test environment. In this example, all the tables are created under the default tablespace, USERSPACE1. So they should also be created in the same SMS tablespace on the test system with same configuration as dumped out in storage.out, including PREFETCHSIZE, EXTENTSIZE, and so on.

Make one little change in the config.out file. Change the following:

UPDATE DB CFG FOR SAMPLE USING dft_queryopt 5;

to,

UPDATE DB CFG FOR SAMPLE USING dft_queryopt 3;

and now save the config.out file.

Now, execute storage.out, config.out, and table.ddl as follows:

db2 -tvf storage.out > storage_output.out
db2 -tvf config.out > config_output.out
db2 -tvf table.ddl > table.out

Examine the output files to make sure that all the commands ran successfully. Also, make changes to DBHEAP, STMTHEAP, NUM_FREQVALUES, NUM_QUANTILES for DUMMYDB as it is for the SAMPLE DB shown in the production environment setup. Also, check that your registry variable settings are as identical as possible.

Stop and start the instance, using db2stop and db2start. Create the explain tables again for this DUMMYDB database:

db2 connect to dummydb;
<install path>\sqllib\misc\db2 -tvf EXPLAIN.DDL
db2 terminate;

Now, run the query against the DUMMYDB database, modifying the database name from SAMPLE to DUMMYDB in the query.sql file you generated earlier when running against the SAMPLE database.

C:\>db2 -tvf query.sql
connect to dummydb

   Database Connection Information

 Database server        = DB2/NT 8.2.1
 SQL authorization ID   = SKAPOOR
 Local database alias   = DUMMYDB

set current explain mode explain
DB20000I  The SQL command completed successfully.

select * from org a, staff b where a.deptnumb=b.dept and b.dept=15
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

set current explain mode no
DB20000I  The SQL command completed successfully.

C:\>db2 terminate
DB20000I  The TERMINATE command completed successfully.

Generate the db2exfmt output:

db2exfmt -d DUMMYDB -g TIC -w -1 -n % -s % -# 0 -o test_dummydb_exfmt.txt

Examine the contents of test_dummydb_exfmt.txt and look at the access plan:

Access Plan:
-----------
	Total Cost: 		25.8843
	Query Degree:		1

              Rows 
             RETURN 
             (   1) 
              Cost 
               I/O 
               |
                4 
             MSJOIN
             (   2) 
             25.8843 
                2 
          /-----+-----\
        1                4 
     TBSCAN           TBSCAN 
     (   3)           (   5) 
     12.913           12.9682 
        1                1 
       |                |
        8               35 
 TABLE: SKAPOOR   TABLE: SKAPOOR  
       ORG             STAFF

You get a different access plan in test than production. In this case, it is obvious, as we changed the DFT_QUERYOPT (default query optimization) on test system from 5 to 3. So, you see a Merge Join plan rather then the Hash Join plan, and the Total Cost differs a tiny bit, as well.

Since the plans don't match (let's say you are not sure why), check the configuration in the db2exfmt output. See Table 2.

As you can see, between the TEST and PRODUCTION, the only difference is the Optimization Level, which we intentionally changed from 5 to 3 just to show why one would not be successful in replicating the production access plan on a testing environment.

In this case, you should update the DFT_QUERYOPT, using the following UPDATE statement, to 5:

UPDATE DB CFG FOR SAMPLE USING dft_queryopt 5

Then terminate and reconnect to the database. Issue query.sql again against the DUMMYDB, and generate the access plan using the db2exfmt command. This time you should see the same access plan. If not, further make sure that all the optimizer-related parameters discussed in this article are identical.

Example 2:

This example shows the importance of the -m option in the db2look command. The statistics are gathered with the -m option as mentioned earlier, and should be the same in both test and production. In this example, we'll see how the plan changes when the statistics are not updated properly.

The database manager configuration, database configuration and db2set registry variables are the same as in Example 1 above. The schema name here is SKAPOOR. Replace it with the schema for your table. The databases are the same, SAMPLE and DUMMY as in Example 1. The platform and db2level used here were AIX 5.1 with DB2 UDB ESE V8.2, Fix pack 8, single partition.

Perform the following commands on the sample database:

db2 "connect to sample"
db2 "create index name_ind on staff (name,id)"
db2 "runstats on table skapoor.staff with distribution and indexes all"
db2 "set current explain mode explain"
db2 "select name from staff where id=10 order by name"
db2 "set current explain mode no"
db2 "terminate"

Generate the access plan using db2exfmt. You will see the following access plan:

Access Plan:
-----------
        Total Cost:             0.111065
        Query Degree:           1

      Rows
     RETURN
     (   1)
      Cost
       I/O
       |
        1
     IXSCAN
     (   2)
    0.111065
        0
       |
       35
 INDEX: SKAPOOR
    NAME_IND

Gather the db2look information from the sample database:

db2look -d sample -l -o storage.out
db2look -d sample -e -a -m -t STAFF -o db2look.out
db2look -d sample -f -fd -o config.out

Modify the files so that you are connecting to the dummy database instead of sample as we did previously in Example 1 above.

Modify one of the statistics manually. Search for the following statement in the db2look.out file (Note that the schema name, TABSCHEMA and INDSCHEMA will differ in your case):

UPDATE SYSSTAT.INDEXES
SET NLEAF=1,
    NLEVELS=1,
    FIRSTKEYCARD=35,
    FIRST2KEYCARD=35,
    FIRST3KEYCARD=-1,
    FIRST4KEYCARD=-1,
    FULLKEYCARD=35,
    CLUSTERFACTOR=-1.000000,
    CLUSTERRATIO=100,
    SEQUENTIAL_PAGES=0,
    DENSITY=0,
    AVERAGE_SEQUENCE_GAP=0.000000,
    AVERAGE_SEQUENCE_FETCH_GAP=0.000000,
    AVERAGE_SEQUENCE_PAGES=0.000000,
    AVERAGE_SEQUENCE_FETCH_PAGES=0.000000,
    AVERAGE_RANDOM_PAGES=1.000000,
    AVERAGE_RANDOM_FETCH_PAGES=0.000000,
    NUMRIDS=35,
    NUMRIDS_DELETED=0,
    NUM_EMPTY_LEAFS=0
WHERE INDNAME = 'NAME_IND' AND INDSCHEMA = 'SKAPOOR '
      AND TABNAME = 'STAFF' AND TABSCHEMA = 'SKAPOOR ';

Now, change the FIRSTKEYCARD, FIRST2KEYCARD, FULLKEYCARD AND NUMRIDS from 35 to 37. Save the db2look.out file and now run the 3 files:

db2 -tvf config.out > config_output.out
db2 -tvf storage.out > storage_output.out
db2 terminate
db2stop
db2start
db2 -tvf db2look.out > db2look_output.out

Examine the contents of the first two files, config_output.out and storage_output.out to make sure they ran successfully. Now, examine the contents of db2look_output.out file. You will see that the following update statement below failed:

UPDATE SYSSTAT.INDEXES SET NLEAF=1, NLEVELS=1, FIRSTKEYCARD=37, FIRST2KEYCARD=37
, FIRST3KEYCARD=-1, FIRST4KEYCARD=-1, FULLKEYCARD=37, CLUSTERFACTOR=-1.000000, C
LUSTERRATIO=100, SEQUENTIAL_PAGES=0, DENSITY=0, AVERAGE_SEQUENCE_GAP=0.000000, A
VERAGE_SEQUENCE_FETCH_GAP=0.000000, AVERAGE_SEQUENCE_PAGES=0.000000, AVERAGE_SEQ
UENCE_FETCH_PAGES=0.000000, AVERAGE_RANDOM_PAGES=1.000000, AVERAGE_RANDOM_FETCH_
PAGES=0.000000, NUMRIDS=37, NUMRIDS_DELETED=0, NUM_EMPTY_LEAFS=0 WHERE INDNAME =
 'NAME_IND' AND INDSCHEMA = 'SKAPOOR ' AND TABNAME = 'STAFF' AND TABSCHEMA = 'SK
APOOR '
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL1227N  The catalog statistic "37" for column "FULLKEYCARD" is out of range
for its target column, has an invalid format, or is inconsistent in relation
to some other statistic. Reason Code = "8".  SQLSTATE=23521

As you can see, the above UPDATE statement for the index, NAME_IND, failed because the FULLKEYCARD is greater then the cardinality (CARD) of the table. The CARD is 35 as can be seen by the following update statement in the db2look.out file:

UPDATE SYSSTAT.TABLES
SET CARD=35,
    NPAGES=1,
    FPAGES=1,
    OVERFLOW=0,
    ACTIVE_BLOCKS=0
WHERE TABNAME = 'STAFF' AND TABSCHEMA = 'SKAPOOR ';

Now again, run the same query:

db2 "select name from staff where id=10 order by name"

in explain mode and generate the access plan. You will see it is different:

Access Plan:
-----------
        Total Cost:             12.972
        Query Degree:           1

      Rows
     RETURN
     (   1)
      Cost
       I/O
       |
        1
     TBSCAN
     (   2)
     12.972
        1
       |
        1
     SORT
     (   3)
     12.9708
        1
       |
        1
     TBSCAN
     (   4)
     12.9682
        1
       |
       35
 TABLE: SKAPOOR
      STAFF

This example shows that if RUNSTATS is run while there is WRITE activity happening on the table, the statistics can be inconsistent as in this example. Hence, the UPDATE statements to update the statistics can fail with the SQL1227N error message. It is very important that all the UPDATE statements run successfully, and if there are any inconsistencies, they should be fixed and re-run. In this case, solution would be to change back the KEYCARDS and NUMRIDS from 37 to 35.

Example 3:

You want to mimic the whole database in production to test, in a single partition environment.

Note: You may need to change the database name in each output of db2look, if the database names are different on test compared to production.

  1. Step1: Collect the db2look with the -l option to gather the tablespace/bufferpoo/database node group information.

    db2look -d <dbname> -l -o storage.out

    Modify the tablespace information to fit your test environment. Say, for example, in production you have the following tablespace:
    ------------------------------------
    -- DDL Statements for TABLESPACES --
    ------------------------------------
    
    CREATE REGULAR TABLESPACE DMS1 IN DATABASE PARTITION GROUP IBMDEFAULTGROUP 
    PAGESIZE 4096 MANAGED BY DATABASE
             USING (	FILE '/data/dms1'20000,
                    	FILE '/data/dms2'20000,
    		FILE '/data/dms3'20000)
             EXTENTSIZE 32
             PREFETCHSIZE 32
             BUFFERPOOL IBMDEFAULTBP
             OVERHEAD 12.670000
             TRANSFERRATE 0.180000
             DROPPED TABLE RECOVERY ON;

    If you do not have the same paths set up on test, then change the location above. If you are only planning to mimic the environment, not copy the data over, then make the file sizes smaller, and use fewer containers if needed. If you do not have the same buffer pools created, then you can also change the buffer pool name. Buffer pool must have same pagesize. Don't change tablespace pagesize Once you are done with this and have created the databsae, run the storage.out file:

    db2 -tvf storage.out

    Redirect the output if needed to make sure it all went through successfully. For example:

    db2 -tvf storage.out > storage_results.out
  2. Step 2: Collect the configuration and environment variable information from production and run it on test:
    db2look -d sample -f -fd -o config.out

    Remember, in MPP environments, this information will be collected for the node where you ran this. If your DB2 registry and database and database manager configuration are different on different database partitions, you would want to gather the information from each node separately. However, if you cannot have the same partitions in test as in production, then use this information from the node where you execute the query in production and then use this information on test.

    Note that if you have different number of partitions on test, your simulation will be imperfect.

    On test, run the config.out file as follows:

    db2 -tvf config.out

    The above is taking under consideration that the optimizer will use the db2fopt information to look at the total buffer pools allocated and sortheap, which will be the setting now in your test environment. Again, this is a technique used when you don't have the same buffer pools due to memory constraints and sortheap in test as in production Also, other configuration parameters along with environment variables discussed earlier in this article will also be updated.

  3. Step 3: Collect the DDL information for all objects from production, as we are mimicking the whole database, and run the db2look in test.

    On production:

    db2look -d sample -e -a -m -o db2look.out

    On test:

    db2 -tvf db2look.out

    To see the output result, issue:

    db2look -tvf db2look.out > db2look.results

    Once the above steps are complete, ensure that the dbheap database configuration parameter is set to the same value in test as in production.

  4. Step 4: Get the access plan using db2exfmt from both test and production, and make sure that the following is the same as it is in production:

    Database Context:
    ----------------
            Parallelism:            None
            CPU Speed:              4.762804e-07
            Comm Speed:             100
            Buffer Pool size:       128500
            Sort Heap size:         128
            Database Heap size:     5120
            Lock List size:         12250
            Maximum Lock List:      10
            Average Applications:   4
            Locks Available:        78400
    
    Package Context:
    ---------------
            SQL Type:               Dynamic
            Optimization Level:     3
            Blocking:               Block All Cursors
            Isolation Level:        Cursor Stability
    
    
    ---------------- STATEMENT 1  SECTION 201 ----------------
            QUERYNO:                1
            QUERYTAG:               CLP
            Statement Type:         Select
            Updatable:              No
            Deletable:              No
            Query Degree:           1

    Now, look at the access plan. If they are the same, then you are successful in recreating the access plan. Note also that you should also look towards the bottom of the db2exfmt output to verify that the tablespace configuration match.

Example 4:

Production: MPP with 16 physical / 4 logical per partition.
Test: MPP with only 4 physical machines available with 4 logical per partition.
Tables, views/MQTs involved in the query.

In this example, the simulation may not work accurately. You must have the name number of partitions in both test and production. However, you can still try to recreate, but it will not be right.

Therefore, you must add 16*4=64 partitions to the test environment in order to get a proper recreation. You do not need 16 physical machines in the test environment; that is, you can have four physical machines with 16 logical partitions per physical. It's up to you, but you must have 64 logical partitions overall, the same as production.

So now, after making the changes to add the same number of logical partitions to test environment, the test should look like the original production setup, shown in the table below.

Table 3. Production setup
DBPARTITIONS
ALLNODES (on nodes 1 to 64)
NODE1 (db partition defined on node 1)
NODE2 (db partition defined on node 5)
TABLESPACES
TABSPACE1 (DMS using devices defined in dbpartition ALLNODES)
TABSPACE2 (DMS using SMS defined in dbpartition NODE1)
TABSPACE3 (DMS using DMS defined in dbpartition NODE2)
TABLES
TAB1 in TABSPACE1
TAB2 in TABSPACE2
TAB3 in TABSPACE3
MQT:
MQT defined on TAB3
VIEW:
VIEW1 defined that include two tables TAB1 and TAB2

Make sure you gather the db2look using the -f and -fd from the node from which the query is being issued to make sure that the buffer pool info, as discussed earlier, is picked from this node and the registry setting, along with db cfg and dbm cfg, is picked up from the node where the query is being ran. In my experience, customers usually have same configuration on all the nodes, except for buffer pools, which is a very important setting.

Steps to follow:

  1. Step1: Gather the storage information from production:

    db2look -d <dbname> -l -o storage.out
  2. Step2: Modify the tablespace/buffer pool information to fit the test environment. If you do not have devices available, then use DMS file containers, but be aware that if you do so, then your simulation will no longer be completely accurate and you may not be able to reproduce exactly the same plan. Also, if you do not wish to use the same number of containers in test that are in production, shorten the list and use fewer containers. But again, you will have to make sure that if a tablespace is of type DMS or SMS on production, it must remain the same type in test and, to ensure a completely accurate simulation, the DMS container type should also match.
  3. Step3: Gather the configuration information using the following command:

    db2look -d <dbname> -f -fd -o config.out
  4. Step4: Now, gather the db2look information for just the objects that we are interested in. In our example, we want all related information, including table DDL, views, and MQTs associated with the tables:

    db2look -d <dbname> -e -a -m -t TAB1 TAB2 TAB3 -o db2look.out

    Once you have gathered all of this information and modified the tablespace/buffer pool information, execute the db2look output files on test environment, and, again, get the db2exfmt output from both production and test and compare them.

Example 5:

This is a classic example of an issue where you have collected RUNSTATS information during activity on the table. You're getting SQL1227N error messages, and you will not be able to recreate the issue unless you modify the statistics manually.

Say for example, the table has one million rows and there is a primary key defined on an integer column. You ran RUNSTATS with distribution and indexes all option, allowing write access to the table. By the time we got the table statictics, there were 100,000 additional records inserted into the table. So for the table statistics, the CARD will show as 1,100,000. However, by the time we started collecting index statistics, for example, for the primary key defined on the integer column, there were 10,000 additional records inserted, hence, the number of rows in the table is 1,110,000, and hence the primary index FIRSTKEYCARD would be 1,110,000. So you can see the inconsistency. The CARD for the table statistics says that there should be 1,100,000 records in the table, whereas the FIRSTKEYCARD for the primary index statistics says that there are 1,110,000 records in the table. The update for the index statistics would fail with a SQL1227N rc=8 error message (in this case) as the FIRSTKEYCARD of the index is greater than CARD of the table. You will have to manually fix this inconsistency and, for this example, make the FIRSTKEYCARD equal to CARD, which is 1,100,000, or the opposite -- that is, increase CARD to equal FIRSTKEYCARD, which is 1,110,000.

There are many other inconsistencies you could encounter. Make sure that when you run the db2look with the -m option on test that you save the output to a file, review all the inconsistencies, and fix them. I have only given an example of one inconsistency here; there are many others that you can run into, and it is left to the user to fix all these inconsistencies and then rerun db2look, redirecting the output to a file to make sure all the update statistics ran fine without any problems.

Example 6:

This is an example where in production you get a SQL0437W rc=1 warning message and you are not seeing it test. In this case, follow the above examples to recreate the issue. Make sure that STMTHEAP is the same. If it is different (for example, higher in test for some reason then in production), then you may not see the same warning. Again, the other parameters that we've already discussed matter as well.

SQL0437W rc=2 and other return codes can be recreated as well, following the same method.

Other errors messages, such as SQL0101N and SQL0901N, can also be recreated using the same method. Even traps in the compiler/optimizer area can be recreated. This is very useful when you are at an older fix pack level and want to try the latest fix pack level to see if that avoids the problem or when you want to play around with different optimization levels to see if that would overcome the problem temporarily.


Conclusion

The db2look utility is quite a powerful utility that you can use to recreate access plan problems along with compiler issues like those we have talked about in this article. Once the problem is recreated, you can test out many variables that can affect performance, such as changing optimization levels, playing with registry variables, and updating the statistics without affecting the production, as well as testing new fix pack levels. You'll find this a handy utility for debugging problems and improving query performance.

Resources

  • Learn more:
    • developerWorks DB2 zone: Learn more about DB2. You'll find technical documentation, how-to articles, education, downloads, product information, and more.
  • Discuss:

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=91239
ArticleTitle=Recreate optimizer access plans using db2look
publish-date=08042005