Technical Blog Post
Database Manager and Database configuration parameters affecting query plan
The performance of the query in DB2 can be affected by many thing like the indexes created, runstats used, reopt bind options etc.. There are also a number of Database Manager and Database configuration parameters that affect the optimizer choosing the optimal plan in DB2.
Some are listed below -
Database Manager configuration parameters -
1) Parallelism - The first one we will consider is Parallelism. When a number of tasks are run in parallel we say parallelism is enabled. This can improve the performance of a system highly. Parallelism can occur at two levels. At the partition level and at the query level.
Inter Query - When multiple queries are executed at the same time by the system.
Intra Query - When parts of the same query are run(processed) in parallel at the same time.
Intra Query can be
I) Interpartition -Single query runs parallel across multiple partition. This could be on the same machine or across multiple machines. Here degree of parallelism determined by number of partitions.
II) Intrapartition -Single query broken up and run in parallel on single partition. An operation like index creation, database load or a complex query can be divided into multiple part to be run in parallel on the same database partition.
a)intra_parallel (DBM cfg) -When this value is set to YES (1) implies that the Database Manager is set to use intrapartition parallelism.
b)max_querydegree (DBM cgf) -This sets the maximum degree of intrapartition parallelism for any query at the instance level. If this is set to -1(ANY), it implies degree is determined by optimizer.
c)dft_degree (DB cfg) -This sets the default value for CURRENT DEGREE special register and the DEGREE bind option. CURRENT DEGREE special register is used to specify intrapartition parallelism for a dynamic sql statement at compile time and DEGREE bind option is used for static sql statements.
III). Both Interpartition and Intrapartition is enabled.
2) cpuspeed (DBM cfg) -The CPU speed, also know as the processor speed, is the amount of cycles that a CPU can perform per second. The cpu speed influences the optimizer in determining the access paths. This can also be used to model a different env. if you are trying repro the issue or test it with a different value. Setting it to -1 will re-calculates this value.
3) comm_bandwidth (Megabytes per second)(DBM cfg) - This parameter indicates communication bandwidth between database partitions. This is used by optimizer is calculating the cost of certain operations between database partitions.
4) sheapthres (DBM cfg) -Sheapthres is the soft limit on total number of amount of memory consumed by the private sorts at a given time. We will dig more into this when we discuss the Database Configuration parameters
Database level configuration parameters -
1) Bufferpool size (buffpage) -A good Bufferpool size will reduce the disk I/O and can save on a lot of time for the optimizer as more work can be done in memory. This can improve the overall performance of the query. The total number of bufferpool pages allocated for the database and used by the optimizer is seen in the db2exfmt output.
2) SORT HEAP size (SORTHEAP) - This parameter defines the maximum number of private memory pages to be used for private sorts or the maximum number of shared memory pages to be used for shared sorts.
(sheapthres and sheapthres_shr)
3) Database heap size (DBHEAP) - There is one database heap allocated per database. It contains control block information for tables, indexes, tablespaces, and bufferpools.
4) Lock list (LOCKLIST) and Maximum lock list (MAXLOCKS)
LOCKLIST is the storage allocated for locks per database and it contains the locks held by all applications concurrently connected to the database.
MAXLOCKS is the percentage of the lock list held by an application that must be filled before the database manager performs lock escalation. Optimizer considers these two to see if any lock escalation is needed.
5) Average Number of Active Applications (AVG_APPLS)
The optimizer uses the avg_appls parameter to help estimate how much of the buffer pool might be available at run-time for the access plan chosen. Ex. a value of 1 implies the entire buffer pool will be available to the application.
6)Optimization Level (DFT_QUERYOPT)
This parameter specifies the default value for the CURRENT DEGREE special register and the DEGREE bind option.
7) Number of frequent values retained(NUM_FREQVALUES)
This parameter allows you to specify the number of "most frequent values" that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command. The "most frequent value" statistics help the optimizer understand the distribution of data values within a column. A higher value results in more information being available to the query optimizer but requires additional catalog space.
8) Number of quantiles retained(NUM_QUANTILES)
This parameter controls the number of quantiles that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command.
The "quantile" statistics help the optimizer understand the distribution of data values within a column. A higher value results in more information being available to the query optimizer but requires additional catalog space. When 0 or 1 is specified, no quantile statistics are retained, even if you request that distribution statistics be collected.
9) Statement heap size (STMTHEAP)
This parameter specifies the limit of the statement heap, which is used as a work space for the SQL or XQuery compiler during compilation of an SQL or XQuery statement.