Performance is a vital key to the success of your on demand applications. When those applications are using IBM® DB2 Universal Database™ as a data store, it's essential that you begin with a fundamental knowledge of how to achieve the best possible performance with DB2 UDB. In this article I'll give in-depth recommendations for tuning a DB2 UDB V8 system.
We'll talk about performance issues from the beginning to the end of the process. You can follow the flow from creating a new database to running with your application. You will see how to use the DB2 auto-configuration utilities to initially configure your database manager and database environment. Then I'll discuss best practices for creating buffer pools, table spaces, tables, and indexes. There are some important configuration parameters you may want to adjust from their initial settings to better support your application, so we'll take a look at those configuration parameters as well.
We'll cover tuning based on monitor output in detail. I'll show you how to use snapshot monitoring to help tune your SQL, buffer pools, and various database manager and database configuration parameters. Then, we'll take a closer look at the SQL your application issues to DB2. We'll cover statement event monitoring, showing you how to capture the SQL your application is issuing. Using Explain, you can generate the access plan that the SQL is taking and look for opportunities to better optimize. We'll examine the Design Advisor, a tool which can recommend new indexes or evaluate existing indexes, based on SQL workloads that you supply to it. Finally, I'll discuss some DB2 SQL options.
In addition, on-going maintenance is very important to maintain optimal performance. So we'll cover the important utilities to help you do this. For those who are using the DB2 ESE Database Partitioning Feature (DPF), I've included a section covering the issues you should be concerned with to keep the database performing well. Sometimes there is an external bottleneck (from DB2) which can prevent you from achieving your performance goals. Common bottlenecks and the utilities you can use to monitor them are listed. Finally, the paper ends with a listing of valuable IBM resources to help you find valuable DB2 information.
The article is intended for those with an intermediate skill in DB2 database administration.
Before you start
Before you begin the performance tuning process, make sure you have applied the latest DB2 fix pack. There are often performance enhancements in fix packs. DB2 FixPak 4 was used as a basis for this article. If you are using a pre-FP4 release, not all options discussed may be available in your environment.
When you are tuning, it is ideal to have a reproducible scenario of database use (that is, the workload your application runs against DB2) that you can use for tailoring your tuning efforts. For example, if the workload has variations of 10% is elapsed time from run to run, it is very difficult to know what effect tuning has really had. Additionally, in cases where the workload varies from run to run, it is hard to measure the changes to the database manager and database configuration parameters.
Always keep track of all changes. This can be useful for the development of tuning scripts or recommendations, as a history for other DBAs, and for backing out of any bad changes.
At the end of most sections, there are pointers to relevant sections of the DB2 v8 HTML Documentation. The online documentation can be found at http://publib.boulder.ibm.com/infocenter/db2help/index.jsp.
The "Top 10" performance boosters
Here are the top 10 things you can do to get the most performance out of your database. Usually, you will find that about 90% of maximum performance is achieved using about 10% of possible configuration changes. I'll discuss each item in detail in the appropriate section below (identified in parenthesis):
- Ensure that you have enough disks (6-10 per CPU is a good start). Each table space's containers should span all available disks. Some table spaces, such as SYSCATSPACE and those with a small number of tables do not need to be spread across all disks, while those with large user or temporary tables should (Table Spaces).
- Buffer pools should make use of about 75% (OLTP) or 50% (OLAP) of available memory (Buffer Pools).
- Runstats should be performed on all tables, including the system catalog tables (Runstats).
- Use the Design Advisor to recommend and review indexes for SQL workloads (Design Advisor).
- Use the Configuration Advisor to configure the database manager and database for your application environment (Configuration Advisor).
- Logging should occur on a separate high-speed disk, identified by the NEWLOGPATH database configuration parameter (Experimenting).
- Concurrency can be increased by committing often (SQL statement tuning).
- SORTHEAP should be increased to avoid sort overflows (DBM and DB configuration).
- Table space type should be SMS for the system catalog table space, and temporary table spaces and DMS raw (device) or file for the rest. Run db2empfa to enable multi-page file allocation for the SMS table spaces; this will allow SMS table spaces to grow an extent at a time (instead of a page), which can speed up heavy insert operations and sorts which spill to disk (Table spaces).
- Use parameter markers for repeated statements (SQL statement tuning).
Creating the database
Creating a database will by default create 3 System Managed Storage (SMS) table spaces (SYSCATSPACE, TEMPSPACE1, and USERSPACE) and a 4 MB buffer pool (IBMDEFAULTBP), which all use a 4 KB page size. It is usually a good idea to drop TEMPSPACE1 and USERSPACE and recreate them according to the recommendations below. In almost all cases, SYSCATSPACE does not need to be optimized further, but you may notice a slight performance improvement by spreading its containers over a few disks (discussed later).
When creating the database, you can take advantage of the autoconfigure options to initially automatically configure the database for the environment. This is handy if your application programmatically creates a DB2 database, as you can provide these options from your application to DB2. The other option you have to automatically configure your database is the more powerful Configuration Advisor GUI, which not only configures your database but your instance as well. However, to use the Configuration Advisor, the database must exist. We'll discuss the Configuration Advisor in a following section.
In Listing 1, we use the autoconfigure option of the CREATE DATABASE command to create a database in Windows that has its SYSCATSPACE spanned across two available disks.
Listing 1. Creating a database with autoconfigure options
create database prod1 catalog tablespace managed by system using ('c:\proddb\cattbs\01','d:\proddb\cattbs\02') extentsize 16 prefetchsize 32 autoconfigure using mem_percent 50 workload_type simple num_stmts 10 tpm 20 admin_priority performance num_local_apps 2 num_remote_apps 200 isolation CS bp_resizeable yes apply db and dbm
Table 1 shows the valid autoconfigure input keywords and values:
Table 1. Autoconfigure options
|Keyword||Valid Value||Default Value||Description|
|mem_percent||1-100||25||Percentage of physical memory to dedicate to the database. If other applications are running on this server (not including the OS), set this to less than 100.|
|workload_type||simple, mixed, complex||mixed||Simple workloads tend to be I/O intensive and mostly transactions (OLTP), whereas complex workloads tend to be CPU intensive and mostly queries (OLAP/DSS).|
|num_stmts||1-1000000||25||Number of statements per unit of work.|
|tpm||1-200000||60||Transactions per minute.|
|admin_priority||performance, recovery, both||both||Optimize for better performance (more transactions per minute) or better recovery time.|
|num_local_apps||0-5000||0||Number of connected local applications.|
|num_remote_apps||0-5000||100||Number of connected remote applications.|
|isolation||RR, RS, CS, UR||RR||Isolation level of applications connecting to this database (Repeatable Read, Read Stability, Cursor Stability, Uncommitted Read).|
|bp_resizeable||yes, no||yes||Whether or not the buffer pools can be resized online.|
The Configuration Advisor
If you have already used autoconfigure while creating the database, this step is not as important. The Configuration Advisor is a GUI tool which allows you to automatically configure a database and instance based on the answers you provide to a series of questions. You will often notice a pretty significant increase in performance by running this tool. It can be launched from the Control Center by right-clicking on the database and selecting "Configuration Advisor…". Once you have answered all the questions, you can generate and optionally apply the results. Figure 1 shows a screenshot of the results page:
Figure 1. Configuration Advisor Results screen
Creating buffer pools
Properly defining buffer pools is one of the major keys to having a well-performing system. With 32-bit operating systems, it is important to be aware of the limit on shared memory, which restricts a database’s buffers pools (that is, database global memory) from exceeding the following limits (64-bit systems do not have such a limit):
- AIX - 1.75 GB
- Linux - 1.75 GB
- Sun - 3.35 GB
- HP-UX - approximately 800 MB
- Windows - 2-3 GB (use ‘/3GB’ switch in boot.ini on NT/2000)
Use the following formula for calculating your approximate database global memory usage:
Listing 2. Calculating database global memory usage (shared memory)
buffer pools + dbheap + util_heap_sz + pkgcachesz + aslheapsz + locklist + approx 10% overhead
If INTRA_PARALLEL is enabled, then add the value of sheapthres_shr to the total.
Determining how many buffer pools
You will need at least one buffer pool for each page size used by a table space in your database. Usually, the default IBMDEFAULTBP buffer pool is left for the system catalogs. New buffer pools are created to handle different page sizes and behaviors of the table spaces.
For starters, you should begin with one buffer pool for each page size being used, especially for OLAP/DSS type workloads. DB2 is very good at self-tuning its buffer pools and keeping the most frequently accessed rows in memory, so one buffer pool may be all you need. (This option also avoids the complexity of managing multiple buffer pools.)
If you have the time and require improvements, you may want to use multiple buffer pools. The idea is to keep the most frequently accessed rows in a buffer pool. Sharing a buffer pool with tables that are accessed randomly or infrequently can cause "polluting" of your buffer pool, by consuming space and possibly pushing a frequently accessed row to disk, for a row which may never be accessed again. Keeping indexes in their own buffer pool can also significantly improve performance when indexes are heavily used (for example, index scans).
This ties in closely to our discussion of table spaces, as you will want to assign buffer pools based on table behavior within the table space. With the multiple buffer pool approach, a good starting point would be to use 4 buffer pools as follows:
- A medium-sized buffer pool for temporary table spaces
- A large-sized buffer pool for index table space
- A large-sized buffer pool for table spaces which contain frequently accessed tables
- A small-sized buffer pool for table spaces which contain infrequently accessed, randomly accessed, or sequentially accessed tables
DMS Table spaces which contain only LOB data can be assigned to any buffer pool, as LOBs do not use buffer pool space.
Determining memory allocation for buffer pools
Never allocate more memory than you have available or you will incur costly OS memory paging. Generally speaking, it may be pretty hard to know how much memory to initially allocate to each buffer pool without monitoring.
For OLTP type workloads, 75% of available memory being allocated to the buffer pool(s) is a very good starting point.
For OLAP/DSS, the rule of thumb is to give 50% of your available memory to a single buffer pool (given that only one page size is being used) and the other 50% to the SORTHEAP.
Using block-based buffer pools
OLAP queries that rely heavily on prefetching may benefit from a block-based buffer pool. By default, all buffer pools are page-based, which means that prefetches will place contiguous pages on disk into non-contiguous memory. If a block-based buffer pool is used, DB2 will use block I/Os to read multiple pages into the buffer pool in a single I/O which significantly improves the performance of sequential prefectching.
A block-based buffer pool consists of both the standard page area and a block area. The NUMBLOCKPAGES parameter of the CREATE and ALTER BUFFERPOOL SQL statement is used to define the size of the block memory, while the BLOCKSIZE parameter specifies the size of the blocks, and hence the number of pages to read from a disk in a block I/O.
Table spaces which share the same extent size should be the only users of a specific block-based buffer pool. Set the BLOCKSIZE equal to the table space’s EXTENT SIZE that is using the buffer pool.
Deciding how much memory to dedicate to blocking within the buffer pool is more complicated. If there is a lot of sequential prefetching occurring, you will likely want more block-based buffer pool available. NUMBLOCKPAGES should be a multiple of BLOCKSIZE and cannot be more than 98% of the number of pages for the buffer pool. Set small at first (not more than 15% or so of the total buffer pool size). It can be adjusted later, based on snapshot monitoring.
- Concepts ==> Administration ==> Database objects ==> Buffer Pool Management
- Reference ==> SQL ==> SQL Statements ==> CREATE BUFFERPOOL
- Reference ==> SQL ==> SQL Statements ==> ALTER BUFFERPOOL
- Concepts ==> Administration ==> Performance tuning ==> Operational performance ==> Memory-use organization
Creating table spaces
Since table spaces are assigned a buffer pool, the previous section on buffer pools is very relevant to performance issues involving table spaces. Using the DB2 Control Center is the easiest and recommended way to create and configure table spaces (right-click on your database’s Table Spaces folder and select Create…).
Determining the type of table space to create (DMS or SMS)
System Managed Storage (SMS) should be used for system temporary table spaces and the system catalog table space, as it allows for the table space to grow and shrink dynamically. DMS may be more efficient in cases where there are large temporary tables that are flushed to disk (either by not having enough sort space or by explicitly creating temporary tables). When using SMS, you should run the utility ‘db2empfa’ which will enable multi-page file allocation to grow the table space one extent at a time, as opposed to one page at a time.
Database Managed Storage (DMS) should be used for all other table spaces. DMS allows for a single table to span multiple table spaces (index, user data, and LOB) which decreases contention between index, user, and LOB data during prefetch and update operations, thereby improving data access times. Using DMS raw may even squeeze an additional 5-10% performance increase.
Determining page size
In order for you to create a table, there must be a table space with a page size large enough to contain a row. You have the option of using 4, 8, 16 or 32 KB page sizes. Sometimes you simply have to use a larger page size to get around some database manager limits. For example, the maximum size of the table space is proportional to the page size of the table space. A table space’s size (per partition) is limited to 64 GB using a 4K page size and to 512 GB using a 32K page size.
For OLTP applications that perform random update operations, a smaller page size is usually preferable as it consumes less space in the buffer pool.
For OLAP applications that access large numbers of consecutive rows at a time, a larger page size is usually better because it reduces the number of I/O requests that are required to read a specific number of rows. Larger page sizes also allow you to reduce the number of levels in the index by keeping more row pointers on a single page. There is however an exception to this. If your row size is smaller than the page size / 255, there will be wasted space on each page due to there being a maximum of 255 rows per page (does not apply to index data pages). In this situation, a smaller page size may be more appropriate.
For example, if you were using a 32K page size to store rows with an average length of 100 bytes, only 100 * 255 = 25500 bytes (24.9 KBs) could be stored on a 32 KB page. This means about 7 KBs per 32 KB page is being wasted. This could be a lot of space in a large database.
Determining how many table spaces
Like buffer pools, you should begin with one table space for each page size being used. For each page size being used, a system temporary table space with a matching page size will need to exist (to support sorts and reorgs). You would then assign all table spaces which share a matching page size to a buffer pool with the same page size.
If you are still concerned about performance and have time to invest, you would use DMS table spaces and group tables based on use. Additionally, follow the recommendations above for using multiple buffer pools.
For each page size, create a:
- System temporary table space
- Regular table space for Indexes
- Regular table space for frequently accessed tables
- Regular table space for infrequently accessed, randomly accessed, and sequentially accessed tables
- Large table space for LOB data
A good starting point is about 6-10 disks per CPU dedicated to table spaces. Each table space should span all disks, that is, have one container (and no more) on each available disk.
You should create the same number of Logical Volumes (UNIX) on each disk as you have table spaces. This way, each table space will have its own logical volume on each disk for container placement. If you are not using DMS raw, you will need to create a file system within each logical volume.
Disk array and storage subsystems
For large disk systems, you should use a single container. Additionally, you will need to set the DB2 Profile Registry variable DB2_PARALLEL_IO for that table space. This is discussed in the Profile registry section.
The Extent Size specifies the number of PAGESIZE pages that will be written to a container before skipping to the next container and is defined at table space creation time (and cannot be easily modified after). Smaller tables are handled more efficiently with smaller extents.
Rule of thumb is based on the average size of a table in the table space:
- Less than 25 MB, use an Extent Size of 8
- Between 25 and 250 MB, use an Extent Size of 16
- Between 250 MB and 2 GB, use an Extent Size of 32
- Greater than 2 GB, use an Extent Size of 64
Use larger values for OLAP databases and tables that are mostly scanned (query only) or have a very high growth rate.
If the table space resides on a disk array, set the extent size to the stripe size (that is, data written to one disk of the array).
Prefetch size can be changed easily using ALTER TABLESPACE. The optimal
setting seems to be:
Prefetch Size = (# Containers of the table space on different physical disks) * Extent Size
If the table space resides on a disk array, set it as:
PREFETCH SIZE = EXTENT SIZE * (# of non-parity disks in array).
- Concepts ==> Administration ==> Database design ==> Physical ==> Table Space Design
- Reference ==> SQL ==> SQL Statements ==> CREATE TABLESPACE
- Reference ==> SQL ==> SQL Statements ==> ALTER TABLESPACE
Multidimensional clustering (MDC)
MDC provides for flexible, continuous, and automatic clustering of data along multiple dimensions. It improves query performance and reduces the need for REORG and index maintenance during insert, update, and delete.
Multidimensional clustering will physically cluster the table’s data along multiple dimensions simultaneously, which is similar to having multiple, independent clustered indexes on the table. MDCs are typically used to help speed up the performance of complex queries on large tables. There is no need to use REORG to recluster the index since MDCs maintain clustering in each dimension automatically and dynamically.
The best candidates for an MDC are those queries which have range, equality, and join predicates that access many rows. Never use a unique column as a dimension since that can cause a table to be unnecessarily large. Avoid using too many dimensions if there are not a significant number of rows with each combination of dimension values (i.e., cell). For best performance, you will want at least enough rows to fill a full block of each cell, which is equal to the extent size of the table space that the table resides in.
- Concepts ==> Administration ==> Database design ==> Logical ==> Multidimensional clustering (MDC)
- Concepts ==> Administration ==> Database objects ==> Tables ==> Multidimensional clustering (MDC) tables
Materialized query tables (MQTs)
MQTs can be used to improve queries which use the GROUP BY, GROUPING, RANK, or ROLLUP OLAP functions. Their use is transparent to the user and DB2 chooses when to use them for optimization purposes. The MQT is used by DB2 to internally maintain summarized results of the required grouping, which allows a user to access a maintained grouping instead of reading through what could be multiple GBs of data to find the answer. These can also be replicated across partitions to help performance of collocated joins by avoiding the broadcasting of this information between partitions.
CREATE TABLE options
Avoid using the VARCHAR data type for columns of 30 bytes or less as it typically wastes space; instead use CHAR. Wasting space can even affect query times if the volume is significant.
When using an IDENTITY or SEQUENCE, use at least the default cache size of 20 (unless there is a big concern about gaps in the numbering). This way you are not calling on the DBM to generate a number every time one is needed and you are avoiding the logging which occurs with number generation.
VALUE COMPRESSION and COMPRESS SYSTEM DEFAULT can save disk space when a table uses many null and system default values. System default values are the default values used for the data types when no specific values are specified. This can also help improve query time when volume is significant. Compressed columns will incur a small overhead if a value is inserted or updated.
ALTER TABLE options
Use APPEND ON for tables which have heavy inserts to avoid the search for free space during the insert process and instead simply append the row to the end of the table. If you rely on the table being in a special order and cannot afford to perform REORGs, avoid using APPEND ON.
Set LOCKSIZE to TABLE for read-only or exclusive access tables. This avoids the time to lock the rows and reduces the amount of LOCKLIST needed.
Use PCTFREE to maintain free space to help with future INSERTs, LOADs and REORGs. The default is 10; try 20-35 for tables with clustered indexes and a heavy insert volume. If using with APPEND ON, set PCTFREE to 0.
Use VOLATILE to encourage an index scan. Volatile indicates that the cardinality of the table can vary significantly from quite large to empty. This helps to encourage the optimizer to use an index as opposed to a table scan, regardless of the statistics on the table. However, this only works in the cases where the index contains all the columns referenced or when the index is able to apply a predicate in the index scan.
Use NOT LOGGED INITIALLY to turn logging off for the duration of a transaction (that is, until COMMIT).
VALUE COMPRESSION and COMPRESS SYSTEM DEFAULT also work in the ALTER TABLE command.
- Reference ==> SQL ==> SQL Statements ==> CREATE TABLE
- Reference ==> SQL ==> SQL Statements ==> ALTER TABLE
Thanks to the Design Advisor, the burden of designing indexes has been eliminated. The Design Advisor is used to recommend and evaluate indexes for specific SQL workloads (that is, a set of SQL statements) and will be discussed shortly.
Nonetheless, the following are some index related issues that you should be aware of:
- When queries are completing in a reasonable time, avoid adding indexes as they can slow down update operations and consume extra space. It is sometimes possible to have one larger index that will cover several queries.
- Columns with high cardinality are good candidates for indexing.
- Avoid using more than five columns in an index due to management overhead.
- For multi-column indexes, place the column which is referenced most in queries first in the definition.
- Avoid adding an index which is similar to a preexisting index. It creates more work for the optimizer and will slow down update operations. Instead, alter the preexisting index to contain additional columns. For example, there is an existing index i1 on (c1,c2) of a table. You notice that your query using "where c2=?", so you create an additional index i2 on (c2). This similar index adds nothing, as it is redundant to i1 and is now additional overhead.
- If your table is read-only and contains a large number of rows, try to define an index which contains all columns referenced in the query using the INCLUDE clause of CREATE INDEX (INCLUDEd columns are not part of the index but are stored as part of the index page to avoid additional data FETCHES).
Clustered indexes can be created to order the rows in the table in the same physical order as the desired result set. They are created using the CLUSTER option of the CREATE INDEX statement. Do not create clustered indexes on volatile tables as the index will not be used. For best performance, create the index over small data types (like integer and char(10)), unique columns, and columns most often used in range searches.
Clustered indexes allow for a more linear access pattern to data pages and more effective prefetching, and help avoid sorts. This means longer inserts, but quicker selects. Consider increasing the free space on data and index pages when using clustered indexes to about 15-35 (instead of the default 10 for PCTFREE) for high volumes of inserts. For tables which are heavily inserted into, consider using a single dimension MDC table (perhaps using a generated column like idcolumn/1000 or INT(date)/100). This will cause the data to be block indexed (on the dimension) rather than on the row. The resulting index is smaller and log contention is significantly reduced during insert.
CREATE INDEX options
Use PCTFREE 0 if index is on a read-only table and 10 for others to help speed up inserts by having space available. Additionally, this should be higher for tables with clustered indexes to ensure that the clustered indexes do not become too fragmented. In such a case where there is a high volume of inserts, a value of 15 – 35 may be more appropriate.
Use ALLOW REVERSE SCANS to allow for an index to be scanned bi-directionally, which means quicker retrieval of ascending and descending result sets. This has no negative performance impact since the index structure does not change internally to support this feature.
INCLUDE can be used to include additional non-indexed columns in the index page to promote index-only access and avoid data page fetches.
UNIQUE columns can be used to efficiently enforce the uniqueness of a column or set of columns.
TYPE-2 INDEXES drastically reduce next-key locking, allow for index columns greater than 255 bytes by default, allow for both online REORG and RUNSTATS, and support the new multidimensional clustering ability. All new indexes in v8 will be created as type-2 except when there was already a (pre-migration) type-1 index defined on the table. Use REORG INDEXES to convert type-1 to type-2 indexes.
- Concepts ==> Administration ==> Performance tuning ==> Operational performance ==> Table and index management ==> Index planning
- Concepts ==> Administration ==> Performance tuning ==> Operational performance ==> Table and index management ==> Performance tips for indexes
- Reference ==> SQL ==> SQL Statements ==> CREATE INDEX
Profile registry configuration
DB2 profile registry variables typically affect how the optimizer and the DB2 engine itself behave. There are many profile registry variables, but most of them have a very specific purpose which will not be used in most DB2 environments. Below are some of the commonly used ones.
Table 2 lists the basic administration commands for the profile registry:
Table 2. Profile registry administration
|db2set –all||Lists all currently set DB2 registry variables.|
|db2set –g | -i variable=value||Sets the specified DB2 registry variable at either the global (-g) or instance (-i) level.|
Note: Do not place spaces between the variable and the value, or the variable will be reset to its default value.
This will help promote parallel access to any table space(s) which resides on a disk array. If all of your table spaces are on disk arrays, set this equal to *. If only some of the table spaces are on disk arrays, use "db2 list tablespaces" to retrieve their ID’s and set this equal to their ID’s (use comma to separate ID’s). For best performance, make sure that your table space’s prefetch size is significantly larger than its extent size.
OFF by default, enabling to ON will defer locking until after predicate evaluation has taken place. Enabling this can be very useful for reducing lock contention in applications which have been migrated from Oracle.
OFF by default, enabling to ON will allow statements using CS or RS to skip deleted keys of indexes and deleted rows of tables. Again, this can be very useful for reducing lock contention in applications which have been migrated from Oracle.
Enabled by default. OLTP may benefit from disabling DB2_HASH_JOIN (NO).
NO by default; if you are using DB2’s Database Partitioning Feature (DPF) and have multiple logical partitions, set to YES to improve inter-partition communication at a cost of 1 less shared memory segment being available to buffer pools. Use NO when database partitioning is not being used.
(AIX 4.3) DB2_MMAP_READ and DB2_MMAP_WRITE
Both are enabled by default. If you are using AIX 4.3, 32-bit DB2, and memory is limiting you from increasing the size of your buffer pools, set this to OFF to free up one more memory segment. This should free up approximately 256 MBs of shared memory (which you should be able to use some of for your buffer pools). Conduct tests to make sure that the change improves performance as sometimes using memory mapped reading and writing of disk buys better performance than increasing the buffer pool size, though this is not as common.
- Reference ==> Registry and environment variables
Configuring to avoid runtime errors
Initial application runs typically uncover some problems with the values of some configuration parameters. If you don’t receive any errors or warnings during your application run, then you are safe. If you do, see the discussion of database manager and database configuration parameter administration later in the article. The following configuration parameters will complain if there is not enough memory to process your SQL:
This is the amount of memory allocated for database system monitor data. Memory is allocated from the monitor heap when you perform database monitoring activities such as snapshot monitoring or activating an event monitor. If DB2 returns an error if there is not enough memory available, try a value of 256. If again you encounter an error, increase by increments of 256 until the error goes away.
This is the maximum amount of memory that can be allocated to store each query in the agent's private memory. The query heap is also used for the memory allocated for blocking cursors. The query heap size must be greater than or equal to ASLHEAPSZ. If you receive an error from DB2 that performance may be less than optimal, at a minimum, set to a value at least five times larger than ASLHEAPSZ which will allow for queries larger than ASLHEAPSZ and provide enough memory for three or four concurrent blocking cursors.
This specifies the maximum number of concurrent applications that can be connected (both local and remote) to a database. At the absolute minimum, set this >= (# of user connections). For more details, see the discussion of MAXAGENTS later on.
The statement heap is used as a work space for the SQL compiler during compilation of an SQL statement. This area is allocated and released for every SQL statement handled. If you receive a warning or error, increase by 256 until the error goes away.
The application heap is the private memory available to be used by the database manager on behalf of a specific agent. The heap is allocated when an agent or subagent is initialized for an application and the amount allocated will be the minimum amount needed to process the request, if more is needed it can be allocated up to the maximum specified by this parameter. Increase in increments of 256 until the error goes away.
Snapshot monitoring for improved performance
Use snapshot monitoring to identify the behavior of a database over a period of time, showing things such as how memory is being utilized and how locks are being acquired. Monitoring is the approach used to fine-tune configuration and identify problems, such as long statement execution time. If you have already used the Configuration Advisor, you may not be able to achieve big gains in performance at this time.
The easiest way to collect data to analyze is to use a script to perform sampled snapshot monitoring while running your application. A script like either the one shown in Listing 3 or the one shown in Listing 4 will collect all the information you need to get on your way. Start by running the script for a duration of 60 seconds over several intervals; this way you should get a nice sample of application behavior and not have too much information to work through.
Listing 3. getsnap.ksh (UNIX)
#!/usr/bin/ksh # take a snapshot after specified sleep period for a number of iterations # parameters: (1) database name # (2) directory for output # (3) interval between iterations (seconds) # (4) maximum number of iterations # # Note: You may receive an error about the monitor heap being too small. You may # want to set mon_heap_sz to 2048 while monitoring. if [ $# -ne 4 ] then echo "4 parameters required: dbname output_dir sleep_interval iterations"; exit fi dbname=$1 runDir=$2 sleep_interval=$3 iterations=$4 stat_interval=3 stat_iterations=$(($sleep_interval/$stat_interval)) if [[ -d $runDir ]]; then echo "dir: $runDir already exists, either remove it or use another directory name" exit fi mkdir $runDir cd $runDir db2 update monitor switches using bufferpool on lock on sort on statement on \ table on uow on # repeat the snapshot loop for the specified iterations let i=1 while [ i -le $iterations ] do if [ $i -le 9 ] then i2="0$i" else i2="$i" fi echo "Iteration $i2 (of $iterations) starting at `date`" vmstat $stat_interval $stat_iterations > vmstat_$i2 iostat $stat_interval $stat_iterations > iostat_$i2 db2 -v reset monitor all sleep $sleep_interval db2 -v get snapshot for dbm > snap_$i2 db2 -v get snapshot for all on $dbname >> snap_$i2 echo "Iteration $i2 (of $iterations) complete at `date`" let i=$i+1 done db2 update monitor switches using bufferpool off lock off sort off statement off \ table off uow off db2 terminate
Listing 4. getsnap.bat (Windows)
@echo off REM REM take a snapshot after specified sleep period for a number of iterations REM parameters: (1) database name REM (2) file name id REM (3) interval between iterations (seconds) REM (4) maximum number of iterations REM REM Note: You may receive an error about the monitor heap being too small. You may REM want to set mon_heap_sz to 2048 while monitoring. :CHECKINPUT IF ""=="%4" GOTO INPUTERROR GOTO STARTPRG :INPUTERROR echo %0 requires 4 parameters: dbname filename_id sleep_interval iterations echo e.g. "getsnap.bat sample 0302 60 3" GOTO END :STARTPRG SET dbname=%1 SET fileid=%2 SET sleep_interval=%3 SET iterations=%4 db2 update monitor switches using bufferpool on lock on sort on statement on table on uow on REM repeat the snapshot loop for the specified iterations SET i=1 :SNAPLOOP IF %i% LSS 10 SET i2=0%i% IF %i% GTR 9 SET i2=%i% echo Starting Iteration %i2% (of %iterations%) db2 -v reset monitor all sleep %sleep_interval% db2 -v get snapshot for dbm > snap%i2%_%fileid% db2 -v get snapshot for all on %dbname% >> snap%i2%_%fileid% echo Completing Iteration %i2% (of %iterations%) SET /a i+=1 IF %i% GTR %iterations% GOTO ENDLOOP GOTO SNAPLOOP :ENDLOOP db2 update monitor switches using bufferpool off lock off sort off statement off table off uow off db2 terminate :END
Note that the two scripts differ slightly in behavior, but both will produce the desired snapshot output.
Snapshot monitoring, amongst other things, will be used as a means to find the optimal setting of many of the DBM and DB configuration parameters in the following sections.
- Reference ==> System monitor ==> Snapshot monitor
Dynamic SQL statements
The scripts shown in Listing 3 and Listing 4 will issue a "get snapshot for all on dbname" which includes all the output for "get snapshot for dynamic SQL on dbname". If you find that you are not capturing many SQL statements, increase your monitoring duration. The "Dynamic SQL Snapshot Result" section of the output for one statement will look as shown in Listing 5:
Listing 5. Sample Dynamic SQL Snapshot
Dynamic SQL Snapshot Result Database name = SAMPLE Database path = C:\DB2\NODE0000\SQL00003\ Number of executions = 1 Number of compilations = 1 Worst preparation time (ms) = 1624 Best preparation time (ms) = 1624 Internal rows deleted = 0 Internal rows inserted = 0 Rows read = 41 Internal rows updated = 0 Rows written = 0 Statement sorts = 0 Total execution time (sec.ms) = 0.134186 Total user cpu time (sec.ms) = 0.000000 Total system cpu time (sec.ms) = 0.000000 Statement text = select * from sales ...
You can see that there are some useful strings to search for in the output.
"Number of executions" - This can help you identify important statements which should likely be well-tuned. It is also very useful for helping to calculate the average execution time of the statement.
For the statements that are executed a large number of times, a single execution may not place large demands on the system, but the cumulative result may degrade performance significantly. Try to understand how the application uses this SQL and perhaps there may be a slight redesign of the application logic that could lead to an improvement in performance.
It may also be useful to see if parameter markers could be used so that only one package is created for the statement. Parameter markers are used as place holders in a dynamically prepared statement (when the access plan is generated). At execution time, values are supplied for the parameter markers and the statement runs.
For example, to search for the most frequently executed statements:
grep -n " Number of executions" snap.out | grep -v "= 0" | sort -k 5,5rn | more
findstr /C:" Number of executions" snap.out | findstr /V /C:"= 0"
"Rows read" - This will help identify the Dynamic SQL statements which read the most rows. A high number of rows read will typically indicate that table scans are occurring. A high value can also indicate index scans with little or no selectivity, which can be just about as bad as a table scan.
You can use Explain to see if this is indeed the case. If a table scan is occurring, the problem may be remedied by performing a RUNSTATS on the table or by feeding the SQL statement to the DB2 Design Advisor to have it recommend a better index. If it is an index scan with poor selectivity, perhaps a better index is needed. Try the Design Advisor.
grep -n " Rows read" snap.out | grep -v "= 0" | sort -k 5,5rn
findstr /C:" Rows read" snap.out | findstr /V /C:"= 0"
"Total execution time" - This is the total execution time for all executions of the statement. It is very handy to divide this number by the number of executions. If you are seeing a very long average execution time for the statement, it could be because of table scans and/or a lock-wait situation. Heavy I/O from index scans and page fetches is another cause. By using an index, you can typically avoid both table scans and lock-waits. Locks are released on commit, so perhaps committing more often might remedy a lock-wait problem.
grep -n " Total execution time" snap.out | grep -v "= 0.0" | sort -k 5,5rn | more
findstr /C:" Total execution time" snap.out | findstr /V /C:"= 0.0" |sort /R
"Statement text" – Displays the text of the statement. If you notice repeated statements that are identical except for differing in values for the predicate of the WHERE, parameter markers could be use to avoid re-compilation of the statement. This can help avoid costly repeated prepares by using the same package. This text can also be input into the Design Advisor to generate optimal indexes.
grep -n " Statement text" snap.out | more
findstr /C:"Statement text" snap.out
Buffer pool sizing
Using "get snapshot for all on dbname" will generate one snapshot for each buffer pool on the database. Listing 6 shows such a snapshot:
Listing 6. Sample bufferpool snapshot
Bufferpool Snapshot Bufferpool name = IBMDEFAULTBP Database name = SAMPLE Database path = C:\DB2\NODE0000\SQL00002\ Input database alias = SAMPLE Snapshot timestamp = 02-20-2004 06:24:45.991065 Buffer pool data logical reads = 370
Buffer pool data physical reads = 54 Buffer pool temporary data logical reads = 0 Buffer pool temporary data physical reads = 0 Buffer pool data writes = 3 Buffer pool index logical reads = 221
Buffer pool index physical reads = 94 Buffer pool temporary index logical reads = 0 Buffer pool temporary index physical reads = 0 Total buffer pool read time (ms) = 287 Total buffer pool write time (ms) = 1 Asynchronous pool data page reads = 9 Asynchronous pool data page writes = 0 Buffer pool index writes = 0 Asynchronous pool index page reads = 0 Asynchronous pool index page writes = 0 Total elapsed asynchronous read time = 0 Total elapsed asynchronous write time = 0 Asynchronous data read requests = 3 Asynchronous index read requests = 0 No victim buffers available = 0 Direct reads = 86 Direct writes = 4 Direct read requests = 14 Direct write requests = 2 Direct reads elapsed time (ms) = 247 Direct write elapsed time (ms) = 56 Database files closed = 0 Data pages copied to extended storage = 0 Index pages copied to extended storage = 0 Data pages copied from extended storage = 0 Index pages copied from extended storage = 0 Unread prefetch pages = 0 Vectored IOs = 3 Pages from vectored IOs = 9 Block IOs = 0 Pages from block IOs = 0 Physical page maps = 0 Node number = 0 Tablespaces using bufferpool = 4 Alter bufferpool information: Pages left to remove = 0 Current size = 250 Post-alter size = 250
To determine the efficiency of a buffer pool, you want to calculate its
buffer pool hit ratio (BPHR). The important information that you need has
been bolded above. An ideal BPHR, if possible, is somewhere over 90%. The
formula is as follows:
BPHR (%) = (1 - (("Buffer pool data physical reads" + "Buffer pool index physical reads") /
("Buffer pool data logical reads" + "Buffer pool index logical reads"))) * 100
In the above snapshot for the IBMDEFAULTBP buffer pool, we can compute the BPHR as:
= (1-((54 + 94) / (370 + 221))) * 100 = (1-(148 / 591)) * 100 = (1- 0.2504) * 100 = 74.96
In this case, the BPHR is about 75%. Currently the buffer pool is only 250 * 4KB pages (1MB). It would be well worthwhile to increase the size of this buffer pool and see if the BPHR increases. If the BPHR remains low, you may need to redesign your logical layout as discussed in the Creating buffer pools and Creating table spaces sections.
Block-based buffer pool efficiency
If this is a block-based buffer pool and you are seeing a low number of "Block IOs", consider altering the buffer pool and increasing the size of NUMBLOCKPAGES. If you see more "Block IOs" occurring as a result, consider increasing again. If you see diminishing returns, decrease the size.
DBM and DB configuration
DB2 has dozens of configurable parameters. Many are automatically configured by DB2, while others have defaults which have proven to work well in most environments. In the following, we describe only those parameters which most commonly require additional consideration.
Some database manager (that is, instance) configuration parameters may be changed online (take effect immediately), while others require that the instance be recycled (that is, DB2STOP followed by DB2START). The same applies to the database configuration parameters. Some take effect instantly while others require the database to be deactivated and reactivated. The documentation on each configuration parameter specifies whether the parameter is configurable online or not.
The basic administration commands for the database manager and database configuration file are shown in Table 3:
Table 3. Database Manager and Database Configuration Administration
|GET DBM CFG [SHOW DETAIL]||Lists the current values of the database manager configuration file.|
|UPDATE DBM CFG USING config_param value||Sets the specified database manager configuration parameter to the specified value.|
|GET DB CFG FOR db_name [SHOW DETAIL]||Lists the current values of a specific database's configuration file..|
|UPDATE DB CFG FOR db_name USING config_param value||Sets the specified database manager configuration parameter to the specified value.|
After you make a change to a configuration parameter, you can identify if
the setting took effect immediately (online) with the following DB2 CLP
GET DBM CFG SHOW DETAIL
GET DB CFG FOR dbname SHOW DETAIL
For example, in the following case, MAX_QUERYDEGREE and MAXTOTFILOP have been increased to 3 and 19000 respectively. If the parameter has been configured online, the Delayed Value will be the same as the Current Value. Otherwise, an instance restart or database reactivation is required.
Listing 7. Show Details example
Database Manager Configuration Node type = Enterprise Server Edition with local and remote clients Description Parameter Current Value Delayed Value ------------------------------------------------------------------------------------------- Maximum query degree of parallelism (MAX_QUERYDEGREE) = 3 3 Maximum total of files open (MAXTOTFILOP) = 16000 19000
Some of the following configuration parameters are allocated from shared memory, so you should keep the OS limits in mind (discussed above). You must ensure that you are not over allocating memory. If you over allocate memory, it will cause the OS to page which can be disastrous to performance.
- Reference ==> Configuration parameters ==> Database manager
- Reference ==> Configuration parameters ==> Database
Listing 8. Database manager snapshot
*Snapshot always shows Current size (bytes) = High water mark (bytes) because memory is allocated at database activation.
Listing 9. Database snapshot
Database Snapshot Database name = SAMPLE Database path = C:\DB2\NODE0000\SQL00002\ Input database alias = SAMPLE Database status = Active Catalog database partition number = 0 Catalog network node name = Operating system running at database server= NT Location of the database = Local First database connect timestamp = 02-20-2004 06:19:00.847979 Last reset timestamp = Last backup timestamp = Snapshot timestamp = 02-20-2004 06:23:17.252491 High water mark for connections = 1 (MAXAPPLS) Application connects = 1 Secondary connects total = 0 Applications connected currently = 1 (AVG_APPLS) Appls. executing in db manager currently = 0 Agents associated with applications = 1 Maximum agents associated with applications= 1 Maximum coordinating agents = 1 Locks held currently = 0 Lock waits = 0 Time database waited on locks (ms) = 0 Lock list memory in use (Bytes) = 1000 (LOCKLIST and MAXLOCKS) Deadlocks detected = 0 Lock escalations = 0 (LOCKLIST and MAXLOCKS)
Exclusive lock escalations = 0 (LOCKLIST and MAXLOCKS) Agents currently waiting on locks = 0 Lock Timeouts = 0 (LOCKTIMEOUT) Number of indoubt transactions = 0 Total Private Sort heap allocated = 0 (SHEAPTHRES and SORTHEAP)
Total Shared Sort heap allocated = 0 (SHEAPTHRES_SHR and SORTHEAP)
Shared Sort heap high water mark = 0 (SHEAPTHRES_SHR and SORTHEAP)
Total sorts = 0 Total sort time (ms) = 0 Sort overflows = 0 (SORTHEAP) Active sorts = 0 Buffer pool data logical reads = 370 Buffer pool data physical reads = 54 Buffer pool temporary data logical reads = 0 Buffer pool temporary data physical reads = 0 Asynchronous pool data page reads = 9 (NUM_IOSERVERS)
Buffer pool data writes = 3 (CHNGPGS_THRESH and NUM_IOCLEANERS)
Asynchronous pool data page writes = 0 (CHNGPGS_THRESH and NUM_IOCLEANERS)
Buffer pool index logical reads = 221 Buffer pool index physical reads = 94 Buffer pool temporary index logical reads = 0 Buffer pool temporary index physical reads = 0 Asynchronous pool index page reads = 0 (NUM_IOSERVERS)
Buffer pool index writes = 0 (CHNGPGS_THRESH and NUM_IOCLEANERS)
Asynchronous pool index page writes = 0 (CHNGPGS_THRESH and NUM_IOCLEANERS) Total buffer pool read time (ms) = 287 Total buffer pool write time (ms) = 1 Total elapsed asynchronous read time = 0 Total elapsed asynchronous write time = 0 Asynchronous data read requests = 3 Asynchronous index read requests = 0 No victim buffers available = 0 LSN Gap cleaner triggers = 0 Dirty page steal cleaner triggers = 0 (CHNGPGS_THRESH)
Dirty page threshold cleaner triggers = 0 (CHNGPGS_THRESH)
Time waited for prefetch (ms) = 0 (NUM_IOSERVERS) Unread prefetch pages = 0 Direct reads = 86 Direct writes = 4 Direct read requests = 14 Direct write requests = 2 Direct reads elapsed time (ms) = 247 Direct write elapsed time (ms) = 56 Database files closed = 0 (MAXFILOP) Data pages copied to extended storage = 0 Index pages copied to extended storage = 0 Data pages copied from extended storage = 0 Index pages copied from extended storage = 0 Host execution elapsed time = 0.000039 Commit statements attempted = 6 Rollback statements attempted = 1 Dynamic statements attempted = 281 Static statements attempted = 7 Failed statement operations = 1 Select SQL statements executed = 4 Update/Insert/Delete statements executed = 0 DDL statements executed = 2 Internal automatic rebinds = 0 Internal rows deleted = 0 Internal rows inserted = 0 Internal rows updated = 0 Internal commits = 1 Internal rollbacks = 0 Internal rollbacks due to deadlock = 0 Rows deleted = 0 Rows inserted = 0 Rows updated = 0 Rows selected = 336 Rows read = 375 Binds/precompiles attempted = 0 Log space available to the database (Bytes)= 5095757 (LOGPRIMARY and LOGSECOND) Log space used by the database (Bytes) = 4243 Maximum secondary log space used (Bytes) = 0 Maximum total log space used (Bytes) = 6498 (LOGPRIMARY and LOGSECOND)
Secondary logs allocated currently = 0 (LOGPRIMARY and LOGSECOND)
Log pages read = 0 (LOGBUFSZ)
Log pages written = 5 (LOGBUFSZ) Appl id holding the oldest transaction = 38 Package cache lookups = 10 (PKGCACHESZ)
Package cache inserts = 8 (PKGCACHESZ)
Package cache overflows = 0 (PKGCACHESZ)
Package cache high water mark (Bytes) = 191140 (PKGCACHESZ) Application section lookups = 281 Application section inserts = 6 Catalog cache lookups = 18 (CATALOGCACHE_SZ)
Catalog cache inserts = 9 (CATALOGCACHE_SZ)
Catalog cache overflows = 0 (CATALOGCACHE_SZ)
Catalog cache high water mark = 0 (CATALOGCACHE_SZ) Workspace Information Shared high water mark = 0 Corresponding shared overflows = 0 Total shared section inserts = 0 Total shared section lookups = 0 Private high water mark = 21102 Corresponding private overflows = 0 Total private section inserts = 6 Total private section lookups = 6 Number of hash joins = 0 Number of hash loops = 0 Number of hash join overflows = 0 (SORTHEAP)
Number of small hash join overflows = 0 (SORTHEAP)
At times, it can be very handy to use ‘grep’ (UNIX) and ‘findstr" (Windows) to perform a preliminary search of the snapshot output for any problem issues. If you find something, you can investigate further by opening the snapshot output and locating the problem.
For example, to identify if any deadlocks have occurred:
grep -n "Deadlocks detected" snap.out | grep -v "= 0" | more
findstr /C:"Deadlocks detected" snap.out | findstr /V /C:"= 0"
This is the total amount of memory that concurrent private sorts may consume for all databases in the instance. Additional incoming sorts will be given a smaller amount of memory to use. For OLTP, a good starting point is about 20000, while 40000-60000 tends to work better for OLAP.
When "Piped sorts accepted" is a low value compared to "Piped sorts requested", performance can typically be improved by increasing the size of SHEAPTHRES. If "Post threshold sorts" (sorts which requested heaps after SHEAPTHRES had been exceeded) is a high value (i.e., double-digit), try increasing the size of SHEAPTHRES. "Total Private Sort heap allocated" should be less than SHEAPTHRES. If it is not, increase SHEAPTHRES.
This is the maximum number of database manager agents available to accept application requests for all databases within the instance. This parameter can be useful in memory constrained environments to limit the total memory usage of the database manager, because each additional agent requires additional memory.
If your machine is not memory constrained, increase MAXAGENTS until "Agents stolen from another application" is 0. Additionally, "Local connections" + "Remote connections to db manager" will give you an indication of the number of concurrent connections to the instance. "High water mark for agents registered" will report the largest number of agents ever connected at one time to the database manager. "Max agents overflow" reports the number of times a request to create a new agent was received when MAXAGENTS had already been reached. Finally, "Agents Registered" show the number of agents currently registered in the database manager instance that is being monitored.
This specifies the number of local databases that can be concurrently active. On a production system, it is recommended that there be one database per instance, so you should set to 1. Otherwise, set it to the maximum number of concurrently active databases. Use "Active local databases" if unsure.
NUM_INITAGENTS and NUM_POOLAGENTS (DBM)
NUM_INITAGENTS specifies the number of idle agents that are created in the pool at db2start and helps speed up connections at the beginning of database use. NUM_POOLAGENTS is related, but has more effect on performance once the database has been running for a while. When the Connections Concentrator is OFF (default; MAX_CONNECTIONS = MAX_COORDAGENTS), NUM_POOLAGENTS specifies the maximum size of the agent pool. When the Concentrator is ON (MAXCONNECTIONS > MAX_COORDAGENTS) it is used as a guideline for how large the agent pool will be when the system workload is low.
NUM_INITAGENTS and NUM_POOLAGENTS should be set to the average number of expected concurrent instance-level connections, which is usually low for OLAP and higher for OLTP. For performance benchmarks where there is a substantial ramp up of connections, set NUM_INITAGENTS to the number of expected connections (this will significantly reduce time required to ramp up connections by reducing resource contention). In 3-tier environments where connection pooling is used, NUM_INITAGENTS and NUM_POOLAGENTS have little affect on performance as connections are continuously maintained by the application server even when the application is idle.
"Idle agents" shows the number of idle agents in the agent pool, while "Agents assigned from pool" shows the number of times an agent has been assigned from the agent pool. "Agents created from empty pool" shows the number of agents that had to be created because of an empty pool, which can be misleading right after db2start. After db2start, it will simply show that NUM_INITAGENTS of agents were created. If the ratio of "Agents created from empty pool" / "Agents assigned from pool" is high (5:1 or more), it could indicate that NUM_POOLAGENTS should be increased; this could also indicate that the overall workload for system is too high. You can adjust the workload by lowering MAXCAGENTS. If the ratio is low, it suggests that NUM_POOLAGENTS is probably set too high and that some agents are wasting system resources.
Only used within a DPF environment with multiple logical partitions and it specifies the number of 4 KB buffers that are used for internal communication. If DPF is not being used, this will not even show in the snapshot output. Additionally, this information will be from the partition that the snapshot was run against. For example, in the snapshot above DBM snapshot above, "Node FCM information corresponds to" shows a value of 2, so it was taken from partition number 2. "Get snapshot for dbm global" can be used to get an aggregate of all partition values.
The FCM Node section of DBM snapshots can be used to see where the major inter-partition communication is taking place for investigation purposes. Heavy communication could indicate a need for more FCM buffer memory, a need for a different partitioning key, or perhaps different table to table space assignments. If "Free FCM buffers low water mark" is less than 15 percent of FCM_NUM_BUFFERS, increase FCM_NUM_BUFFERS until "Free FCM buffers low water mark" is greater than or equal to it, to ensure adequate FCM resources are always available.
Only change if your application issues complex SQL (e.g., joins, functions, recursive, etc.), otherwise leave at 1. This helps to estimate how much buffer pool will be available at runtime for an access plan. It should be set to a low number that is the average number of "Applications connected currently" multiplied by the percentage issuing complex SQL.
LOCKLIST and MAXLOCKS (DB)
There is one lock list per database and it contains the locks held by all applications concurrently connected to the database. On 32-bit platforms, the first lock on an object requires 72 bytes while each additional requires 36 bytes. On 64-bit platforms, the first lock requires 112 bytes while additional require 56 bytes.
When the percentage of the LOCKLIST used by one application reaches MAXLOCKS, the database manager will perform a lock escalation, where row locks for a given table will be traded in for a single table lock. Also, if LOCKLIST is close to being exhausted, the database manager will identify a connection holding the most row locks on a table and trade those for a table lock to free up LOCKLIST memory. Locking entire tables can greatly reduce concurrency – the chance of deadlock is also increased.
If "Lock list memory in use (Bytes)" exceeds 50 percent of the defined LOCKLIST size, then increase the number of 4K pages in the LOCKLIST. If there are "Lock escalations" or “Exclusive lock escalations” occurring, increase either LOCKLIST or MAXLOCKS, or both.
The database snapshot section on locking contains a lot of valuable information. Look for high values in "Locks held currently", "Lock waits", "Time database waited on locks (ms)", "Agents currently waiting on locks", and "Deadlocks detected" which could be the symptoms of less than optimal access plans, long transactions, or application concurrency problems. If you are seeing deadlocks, you will want to create an event monitor for deadlocks with details to see what is actually occurring.
For more detailed information on locking issues, see the article Diagnosing and Resolving Lock Problems with DB2 Universal Database by Bill Wilkins.
Some things you can do to reduce locking include:
- Ensure that your application is using the lowest isolation level required.
- Perform frequent COMMITs.
- When performing many updates, explicitly lock the entire table (using LOCK TABLE statement) before updating.
- Use Cursor Stability isolation level (default) when possible to decrease the number of share locks held. (Uncommitted Read can further decrease locking, if your application can function with dirty reads.)
This specifies the number of seconds that an application will wait to obtain a lock. This can help to avoid global deadlock situations. With -1, the application will appear to freeze if lock-wait is encountered. The Bill Wilkins article on locking also covers this in greater detail.
About 60 (seconds) for OLAP and 10 for OLTP for production environments is a good starting point. -1 should be used for development environments to identify and resolve lock-wait situations. If there are a large number of concurrent users, you may want to increase the OLTP time to avoid rollbacks.
If "Lock Timeouts" is a high number, it could be caused by (1) too low a LOCKTIMEOUT value, (2) a transaction holding locks for an extended period, or (3) lock escalations.
This is a hard-limit on the total amount of memory that concurrent shared sorts may consume within an instance. This is only applicable if: (1) INTRA_PARALLEL=YES, or (2) Concentrator is on (MAX_CONNECTIONS > MAX_COORDAGENTS). Sorts that use a cursor with the WITH HOLD option, will be allocated from shared memory.
"Shared Sort heap high water mark" shows the most shared sort memory allocated at one time. If this value is always much lower than SHEAPTHRES_SHR, you should decrease SHEAPTHRES_SHR to save memory for other database functions. If this value is very close to SHEAPTHRES_SHR, you may want to increase SHEAPTHRES_SHR. "Total Shared Sort heap allocated" is the total number of allocated pages of sort heap space for all sorts. If the element value is greater than or equal to SHEAPTHRES_SHR, it means that the sorts are not getting the full sort heap as defined by the SORTHEAP parameter. Increase the size of SHEAPTHRES_SHR to help avoid this.
When setting, try to keep it as a multiple of SORTHEAP.
This parameter specifies the maximum number of private memory pages to be used for private sorts or the maximum number of shared memory pages to be used for shared sorts. Each sort has a separate sort heap that is allocated as needed, by the database manager.
What is often well understood is that sort overflows occur when the amount of memory needed for a sort exceeds SORTHEAP. Less well understood is that if your statistics are out of date, or if data is skewed and you have not collected distribution statistics, an overflow can occur if DB2 requests too small a sort heap, and the actual sort operation exceeds the requested amount. Therefore, it is very important to keep statistics up to date. Additionally, make sure that the sort is not the result of a missing index.
A good starting point is 128 for OLTP and between 4096 - 8192 for OLAP. If there are a lot of "Sort overflows" (double-digit), you likely need to increase SORTHEAP. If the "Number of hash join overflows" is not 0, increase SORTHEAP by increments of 256 until it is. If "Number of small hash join overflows" is not 0, increase SORTHEAP by 10% increments until small hash join overflows is zero.
Use this to specify the percentage of changed pages in the buffer pool at which time the asynchronous page cleaners will be started to write the changes to disk to make room for new data in the buffer pool. In a read-only environment page cleaners are not used. In OLTP, a value of 20-40 should improve performance (20 in the case of very heavy update activity), as lowering the value makes the I/O Cleaners more aggressive in their writing out of dirty buffer pool pages, but with less work each time. If there are not a lot of INSERTs or UPDATEs, the default of 60 should be fine for OLTP and OLAP.
If "Dirty page steal cleaner triggers" is a double-digit number, try lowering. If high "Buffer pool data writes" and low "Asynchronous pool data page writes", try lowering.
As of FixPak 4, there is an alternate page cleaning algorithm available which may boost your specific buffer pool performance. You need to set the Profile Registry variable DB2_USE_ALTERNATE_PAGE_CLEANING=YES, which will ignore CHNGPGS_THRESH. Ensure NUM_IOSERVERS is at least 3 otherwise it will become an inhibitor to the new algorithm.
This specifies the number of asynchronous page cleaners for a database, which write changed pages from the buffer pool to disk. Start by setting it equal to the number of CPUs on the system. When I/O Cleaners are triggered they are all started at the same time, therefore you do not want to have so many that they actually impact performance and block other processing.
Decrease NUM_IOCLEANERS if Asynchronous Write Percentage (AWP) is 90% or higher, increase if less than 90%.
AWP = (("Asynchronous pool data page writes" + "Asynchronous pool index page writes") * 100) / ("Buffer pool data writes" + "Buffer pool index writes")
I/O servers are used to perform prefetching and this parameter specifies the maximum number of I/O servers for a database. Non-prefetch I/Os are scheduled directly from the database agents and as a result are not constrained by this parameter. Start by setting equal to the number of physical disks that the database resides on (even if many disks in a disk array or many disks in a logical volume) + 1 or 2, but not more than 4-6 times the # of CPUs. Having too large a value will not hurt performance, unless it exceeds the 4-6 times the # of CPUs.
If you see "Time waited for prefetch (ms)" in the seconds, you may want to add an IO Server to see if performance improves.
This parameter specifies the maximum number of files that can be open for each database agent. If opening a file causes this value to be exceeded, some files in use by this agent are closed. Excessive opening and closing will degrade performance. Both SMS table spaces and DMS table space file containers are treated as files. More files are generally used by SMS.
Increase until the number of "Database files closed" is 0.
LOGPRIMARY, LOGSECOND and LOGFILSIZ (DB)
LOGPRIMARY specifies the number of primary log files to be pre-allocated, while LOGSECOND are allocated on an as-needed basis. LOGFILSIZ defines the size of each log file.
If there is a high number of "Secondary logs allocated currently", you will want to increase LOGFILSIZ or LOGPRIMARY (while making sure that LOGPRIMARY + LOGSECOND does not exceed 256). You can also use "Maximum total log space used (Bytes)" to help figure out your dependency on log file space (primary + secondary logs).
Log file size has an implication on disaster recovery configurations where log shipping is used. A large log file will have better performance, but potentially increases the degree of lost transactions. When the primary system goes down, the last log file and its transactions may never be sent to the secondary because the file was not closed before failure. The larger the log file, the greater the degree of lost transactions due to the lost log file.
This parameter allows you to specify the amount of the database heap (DBHEAP) to use as a buffer for log records before writing to disk. The log records are written to disk when a transaction commits or the log buffer is full. Buffering the log records will result in log records being written to disk less frequently with more log records being written at each time. A good starting point is at least 256 pages for most OLTP and 128 for OLAP. When you consistently see more than a couple of "Log pages read" , you may need to increase the size. Log pages may also be read if a rollback occurs.
If you get an error while attempting to increase LOGBUFSZ, increase DBHEAP by the same amount first and try again.
The package cache is used for caching sections for static and dynamic SQL statements. Caching packages allows the database manager to reduce its internal overhead by eliminating the need to access the system catalogs when reloading a package; or, in the case of dynamic SQL, eliminating the need for recompilation.
PKGCACHESZ should be larger than "Package cache high water mark (Bytes)". If "Package cache overflows" is not 0, you can attempt to drive this counter to zero by increasing PKGCACHESZ.
The Package Cache Hit Ratio (PCHR) should be as close to 100% as possible (without taking needed memory away from buffer pools). Calculate with the following formula:
PCHR = (1-("Package cache inserts" / "Package cache lookups"))*100
This is used to cache system catalog information, such as SYSTABLE, authorization, and SYSROUTINES information. Caching catalog information is very important, especially when DPF is used, where internal overhead is reduced by eliminating the need to access the system catalogs (catalog partition) to obtain information that has been previously retrieved.
Keep increasing until Catalog Cache Hit Ratio (CCHR) is 95% or better for OLTP:
CCHR = (1-("Catalog cache inserts" / "Catalog cache lookups"))*100
Also increase if the value of "Catalog cache overflows" is greater than 0. You can also use the "Catalog cache high water mark (Bytes)" to determine the most memory ever consumed by the catalog cache. If the High water mark equals the Maximum size allowed, you will want to increase the catalog cache heap size.
Experimenting: DBM and DB Configuration
The following parameters may buy you extra performance. However, their impact is not directly reported by any specific monitor in the snapshot. Instead, you may want to change one at a time and measure the overall performance of the application. The best measurement you can take is to examine the overall SQL execution times from several snapshots, pre-change and post-change.
This parameter specifies whether the database manager can use intra-partition parallelism. The default of NO is best for high concurrent connections (mainly OLTP), while YES works best for few concurrent connections and more complex SQL (OLAP/DSS). Mixed workloads typically benefit from NO.
When enabled, it causes sort memory to be allocated from shared memory. Additionally, it can cause excessive system overhead if the level of concurrency increases significantly. If the system is non-OLTP, there is a 4:1 ratio of CPUs to partitions, and the CPU busy rate runs less than 50 percent on average, INTRA_PARALLEL would likely improve performance.
This is used to specify a default level of optimization to use when compiling SQL queries. Try the default of 5 or 3 for mixed OLTP/OLAP, lower for OLTP, higher for OLAP. For simple SELECTS or short runtime queries (which generally take less than one second to complete) 1 and 0 may be appropriate. Try a class of 1 or 2 if you have many tables with many join predicates on the same column. Try using a class of 7 for longer running queries which take more than 30 seconds to complete or if you are inserting to a UNION ALL VIEW (which were added in FixPak4). You should avoid using a class of 9 under most circumstances.
This parameter specifies the maximum amount of memory that can be used simultaneously by the BACKUP, RESTORE, and LOAD utilities. If you are using LOAD, set UTIL_HEAP_SZ to at least 10000 pages per CPU.
This specifies a string of up to 242 bytes to change the location where the log files are written and stored. This can point to either a fully qualified path name or raw device. Changing the log path to a separate local high-speed disk (which is not in use for anything other than logging) can significantly improve performance.
Further SQL analysis
The Design Advisor
If you have a specific problem query or set of queries, you can input that workload into the DB2 Design Advisor (db2advis) to have it recommend a set of efficient indexes. If you do not know the SQL, you can
- Capture dynamic SQL using snapshots
- Collect all the SQL issued over a period of time with a statement event monitor
- extract static SQL from the SYSCAT.STATEMENTS catalog view
Using the statement event monitor will be discussed a little later in this SQL analysis section.
The Design Advisor can be used from either the DB2 Control Center or from the CLP command line. The following discusses both interfaces.
Using the DB2 Control Center
In the Control Center, expand the object tree until you find the database of interest. Right-click on the database name and select Design Advisor… from the pop-up menu. You will now be able to construct a workload by looking at recently executed SQL, looking through the packages, or by manually adding SQL statements.
Using the DB2 CLP
When using the CLP, output is displayed to the screen which can be captured into a script and executed. Some common examples follow.
To recommend indexes for a specific SQL statement against the ‘sample’
database, with a limit of 1 minute for index identification:
db2advis -d sample -s "select count(*) from sales where region = 'Quebec'" -t 1
To recommend indexes for multiple statements, we can build a text file which looks like:
db2advis -d sample -s "--#SET FREQUENCY 10 SELECT * FROM SALES; --#SET FREQUENCY 2 SELECT FIRSTNME FROM EMPLOYEE WHERE EMPNO = ?;
where the frequency is a weighting of the number of times that SQL
statement is executed in comparison with the other SQL statements in the
input file. To generate results:
db2advis -d dbname -i sqlstmts_file > recindexes.out
Since the output is displayed to the screen, we use a redirection to capture the index definitions in a file which can then be run as a DB2 script.
You can also pipe dynamic SQL from a snapshot into the Design Advisor:
get snapshot for dynamic SQL on dbname write to file
This will store the snapshot in an internal file format. You can then
insert the results into a Design Advisor table, using the following SQL:
insert into advise_workload(select 'myworkload',0,stmt_text,cast(generate_unique() as char(254)), num_executions, 1,1,0,0,cast(null as char) from table (snapshot_dyn_sql('dbname', -1)) as snapshot_dyn_sql)
The default frequency for each SQL statement in a workload is 1 and the
default importance is also 1. The generate_unique() function assigns a
unique identifier to the statement. Both of these columns can be updated
by the user to a more meaningful value. To generate your indexes:
db2advis -d dbname -w myworkload
Once you have executed the Design Advisor it will populate the advise_index
table. You can query this table to list all of the Design Advisor’s
recommendations with the following query:
SELECT CAST(CREATION_TEXT as CHAR(200)) FROM ADVISE_INDEX
Event monitoring for SQL
The CREATE EVENT MONITOR statement defines a monitor that will record certain events as they occur when using the database. The definition of each event monitor also specifies where the database should record the events. Event monitors can be created to record information relating to the following types of events: DATABASE, TABLES, DEADLOCKS [WITH DETAILS], TABLESPACES, BUFFERPOOLS, CONNECTIONS STATEMENTS, and TRANSACTIONS.
Listing 10. getevmon.ksh (UNIX)
#!/usr/bin/ksh # create an event monitor and capture its output # parameters: (1) database name # (2) monitor output file # (3) interval between iterations (seconds) # Note: You may receive an error about the monitor heap being too small. You may want to set # mon_heap_sz to 2048 while monitoring. if [ $# -ne 3 ] then echo "Requires 3 Parameters: dbname monitor_output_file interval_in_#seconds"; exit fi MON=evmon # "nonblocked" may cause loss of data but has less impact on system than default "blocked". MONTYPE=nonblocked SLEEP=$3 DB=$1 #EVENTS="deadlocks with details" #EVENTS="tables, statements, deadlocks with details, connections" EVENTS="statements" OUTFILE=$2 OUTDIR="TMPEVMON" mkdir $OUTDIR chmod 777 $OUTDIR cd $OUTDIR db2 connect to $DB db2 -v drop event monitor $MON db2 -v create event monitor $MON for $EVENTS \ write to file "'`pwd`'" buffersize 64 $MONTYPE db2 -v set event monitor $MON state = 1 echo "" echo "Event Monitor active at `date`; sleeping for $SLEEP seconds before turning it off." sleep $SLEEP db2 -v set event monitor $MON state = 0 cd .. db2evmon -db $DB -evm $MON > $OUTFILE db2 -v drop event monitor $MON db2 terminate rm -fr $OUTDIR echo echo db2evmon output is in $OUTFILE
Listing 11. getevmon.bat (Windows)
@echo off REM create an event monitor and capture its output REM parameters: (1) database name REM (2) monitor output file REM (3) interval to monitor for (seconds) REM Note: You may receive an error about the monitor heap being too small. You may want to set REM mon_heap_sz to 2048 while monitoring. :CHECKINPUT IF ""=="%3" GOTO INPUTERROR GOTO STARTPRG :INPUTERROR echo %0 requires 3 parameters: dbname filename sleep_interval echo e.g. "%0 sample evmon0302.out 60" GOTO END :STARTPRG SET dbname=%1 SET outfile=%2 SET sleep_interval=%3 SET MON=evmon REM "nonblocked" may cause loss of data but has less impact on system than default "blocked". SET MONTYPE=nonblocked REM SET EVENTS="deadlocks with details" REM SET EVENTS="tables, statements, deadlocks with details, connections" SET EVENTS="statements" SET OUTDIR="c:\temp\evmon" mkdir %OUTDIR% db2 connect to %dbname% db2 -v drop event monitor %MON% db2 -v create event monitor %MON% for %EVENTS% write to file '%OUTDIR%' buffersize 64 %MONTYPE% db2 -v set event monitor %MON% state = 1 echo Sleeping for %sleep_interval% seconds before turning off. sleep %sleep_interval% db2 -v set event monitor %MON% state = 0 db2evmon -db %dbname% -evm %MON% > %OUTFILE% db2 -v drop event monitor %MON% db2 terminate rmdir /s /q %OUTDIR% echo db2evmon output is in %OUTFILE% :END
The output will contain the exact text of all dynamic SQL statements; however for static SQL statements it will list the package name and section number. db2expln with the package name and section number can be used to extract the statement text, or you could also query the syscat.statements view to extract the text.
Listing 12 shows an example of an event captured in the output of the statement event monitor:
Listing 12. Sample Statement Event Monitor output
42) Statement Event ... Appl Handle: 16 Appl Id: *LOCAL.DB2.010746204025 Appl Seq number: 0003 Record is the result of a flush: FALSE ------------------------------------------- Type : Dynamic Operation: Close Section : 201 Creator : NULLID Package : SQLC2E03 Consistency Token : AAAAAJHR Package Version ID : Cursor : SQLCUR201 Cursor was blocking: TRUE Text : select * from staff ------------------------------------------- Start Time: 10-06-2003 17:27:38.800490
Stop Time: 10-06-2003 17:27:38.806619
Exec Time: 0.006129 seconds Number of Agents created: 1 User CPU: Not Available System CPU: Not Available Fetch Count: 35 Sorts: 0 Total sort time: 0 Sort overflows: 0 Rows read: 35 Rows written: 0 Internal rows deleted: 0 Internal rows updated: 0 Internal rows inserted: 0 SQLCA: sqlcode: 0 sqlstate: 00000
Since there could be literally thousands of statement events, the easiest way to pinpoint problems is by using grep (UNIX) or findstr (Windows). The following are some useful strings to search for in the output:
" sqlcode: -"
This is useful to find errors, such as -911 RC 2, which means a deadlock and RC 68 which is a lock timeout.
grep -n " sqlcode: -" stmtevmon_output
findstr /C:" sqlcode: -" stmtevmon_output
" Rows read: "
This identifies the number of rows read by a statement (does not count index entries or direct table reads). A really high number could mean that an index is needed or that statistics are out of date.
grep -n " Rows read: " stmtevmon_output | grep -v ": 0" | sort -k 4,4rn | more
findstr /C:" Rows read: " stmtevmon_output | findstr /V /C:" Rows read: 0" | sort
" Exec Time: "
This is the actual execution time for the statement and will include any lock wait time. Sometimes it is handy to start at the end of the event monitor data and start searching back for "Exec Time" and see if there is a pattern or repetition of a costly SQL statement. You can then examine the SQL statement with EXPLAIN to see what the problem might be.
grep -n " Exec Time: " stmtevmon_output | grep -v ": 0.0" | sort -k 4,4rn | more
findstr /C:" Exec Time: " stmtevmon_output | findstr /V /C:" Exec Time: 0.0" |sort
" Sort overflows:"
Shows where costly sort overflows are occurring. This could indicate the need for an index, RUNSTATS, or a larger SORTHEAP.
grep -n " Sort overflows:" stmtevmon_output | grep -v ": 0" | sort -k 4,4rn | more
findstr /C:" Sort overflows: " stmtevmon_output |findstr /V /C:" Sortoverflows: 0"|sort
" Fetch Count:"
Useful to see how many fetches were performed against the result set. Instead of recording each FETCH operation. DB2 tracks them at the statement level by incrementing this field. You can limit fetches with the FETCH FIRST clause.
grep -n " Fetch Count:" stmtevmon_output | grep -v ": 0" | sort -k 4,4rn | more
findstr /C:" Fetch Count: " stmtevmon_output | findstr /V /C:" Fetch Count: 0" | sort
Once you identify some elements you want to take a closer look at, open the statement event monitor output and search on the string you are interested in. Once you locate the statement, the following fields will be useful:
" Operation: "
This field can provide you with the general flow of statements. It identifies Prepare, Open, Fetch, Close, Commit, etc.
" Text : "
This is the statement text for dynamic SQL. For static SQL, look for "Section :" and "Package :".
" Start Time: " or " Stop Time: " or " Time: "
This help to identify the start time, stop time or both for each statement. Stop time and subsequent Start time can also tell you how big of a gap between statements, which may be useful if you suspect DB2 is spending time somewhere else (possibly Stored Procedure overhead).
Explain for access plans
The SQL explain facility is part of the SQL Compiler that is used to show the access plan of a statement and the environment in which it was compiled. Explain information can both be captured and displayed in a number of ways.
This information can help you:
- understand the execution plan chosen for a query
- assist in designing application programs
- determine when an application should be rebound
- assist in database design
Before you can obtain explain data for an SQL statement, you must have a set of explain tables defined using the same schema as the authorization ID that invokes the explain facility. Please see ‘sqllib/misc/explain.ddl‘ or ‘sqllib\misc\explain.ddl’ under your installation of DB2 for the Explain Tables' DDL.
To clear the explain tables, issue:
delete from schema.explain_instance
All other explain tables will be cleared as a result, because of the cascading delete referential integrity.
When analyzing the explain output, try to identify if the following are occurring:
- ORDER BY, GROUP BY, or DISTINCT operators used on the same sets of columns and base tables repeatedly could benefit from an index or Materialized Query Table (MQT) to eliminate sorts.
- Expensive operations such as large sorts, sort overflows, and heavy table usage could benefit from more sort space, better indexes, updated statistics, or different SQL.
- Table scans that could benefit from an index.
- Full or unselective index scans, which do not use a start or stop key, or use start and stop keys with a wide range of values.
- Single column fetches which may be avoided by using an INCLUDE index.
Visual Explain is very easy and straightforward to use. It can explain statements containing parameter markers (denoted with "?"), but it can be difficult to share the results with other people since it is GUI-based. From the Control Center, Right-click on the database that you wish to explain an SQL statement for and select "Explain SQL…". In the SQL text box, you can input your SQL statement and then click on "OK" to generate the graph. Figure 2 is an example of what the overall graph could look like:
Figure 2. Visual Explain results screen
You can double click on any of the nodes to obtain more detailed analysis.
The text-based options of db2exfmt and db2expln are not as easy to read (at first), but are much easier to share with other people as you can simply send them the output file.
Again, it is a good practice to clear out the explain tables prior to
capturing new data. Issue:
delete from schema.explain_instance
All other explain tables will be cleared as a result, because of the cascading delete referential integrity.
All explain output (including Visual Explain) is read from the bottom up.
Figure 3. Text Explain screen
Instead of having details displayed on more than one screen, as in Visual Explain, all details are listed in the single output file. Each operator is numbered in the above diagram, as you work down the document each operator will be explained in detail. In the above diagram, an operator can be interpreted as follows:
Listing 13. Reading Text Explain Operators
5.7904 - # of rows returned (based on statistics calculation) HSJOIN - type of operator ( 2) - operator # 75.536 - cumulative timerons 3 - I/O costs
Number of rows returned, timerons (cost), and I/O are all optimizer estimates, and in some cases may be quite different than the actual number. A timeron is a DB2 unit of measurement that is used to give a rough relative estimate of the resources, or cost, required by the database server to execute two plans for the same query. The resources calculated in the estimate include weighted processor and I/O costs.
You can use db2exfmt to explain a single statement. For example,
Listing 14. Generating Text Explain output for a single statement
explain all for SQL_statement db2exfmt -d dbname -g tic -e explaintableschema -n % -s % -w -1 -# 0 -o outfile
If you build a text file of several "explain all…" statements separated by a ";", you can explain multiple statements at one time:
Listing 15. Generating Text Explain output for multiple statements
db2 -tf file_with_statements db2exfmt -d dbname -g tic -e explaintableschema -n % -s % -w % -# 0 -o outfile
Finally, if you want to explain static SQL contained within a package, you
will need to use the db2expln tool:
db2expln -database dbname -schema schema_name -package package -output outfile.txt
- Concepts ==> Administration ==> Performance tuning ==> SQL explain facility
- Tutorials ==> Tutorials (Visual Explain Tutorial)
SQL statement tuning
The following are some techniques you can take advantage of to squeeze extra performance out of the statement execution:
Use SELECT ... FOR UPDATE to protect rows that may be updated in a subsequent positioned UPDATE statement. This will cause update (U) locks to be placed on all rows that have been selected. This helps prevent deadlocks if you will be later updating the rows.
Use SELECT … FOR READ ONLY (or FETCH ONLY) to indicate that the result table is read-only. This means that the cursor cannot be referred to in a positioned UPDATE or DELETE statement. This can help improve the performance of FETCH operations since it allows DB2 to perform blocking (return multiple rows to the client for a given FETCH request).
Optimize return time with SELECT … OPTIMIZE FOR n ROWS. It will make to optimizer return N rows quickly, instead of minimizing the cost of the whole answer set, which is the default behavior. In addition, if using the READ ONLY clause, this will influence the number of rows that will be returned in each block (there will not be more than n rows in a block). This does not limit the number of rows that can be fetched, but may degrade performance if more than n rows are fetched. In order for this clause to have an impact on data buffers, the value of n * row size cannot exceed the size of the communication buffer (defined by DBM CFG RQRIOBLK or ASLHEAPSZ).
You can limit query result set size with SELECT … FETCH FIRST n ROWS.
Mass DELETEs/Purging can be accomplished by ALTER TABLE … ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE. Since this operation is not logged, if something goes wrong you will be forced to drop the table. A safer alternative is to use the Import utility with the Replace option and a null file.
To decrease the chances of lock-wait or deadlock, release locks frequently by using commit.
Make sure that repeated statements use parameter markers instead of literals. Compile time is more significant with OLTP, so replacing literals with parameter markers avoids repeated compilations. When using parameter markers, the optimizer assumes uniform distribution of values, so skewed data could mean that a poor access plan will be chosen. Typically, it’s OLAP type environments that suffer from the assumption of uniform values.
Specify the isolation level at the statement level for better granularity and improved performance and concurrency. DB2 supports Uncommitted Read, Cursor Stability, Read Stability, and Repeatable Read (UR, CS, RS, and RR). For example, SELECT * FROM STAFF WITH UR would perform a SELECT using Uncommitted Read (minimal locking).
- Reference ==> SQL ==> Queries and subqueries ==> select-statement
Specifying selectivity when using parameter markers with skewed data
Specifying the selectivity can be very useful when data is not evenly distributed. SELECTIVITY refers to the probability that any row will satisfy a predicate (i.e., be true). A highly selective predicate is desirable. Such predicates return fewer rows for future operators to work on, thereby requiring less CPU and I/O to satisfy the query.
For example, a selectivity of 0.01 (1%) for a predicate operating on a table with 1,000,000 rows would mean that 10,000 rows are estimated to qualify and 990,000 to not qualify.
You can influence the optimizer by rather than just using the assumption of uniform distribution, use an artificially low selectivity value (such as 0.000001) to "guarantee" that an index on that column will be used. This is especially useful if you expect a table to grow and want to ensure that it sticks to using indexes on specific columns. SELECTIVITY of 1 can be used to discourage DB2 from using an index on a specific column.
Using an optimization level of 5 (e.g., DFT_QUERYOPT=5) works most predictably with this technique. Also, you must first set the registry variable DB2_SELECTIVITY=YES and restart the instance before being able to use the SELECTIVITY clause.
You can specify the SELECTIVITY clause for the following predicates:
- A basic predicate in which at least one expression contains host variables/parameter markers (a basic predicate includes simple comparisons like =, <>, <, and <=, but does not include things like IN, BETWEEN, and IS NULL.)
- A LIKE predicate in which the MATCH expression, predicate expression, or escape expression contains host variables/parameter markers
The selectivity value must be a numeric literal value in the inclusive range from 0 to 1. If SELECTIVITY is not specified, an intelligent default value is used. A value of 0.01 would mean that the predicate is expected to filter out all but one percent of all the rows in the table. Providing SELECTIVITY should be considered a last resort.
Listing 16. Specifyin Selectivity Example
SELECT c1, c2, c3, FROM T1, T2, T3 WHERE T1.x = T2.x AND T2.y = T3.y AND T1.x >= ? selectivity 0.000001 AND T2.y < ? selectivity 0.5 AND T3 = ? selectivity 1
- Reference ==> SQL ==> Language elements ==> Predicates ==> Search Conditions
The following are useful techniques to maintain optimal performance of your database. Remember the scope of commands when you are using a partitioned database. For example, the RUNSTATS command collects statistics for a table on the database partition from which it is invoked, while REORG operates over all partitions of the database partition group. See the ‘Scope’ section of the documentation for each command.
REORG and REORGCHK
REORG will eliminate overflow rows and reclaim space from deleted rows of tables and indexes, which is very useful if there have been a large number of deletes, updates, or inserts. It can also be used to put a table into a certain index sequence (e.g., in support of a clustered index). REORGs can be performed online and paused. REORCHK is used to identify those tables and indexes that are in need of a REORG and can also be used to collect statistics for all the tables in the database.
To collect updated statistics and identify if tables or indexes need to be
REORGCHK UPDATE STATISTICS ON TABLE ALL
To identify if tables or indexes need to be reorged based on the current
REORGCHK CURRENT STATISTICS ON TABLE ALL
The tables which have been identified as needing a REORG will show one or
more asterisks in the reorg columns (F1 to F8) of the REORGCHK output. Now
you can actually REORG the identified tables. To REORG a table without
regard to order and simply reclaim space from pseudo-deleted rows and
remove overflow rows:
REORG TABLE schema.tablename
To perform a REORG on the all the indexes for a table:
REORG INDEXES ALL FOR TABLE schema.tablename
To order a table in physical sequence according to a specific index (great
for ORDER BY or a clustered index):
REORG TABLE schema.tablename INDEX schema.indexname
- Reference ==> Commands ==> Command Line Processor (CLP) ==> REORG
- Reference ==> Commands ==> Command Line Processor (CLP) ==> REORGCHK
After REORGanizing tables and indexes, it is always a good idea to recollect the table and index statistics so that the optimizer can create the most appropriate access plan. You may find it useful to perform a sampled RUNSTATS (for better performance on large databases) or perform a background run.
To identify whether RUNSTATS has been performed before on tables and indexes:
Listing 17. Identifying Runstats Time
SELECT char(tabname,40) FROM syscat.tables WHERE type = ‘T’ AND stats_time is null SELECT char(indname,40) FROM syscat.indexes WHERE stats_time is null Or, to list runstat times (oldest first) SELECT char(tabname,40), stats_time FROM syscat.tables WHERE type = ‘T’ ORDER by stats_time
The System Catalog tables also benefit from having RUNSTATS performed on them.
The following are some useful commands:
Table 4. Useful Runstats Commands
|RUNSTATS ON TABLE schema.table||Collects statistics for a specific table.|
|RUNSTATS ON TABLE schema.table AND INDEXES ALL||Collects statistics for a specific table and all of its indexes.|
|RUNSTATS ON TABLE schema.table AND SAMPLED DETAILED INDEXES ALL||Collects statistics on a specific table and all of its indexes using extended index statistics and a CPU sampling technique, which is useful for very large indexes (1+ million rows) where time available for RUNSTATS is limited.|
|RUNSTATS ON TABLE schema.table WITH DISTRIBUTION||Collects additional statistics on a specific table (and optionally, specific columns), which is useful when data is not evenly distributed.|
It can also be handy to use a SELECT statement to create a script, where you simply pipe the results of a SELECT statement to a file. Use a statement similar to the following and pipe the results to a file:
Listing 18. Generating a Runstats CLP Script
SELECT 'RUNSTATS ON TABLE ' || rtrim(tabschema) || '.' || char(tabname,40) || ' AND DETAILED INDEXES ALL;' FROM syscat.tables WHERE type = 'T' ORDER BY tabschema, tabname;
- Reference ==> Commands ==> Command Line Processor (CLP) ==> RUNSTATS
After performing a REORG and RUNSTATS, you will want to REBIND all database
packages so that their static SQL can take advantage of the most recent
system statistics. Use DB2RBIND to rebind all the database packages:
db2rbind dbname -l logfile.out ALL
REBIND can be used to rebind single packages.
- Reference ==> Commands ==> System ==> db2rbind
- Reference ==> Commands ==> Command Line Processor (CLP) ==> REBIND
Database partitioning feature (DPF) performance
The Database partitioning feature (DPF) of DB2 v8.1 ESE allows for the partitioning of a database within a single server or across a cluster of servers. This provides increased scalability to support very large databases, complex workloads, and increased parallelism for administration tasks. The following section contains recommendations to help you get the best performance from DPF.
When to partition?
Before 64-bit DB2 came along, partitioning was commonly used to get around the restrictive shared memory limits of 32-bit architecture (typically around 2 GBs per database). The better option to take advantage of memory is to use a large SMP 64-bit server, as it avoids the complexities and overhead of partitioning altogether.
However, in some cases, partitioning can greatly speedup up the execution of Select, LOAD, BACKUP, and RESTORE. With each partition added, you reduce the amount of data that the processors on each partition need to work with. This improvement is not typically found in databases which have a small number of partitions, as the overhead of hashing the rows and sending data outweighs the performance gain of having less data to work with.
Another reason to partition is to overcome some per partition Database Manager limits (e.g., maximum table size of 64GBs per partition for a 4K page size).
How Many Partitions
This is a difficult question to answer, as some systems perform best with 1 CPU per partition, while others at 8 or more. The idea is to keep the CPUs assigned to each partition busy. Given an SMP machine, a good starting point is about 4 CPUs per partition. If you find that CPU usage is continually low (e.g., less than 40%), you may want to consider adding more partitions.
In general, the fewer partitions that there are per machine the better, as there is a greater chance of both local bypass and collocation (explained below).
Choosing an ideal partitioning key
By choosing a proper partitioning key you can help ensure a balanced data distribution and workload along with efficient table collocation.
Typically the following should be kept in mind when choosing a partitioning key:
- Always specify the partitioning key instead of letting it default.
- The partitioning key must be a subset of the primary key or unique index.
- A column with many distinct values makes a good candidate. If the column has only a limited number of distinct values, only a few hashing numbers will be generated which will increase the odds of having skewed data and an unbalanced workload.
- If the partitioning key consists of too many columns (typically 4 or more) performance degradation will be caused by simply generating the hashing numbers.
- Use the same set of columns as a partitioning key on tables which are frequently joined, to increase number of collocated joins.
- An Integer column is more efficient than using a character column which is more efficient than using a decimal column.
- Reference ==> SQL ==> Language elements ==> Data types ==> Partition-compatible data types
Table collocation allows for the local processing of a query (within the same logical partition) which avoids unnecessary data movement among the participating partitions. To help ensure table collocation, use the join columns as partitioning keys and place those joined tables in table spaces which share the same partition group. The joined tables’ partitioning keys should have the same number of columns and corresponding data types.
If some tables cannot be partitioned on the same keys as the tables that they are typically joined with and those tables are modest is size and typically read-only, replicated Materialized Query Tables (MQTs) may be an efficient solution for performance. This allows for the contents of an entire table (or portion of the table) to be replicated to every partition in the database partition group. However, if the table is frequently updated this can hurt performance due to the increase in resource usage.
To create a simple replicated MQT, use the following syntax:
CREATE TABLE replicated_table AS (SELECT * FROM source_table) DATA INITIALLY DEFERRED REFRESH IMMEDIATE REPLICATED
For more information on MQTs, see the appropriate section under that Table discussion.
If you are using AIX and have enabled the DB2 Profile Registry variable DB2_FORCE_FCM_BP=YES, then when using multiple logical partitions (i.e., on the same machine) data that is transferred between partitions is done so through shared memory, which is extremely fast.
- Release information ==> Version 8.1 ==> New features ==> Performance enhancements ==> Materialized query tables
- Reference ==> SQL ==> Functions ==> Scalar ==> DBPARTITIONNUM
Use REDISTRIBUTE DATABASE PARTITION GROUP to rebalance data across partitions and update the hash partitioning maps to be more balanced. This is useful if you have added a partition or observe unbalanced data volumes amongst the current partitions.
You can use the HASHEDVALUE and DBPARTITIONNUM SQL functions to determine
the current data distribution across hash partitions or database
partitions. Avoid having too much data on one or more partitions and too
little data on one or more partitions. The PARTITION function returns the
partitioning map index for each row of the table while the DBPARTITIONNUM
function returns the partition number of the row. For example, to find the
current distribution of a table:
SELECT DBPARTITIONNUM(column), COUNT(*) FROM table GROUP BY DBPARTITIONNUM(column)
- Reference ==> Commands ==> Command Line Processor (CLP) ==> REDISTRIBUTE DATABASE PARTITION GROUP
Utilization and bottlenecks
The following identifies potential problems which are typically outside the realm of DB2 tuning, but can significantly decrease database performance.
A system is considered to be CPU bound if total CPU utilization (user + system) is greater than 80%. The rule of thumb is to keep CPU utilization (for the most part) below 80%, as this leaves the capacity to handle short bursts of increased activity. If you seem to be CPU-bound, it is hard to generalize what the problem could be. It could be anything from an inefficient access plan to high-connection concurrency which requires more CPU.
On Unix, monitor with vmstat (e.g., "vmstat 3"). On Windows, monitor with Perfmon.exe or Task Manager. Ignore short bursts (1-3 seconds) of 100% busy CPU and focus on the long-term average CPU.
Listing 19 shows some output from RedHat 8.0 Linux with the important columns bolded. The columns that you want to look at are the User CPU utilization (us) and System CPU utilization (sy). The id column shows idle time. You should always disregard the first line. Here we see a system that has a pretty high User CPU utilization percentage and then the system returns to normal.
Listing 19. Sample vmstat output
[db2inst1@65658161 db2inst1]$ vmstat 3 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 704388 65460 165592 0 0 0 21 20 9 2 2 6 0 1 1 0 642096 65660 206060 0 0 515 2149 911 1047 18 9 72 2 1 0 0 639712 65668 208292 0 0 139 2287 862 938 30 10 60 1 0 0 0 629772 65692 215496 0 0 0 581 568 422 94 1 4 1 0 0 0 625764 65696 218956 0 0 0 1809 612 423 91 1 8 1 0 0 0 623752 65704 220752 0 0 11 1741 712 549 85 8 7 0 0 0 0 633548 65712 217768 0 0 11 1264 728 700 17 4 79 1 0 0 0 633556 65712 217768 0 0 0 87 621 280 5 7 88 0 0 0 0 633556 65712 217768 0 0 0 0 519 150 0 0 100 1 0 0 0 633556 65712 217768 0 0 0 0 523 154 0 0 100
High CPU utilization can sometimes be attributed to excessive table scans or index scans on large tables. Look for indexing opportunities by analyzing SQL statements which have very high "rows read" values (in SQL snapshots).
You can also monitor at the process level, to get a better idea of what may be consuming all of the CPU. Monitor on UNIX with ps (e.g., "ps uax") and on Windows with Perfmon.exe or Task Manager. Ignore short bursts (1 to 3 seconds) of 100% and instead focus on the long-term average.
For example, on RedHat Linux 8.0, we can see how much CPU each process is consuming by issuing "ps uax":
Listing 20. Sample ps output
[db2inst1@65658161 tmp]$ ps uax USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND db2inst1 9967 0.0 2.5 123416 26620 ? S Feb15 0:00 db2agent (idle) db2inst1 10020 2.1 7.4 435200 76952 ? R Feb15 2:12 db2agent (TEST1) db2inst1 3643 0.1 3.9 249544 41220 ? S 13:17 0:00 db2loggw (TEST1) db2inst1 3649 0.0 4.0 249540 41320 ? S 13:17 0:00 db2pclnr
A system is considered to be I/O bound when the disk is regularly more than 45% busy (usually the %util column). If you are experiencing a disk bottleneck, make sure that your table spaces’ containers are spread across all available disks. If disk utilization is still high, you likely need more disks.
Unfortunately, iostat has different forms of output depending on which OS you use it on. Monitor on UNIX with iostat (e.g., "iostat 3") and on Windows with Perfmon.exe. Ignore short bursts (1-3 seconds) of 100% busy and instead focus on the long-term average.
If you are using Linux, use the "iostat –d –x 3" command to enable extended disk information and look for disks which have a service time of more than 50 ms (svctm) and a disk that is more than a 45% busy). The following output has had some columns of data ommitted for formatting reasons.
Listing 21. Sample iostat output
[db2inst1@65658161 tmp]$ iostat -d -x 3 Linux 2.4.18-14 (65658161.torolab.ibm.com) 02/18/2004 Device: r/s w/s rsec/s wsec/s rkB/s wkB/s await svctm %util /dev/hda 0.01 2.25 0.19 41.81 0.09 20.91 0.60 1.88 0.42 /dev/hda1 0.00 0.00 0.00 0.00 0.00 0.00 277.20 176.86 0.00 /dev/hda2 0.00 0.00 0.00 0.00 0.00 0.00 4.11 4.11 0.00 /dev/hda3 0.01 2.25 0.19 41.81 0.09 20.91 0.58 0.66 0.15 Device: r/s w/s rsec/s wsec/s rkB/s wkB/s await svctm %util /dev/hda 0.00 383.67 0.00 5632.00 0.00 2816.00 8.10 1.35 51.97 /dev/hda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 /dev/hda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 /dev/hda3 0.00 383.67 0.00 5632.00 0.00 2816.00 8.10 2.16 82.93
Memory and paging-space utilization
From a DB2 perspective, a system is considered to be memory bound when paging is occurring. Once paging begins to occur, performance usually drops drastically.
On UNIX, use "lsps –a" to list the paging space characteristics and vmstat (e.g., "vmstat 3") to monitor. On Windows, use Perfmon.exe or Task Manager.
For example, on RedHat Linux 8.0 you will want to pay attention to the swap information. Specifically, the swap in (si) and swap out (so) columns show the amount of memory swapped in from disk and swapped out to disk in kB/sec (on AIX it is measured in 4K pages/sec).
Listing 22. Sample iostat output
[db2inst1@65658161 tmp]$ vmstat 3 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 675160 66088 175908 0 0 0 21 22 9 2 2 6 0 0 0 0 675096 66088 175908 0 2 0 37 624 246 5 7 88 2 0 0 0 665376 66088 183408 0 0 11 88 666 826 6 8 85 1 0 0 0 665376 66088 183408 1 0 0 76 623 452 5 8 88 2 0 0 0 654748 66096 191112 0 0 79 48 619 847 2 4 94 3 0 0 0 652760 66096 191192 0 0 15 47 578 791 2 2 96
Though the network is not typically a significant bottleneck, some tuning may improve performance. A system could be considered to be Network bound if CPU and I/O utilization are both quite low, and there are performance problems occurring when communicating across the network to the DB2 server. The most serious degradation can be caused in a partitioned database if a partitioning strategy results in non-collocated joins.
You can monitor on UNIX with netpmon (e.g., "netpmon -O all -o netpmon.out") and on Windows with Perfmon.exe.
|Code sample||scripts.zip ( HTTP | FTP )||3.10 KB|
- The DB2 Web Site is the DB2 product web site, where all DB2 related information can be found.
- The DB2 Product Technical Library is a repository of DB2 documentation including articles, HOWTO documents, FAQs, and more.
- DB2 White Papers is a repository of DB2 white papers and technical articles.
- IBM PartnerWorld for Developers is a repository of DB2 developer resources, forums, papers, discussions, etc.
- " Diagnosing and Resolving Lock Problems with DB2 for Linux, UNIX, and Windows " ( developerWorks , October 2003) discusses how DB2 implements its locking strategy and how to resolve related problems.