DB2 Version 9.7 for Linux, UNIX, and Windows

# Table space impact on query optimization

Certain characteristics of your table spaces can affect the access plans that are chosen by the query compiler.

These characteristics include:
• Container characteristics
Container characteristics can have a significant impact on the I/O cost that is associated with query execution. When it selects an access plan, the query optimizer considers these I/O costs, including any cost differences when accessing data from different table spaces. Two columns in the SYSCAT.TABLESPACES catalog view are used by the optimizer to help estimate the I/O costs of accessing data from a table space:
• OVERHEAD provides an estimate of the time (in milliseconds) that is required by the container before any data is read into memory. This overhead activity includes the container's I/O controller overhead as well as the disk latency time, which includes the disk seek time.
You can use the following formula to estimate the overhead cost:
``````   OVERHEAD = average seek time in milliseconds
+ (0.5 * rotational latency)``````
where:
• 0.5 represents the average overhead of one half rotation
• Rotational latency (in milliseconds) is calculated for each full rotation, as follows:
``   (1 / RPM) * 60 * 1000``
where:
• You divide by rotations per minute to get minutes per rotation
• You multiply by 60 seconds per minute
• You multiply by 1000 milliseconds per second
For example, assume that a disk performs 7200 rotations per minute. Using the rotational-latency formula:
``   (1 / 7200) * 60 * 1000 = 8.328 milliseconds``
This value can be used to estimate the overhead as follows, assuming an average seek time of 11 milliseconds:
``````   OVERHEAD = 11 + (0.5 * 8.328)
= 15.164``````
• TRANSFERRATE provides an estimate of the time (in milliseconds) that is required to read one page of data into memory.
If each table space container is a single physical disk, you can use the following formula to estimate the transfer cost in milliseconds per page:
``   TRANSFERRATE = (1 / spec_rate) * 1000 / 1024000 * page_size``
where:
• You divide by spec_rate, which represents the disk specification for the transfer rate (in megabytes per second), to get seconds per megabyte
• You multiply by 1000 milliseconds per second
• You divide by 1 024 000 bytes per megabyte
• You multiply by the page size (in bytes); for example, 4096 bytes for a 4-KB page
For example, suppose that the specification rate for a disk is 3 megabytes per second. Then:
``````   TRANSFERRATE = (1 / 3) * 1000 / 1024000 * 4096
= 1.333248``````
or about 1.3 milliseconds per page.

If the table space containers are not single physical disks, but are arrays of disks (such as RAID), you must take additional considerations into account when estimating the TRANSFERRATE.

If the array is relatively small, you can multiply the spec_rate by the number of disks, assuming that the bottleneck is at the disk level. However, if the array is large, the bottleneck might not be at the disk level, but at one of the other I/O subsystem components, such as disk controllers, I/O busses, or the system bus. In this case, you cannot assume that the I/O throughput capacity is the product of the spec_rate and the number of disks. Instead, you must measure the actual I/O rate (in megabytes) during a sequential scan. For example, a sequential scan resulting from select count(*) from big_table could be several megabytes in size. In this case, divide the result by the number of containers that make up the table space in which BIG_TABLE resides. Use this result as a substitute for spec_rate in the formula given above. For example, a measured sequential I/O rate of 100 megabytes while scanning a table in a four-container table space would imply 25 megabytes per container, or a TRANSFERRATE of (1 / 25) * 1000 / 1 024 000 * 4096 = 0.16 milliseconds per page.

Containers that are assigned to a table space might reside on different physical disks. For best results, all physical disks that are used for a given table space should have the same OVERHEAD and TRANSFERRATE characteristics. If these characteristics are not the same, you should use average values when setting OVERHEAD and TRANSFERRATE.

You can obtain media-specific values for these columns from hardware specifications or through experimentation. These values can be specified on the CREATE TABLESPACE and ALTER TABLESPACE statements.

• Prefetching

When considering the I/O cost of accessing data in a table space, the optimizer also considers the potential impact that prefetching data and index pages from disk can have on query performance. Prefetching can reduce the overhead that is associated with reading data into the buffer pool.

The optimizer uses information from the PREFETCHSIZE and EXTENTSIZE columns of the SYSCAT.TABLESPACES catalog view to estimate the amount of prefetching that will occur.
• EXTENTSIZE can only be set when creating a table space. An extent size of 4 or 8 pages is usually sufficient.
• PREFETCHSIZE can be set when you create or alter a table space. The default prefetch size is determined by the value of the dft_prefetch_sz database configuration parameter. Review the recommendations for sizing this parameter and make changes as needed, or set it to AUTOMATIC.

After making changes to your table spaces, consider executing the runstats utility to collect the latest statistics about indexes and to ensure that the query optimizer chooses the best possible data-access plans before rebinding your applications.