Many factors affect the performance of sort operations,
but you can follow certain recommendations to reduce I/O contention
and minimize sort row size.
About this task
A
sort operation is invoked when a cursor is opened for a SELECT statement
that requires sorting. The following factors affect the performance
of Db2 sort
processing:
- Sort pool size
- I/O contention
- Sort row size
- Whether the data is already sorted
For any SQL statement that initiates sort activity, the IBM® OMEGAMON® for Db2 Performance Expert on z/OS® SQL
activity reports provide information on the efficiency of the sort
that is involved.
Procedure
To minimize the performance impacts of sort processing,
use any of the following approaches:
- Increase the size of the sort pool.
The larger
the sort pool, the more efficient the sort is. The maximum size of
the sort work area allocated for each concurrent sort user depends
on the value that you specified for the SRTPOOL subsystem parameter.
To
determine a rough estimate for the maximum sort size, use the following
formula:
32000 × (16 + sort key length + sort data length)
For
sort key length and sort data length, use values that represent the
maximum values for the queries you run. To determine these values,
refer to the QW0096KL (key length) and QW0096DL (data length) fields
in IFCID 96, as mapped by macro DSNDQW01. You can also determine
these values from an SQL activity trace.
If a column
is in the ORDER BY clause that is not in the select clause, that column
should be included in the sort data length and the sort key length
as shown in the following example:
SELECT C1, C2, C3
FROM tablex
ORDER BY C1, C4;
If C1, C2, C3, and C4 are each 10
bytes in length, you could estimate the sort pool size as follows:
32000 × (16 + 20 + (10 + 10 + 10 + 10)) = 2342000 bytes
The
values used in the example above include the items in the following
table:
Table 1. Values
used in the sort pool size example
Attribute |
Value |
Maximum number of sort nodes |
32000 |
Size (in bytes) of each node |
16 |
Sort key length (ORDER BY C1, C4) |
20 |
Sort data length (each column is 10 bytes in
length) |
10+10+10+10 |
- Minimize I/O contention on the I/O paths to the physical
work files, and make sure that physical work files are allocated on
different I/O paths and packs to minimize I/O contention. Using disk
devices with Parallel Access Volumes (PAV) support is another way
to significantly minimize I/O contention.
When I/Os occur
in the merge phase of a sort,Db2 uses
sequential prefetch to bring pages into the buffer pool with a prefetch
quantity of eight pages. However, if the buffer pool is constrained,
then Db2 uses
a prefetch quantity of four pages or less, or disables prefetch entirely
because of the unavailability of enough pages.
- Allocate additional physical work files in excess of the
defaults, and put those work files in their own buffer pool.
Segregating
work file activity enables you to better monitor and tune sort performance.
It also allows Db2 to
handle sorts more efficiently because these buffers are available
only for sort without interference from other Db2 work.
- Increase the amount of available space for work files.
Applications that use created temporary tables use work file
space until a COMMIT or ROLLBACK occurs. (If a cursor is defined WITH
HOLD, then the data is held past the COMMIT.) If sort operations happen
at the same time that the temporary tables exist, you might need to
provide more space for the work files.
Applications that require
star join, materialized views, materialized nested table expressions,
non-correlated subqueries or triggers also use work files.
-
Write applications to sort only columns that require sorting.
Each sorted
column in the sort key is counted twice when the sort row size is calculated. A smaller sort row
size means that more rows can fit in the sort pool.
-
Select VARCHAR columns only when they are required.
Varying length sort key columns, but not
sort data columns, are padded to their maximum length for sort row size.
-
Set the buffer pool sequential steal threshold (VPSEQT)
to 99% unless sparse index is used to access the work files.
The default value, which is 80%, allows 20% of the buffers
to go unused. A value of 99% prevents space map pages, which are randomly
accessed, from being overwritten by massive prefetch.
- Increase the buffer pool deferred write threshold (DWQT)
or data set deferred write threshold (VDWQT) values.
If
DWQT or VDWQT are reached, writes are scheduled. For a large sort
that uses many logical work files, scheduled writes are difficult
to avoid, even if a very large buffer pool is specified. As
you increase the value of VDWQT, watch for buffer shortage conditions
and either increase the work file buffer pool size or reduce VDWQT
if buffer shortages occur.