Tuning sort performance

Because queries often require sorted or grouped results, proper configuration of the sort heap is crucial to good query performance.

Sorting is required when:
  • No index that satisfies a requested order exists (for example, a SELECT statement that uses the ORDER BY clause)
  • An index exists, but sorting would be more efficient than using the index
  • An index is created
  • An index is dropped, which causes index page numbers to be sorted

Elements that affect sorting

The following factors affect sort performance:
  • Settings for the following configuration parameters:
    • Sort heap size (sortheap), which specifies the amount of memory to be used for each sort
    • Sort heap threshold (sheapthres) and the sort heap threshold for shared sorts (sheapthres_shr), which control the total amount of memory that is available for sorting across the instance
  • The number of statements in a workload that require a large amount of sorting
  • The presence or absence of indexes that might help avoid unnecessary sorting
  • The use of application logic that does not minimize the need for sorting
  • Parallel sorting, which improves sort performance, but which can only occur if the statement uses intrapartition parallelism
  • Whether or not the sort is overflowed. If the sorted data cannot fit into the sort heap, which is a block of memory that is allocated each time a sort is performed, the data overflows into a temporary table that is owned by the database.
  • Whether or not the results of the sort are piped. If sorted data can return directly without requiring a temporary table to store the sorted list, it is a piped sort.

    In a piped sort, the sort heap is not freed until the application closes the cursor that is associated with the sort. A piped sort can continue to use up memory until the cursor is closed.

Although a sort can be performed entirely in sort memory, this might cause excessive page swapping. In this case, you lose the advantage of a large sort heap. For this reason, you should use an operating system monitor to track changes in system paging whenever you adjust the sorting configuration parameters.

Techniques for managing sort performance

Identify particular applications and statements where sorting is a significant performance problem:
  1. Set up event monitors at the application and statement level to help you identify applications with the longest total sort time.
  2. Within each of these applications, find the statements with the longest total sort time.

    You can also search through the explain tables to identify queries that have sort operations.

  3. Use these statements as input to the Design Advisor, which will identify and can create indexes to reduce the need for sorting.
You can use the self-tuning memory manager (STMM) to automatically and dynamically allocate and deallocate memory resources required for sorting. To use this feature:
  • Enable self-tuning memory for the database by setting the self_tuning_mem configuration parameter to ON.
  • Set the sortheap and sheapthres_shr configuration parameters to AUTOMATIC.
  • Set the sheapthres configuration parameter to 0.
You can also use the database system monitor and benchmarking techniques to help set the sortheap, sheapthres_shr, and sheapthres configuration parameters. For each database manager and for each database:
  1. Set up and run a representative workload.
  2. For each applicable database, collect average values for the following performance variables over the benchmark workload period:
    • Total sort heap in use (the value of the sort_heap_allocated monitor element)
    • Active sorts and active hash joins (the values of the active_sorts and active_hash_joins monitor elements)
  3. Set sortheap to the average total sort heap in use for each database.
    Note: If long keys are used for sorts, you might need to increase the value of the sortheap configuration parameter.
  4. Set the sheapthres. To estimate an appropriate size:
    1. Determine which database in the instance has the largest sortheap value.
    2. Determine the average size of the sort heap for this database.

      If this is too difficult to determine, use 80% of the maximum sort heap.

    3. Set sheapthres to the average number of active sorts, times the average size of the sort heap computed previously. This is a recommended initial setting. You can then use benchmark techniques to refine this value.

IBM® Data Server Manager provides tools for improving the performance of single SQL statements and the performance of groups of SQL statements, which are called query workloads. For more information about this product, see Tuning.