Indexes to avoid sorts
In addition to selective access to data, indexes can also order data, and sometime eliminate the need to sort the data.
Some sorts can be avoided
if index keys are in the order needed by ORDER BY, GROUP BY, a join
operation, or DISTINCT in an aggregate function. In other cases, such
as when list prefetch is used, the index does not provide useful ordering,
and the selected data might have to be sorted.
When it is absolutely necessary to prevent a sort, consider creating an index on the column or columns necessary to provide that ordering. Consider also using the clause OPTIMIZE FOR 1 ROW to discourage Db2 from choosing a sort for the access path.
Consider the following query:
SELECT C2, SUM(C3)
FROM T1
WHERE C1 = 17
GROUP BY C2;
An ascending index on C1 or an index on (C1,C2,C3®) could eliminate a sort.
Backward index scan
In some cases, Db2 can use a backward index scan on a descending index to avoid a sort on ascending data. Similarly, an ascending index can be used to avoid a sort on descending data. For Db2 to use a backward index scan, the following conditions must be true:
- The index includes the columns in the ORDER BY clause in the same order that they appear in the ORDER BY clause.
- Each column in the sequence must have the opposite sequence (ASC or DESC) of the ORDER BY clause.
Example: backward index scan
Suppose that an index exists on the ACCT_STAT table. The index is defined by the following columns: ACCT_NUM, STATUS_DATE, STATUS_TIME. All of the columns in the index are in ascending order. Now, consider the following SELECT statements:
SELECT STATUS_DATE, STATUS
FROM ACCT_STAT
WHERE ACCT_NUM = :HV
ORDER BY STATUS_DATE DESC, STATUS_TIME DESC;
SELECT STATUS_DATE, STATUS
FROM ACCT_STAT
WHERE ACCT_NUM = :HV
ORDER BY STATUS_DATE ASC, STATUS_TIME ASC;
By using a backward index scan, Db2 can use the same index for both statements.
Randomized index key columns
You might also be able to avoid a sort by using the RANDOM option to create an index with a randomized key column, as long as the randomized key column is not included within an ORDER BY clause.
Example: randomized index key columns
You can avoid sorts in query that uses GROUP BY processing by using an index with a randomized key. Consider the following statements:
CREATE INDEX I1
ON T1(C1, C2 RANDOM, C3);
SELECT C2, SUM(C3)
FROM T1
WHERE C1 = 17
GROUP BY C2;
The query can use index I1 because all equal values of the original column C2 are stored contiguously on the index, and have identical random values stored. Although, the order of the query's output would appear to be arbitrary (as opposed to the output if an ASC or DESC index was used), the correctness of the results is not effected. Only the order in which the result tuples are represented to the application is effected by the randomization. If you want to see the results in order, you must enforce the order with an ORDER BY statement, which requires a sort.
When sorts are more efficient
Not all sorts are inefficient. For example, if the index that provides ordering is not an efficient one and many rows qualify, it is possible that using another access path to retrieve and then sort the data could be more efficient than the inefficient, ordering index.
Indexes
that are created to avoid sorts can sometimes be non-selective. If
these indexes require data access and if the cluster ratio is poor,
these indexes are unlikely to be chosen. Accessing many rows by using
a poorly clustered index is often less efficient than accessing rows
by using a table space scan and sort. Both table space scan and sort
benefit from sequential access.