Data access through index scans

An index scan occurs when the database manager accesses an index to narrow the set of qualifying rows (by scanning the rows in a specified range of the index) before accessing the base table; to order the output; or to retrieve the requested column data directly (index-only access).

When scanning the rows in a specified range of the index, the index scan range (the start and stop points of the scan) is determined by the values in the query against which index columns are being compared. In the case of index-only access, because all of the requested data is in the index, the indexed table does not need to be accessed.

If indexes are created with the ALLOW REVERSE SCANS option, scans can also be performed in a direction that is opposite to that with which they were defined.

The optimizer chooses a table scan if no appropriate index is created, or if an index scan would be more costly. An index scan might be more costly:

When the table is small.
When the index-clustering ratio is low.
When the query requires most of the table rows.
Where extra sorts are required when a partitioned index is used. A partitioned index cannot preserve the order in certain cases.
Where extra sorts are required when an index with random ordering is used.

To determine whether the access plan uses a table scan or an index scan, use the Db2® explain facility.

Index scans to limit a range

To determine whether an index can be used for a particular query, the optimizer evaluates each column of the index, starting with the first column, to see if it can be used to satisfy equality and other predicates in the WHERE clause. A predicate is an element of a search condition in a WHERE clause that expresses or implies a comparison operation. Predicates can be used to limit the scope of an index scan in the following cases:

Tests for IS NULL or IS NOT NULL
Tests for strict and inclusive inequality
Tests for equality against a constant, a host variable, an expression that evaluates to a constant, or a keyword
Tests for equality against a basic subquery, which is a subquery that does not contain ANY, ALL, or SOME; this subquery must not have a correlated column reference to its immediate parent query block (that is, the select for which this subquery is a subselect).

Note: A range predicate or IS NOT NULL predicate on the random column of an index cannot be used as a start-stop predicate for that index. Only equality predicates on the random column of an index can be used in a start-stop predicate. However, the random index can still be chosen to satisfy the query by doing a full index scan. The random index can also be chosen to satisfy the query by using jumpscan with the random column treated as a gap.

The following examples show how an index could be used to limit the range of a scan.

Consider an index with the following definition:
```
   INDEX IX1:  NAME    ASC,
               DEPT    ASC,
               MGR     DESC,
               SALARY  DESC,
               YEARS   ASC
```
The following predicates could be used to limit the range of a scan that uses index IX1:
```
   where
     name = :hv1 and
     dept = :hv2
```
or
```
   where
     mgr = :hv1 and
     name = :hv2 and
     dept = :hv3
```
The second WHERE clause demonstrates that the predicates do not have to be specified in the order in which the key columns appear in the index. Although the examples use host variables, other variables, such as parameter markers, expressions, or constants, could be used instead.
In the following WHERE clause, only the predicates that reference NAME and DEPT would be used for limiting the range of the index scan:
```
   where
     name = :hv1 and
     dept = :hv2 and
     salary = :hv4 and
     years = :hv5
```
Because there is a key column (MGR) separating these columns from the last two index key columns, the ordering would be off. However, after the range is determined by the name = :hv1 and dept = :hv2 predicates, the other predicates can be evaluated against the remaining index key columns.
Consider an index that was created using the ALLOW REVERSE SCANS option:
```
   create index iname on tname (cname desc) allow reverse scans
```
In this case, the index (INAME) is based on descending values in the CNAME column. Although the index is defined for scans running in descending order, a scan can be done in ascending order. Use of the index is controlled by the optimizer when creating and considering access plans.

Index scans to test inequality

Certain inequality predicates can limit the range of an index scan. There are two types of inequality predicates:

Strict inequality predicates
The strict inequality operators that are used for range-limiting predicates are greater than (>) and less than (<).
Only one column with strict inequality predicates is considered for limiting the range of an index scan. In the following example, predicates on the NAME and DEPT columns can be used to limit the range, but the predicate on the MGR column cannot be used for that purpose.
```
   where
     name = :hv1 and
     dept > :hv2 and
     dept < :hv3 and
     mgr < :hv4
```
However, in access plans that use jump scans, multiple columns with strict inequality predicates can be considered for limiting the range of an index scan. In this example (assuming the optimizer chooses an access plan with a jump scan), the strict inequality predicates on the DEPT, and MGR columns can be used to limit the range. While this example focuses on strict inequality predicates, note that the equality predicate on the NAME column is also used to limit the range.
Inclusive inequality predicates
The inclusive inequality operators that are used for range-limiting predicates are:
- >= and <=
- BETWEEN
- LIKE
Multiple columns with inclusive inequality predicates can be considered for limiting the range of an index scan or a jump scan. In the following example, all of the predicates can be used to limit the range.
```
   where
     name = :hv1 and
     dept >= :hv2 and
     dept <= :hv3 and
     mgr <= :hv4
```
Suppose that :hv2 = 404, :hv3 = 406, and :hv4 = 12345. The database manager will scan the index for departments 404 and 405, but it will stop scanning department 406 when it reaches the first manager whose employee number (MGR column) is greater than 12345.

Index scans to order data

If a query requires sorted output, an index can be used to order the data if the ordering columns appear consecutively in the index, starting from the first index key column. Ordering or sorting can result from operations such as ORDER BY, DISTINCT, GROUP BY, an = ANY subquery, a > ALL subquery, a < ALL subquery, INTERSECT or EXCEPT, and UNION. Exceptions to this are as follows:

If the index is partitioned, it can be used to order the data only if the index key columns are prefixed by the table-partitioning key columns, or if partition elimination eliminations all but one partition.
Ordering columns can be different from the first index key columns when index key columns are tested for equality against constant values or any expression that evaluates to a constant.
Indexes with random ordering cannot be used to order data.

Consider the following query:

   where
     name = 'JONES' and
     dept = 'D93'
   order by mgr

For this query, the index might be used to order the rows, because NAME and DEPT will always be the same values and will therefore be ordered. That is, the preceding WHERE and ORDER BY clauses are equivalent to:

   where
     name = 'JONES' and
     dept = 'D93'
   order by name, dept, mgr

A unique index can also be used to truncate a sort-order requirement. Consider the following index definition and ORDER BY clause:

   UNIQUE INDEX IX0: PROJNO  ASC

   select projno, projname, deptno
     from project
     order by projno, projname

Additional ordering on the PROJNAME column is not required, because the IX0 index ensures that PROJNO is unique: There is only one PROJNAME value for each PROJNO value.

Jump scans

Queries against tables with composite (multi-column) indexes present a particular challenge when designing indexes for tables. Ideally, a query's predicates are consistent with a table's composite index. This would mean that each predicate could be used as a start-stop key, which would, in turn, reduce the scope of the index needing to be searched. When a query contains predicates that are inconsistent with a composite index, this is known as an index gap. As such, index gaps are a characteristic of how a query measures up to a table's indexes.

Unconstrained index gap

Consider the following query against a table with the IX1 composite index defined earlier in this topic:

   where
     name = :hv1 and
     dept = :hv2 and
     mgr = :hv3 and
     years = IS NOT NULL

This query contains an index gap: on the SALARY key part of the composite index (this is assuming that the access plan contains an index scan on the composite index). The SALARY column cannot be included as a start-stop predicate. The SALARY column is an example of an unconstrained index gap.

Note: For some queries it can be difficult to assess whether or not there are index gaps. Use the EXPLAIN output to determine if index gaps are present.

Constrained index gap

Consider the following query against a table with the IX1 composite index defined earlier in this topic:

   where
     name = :hv1 and
     dept = :hv2 and
     mgr = :hv3 and
     salary < :hv4 and
     years = :hv5

This query contains an index gap on the SALARY key part of the composite index (this is assuming that the access plan contains an index scan on the composite index). Since the SALARY column in the query is not an equality predicate, start-stop values cannot be generated for this column. The SALARY key part represents a constrained index gap.

To avoid poor performance in queries with index gaps, the optimizer can perform a jump scan operation. In a jump scan operation, the index manager identifies qualifying keys for small sections of a composite index where there are gaps, and fills these gaps with these qualifying keys. The end result is that the index manager skips over parts of the index that will not yield any results.

Jump scan restrictions

For queries being issued where you expect a jump scan, verify that the target table has an appropriate composite index and that the query predicates introduce an index gap. The Db2 optimizer will not create plans with jump scans if there are no index gaps.

Jump scans do not scan the following types of indexes:

range-clustered table indexes
extended indexes (for example, spatial indexes)
XML indexes
text indexes (for Text Search)

With jump scans, a gap column with an IN-List predicate might be treated as an unconstrained gap column. Also, in databases with Unicode Collation Algorithm (UCA) collation, jump scans might treat LIKE predicates with host variables or parameter markers as unconstrained gaps.

Note: When evaluating queries, there can be cases where the optimizer chooses an access plan that does not include a jump scan operation, even if index gaps are present. This would occur if the optimizer deems an alternative to using a jump scan to be more efficient.