Designing indexes

Indexes are typically used to speed up access to a table. However, they can also serve a logical data design purpose.

For example, a unique index does not allow entry of duplicate values in the columns, thereby guaranteeing that no two rows of a table are the same. Indexes can also be created to order the values in a column in ascending or descending sequence.

Important: When creating indexes, keep in mind that although they can improve read performance, they negatively impact write performance. This negative impact occurs because for every row that the database manager writes to a table, it must also update any affected indexes. Therefore, create indexes only when there is a clear overall performance advantage.

When creating indexes, also take into account the structure of the tables and the type of queries that are most frequently performed on them. For example, columns that appear in the WHERE clause of a frequently issued query are good candidates for indexes. In less frequently run queries, however, the cost that an index incurs for performance in INSERT and UPDATE statements might outweigh the benefits.

Similarly, columns that figure in a GROUP BY clause of a frequent query might benefit from the creation of an index, particularly if the number of values used to group the rows is small relative to the number of rows being grouped.

When creating indexes, keep in mind that they can also be compressed. You can modify the indexes later, by enabling or disabling compression by using the ALTER INDEX statement.

To remove or delete indexes, you can use the DROP INDEX command. Dropping indexes has the reverse requirements of inserting indexes; that is, to remove (or mark as deleted) the index entries.

Guidelines and considerations when designing indexes

  • Although the order of the columns that make up an index key does not make a difference to index key creation, it might make a difference to the optimizer when it is deciding whether to use an index. For example, if a query has an ORDER BY col1,col2 clause, an index created on (col1,col2) could be used, but an index created on (col2,col1) is of no help. Similarly, if the query specified a condition such as where col1 >= 50 and col1 <= 100 or where col1=74, then an index on (col1) or on (col1,col2) could be helpful, but an index on (col2,col1) is far less helpful.
    Note: Whenever possible, order the columns in an index key from the most distinct to the least distinct. This ordering provides the best performance.
  • Any number of indexes can be defined on a particular table, to a maximum of 32 767, and they can have a beneficial effect on the performance of queries. The index manager must maintain the indexes during update, delete and insert operations. Creating a large number of indexes for a table that receives many updates can slow down processing of requests. Similarly, large index keys can also slow down processing of requests. Therefore, use indexes only where a clear advantage for frequent access exists.

  • Column data which is not part of the unique index key but which is to be stored or maintained in the index is called an include column. Include columns can be specified for unique indexes only. When creating an index with include columns, only the unique key columns are sorted and considered for uniqueness. The use of include columns can enable index only access for data retrieval, thus improving performance.

  • If the table that is being indexed is empty, an index is still created, but no index entries are made until the table is loaded or rows are inserted. If the table is not empty, the database manager creates the index entries while processing the CREATE INDEX statement.

  • For a clustering index, the database manager attempts to place new rows for the table physically close to existing rows with similar key values (as defined by the index).

  • If you want a primary key index to be a clustering index, a primary key should not be specified on the CREATE TABLE statement. Once a primary key is created, the associated index cannot be modified. Instead, issue a CREATE TABLE without a primary key clause. Then issue a CREATE INDEX statement, specifying clustering attributes. Finally, use the ALTER TABLE statement to add a primary key that corresponds to the index just created. This index is used as the primary key index.

  • If you have a partitioned table, by default, any index that you create is a partitioned index unless you create a unique index that does not include the partitioning key. You can also create the index as a nonpartitioned index.

    Starting in Db2® version 9.7 Fix Pack 1, you can create an index over XML data on a partitioned table as either partitioned or nonpartitioned. The default is a partitioned index.

    Partitioned indexes offer benefits when performing roll-in operations with partitioned tables (attaching a data partition to another table by using the ATTACH PARTITION clause on the ALTER table statement.) With a partitioned index, you can avoid the index maintenance that you would otherwise have to perform with nonpartitioned indexes. When a partitioned table uses a nonpartitioned index, you must use the SET INTEGRITY statement to maintain the nonpartitioned index by incorporating the index keys from newly attached partitions. Not only is this time consuming, it also can require a large amount of log space, depending on the number of rows that are being rolled in.

  • Indexes consume disk space. The amount of disk space varies depending on the length of the key columns and the number of rows being indexed. The size of the index increases as more data is inserted into the table. Therefore, consider the amount of data being indexed when planning the size of the database. Some of the indexing sizing considerations include:
    • Primary and unique key constraints will always create a system-generated unique index.
    • The creation of an MDC table will also create system-generated block indexes.
    • XML columns will always cause system-generated indexes, including column path indexes and region indexes, to be created.
    • It is beneficial to create indexes on foreign key constraint columns.
    • Whether the index is compressed or not (using the COMPRESS option).
  • The maximum number of columns that can be in an index is influenced by several factors:
    • In general, the maximum of columns in an index is 64.
    • If you are indexing a typed table, the maximum number of columns in an index is 63.
    • The maximum number of columns in an index with random ordering is reduced by one for each column that is specified with random ordering, because each random-ordering column requires two columns of space in the index.
  • The maximum length of an index key is influenced by several factors:
    • The maximum length of an index key, including all components, is IndexPageSize ÷ 4.
    • The maximum length of an index key must not be greater than the index key length limit for the page size, as dictated by general SQL limits.
  • During database upgrade, existing indexes are not compressed. If a table is enabled for data row compression, new indexes created after the upgrade might be compressed, unless the COMPRESS NO option is specified on the CREATE INDEX statement.