An index is an ordered list of the key values of a column or columns of a table. There are two reasons why you might create an index.
- To ensure uniqueness of values in a column or columns.
- To improve performance of queries against the table. The DB2 optimizer uses indexes to improve performance when performing queries by minimizing the number of rows to be searched. The optimizer can also eliminate a sort to present the results of a query in the order supported by an index.
Indexes can be defined as unique or non-unique. Non-unique indexes
allow duplicate key values, and unique indexes allow only one
occurrence of a key value in the list. Unique indexes allow a single
null value to be present. However, a second null value would cause a
duplicate and therefore is not allowed. Note: z/OS allows
UNIQUE WHERE NOT NULL index, which allows
many rows with null values.
Indexes are created using the
SQL statement. Indexes are also created implicitly in support of a
PRIMARY KEY or
UNIQUE constraint. When a unique index is
created implicitly, the key data is checked for uniqueness and the
operation fails if duplicates or any null values are found.
Indexes are created with individual fields as ascending or descending. The option you choose depends on how the application accesses the data.
In the example, you have a primary key on the BOOKID column. Often,
users conduct searches on the book title, so an index on BOOKNAME
would be appropriate. The following statement creates a non-unique
ascending index on the BOOKNAME column:
CREATE INDEX IBOOKNAME ON BOOKS (BOOKNAME);
The index name, IBOOKNAME, is used to create and drop the index. Other than that, the name is not used in queries or updates to the table.
By default, an index is created in ascending order, but you can specify
each column in an index with either
DECENDING. The following statement defines
an index on the AUTHORID and BOOKNAME columns. The values of the
AUTHORID column are sorted in descending order, and the values of the
BOOKNAME column are sorted in ascending order within the same
AUTHORID, as shown in Listing 37.
Listing 37. Sorting order
CREATE INDEX I2BOOKNAME ON BOOKS (AUTHORID DESC, BOOKNAME ASC);
When an index is created, the keys are stored in the specified order. The index helps improve the performance of queries requiring the data in the specified order. An ascending key, for example, can be used to determine the result of the MIN column function.
DB2 does not let you create multiple indexes with the same definition. This applies even to indexes that you create implicitly in support of a primary key or unique constraint. Because the BOOKS table already has a primary key defined on the BOOKID column, attempting to create an index on BOOKID column fails.
Creating an index can take a long time. DB2 reads each row to extract the keys, sort those keys, and then writes the list to the database. If the table is large, then a temporary tablespace is used sort the keys.
The index is stored in a tablespace. Note: Linux, UNIX, and Windows, separates the indexes into a separate tablespace, then identifies the tablespace when you create the table by using the INDEXES IN clause. The location of a table's indexes is set when the table is created, and cannot be changed unless the table is dropped and recreated.
DB2 also provides the
DROP INDEX SQL
statement to remove an index from the database. There is no way to
modify an index. If you need to change an index (to add another column
to the key) for example, you have to drop and then re-create it.
You can create one index on each table as the clustering index. A clustering index is useful when the table data is often referenced in a particular order. The clustering index defines the order in which data is stored in the database. During inserts, DB2 attempts to place new rows close to rows with similar keys. Then, during queries requiring data in the clustering index sequence, the data can be retrieved faster.
To create an index as the clustering index, specify the
CLUSTER clause on the
CREATE INDEX statement, as shown in Listing
Listing 38. Clustering index
CREATE INDEX IAUTHBKNAME ON BOOKS (AUTHORID,BOOKNAME) CLUSTER;
This statement creates an index on the AUTHORID and BOOKNAME columns as the clustering index. This index would improve the performance of queries written to list authors and all the books that they have written.
When creating an index, you have the option to include extra column data that is stored with the key, but is not actually part of the key itself and is not sorted. The main reason for including additional columns in an index is to improve the performance of certain queries. With this data already available in the index page, DB2 does not need to access the data page to fetch it. Included columns can only be defined for unique indexes. However, the included columns are not considered when enforcing uniqueness of the index. (Note: z/OS permits an expression based upon columns of the table to be included in an index.)
Suppose that you often need to get a list of book names ordered by BOOKID. The query would look like what is shown in Listing 39.
Listing 39. Query for books ordered by BOOKID
SELECT BOOKID,BOOKNAME FROM BOOK ORDER BY BOOKID;
You can create an index that might improve performance, as shown in Listing 40.
Listing 40. Index to improve performance
CREATE UNIQUE INDEX IBOOKID ON BOOKS (BOOKID) INCLUDE(BOOKNAME);
As a result, all the data required for the query result is present in the index and no data pages need to be retrieved.
So why not just include all the data in the indexes? First of all, this would require more physical space in the database because the table data would essentially be duplicated in the index. Second, all the copies of the data would need to be updated whenever the data value is updated, and this would be significant overhead in a database where many updates occur.
Consider the following when creating indexes.
- Because indexes are a permanent list of the key values, they require space in the database. Creating many indexes requires additional storage space in your database. The amount of space required is determined by the length of the key columns. DB2 provides a tool to help you estimate the size of an index.
- Indexes are additional copies of the values so they must be maintained as the data in the table is changed via INSERT, UPDATE and DELETE. If table data is frequently changed, consider what impact maintenance of additional indexes may have on performance.
- Indexes can significantly improve performance of particular queries when defined on the appropriate columns.
Note: Optim Query Tuner can provide detailed and specific advice on which indexes to define for either DB2 for Linux, Unix, and Windows, or for DB2 for z/OS.