How to identify unused indexes in DB2 for Linux, UNIX, and Windows
Get to know different approaches for determining indexes never or rarely used by database queries
DB2 Design Advisor (db2advis)
The DB2 Design Advisor represents a simple method to identify unused indexes. You can use Design Advisor to analyze a set of SQL statements and their execution frequency. As a result of such an analysis, the Design Advisor returns the following information:
- Additional index definitions that would improve performance of the queries considered by reducing the runtime of the queries)
- Names of existing indexes that are used for query execution by DB2
- Names of existing indexes that are not used for query execution by DB2
Let's look at a Design Advisor usage scenario based on the DB2 test database SAMPLE.
First, create the SAMPLE database by executing the
db2sampl command on the command line
as shown in Listing 1.
Listing 1. Create the SAMPLE database
Before calling the Design Advisor, you must run the
EXPLAIN.DDL script to create the Explain tables that it requires.
You can find the script in the sqllib/misc subdirectory of your DB2 installation.
You only have to run the script once.
Listing 2 shows the commands to run the script.
Listing 2. Create the Explain tables
db2 "CONNECT TO SAMPLE" db2 -tf "C:\Program Files\IBM\SQLLIB\MISC\EXPLAIN.DDL"
When calling the Design Advisor, you have to define the set of SQL statements that you want to analyze. You can do this in several ways. One possibility is to provide a text file containing the SQL statements to analyze. For example, Listing 3 shows a sample file named queries.sql that contains a list of SQL statements.
Listing 3. DB2 Design Advisor input file queries.sql
-- EMPLOYEE queries --#SET FREQUENCY 123 SELECT LASTNAME FROM EMPLOYEE WHERE EMPNO = '000010'; --#SET FREQUENCY 456 SELECT LASTNAME FROM EMPLOYEE WHERE WORKDEPT = 'A00'; -- DEPARTMENT queries --#SET FREQUENCY 245 SELECT DEPTNAME FROM DEPARTMENT WHERE DEPTNO = 'A00'; --#SET FREQUENCY 678 SELECT DEPTNAME FROM DEPARTMENT WHERE MGRNO = '000010'; -- PROJECT queries --#SET FREQUENCY 345 SELECT PROJNAME FROM PROJECT WHERE PROJNO = 'OP1000';
Following are some things to consider when creating a text file of SQL statements you want Design Advisor to analyze:
Comment lines start with
--and are ignored by the Design Advisor.
If known, you can optionally specify an estimate for the execution frequency of each
This helps Design Advisor to better estimate how much database performance would be improved by
creating new indexes.
To do this, add a separate line starting with
--#SET FREQUENCYjust before the query definition (although these lines start with
--, they are not treated as comment lines).
Each SQL statement in the text file must be terminated with a semicolon (
- The SQL statements may contain parameter markers. Parameter markers are allowed because the Design Advisor does not actually execute the statements, it only evaluates alternative access plans for them.
For all tables and views not fully qualified (no explicit schema is given), the
authorization ID of the caller is used as the default schema.
Optionally, you can overwrite the default schema with the
-qoption when calling the Design Advisor.
Listing 4 shows a sample command you can use to call the Design Advisor and specify queries.sql as the input file that contains the SQL statements to analyze.
Listing 4. Call the DB2 Design Advisor with the input file queries.sql
db2advis -d sample -i queries.sql -m I -l -1 -t 0 -o db2advis_file.txt
The options on the above command have the following meanings:
-iInput file containing the SQL statements to analyze.
-mBesides relational indexes, the Design Advisor can also recommend Materialized Query Tables (MQTs), Multidimensional Clustering (MDC) tables, and re-partitioning of partitioned tables. In this sample scenario,
-mhas a value of
Ito indicate that only relational indexes should be recommended (this is also the default).
-lSize limit for the definition of new indexes or MQTs. The value of
-1means no size limit.
-tRuntime limit for execution of the Design Advisor. The value of
0means no time limit.
-oOutput file where you want to write the recommendations of the Design Advisor.
The output file created by the Design Advisor consists of three sections: recommended additional indexes, recommended existing indexes, and unused indexes. For this scenario, you are primarily interested in the section that lists the unused indexes. Listing 5 shows a sample of the unused indexes section.
Listing 5. Unused indexes section in DB2 Design Advisor output
-- UNUSED EXISTING INDEXES -- ============================ -- DROP INDEX "FECHNER "."XDEPT3"; -- DROP INDEX "FECHNER "."XPROJ2"; -- ===========================
For this scenario, the Design Advisor indicates that the indexes XDEPT3 and XPROJ2 are not used. However, keep in mind that this recommendation is based only on the set of SQL statements that the Design Advisor analyzed (the statements contained in the input file). Besides the SQL statements in the input file, there may be other statements that depend on the identified indexes for fast execution. Therefore, you should carefully check and consider any index identified as unused before you actually drop it.
As an alternative to an input file containing SQL statements,
you can also have Design Advisor evaluate the contents of the database's SQL statement cache.
You do this by using the
-g option on the db2advis command, as
shown in Listing 6.
Listing 6. Call the DB2 Design Advisor for the SQL statement cache contents
db2advis -d sample -g -m I -l -1 -t 0 -o db2advis_sql_cache.txt
Calling the Design Advisor this way requires less effort than using a manually created input file containing SQL statements. However, you still need to be aware that the results depend on the current contents of the SQL statement cache. Statements not found in the cache when the Design Advisor is called are not analyzed. Therefore, when using the SQL statement cache as input, you should run the Design Advisor at different points throughout the day to evaluate a broader range of statements.
db2pd utility is generally used for problem
determination and monitoring purposes, but
you can also use it to query information about database activity.
One piece of information that you can obtain with the
db2pd utility is the number of times indexes have been accessed since
the database was activated.
Listing 7 shows how to use the
db2pd utility to get this
information for all indexes on all tables in a database.
Listing 7. Query table and index metrics with the
db2pd -db sample -tcbstats all -file db2pd_tab_all.txt
The options on the above call to the
db2pd utility have the following
-tcbstats allShow all table and index metrics
If you want to restrict the output of the
db2pd utility to show metrics for
only a certain table and its indexes, use the
option to specify the tablespace ID and table ID.
To do this, you first need to determine the tablespace ID and table ID of the table
by executing a
SELECT statement on the SYSCAT.TABLES catalog view, as shown
in Listing 8.
Listing 8. Query database catalog to determine tablespace ID and table ID of a table
db2 "SELECT TBSPACEID, TABLEID FROM SYSCAT.TABLES WHERE TABSCHEMA = 'FECHNER' AND TABNAME = 'DEPARTMENT'"
Listing 9 shows a sample result set from a tablespace ID and table ID query such as the one above.
Listing 9. Result set of tablespace ID and table ID query
TBSPACEID TABLEID --------- ------- 2 5 1 record(s) selected.
After determining the tablespace ID and table ID of the table you want to query,
you can restrict the
db2pd output to just that table.
You do this by including the
tableid suboptions following
-tcbstats option, as shown in Listing 10.
Listing 10. Restrict
db2pd -tcbstats output to a
db2pd -db sample -tcbstats all tbspaceid=2 tableid=5 -file db2pd_tab_dept.txt
Listing 11 shows a sample result set from a
db2pd query such as the
Listing 11. Table and index metrics as shown by the
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:20:34 TCB Table Information: Address TbspaceID TableID PartID MasterTbs MasterTab TableName SchemaNm ObjClass DataSize LfSize LobSize XMLSize 0x797DF2B8 2 5 n/a 2 5 DEPARTMENT FECHNER Perm 1 0 0 0 TCB Table Stats: Address TableName SchemaNm Scans UDI RTSUDI PgReorgs NoChgUpdts Reads FscrUpdates Inserts Updates Deletes OvFlReads OvFlCrtes RowsComp RowsUncomp CCLogReads StoreBytes BytesSaved 0x797DF2B8 DEPARTMENT FECHNER 0 0 0 0 0 951 0 0 0 0 0 0 0 0 0 - - TCB Index Information: Address InxTbspace ObjectID PartID TbspaceID TableID MasterTbs MasterTab TableName SchemaNm IID IndexObjSize 0x797E0330 2 5 n/a 2 5 2 5 DEPARTMENT FECHNER 3 8 0x797E0330 2 5 n/a 2 5 2 5 DEPARTMENT FECHNER 2 8 0x797E0330 2 5 n/a 2 5 2 5 DEPARTMENT FECHNER 1 8 TCB Index Stats: Address TableName IID PartID EmpPgDel RootSplits BndrySplts PseuEmptPg EmPgMkdUsd Scans IxOnlyScns KeyUpdates InclUpdats NonBndSpts PgAllocs Merges PseuDels DelClean IntNodSpl 0x797E0330 DEPARTMENT 3 n/a 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0x797E0330 DEPARTMENT 2 n/a 0 0 0 0 0 678 0 0 0 0 1 0 0 0 0 0x797E0330 DEPARTMENT 1 n/a 0 0 0 0 0 245 0 0 0 0 1 0 0 0 0
The part of the result set that concerns index usage is in the TCB Index Stats section. The sample output above shows that the DEPARTMENT table has three indexes. The indexes are shown by their index IDs (the IID column), not by their names. The Scans column shows how many times the indexes were accessed since the database was activated:
- The index with ID 1 was accessed 245 times.
- The index with ID 2 678 times.
- The index with ID 3 was never accessed.
Because the index with ID 3 has not been accessed, you can conclude that it is not
used, or at least that it is seldom used.
To get the name of the index with ID 3, query the SYSCAT.INDEXES
catalog view using a
SELECT statement, as shown in Listing 12.
Listing 12. Query database catalog to determine an index name based on its index ID (IID)
db2 "SELECT INDSCHEMA, INDNAME FROM SYSCAT.INDEXES WHERE TABSCHEMA = 'FECHNER' AND TABNAME = 'DEPARTMENT' AND IID = 3"
The output of the above
SELECT statement is shown in Listing 13.
Listing 13. Result set of index name query
INDSCHEMA INDNAME ------------------------------ ------------------------------ FECHNER XDEPT3 1 record(s) selected.
In addition to the Scans column, the TCB Index Stats section of the
results has a IxOnlyScns column
that shows the number of index only scans.
An Index only scan is an index access that is not followed by a table access because
the index itself contains all the requested data.
So if the IxOnlyScns counter is not zero, you have to add the numbers of both the
Scans column and the IxOnlyScns column to get the total number of times the index has
As already mentioned when discussing use of the Design Advisor, you have to be careful
when interpreting index usage information.
The same is true when using the
The fact that a certain index has not been used
up until a certain point in time,
does not necessarily mean that the index will never be used.
So be cautious when deciding whether or not to drop an index that appears not to be used.
If you do decide to drop an index, you should first save its
INDEX statement so that you can easily recreate it if necessary.
DB2 Workload Manager (WLM)
The DB2 Design Advisor as well as the
db2pd utility were
introduced in DB2 Version 8 and continue to be available in DB2 Version 9.
Starting with DB2 Version 9.5, you can also use the DB2 Workload Manager (WLM) to gathering index usage information.
Generally, WLM is a priced DB2 feature that must be separately licensed.
However, the extended database monitoring options that are part of WLM can be utilized
without a separate license.
This section does not provide an introduction to DB2 WLM.
It does provide step-by-step instructions you can use to gather and evaluate index usage information
with the help of WLM-specific event monitors and two Perl scripts that are part of every DB2 installation.
Like the Design Advisor, WLM requires that you have created the Explain tables before
you can use it (see Listing 2 for details on running the
EXPLAIN.DLL script to create the Explain tables).
You must also first run the
wlmevmon.ddl script that is in the sqllib/misc
The script creates and enables the
WLM-specific event monitors:
These event monitors write the data they collect to tables.
By default, these tables are created in tablespace USERSPACE1.
If you want the tables to reside in another tablespace, alter the script accordingly
before you run it.
Listing 14 shows the command to run the
Listing 14. Create the WLM-specific event monitors
db2 -tf "C:\Program Files\IBM\SQLLIB\MISC\wlmevmon.ddl"
By default, the event monitors are created with the
AUTOSTART option, which means that the
event monitors are automatically started the next time the database is activated.
Before you can use the event monitors after you first create them, you either have to deactivate and
activate the database or manually start the event monitors, as shown in Listing 15.
Listing 15. Start the WLM-specific event monitors
db2 "SET EVENT MONITOR DB2ACTIVITIES STATE 1" db2 "SET EVENT MONITOR DB2STATISTICS STATE 1" db2 "SET EVENT MONITOR DB2THRESHOLDVIOLATIONS STATE 1"
To understand the next command, you need a basic knowledge of the WLM concept of service class. WLM allows you to assign workloads with different response time requirements to different service classes and service subclasses. Beginning with DB2 Version 9.5, the usage of service classes is deeply integrated into the DB2 engine. Even if WLM is not explicitly used (that is, no service classes are defined by the database administrator), all user transactions are executed within the context of the predefined service class. The predefined service class is SYSDEFAULTUSERCLASS and its corresponding subclass is SYSDEFAULTSUBCLASS. WLM-specific event monitors are always activated for a certain service (sub-)class. This means that they only collect data for service (sub-)classes they have been activated for.
In this sample scenario, you activate the
monitor for the service subclass SYSDEFAULTSUBCLASS.
This way, you gather the monitor data for all user transactions because
application-specific service classes have not been defined.
Because there are no application-specific service classes,
all user transactions are executed within the default service class and its subclass.
Listing 16 shows the command to activate the activity event monitor for the default
Listing 16. Activate the activity event monitor for the default service (sub-)class
db2 "ALTER SERVICE CLASS SYSDEFAULTSUBCLASS UNDER SYSDEFAULTUSERCLASS COLLECT ACTIVITY DATA ON ALL DATABASE PARTITIONS WITH DETAILS"
Your collection of monitor data should cover a time frame in which you would expect normal database activity (for example, between 9 a.m. and 5 p.m. during a workday). After you finish collecting monitor data, deactivate the activity event monitor, as shown in Listing 17.
Listing 17. Deactivate the activity event monitor for the default service (sub-)class
db2 "ALTER SERVICE CLASS SYSDEFAULTSUBCLASS UNDER SYSDEFAULTUSERCLASS COLLECT ACTIVITY DATA NONE"
For evaluation of index usage based on collected WLM event monitor data, DB2 provides two Perl scripts
You can find these scripts in the samples\perl subdirectory of your DB2 installation.
Both scripts expect the database name and the database username and password as input parameters.
wlmhist.pl script first.
It reads the SQL statements executed from the event monitor tables and runs the Explain utility for each statement to generate the corresponding access plan.
After reading the SQL statements, the script extracts access plan information from the
Explain tables and writes it to a table named WLMHIST (if the WLMHIST table does not
already exist the script creates it).
Next, run the
wlmhistrep.pl script to evaluate the data stored in the
The script creates a report that provides details about table and index usage.
wlmhistrep.pl, you also have to specify the name of the resulting report file
in addition to the database name and the database username and password.
Listing 18 shows the commands for running the Perl scripts that help you evaluate the index usage based on collected WLM event monitor data.
Listing 18. Execute the
wlmhistrep.pl Perl scripts
cd "C:\Program Files\IBM\SQLLIB\samples\perl" perl wlmhist.pl sample userid password perl wlmhistrep.pl sample userid password wlmhistrep.txt
The report generated by
wlmhistrep.pl consists of four sections:
used (hit) tables, unused (not hit) tables, used (hit) indexes, and unused (not hit) indexes.
For this scenario, you are mainly interested in the unused indexes section.
Listing 19 shows a sample report.
Listing 19. Sample report generated by
INDEXES HIT REPORT FOR DATABASE sample _______________________________________________________ TABLE NAME TABLE SCHEMA OBJECT NAME OBJECT SCHEMA TOTAL HITS __________________ _______________ __________________ _______________ __________ ... DEPARTMENT FECHNER PK_DEPARTMENT FECHNER 245 DEPARTMENT FECHNER XDEPT2 FECHNER 678 EMPLOYEE FECHNER PK_EMPLOYEE FECHNER 123 EMPLOYEE FECHNER XEMP2 FECHNER 456 PROJECT FECHNER PK_PROJECT FECHNER 345 ... INDEXES NOT HIT REPORT FOR DATABASE sample ___________________________________________________________ TABLE NAME TABLE SCHEMA INDEX NAME INDEX SCHEMA INDEX TYPE __________________ _______________ __________________ _______________ __________ ... DEPARTMENT FECHNER XDEPT3 FECHNER REG PROJECT FECHNER XPROJ2 FECHNER REG ...
The INDEXES HIT REPORT section has a TOTAL HITS column that indicates how many times an index was accessed. Indexes that were never accessed are shown in the INDEXES NOT HIT REPORT section.
MON_GET_INDEX table function
The db2pd utility
section of this article showed you how to use the
db2pd utility to retrieve index metrics.
However, if you are using DB2 Version 9.7, you can retrieve the same information by querying the new
MON_GET_INDEX table function.
Using this table function instead of
evaluation of the results because the index IDs (IIDs) can be converted to index names by joining with the SYSCAT.INDEXES catalog view.
Listing 20 shows a sample of how to call the
MON_GET_INDEX table function.
Listing 20. Sample call to the MON_GET_INDEX table function (file name mon_get_index.sql)
SELECT SUBSTR(SI.INDSCHEMA, 1, 30) AS INDSCHEMA, SUBSTR(SI.INDNAME, 1, 30) AS INDNAME, MGI.INDEX_SCANS, MGI.INDEX_ONLY_SCANS FROM TABLE(MON_GET_INDEX('FECHNER', 'DEPARTMENT', -2)) as MGI, SYSCAT.INDEXES AS SI WHERE MGI.TABSCHEMA = SI.TABSCHEMA AND MGI.TABNAME = SI.TABNAME AND MGI.IID = SI.IID ORDER BY MGI.INDEX_SCANS DESC;
MON_GET_INDEX expects the following parameters:
- Table schema
- Table name
- Database partition number
For a single partition database, you can just specify
(current partition) or
-2 (all partitions) for the
database partition number and the results will be the same.
If you omit a table name (that is, you specify
NULL or an empty string), information is returned for all tables in the schema specified.
If you omit table name and table schema, information is returned for all tables in the database.
Listing 21 shows a sample result set from a query such as the one above.
Listing 21. Results from the MON_GET_INDEX table function
db2 -tf mon_get_index.sql INDSCHEMA INDNAME INDEX_SCANS INDEX_ONLY_SCANS -------------------- -------------------- -------------------- -------------------- FECHNER XDEPT2 678 0 FECHNER PK_DEPARTMENT 245 0 FECHNER XDEPT3 0 0 3 record(s) selected.
table function and
db2pd utility provide
the same index metrics (number of index scans and index only scans).
However, you may find it easier to work with the result set returned by the
than it is to evaluate the output of the
Unused and seldom used indexes cause UID operations (
DELETE) and database maintenance activities (such as
REORG) to run unnecessarily long.
Such indexes also require storage space, within tablespace containers as well as
within backup images, that you might better use for other purposes.
This article demonstrates different approaches to identify unused and seldom used indexes defined within a database.
Based on that information, you can delete indexes that are not required.
By doing so, you may be able to significantly improve the overall performance of your DB2 LUW database.
- The complete DB2 9.7 LUW documentation online in HTML format: DB2 9.7 Information Center for Linux, UNIX, and Windows
- The DB2 LUW technical support site: DB2 9 for Linux UNIX and Windows Support. From here you can search for APARs, download fixpacks, get DB2 LUW documentation in PDF format, etc.
- The following DB2 LUW Best Practice papers are designed to provide practical guidance for the most common DB2 9 product configurations. By applying these recommendations, you may improve the value of your DB2 data servers and align yourself with IBM's technical direction for DB2. These papers are authored by leading experts in IBM's development and consulting teams, and have been extensively tested: Best practices for DB2 for Linux, UNIX, and Windows
- Download IBM product evaluation versions or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2, Lotus®, Rational®, Tivoli®, and WebSphere®.