In the first article of this series, Tuning SQL with Optim Query Tuner, Part 1: Understanding access paths, the concept of an access path was introduced. For a given SQL statement, there are often multiple access path choices, and the different access paths usually have different performance characteristics. Before SQL is executed, the DB2 optimizer estimates the cost of the candidate access paths and selects the path with the least estimated cost. This process is included in the PREPARE step for a dynamic SQL statement, or in the BIND step for a static SQL statement.
Although the DB2 optimizer is helpful for selecting the best access path, the result depends on the input data, which is often not accessed or controlled by the optimizer. If you are a developer or DBA, it is helpful for you to understand how to tune queries so that you can provide the best input to the DB2 optimizer.
In this article, you are introduced to a methodology for tuning single queries, including the rationale for why it's important to understand how to tune queries even when a world-class optimizer exists in DB2. You then use a sample query to explain the methodology for tuning a query, using the related features of Optim Query Tuner, which can be of great benefit in helping you to understand, analyze, and tune single queries.
Note that this article is mainly designed for SQL tuning on DB2 for z/OS, but most of the query optimization concepts and the SQL tuning methodology in this article are also applicable to DB2 for Linux®, UNIX®, and Windows®.
If you want to try the sample query in this article by yourself, you can download the sample project file in the download section of this article, and then import the project file into Data Studio (stand-alone or IDE package with Fix Pack 1 or later) or any of the Optim Query Tuner products.
To import the sample project do the following:
- Open the IBM Query Tuning Perspective of your Data Studio or Optim Query Tuner product.
- Select File > Import..
- In the Import wizard, navigate to Query Tuner > Projects, and then click Next.
- Click Browse, and select the directory that contains the downloaded zip file to see a list of projects in the Projects window.
- Select samplequerytuningproject, and click Finish.
- The sample project should now appear in your Project Explorer. If you do not see a Project Explorer Window, make sure you are in the IBM Query Tuning Perspective and select Window > Reset Perspective. Alternatively, you can select Window > Show View > Project Explorer.
About Optim query tuning solutions
Optim query tuning solutions provide an environment to identify and tune poorly performing SQL statements with advisors and tools that help guide you to a solution. Query tuning capabilities are delivered in the following products:
- Basic, single-query tuning and query formatting capabilities are available in Data Studio 2.2.1 (both stand-alone and the IDE). This product is available at no charge for both DB2 for z/OS and DB2 for Linux, UNIX, and Windows. Be aware that while the information in this article series explains how you can use Data Studio to interpret the access path graphs, not all of the capabilities that are described are available in Data Studio.
- Single-query tuning and query formatting, as well as the larger set of advisors, are available in Optim Query Tuner. This product is available for both DB2 for z/OS and DB2 for Linux, UNIX, and Windows.
- Query workload tuning, single-query tuning, and the full set of advisors are available in Optim Query Workload Tuner. This product is only available for DB2 for z/OS (as of the time this article was written).
For brevity, this article series uses the name Optim Query Tuner to refer to the set of advisors and tools that Optim query tuning solutions provide. Where applicable, specific product names are provided when describing capabilities that may not be available in all the products listed above.
Note that this article mainly focuses on the query tuning methodology and uses screen captures from Optim Query Tuner to illustrate the points. This article is not intended to provide "how to" information on using Query Tuner. For more information about navigating to the various product features, and to see a detailed introduction about how to launch the various functions in OQT, please refer to the Resources section.
Overview of query optimization
As shown in Figure 1, the DB2 optimizer chooses the best access path.
Figure 1. Overview of the DB2 optimizer
The optimizer compares the cost of each candidate access path based on the information from multiple inputs, for example, see the following:
- Catalog statistics
The DB2 optimizer is a cost-based optimizer. The cornerstone of cost-based optimization is a set of statistics that enables the optimizer to accurately estimate the cost of all candidate access paths and to differentiate efficient access paths from inefficient ones. The statistics in the DB2 catalog tables are used to estimate the cost of the access path. For example, information in the catalog tables SYSTABLES and SYSTABLESPACE tells you how many rows and pages hold the data in your table.
- Physical database design
Physical database design includes table design, index design, materialized query table design, and the design of the other physical database objects. Index design has a critical impact on access path selection. As was mentioned in the previous article, for single table access, there are two types of access methods: table space scan (TBSCAN) and index scan (IXSCAN). Index scans are usually the most efficient way to access data, especially when the table is large but the number of qualified rows is small.
- SQL statement
The SQL statement itself also impacts the access path selection. For example, improperly coded predicates might prevent the optimizer from using an index scan even when the index is available. Also, before selecting the access path, the optimizer first performs a series of query transformations to increase the number of access paths available. If the SQL statement is badly coded, it is difficult for the optimizer to transform the query, leaving fewer options available for choosing an optimal access path.
- Other considerations for access path selection
In addition to considering catalog statistics, physical database design, and the SQL statement itself, the DB2 optimizer also considers the central processor model, number of central processors, buffer pool size, RID pool size, and other system resource settings. For example, the access path can change from one system to another if they have different buffer pool sizes, even if all the catalog statistics are identical.
The DB2 optimizer is comprehensive and quite powerful. Why, then, is query tuning required if the DB2 optimizer is working? There are two reasons for why query tuning may still be required:
- The DB2 optimizer is not all-knowing.
Although the DB2 optimizer has a lot of information upon which to base its plans, it cannot know what does not yet exist. For example, the optimizer does not know the characteristics of the data unless you have run RUNSTATS to populate the catalog with relevant statistics. In addition, some items are just not knowable until runtime. For example, the optimizer does not know the values of host variables or parameter markers (if these are contained in the query) until the query is executed.
- The DB2 optimizer is not all-controlling.
As mentioned before, the physical database design, the SQL statement, and the system resource settings impact how the optimizer selects the best access path, yet both database and query design are tasks that are outside of the control of the DB2 optimizer. This is where DBAs and developers play a role in helping or hurting SQL performance.
The purpose of query tuning is to provide the best possible input to the optimizer so that the optimizer can choose the best access path. This involves effort from both application developers and DBAs.
For application developers:
- Follow SQL coding standards and guidelines.
You need to comply with SQL coding standards and guidelines when you write your SQL statements. For example, write indexable or stage 1 predicates and avoid writing queries without join predicates (also known as Cartesian join).
- Exploit REOPT bind options properly.
For SQL statements with variables, the optimizer uses a default filter factor to determine the best access path at bind time. In some cases, the access path does not perform well at runtime if the statement contains host variables, parameter markers, or special registers. You can use the REOPT bind options to reoptimize the access path either at bind time or at run time.
- Collect sufficient and accurate statistics.
Insufficient or inaccurate statistics lead to inaccurate cost estimates for the candidate access paths and is the most common reason for the selection of inefficient access paths. Meanwhile, the collection and refresh of all statistics would consume too much of the resource which is not necessary. Given the number of INSERT, UPDATE, and DELETE operations and changes in data distributions, you need to collect the right statistics regularly and with minimum resource consumption.
- Optimize the index design.
You need to design indexes to support efficient access with local predicates and join predicates. You may also need to design indexes to avoid data sort and to provide index only access.
- Tune application as a whole.
To ensure good performance of the application, it is necessary to tune the application as a whole. The effort required to tune the whole application, by evaluating every single statement, is overwhelming. Also, performance improvement on one statement might regress the performance of the other statements in the application. Therefore, it is critical to tune the application as a whole, which is also know as workload tuning. This article will focus on single-query tuning; the next article of this series will extend the methodology in this article to introduce the workload tuning in detail.
This article describes a methodology for understanding potential query performance problems and how to address those potential problems. Using Optim Query Tuner makes the process more straightforward.
Query tuning methodology
Overview of query tuning methodology
In order to perform query tuning, you need to first understand what you want to tune, in this case it is the query itself and its current access plan chosen by optimizer, then figure out how to tune it.
Based on this idea, you should perform the following tasks to tune the query completely, which you can do from within Query Tuner:
- Format the problem query to make it easier to read and comprehend the logic of the query.
- Annotate the problem query with relevant statistics to better understand what the DB2 optimizer is using for its estimations.
- Analyze the query access plan to visualize the choices the optimizer makes in accessing the data.
- Perform statistics analysis to ensure that the DB2 optimizer always has the most current and the most needed statistics.
- Perform predicate analysis to see if the predicates are being as selective as possible.
- Perform index analysis to make sure that the right indexes exist to help avoid unnecessary table scans.
In the subsequent sections, the SQL statement shown in Listing 1 is used as a sample to explain each individual task of query tuning in detail. As you can imagine, there is quite a bit of inter-dependence among these tasks. For example, changing the statistics that are gathered may very likely affect the results of the predicate analysis. Also, you might need to iterate through one or more of these tasks several times until a particular performance issue is resolved.
Listing 1. Sample query that is used in this article
SELECT CCUS.CUST_FIRST_NAME , CCUS.CUST_LAST_NAME , CINT.CUST_INTEREST_RANK , CILO.CUST_INTERST FROM CUST_CUSTOMER AS CCUS , CUST_INTEREST_LOOKUP AS CILO , CUST_INTEREST AS CINT WHERE ( CCUS.CUST_CITY = 'Singapore' AND CCUS.CUST_PROV_STATE = 'Singapore' AND CCUS.CUST_CODE IN ( SELECT COHE.CUST_CODE FROM CUST_ORDER_HEADER AS COHE , CUST_ORDER_STATUS AS COST WHERE ( COHE.CUST_ORDER_DATE >='2009-01-01 00:00:00.001' AND COST.CUST_ORDER_STATUS IN ( 'Shipped', 'Back-ordered', 'In-process' ) AND COHE.CUST_ORDER_STATUS_CODE = COST.CUST_ORDER_STATUS_CODE ) ) AND CCUS.CUST_CODE = CINT.CUST_CODE AND CINT.CUST_INTEREST_CODE = CILO.CUST_INTEREST_CODE ) ORDER BY CCUS.CUST_LAST_NAME ASC , CCUS.CUST_FIRST_NAME ASC , CINT.CUST_INTEREST_RANK ASC
Format the problem query
Before tuning a query, you need to understand the following aspects of the problem query.
- The semantics of the query: Which tables are accessed in the query? What kinds of predicates are used on each referenced table? What kinds of predicates are used to join the referenced tables?
- The access path of the query: How are the tables accessed? Is the whole table scanned or is it accessed by an index? If an index, which index or indexes? What is the join sequence and join method?
As you can see in Figure 2, the original unformatted problem query is hard to read and understand.
Figure 2. Unformatted query
Optim Query Tuner can format the problem query, which provides a good starting point for analysis. In the formatted query, each table reference, each column reference under the SELECT clause, and each predicate is shown on its own line. For the sample query in this article, the formatted query is shown in Figure 3.
Figure 3. Formatted query
As you can imagine, for long complicated SQL, simply formatting the query can save hours of DBA time. Now it is easy to find out which tables are accessed, how many there are in the query, and how these tables are joined. Formatted queries give you the ability to the following:
- Drill into parts of the query in more detail, such as referenced views and sub-queries, by expanding and collapsing sections of a complex SQL.
- Easily see how a specific table is accessed in the SQL. When you click any line in the formatted query, other lines of the query that contain column or table references from the same table are also highlighted.
- Customize the formatting ordering of the predicates according to various criteria such as local predicates or join predicates, and table references.
Getting back to the formatted query shown in Figure 3, you can see the following:
- The query accesses the following three tables: CUST_CUSTOMER, CUST_INTEREST and CUST_INTEREST_LOOKUP to get the qualified customer name and interest information.
- The three tables are joined with equal join predicates.
- There are three predicates on CUST_CUSTOMER. The first two predicates
are simple equal predicates (CCUS.CUST_CITY = 'Singapore',
CCUS.CUST_PROV_STATE = 'Singapore'). The third predicate is an IN-list
predicate, which contains a non-correlated sub-query:
- The sub-query accesses the CUST_ORDER_HEADER and CUST_ORDER_STATUS tables to get the qualified customer code, and those two tables are joined with an equal join predicate.
- There is a range local predicate on CUST_ORDER_HEADER, and an IN-list local predicate on CUST_ORDER_STATUS.
- There is no local predicate on the other two tables (CUST_INTEREST and CUST_INTEREST_LOOKUP).
- The result is ordered by the customer name and interest.
Query Tuner also makes it easy to spot where the DB2 optimizer has transformed a query. As a reminder, transformations are the adjustments that the DB2 optimizer makes to the query to try to improve the performance of the query; for example, it might add a predicate for transitive closure to facilitate join sequence assessment.
As an illustration, given a predicate such as A.CUSTNO BETWEEN ? AND ? AND C.CUSTNO = A.CUSTNO, DB2 can derive that predicate C.CUSTNO BETWEEN ? AND ? must also be true, so the DB2 query transformation can add that predicate enabling it to consider another index.
For the sample in this article, the transformed query is shown in Figure 4.
Figure 4. Transformed query
As you can see, the optimizer creates a virtual table to process the IN-list sub-query. In addition, the non-correlated sub-query has been transformed to a correlated sub-query. This is an optimization that was introduced in DB2 for z/OS V9.1, which enables DB2 to optimize a query as a whole rather than as several independent query blocks. When a query is optimized as a whole, DB2 can consider the effect of one query block on another, and can consider reordering query blocks to determine an optimal query path.
Annotate the problem query
Alongside the formatted SQL predicates and table references, Query Tuner includes annotations of the relevant catalog statistics and cost estimation information, such as cardinality and estimated qualified rows, as shown in Figure 5. Having this information readily available can help DBAs accelerate analysis and reduce downtime for urgent situations.
Figure 5. Annotated problem query
For each table reference in the query, Query Tuner adds the following annotations:
- CARDF (see number 1 in Figure 5): Table cardinality, which denotes the total number of rows in the table. CUST_CUSTOMER is the largest table (31,284) while CUST_INTEREST_LOOKUP is the smallest table (338).
- QUALIFIED_ROWS (see number 2 in Figure 5): Number of qualified rows after applying local predicates on the table. Although CUST_CUSTOMER and CUST_INTEREST almost have the same table cardinality, the qualified rows for CUST_CUSTOMER is just over 1, which means that there are very selective local predicates on the table. Contrast that with CUST_INTEREST, in which the number of qualified rows is the same as the cardinality (31,255). This indicates that there is no filtering on the table. This is explainable because there are two local predicates on the table CUST_CUSTOMER while there are no local predicates on CUST_INTEREST.
- NPAGESF (see number 3 in Figure 5): Total number of pages on which rows of this table appear.
For each predicate in the query, Query Tuner adds statistics annotations for the referenced columns in the predicate and also the cost estimation for the predicate:
- COLCARDF (see number 4 in Figure 5): Column cardinality, which denotes the estimated number of distinct values in the column. If the predicate contains more than one column, the column cardinality for each of the referenced columns is separated with backlash ("/") in the same order as the columns appear in the predicate.
- MAX_FREQ (see number 5 in Figure 5): The maximum frequency of all the possible column values. The frequency for a specific column value is the percentage of rows in the table that contain the specific value for a column. For example, if there are five distinct values for a column (COLCARDF=5), and if the data is uniformly distributed, the MAX_FREQ is about 20% because each of the different column values populates 20% of the table rows. If the column cardinality is 5 and yet MAX_FREQ is far more than 20%, that means the data in the table is not uniformly distributed on the column. In other word, there is data skew on the column.
- FF (see number 6 in Figure 5): Filter factor for the predicate. The filter factor is a number between 0 and 1 that estimates the proportion of rows in a table for which the predicate is true. The filter factor indicates how selective the predicate is. The more selective the predicate, the earlier the predicate should be applied.
Analyze the query access plan
Query Tuner provides a visualization of the processing that your data server uses to run a query. This visualization is called the access plan graph. From the access plan graph, you can see what choices the optimizer has made regarding how the query will be processed and the rationale for those choices. The diagram consists of nodes that represent tables, indexes, operations, and returned data. The nodes are arranged and connected by links that indicate the flow of the process. The graph is read left to right, bottom to top. And each node is annotated by statistics, estimated costs, selectivity information and so forth that are used to determine the access plan flow.
Understanding the access plan is important for understanding and influencing performance, as well as for stabilizing performance. Refer to the previous article of this series for more information about reading and understanding access paths.
Figure 6 is the access path graph generated by Query Tuner for the sample query in this article.
Figure 6. Access plan graph
(View a larger version of Figure 6.)
From the access plan graph in Figure 6, you can see the following:
- The query contains two query blocks, QB1 (see number 1 in Figure 6) and QB2 (see number 2 in Figure 6). QB2 represents the IN-list sub-query while QB1 is the main sub-query.
- QB2 is joined with CUST_CUSTOMER table in the outer query block QB1, which means the IN-list sub-query has been transformed as correlated sub-query, though it is a non-correlated sub-query in the original query.
- The access plan for QB1 can be summarized as below:
TBSCAN(CUST_CUSTOMER) NLJ ISCAN(CUST_INTEREST) NLJ
- The 3 tables in QB1 are joined with nested loop join (NLJ)
- The 3 tables in QB1 are joined in the following join sequence: CUST_CUSTOMER -> CUST_INTEREST -> CUST_INTEREST_LOOKUP
- CUST_CUSTOMER is accessed by table scan while CUST_INTEREST and CUST_INTEREST_LOOKUP are accessed by index scan
- The access plan for QB2 can be summarized as below:
TBSCAN(CUST_ORDER_HEADER) NLJ ISCAN(CUST_ORDER_STATUS).
- The 2 tables in QB2 are joined with nested loop join (NLJ)
- The 2 tables in QB2 are joined in the following join sequence: CUST_ORDER_HEADER -> CUST_ORDER_STATUS
- CUST_ORDER_HEADER is accessed by table scan while CUST_ORDER_STATUS is accessed by index scan
From the access plan graph in Figure 6, you can do some initial performance analysis:
- The access to the inner tables (CUST_INTEREST and CUST_INTEREST_LOOKUP) is by index scan. It is a reasonably efficient access plan.
- The access to the leading table of the outer sub-query (CUST_CUSTOMER)
and the inner sub-query (CUST_ORDER_HEADER) is by table scan, which
could be a potential problem:
- The table cardinality for CUST_CUSTOMER is about 30000. However since it is a leading table and will be accessed by the table scan only once, it may cause some performance problems, but it should not be a disaster.
- The inner sub-query is a correlated sub-query. Depending on how many qualified rows are returned from the outer table (CUST_CUSTOMER), it may be accessed many times. From either the annotation in Figure 5 or the access plan graph in Figure 6, you can see that the estimated qualified rows for CUST_CUSTOMER is 1; in other words, the optimizer thinks that CUST_ORDER_HEADER will be scanned just once. Given that the table cardinality is about 50000; it should not be a performance disaster, either.
Thus far, you are certain that CUST_CUSTOMER will be scanned only one time because it is a leading table in the outer sub-query. However you are not sure if CUST_ORDER_HEADER will be scanned only once, since the qualified rows from CUST_CUSTOMER are calculated with the existing statistics and predicates:
- The table cardinality for CUST_CUSTOMER is 31284 (see number 7 in Figure 5).
- There are two local predicates on the table which are very selective:
- CCUS.CUST_CITY = 'Singapore', FF=0.00727 (see number 8 in Figure 5)
- CCUS.CUST_PROV_STATE = 'Singapore', FF=0.004 (see number 9 in Figure 5)
- Thus the qualified rows estimated by the optimizer are about 31284*0.00727*0.004 = 1.
What if the estimated qualified rows from CUST_CUSTOMER are incorrect? For example, what if the two local predicates are not as selective as estimated by the optimizer? This may cause serious performance problems since the CUST_ORDER_HEADER could be accessed many times.
One way to validate the suspicious performance bottleneck is by looking into the runtime statistics of the query, which can be obtained by turn on the performance trace on IFCID 318. Another option is to use Query Tuner to capture statements from the statement cache and to view the statement runtime information, as shown in Figure 7.
Figure 7. Runtime statistics
(View a larger version of Figure 7.)
The highlighted line (ending with a "B" in Figure 7) shows the runtime information for the sample query in this article. As you can see, the query was executed three times, and the average elapsed time is about 307 seconds, which is very slow. The total number of table scans (STAT_RSCN) is about 1764, i.e., more than 580 (1764/3) table scans per execution. This is far more than what can be estimated from the access plan graph, which is about 2 (one for CUST_CUSTOMER, and another for CUST_ORDER_HEADER). This further confirms our suspicion that the estimated number qualified rows from CUST_CUSTOMER is way off from reality.
Another way to validate this is to issue a query as below to count the real value of the qualified rows.
Listing 2. Count query
SELECT COUNT(*) FROM CUST_CUSTOMER AS CCUS WHERE CCUS.CUST_CITY = 'Singapore' AND CCUS.CUST_PROV_STATE = 'Singapore'
The result of the above count query shows that there are about 588 qualified rows from CUST_CUSTOMER. In other words, the optimizer overestimated the selectivity of the local predicates on the table. In the following sections of the article, you will analyze the problem query from the statistics, predicate and index perspectives, and then show why the overestimation happens, as well as how to resolve it.
Perform statistics analysis
From the query annotation information, you can easily see what kinds of statistics are available, and what kinds of statistics are missing. For the query in this article, the basic statistics of the referenced tables, columns and indexes were collected, such as the table cardinality, column cardinality, an so on.
On the other hand, some distribution statistics, such as column frequency,
have never been collected. For example, as shown in Figure 8, MAX_FREQ for
the column CUST_CITY in the predicate
CCUS.CUST_CITY = 'Singapore' is shown as
missing. Missing statistics can cause the optimizer to overestimate or to
underestimate the selectivity of a predicate, and eventually to select an
inefficient access path.
Figure 8. Statistics analysis
The same problem can occur even if the statistics were collected before but have not been refreshed for so long that they are now obsolete. This is particularly true if the data has changed dramatically since the last statistics collection. There is an attribute RUNSTATS TIMESTAMP on the table or index node of the access plan graph that indicates the last time when the statistics were collected. Note, you can specify a threshold in the Query Tuner preferences to define how old statistics are before they are considered obsolete. By default, if the statistics are over 1 year old, they will be identified as obsolete.
If different statistics are collected at different times, it may result in a situation where the statistics are inconsistent with each other on DB2 for z/OS. For example, if table level statistics are collected on a particular table, and later a significant number of rows are inserted into this table, collecting statistics for the indexes may only end up with the index columns having column cardinality greater than the table cardinality. The inconsistent statistics may also mislead the optimizer into choosing a bad access path.
Perform predicate analysis
Meanwhile, you can also do predicate analysis to find if there is any potential problem in the annotation view, as shown in Figure 9.
Figure 9. Predicate analysis
(View a larger version of Figure 9.)
According to the annotation information in the figure, the table reference CUST_CUSTOMER (CCUS) has two local predicates which are very selective, with filter factor 0.00727 and 0.004, respectively.
Since no frequency statistics were collected (MAX_FREQ=(missing)), the
filter factor of each predicate is calculated based on the assumption that
the data is uniformly distributed on the columns. For example, the column
cardinality of CUST_CITY is 1376, thus the filter factor of the predicate
CCUS.CUST_CITY='Singapore' is computed as
How much confidence can one place in the estimate of predicate selectivity?
If the data is skewed on the columns, for example if the majority of the
data has a city name of
Singapore, then the
current selectivity could be greatly overestimated. In order to get a more
accurate estimate, you need to collect frequency statistics on the columns
CUST_CITY and CUST_PROV_STATE of table CUST_CUSTOMER.
Another potential problem is the local predicate on the table reference CUST_ORDER_HEADER (COHE) in the inner sub-query (see also in Figure 9). Since it is a range predicate and there are no frequency and histogram statistics available, the filter factor is calculated with the basic statistics information HIGH2KEY and LOW2KEY. The estimate could be incorrect if there is data skew on the column. In order to get a more accurate estimate, you need to collect histogram statistics on the column CUST_ORDER_DATE of table CUST_ORDER_STATUS_CODE.
Now that you've analyzed the selectivity of the local predicates, let's now look at the table selectivity.
For the three tables in the outer sub-query, there is no local predicate on CUST_INTEREST_LOOKUP (CILO) and CUST_INTEREST (CINT); therefore if either of those tables has been processed as leading table, the table selectivity is zero, which is very bad.
If CUST_CUSTOMER (CCUS) is processed as leading table, given the two local predicates, the table selectivity of CUST_CUSTOMER is 0.00727*0.004, and the qualified rows of CUST_CUSTOMER is about 1 (31284*0.00727*0.004). This seems very selective, but this value is calculated with the assumption that the two columns (CUST_CITY, CUST_PROV_STATE) are not correlated with each other, which is not true. Because Singapore is one of the cities in the state of Singapore, in order to get a more accurate estimate of the table selectivity, you need to collect the cardinality and frequency statistics of the column group (CUST_CITY, CUST_PROV_STATE).
After CUST_CUSTOMER is processed, the join predicate
could be considered as a local predicate on the table CUST_INTEREST
(CINT). Given the table cardinality of CUST_INTEREST is about 31255, you
can see that for each qualified row from CUST_CUSTOMER, there is only one
matching record in CUST_INTEREST. That is, table CUST_INTEREST could be
very selective through the join predicate.
After CUST_CUSTOMER and CUST_INTEREST are processed, the join predicate
CINT.CUST_INTEREST_CODE = CILO.CUST_INTEREST_CODE
could be considered as a local predicate on the table CUST_INTEREST_LOOKUP
(CILO). Given the table cardinality of CUST_INTEREST_LOOKUP is about 338,
you can see that for each qualified row from CUST_INTEREST, there are
about 13 (338/26) matching records in CUST_INTEREST_LOOKUP, which is a
reasonably good level of selectivity.
With the above statistics and predicate analysis, you have a clear idea on which statistics should be collected. After you collect those statistics, you get the new access plan graph shown in Figure 10.
Figure 10. Access plan graph after RUNSTATS
From this new access plan graph, you can see the following:
- Since statistics on column group (CUST_CITY and CUST_PROV_STATE) on the table CUST_CUSTOMER have been collected, the DB2 optimizer now knows the correct number of qualified rows after local filtering is 590 instead of 1.
- The join sequence of the inner sub-query is changed, the original sequence is CUST_ORDER_HEADER-> CUST_ORDER_STATUS, now it is CUST_ORDER_STATUS -> CUST_ORDER_HEADER; because histogram statistics on CUST_ORDER_DATE column of CUST_ORDER_HEADER table have been collected.
- With the correct qualified rows from CUST_CUSTOMER, the inner sub-query is processed as non-correlated sub-query, which is materialized into a work file and then accessed with sparse index scan in memory.
The line which is ended with an "S" in Figure 11 shows the runtime information of the sample query after RUNSTATS. The average elapsed time is about 2.8 seconds, a significant performance improvement compared to that before RUNSTATS (about 307 seconds).
Figure 11. Runtime statistics after RUNSTATS
(View a larger version of Figure 11.)
Also as shown in Figure 11, you can see that there are still two table scans for each execution of the query. From the access plan graph in Figure 10, you can easily figure out the two table scans come from CUST_CUSTOMER and CUST_ORDER_HEADER respectively. You will see how this can be fixed by index analysis in the next section.
Perform index analysis
Optim Query Tuner provides reports in HTML and text format that contain information about the tables, indexes, and predicates that are involved with a particular SQL statements. For the sample query in this article, Figure 12 shows the index report, which describes the existing indexes of the referenced tables.
Figure 12. Index report
For the convenience of discussion, the existing indexes are summarized in Table 1.
Table 1. Existing indexes
|Table name||Index name||Index column|
Now let's look at the predicates for table CUST_CUSTOMER as extracted from the formatted query in Figure 3:
- CCUS.CUST_CITY = 'Singapore'
- CCUS.CUST_PROV_STATE = 'Singapore'
- CCUS.CUST_CODE = CINT.CUST_CODE
Given the existing indexes in Table 1 and the predicate above, only the
join predicate (CCUST.CUST_CODE = CINT.CUST_CODE) can benefit from the
IDX_CUST_CUSTOMER. The two local
predicates cannot benefit from matching or screening index scans from any
of the existing indexes.
To support both the local predicates and the join predicate with index scans, you should create an index on the columns CUST_CITY, CUST_PROV_STATE and CUST_CODE.
For table CUST_INTEREST and CUST_INTEREST_LOOKUP, each has a join predicate, and the existing index can support a matching index scan on the join predicates as shown below.
- CCUS.CUST_CODE = CINT.CUST_CODE
- CINT.CUST_INTEREST_CODE = CILO.CUST_INTEREST_CODE
For table CUST_ORDER_HEADER, both of the existing indexes do not support the local and join predicates on the table, as shown below.
- COHE.CUST_ORDER_DATE >= '2009-01-01 00:00:00.001'
- COHE.CUST_ORDER_STATUS_CODE = COST.CUST_ORDER_STATUS_CODE
In order to support both the local predicate and join predicate with an index scan, you should create an index on the columns CUST_ORDER_DATE and CUST_ORDER_STATUS_CODE. Also, in order to get better performance with index only access, you can also add CUST_CODE as a key column. Considering that this table is part of the IN-list sub-query, if CUST_CODE is the first key column, the DB2 optimizer may consider global query optimization to access the sub-query as correlated sub-query.
For table CUST_ORDER_STATUS, the first index BQT01_CUST_ORDER_STATUS supports index access for the local predicate but cannot support the join predicate. The second index CUST_ORDER_STAT_PK can support index access for the join predicate but cannot support the local predicate:
- COST.CUST_ORDER_STATUS IN ( 'Shipped', 'Back-ordered', 'In-process' )
- COHE.CUST_ORDER_STATUS_CODE = COST.CUST_ORDER_STATUS_CODE
To achieve better performance on the table CUST_ORDER_STATUS, you need to create an index with both of the columns CUST_ORDER_STATUS_CODE and CUST_ORDER_STATUS.
Given the previous index analysis, we propose the following three indexes to achieve better performance:
Table 2. New indexes
|Table name||Index name||Index column|
After creating the three new indexes, you get the new access plan graph shown in Figure 13.
Figure 13. Access plan graph after creating indexes
(View a larger version of Figure 13.)
With the newly created indexes, all tables are accessed with index scans. The sub-query is changed to correlated sub-query again, and the two tables in the sub-query are accessed using only the index, which provides the best performance.
The line which is ended with an "I" in Figure 14 shows the runtime information of the sample query after creating the new indexes. The average elapsed time is about 0.246 seconds, a further improvement compared to the performance after RUNSTATS (about 2.8 seconds).
Figure 14. Runtime statistics after creating indexes
(View a larger version of Figure 14.)
With the step-by-step analysis on statistics, predicate, and index, the performance of the problem query is improved significantly. Table 2 shows a summary of the performance improvements. Row 1 shows the runtime statistics before tuning. Row 2 shows the statistics after collecting the relevant statistics, and row 3 shows the results after creating the suggested indexes.
Table 3. Table 3. Performance comparison
|Tuning Task||Execution count||STAT_GPAG||STAT_ELAP||STAT_CPU||AVG_STAT_ELAP||AVG_STAT_CPU|
Accelerate query tuning with Optim Query Tuner
So far, you've seen how you can analyze and tune a problem query with the assistance of Optim Query Tuner capabilities, such as query annotation, access plan graph, and so on. Although the tools are very intuitive and user friendly, you still need to have enough background knowledge on the DB2 optimizer and SQL access paths to fully take advantage of the powerful functions provided by the tools. Of course, you also need to spend some time on analysis.
However, in most cases, you do not need to perform query tuning manually. Optim Query Tuner provides a series of advisors to automate the query tuning by giving recommendations, which you can review and implement directly from the product to resolve a performance issue.
- Statistics advisor: recommendations for collecting the required statistics to facilitate access path selection.
- Query advisor: recommendations for rewriting the query for better efficiency.
- Index advisor: recommendations for indexes that may help improve performance.
Recommendation: Typically, it is suggested that you run the advisors in the following order:
- Run the statistics advisor and take actions to make statistics current.
- Run the query advisor and rewrite the SQL if required and possible.
- Run the index advisor and create new indexes or adjust existing ones. Accurate and complete statistics will help drive good access path advice, and of course the index advisor should not be executed without good statistics.
The following sections go into more detail on the previous advisors.
Execute statistics advisor to improve statistics quality and collection
As was mentioned at the beginning of this article, database statistics are the facts on which the optimizer makes decisions about access plans. Consequently, if the statistics are inaccurate, outdated, or conflicting, the optimizer will create inaccurate estimates for the costs of the steps in a query plan, leading to poor performance.
In DB2, the command
RUNSTATS TABLE ALL INDEX ALL
collects all uniform statistics, many of which may not be necessary for
improving query performance. At the same time, the utility does not
collect certain key statistics, such as multi-column and distributed
statistics. There are often data correlations among columns. For example,
there is a strong correlation between CUST_CITY and CUST_PROV_STATE
columns for the sample query in this article. Collecting statistics of
individual columns may not be enough to provide the information required,
so you need to collect column group statistics.
Query Tuner statistics advisor makes it easier by providing RUNSTATS recommendation to collect the required statistics by identifying the following problematic statistics status.
- Missing statistics: When statistics are missing, the optimizer assumes the default values to determine costs, which could be completely inaccurate.
- Conflicting statistics: Inconsistent statistics may cause the optimizer to derive the wrong estimated costs and to make the wrong decision on access plan assessment. Inconsistent statistics may be caused when users collect the partial statistics, for example collecting table and index statistics individually, at different points in time.
- Obsolete statistics: Old statistics may no longer represent the current state of the table.
The statistics advisor also assesses the relative importance of the advice and gives the following two types of recommendations.
- Repair: Recommendations of this type indicate that important
statistics are missing, or that conflicts exist among the statistics.
The recommendation includes a
RUNSTATScommand that you can run to capture and repair the relevant statistics.
- Complete: Recommendations of this type cover the Repair ones as well
as other statistics which can be refreshed for maintenance purpose.
The recommendation provides a
RUNSTATSjob that you might want to include periodically in your maintenance cycle to maintain the health of the relevant statistics.
As you drill into the advice for a specific recommendation, as shown in
Figure 15, you'll see that the statistics advisor generates a
RUNSTATS command that you can use to collect or
repair statistics. You can save them for later execution, or run them
directly from the Query Tuner client if you have the appropriate
authority. Note, in order to execute the
RUNSTATS recommendations from Query Tuner, the
SYSPROC.DSNUTILU stored procedure needs to be
available and operative on the server side.
You also see the explanation behind the recommendation. In Figure 15, you can see that the advisor found several instances of conflicting or missing statistics as well as providing the explanation about why the advisor thinks they are conflicting.
Figure 15. Statistics advisor recommendations
You have already seen how improving statistics quality means that you are giving the DB2 optimizer accurate data on which to base decisions about optimizing the access plan and thus potentially improving performance and reducing CPU. However, there is another benefit. By collecting only the needed statistics, you can avoid collecting irrelevant statistics that drive up CPU overhead and create unnecessary workload inside limited maintenance windows. In other words, the statistics advisor can help improve statistics quality as well as improve the efficiency of statistics data collection.
Execute query advisor to improve query design
Before access path selection, the DB2 optimizer transforms SQL statements into forms that are semantically identical, such as by using predicate pushdown or predicate transitive closure. As a result, it can improve the possible access paths. In contrast, Query Tuner helps the person who writes or tunes queries to identify errors and oversights by making suggestions that further constrain the query, increase index utilization, and reduce data reads. The query advisor looks for opportunities in a query to do the following.
- Minimize the number of index pages and data rows that have to be read. For example, you can minimize the rows read by having predicates in the query that can determine the needed rows from just the index alone.
- Minimize sort operations. For example, whether ORDER BY or GROUP BY clauses are needed in the query, or if they can be resolved via index access.
Specifically, the query advisor checks for the following.
- Missing join predicates, but only if a foreign key is defined.
- Stage 2 predicates that can improve performance if rewritten as Stage 1 or indexable. See the Resources for a link to the DB2 for z/OS Information Center where you can find more information about Stage 1 and Stage 2 predicates.
- Stage 1 predicates that can improve performance if rewritten as indexable.
- Additional local predicates that are not automatically provided by DB2 that can provide predicate transitive closure.
- Predicates that are pushed down to a nested table expression or materialized view without changing the result, and not already done automatically by DB2.
- Additional predicates added to a complex WHERE clause, containing
()parentheses. This might improve performance without changing the result.
- The use of
SELECT *which can be replaced with specific column list.
In Figure 16, you can see the details of a query advisor recommendation along with the rationale for the recommendation. The UI automatically highlights the section of the query related to the recommendation. In this way, DBAs or developers learn about query tuning as they go.
Figure 16. Query advisor recommendations
Execute index advisor to improve query efficiency
Query Tuner also provides index advice. It analyzes the query and recommends additional indexes that would benefit the query access. Index advisor might recommend indexes for the following reasons.
- Foreign keys that do not have indexes defined.
- Indexes that will provide index filtering and/or screening for the SQL statement.
- Indexes that will provide index-only access for the SQL statement.
- Indexes that can help to avoid sorts.
Figure 17 shows the index recommendations for the sample query in this article, together with an estimated performance improvement and DASD space requirement. It also creates the DDL required to create the indexes, and gives you the ability to run them right away, assuming you have proper authority, or save them for later review and execution.
Figure 17. Index advisor recommendations
You can also customize recommendations and the DDL creation. For example you can specify a maximum for the number of columns that can be part of an index key and change the Creator ID that is used when index DDL is generated. You can also specify the FREEPAGE, PCTFREE, and CLUSTERRATIO values to be assumed for new indexes, and whether or not index leaf page size can exceed 4 KB.
You can also test the recommended indexes together with the indexes that you suggest with the "what-if analysis" before deployment. It allows you to apply various constraints to the index advisor and see how those constraints modify the recommendations. The "what-if analysis" uses a virtual index under the covers to assess the effective of the indexes. By doing so, you can choose to create the indexes that can help give you the performance that you want.
This article introduced a methodology for tuning single queries, and used a sample query to explain each step of the methodology in detail. It includes how to understand the problem query better by using the visual aids provided by query formatting, annotation, reports, and the access plan graph. You have learned how to analyze and tune the problem query from the perspective of statistics, predicates, and indexes. You have also been shown ways to accelerate query tuning by using the Query Tuner advisors.
The next article of this series will extend the discussion to a query workload scope, and introduce why workload tuning is needed the challenge for workload tuning, with an example to introduce the workload tuning in detail.
|Sample project file for this article||samplequerytuningproject.zip||338KB|
- Use an RSS feed to request notification for the upcoming articles in this series. (Find out more about RSS feeds of developerWorks content.)
- See the "Tuning SQL with Optim Query Tuner, Part 1: Understanding access paths" (developerWorks, 2010 June) article to help you learn more details about understanding access paths.
- Watch the demo "Optim Performance Management solution" (developerWorks, 2010 April) to see how one fictional company uses Optim solutions to resolve problems before they affect the business, including use of Optim Query Tuner for DB2 for Linux, UNIX, and Windows.
- Go to the Optim Query Workload Tuner for DB2 for z/OS product web page or the Optim Query Tuner for DB2 for Linux, UNIX, and Windows product web page for more information, including how to purchase the product.
- Read the DB2 for Linux, UNIX, and Windows product information on joins in the DB2 Information Center.
- See a detailed introduction about how to launch the various functions in OQT in the Information Center.
- Read the DB2 for Linux, UNIX, and Windows product information on predicate types in the DB2 Information Center.
- Read the DB2 for z/OS product information on nested loop joins in the DB2 for z/OS Information Center.
- Read the DB2 for z/OS product information on merge scan joins in the DB2 for z/OS Information Center.
- Read the DB2 for z/OS product information on stage 1 and stage 2 predicates in the DB2 for z/OS Information Center.
- Read the "Best practices when using Data Studio and Optim Development Studio with DB2 for z/OS" (developerWorks, 2010 June) article contains information to help you configure Data Studio for query tuning on DB2 for z/OS.
- Explore the developerWorks Optim family page to learn more about Optim solutions. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts. See this page for a schedule of virtual technical briefings around the Optim integrated data management portfolio.
- Learn more about Information Management at the developerWorks Information Management zone. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Follow developerWorks on Twitter.
Get products and technologies
- Download the Data Studio no-charge software.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.
- Participate in the discussion forum.
- Check out the Integrated Data Management experts blog and get involved in the Integrated Data Management community space, which has a comprehensive list of resources and downloads.
- Check out the developerWorks blogs and get involved in the developerWorks community.