Improve SQL execution times using the DB2 statement concentrator feature and the DB2 statement reoptimization feature

The article explains how using parameter markers and literals in SQL statements influences SQL execution times. It advises whether to use parameter markers or literals, depending on the type of the application and data distribution characteristics of the database. You will also learn about two DB2® for Linux®, UNIX®, and Windows® features to improve SQL execution times, especially when changes to a database application are not possible. Learn how to use the DB2 statement concentrator feature when literals are used improperly. Learn how to use the DB2 statement reoptimization feature when parameter markers are used improperly.

Share:

Dirk Fechner (fechner@de.ibm.com), IT Specialist, IBM

Author photoDirk Fechner works as an IT-Specialist for IBM Software Group Services His area of expertise is design, implementation, and administration of DB2 databases on distributed platforms as well as database application development. He has eleven years of experience with DB2 for Linux, UNIX, and Windows. Dirk is an IBM Certified Advanced Database Administrator and an IBM Certified Application Developer.



07 April 2011

Also available in Chinese

Introduction

Database APIs such as JDBC and ODBC/CLI offer database developers the choice to use SQL statements with or without parameter markers (also called host variables). In many cases, one of these options is chosen without considering the implications. This can cause unexpected database performance issues when the database application is subsequently executed. For users of packaged software, such problems are even worse because changes to the application are not possible. This article explains in which cases you should work with parameter markers and in which cases using literals would be better. The article then presents two DB2 LUW features that help to resolve performance issues caused by inadequate use of literals and parameter markers respectively: the DB2 statement concentrator feature and the DB2 statement reoptimization feature.

Understanding SQL execution phases

Execution of an SQL statement is divided into three phases:

Prepare
In the prepare phase, the DB2 optimizer creates the access plan to execute the SQL statement.
Execute
In the execute phase, DB2 determines the rows that make up the result set (SELECT), inserts new rows (INSERT), deletes rows (DELETE), or updates existing rows (UPDATE). The access plan created in the prepare phase determines the database internal operations and the order in which the operations are executed.
Fetch
In the fetch phase, if the SQL statement is a SELECT, DB2 returns the result set to the application.

The execution time for an SQL statement is the time required to go through all three phases. DB2 LUW provides the db2batch utility to determine the times that an SQL statement spends in each of the three phases (prepare, execute, and fetch). db2batch reads SQL statements from a flat file, executes the statements, and writes the results to an output file. The option -i complete tells db2batch to report the time required for each phase separately, as shown in Listing 1.

Listing 1. Example db2batch statement
db2batch -d <dbname> -f <sqlfile> -a <user>/<password> -r<outfile> -i complete

Using parameter markers or literals

This section describes when to use parameter markers and when to use literals.

The case for parameter markers

Parameter markers should be used if all of the following conditions are true for an SQL statement:

  • The SQL statement is executed many times.
  • The time required for the prepare phase of the SQL statement makes up a significant portion of the statement's overall execution time.
  • The best access plan for the SQL statement does not depend on explicit values in the statement's filter conditions.

The DB2 optimizer stores the access plan created in the prepare phase in the SQL statement cache (also called package cache in DB2) for reuse purposes. So when the same SQL statement is executed again, DB2 finds the previously created access plan in the statement cache and reuses the access plan. This saves the prepare phase and thus reduces execution time. Avoiding the prepare phase especially pays off if an SQL statement is executed many times and if the prepare phase makes up a significant portion of the statement's overall execution time.

Consider this example scenario:

  • The execution time of an SQL statement is 4 ms. These 4 ms comprise the total time required for the prepare, execute, and fetch phases: Prepare = 1 milliseconds (ms); Execute = 2 ms; Fetch = 1 ms.
  • If the statement is executed 100,000 times, the total execution time is 100,000 x 4 ms = 400 seconds = 6 minutes 40 seconds
  • If the access plan is reused, the time for the prepare phase is saved and the total execution time is reduced by 25%: 1 x 4 ms + 99,999 x 3 ms = 300,001 seconds = 5 minutes.

Another advantage of access plan reuse is a reduced CPU load. Access plan creation in the prepare phase requires CPU resources. So if an access plan is reused, a database server requires less CPU resources to execute the corresponding SQL statement repeatedly.

Remember that DB2 can only reuse an access plan if SQL statement texts completely match. For example, the two SQL statements in Listing 2 are not identical to DB2, although they differ only slightly in their filter conditions. DB2 would not reuse an access plan in this example.

Listing 2. SQL statements using literals in the filter conditions
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = '000040'
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = '000120'

To enable access plan reuse, rewrite the SQL statement using a parameter marker, as shown in Listing 3.

Listing 3. SQL statement using a parameter marker in the filter condition
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = ?

All major database interfaces offer the capability of executing SQL statements that contain parameter markers. JDBC, for example, provides the class java.sql.PreparedStatement for this purpose, as shown in Listing 4.

Listing 4. JDBC java.sql.PreparedStatement code sample
java.sql.Connection con;
java.sql.PreparedStatement prepstmt;
java.sql.ResultSet rs;
String query;

// initialize the database connection
con = ...

// prepare the SQL statement (implicitly creates a prepared SQL statement object)
query = "SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = ?";
prepstmt = con.prepareStatement(query);

// bind a value to the parameter marker and execute the SQL statement
prepstmt.setString(1, "000040");
rs = prepstmt.executeQuery();
// process the result set...

// bind another value to the parameter marker and execute the SQL statement again
prepstmt.setString(1, "000120");
rs = prepstmt.executeQuery();
// process the result set...

SQL statements well-suited for parameter marker usage (because they conform to the three rules mentioned above) are normally quite simple and therefore are typical for online transaction processing (OLTP). In most cases, well-suited SQL statements are very simple SELECT statements that reference only a single table or are INSERT, UPDATE, and DELETE (UID) operations that are executed many times. Listing 5 shows the low complexity of such SQL statements.

Listing 5. Simple SQL statements suited for parameter marker usage
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = ?
INSERT INTO DEPARTMENT (DEPTNO, DEPTNAME, ADMRDEPT) VALUES (?, ?, ?)
UPDATE EMPLOYEE SET JOB = ? WHERE WORKDEPT = ?

The case for literals

As the previous section described, parameter marker usage in SQL statements fosters access plan reuse. Thus time and CPU resources are saved in the prepare phase when SQL statements are executed many times. But using parameter markers is not always better than using literals (explicit values). Always use literals if the optimal access plan for the SQL statement varies according to the literals in the filter conditions. This condition generally applies to complex queries executed by applications used for reporting purposes. Such applications rarely execute the exact same query twice. Therefore, time and CPU savings in the prepare phase is minimal even if parameter markers are used.

Literals can also be a better choice in simple SQL statements, as shown in Listing 6.

Listing 6. Simple SQL statement not suited for parameter marker usage
SELECT SUM(SALES) FROM SALES WHERE SALES_DATE BETWEEN ? AND ?

For the example in Listing 6, assume the following:

  • All rows in table SALES have a SALES_DATE between 2010-01-01 and 2010-12-31.
  • For each day there are roughly 1,000 rows, yielding a total of 365,000 rows for the whole year 2010.
  • There is an index on column SALES_DATE.

Depending on the values inserted for the parameter markers following the prepare phase, data access using the SALES_DATE index or a complete table scan makes more sense. For example, if only a single day should be evaluated (roughly 1,000 rows), accessing the rows using the SALES_DATE index is preferable. However, if the whole year 2010 should be evaluated (all 365,000 rows), accessing the rows using the index causes processing overhead, because the complete table has to be read anyway. Accessing the rows using a table scan is a better choice in this case. Because the DB2 optimizer does not know the values of the parameter markers in the prepare phase, a one-size-fits-all access plan is generated that cannot be optimal in all cases. For this example query, it would be better to use literals instead of parameter markers and to accept the repeated prepare phase, as shown in Listing 7.

Listing 7. Better version of the SQL statement above using literals instead of parameter markers
SELECT COUNT(SALES) FROM SALES WHERE SALES_DATE BETWEEN '2010-05-26' AND '2010-05-26'
SELECT COUNT(SALES) FROM SALES WHERE SALES_DATE BETWEEN '2010-01-01' AND '2010-12-31'

In a JDBC application, use java.sql.Statement instead of java.sql.PreparedStatement to execute such queries, as shown in Listing 8.

Listing 8. JDBC java.sql.Statement code sample
java.sql.Connection con;
java.sql.Statement stmt;
java.sql.ResultSet rs;
String query;

// initialize the database connection
con = ...

// create an SQL statement object
stmt = con.createStatement();

// execute the SQL statement with literals
query = "SELECT COUNT(SALES) FROM SALES " +
        "WHERE SALES_DATE BETWEEN '2010-05-26' AND '2010-05-26'";
rs = stmt.executeQuery(query);
// process the result set...

// execute the same SQL statement but with different literals
query = "SELECT COUNT(SALES) FROM SALES " +
        "WHERE SALES_DATE BETWEEN '2010-01-01' AND '2010-12-31'";
rs = stmt.executeQuery(query);
// process the result set...

Using java.sql.PreparedStatement usually makes the source code more readable compared to using java.sql.Statement that might require ugly string concatenations. However, if literals are important to get optimal access plans, the shorter execution times of the SQL statements are more important than the elegancy of the application's source code.


Fixing performance issues caused by inadequate use of parameter markers or literals at the database level

When you understand the advantages and disadvantages of parameter markers and literals, you can choose wisely between the two options during application development to achieve optimal performance for execution of SQL statements. But most often SQL performance issues arise in situations in which you cannot influence how an application executes SQL statements, as in any of the following situations:

Packaged software
When software is purchased as a product, the software developer decides how the application accesses the database. Changes to the implementation are not possible.
Persistence frameworks
When a third-party persistence framework is used for custom application development, it is not possible to influence the way in which the persistence layer generates and executes SQL statements.
Legacy applications
In case of custom legacy applications, changes to the implementation are possible. But often such changes can cost too much time and money.

When one of these situations applies, database performance can be bad because of improper use of parameter markers and literals. DB2 offers the following two features to improve performance on the database side:

DB2 Statement Concentrator
The DB2 Statement Concentrator feature enables access plan reuse when an application improperly uses literals instead of parameter markers.
DB2 Statement Reoptimization
The DB2 Statement Reoptimization feature enables reoptimization of access plans after the prepare phase when an application improperly uses parameter markers instead of literals.

Forcing access plan reuse using the DB2 statement concentrator feature

This section describes an example of how to use the DB2 statement concentrator feature.

Table 1 describes the contents of the example table DEMO.TAB1, which consists of 10,000 rows and 3 columns.

Table 1. Contents of sample table DEMO.TAB1
IDCOL1COL2
111
221
331
441
551
.........
99991
10001
10111
.........
199991
20001
20111
.........
1000001

Where:

  • ID is a continuous ID from 1 to 10,000.
  • COL1 contains the result of the operation.
  • ID MOD 100 is the remainder of ID divided by 100.
  • COL2 contains the value 1 for all rows at the beginning.

In addition to a primary key index on column ID, two more indexes are defined: an index DEMO.IDX1 on column COL1 and an index DEMO.IDX2 on column COL2.

The DB2 CLP script in Listing 9 creates and fills the table DEMO.TAB1 (together with its indexes) in the DB2 sample database.

Listing 9. DB2 CLP script to create table DEMO.TAB1 and its indexes in the SAMPLE database
!db2sampl;

CONNECT TO SAMPLE;

CREATE TABLE DEMO.TAB1
(
    ID INT NOT NULL PRIMARY KEY,
    COL1 INT,
    COL2 INT
);

INSERT INTO DEMO.TAB1 (ID, COL1, COL2)
WITH TMP (ID, COL1, COL2) AS
(
    VALUES (1, 1, 1)
    UNION ALL
    SELECT (ID + 1), MOD((ID + 1), 100), 1
    FROM TMP
    WHERE ID < 10000
)
SELECT ID, COL1, COL2 FROM TMP;

COMMIT;

CREATE INDEX DEMO.IDX1 ON DEMO.TAB1 (COL1 ASC);
CREATE INDEX DEMO.IDX2 ON DEMO.TAB1 (COL2 ASC);

RUNSTATS ON TABLE DEMO.TAB1 AND INDEXES ALL;

CONNECT RESET;

Assume an application repeatedly reads the IDs of all rows containing the same value in COL1. The corresponding SELECT statements use literals in the filter conditions, as shown in Listing 10.

Listing 10. SELECT statements accessing table DEMO.TAB1 using literals in the filter condition
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42;
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73;

As expected, DB2 cannot reuse the access plan generated for the first statement (filter condition COL1 = 42) when executing the second statement (COL1 = 73). These values are generated by the literals in the filter conditions, and DB2 considers the two statements to be different. A snapshot of the DB2 statement cache confirms that each statement has its own entry in the statement cache and each entry shows only 1 execution (Number of executions = 1) and 1 compilation (Number of compilations = 1), as shown in Listing 11.

Listing 11. Dynamic SQL snapshot for SELECT statements with literals
GET SNAPSHOT FOR DYNAMIC SQL ON SAMPLE;

      Dynamic SQL Snapshot Result

 Database name              = SAMPLE

 Database path              = C:\DB2\NODE0000\SQL00001\

 Number of executions       = 1
 Number of compilations     = 1
 ...
 Statement text             = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42


 Number of executions       = 1
 Number of compilations     = 1
 ...
 Statement text             = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73

Because the values in COL1 are equally distributed, parameter marker usage would have been the right choice because in this case each of these queries returns exactly 100 rows. Because changes to the example application are not possible (for any of the reasons mentioned above), performance of the SQL statements can be improved by using the DB2 statement concentrator feature. The statement concentrator can be activated at the database level by setting the DB CFG parameter STMT_CONC to the value LITERALS, as shown in Listing 12.

Listing 12. Activating the DB2 statement concentrator in the database configuration
UPDATE DB CFG USING STMT_CONC LITERALS IMMEDIATE;

After activating the statement concentrator, the two SELECT statements are executed again. The resulting contents of the DB2 statement cache are shown in Listing 13.

Listing 13. Dynamic SQL snapshot for SELECT statements but with DB2 statement concentrator activated
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42;
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73;

GET SNAPSHOT FOR DYNAMIC SQL ON SAMPLE;

      Dynamic SQL Snapshot Result

 Database name              = SAMPLE

 Database path              = C:\DB2\NODE0000\SQL00001\

 Number of executions       = 2
 Number of compilations     = 1
 ...
 Statement text             = SELECT ID FROM DEMO.TAB1 WHERE COL1 = :L0


 Number of executions       = 0
 Number of compilations     = 0
 ...
 Statement text             = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42


 Number of executions       = 0
 Number of compilations     = 0
 ...
 Statement text             = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73

Both SELECT statements still have their own entry in the statement cache. But the number of executions and compilations (Number of executions/compilations = 0) indicates that the SELECT statements were not executed in their original representation with literals. Instead the DB2 statement concentrator automatically generated a more generic representation of the SELECT statement by replacing the literals with a parameter marker: SELECT ID FROM DEMO.TAB1 WHERE COL1 = :L0. The corresponding entry in the statement cache shows that this generic SELECT statement was executed 2 times (Number of executions = 2) and compiled only 1 time (Number of compilations = 1). By activating the statement concentrator, the repeated prepare phase for this kind of SELECT statement could be saved.

The access plan information, which is generated by entering the statement in Listing 14 and then formatted using the db2exfmt utility, also proves that DB2 rewrote the original SELECT statement.

Listing 14. Generating the access plan with DB2 statement concentrator activated
EXPLAIN ALL FOR
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42;

In addition to the usual sections Original Statement and Optimized Statement, the db2exfmt representation of the access plan contains an additional section Effective Statement that reflects the substitution of the literal with a parameter marker, as shown in Listing 15.

Listing 15. Excerpt from the access plan showing the original and the effective statement
Original Statement:
------------------
SELECT ID 
FROM DEMO.TAB1 
WHERE COL1 = 42


Effective Statement:
-------------------
SELECT ID 
FROM DEMO.TAB1 
WHERE COL1 = :L0


Optimized Statement:
-------------------
SELECT Q1.ID AS "ID" 
FROM DEMO.TAB1 AS Q1 
WHERE (Q1.COL1 = :L0)

The DB2 statement concentrator can improve database performance if an application executes many similar SQL statements with literals whose access plans DB2 would normally reuse if parameter markers were used instead.

To analyze the problem of improper usage of parameter markers in the next section, deactivate the statement concentrator, as shown in Listing 16.

Listing 16. Deactivating the DB2 statement concentrator in the database configuration
UPDATE DB CFG USING STMT_CONC OFF IMMEDIATE;

Getting an optimal access plan using DB2 statement reoptimization

To determine optimal access plans for certain kinds of SQL statements, the DB2 optimizer needs a very accurate information about the filter factors associated with the conditions specified in the WHERE clauses. In such cases, the use of literals instead of parameter markers is vital for good SQL performance. This is especially true if data in certain columns of a table is not equally distributed (see the article about distribution statistics in the Resources section). To generate an uneven distribution of values in column COL2 of the sample table DEMO.TAB1, execute the DB2 CLP script in Listing 17.

Listing 17. DB2 CLP script to update table DEMO.TAB1
CONNECT TO SAMPLE;

UPDATE DEMO.TAB1 SET COL2 = 0 WHERE ID = 1;

RUNSTATS ON TABLE DEMO.TAB1 WITH DISTRIBUTION ON COLUMNS (COL2) AND INDEXES ALL;

COMMIT;

CONNECT RESET;

Before the script is executed, each of the 10,000 rows in table DEMO.TAB1 has a value of 1 in column COL2. After executing the script, there is exactly 1 row (the row with ID = 1) that has a value of 0 in column COL2. The distribution of values in column COL2 is extremely uneven. The value 0 occurs only 1 time, while the value 1 occurs 9,999 times. The script also generates new statistics for table DEMO.TAB1. During statistics generation, distribution information is gathered for the values in column COL2. Thus the uneven distribution of values is reflected in the database's statistics information.

Listing 18 shows a sample SELECT statement that uses parameter markers in its filter conditions.

Listing 18. Sample SELECT using parameter markers
SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?;

To analyze how DB2 executes this statement, generate the access plan information in the explain tables as shown in as shown in Listing 19, and then use the db2exfmt utility to format the access plan.

Listing 19. Generating an access plan for sample SELECT with parameter markers
EXPLAIN ALL FOR
SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?;

Listing 20 shows the resulting db2exfmt output.

Listing 20. Access plan for sample SELECT with parameter markers
Access Plan:
-----------
	Total Cost: 		39,6529
	Query Degree:		1

            Rows 
           RETURN
           (   1)
            Cost 
             I/O 
             |
             50 
           FETCH 
           (   2)
           39,6529 
           32,9772 
         /---+----\
       100         10000 
     RIDSCN   TABLE: DEMO    
     (   3)        TAB1
     7,63471        Q1
        1 
       |
       100 
     SORT  
     (   4)
     7,63438 
        1 
       |
       100 
     IXSCAN
     (   5)
     7,61811 
        1 
       |
      10000 
 INDEX: DEMO    
      IDX1
       Q1

The db2exfmt utility for formatting access plans

A detailed access plan for an SQL statement can be created by performing the following steps:

  1. Locate the script to generate the Explain tables. The script is in the directory <DB2 instance owner>/sqllib/misc/EXPLAIN.DDL for UNIX or Linux or in the directory <DB2 install dir>\SQLLIB\MISC\EXPLAIN.DDL for Windows.
  2. Create the explain tables (required only once) by executing the script using the DB2 CLP
    db2 CONNECT TO <dbname> USER <userid>
    db2 -tf EXPLAIN.DDL
  3. Fill the explain tables by executing the SQL statement in EXPLAIN mode using one of the following options:
    • db2 SET CURRENT EXPLAIN MODE EXPLAIN
      db2 <sqlstmt>
      db2 SET CURRENT EXPLAIN MODE NO
    • db2 EXPLAIN ALL FOR <sqlstmt>
  4. Read the explain tables and write the access plan to a file using
    db2exfmt -d <dbname> -u <userid> <pwd> -g -o <outfile> -w -1 -n % -s % -# 0

The access plan is generated during the prepare phase. At this point, the DB2 optimizer does not know the values for the parameter markers in the filter conditions of the SELECT statement because the application provides the filter values just before the statement's execute phase (after the prepare phase). The access plan shows that the DB2 optimizer uses the index on column COL1 (index DEMO.IDX1) to access the rows in table DEMO.TAB1 for the following reasons:

  • Independent of the value that is finally inserted for the parameter marker, the filter condition COL1 = ? always returns 100 rows, because the values in column COL1 are evenly distributed (the DB2 optimizer handles the filter factor accurately).
  • On the other hand, the filter factor for condition COL2 = ? varies greatly, depending on the value inserted for the parameter marker, 1 or 0. If 1 is inserted, the filter factor is not good because 9,999 rows satisfy the condition. However, if 0 is inserted, the filter factor is excellent, because only a single row satisfies the condition.
  • Because parameter markers are used in the filter conditions, DB2 tries to mitigate the risk of a bad access plan and chooses a conservative data access pattern by using the index on column COL1. This access pattern ensures a good execution time independent of the value (1 or 0) that is inserted for the parameter marker in the filter condition COL2 = ?.

By examining the access plan for the same SELECT statement but with literals in the filter conditions, you can see that the DB2 optimizer uses the exact filter factors to generate a better access plan for the case COL2 = 0, as shown in Listing 21.

Listing 21. Generating an access plan for sample SELECT with literals
EXPLAIN ALL FOR
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 1 AND COL2 = 0;

The resulting db2exfmt output is shown in Listing 22.

Listing 22. Access plan for sample SELECT with literals
Access Plan:
-----------
	Total Cost: 		15,139
	Query Degree:		1

            Rows 
           RETURN
           (   1)
            Cost 
             I/O 
             |
            0,01 
           FETCH 
           (   2)
           15,139 
              2 
         /---+----\
        1          10000 
     IXSCAN   TABLE: DEMO    
     (   3)        TAB1
     7,57519        Q1
        1 
       |
      10000 
 INDEX: DEMO    
      IDX2
       Q1

With literals in the filter conditions, the DB2 optimizer recognizes that accessing table DEMO.TAB1 using the index on column COL2 (index DEMO.IDX2) is more efficient than accessing the table using the index on column COL1 (index DEMO.IDX1) that was chosen for the parameterized query. The lower costs (Total Cost) of the access plan with literals confirm the optimized data access: 15 timerons instead of 39 timerons for the access plan with parameter markers.

If optimal access plans for certain SQL statements depend on the use of literals in the filter conditions but the application uses parameter markers instead, DB2 offers the possibility to reoptimize SQL statements. Reoptimization means that DB2 still generates the access plan in the prepare phase based on the original SQL statement text containing parameter markers. But at the point at which the application has inserted values for the parameter markers, DB2 reassesses the previously generated access plan and reoptimizes it if necessary before the access plan executes. In contrast to the DB2 statement concentrator, statement reoptimization cannot be simply activated or deactivated by setting a DB CFG parameter. Reoptimization is controlled at the database interface level.

To execute SQL statements that JDBC or ODBC/CLI sends, DB2 internally uses the so-called CLI packages. These packages are usually bound to the database with the BIND command, as shown in Listing 23.

Listing 23. Binding CLI packages
BIND "C:\PROGRA~1\IBM\SQLLIB\bnd\@db2cli.lst"
    BLOCKING ALL GRANT PUBLIC;

As the BIND command contains no explicit schema name, the CLI packages are bound in the default schema NULLID. By default DB2 uses the CLI packages in schema NULLID when it receives SQL statements using JDBC or ODBC/CLI. The CLI packages in schema NULLID do not use reoptimization for the execution of SQL statements.

To reoptimize, bind the CLI packages a second time in another schema named NULLIDRA. The schema name is specified in the BIND command using the option COLLECTION, as shown in Listing 24.

Listing 24. Binding CLI packages with option REOPT ALWAYS
BIND "C:\PROGRA~1\IBM\SQLLIB\bnd\@db2cli.lst"
    BLOCKING ALL COLLECTION NULLIDRA GRANT PUBLIC;

NULLIDRA is a reserved schema name in DB2 that indicates that the CLI packages should be bound with the additional option REOPT ALWAYS. Because this option is implicitly used for schema name NULLIDRA, it is not necessary to explicitly specify REOPT ALWAYS.

Just binding the CLI packages in schema NULLIDRA is not enough. A JDBC or ODBC/CLI application must explicitly tell the database that it would like to use the CLI packages in schema NULLIDRA instead of the default CLI packages in schema NULLID (remember that the NULLID and NULLIDRA CLI packages coexist in the database). For Java applications, the CLI packages' schema can be specified by setting the JDBC property currentPackageSet. The JDBC property can be appended to the database URL or it can be set within the data source definition for the database. Appending the JDBC property to the database URL is shown in Listing 25.

Listing 25. JDBC database URL setting the property currentPackageSet to schema NULLIDRA
jdbc:db2://localhost:50000/sample:currentPackageSet=NULLIDRA;

For the sample query with parameter markers, the Java source code is shown in Listing 26.

Listing 26. Sample Java class using the CLI packages bound with REOPT ALWAYS
public class ReoptAlwaysSample {

    public static void main(String[] args) {
        try {
            // load driver
            Class.forName("com.ibm.db2.jcc.DB2Driver").newInstance();

            // set database URL, user, and password

            // use REOPT ALWAYS CLI packages
            String databaseUrl =
                "jdbc:db2://localhost:50000/sample:currentPackageSet=NULLIDRA;";

            String user = "userid";
            String password = "password";

            // get connection
            java.sql.Connection con =
                java.sql.DriverManager.getConnection(databaseUrl, user, password);

            // execute query
            String query = "SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?";
            java.sql.PreparedStatement prepstmt = con.prepareStatement(query);
            prepstmt.setInt(1, 1);
            prepstmt.setInt(2, 0);
            java.sql.ResultSet rs = prepstmt.executeQuery();

            // in the real world, result set evaluation would occur here
            // ...

            rs.close();
            prepstmt.close();

            con.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

You can identify several places in the access plan for the SELECT statement in the Java class ReoptAlwaysSample that indicate that the statement was reoptimized during execution.

The header section of the db2exfmt output shows that the CLI packages in schema NULLIDRA were used for statement execution, as shown in Listing 27.

Listing 27. db2exfmt header showing schema NULLIDRA for CLI package SYSSH200
******************** EXPLAIN INSTANCE ********************

DB2_VERSION:       09.07.2
SOURCE_NAME:       SYSSH200
SOURCE_SCHEMA:     NULLIDRA
SOURCE_VERSION:      
EXPLAIN_TIME:      2011-01-07-18.01.01.921001
EXPLAIN_REQUESTER: FECHNER

The access plan tree has changed. It shows that the rows in table DEMO.TAB1 are accessed using index DEMO.IDX2 on column COL2. In other words, the access plan tree for the SELECT statement with parameter markers is now identical to the optimal access plan tree for the SELECT statement with literals in the filter conditions, as shown in Listing 28.

Listing 28. Access plan showing that the optimal index is chosen when SQL Statement Reoptimization occurs
Original Statement:
------------------
SELECT ID 
FROM DEMO.TAB1 
WHERE COL1 = ? AND COL2 = ?

Optimized Statement:
-------------------
SELECT Q1.ID AS "ID" 
FROM DEMO.TAB1 AS Q1 
WHERE (Q1.COL2 = :?) AND (Q1.COL1 = :?)

Access Plan:
-----------
	Total Cost: 		15,139
	Query Degree:		1

            Rows 
           RETURN
           (   1)
            Cost 
             I/O 
             |
            0,01 
           FETCH 
           (   2)
           15,139 
              2 
         /---+----\
        1          10000 
     IXSCAN   TABLE: DEMO    
     (   3)        TAB1
     7,57519        Q1
        1 
       |
      10000 
 INDEX: DEMO    
      IDX2
       Q1

The details sections for the index scan operation (3) and the fetch operation (2) also show that reoptimization occurred for the SELECT statement. Both details sections contain the literal values that were inserted for the parameter markers during the reoptimization process, as shown in Listing 29.

Listing 29. Details section of db2exfmt output explicitly showing the values used for SQL statement reoptimization
	2) FETCH : (Fetch)
            ...
		Predicates:
		----------
		3) Sargable Predicate, 
			Comparison Operator: 		Equal (=)
			Subquery Input Required: 	No
			Filter Factor: 			0,01

			Predicate Text:
			--------------
			(Q1.COL1 = :? /* REOPT VALUE=1*/)

	3) IXSCAN: (Index Scan)
            ...
		Predicates:
		----------
		2) Start Key Predicate, 
			Comparison Operator: 		Equal (=)
			Subquery Input Required: 	No
			Filter Factor: 			0,0001

			Predicate Text:
			--------------
			(Q1.COL2 = :? /* REOPT VALUE=0*/)

		2) Stop Key Predicate, 
			Comparison Operator: 		Equal (=)
			Subquery Input Required: 	No
			Filter Factor: 			0,0001

			Predicate Text:
			--------------
			(Q1.COL2 = :? /* REOPT VALUE=0*/)

Some additional comments concerning this sample:

  • CLI packages were bound in schema NULLIDRA (RA = REOPT ALWAYS), so an SQL statement is reoptimized for each execution. Under certain circumstances it might be sufficient for the SQL statement to be reoptimized for only the first execution. If this is the desired behavior, CLI packages are bound in schema NULLIDR1 (R1 = REOPT ONCE). When schema NULLIDR1 is specified, the BIND command implicitly uses the option REOPT ONCE to bind CLI packages. REOPT ONCE is a trade-off between optimal access plans and reuse of access plans, because in this case reoptimization only takes place for an SQL statement's first execution. For subsequent executions, the access plan is reused independently of the values that are bound to the parameter markers.
  • Generation of access plan information for the SELECT statement in the Java class was achieved by setting the additional JDBC property currentExplainMode to YES, as shown in Listing 30.
    Listing 30. JDBC database URL setting the properties currentPackageSet and currentExplainMode
    jdbc:db2://localhost:50000/sample:currentPackageSet=NULLIDRA;currentExplainMode=YES;
  • The CLI keyword CurrentPackageSet is the equivalent of the JDBC property currentPackageSet (used for setting the CLI packages' schema) in ODBC/CLI applications. You can set CurrentPackageSet for the data source (= database) in the CLI configuration file db2cli.ini.

Setting currentPackageSet to NULLIDRA for a JDBC or ODBC/CLI application activates reoptimization for the execution of each SQL statement in the application. But in many cases, only a few SQL statements should be reoptimized, because reoptimizing each statement could have a negative performance impact. You can control reoptimization at the statement level using an optimization profile.

Controlling reoptimization using a profile

See Resources for a link to an article that introduces optimization profiles.

This section describes how to control reoptimization for the SQL statement in the example using an optimization profile.

  1. Create an optimization profile for the SELECT statement, as shown in Listing 31. Optimization profiles are XML documents. The sample optimization profile is named profile1.xml.
    Listing 31. Optimization profile specifying REOPT ALWAYS for a certain query
    <?xml version="1.0" encoding="UTF-8"?>
    <OPTPROFILE VERSION="9.1.0.0">
        <STMTPROFILE ID="REOPT_QUERY_1">
            <STMTKEY>
                <![CDATA[SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?]]>
            </STMTKEY>
            <OPTGUIDELINES>
                <REOPT VALUE="ALWAYS"/>
            </OPTGUIDELINES>
        </STMTPROFILE>
    </OPTPROFILE>

    An optimization profile contains DB2 optimizer hints for one or more SQL statements. Each SQL statement has its own section <STMTPROFILE> in the XML document. The section contains the text of the SQL statement (subsection <STMTKEY>) as well as the optimizer hints (subsection <OPTGUIDELINES>) that the DB2 optimizer should consider when generating the access plan for the statement. The sample optimization profile contains a SELECT statement and the corresponding hint <REOPT VALUE="ALWAYS"/>. This hint tells the DB2 optimizer to execute this SELECT statement using the CLI packages in schema NULLIDRA, which means that the statement will be reoptimized for each execution.

  2. Set the DB2 registry variable DB2_OPTPROFILE to a value of YES to explicitly activate the use of optimization profiles in DB2, as shown in Listing 32.
    Listing 32. Activating optimization profile support by setting the DB2 Registry variable DB2_OPTPROFILE
    db2set DB2_OPTPROFILE=YES
    
    db2stop
    db2start
  3. Restart the DB2 instance for DB2 registry changes to take effect.
  4. Call the stored procedure SYSPROC.SYSINSTALLOBJECTS to create the SYSTOOLS.OPT_PROFILE table to store the optimization profile, as shown in Listing 33.
    Listing 33. Creating table SYSTOOLS.OPT_PROFILE for storing optimization profiles
    CALL SYSPROC.SYSINSTALLOBJECTS('OPT_PROFILES', 'C',
        CAST(NULL AS VARCHAR(128)), CAST(NULL AS VARCHAR(128)));
  5. Use the DB2 IMPORT utility to insert the XML file containing the optimization profile definition (profile1.xml) into table SYSTOOLS.OPT_PROFILE.
    1. Create an input file for the IMPORT utility (such as profile1.del) to reference the XML file and to provide a schema (DEMO) and a name (PROFILE1) for the optimization profile, as shown in Listing 34.
      Listing 34. Input file profile1.del for importing the optimization profile
      "DEMO","PROFILE1","profile1.xml"
    2. Call the IMPORT utility for the input file, as shown in Listing 35.
      Listing 35. Importing the optimization profile in table SYSTOOLS.OPT_PROFILE
      IMPORT FROM profile1.del OF DEL MODIFIED BY LOBSINFILE 
           INSERT INTO SYSTOOLS.OPT_PROFILE;
  6. Flush the DB2 optimization profile cache to ensure that changes to optimization profiles will take effect, as shown in Listing 36.
    Listing 36. Flushing the DB2 optimization profile cache
    FLUSH OPTIMIZATION PROFILE CACHE ALL;

Because the sample Java application should no longer use reoptimization for all SQL statements, the property currentPackageSet is removed. It is replaced with the property optimizationProfile, which defines the fully qualified name of the optimization profile (= DEMO.PROFILE1) that should be used for SQL statements that the application executes, as shown in Listing 37. Besides replacing one Java property with another, no additional changes to the source code are required.

Listing 37. Sample Java class using the optimization profile
public class ReoptAlwaysSample {

    public static void main(String[] args) {
        try {
            // load driver
            Class.forName("com.ibm.db2.jcc.DB2Driver").newInstance();

            // set database URL, user, and password

            // specify optimization profile to use
            String databaseUrl = "jdbc:db2://localhost:50000/sample"
                                 + ":optimizationProfile=DEMO.PROFILE1;";

            ...
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The header of the db2exfmt output shows that the application now uses the default CLI packages in schema NULLID again (without reoptimization), as shown in Listing 38.

Listing 38. db2exfmt header showing default schema NULLID for CLI package SYSSH200
******************** EXPLAIN INSTANCE ********************

DB2_VERSION:       09.07.2
SOURCE_NAME:       SYSSH200
SOURCE_SCHEMA:     NULLID  
SOURCE_VERSION:      
EXPLAIN_TIME:      2011-01-21-19.10.55.500001
EXPLAIN_REQUESTER: FECHNER

Note that there is a new section Profile information. This section shows the name of the optimization profile that the application uses, as well as the statement key (= <STMTKEY>) that contains the SQL statement text of the SELECT statement. The fact that the correct <STMTKEY> appears proves that SQL statement text matching succeeded for the SELECT statement, as shown in Listing 39.

Listing 39. Profile Information section in db2exfmt output
Profile Information:
--------------------
OPT_PROF: (Optimization Profile Name)
	DEMO.PROFILE1
STMTPROF: (Statement Profile Name)
	REOPT_QUERY_1

The access plan tree confirms that selective reoptimization using the optimization profile works. The optimal index for execution of the SELECT statement (DEMO.IDX2) is chosen, as shown in Listing 40.

Listing 40. Access plan showing that the optimal index is chosen with the optimization profile
Access Plan:
-----------
	Total Cost: 		15,139
	Query Degree:		1

            Rows 
           RETURN
           (   1)
            Cost 
             I/O 
             |
            0,01 
           FETCH 
           (   2)
           15,139 
              2 
         /---+----\
        1          10000 
     IXSCAN   TABLE: DEMO    
     (   3)        TAB1
     7,57519        Q1
        1 
       |
      10000 
 INDEX: DEMO    
      IDX2
       Q1

Conclusion

The article explains how using parameter markers and literals in SQL statements influences SQL execution times. It advises whether to use parameter markers or literals, depending on the type of the application and data distribution characteristics of the database. It also describes two DB2 for Linux, UNIX, and Windows features to improve SQL execution times, especially when changes to a database application are not possible. You learned how to use the DB2 statement concentrator feature when literals are used improperly. And you learned how to use the DB2 statement reoptimization feature when parameter markers are used improperly.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=645471
ArticleTitle=Improve SQL execution times using the DB2 statement concentrator feature and the DB2 statement reoptimization feature
publish-date=04072011