Database APIs such as JDBC and ODBC/CLI offer database developers the choice to use SQL statements with or without parameter markers (also called host variables). In many cases, one of these options is chosen without considering the implications. This can cause unexpected database performance issues when the database application is subsequently executed. For users of packaged software, such problems are even worse because changes to the application are not possible. This article explains in which cases you should work with parameter markers and in which cases using literals would be better. The article then presents two DB2 LUW features that help to resolve performance issues caused by inadequate use of literals and parameter markers respectively: the DB2 statement concentrator feature and the DB2 statement reoptimization feature.
Understanding SQL execution phases
Execution of an SQL statement is divided into three phases:
- Prepare
- In the prepare phase, the DB2 optimizer creates the access plan to execute the SQL statement.
- Execute
- In the execute phase, DB2 determines the rows that make up
the result set (
SELECT), inserts new rows (INSERT), deletes rows (DELETE), or updates existing rows (UPDATE). The access plan created in the prepare phase determines the database internal operations and the order in which the operations are executed. - Fetch
- In the fetch phase, if the SQL statement is a
SELECT, DB2 returns the result set to the application.
The execution
time for an SQL statement is the time required to go through all three
phases.
DB2 LUW provides the db2batch utility to
determine the times that an SQL statement spends in each of the three
phases (prepare, execute, and fetch).
db2batch reads SQL statements from a flat
file, executes the statements, and writes the results to an output
file. The option -i complete tells
db2batch to report the time required for
each phase separately, as shown in Listing 1.
Listing 1. Example db2batch statement
db2batch -d <dbname> -f <sqlfile> -a <user>/<password> -r<outfile> -i complete |
Using parameter markers or literals
This section describes when to use parameter markers and when to use literals.
The case for parameter markers
Parameter markers should be used if all of the following conditions are true for an SQL statement:
- The SQL statement is executed many times.
- The time required for the prepare phase of the SQL statement makes up a significant portion of the statement's overall execution time.
- The best access plan for the SQL statement does not depend on explicit values in the statement's filter conditions.
The DB2 optimizer stores the access plan created in the prepare phase in the SQL statement cache (also called package cache in DB2) for reuse purposes. So when the same SQL statement is executed again, DB2 finds the previously created access plan in the statement cache and reuses the access plan. This saves the prepare phase and thus reduces execution time. Avoiding the prepare phase especially pays off if an SQL statement is executed many times and if the prepare phase makes up a significant portion of the statement's overall execution time.
Consider this example scenario:
- The execution time of an SQL statement is 4 ms. These 4 ms comprise the total time required for the prepare, execute, and fetch phases: Prepare = 1 milliseconds (ms); Execute = 2 ms; Fetch = 1 ms.
- If the statement is executed 100,000 times, the total execution time is 100,000 x 4 ms = 400 seconds = 6 minutes 40 seconds
- If the access plan is reused, the time for the prepare phase is saved and the total execution time is reduced by 25%: 1 x 4 ms + 99,999 x 3 ms = 300,001 seconds = 5 minutes.
Another advantage of access plan reuse is a reduced CPU load. Access plan creation in the prepare phase requires CPU resources. So if an access plan is reused, a database server requires less CPU resources to execute the corresponding SQL statement repeatedly.
Remember that DB2 can only reuse an access plan if SQL statement texts completely match. For example, the two SQL statements in Listing 2 are not identical to DB2, although they differ only slightly in their filter conditions. DB2 would not reuse an access plan in this example.
Listing 2. SQL statements using literals in the filter conditions
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = '000040' SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = '000120' |
To enable access plan reuse, rewrite the SQL statement using a parameter marker, as shown in Listing 3.
Listing 3. SQL statement using a parameter marker in the filter condition
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = ? |
All major database interfaces offer the capability of executing SQL
statements that contain parameter markers. JDBC, for example, provides the
class java.sql.PreparedStatement for this
purpose, as shown in Listing 4.
Listing 4. JDBC java.sql.PreparedStatement code sample
java.sql.Connection con; java.sql.PreparedStatement prepstmt; java.sql.ResultSet rs; String query; // initialize the database connection con = ... // prepare the SQL statement (implicitly creates a prepared SQL statement object) query = "SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = ?"; prepstmt = con.prepareStatement(query); // bind a value to the parameter marker and execute the SQL statement prepstmt.setString(1, "000040"); rs = prepstmt.executeQuery(); // process the result set... // bind another value to the parameter marker and execute the SQL statement again prepstmt.setString(1, "000120"); rs = prepstmt.executeQuery(); // process the result set... |
SQL statements well-suited for parameter marker usage
(because they conform to the three rules mentioned above) are normally
quite simple and therefore are typical for online
transaction processing (OLTP). In most cases, well-suited SQL statements are very simple
SELECT statements that reference only a single
table or are INSERT, UPDATE, and DELETE (UID) operations that are executed many
times. Listing 5 shows the low complexity of such SQL
statements.
Listing 5. Simple SQL statements suited for parameter marker usage
SELECT LASTNAME, FIRSTNME FROM EMPLOYEE WHERE EMPNO = ? INSERT INTO DEPARTMENT (DEPTNO, DEPTNAME, ADMRDEPT) VALUES (?, ?, ?) UPDATE EMPLOYEE SET JOB = ? WHERE WORKDEPT = ? |
As the previous section described, parameter marker usage in SQL statements fosters access plan reuse. Thus time and CPU resources are saved in the prepare phase when SQL statements are executed many times. But using parameter markers is not always better than using literals (explicit values). Always use literals if the optimal access plan for the SQL statement varies according to the literals in the filter conditions. This condition generally applies to complex queries executed by applications used for reporting purposes. Such applications rarely execute the exact same query twice. Therefore, time and CPU savings in the prepare phase is minimal even if parameter markers are used.
Literals can also be a better choice in simple SQL statements, as shown in Listing 6.
Listing 6. Simple SQL statement not suited for parameter marker usage
SELECT SUM(SALES) FROM SALES WHERE SALES_DATE BETWEEN ? AND ? |
For the example in Listing 6, assume the following:
- All rows in table
SALEShave aSALES_DATEbetween 2010-01-01 and 2010-12-31. - For each day there are roughly 1,000 rows, yielding a total of 365,000 rows for the whole year 2010.
- There is an index on column
SALES_DATE.
Depending on the values inserted for the parameter markers following the
prepare phase, data access using the SALES_DATE
index or a complete table scan makes more sense. For example, if only a
single day should be evaluated (roughly 1,000 rows), accessing the
rows using the SALES_DATE index is
preferable. However, if the whole year 2010 should be evaluated (all
365,000 rows), accessing the rows using the index causes
processing overhead, because the complete table has to be read anyway. Accessing
the rows using a table scan is a better choice in this case. Because the
DB2 optimizer does not know the values of the parameter markers in the
prepare phase, a one-size-fits-all access plan is generated that cannot be
optimal in all cases. For this example query, it would be better to use
literals instead of parameter markers and to accept the repeated prepare
phase, as shown in Listing 7.
Listing 7. Better version of the SQL statement above using literals instead of parameter markers
SELECT COUNT(SALES) FROM SALES WHERE SALES_DATE BETWEEN '2010-05-26' AND '2010-05-26' SELECT COUNT(SALES) FROM SALES WHERE SALES_DATE BETWEEN '2010-01-01' AND '2010-12-31' |
In a JDBC application, use java.sql.Statement
instead of java.sql.PreparedStatement to execute such
queries, as shown in Listing 8.
Listing 8. JDBC java.sql.Statement code sample
java.sql.Connection con;
java.sql.Statement stmt;
java.sql.ResultSet rs;
String query;
// initialize the database connection
con = ...
// create an SQL statement object
stmt = con.createStatement();
// execute the SQL statement with literals
query = "SELECT COUNT(SALES) FROM SALES " +
"WHERE SALES_DATE BETWEEN '2010-05-26' AND '2010-05-26'";
rs = stmt.executeQuery(query);
// process the result set...
// execute the same SQL statement but with different literals
query = "SELECT COUNT(SALES) FROM SALES " +
"WHERE SALES_DATE BETWEEN '2010-01-01' AND '2010-12-31'";
rs = stmt.executeQuery(query);
// process the result set...
|
Using java.sql.PreparedStatement usually makes the source
code more readable compared to using
java.sql.Statement that might require ugly
string concatenations. However, if literals are important to get optimal access plans, the shorter execution times of the
SQL statements are more important than the elegancy of the application's
source code.
When you understand the advantages and disadvantages of parameter markers and literals, you can choose wisely between the two options during application development to achieve optimal performance for execution of SQL statements. But most often SQL performance issues arise in situations in which you cannot influence how an application executes SQL statements, as in any of the following situations:
- Packaged software
- When software is purchased as a product, the software developer decides how the application accesses the database. Changes to the implementation are not possible.
- Persistence frameworks
- When a third-party persistence framework is used for custom application development, it is not possible to influence the way in which the persistence layer generates and executes SQL statements.
- Legacy applications
- In case of custom legacy applications, changes to the implementation are possible. But often such changes can cost too much time and money.
When one of these situations applies, database performance can be bad because of improper use of parameter markers and literals. DB2 offers the following two features to improve performance on the database side:
- DB2 Statement Concentrator
- The DB2 Statement Concentrator feature enables access plan reuse when an application improperly uses literals instead of parameter markers.
- DB2 Statement Reoptimization
- The DB2 Statement Reoptimization feature enables reoptimization of access plans after the prepare phase when an application improperly uses parameter markers instead of literals.
Forcing access plan reuse using the DB2 statement concentrator feature
This section describes an example of how to use the DB2 statement concentrator feature.
Table 1 describes the contents of the example table
DEMO.TAB1, which consists of 10,000 rows and 3
columns.
Table 1. Contents of sample table DEMO.TAB1
| ID | COL1 | COL2 |
|---|---|---|
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 4 | 4 | 1 |
| 5 | 5 | 1 |
| ... | ... | ... |
| 99 | 99 | 1 |
| 100 | 0 | 1 |
| 101 | 1 | 1 |
| ... | ... | ... |
| 199 | 99 | 1 |
| 200 | 0 | 1 |
| 201 | 1 | 1 |
| ... | ... | ... |
| 10000 | 0 | 1 |
Where:
IDis a continuous ID from 1 to 10,000.COL1contains the result of the operation.ID MOD 100is the remainder ofIDdivided by 100.COL2contains the value 1 for all rows at the beginning.
In addition to a primary key index on column ID, two
more indexes are defined: an index DEMO.IDX1
on column COL1 and an index
DEMO.IDX2 on column
COL2.
The DB2 CLP script in Listing 9 creates and fills the table
DEMO.TAB1 (together with its indexes) in the
DB2 sample database.
Listing 9. DB2 CLP script to create table DEMO.TAB1 and its indexes in the SAMPLE database
!db2sampl;
CONNECT TO SAMPLE;
CREATE TABLE DEMO.TAB1
(
ID INT NOT NULL PRIMARY KEY,
COL1 INT,
COL2 INT
);
INSERT INTO DEMO.TAB1 (ID, COL1, COL2)
WITH TMP (ID, COL1, COL2) AS
(
VALUES (1, 1, 1)
UNION ALL
SELECT (ID + 1), MOD((ID + 1), 100), 1
FROM TMP
WHERE ID < 10000
)
SELECT ID, COL1, COL2 FROM TMP;
COMMIT;
CREATE INDEX DEMO.IDX1 ON DEMO.TAB1 (COL1 ASC);
CREATE INDEX DEMO.IDX2 ON DEMO.TAB1 (COL2 ASC);
RUNSTATS ON TABLE DEMO.TAB1 AND INDEXES ALL;
CONNECT RESET;
|
Assume an application repeatedly reads the IDs of all rows
containing the same value in COL1. The
corresponding SELECT statements use literals in
the filter conditions, as shown in Listing 10.
Listing 10. SELECT statements accessing table DEMO.TAB1 using literals in the filter condition
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42; SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73; |
As expected, DB2 cannot reuse the access plan generated for the first
statement (filter condition COL1 = 42) when
executing the second statement (COL1 = 73).
These values are generated by the literals in the filter conditions, and DB2
considers the two statements to be different. A snapshot of the DB2
statement cache confirms that each statement has its own entry in
the statement cache and each entry shows only 1 execution
(Number of executions = 1) and 1 compilation
(Number of compilations = 1), as shown in Listing 11.
Listing 11. Dynamic SQL snapshot for SELECT statements with literals
GET SNAPSHOT FOR DYNAMIC SQL ON SAMPLE;
Dynamic SQL Snapshot Result
Database name = SAMPLE
Database path = C:\DB2\NODE0000\SQL00001\
Number of executions = 1
Number of compilations = 1
...
Statement text = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42
Number of executions = 1
Number of compilations = 1
...
Statement text = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73
|
Because the values in COL1 are equally
distributed, parameter marker usage would have been the right choice because in this case each of these queries returns exactly 100 rows.
Because changes to the example application are not possible (for any of
the reasons mentioned above), performance of the SQL statements can be
improved by using the DB2 statement concentrator feature. The statement
concentrator can be activated at the database level by setting the
DB CFG parameter
STMT_CONC to the value
LITERALS, as shown in Listing 12.
Listing 12. Activating the DB2 statement concentrator in the database configuration
UPDATE DB CFG USING STMT_CONC LITERALS IMMEDIATE; |
After activating the statement concentrator, the two
SELECT statements are executed again. The
resulting contents of the DB2 statement cache are shown in Listing 13.
Listing 13. Dynamic SQL snapshot for SELECT statements but with DB2 statement concentrator activated
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42;
SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73;
GET SNAPSHOT FOR DYNAMIC SQL ON SAMPLE;
Dynamic SQL Snapshot Result
Database name = SAMPLE
Database path = C:\DB2\NODE0000\SQL00001\
Number of executions = 2
Number of compilations = 1
...
Statement text = SELECT ID FROM DEMO.TAB1 WHERE COL1 = :L0
Number of executions = 0
Number of compilations = 0
...
Statement text = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42
Number of executions = 0
Number of compilations = 0
...
Statement text = SELECT ID FROM DEMO.TAB1 WHERE COL1 = 73
|
Both SELECT statements still have their own
entry in the statement cache. But the number of
executions and compilations
(Number of executions/compilations = 0)
indicates that the SELECT statements were not executed
in their original representation with literals. Instead the DB2 statement
concentrator automatically generated a more generic representation of the
SELECT statement by replacing the literals with
a parameter marker:
SELECT ID FROM DEMO.TAB1 WHERE COL1 = :L0. The
corresponding entry in the statement cache shows that this generic
SELECT statement was executed 2 times
(Number of executions = 2) and compiled only 1
time (Number of compilations = 1).
By activating the statement concentrator, the repeated prepare phase for this
kind of SELECT statement could be saved.
The access plan information, which is generated by entering the statement
in Listing 14 and then formatted using the db2exfmt utility, also proves that DB2 rewrote the original SELECT statement.
Listing 14. Generating the access plan with DB2 statement concentrator activated
EXPLAIN ALL FOR SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42; |
In addition to the usual sections Original Statement and
Optimized Statement, the db2exfmt representation of the access plan contains an additional
section Effective Statement that reflects the
substitution of the literal with a parameter marker, as shown in Listing 15.
Listing 15. Excerpt from the access plan showing the original and the effective statement
Original Statement: ------------------ SELECT ID FROM DEMO.TAB1 WHERE COL1 = 42 Effective Statement: ------------------- SELECT ID FROM DEMO.TAB1 WHERE COL1 = :L0 Optimized Statement: ------------------- SELECT Q1.ID AS "ID" FROM DEMO.TAB1 AS Q1 WHERE (Q1.COL1 = :L0) |
The DB2 statement concentrator can improve database performance if an application executes many similar SQL statements with literals whose access plans DB2 would normally reuse if parameter markers were used instead.
To analyze the problem of improper usage of parameter markers in the next section, deactivate the statement concentrator, as shown in Listing 16.
Listing 16. Deactivating the DB2 statement concentrator in the database configuration
UPDATE DB CFG USING STMT_CONC OFF IMMEDIATE; |
Getting an optimal access plan using DB2 statement reoptimization
To determine optimal access plans for certain kinds of SQL statements, the
DB2 optimizer needs a very accurate information about the filter factors
associated with the conditions specified in the
WHERE clauses. In such cases, the use of
literals instead of parameter markers is vital for good SQL performance.
This is especially true if data in certain columns of a table is not
equally distributed (see the article about distribution statistics in the
Resources section). To generate an uneven
distribution of values in column COL2 of the
sample table DEMO.TAB1, execute the DB2 CLP
script in Listing 17.
Listing 17. DB2 CLP script to update table DEMO.TAB1
CONNECT TO SAMPLE; UPDATE DEMO.TAB1 SET COL2 = 0 WHERE ID = 1; RUNSTATS ON TABLE DEMO.TAB1 WITH DISTRIBUTION ON COLUMNS (COL2) AND INDEXES ALL; COMMIT; CONNECT RESET; |
Before the script is executed, each of the 10,000 rows in table
DEMO.TAB1 has a value of 1 in column
COL2. After executing the script, there is
exactly 1 row (the row with ID = 1) that has a
value of 0 in column COL2. The
distribution of values in column COL2 is
extremely uneven. The value 0 occurs only 1 time, while the value 1 occurs
9,999 times. The script also generates new statistics for table
DEMO.TAB1. During statistics generation,
distribution information is gathered for the values in column
COL2. Thus the uneven distribution of values is
reflected in the database's statistics information.
Listing 18 shows a sample SELECT statement
that uses parameter markers in its filter conditions.
Listing 18. Sample SELECT using parameter markers
SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?; |
To analyze how DB2 executes this statement, generate the access plan
information in the explain tables as shown in as shown in Listing 19,
and then use the db2exfmt utility to format the
access plan.
Listing 19. Generating an access plan for sample SELECT with parameter markers
EXPLAIN ALL FOR SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?; |
Listing 20 shows the resulting db2exfmt output.
Listing 20. Access plan for sample SELECT with parameter markers
Access Plan:
-----------
Total Cost: 39,6529
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
50
FETCH
( 2)
39,6529
32,9772
/---+----\
100 10000
RIDSCN TABLE: DEMO
( 3) TAB1
7,63471 Q1
1
|
100
SORT
( 4)
7,63438
1
|
100
IXSCAN
( 5)
7,61811
1
|
10000
INDEX: DEMO
IDX1
Q1
|
The access plan is generated during the prepare phase. At this point,
the DB2 optimizer does not know the values for the parameter markers
in the filter conditions of the SELECT
statement because the application provides the filter values just before
the statement's execute phase (after the prepare phase). The access
plan shows that the DB2 optimizer uses the index on column
COL1 (index
DEMO.IDX1) to access the rows in table
DEMO.TAB1 for the
following reasons:
- Independent of the value that is finally inserted for the parameter
marker, the filter condition
COL1 = ?always returns 100 rows, because the values in columnCOL1are evenly distributed (the DB2 optimizer handles the filter factor accurately). - On the other hand, the filter factor for condition
COL2 = ?varies greatly, depending on the value inserted for the parameter marker, 1 or 0. If 1 is inserted, the filter factor is not good because 9,999 rows satisfy the condition. However, if 0 is inserted, the filter factor is excellent, because only a single row satisfies the condition. - Because parameter markers are used in the filter conditions, DB2 tries to
mitigate the risk of a bad access plan and chooses a conservative data
access pattern by using the index on column
COL1. This access pattern ensures a good execution time independent of the value (1 or 0) that is inserted for the parameter marker in the filter conditionCOL2 = ?.
By examining the access plan for the same SELECT
statement but with literals in the filter conditions, you can see that the
DB2 optimizer uses the exact filter factors to generate a
better access plan for the case COL2 = 0, as
shown in Listing 21.
Listing 21. Generating an access plan for sample SELECT with literals
EXPLAIN ALL FOR SELECT ID FROM DEMO.TAB1 WHERE COL1 = 1 AND COL2 = 0; |
The resulting db2exfmt output is shown in Listing 22.
Listing 22. Access plan for sample SELECT with literals
Access Plan:
-----------
Total Cost: 15,139
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
0,01
FETCH
( 2)
15,139
2
/---+----\
1 10000
IXSCAN TABLE: DEMO
( 3) TAB1
7,57519 Q1
1
|
10000
INDEX: DEMO
IDX2
Q1
|
With literals in the filter conditions, the DB2 optimizer recognizes that
accessing table DEMO.TAB1 using the index on
column COL2 (index
DEMO.IDX2) is more efficient than accessing the
table using the index on column COL1 (index
DEMO.IDX1) that was chosen for the
parameterized query. The lower costs
(Total Cost) of the access plan with literals
confirm the optimized data access: 15 timerons instead of 39 timerons for the
access plan with parameter markers.
If optimal access plans for certain SQL statements
depend on the use of literals in the filter conditions but the application
uses parameter markers instead, DB2 offers the possibility to reoptimize SQL statements.
Reoptimization means that DB2 still generates the access plan in the
prepare phase based on the original SQL statement text containing
parameter markers. But at the point at which the application has
inserted values for the parameter markers, DB2 reassesses the previously
generated access plan and reoptimizes it if necessary before
the access plan executes. In contrast to the DB2 statement
concentrator, statement reoptimization cannot be simply
activated or deactivated by setting a DB CFG
parameter. Reoptimization is controlled at the database
interface level.
To execute SQL statements that JDBC or ODBC/CLI sends, DB2
internally uses the so-called CLI packages. These packages are usually
bound to the database with the BIND
command, as shown in Listing 23.
Listing 23. Binding CLI packages
BIND "C:\PROGRA~1\IBM\SQLLIB\bnd\@db2cli.lst"
BLOCKING ALL GRANT PUBLIC;
|
As the BIND command contains no explicit schema
name, the CLI packages are bound in the default schema
NULLID. By default DB2 uses the CLI packages in
schema NULLID when it receives SQL statements
using JDBC or ODBC/CLI. The CLI packages in schema
NULLID do not use reoptimization for the
execution of SQL statements.
To reoptimize, bind the CLI packages a
second time in another schema named NULLIDRA.
The schema name is specified in the BIND
command using the option COLLECTION, as shown
in Listing 24.
Listing 24. Binding CLI packages with option
REOPT ALWAYS
BIND "C:\PROGRA~1\IBM\SQLLIB\bnd\@db2cli.lst"
BLOCKING ALL COLLECTION NULLIDRA GRANT PUBLIC;
|
NULLIDRA is a reserved schema name in DB2 that
indicates that the CLI packages should be bound with the additional option
REOPT ALWAYS. Because this option is implicitly used
for schema name NULLIDRA, it is not necessary
to explicitly specify REOPT ALWAYS.
Just binding the CLI packages in schema
NULLIDRA is not enough. A JDBC or ODBC/CLI
application must explicitly tell the database that it would like to use
the CLI packages in schema NULLIDRA instead of
the default CLI packages in schema NULLID
(remember that the NULLID and
NULLIDRA CLI packages coexist in the database).
For Java applications, the CLI packages' schema can be specified by
setting the JDBC property currentPackageSet.
The JDBC property can be appended to the database URL or it can be set
within the data source definition for the database. Appending the JDBC
property to the database URL is shown in Listing 25.
Listing 25. JDBC database URL setting the property
currentPackageSet to schema NULLIDRAjdbc:db2://localhost:50000/sample:currentPackageSet=NULLIDRA; |
For the sample query with parameter markers, the Java source code is shown in Listing 26.
Listing 26. Sample Java class using the CLI packages bound with
REOPT ALWAYS
public class ReoptAlwaysSample {
public static void main(String[] args) {
try {
// load driver
Class.forName("com.ibm.db2.jcc.DB2Driver").newInstance();
// set database URL, user, and password
// use REOPT ALWAYS CLI packages
String databaseUrl =
"jdbc:db2://localhost:50000/sample:currentPackageSet=NULLIDRA;";
String user = "userid";
String password = "password";
// get connection
java.sql.Connection con =
java.sql.DriverManager.getConnection(databaseUrl, user, password);
// execute query
String query = "SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?";
java.sql.PreparedStatement prepstmt = con.prepareStatement(query);
prepstmt.setInt(1, 1);
prepstmt.setInt(2, 0);
java.sql.ResultSet rs = prepstmt.executeQuery();
// in the real world, result set evaluation would occur here
// ...
rs.close();
prepstmt.close();
con.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
|
You can identify several places in the access plan for the SELECT
statement in the Java class ReoptAlwaysSample
that indicate that the statement was reoptimized during execution.
The header section of the db2exfmt output shows
that the CLI packages in schema NULLIDRA were
used for statement execution, as shown in Listing 27.
Listing 27.
db2exfmt header showing schema NULLIDRA for CLI package SYSSH200******************** EXPLAIN INSTANCE ******************** DB2_VERSION: 09.07.2 SOURCE_NAME: SYSSH200 SOURCE_SCHEMA: NULLIDRA SOURCE_VERSION: EXPLAIN_TIME: 2011-01-07-18.01.01.921001 EXPLAIN_REQUESTER: FECHNER |
The access plan tree has changed. It shows that the rows in table
DEMO.TAB1 are accessed using index
DEMO.IDX2 on column
COL2. In other words, the access plan tree for the
SELECT statement with parameter markers is now identical
to the optimal access plan tree for the SELECT
statement with literals in the filter conditions, as shown in Listing 28.
Listing 28. Access plan showing that the optimal index is chosen when SQL Statement Reoptimization occurs
Original Statement:
------------------
SELECT ID
FROM DEMO.TAB1
WHERE COL1 = ? AND COL2 = ?
Optimized Statement:
-------------------
SELECT Q1.ID AS "ID"
FROM DEMO.TAB1 AS Q1
WHERE (Q1.COL2 = :?) AND (Q1.COL1 = :?)
Access Plan:
-----------
Total Cost: 15,139
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
0,01
FETCH
( 2)
15,139
2
/---+----\
1 10000
IXSCAN TABLE: DEMO
( 3) TAB1
7,57519 Q1
1
|
10000
INDEX: DEMO
IDX2
Q1
|
The details sections for the index scan operation (3) and the
fetch operation (2) also show that reoptimization occurred for the
SELECT statement. Both details sections contain
the literal values that were inserted for the parameter markers during the
reoptimization process, as shown in Listing 29.
Listing 29. Details section of
db2exfmt output explicitly showing the values used for SQL statement
reoptimization
2) FETCH : (Fetch)
...
Predicates:
----------
3) Sargable Predicate,
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0,01
Predicate Text:
--------------
(Q1.COL1 = :? /* REOPT VALUE=1*/)
3) IXSCAN: (Index Scan)
...
Predicates:
----------
2) Start Key Predicate,
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0,0001
Predicate Text:
--------------
(Q1.COL2 = :? /* REOPT VALUE=0*/)
2) Stop Key Predicate,
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0,0001
Predicate Text:
--------------
(Q1.COL2 = :? /* REOPT VALUE=0*/)
|
Some additional comments concerning this sample:
- CLI packages were bound in schema
NULLIDRA(RA = REOPT ALWAYS), so an SQL statement is reoptimized for each execution. Under certain circumstances it might be sufficient for the SQL statement to be reoptimized for only the first execution. If this is the desired behavior, CLI packages are bound in schemaNULLIDR1(R1 = REOPT ONCE). When schemaNULLIDR1is specified, theBINDcommand implicitly uses the optionREOPT ONCEto bind CLI packages.REOPT ONCEis a trade-off between optimal access plans and reuse of access plans, because in this case reoptimization only takes place for an SQL statement's first execution. For subsequent executions, the access plan is reused independently of the values that are bound to the parameter markers. - Generation of access plan information for the
SELECTstatement in the Java class was achieved by setting the additional JDBC propertycurrentExplainModetoYES, as shown in Listing 30.
Listing 30. JDBC database URL setting the propertiescurrentPackageSetandcurrentExplainModejdbc:db2://localhost:50000/sample:currentPackageSet=NULLIDRA;currentExplainMode=YES;
- The CLI keyword
CurrentPackageSetis the equivalent of the JDBC propertycurrentPackageSet(used for setting the CLI packages' schema) in ODBC/CLI applications. You can setCurrentPackageSetfor the data source (= database) in the CLI configuration file db2cli.ini.
Setting currentPackageSet to NULLIDRA for a
JDBC or ODBC/CLI application activates reoptimization for the execution of
each SQL statement in the application. But in many cases, only a few SQL
statements should be reoptimized, because reoptimizing each statement could
have a negative performance impact. You can control
reoptimization at the statement level using an optimization profile.
Controlling reoptimization using a profile
See Resources for a link to an article that introduces optimization profiles.
This section describes how to control reoptimization for the SQL statement in the example using an optimization profile.
- Create an optimization profile for the
SELECTstatement, as shown in Listing 31. Optimization profiles are XML documents. The sample optimization profile is namedprofile1.xml.
Listing 31. Optimization profile specifyingREOPT ALWAYSfor a certain query<?xml version="1.0" encoding="UTF-8"?> <OPTPROFILE VERSION="9.1.0.0"> <STMTPROFILE ID="REOPT_QUERY_1"> <STMTKEY> <![CDATA[SELECT ID FROM DEMO.TAB1 WHERE COL1 = ? AND COL2 = ?]]> </STMTKEY> <OPTGUIDELINES> <REOPT VALUE="ALWAYS"/> </OPTGUIDELINES> </STMTPROFILE> </OPTPROFILE>
An optimization profile contains DB2 optimizer hints for one or more SQL statements. Each SQL statement has its own section
<STMTPROFILE>in the XML document. The section contains the text of the SQL statement (subsection<STMTKEY>) as well as the optimizer hints (subsection<OPTGUIDELINES>) that the DB2 optimizer should consider when generating the access plan for the statement. The sample optimization profile contains aSELECTstatement and the corresponding hint<REOPT VALUE="ALWAYS"/>. This hint tells the DB2 optimizer to execute thisSELECTstatement using the CLI packages in schemaNULLIDRA, which means that the statement will be reoptimized for each execution. - Set the DB2 registry variable
DB2_OPTPROFILEto a value ofYESto explicitly activate the use of optimization profiles in DB2, as shown in Listing 32.
Listing 32. Activating optimization profile support by setting the DB2 Registry variableDB2_OPTPROFILEdb2set DB2_OPTPROFILE=YES db2stop db2start
- Restart the DB2 instance for DB2 registry changes to take effect.
- Call the stored procedure
SYSPROC.SYSINSTALLOBJECTSto create theSYSTOOLS.OPT_PROFILEtable to store the optimization profile, as shown in Listing 33.
Listing 33. Creating tableSYSTOOLS.OPT_PROFILEfor storing optimization profilesCALL SYSPROC.SYSINSTALLOBJECTS('OPT_PROFILES', 'C', CAST(NULL AS VARCHAR(128)), CAST(NULL AS VARCHAR(128)));
- Use the DB2
IMPORTutility to insert the XML file containing the optimization profile definition (profile1.xml) into tableSYSTOOLS.OPT_PROFILE.- Create an input file for
the
IMPORTutility (such as profile1.del) to reference the XML file and to provide a schema (DEMO) and a name (PROFILE1) for the optimization profile, as shown in Listing 34.
Listing 34. Input file profile1.del for importing the optimization profile"DEMO","PROFILE1","profile1.xml"
- Call the
IMPORTutility for the input file, as shown in Listing 35.
Listing 35. Importing the optimization profile in tableSYSTOOLS.OPT_PROFILEIMPORT FROM profile1.del OF DEL MODIFIED BY LOBSINFILE INSERT INTO SYSTOOLS.OPT_PROFILE;
- Create an input file for
the
- Flush the DB2 optimization profile cache to ensure that changes
to optimization profiles will take effect, as shown in Listing 36.
Listing 36. Flushing the DB2 optimization profile cacheFLUSH OPTIMIZATION PROFILE CACHE ALL;
Because the sample Java application should no longer use reoptimization for all
SQL statements, the property currentPackageSet
is removed. It is replaced with the property optimizationProfile, which defines the fully qualified name
of the optimization profile (= DEMO.PROFILE1) that should be used for SQL statements that the application
executes, as shown in Listing 37. Besides replacing one Java
property with another, no additional changes to the source code are
required.
Listing 37. Sample Java class using the optimization profile
public class ReoptAlwaysSample {
public static void main(String[] args) {
try {
// load driver
Class.forName("com.ibm.db2.jcc.DB2Driver").newInstance();
// set database URL, user, and password
// specify optimization profile to use
String databaseUrl = "jdbc:db2://localhost:50000/sample"
+ ":optimizationProfile=DEMO.PROFILE1;";
...
} catch (Exception e) {
e.printStackTrace();
}
}
}
|
The header of the db2exfmt output shows that
the application now uses the default CLI packages in schema
NULLID again (without reoptimization), as shown
in Listing 38.
Listing 38.
db2exfmt header showing default schema NULLID for CLI package SYSSH200******************** EXPLAIN INSTANCE ******************** DB2_VERSION: 09.07.2 SOURCE_NAME: SYSSH200 SOURCE_SCHEMA: NULLID SOURCE_VERSION: EXPLAIN_TIME: 2011-01-21-19.10.55.500001 EXPLAIN_REQUESTER: FECHNER |
Note that there is a new section Profile information.
This section shows the name of the optimization profile that the
application uses, as well as the statement key (=
<STMTKEY>) that contains the SQL
statement text of the SELECT statement. The fact that the
correct <STMTKEY> appears proves that
SQL statement text matching succeeded for the
SELECT statement, as shown in Listing 39.
Listing 39. Profile Information section in
db2exfmt outputProfile Information: -------------------- OPT_PROF: (Optimization Profile Name) DEMO.PROFILE1 STMTPROF: (Statement Profile Name) REOPT_QUERY_1 |
The access plan tree confirms that selective reoptimization using
the optimization profile works. The
optimal index for execution of the SELECT
statement (DEMO.IDX2) is chosen, as shown in Listing 40.
Listing 40. Access plan showing that the optimal index is chosen with the optimization profile
Access Plan:
-----------
Total Cost: 15,139
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
0,01
FETCH
( 2)
15,139
2
/---+----\
1 10000
IXSCAN TABLE: DEMO
( 3) TAB1
7,57519 Q1
1
|
10000
INDEX: DEMO
IDX2
Q1
|
The article explains how using parameter markers and literals in SQL statements influences SQL execution times. It advises whether to use parameter markers or literals, depending on the type of the application and data distribution characteristics of the database. It also describes two DB2 for Linux, UNIX, and Windows features to improve SQL execution times, especially when changes to a database application are not possible. You learned how to use the DB2 statement concentrator feature when literals are used improperly. And you learned how to use the DB2 statement reoptimization feature when parameter markers are used improperly.
Learn
- Get detailed advice on how to use
statement reoptimization for different database interfaces in Improve DB2 for Linux, UNIX, and Windows query performance using the
REOPT bind option. DB2 LUW supports statement reoptimization
not only for dynamic SQL as used by JDBC or ODBC/CLI, but also for static
SQL as used by SQLJ, embedded SQL, or stored procedures.
- Read Influence query optimization with optimization profiles and
statistical views in DB2 9 for an introduction to DB2 LUW optimization
profiles.
- Refer to Distribution statistics uses with the DB2 optimizer to understand how
distribution statistics influence DB2 LUW access plans.
- Check out the DB2 9.7 Information Center for Linux, UNIX, and
Windows for the complete DB2 9.7 LUW product
documentation online in HTML format.
- Visit the DB2 9 for Linux, UNIX, and Windows support site to
search for APARs, to download fixpacks, to get DB2 LUW documentation in PDF
format, and more.
- Read Best practices for DB2 for Linux, UNIX, and Windows for practical guidance for the most common DB2
9 product configurations. By applying these recommendations, you can
improve the value of your DB2 data servers and align yourself with the IBM
technical direction for DB2. These papers are authored by leading experts
on IBM development and consulting teams, and the papers have been extensively
tested:
- Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation,
how-to articles, education, downloads, product information, and
more.
- Stay current with
developerWorks technical events and webcasts.
- Follow developerWorks on
Twitter.
Get products and technologies
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
Discuss
- Check out the
developerWorks
blogs and get involved in the
developerWorks community.

Dirk Fechner works as an IT-Specialist for IBM Software Group Services His area of expertise is design, implementation, and administration of DB2 databases on distributed platforms as well as database application development. He has eleven years of experience with DB2 for Linux, UNIX, and Windows. Dirk is an IBM Certified Advanced Database Administrator and an IBM Certified Application Developer.




