Before you start
About this tutorial
In this tutorial, learn how to enable static SQL execution for an existing .NET application (this tutorial uses the testconn20.exe utility as the .NET application, which is included in DB2 client and data server installations). You can also follow this tutorial using an application of your choice that accesses DB2 data sources on any platform.
In this tutorial, learn:
- How static SQL execution can provide improved problem determination, performance, and security for your .NET database applications
- How to enable the pureQuery functionality in the IBM Data Server Provider for .NET that the application is using to retrieve data from a DB2 database
- How to configure the IBM Data Server Provider for .NET to capture the application's SQL
- How to configure the captured SQL and bind it to a database package
- How to configure the IBM Data Server Provider for .NET to statically execute the application's SQL
This tutorial is written for DB2 database administrators and application developers who would like to explore how they can optimize their existing .NET applications to improve and stabilize database access performance, to leverage workload management features, simplify problem determination, and improve database access security.
To follow this tutorial, you need:
- A simple .NET-based application of your choice that accesses a DB2 data server using the IBM Data Server Provider for .Net
- A pureQuery-enabled IBM Data Server Provider for .Net
- IBM Data Studio pureQuery Runtime, which is currently only available as a licensed product that is part of the Data Studio portfolio
- The authority to create a database package in the database that is accessed by the .NET application of your choice
Problem determination, performance and security - why pureQuery could matter to you, the DBA
As a database administrator you are sometimes at the mercy of the developers. You've probably seen your fair share of unpredictable SQL performance and had trouble identifying which .NET application has submitted a particular SQL that is causing issues. This is because .NET applications typically run under a single, generic package identifier.
Let's illustrate with an example. If you are a DBA for a DB2 for z/OS® system, you may be quite familiar with the following Tivoli® OMEGAMON® XE Performance Expert for DB2 monitoring screen (Figure 1):
Figure 1. Tivoli OMEGAMON. Which application is using package SYSxxx00???
With almost all of the dynamic SQL applications using packages like "SYSxxx00", identifying specific programs is difficult. How would you even know which application to look to first for identifying a problem query? Wouldn't it be nice to be able to correlate SQL with its originating application even if the .NET developer didn't think about this beforehand and coded according to best practices? With client optimization, you, the DBA, can determine specific package names to correlate with specific applications, significantly improving your ability to chase down problems in this environment.
A key reason that many people like static SQL, especially when they are paying for CPU usage, is the potential to reduce CPU consumption. With dynamic SQL, there can be significant overhead for determining an access plan, sending prepare messages back and forth, and so on. While the pureQuery client optimization feature won't turn poorly written SQL into a road runner, it eliminates this overhead and can therefore reduce DB2 CPU usage, sometimes significantly.
To illustrate, IBM has run some lab tests using the IRWW benchmark to measure the benefit of static SQL when compared with dynamic SQL for .NET applications. The chart below (Figure 2) shows that in this particular environment, the lab was able to significantly reduce CPU utilization by using static SQL. They even saw good results when DB2 was finely tuned and was able to attain 100% statement cache hit ratios. For more information about this performance study, see the article "More pureQuery Performance: Now with .NET applications accessing DB2 for z/OS" (see Resources).
Figure 2. .NET-based IRWW benchmark average CPU consumption per transaction (lower is better)
In addition, because the access plan is determined ahead of time (at bind time), your applications are much more likely to yield consistent, stable performance.
Finally, from a security point of view, static execution of SQL provides greater control over the SQL that is executed and can help reduce the risk of SQL injection, which is a known security risk for dynamic SQL. This is because with static packages, you only need to authorize users to execute the package, which means they are limited to only executing SQL that's in the package. With dynamic SQL, you need to provide access to the underlying data object.
If you're interested in learning more about how pureQuery client optimization can help you transform your existing .NET applications to use static SQL (without modifying any source code!), then read on.
Enable an existing application for pureQuery using client optimization
A key benefit of pureQuery is the ability for database applications to take advantage of static SQL when they may not have been able to before. As mentioned before, static SQL (as defined by DB2) is the ability to precompile an application's SQL before the application is executed for the first time. Having this information can provide insights that may improve performance, problem determination, and security.
Support for static SQL in new application development has been around for a long time for traditional programming languages (for example, COBOL or C) and more recently for Java applications (for example, through Java's SQLJ language extension and the Java pureQuery API). Existing ODBC- and CLI-based applications that were implemented without the programming languages static SQL support can leverage static SQL through DB2's CLI/ODBC Static Profiling feature. With the introduction of the pureQuery client optimization feature, existing .NET applications can now also reap the benefits of static SQL.
The pureQuery client optimization feature was designed with those existing applications in mind, for which a re-implementation of the data access layer (to take advantage of static SQL) is not a viable option. The process of enabling an existing application for pureQuery (or in other words, static SQL) is divided into four phases, which are depicted in Figure 3.
Figure 3. pureQuery client optimization process overview
- Capture: You configure the IBM Data Server Provider for .Net, which is used by .NET applications to access DB2 data servers, to capture the application's successfully executed SQL while that application is running. The captured information is stored in a text file, which is processed in a subsequent phase by a DB2 utility and consumed by the IBM Data Server Provider for .Net.
- Configure: The captured SQL and its associated metadata is enriched with information that enables a DB2 utility to create one or more database packages, which enable the IBM Data Server Provider for .Net to execute SQL statically.
- Bind: Use the
db2caputility to process the configured metadata file and create database packages that correlate to the captured SQL.
- Execute: Configure the IBM Data Server Provider for .Net to statically process SQL you had previously captured, configured, and bound.
This process is an iterative process that is typically performed in a test environment. Once the target application has been validated, you can move it and its associated pureQuery-related artifacts (the configured capture file and the database packages) to the production environment.
Enable pureQuery for the IBM Data Server Provider for .Net
The IBM Data Server Provider for .Net, which is IBM's recommended .NET data provider for DB2 databases, has been extended to provide pureQuery functionality that enables you to selectively dynamically or statically execute an application's SQL without having to modify your existing application. This pureQuery-enabled provider is included in the following products:
- IBM Data Server Driver for ODBC, CLI, and .NET, Version 9.5.3 (or later)
- IBM Data Server Runtime Client, Version 9.5.3 (or later)
- IBM Data Server Client, Version 9.5.3 (or later)
- DB2 for Linux®, UNIX®, and Windows®, Version 9.5, Fix pack 3 (or later)
The provider's pureQuery functionality is a separately licensed feature and is disabled by default. To unlock the functionality, you must acquire an IBM Data Studio pureQuery Runtime license and apply it. This section of the tutorial walks you through the process of installing the pureQuery Runtime and shows you how to apply the pureQuery license.
Install the pureQuery Runtime
To install your licensed pureQuery Runtime on the machine where your .NET application resides, download it from Passport Advantage and follow the simple instructions that are well documented in the pureQuery Runtime installation section of the Integrated Data Management Information Center. (See Resources for a link to the Information Center to access the instructions.)
Activate the IBM Data Server Provider's pureQuery functionality
The pureQuery license activation steps vary depending on which of the above products you have installed.
If you installed the IBM Data Server Driver for ODBC, CLI, and .NET, copy the license file dspq_rt.lic, which you can find in the (default pureQuery Runtime installation) directory C:\Program Files\IBM\Data Studio pureQuery Runtime\pureQuery\, to the driver's licence directory. This .NET provider license directory is located by default in C:\Program Files\IBM\IBM DATA SERVER DRIVER\license\.
If you installed the .NET driver as part of another DB2 client or as part of a data server, use the DB2 License Center or run the DB2 license management command, and specify the full path and license file name as a command line parameter:
db2licm -a "C:\Program Files\IBM\Data Studio pureQuery Runtime\pureQuery\dspq_rt.lic"
Once you've activated the pureQuery license, you are ready to step through the process that enables your existing application for pureQuery.
Select a .NET application for this tutorial
The pureQuery functionality of the IBM Data Server Provider for .Net is configured by adding custom properties to the application's database connection string. To keep it simple, this tutorial uses the testconn20.exe utility as the .NET application, which is included in DB2 client and data server installations (in the C:\Program Files\IBM\SQLLIB\BIN\ directory). This utility is commonly used to validate that the IBM Data Server Provider for .Net is configured correctly. It takes as an argument a database connection string, which can be easily manipulated, as shown in Figure 4. The utility only accesses catalog tables and can therefore be used with any database. For illustrative purposes, this tutorial uses the GSDB database, which is commonly used as a data source in Data Studio tutorials.
Figure 4. Example program output
If you prefer, you can follow this tutorial using an application that you've developed and that you are familiar with as long as you can manipulate its database connection string.
Capture the application's SQL
Traditionally, if you've developed an application that leverages static SQL, you had to embed special programming language constructs into the source code that define the SQL operations. These language constructs had to be processed by a precompiler that generated additional source code and SQL metadata information. This SQL metadata was used to generate a database package (the process is called bind), which contains the SQL access plan that is used to execute the SQL when the application is running. The generated source code contains the logic to execute the SQL that was embedded in the database package.
While pureQuery client optimization's ability to statically process a .NET application's SQL is based on the same fundamental concept, it is taking a slightly different approach that does not require any source code analysis and does not generate any additional source code. The IBM Data Server Provider for .Net can be configured to execute in a special mode that captures the SQL that an application is submitting. This captured SQL and its associated metadata serve two purposes:
- (By the bind utility) to create one or more database packages that contain the SQL access plans
- (By the IBM Data Server Provider at runtime) to determine whether a compiled version of an SQL statement is present in a database package, which enables it to run this SQL statically
Configure the IBM Data Server Provider for .Net to capture SQL
The IBM Data Server Provider for .Net evaluates runtime configuration
settings to determine whether special pureQuery processing is
required. If set to
captureMode property instructs the data
provider to capture information about each SQL statement that is
successfully executed in a metadata file, which is identified by the
value of the
Let's take a look at an example and assume that your application connects to the GSDB database. Your regular connection string would look like the following if the application is accessing a local database and doesn't provide alternate user id and password:
To enable the Data Server Provider's SQL capture mode and store the
captured SQL in a file, you'd append the
pureQueryXml properties to this connection
string, and assign the appropriate values to them:
To associate the captured metadata with a unique package identifier,
you have to define two additional connection string properties while
capturing an application's SQL:
collection property identifies the
collection in which the package resides, whereas the
rootPkgName property defines the base
database package name that the bind utility will use to generate one
or, if necessary more, package names. Both properties are considered
captureMode is set to
While running in SQL capture mode, the provider can optionally
generate a trace file that provides basic information about the
capture activity. The trace information identifies, among other
things, how many new (or, in other words, previously not captured in
this pureQueryXml file) SQL statements have been recorded. To activate
tracing, specify the
traceFile property and
assign a file name as a value.
A typical database connection string for the pureQuery capture activity looks like this (a line feed was added to improve readability):
Now that you've explored how to configure the capture mode for the Data Server Provider, you are ready to run the application and capture its SQL. If you are using the testconn20.exe application while you are working on this tutorial, you will specify the database connection properties when you run the application. If you are using your own application, please alter its current database connection string and append the properties discussed in this tutorial.
Run the example application to capture its SQL
The pureQuery capture operation is completely transparent to the application. The Data Server Provider is processing the application's SQL as it would do if it is not configured to exploit the pureQuery technology. The only indication that you have that a capture activity is in progress is the growing capture file size and the trace file information.
If you are using the testconn20.exe application, follow the instructions below to run it. If you are using your own application, run it as you would normally do.
- Open a command prompt window and change to the C:\Program Files\IBM\SQLLIB\BIN directory.
- Enter the following command to run the application (a line feed
was added to improve readability):
testconn20.exe "database=GSDB;captureMode=ON;pureQueryXml=c:\myapp.xml; collection=NULLID;rootPkgName=TOOT;traceFile=c:\pdqcapturetrace.txt"
Figure 5. Example program output while capturing SQL
If the application returns database connection error SQL8029N, the Data Server Provider was unable to verify the pureQuery license. If another connection error occurs, verify that the pureQuery property names and values are correct.
Browse (and configure) the captured SQL
Let's take a look at the trace file information before browsing the SQL capture file.
- Open the trace file that you've specified as the
traceFileproperty value using an editor of your choice. Note that it displays information about how many SQL statements were executed while the Data Server Provider was running in capture mode.
Figure 6. pureQuery capture trace output
- Close the trace file.
- Open the file that you've specified as the
pureQueryXmlproperty value to browse the captured SQL. Note that this file is a plain text file even though the property name implies that this file uses XML markup. You will see in a minute, when this tutorial discusses how to bind this file using the DB2 tooling, why the capture file currently does not use XML.
Figure 7. pureQuery capture file
The file consists of multiple sections, each one containing either database package configuration information, bind options, or SQL statement information.
The [COMMON] section associates this capture file with one or more consistency tokens, which are used internally to determine whether a certain database package that is present in a database matches the current SQL capture file. Since you have not bound this SQL file to a database package yet, no consistency token has been assigned. The
db2caputility will update this section as necessary when it creates the database package.
The [BINDOPTIONS] section defines, as the name implies, the database package characteristics. The
bindutility uses this information to create the database package that the Data Server Provider will use to statically execute SQL. You can customize these bind options as necessary prior to running the bind operation, which this tutorial discusses in the next section.
The [STATEMENTx] sections describe the SQL that was captured. One such section is created for each captured SQL. Each section contains information about the SQL text, the cursor characteristics, as well as input and output parameter information. The
bindutility uses this information when it creates a database package, and the Data Server Provider uses it to determine whether it can statically execute an SQL statement. You should not modify these sections.
- Close the pureQuery capture file.
Bind the application's SQL to a database package
In the current release, the pureQuery for .NET technology leverages the
db2cap utility, which was introduced as
part of DB2's CLI/ODBC Static Profiling feature, to bind the captured
SQL to one or more database packages. This utility is included in the
- IBM DB2 Connect
- IBM Data Server client
- IBM Data Server Runtime Client
- DB2 for Linux, UNIX, and Windows
db2cap utility accepts the following
db2cap [-h | -?] bind capture-file -d db-alias [-u userid [-p password]]
capture-file identifies the pureQuery SQL
db-alias identifies the
target database where the
creates the database package based on the information that is stored
capture-file. Note that in order to bind
a package to a remote database, that database has to be catalogued
db2cap uses the
password credentials to connect to the
target database and perform the bind operation. The bind operation
will fail unless the specified user holds the following privileges:
- Access privileges to any database objects referenced by SQL statements recorded in the capture file
- Sufficient authority to set bind options such as OWNER and
QUALIFIER if they are different from
- BINDADD authority if the package is being bound for the first time; otherwise, BIND authority
Bind the captured SQL to a database package
To bind the captured SQL to a database package:
- Open a DB2 command window.
- Invoke the
db2caputility and specify the capture file name as well as the target database name.
db2cap bind c:\myapp.xml -d GSDB
The bind operation should complete successfully.
db2capbind utility output
db2capprocesses the capture file, it updates it. Take a look at what has changed. Open the file that you've specified as the
pureQueryXmlproperty value. A consistency token has been added, uniquely identifying the version of the package. Each SQL has been assigned a section number that enables the runtime to statically execute the SQL.
Figure 9. Updated pureQuery SQL capture file
Now that you've created a database package you can verify that it contains the SQL that you've captured.
Verify the database package
Run the following query against the GSDB database to retrieve a list of statements that are in the package you've just created.
SELECT PKGSCHEMA,PKGNAME,TEXT FROM SYSCAT.STATEMENTS WHERE PKGSCHEMA = 'NULLID' AND PKGNAME LIKE 'TOOT%'
If you are using the testconn20.exe application, this query should return two rows:
NULLID TOOT1 DECLARE SQLCURCAPCS0 CURSOR WITH HOLD FOR SELECT * FROM SYSIBM.SYSTABLES FETCH FIRST 5 rows only NULLID TOOT1 CALL SYSIBM.SQLTABLES(:H00002 :H00001,:H00003 :H00001,:H00004 :H00001,:H00005 :H00001,:H00006 :H00001)
You've now successfully created the database package that the IBM Data Server Provider will use to statically execute these statements.
Statically run the .NET application's SQL
Configure the IBM Data Server Provider for .Net to statically execute SQL
Once the captured metadata has been configured and bound to a database
package, you can configure the IBM Data Server Provider to statically
execute SQL by setting the
database connection property to
DYNAMIC is the default if this property is
not defined). To determine how an SQL statement can be processed
statically, the provider retrieves, upon startup, the content of the
metadata file (which is identified by the
pureQueryXml property), and matches
incoming SQL requests with the SQL that was previously captured and
mapped to a database package. If the lookup and mapping is successful,
the provider statically executes the SQL. If the lookup fails, the SQL
will be executed dynamically.
If you'd like to determine whether a particular SQL is executed
dynamically or statically, define the connection property
traceFile and assign a log file name to it.
If logging is enabled, the provider will log an entry for each SQL
that it was unable to map to the database package. The following code
shows a typical connection string definition that you would use during
testing to validate that the application behaves as expected when
static SQL execution is enabled:
Run the example application's SQL statically
- Open a command prompt window, and change to the C:\Program Files\IBM\SQLLIB\BIN directory.
- Enter the following to run the application (a line feed was added
to improve readability):
testconn20.exe "database=GSDB;executionMode=STATIC;pureQueryXml=c:\myapp.xml; traceFile=c:\pdqtrace.txt"
The example application (or the program you are using) should generate the same output as before. Let's verify that all SQL was executed statically.
- Open the log file that you've specified as the
traceFileproperty value. If you've executed the application the same way you did when you captured its SQL, you should see an output similar to the one shown in Figure 10:
Figure 10. pureQuery static SQL execution log file
- Let's try a scenario where you haven't captured all of the
application's SQL. To simulate this scenario, remove one of the
captured SQL statements from the capture file and re-run the
application. If you are using your own application, skip the steps
below and simply run a part of the application for which you
haven't yet captured any SQL.
- Open the file that you've specified as the
pureQueryXmlproperty value to edit the captured SQL.
- Locate one of the [STATEMENTx] sections, as shown in
Figure 11. Remove SQL statement from pureQuery capture file
- Delete all associated SQL metadata entries for this
section. Since you are not invoking the
db2capbind utility, the capture file and the database package are now out of sync. Even though the SQL is present in the database package, the Data Provider is unable to map the incoming statement because it is not present in the metadata file.
Figure 12. pureQuery capture file after SQL metadata has been removed
- Save the modified capture file.
- Open the file that you've specified as the
- Run the application again, like you did in step 2 above, and open
the log file. A warning is displayed, indicating that one or more
SQL statements that you've removed (or didn't originally capture)
could not be executed statically.
Figure 13. Unmatched SQL is displayed in the pureQuery log file
In a typical scenario where the pureQuery technology is used to statically run SQL, you'd now re-run the capture process to iteratively capture SQL that was previously not recorded. This approach is common during development cycles when new code is added to the application.
You have successfully completed the process of enabling an existing .NET application for static SQL execution — without having to change a line of code. Consider using this technology if you are interested in improving quality of service for your existing applications by improving and stabilizing performance, and increasing accuracy of capacity planning. You may be able to get more value from your existing investments because of increased system throughput and simplified problem resolution.
The pureQuery technology benefits that were discussed in this tutorial are not limited to .NET applications. The corresponding pureQuery Java technology provides even more benefits that go beyond the benefits of static SQL to further enhance performance, problem determination support, and database security.
The author would like to thank Kathy Zeidenstein, Brent Gross, Paul Wirth, and Reece Laurie for their contributions to this tutorial.
- IBM Data Studio pureQuery Runtime: Learn more about Data Studio pureQuery Runtime.
- Integrated Data Management Information Center: Installing Data Studio pureQuery Runtime: Find more instructions for installing and uninstalling Data Studio pureQuery Runtime.
- "More pureQuery Performance: Now with .NET applications accessing DB2 for z/OS" (IBM Database Magazine, March 2009): Get a brief overview of pureQuery use with the IBM Data Server Provider for .NET and the results of a performance study comparing the benefits of pureQuery client optimization for .NET applications.
- developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Technology bookstore: Browse for books on these and other technical topics.
Get products and technologies
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Data Studio Community Space: Get the most current information regarding Data Studio.
- Data Studio team blog on developerWorks: Find advice, hints and tips, and lessons learned from a team of Data Studio experts and their experiences working with customers around the world
- Participate in developerWorks blogs and get involved in the developerWorks community.