Optimize your existing .NET applications using IBM Data Studio's pureQuery

Improve performance and security for existing .NET applications that access DB2

IBM® Data Studio pureQuery Runtime 2.1 includes a new feature, called client optimization, that enables database administrators and developers to take advantage of the performance and security benefits of static SQL execution against IBM DB2® databases without having to modify their existing custom-developed or packaged .NET applications. In this tutorial, learn how to enable this capability for an existing .NET application.

Patrick Titzler (ptitzler@us.ibm.com), Enablement Architect, IBM

Patrick TitzlerPatrick Titzler is an engineer in the Data Studio Enablement team at IBM's Silicon Valley Lab in San Jose, CA, and focuses on the application development components of the Data Studio portfolio. Prior to joining this team, Patrick was a member of Data Warehousing and OLAP application development teams. Patrick holds a Master's degree in Computer Science from University of Rostock, Germany.



26 March 2009

Before you start

About this tutorial

In this tutorial, learn how to enable static SQL execution for an existing .NET application (this tutorial uses the testconn20.exe utility as the .NET application, which is included in DB2 client and data server installations). You can also follow this tutorial using an application of your choice that accesses DB2 data sources on any platform.

Objectives

In this tutorial, learn:

  • How static SQL execution can provide improved problem determination, performance, and security for your .NET database applications
  • How to enable the pureQuery functionality in the IBM Data Server Provider for .NET that the application is using to retrieve data from a DB2 database
  • How to configure the IBM Data Server Provider for .NET to capture the application's SQL
  • How to configure the captured SQL and bind it to a database package
  • How to configure the IBM Data Server Provider for .NET to statically execute the application's SQL

Prerequisites

This tutorial is written for DB2 database administrators and application developers who would like to explore how they can optimize their existing .NET applications to improve and stabilize database access performance, to leverage workload management features, simplify problem determination, and improve database access security.

System requirements

To follow this tutorial, you need:

  • A simple .NET-based application of your choice that accesses a DB2 data server using the IBM Data Server Provider for .Net
  • A pureQuery-enabled IBM Data Server Provider for .Net
  • IBM Data Studio pureQuery Runtime, which is currently only available as a licensed product that is part of the Data Studio portfolio
  • The authority to create a database package in the database that is accessed by the .NET application of your choice

Problem determination, performance and security - why pureQuery could matter to you, the DBA

As a database administrator you are sometimes at the mercy of the developers. You've probably seen your fair share of unpredictable SQL performance and had trouble identifying which .NET application has submitted a particular SQL that is causing issues. This is because .NET applications typically run under a single, generic package identifier.

Let's illustrate with an example. If you are a DBA for a DB2 for z/OS® system, you may be quite familiar with the following Tivoli® OMEGAMON® XE Performance Expert for DB2 monitoring screen (Figure 1):

Figure 1. Tivoli OMEGAMON. Which application is using package SYSxxx00???
Tivoli OMEGAMON: DB2 thread activity listing for dynamic SQL

With almost all of the dynamic SQL applications using packages like "SYSxxx00", identifying specific programs is difficult. How would you even know which application to look to first for identifying a problem query? Wouldn't it be nice to be able to correlate SQL with its originating application even if the .NET developer didn't think about this beforehand and coded according to best practices? With client optimization, you, the DBA, can determine specific package names to correlate with specific applications, significantly improving your ability to chase down problems in this environment.

A key reason that many people like static SQL, especially when they are paying for CPU usage, is the potential to reduce CPU consumption. With dynamic SQL, there can be significant overhead for determining an access plan, sending prepare messages back and forth, and so on. While the pureQuery client optimization feature won't turn poorly written SQL into a road runner, it eliminates this overhead and can therefore reduce DB2 CPU usage, sometimes significantly.

To illustrate, IBM has run some lab tests using the IRWW benchmark to measure the benefit of static SQL when compared with dynamic SQL for .NET applications. The chart below (Figure 2) shows that in this particular environment, the lab was able to significantly reduce CPU utilization by using static SQL. They even saw good results when DB2 was finely tuned and was able to attain 100% statement cache hit ratios. For more information about this performance study, see the article "More pureQuery Performance: Now with .NET applications accessing DB2 for z/OS" (see Resources).

Figure 2. .NET-based IRWW benchmark average CPU consumption per transaction (lower is better)
.NET-based IRWW benchmark average CPU consumption per transaction

In addition, because the access plan is determined ahead of time (at bind time), your applications are much more likely to yield consistent, stable performance.

Finally, from a security point of view, static execution of SQL provides greater control over the SQL that is executed and can help reduce the risk of SQL injection, which is a known security risk for dynamic SQL. This is because with static packages, you only need to authorize users to execute the package, which means they are limited to only executing SQL that's in the package. With dynamic SQL, you need to provide access to the underlying data object.

If you're interested in learning more about how pureQuery client optimization can help you transform your existing .NET applications to use static SQL (without modifying any source code!), then read on.


Enable an existing application for pureQuery using client optimization

A key benefit of pureQuery is the ability for database applications to take advantage of static SQL when they may not have been able to before. As mentioned before, static SQL (as defined by DB2) is the ability to precompile an application's SQL before the application is executed for the first time. Having this information can provide insights that may improve performance, problem determination, and security.

Support for static SQL in new application development has been around for a long time for traditional programming languages (for example, COBOL or C) and more recently for Java applications (for example, through Java's SQLJ language extension and the Java pureQuery API). Existing ODBC- and CLI-based applications that were implemented without the programming languages static SQL support can leverage static SQL through DB2's CLI/ODBC Static Profiling feature. With the introduction of the pureQuery client optimization feature, existing .NET applications can now also reap the benefits of static SQL.

The pureQuery client optimization feature was designed with those existing applications in mind, for which a re-implementation of the data access layer (to take advantage of static SQL) is not a viable option. The process of enabling an existing application for pureQuery (or in other words, static SQL) is divided into four phases, which are depicted in Figure 3.

Figure 3. pureQuery client optimization process overview
pureQuery client optimization process overview
  • Capture: You configure the IBM Data Server Provider for .Net, which is used by .NET applications to access DB2 data servers, to capture the application's successfully executed SQL while that application is running. The captured information is stored in a text file, which is processed in a subsequent phase by a DB2 utility and consumed by the IBM Data Server Provider for .Net.
  • Configure: The captured SQL and its associated metadata is enriched with information that enables a DB2 utility to create one or more database packages, which enable the IBM Data Server Provider for .Net to execute SQL statically.
  • Bind: Use the db2cap utility to process the configured metadata file and create database packages that correlate to the captured SQL.
  • Execute: Configure the IBM Data Server Provider for .Net to statically process SQL you had previously captured, configured, and bound.

This process is an iterative process that is typically performed in a test environment. Once the target application has been validated, you can move it and its associated pureQuery-related artifacts (the configured capture file and the database packages) to the production environment.


Enable pureQuery for the IBM Data Server Provider for .Net

The IBM Data Server Provider for .Net, which is IBM's recommended .NET data provider for DB2 databases, has been extended to provide pureQuery functionality that enables you to selectively dynamically or statically execute an application's SQL without having to modify your existing application. This pureQuery-enabled provider is included in the following products:

  • IBM Data Server Driver for ODBC, CLI, and .NET, Version 9.5.3 (or later)
  • IBM Data Server Runtime Client, Version 9.5.3 (or later)
  • IBM Data Server Client, Version 9.5.3 (or later)
  • DB2 for Linux®, UNIX®, and Windows®, Version 9.5, Fix pack 3 (or later)

The provider's pureQuery functionality is a separately licensed feature and is disabled by default. To unlock the functionality, you must acquire an IBM Data Studio pureQuery Runtime license and apply it. This section of the tutorial walks you through the process of installing the pureQuery Runtime and shows you how to apply the pureQuery license.

Install the pureQuery Runtime

To install your licensed pureQuery Runtime on the machine where your .NET application resides, download it from Passport Advantage and follow the simple instructions that are well documented in the pureQuery Runtime installation section of the Integrated Data Management Information Center. (See Resources for a link to the Information Center to access the instructions.)

Activate the IBM Data Server Provider's pureQuery functionality

The pureQuery license activation steps vary depending on which of the above products you have installed.

If you installed the IBM Data Server Driver for ODBC, CLI, and .NET, copy the license file dspq_rt.lic, which you can find in the (default pureQuery Runtime installation) directory C:\Program Files\IBM\Data Studio pureQuery Runtime\pureQuery\, to the driver's licence directory. This .NET provider license directory is located by default in C:\Program Files\IBM\IBM DATA SERVER DRIVER\license\.

If you installed the .NET driver as part of another DB2 client or as part of a data server, use the DB2 License Center or run the DB2 license management command, and specify the full path and license file name as a command line parameter:

db2licm -a "C:\Program Files\IBM\Data Studio pureQuery Runtime\pureQuery\dspq_rt.lic"

Once you've activated the pureQuery license, you are ready to step through the process that enables your existing application for pureQuery.

Select a .NET application for this tutorial

The pureQuery functionality of the IBM Data Server Provider for .Net is configured by adding custom properties to the application's database connection string. To keep it simple, this tutorial uses the testconn20.exe utility as the .NET application, which is included in DB2 client and data server installations (in the C:\Program Files\IBM\SQLLIB\BIN\ directory). This utility is commonly used to validate that the IBM Data Server Provider for .Net is configured correctly. It takes as an argument a database connection string, which can be easily manipulated, as shown in Figure 4. The utility only accesses catalog tables and can therefore be used with any database. For illustrative purposes, this tutorial uses the GSDB database, which is commonly used as a data source in Data Studio tutorials.

Figure 4. Example program output
Example program output

If you prefer, you can follow this tutorial using an application that you've developed and that you are familiar with as long as you can manipulate its database connection string.


Capture the application's SQL

Traditionally, if you've developed an application that leverages static SQL, you had to embed special programming language constructs into the source code that define the SQL operations. These language constructs had to be processed by a precompiler that generated additional source code and SQL metadata information. This SQL metadata was used to generate a database package (the process is called bind), which contains the SQL access plan that is used to execute the SQL when the application is running. The generated source code contains the logic to execute the SQL that was embedded in the database package.

While pureQuery client optimization's ability to statically process a .NET application's SQL is based on the same fundamental concept, it is taking a slightly different approach that does not require any source code analysis and does not generate any additional source code. The IBM Data Server Provider for .Net can be configured to execute in a special mode that captures the SQL that an application is submitting. This captured SQL and its associated metadata serve two purposes:

  • (By the bind utility) to create one or more database packages that contain the SQL access plans
  • (By the IBM Data Server Provider at runtime) to determine whether a compiled version of an SQL statement is present in a database package, which enables it to run this SQL statically

Configure the IBM Data Server Provider for .Net to capture SQL

The IBM Data Server Provider for .Net evaluates runtime configuration settings to determine whether special pureQuery processing is required. If set to ON, the captureMode property instructs the data provider to capture information about each SQL statement that is successfully executed in a metadata file, which is identified by the value of the pureQueryXml property.

Configuring pureQuery configuration properties

Define pureQuery configuration properties in the database connection string (which is typically done in the application configuration file) to limit their scope to a single application. Define these properties in the IBM Data Server driver configuration file db2dsdriver.cfg to apply them to a data source that can be shared by multiple applications.

Let's take a look at an example and assume that your application connects to the GSDB database. Your regular connection string would look like the following if the application is accessing a local database and doesn't provide alternate user id and password:

database=GSDB

To enable the Data Server Provider's SQL capture mode and store the captured SQL in a file, you'd append the captureMode and pureQueryXml properties to this connection string, and assign the appropriate values to them:

database=GSDB;captureMode=ON;pureQueryXml=c:\myapp.xml

To associate the captured metadata with a unique package identifier, you have to define two additional connection string properties while capturing an application's SQL: collection and rootPkgName. The collection property identifies the collection in which the package resides, whereas the rootPkgName property defines the base database package name that the bind utility will use to generate one or, if necessary more, package names. Both properties are considered mandatory if captureMode is set to ON.

database=GSDB;captureMode=ON;pureQueryXml=c:\myapp.xml;collection=NULLID;rootPkgName=TOOT

While running in SQL capture mode, the provider can optionally generate a trace file that provides basic information about the capture activity. The trace information identifies, among other things, how many new (or, in other words, previously not captured in this pureQueryXml file) SQL statements have been recorded. To activate tracing, specify the traceFile property and assign a file name as a value.

A typical database connection string for the pureQuery capture activity looks like this (a line feed was added to improve readability):

database=GSDB;captureMode=ON;pureQueryXml=c:\myapp.xml;
collection=NULLID;rootPkgName=TOOT;traceFile=c:\pdqcapturetrace.txt

Now that you've explored how to configure the capture mode for the Data Server Provider, you are ready to run the application and capture its SQL. If you are using the testconn20.exe application while you are working on this tutorial, you will specify the database connection properties when you run the application. If you are using your own application, please alter its current database connection string and append the properties discussed in this tutorial.

Run the example application to capture its SQL

The pureQuery capture operation is completely transparent to the application. The Data Server Provider is processing the application's SQL as it would do if it is not configured to exploit the pureQuery technology. The only indication that you have that a capture activity is in progress is the growing capture file size and the trace file information.

If you are using the testconn20.exe application, follow the instructions below to run it. If you are using your own application, run it as you would normally do.

  1. Open a command prompt window and change to the C:\Program Files\IBM\SQLLIB\BIN directory.
  2. Enter the following command to run the application (a line feed was added to improve readability):
    testconn20.exe "database=GSDB;captureMode=ON;pureQueryXml=c:\myapp.xml;
    collection=NULLID;rootPkgName=TOOT;traceFile=c:\pdqcapturetrace.txt"
Figure 5. Example program output while capturing SQL
Example program output while capturing SQL

If the application returns database connection error SQL8029N, the Data Server Provider was unable to verify the pureQuery license. If another connection error occurs, verify that the pureQuery property names and values are correct.

Browse (and configure) the captured SQL

Let's take a look at the trace file information before browsing the SQL capture file.

  1. Open the trace file that you've specified as the traceFile property value using an editor of your choice. Note that it displays information about how many SQL statements were executed while the Data Server Provider was running in capture mode.
    Figure 6. pureQuery capture trace output
    pureQuery capture trace output
  2. Close the trace file.
  3. Open the file that you've specified as the pureQueryXml property value to browse the captured SQL. Note that this file is a plain text file even though the property name implies that this file uses XML markup. You will see in a minute, when this tutorial discusses how to bind this file using the DB2 tooling, why the capture file currently does not use XML.
    Figure 7. pureQuery capture file
    pureQuery capture file

    The file consists of multiple sections, each one containing either database package configuration information, bind options, or SQL statement information.

    The [COMMON] section associates this capture file with one or more consistency tokens, which are used internally to determine whether a certain database package that is present in a database matches the current SQL capture file. Since you have not bound this SQL file to a database package yet, no consistency token has been assigned. The db2cap utility will update this section as necessary when it creates the database package.

    The [BINDOPTIONS] section defines, as the name implies, the database package characteristics. The bind utility uses this information to create the database package that the Data Server Provider will use to statically execute SQL. You can customize these bind options as necessary prior to running the bind operation, which this tutorial discusses in the next section.

    The [STATEMENTx] sections describe the SQL that was captured. One such section is created for each captured SQL. Each section contains information about the SQL text, the cursor characteristics, as well as input and output parameter information. The bind utility uses this information when it creates a database package, and the Data Server Provider uses it to determine whether it can statically execute an SQL statement. You should not modify these sections.

  4. Close the pureQuery capture file.

Bind the application's SQL to a database package

In the current release, the pureQuery for .NET technology leverages the db2cap utility, which was introduced as part of DB2's CLI/ODBC Static Profiling feature, to bind the captured SQL to one or more database packages. This utility is included in the following products:

  • IBM DB2 Connect
  • IBM Data Server client
  • IBM Data Server Runtime Client
  • DB2 for Linux, UNIX, and Windows

The db2cap utility accepts the following program arguments:

db2cap [-h | -?] bind capture-file -d db-alias [-u userid [-p password]]

capture-file identifies the pureQuery SQL capture file. db-alias identifies the target database where the db2cap utility creates the database package based on the information that is stored in capture-file. Note that in order to bind a package to a remote database, that database has to be catalogued locally.

db2cap uses the userid and password credentials to connect to the target database and perform the bind operation. The bind operation will fail unless the specified user holds the following privileges:

  • Access privileges to any database objects referenced by SQL statements recorded in the capture file
  • Sufficient authority to set bind options such as OWNER and QUALIFIER if they are different from userid
  • BINDADD authority if the package is being bound for the first time; otherwise, BIND authority

Bind the captured SQL to a database package

To bind the captured SQL to a database package:

  1. Open a DB2 command window.
  2. Invoke the db2cap utility and specify the capture file name as well as the target database name.
    db2cap bind c:\myapp.xml -d GSDB

    The bind operation should complete successfully.
    Figure 8. db2cap bind utility output
    db2cap bind utitlity output
  3. While db2cap processes the capture file, it updates it. Take a look at what has changed. Open the file that you've specified as the pureQueryXml property value. A consistency token has been added, uniquely identifying the version of the package. Each SQL has been assigned a section number that enables the runtime to statically execute the SQL.
    Figure 9. Updated pureQuery SQL capture file
    Updated pureQuery SQL capture file

    Now that you've created a database package you can verify that it contains the SQL that you've captured.

Verify the database package

Run the following query against the GSDB database to retrieve a list of statements that are in the package you've just created.

SELECT PKGSCHEMA,PKGNAME,TEXT 
FROM SYSCAT.STATEMENTS 
WHERE PKGSCHEMA = 'NULLID' AND PKGNAME LIKE 'TOOT%'

If you are using the testconn20.exe application, this query should return two rows:

 NULLID  TOOT1  DECLARE SQLCURCAPCS0 CURSOR WITH HOLD FOR SELECT * 
    FROM SYSIBM.SYSTABLES FETCH FIRST 5 rows only
 NULLID  TOOT1  CALL SYSIBM.SQLTABLES(:H00002 :H00001,:H00003 :H00001,:H00004 
    :H00001,:H00005 :H00001,:H00006 :H00001)

You've now successfully created the database package that the IBM Data Server Provider will use to statically execute these statements.


Statically run the .NET application's SQL

Configure the IBM Data Server Provider for .Net to statically execute SQL

Once the captured metadata has been configured and bound to a database package, you can configure the IBM Data Server Provider to statically execute SQL by setting the executionMode database connection property to STATIC (DYNAMIC is the default if this property is not defined). To determine how an SQL statement can be processed statically, the provider retrieves, upon startup, the content of the metadata file (which is identified by the pureQueryXml property), and matches incoming SQL requests with the SQL that was previously captured and mapped to a database package. If the lookup and mapping is successful, the provider statically executes the SQL. If the lookup fails, the SQL will be executed dynamically.

database=GSDB;executionMode=STATIC;pureQueryXml=c:\myapp.xml

If you'd like to determine whether a particular SQL is executed dynamically or statically, define the connection property traceFile and assign a log file name to it. If logging is enabled, the provider will log an entry for each SQL that it was unable to map to the database package. The following code shows a typical connection string definition that you would use during testing to validate that the application behaves as expected when static SQL execution is enabled:

database=GSDB;executionMode=STATIC;pureQueryXml=c:\myapp.xml;traceFile=c:\pdqtrace.txt

Run the example application's SQL statically

  1. Open a command prompt window, and change to the C:\Program Files\IBM\SQLLIB\BIN directory.
  2. Enter the following to run the application (a line feed was added to improve readability):
    testconn20.exe "database=GSDB;executionMode=STATIC;pureQueryXml=c:\myapp.xml;
    traceFile=c:\pdqtrace.txt"

    The example application (or the program you are using) should generate the same output as before. Let's verify that all SQL was executed statically.

  3. Open the log file that you've specified as the traceFile property value. If you've executed the application the same way you did when you captured its SQL, you should see an output similar to the one shown in Figure 10:
    Figure 10. pureQuery static SQL execution log file
    PureQuery static SQL execution log file
  4. Let's try a scenario where you haven't captured all of the application's SQL. To simulate this scenario, remove one of the captured SQL statements from the capture file and re-run the application. If you are using your own application, skip the steps below and simply run a part of the application for which you haven't yet captured any SQL.
    1. Open the file that you've specified as the pureQueryXml property value to edit the captured SQL.
    2. Locate one of the [STATEMENTx] sections, as shown in Figure 11:
      Figure 11. Remove SQL statement from pureQuery capture file
      Remove SQL statement from pureQuery capture file
    3. Delete all associated SQL metadata entries for this section. Since you are not invoking the db2cap bind utility, the capture file and the database package are now out of sync. Even though the SQL is present in the database package, the Data Provider is unable to map the incoming statement because it is not present in the metadata file.
      Figure 12. pureQuery capture file after SQL metadata has been removed
      pureQuery capture file after SQL metadata has been removed
    4. Save the modified capture file.
  5. Run the application again, like you did in step 2 above, and open the log file. A warning is displayed, indicating that one or more SQL statements that you've removed (or didn't originally capture) could not be executed statically.
    Figure 13. Unmatched SQL is displayed in the pureQuery log file
    Unmatched SQL is displayed in the pureQuery log file

    In a typical scenario where the pureQuery technology is used to statically run SQL, you'd now re-run the capture process to iteratively capture SQL that was previously not recorded. This approach is common during development cycles when new code is added to the application.


Summary

You have successfully completed the process of enabling an existing .NET application for static SQL execution — without having to change a line of code. Consider using this technology if you are interested in improving quality of service for your existing applications by improving and stabilizing performance, and increasing accuracy of capacity planning. You may be able to get more value from your existing investments because of increased system throughput and simplified problem resolution.

The pureQuery technology benefits that were discussed in this tutorial are not limited to .NET applications. The corresponding pureQuery Java technology provides even more benefits that go beyond the benefits of static SQL to further enhance performance, problem determination support, and database security.

Acknowledgements

The author would like to thank Kathy Zeidenstein, Brent Gross, Paul Wirth, and Reece Laurie for their contributions to this tutorial.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=378707
ArticleTitle=Optimize your existing .NET applications using IBM Data Studio's pureQuery
publish-date=03262009