Note: The information contained in this document has not been submitted to any formal IBM test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. Although IBM may have reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Anyone attempting to adapt these techniques to their own environments does so at their own risk.
Any performance data contained in this document were determined in various controlled laboratory environments and are for reference purposes only. Customers should not adapt these performance numbers to their own environments as system performance standards. The results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.
IBM recently announced Data Studio pureQuery Runtime 2.1, which now supports client optimization for .NET applications. This capability, previously only available for Java applications, enables transformation of .NET applications to use static SQL, without changing a line of code. In this article, we'll provide a brief overview of pureQuery use with the IBM Data Server Provider for .NET, and the results of a performance study comparing the benefits of pureQuery client optimization for .NET applications.
In a previous performance article, "IBM Data Studio pureQuery Runtime for z/OS Performance," we described the performance and throughput improvements on DB2 for z/OS with Java applications that use pureQuery. These measurements included the use of both the pureQuery API and the client optimization feature, which enables existing Java applications to execute SQL statically, without changing the Java source code.
Since publishing the article, there has been a lot of interest in IBM's pureQuery capabilities from the DB2 for z/OS community because of its ability to lower costs to deploy Java applications, making it easier to manage workloads, and reduce security risks. However, many of IBM's customers use .NET instead of, or in addition to, Java for data access, and they, too, wanted to enjoy these benefits. As of Data Studio pureQuery Runtime 2.1, now they can.
The client optimization feature of pureQuery allows one to realize all the benefits of static SQL execution. This capability works with .NET access against the following IBM data servers:
- DB2 for Linux, Unix, or Windows V8 and later
- DB2 for z/OS Version 8 and 9, and later through DB2 Connect
- DB2 for i5/OS Version 5 R3 and later through DB2 Connect.
The focus of this article will be on DB2 for z/OS access, although many of the same principles apply for all supported IBM data servers.
Before you begin: You don't need to install additional software on the DB2 for z/OS system. On your Windows server where the application is running, you must install IBM Data Studio pureQuery Runtime 2.1 (or later) along with at least one product that includes the IBM Data Server Provider for .NET (such as the IBM Data Server Runtime Client, version 9.5.3 or later). Finally, you must activate the pureQuery license. All these prerequisites are described in the DB2 Information Center.
Conceptually, client optimization for .NET applications is similar to the process for Java applications described in this developerWorks tutorial, but some of the underlying technology is a little different. The process consists of the following steps as illustrated in Figure 1:
- Configure the application runtime to support SQL capture and execute the application so the SQL statements can be captured as they are executing.
- Bind the captured SQL into DB2 packages using a utility included with the prerequisite clients.
- Execute the application in static mode, and the successfully captured SQL will run statically.
(A developerWorks Tutorial walks through this process in more detail.)
Figure 1. Client optimization process for .NET.
The static SQL packages created by the client optimization tooling are just like any other packages DB2 DBAs manage. Using the unique package names created during the bind process (instead of the generic package ID used for dynamic SQL) can greatly reduce the amount of time and frustration it takes to pinpoint performance problems. Figure 2 shows the difference you'll see when using Tivoli OMEGAMON to monitor your DB2 subsystem: Instead of a system-generated package ID (in this case SYSSH200) for all database access under .NET, the DBA has specified a package name of PAWTST1 for a particular subset of statements in a particular application.
These unique package names can also be used by Workload Manager (WLM) to assign Service Classes to packages, allowing thread transaction level prioritization, similar to CICS and IMS/TM. Furthermore, package-based accounting statistics enable charge-back accounting. From a security point of view, static execution offers a tighter security model by enabling execution of just the SQL statements in the package rather than requiring access to an entire table or view.
Figure 2. With static SQL, even .NET applications can have unique package identifiers.
The focus of this article, of course, is performance, and static SQL not only offers the potential to significantly reduce CPU overhead, but also enables you to lock in your access plans to ensure that performance is consistent. Let's now dig into the details of the performance benefits static SQL can bring to .NET applications.
The performance study used the IBM Relational Warehouse Workload (IRWW) running in IIS V6.0 with ASP.NET 2.0. The workload was executed on DB2 for z/OS V9 running on a three-way Z10 processor (see Figure 3).
Figure 3. Performance study architecture.
IRWW is an OLTP application based on an order entry and tracking system. The IRWW models a wholesale supplier managing orders, and it consists of seven transaction types. The frequency of transactions is set to simulate a realistic scenario; the mix used in the benchmark environment was 47 percent update transactions, 53 percent read-only transactions. The workload is triggered by a client program which generates HTTP requests for the required transaction mix. The performance is reported in Normalized Throughput Rate - Transactions Per Second (ITR).
A prerelease version of the IBM Data Server Provider for .NET was used for the study. The scope of the performance study was to compare normalized throughput and CPU consumption, running the workload with dynamic SQL execution and in static mode using the client optimization feature. The performance numbers were extracted by running the workload and monitoring the system.
The Internal Throughput Rate (ITR) is the notional throughput rate assuming that the CPUs are 100 percent busy. For example, consider an application with a transaction rate of 200 transactions per second at 75 percent CPU consumption. The ITR for this application would be 200 * 100/75 = 267 tps. This is the notional transaction rate that could be achieved if the CPUs were 100 percent busy.
The dynamic statement cache hit ratio is a key factor in determining how much improvement your application can achieve by switching from dynamic to static SQL execution. As the article "'No Excuses' Database Programming: Make your Java programs fly with Static SQL and pureQuery" explains, much of the overhead for dynamic statements is in preparing the statement to run. However, since statements that have already been prepared can reside in the dynamic statement cache, a repeated execution of that same exact statement would not require the statement to be recompiled on the server.
During our testing, apart from measurements with 100 percent statement cache hit ratio, the dynamic statement cache was also sized to achieve a hit ratio close to 80 percent, which is typical for real-world applications. Figure 4 compares the normalized throughput (ITR) of the different modes of SQL execution when executing the IRWW.
Figure 4. Normalized throughput numbers.
The results highlight the fact that static execution can achieve significantly greater throughput than dynamic SQL execution: pureQuery client optimization can improve the throughput by 159 percent ((473-1226)/473=-1.59) when compared to dynamic SQL execution while running with a statement cache hit of 79 percent. Even with a statement cache hit of 100 percent, you can achieve an improvement in throughput of 21 percent by using pureQuery client optimization to convert the dynamic SQL to static. If you're wondering why the results are better even against a dynamic hit ratio of 100 percent, it's because the overhead of processing and parsing the prepare request, statement cache lookup, and creation of the prepare reply (including extended describe information) is eliminated.
Figure 5 shows the average CPU consumed by a transaction with the different modes of SQL execution.
Figure 5. Comparison of CPU consumed per transaction.
By using client optimization, we were able to achieve a 61 percent reduction in CPU per transaction compared to dynamic SQL execution with 79 percent statement cache hit, and a 17 percent reduction in CPU compared to dynamic SQL execution with a statement cache hit of 100 percent.
The results of the performance study show that static SQL execution in .NET applications using the Data Studio pureQuery Runtime 2.1 and the IBM Data Server Provider for .NET can significantly reduce the CPU cost per transaction of an application. In the benchmark environment employed for this study, this approach of transforming to static SQL execution resulted in a 61 percent reduction in CPU consumed per transaction compared to dynamic SQL execution when the dynamic cache hit ratio is 79 percent. If your cache hit ratios are less than 79 percent, your results may be even better.
The tests also show that even with a finely tuned system achieving 100 percent cache hit ratio, client optimization can still be of benefit, delivering a 17 percent reduction in CPU per transaction in our lab tests. These impressive results were obtained with no change to the application code.
Furthermore, the use of static SQL offers additional benefits including improved problem isolation, the ability to use Workload Manager to do workload prioritization similar to that for CICS/COBOL and IMS/TM, tighter security, and enhanced accounting.
Rajesh Sambandhan is an advisory software engineer on the IBM Information Management team, India Software Labs. He has been in the industry for more than seven years, working primarily in J2EE and Test Driven Development.
Todd Munk is an advisory software engineer on the DB2 for z/OS Performance team at Silicon Valley Laboratory. He has been with the IBM fifteen years working primarily with distributed access to DB2 for z/OS including ODBC/CLI, JDBC/SQLJ, DB2 Connect, and WebSphere.
Simon Harris is a Performance Engineer within the WebSphere Federation Server development team in the Silicon Valley Laboratory. Simon has been working with federated database technology since its inception in IBM in 1995, supporting many customers in both a pre- and post-sales capacity throughout Europe, the Middle-East, and Africa.