Note:The information contained in this document has not been submitted to any formal IBM test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While IBM may have reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Anyone attempting to adapt these techniques to their own environments does so at their own risk.
Any performance data contained in this document were determined in various controlled laboratory environments and are for reference purposes only. Customers should not adapt these performance numbers to their own environments as system performance standards. The results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.
In this article, we’ll provide a brief overview of the pureQuery platform, a summary of enhancements introduced since this article was first published, and the updated results of a performance study comparing pureQuery access methods with a variety of popular Java data access alternatives (including JDBC). The results of the study show the potential for significant throughput improvements and CPU reduction using pureQuery static execution as compared to the alternatives. And, by using client optimization, these benefits can be realized without making changes to the application.
This article includes performance results of using static execution for an existing JDBC application. See the Resources for an article on the performance results of using pureQuery for .NET applications.
pureQuery is a high-performance Java data access platform that simplifies the tasks of developing and managing applications that access data. It consists of tools (IBM Optim Development Studio), APIs, and a runtime (IBM Optim pureQuery Runtime (for z/OS or for Linux, UNIX, and Windows)).
The Resources section lists several resources that provide information about the pureQuery platform. In a nutshell, by using pureQuery, it is possible to optimize data access performance and improve security and manageability for any new or existing Java or .NET application. (This article focuses on Java performance.)
Key concepts that will help you understand the information presented in this article include the following:
Each of these is described in more detail in this section.
For a summary of considerations of whether pureQuery is right for your environment, see the Appendix.
pureQuery supports data access using both Data Access Objects (DAO) interfaces and annotated methods, and built-in inline methods using an out-of-the box data interface. It is important to note that you do not need to use the API to take advantage of pureQuery. However, if you do, either approach is designed to be easy for developers to learn and to reduce the amount of code generated.
With the use of inline methods, the SQL is included directly in the Java code and uses the out-of-the-box data interface. Use of inline methods supports dynamic execution directly, but you can use client optimization with inline methods to support static execution. With the DAO interface using annotated methods, the SQL is maintained either within the application, or in separate interface files that you define. The interface files make it easier for the actual Java application to remain independent of the persistence approach. The DAO interface using annotated methods supports both dynamic and static execution. In other words, no intervening client optimization processing is required for this style to make use of static SQL. This article does include performance measurements for using DAOs to implement static execution.
Client optimization, a key enhancement that was introduced in the 1.2 release, makes it possible to take advantage of static SQL for existing JDBC (or .NET) applications, without modifying — or even having access to — the source code. This capability also works on Java applications that use a framework for persistence, such as Hibernate. With client optimization, you execute the application while the pureQuery runtime is in “capture” mode, typically as part of a use case test validation that exercises various program execution paths. The DBA can then bind the captured SQL into packages at the database server. Subsequent executions of the application allow the pureQuery runtime to execute the captured SQL statically. In this way, client optimization gives applications the performance and security benefits of static SQL, without the need to modify or recompile the application.
For Java applications, including those that use the pureQuery DAO interfaces or those that have been “client optimized,” Optim Development Studio enables developers or DBAs to quickly locate specific SQL statements in the Java source, down to the specific line number (see Figure 1). This ability to visualize relationships among Java source and SQL is known as the SQL Outline view (previously known as the pureQuery outline).
The outline view also provides a database schema perspective, which can be useful when doing impact analysis or problem isolation. For example, if you’re considering dropping a column, you can find out where in the Java program that column is being used and modify the SQL appropriately. Finally, the outline view also provides performance data on executing SQL, so that you can identify SQL hotspots quickly and directly from there go into Visual Explain or launch into Optim Query Tuner to help analyze and tune the SQL.
Figure 1. Optim Development Studio outline view
Since this article was originally published in July 2008, pureQuery has made huge leaps forward in terms of developer and DBA collaboration and productivity, as well as with additional enhancements to help you take best advantage of client optimization. Here is a summary of some, but not all, of the enhancements. More information about these enhancements can be found in the Resources section of this article. We call out especially Chapter 5 of the Meeting Service Level Objectives Redbook, which has a nice summary of the enhancements delivered in Version 2.2.
- Ability to substitute parameter markers for literals when capturing SQL. Prior to this enhancement, client optimization would need to capture and bind each unique occurrence of a statement that might differ only in the literal values being used. Now, with literal substitution, those statements can be consolidated into a single statement for the purposes of binding and package creation.
- Ability to replace poorly performing SQL with equivalent SQL that you have tuned without requiring a change to the application. Also, to tune in context, you have the ability to launch Optim Query Tuner directly from Optim Development Studio.
- Ability to visualize, save, and compare performance metrics in Optim Development Studio.
- Ability to merge multiple capture files in support of clustered application environments.
- New as of Version 2.2 Fix Pack 3, the ability to use a secure repository to centralize management of pureQuery artifacts.
Both JDBC Type 2 and Type 4 drivers are commonly used by applications to access DB2 data. The Type 2 driver requires the DB2 client to be installed on the same machine as the application; whereas the Type 4 driver is 100 percent Java and doesn’t require a DB2 client to be installed (it directly implements the network protocol for the data source). This article presents the performance numbers using both Type 2 and Type 4 connections.
Figure 2 shows pureQuery runtime running natively on z/OS using either Type 2 or Type 4 connections.
Figure 2. Running Optim pureQuery Runtime natively on z/OS
The performance study used the IBM Relational Warehouse Workload (IRWW) running in WebSphere Application Server V6.1. The workload was executed on DB2 for z/OS V9 running on a three-way Z9 processor.
IRWW is an OLTP application based on an order entry and tracking system. The IRWW models a wholesale supplier managing orders and consists of seven transaction types. The frequency of transactions is set to simulate a realistic scenario; the mix used in the benchmark environment was 47 percent update transactions, 53 percent read-only transactions. The workload is triggered by a Java client program which generates HTTP requests for the required transaction mix. The performance is reported in Normalized Throughput Rate - Transactions Per Second (ITR).
The Internal Throughput Rate (ITR) is the notional throughput rate assuming that the CPUs are 100 percent busy. For example, consider an application with a transaction rate of 200 transactions per second at 75 percent CPU consumption. The ITR for this application would be 200 * 100/75 = 267 tps. This is the notional transaction rate that could be achieved if the CPUs were 100 percent busy.
IBM pureQuery Runtime for z/OS 2.2 was used for the study. The scope of the performance study was to compare normalized throughput and CPU consumption of SQL, comparing both the pureQuery DAO interfaces and client optimized JDBC with the technology alternatives. The IRWW was modified to incorporate each of these technologies, and the performance numbers were extracted by running the workload and monitoring the system. In this article, the scope is limited to the following technologies:
- pureQuery DAO interfaces using annotated methods (static execution)
- JDBC with pureQuery client optimization (static execution)
- JDBC (dynamic execution) - this serves as the baseline measurement against which the alternatives are measured
- EJB2 (dynamic execution)
- OpenJPA (dynamic execution).
Note that the IRWW and this test don’t make use of all pureQuery features that can help improve performance such as heterogeneous batching and literal substitution.
The dynamic statement cache hit ratio is a key factor in determining how much improvement your application can achieve by switching from dynamic to static SQL execution. As the "No Excuses" article explains, much of the overhead for dynamic statements is in preparing the statement to run. However, since statements that have already been prepared can reside in the dynamic statement cache, a repeated execution of that same exact statement would not require another prepare operation. Thus, if every SQL statement the application executes is already in the dynamic statement cache (that is, the hit ratio is 100 percent), the CPU reduction benefits of switching to static SQL won’t be as great when compared with the situation in which only half of the dynamic SQL statements were found in the dynamic statement cache (50 percent hit ratio). During our testing, the dynamic statement cache was sized to achieve a hit ratio of approximately 80 percent; which is typical for some real-world applications.
CPU measurements during the benchmark are taken across the z/OS LPAR — there was no zIIP capacity for offload.
In executing the IRWW with the JDBC Type 2 driver, the statement cache hit ratios in Table 1 were observed:
Table 1. Statement cache hit ratio observed in executing the IRWW with JDBC Type 2 driver.
|API||Dynamic statement cache hit ratio|
Figure 3 uses dynamic JDBC as the baseline and compares the normalized throughput (ITR) of the different APIs when executing the IRWW using the JDBC Type 2 driver.
Figure 3. Results of normalized throughput using JDBC Type 2 driver.
The results highlight the fact that static execution can achieve significantly greater throughput than dynamic JDBC. The chart in Figure 3 shows that pureQuery with Client Optimization Static achieved a 42 percent improvement in ITR (normalized transaction rate over CPU consumption) compared to native dynamic JDBC alone. pureQuery DAO interfaces using static execution achieved an even more impressive ITR improvement of 50% over dynamic JDBC. On the test system, the pureQuery client optimization static test resulted in an overall system-wide CPU reduction of 14%, and the pureQuery DAO interfaces using static execution achieved a system-wide CPU reduction of approximately 17%. The client optimization results are particularly impressive since no source code had to be modified.
With the JDBC Type 4 driver, the statement cache hit ratios in Table 2 were observed during the tests:
Table 2. Statement cache hit ratio observed in executing the IRWW with JDBC Type 4 driver.
|API||Dynamic statement cache hit ratio (%)|
Figure 4 shows the normalized throughput (ITR) of the different APIs when executing the IRWW using the JDBC Type 4 driver.
Figure 4. Results of Normalized throughput using the JDBC Type 4 driver.
The results with the Type 4 driver show that, as with the Type 2 driver, static execution (whether with pureQuery DAO or with client optimization) can achieve significantly improved throughput efficiency over JDBC.
The normalized chart in Figure 4 shows that with the IRWW, pureQuery client optimization static achieved a 24 percent ITR improvement over using native dynamic JDBC alone. pureQuery DAO interfaces using static execution achieved an ITR improvement of 37% over dynamic JDBC. On the test system, the pureQuery client optimization static test resulted in an overall system-wide CPU reduction of 4%, and the pureQuery DAO interfaces (static) achieved a system-wide CPU reduction on approximately 13%.
During each test, the CPU consumption of both the IRWW application and DB2 were individually monitored. With both the Type 2 and 4 drivers, moving from dynamic JDBC to pureQuery static, the proportion of CPU consumed by DB2 decreased, while the proportion of CPU consumed by the application increased. The application CPU increases because it is able to process more transactions within the same unit of time, while the DB2 CPU decreases because it no longer has to prepare the SQL statement.
The results of the performance study show that static execution of SQL using Optim pureQuery Runtime for z/OS can significantly improve the throughput and reduce the CPU cost per transaction of an application. In the benchmark environment employed for this study, this approach resulted in a 13 percent reduction in total LPAR CPU consumption using the Type 4 driver and a 17 percent reduction when using the Type 2 driver, compared to native JDBC. Furthermore, with the client optimization feature there’s no need to change a single line of source code within your application, or even to have access to the source code, a fact that greatly reduces the cost of implementing pureQuery runtime.
The reduction in CPU and increased throughput when using pureQuery DAO interfaces with static execution and client optimization results because the SQL statements no longer have to be dynamically prepared and executed. If the JDBC dynamic statement cache hit ratio for the IRWW workload was nearer 100 percent, then the performance benefit of introducing the pureQuery DAOs (static) or using client optimization (static) would be reduced. Similarly, if the cache hit ratio was less than those quoted during the benchmark, the benefit would be even greater.
This study also demonstrates the performance advantage of using JDBC and pureQuery over JPA or EJB2. Using both the Type 2 and Type 4 drivers, native JDBC shows an improvement in throughput efficiency over JPA and EJB2, and both pureQuery Client optimization (using static execution) and pureQuery DAOs (using static execution) show a significant improvement.
The authors would like to thank Bill Bireley, Kevin Foster, Stephen Brodsky, and Kathy Zeidenstein for their review and suggestions. A special thanks to Soonnee Chang and Todd Munk for running the performance tests.
In our recommendations, we attempt to not just focus on the use of pureQuery alone, but also indicate those cases in which pureQuery technology (such as client optimization) can be used to help with real world environments.
For data-centric applications, from a performance and security perspective, static SQL provides significant benefit, including the potential for greater application growth before requiring purchase of additional hardware. pureQuery DAO (annotated method) style provides ease of use and diagnostic benefits for developers as well the ability to deploy statically with no change to the application.
If your developers require object management or are reluctant to move from their existing framework, then you can still use client optimization to gain the benefits of static SQL for those applications.
If the environment for your existing JDBC applications is already expertly tuned (that is, you are closer to 100% hit ratio in your statement cache), then it is unlikely you would want to convert those existing applications to pureQuery for performance reasons alone, but you will still benefit from the d improved management and problem resolution, improved plan stability, and the improved security and governance model of static SQL.
Consider using client optimization to test the advantages of static SQL and to provide developers and DBAs with the outline view, which lets you more closely pinpoint the location of a problematic SQL statement in the Java source, as well as providing the ability to do SQL performance analysis in the test environment.
If you are using SQLJ you are already getting the performance benefits of static SQL, but you may wish to use client optimization to take advantage of the problem determination and impact analysis capabilities in pureQuery. If you like the benefits of pureQuery from a development and administrative perspective, it might be worth considering doing new development on pureQuery. For example, in addition to the developer tooling and enhanced diagnostic capabilities with pureQuery, it is simpler to bind and deploy with pureQuery than with SQLJ.
- If you're interested in how static access can benefit .NET applications running
against DB2 for z/OS, be sure to read
More pureQuery Performance: Now with .NET applications accessing DB2 for z/OS" (IBM Data magazine,
pureQuery platform web page includes an overview of the pureQuery platform components, products, and includes an FAQ.
- Watch the demo:
pureQuery for z/OS: Improve ROI for existing applications
"No Excuses" Database Programming for Java
(Data Management Magazine, Jan 2008)
to learn more about the static SQL and its benefits.
Learn about the client optimization process using this tutorial:
Optimize your existing JDBC applications using pureQuery
(developerWorks, Aug 2008).
If you're interested in best practices in using the pureQuery API, including information about heterogeneous batching, be sure to read
Write high performance Java data access applications, Part 3: pureQuery API best practices (developerWorks, Aug 2008).
Read the Integrated Data Management Information Center topic on
Dan Galvin's blog on
Bill Bireley's blog on
Read the Integrated Data Management Information Center topic on
Demo: Optim Performance Management Solution: See how one fictional company uses Optim solutions for performance management.
Includes client optimization scenario.
developerWorks Optim family page: Learn more about Optim solutions.
Find technical documentation, how-to articles, education, downloads, product information, and more.
Get products and technologies
Simon Harris is a Performance Engineer within the WebSphere Federation Server development team in the Silicon Valley Laboratory. Simon has been working with federated database technology since its inception in IBM in 1995, supporting many customers in both a pre- and post-sales capacity throughout Europe, the Middle-East, and Africa.