pureQuery is a high-performance data access platform that consists of development tools, a Java™ API, a runtime, and monitoring services. Using the lightweight pureQuery API, you can write data access code that accesses DB2, Informix, and even Oracle databases. However, many of the benefits of pureQuery — the ability to visualize and modify generated SQL, use of database access best practices, support for performance features such as heterogeneous batching, and the ability to gather performance data from queries — do not require use of the API. For more information about pureQuery, see Resources.
A key use case for using pureQuery with existing applications can be seen with DB2 data servers, in which you can convert the SQL execution mode from dynamic execution (the default for Java applications) to static execution. This removes the need to do preparation at runtime and can reduce SQL injection risk. For more information about the benefits of static execution, refer to Resources.
With pureQuery, you can also take advantage of IBM Data Studio capabilities, such as visualizing performance metrics and correlating generated SQL statements with the originating Java source code. This makes pureQuery beneficial for use with existing applications that access Informix and Oracle databases. Figure 1 is a snapshot of IBM Data Studio with some key capabilities highlighted.
Figure 1. IBM Data Studio outline view
For more information about the benefits of IBM Data Studio, see Resources.
There are two ways you can take advantage of pureQuery for existing applications that are not written with the pureQuery API. You can use these methods separately or together:
- Use client optimization to have the Optim pureQuery Runtime capture SQL (and associated performance and application metadata) while the Java program is running. This approach is flexible in that you can use it with any Java application, including packaged applications. However, it does require some additional setup and requires execution of various paths in the code to ensure that all SQL is captured. After the SQL is captured, you can use the full power of IBM Data Studio for modifying the SQL, visualizing access paths, conducting performance testing, and doing impact analysis. Resources contains a link to a tutorial that describes how to use client optimization.
- Use integration modules to enable programs to take advantage of the full power of pureQuery and IBM Data Studio without requiring an initial capture step. This is a new approach and is the focus of this series. With this approach, the integration module extracts the SQL and its metadata before the Hibernate/iBATIS program is run. This support was previously available for WebSphere® OpenJPA (see Resources for a link to an article on this subject). Now, with new integration modules, this support is also available for Hibernate and iBATIS applications. With these modules, you can use tooling to visualize and optimize SQL without performing a runtime capture (you can also use it in conjunction with capture, as described in Gathering SQL not captured by integration module.)
The IBM Integration Module for Hibernate and pureQuery enables the following benefits for your Hibernate applications:
- Your Hibernate applications can use static SQL without the need for
initial pureQuery client optimization capture. The Integration Module
contains a tool called
HibernateGenthat generates the SQL that will be used by the Hibernate application. The SQL can be bound statically for access to DB2 databases and can be reviewed, shared, optimized, and replaced without changing the application itself. The SQL can also be correlated with application code for tracing performance problems back to Hibernate source code, or it can be correlated with database object changes for impact analysis using IBM Data Studio. - Your Hibernate applications can use the heterogeneous batching
feature in pureQuery. This feature enables you to batch multiple
INSERT,UPDATE, andDELETErequests even when they reference multiple tables. For example, an update of an employee object that requires updating several database tables that make up the object could be done in a single trip to the database rather than multiple trips. Each network round trip to the database server is expensive in terms of performance. So having fewer round trips but larger batches is better for performance. This support is available for DB2 and for Informix databases.
This article describes how to set up and use the Integration Module for Hibernate and pureQuery. We first describe the static SQL feature of the IBM Integration Module and then the heterogeneous batching feature. The article assumes you have some knowledge of Hibernate and that you are using Hibernate V3.2 or later. You can take advantage of the benefits provided by the Integration Module without making changes to your application code.
What is static SQL and how can it help performance?
Static SQL in DB2 is a powerful capability that enables you to streamline data access at runtime. It does this by performing some work ahead of time, such as determining the database access path. This helps make the runtime perform faster and in a more consistent manner. Figure 2 shows a comparison between the execution of dynamic SQL and static SQL.
Figure 2. Static SQL execution compared to dynamic SQL execution
The security model for static SQL is also different from that of dynamic SQL. With static SQL, users only receive privileges to execute the output of the bind process, known as the package. That package contains the SQL statements. Therefore, if all access to a table is static, you would only need to grant access to the package, not the entire table or view. Also, by using static SQL, you reduce the opportunity for malicious injection, which is a well-known security issue with dynamic SQL.
For more information about static SQL, see Resources.
Overview of static SQL creation with Integration Module for Hibernate and pureQuery
This section compares the default execution of a Hibernate application (dynamic SQL) to one that uses the Integration Module to create and use static SQL.
Dynamic SQL execution by Hibernate: JDBC
Figure 3 shows the high-level execution of a Hibernate application without the Integration Module. Hibernate generates the SQL statement strings and executes them against the database using the JDBC prepare and execute operations.
Figure 3. Dynamic SQL execution of a Hibernate application without the Integration Module
Static SQL execution by Hibernate with Integration Module: pureQuery
Figure 4 shows the high-level execution of a Hibernate application with the Integration Module. The DB2 packages have been created for static SQL execution. The figure depicts the following:
- A SQL generation step (
HibernateGen) gathers all the SQL that the Hibernate application uses into a pureQuery XML file (these files have a file type of .pdqxml). This file can then be used in IBM Data Studio for tuning or impact analysis without requiring aBINDstep. The pureQuery XML capture file can be bound into static packages using the StaticBinder. - During application execution, the pureQuery Runtime redirects the JBDC calls to execute the DB2 static packages that were created in the generation step. The pdq.properties file controls the pureQuery Runtime execution behavior (dynamic or static) and also provides the location of the pureQuery XML file.
Figure 4. Static SQL execution of a Hibernate application with the Integration Module
Which SQL statements can be extracted from a Hibernate application?
The following SQL statements can be extracted by the Integration Module:
- SQL for Named Queries — both Java Persistence Query Language (JPQL) and Hibernate Query Language (HQL)
- SQL for Named Native Queries — named SQL Queries, both Java Persistence API (JPA) and Hibernate API specified
- SQL for Create, Read, Update, and Delete (CRUD) operations on Hibernate Beans/entities — both JPA and native API
- SQL for CRUD operations on Collections in Hibernate Beans/entities — both JPA and native API
- SQL for Criteria Queries — Native Hibernate Criteria API
SQL for In-line Queries are not generated by the Integration Module. You should avoid embedding HQL queries in your Java code unless they are truly dynamic queries. In such cases, static SQL cannot be applied to those queries. For information on externalizing HQL queries, see Resources.
What is heterogeneous batching and how can it help performance?
Hibernate uses JDBC for data access. JDBC supports a form of batching called homogeneous batching. This form of batching is restricted to single tables and to multiple executions of the same SQL statement as long as the statement has input parameters. However, when updating multiple entity types in a transaction, or when using entity mappings involving secondary tables, join inheritance, or table-per-class inheritance, a transaction can contain multiple SQL statements involving multiple tables. So in these cases, JDBC batching is not effective.
In addition, JDBC batching only supports adding different SQL statements to the batch as long as the SQL statements do not have any input parameters. However, typical applications that use Hibernate (or OpenJPA) would use SQL statements with input parameters due to the dynamic nature of the applications. In those scenarios, JDBC batching does not allow batching of different SQL statements.
Consider the batching scenario, shown in Listing 1.
Listing 1. A batching scenario
PreparedStatement pstmt = conn.prepareStatement("update T1 set name=? where id=?");
pstmt.setString(1,"name1");
pstmt.setInt(2,1);
pstmt.addBatch();
pstmt.setString(2,"name2");
pstmt.setInt(2,2);
pstmt.addBatch();
pstmt.executeBatch();
|
You cannot add the SQL statement
delete from T1 where ID = ? to the batch shown
in Listing 1. Instead, you would need to create another
preparedStatement for the delete, and another
message would be sent to the database server to prepare the statement for
the delete query.
You may be able to use heterogeneous batching, which is available with pureQuery, to improve performance for situations not covered by JDBC batching. For example, look again at the scenario shown in Listing 1. Using pureQuery, you could add the SQL statements in Listing 2 to the same pureQuery batch. pureQuery would then take care of preparing the statement and sending all these statements to the database server in a single message. This can improve performance.
Listing 2. Heterogeneous batching
update T1 set name=? where id=? delete from T1 where ID = ? |
Table 1 summarizes some characteristics of programs that can take advantage of pureQuery heterogeneous batching.
Table 1. Scenarios where pureQuery heterogeneous batching can be useful
| Program characteristic | Supported by JDBC homogeneous batching? | Supported by pureQuery heterogeneous batching? |
|---|---|---|
| Batch the updates to entities that map to multiple tables | No | Yes |
| Batch the updates against the same table but different columns | No (unless using literals instead of input parameters) | Yes |
Batch multiple operations (INSERT,
UPDATE,
DELETE) against the same table | No (unless using literals instead of input parameters) | Yes |
| Batch updates against multiple entities | No (unless using literals instead of input parameters) | Yes |
Heterogeneous batching does not provide additional application performance for the following scenarios:
- The application consists only of
SELECTstatements. This is because heterogeneous batching in the current release of the Integration Module only improves performance ofINSERT,UPDATE, andDELETEstatements. - The application uses entities with table identity or database sequence key generation.
Overview of heterogeneous batching enablement using the Integration Module
This section compares the execution of a Hibernate application that uses pureQuery heterogeneous batching to one that does not.
Default batching in Hibernate: JDBC
Figure 5 shows the high-level execution flow of the sample Hibernate
application without pureQuery heterogeneous batching. (The sample
application is available in the Download section.)
Hibernate generates INSERT,
UPDATE, and DELETE
statement strings and passes them to the class persister for execution.
The persister calls the batcher to perform the JDBC prepare and execute
operations.
Figure 5. Execution of a Hibernate application without pureQuery heterogeneous batching
Heterogeneous batching in Hibernate: pureQuery
Figure 6 shows the high-level execution flow of the sample Hibernate
application with pureQuery heterogeneous batching. Hibernate generates
INSERT, UPDATE, and
DELETE statement strings and passes them to the
class persister for execution.
The IBM Integration Module implements the batcher, eventListener, and
persister interfaces to enable SQL INSERT,
UPDATE, and DELETE
statements to be routed to pureQuery rather than JDBC and to take
advantage of pureQuery batching. The FlushListener notifies the batcher to
execute the batch once all the SQL statements are added to the batch from
a transaction. The current version of the Integration Module does not
support heterogeneous batching of SELECT
statements, so they do not flow through the custom batcher and persister.
Figure 6. Execution of a Hibernate application with pureQuery heterogeneous batching
Prerequisites for using the IBM Integration Module for Hibernate and pureQuery
The Resources section contains a link you can use to download the IBM Integration Module for Hibernate and pureQuery.
To use the pureQuery Integration Module, you must have:
- Optim pureQuery Runtime V2.2.0.3 or higher (see Resources to download a 30-day trial version of IBM Data Studio and Optim pureQuery Runtime. The runtime must be on the same machine as IBM Data Studio.)
- Ensure that you have pdq.jar and pdqmgmt.jar in your classpath.
- IBM Data Server Driver for JDBC and SQLJ V3.58 or V4.7 (available with IBM Data Studio, or see Resources to download).
- Hibernate V3.2 or later.
The IBM Integration Module for Hibernate and pureQuery works when the application is accessing any of the following database systems:
- DB2 for Linux®, UNIX®, and Windows® V8.2, Fix Pack 11 or later
- DB2 for Linux, UNIX, and Windows V9.1, V9.5, or V9.7
- DB2 for i V5R4 (heterogeneous batching feature only)
- DB2 for z/OS® V8 or later
- Informix Dynamic Server V11.10 or V11.5 (heterogeneous batching feature only)
Sample application used in this article
The DemoHibApp.zip file (see Download) contains a sample application you can use to try out this capability if you are using Hibernate V3.2 or later.
Enabling pureQuery static SQL for Hibernate applications
This section uses the sample Hibernate application to help explain how the IBM Integration Module works for static SQL.
The latest version of HibernateGen is available
as a plug-in for IBM Data Studio. The installation and usage of the
HibernateGen plug-in to generate the pureQuery
XML is described in the following sections. To use HibernateGen via
a command prompt, refer to the
HibernateStaticGeneratorQuickStart.pdf
available for download with the Integration module.
Installing the HibernateGen plug-in for generating pureQuery XML file
Follow the steps below to install the HibernateGen plug-in:
- Copy the file hibernategen.jar under the plugins folder of the Integration Module into the plugins folder under the installation directory of IBM Data Studio.
- Restart IBM Data Studio with the
-cleanoption.
Say your IBM Data Studio is installed at location C:\IBM\DS, issue the command as in Listing 3 from directory C:\IBM\DS to restart IBM Data Studio.
Listing 3. Launching IBM Data Studio with the
-clean option
C:\IBM\DS> eclipse.exe -clean
|
Setting up the sample application
Unzip the DemoHibApp.zip file and import the sample application as a Java project (named DemoHibApp) into your IBM Data Studio workspace (C:\workspace\, for example).
The sample application uses Hibernate V3.2, the Hibernate entity manager, and the Hibernate annotations. Therefore, to compile the application, you need to have the following JAR files in your classpath along with the other JAR files required for Hibernate:
- hibernate3.jar
- hibernate-entitymanager.jar
- hibernate-annotations.jar
- hibernate-common-annotations.jar
- ejb3-persistence.jar
- dom4j-1.6.1.jar
- log4j-1.2.15.jar
- javassist-3.4.GA.jar
- antlr-2.7.6.jar
- commons-collections-3.1.jar
- slf4j-api-1.5.2.jar
- slf4j-log4j12-1.5.2.jar
- commons-logging-1.0.4.jar
- pdqhibernate-3.2.jar (shipped with integration module)
To use the IBM Integration Module, you also need the following PDQ and DB2 JAR files in your classpath:
- pdq.jar
- pdqmgmt.jar
- db2jcc jars
To add the PDQ and DB2 JAR files into the project's build path, right-click on
the project and select Data Access Development >
Add pureQuery support.... This will create a folder
named dataAccessFolder under the project and
adds the necessary JAR files to the project's build path.
Prepare the sample application's database
Create a database named DEMODB. Hibernate automatically creates the needed tables during the first run of the application.
The sample application uses the following entities:
- CustomerInfo with subclass Customer
- District
- Item
- Order
Generate the pureQuery XML file for static SQL execution
To generate the pureQuery XML file for the sample application, invoke
HibernateGen by right-clicking the
configuration file (hibernate.cfg.xml or persistence.xml), then select
Data Access Development > HibernateGen..., as
shown below. This will launch a dialog wizard, as shown below.
Provide the database connection details, which will be used to connect to
the database and validate the generated SQLs. You can also specify other
optional parameters, like the name of the pureQuery XML file that will be
generated. To know more about what the options stand for in the dialog
wizard, hover the mouse over respective widgets.
Click Finish to generate the pureQuery XML file.
Figure 7. Invoking HibernateGen in IBM Data Studio
Figure 8. Dialog Wizard launched when HibernateGen is invoked
On completion, verify that a file named hibernate.cfg.xml.pdqxml (if using JPA, file with name as the persistent-unit name specified and extension pdqxml) was created in the directory dataAccessFolder under your project. Details of the number of SQLs captured and other logs will be displayed on the HibernateGen Messages console.
Figure 9 shows a screenshot of the HibernateGen Messages after run on the sample application.
Figure 9. Screenshot showing HibernateGen messages logged on console
NOTE:
- If using Hibernate-native APIs, by default, the connection details are fetched from the hibernate.cfg.xml file. To select a different database, click the Edit... button on the dialog box and select the database against which the generated SQLs are to be validated.
- If using JPA, you will need to specify the
puNameto be used. - If using JPA, database connection details will be fetched from
puName. If connection details are provided along withpuName, those will be used. HibernateGengenerates the SQL with the JDBC default forresultsetHoldability. However, WebSphere Application Server uses a different default forresultsetHoldability. If you are deploying your Hibernate application on WebSphere Application Server, check theresultsetHoldabilityof the WebSphere Application Server data source and specify it in theresultsetHoldabilityon the pureQuery XML Options page.
Configure and bind the pureQuery XML file for static SQL execution
In order to run the application statically, you first need to configure, then bind the generated hibernate.cfg.xml.pdqxml file.
Follow the steps below to configure and bind hibernate.cfg.xml.pdqxml:
- Edit the file dataAccessFolder\Default.genProps in the DemoHibApp project to add the line as shown in Listing 4.
- Save the file. If asked for rebuild of project, click Yes.
- Right-click dataAccessFolder\hibernate.cfg.xml.pdqxml and invoke the binder by selecting Data Access Development > Bind..., as shown in Figure 10.
Listing 4. Line to add in Default.genProps file
C:\workspace\DemoHibApp\dataAccessFolder\hibernate.cfg.cml.pdqxml= -rootPkgName test |
Figure 10. Binding the pureQuery XML file in IBM Data Studio
Run the application using static SQL
Edit Provider in persistence.xml
If you are using persistence.xml, in order to enable pureQuery static SQL
support, you need to ensure that the provider property is specified and set
to com.ibm.pdq.hibernate3.PDQHibernatePersistence
in the persistence.xml file.
For the sample application, this property is already set, as shown in the first line of Listing 5.
Listing 5. Required property in persistence.xml to enable pureQuery static SQL
<provider>com.ibm.pdq.hibernate3.PDQHibernatePersistence</provider> <!-- <provider>org.hibernate.ejb.HibernatePersistence</provider> --> |
If you are using native Hibernate APIs, you need to use the Configuration Classes provided in the integration module as described below. Refer to the section Why is the Provider change required to enable static SQL usage with Hibernate? to know why provider change is required.
Edit pdq.properties file
Listing 6 shows the properties used in the pdq.properties file included with the sample application.
Listing 6. pdq.properties file for the sample application
pdq.captureMode=OFF pdq.executionMode=STATIC pdq.pureQueryXml=hibernate.cfg.xml.pdqxml |
The following explains the property settings:
captureMode=OFFindicates that capture is turned off. You only need capture turned on when you are using the client optimization process described later in this article.executionMode=STATICindicates that static execution mode should be used (instead of dynamic).pureQueryXml=hibernate.cfg.xml.pdqxmlprovides the name and location of the pureQuery XML file that was created when you ranHibernateGen.
For more information about all PDQ properties, see Resources.
In the actual pdq.properties file included with the sample application,
all the properties are commented out (each line begins with
#). Before proceeding, edit the file, delete
the # character to uncomment the three
properties, and save the file. When you are done, the file should look
like Listing 6.
If you are using persistence.xml, you can now run the sample application by executing MainClient.java. To run MainClient.java, right-click on the file and select Run As > Java Application. SQL statements from the application are executed statically.
If you are using hibernate.cfg.xml, you can run the sample application by executing MainClientSessionFactory.java. Notice that in MainClientSessionFactory.java, we use the configuration class provided by the Integration Module as shown in Listing 7. This is required to solve the problem described in section Why is the Provider change required to enable static SQL usage with Hibernate?
Listing 7. Usage of the configuration class provided by the Integration Module in hibernate.cfg.xml applications
public static void main(String[] args) {
_sFactory = new PDQAnnotationConfiguration().configure()
.buildSessionFactory();
|
If you are deploying and running the sample application on WebSphere Application Server, see Resources regarding pureQuery client optimization in Web application environments.
Enabling pureQuery heterogeneous batching for Hibernate applications
Set properties in persistence.xml
To enable pureQuery heterogeneous batching with Hibernate V3.2 applications using the IBM Integration Module, ensure that the following properties are specified in your persistence.xml file:
hibernate.jdbc.factory_classmust be set tocom.ibm.pdq.hibernate3.batcher.PDQBatcherFactory.hibernate.ejb.event.flushmust be set tocom.ibm.pdq.hibernate3.ejb3.listener.PDQEJB3FlushEventListener.
The sample application includes these properties in a comment section of persistence.xml. So for the sample application, you can simply edit the persistence.xml file, uncomment those properties, and save the file. With the comments removed, this section of the file would look similar to Listing 8.
Listing 8. Required properties in persistence.xml to enable pureQuery heterogeneous batching
<property name="hibernate.jdbc.factory_class" value = "com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory"/> <property name="hibernate.ejb.event.flush" value = "com.ibm.pdq.hibernate3.ejb3.listener.PDQEJB3FlushEventListener"/> <property name="hibernate.connection.provider_class" value= "org.hibernate.connection.C3P0ConnectionProvider"/> |
For optimal performance, it is important to have statement caching turned
on and set to a reasonable size. Hibernate applications can use c3p0
connection pooling to enable statement caching in J2SE applications. To
enable c3P0 statement caching, set the
hibernate.connection.provider_class to
org.hibernate.connection.C3P0ConnectionProvider
in the persistence.xml as shown in Listing 8. For J2EE applications, you can set statement caching on the application
server.
The c3p0.properties file specifies the size of the statement cache as shown in Listing 9.
Listing 9. c3p0 file size specification
c3p0.maxStatementsPerConnection = 200 |
The size you specify for the statement cache size should be relatively large. Typically, this is in the range of 100-200. In a J2SE environment, if the statement cache is too small or the batch size too large, you may receive SQL exceptions resulting from inconsistencies in the statement cache.
If your Hibernate application is using the native Hibernate API, you must specify the following properties in the hibernate.cfg.xml file:
hibernate.jdbc.factory_classmust be set tocom.ibm.pdq.hibernate3.batcher.PDQBatcherFactory.- The flush event listener class must be set to
com.ibm.pdq.hibernate3.listener.PDQFlushListener.
The sample application includes these properties in comment sections of hibernate.cfg.xml. So for the sample application, you can simply edit the hibernate.cfg.xml file, uncomment those properties, and save the file. With the comments removed, the sections of the file would look similar to Listing 10.
Listing 10. Required properties in hibernate.cfg.xml to enable pureQuery heterogeneous batching
<property name="hibernate.jdbc.factory_class">
com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory</property>
<event type="flush">
<listener class="com.ibm.pdq.hibernate3.listener.PDQFlushListener"/>
</event>
|
Run the application with pureQuery heterogeneous batching
Listing 11 shows the persistence.xml file configured to run the sample application with pureQuery static SQL and heterogeneous batching enabled. It defines the entities, database connection properties, and other required configuration properties.
The batch_size property controls the size of a
JDBC batch. It is also used by the Integration Module to control the size
of a pureQuery batch. The recommended batch size is in the range of
50-100. Most of the benefit of batching is seen in the range 1-25. If the
batch size is too large, you may receive an SQL exception with the SQL
code -805. This means the database server has run out of dynamic cursors.
Listing 11. persistence.xml for the sample Hibernate application
<persistence-unit name="test" transaction-type="RESOURCE_LOCAL">
<provider>com.ibm.pdq.hibernate3.PDQHibernatePersistence</provider>
<!-- <provider>org.hibernate.ejb.HibernatePersistence</provider> -->
<class>demo.CustomerInfo</class>
<class>demo.Customer</class>
<class>demo.District</class>
<class>demo.Item</class>
<class>demo.Order</class>
<properties>
<property name="hibernate.connection.driver_class"
value="com.ibm.db2.jcc.DB2Driver"/>
<property name="hibernate.connection.url"
value="jdbc:db2:demodb"/>
<property name="hibernate.connection.username"
value="xx"/>
<property name="hibernate.connection.password"
value="xx"/>
<property name="hibernate.dialect" value="org.hibernate.dialect.DB2Dialect"/>
<property name="hibernate.hbm2ddl.auto"
value="update"/>
<property name="hibernate.jdbc.batch_size"
value = "100"/>
<property name="hibernate.jdbc.factory_class" value =
"com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory"/>
<property name="hibernate.ejb.event.flush" value =
"com.ibm.pdq.hibernate3.ejb3.listener.PDQEJB3FlushEventListener"/>
<property name="hibernate.connection.provider_class" value=
"org.hibernate.connection.C3P0ConnectionProvider"/>
</properties>
</persistence-unit>
|
NOTE: If you are using Informix Dynamic Server, you need
to set the hibernate.dialect property in
persistence.xml to InformixDialect as
follows:
<property name="hibernate.dialect" value="org.hibernate.dialect.InformixDialect" /> |
Now you can run the sample application by executing MainClient.java. The custom pureQuery batcher ensures that heterogeneous batching occurs.
If you are using hibernate.cfg.xml, run the sample application by
executing the MainClientSessionFactory.java
class, which uses SessionFactory API. You should also include the
c3p0.properties file in the application path
with the c3p0.maxStatementsPerConnection
property specified.
Comparing sample application elapsed-time performance
This section describes how to determine if the performance of the sample application improves when running it with pureQuery heterogeneous batching.
To determine if pureQuery heterogeneous batching provides an elapsed-time benefit when running the sample application, you need to run the application with and without heterogeneous batching.
One method to get the elapsed time is to run the application using
MainClient.java or
MainClientSessionFactory.java. Each of these
displays the total elapsed time it takes for the program to run.
Another method is to turn on the trace for the batcher. This lets you actually see how the Integration Module is creating the heterogeneous batches by showing you when the batch has started, when statements are added to the batch, and when the batch ends. The sample application includes a file named log4j.properties you can use to turn on the trace for the batcher. After the batch ends, the current directory contains a file named demo.log that contains messages similar to what is shown in Listing 12.
Listing 12. Trace messages in the demo.log
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - PureQuery batch started
Added to the batch: insert into District (d_city, d_name, d_next_o_id,
d_state, d_street_1, d_street_2, d_tax, d_ytd, d_zip, version, d_id)
values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Added to the batch: insert into Customer (city, fname, lname, state,
street, version, zipcode, c_balance, c_credit, c_credit_lim, c_data,
c_delivery_cnt, c_discount, c_payment_cnt, c_since, c_ytd_payment,
district_d_id, c_d_id, c_id)
values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Added to the batch: insert into Item (i_data, i_name, i_price, version, i_id)
values (?, ?, ?, ?, ?)
Added to the batch: insert into Order (district_d_id, itime, o_all_local, o_c_id,
o_carrier_id, o_entry_d, o_ol_cnt, version, o_d_id, o_id)
values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Added to the batch: update District set d_city=?, d_name=?, d_next_o_id=?, d_state=?,
d_street_1=?, d_street_2=?, d_tax=?, d_ytd=?, d_zip=?, version=?
where d_id=? and version=?
Added to the batch: update Customer set city=?, fname=?, lname=?, state=?, street=?,
version=?, zipcode=?, c_balance=?, c_credit=?, c_credit_lim=?, c_data=?,
c_delivery_cnt=?, c_discount=?, c_payment_cnt=?, c_since=?, c_ytd_payment=?,
district_d_id=? where c_d_id=? and c_id=? and version=?
Added to the batch: update Item set i_data=?, i_name=?, i_price=?, version=?
where i_id=? and version=?
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - About to execute the batch
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - Batch Executed.
Executed batch size=7
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - Number of batches
executed in the session = 1
|
The authors of this article tested the sample application with the batch size set to 100 against DB2 on a local machine and on a remote machine. This testing produced the following results on DB2 for Linux, UNIX, and Windows:
- When DB2 and the sample application are running on a local machine, pureQuery improved elapsed-time performance over JDBC by more than 40 percent, depending on the number of objects per transaction. This is depicted in Figure 11. Seven SQL statements are created per object (in turn, seven entity objects are created per object).
- When DB2 is running on a remote machine and the sample application is running on a local machine, elapsed-time performance improvements ranged from 12 percent to more than 50 percent, depending on the number of objects per transaction. This is depicted in Figure 12. Seven SQL statements are created per object (in turn, seven entity objects are created per object).
In Figure 11, the x-axis represents the number of objects per transaction. The y-axis represents to the elapsed time taken for the transaction in milliseconds. Notice that the chart indicates that JDBC batching is taking more elapsed time than pureQuery.
Figure 11. JDBC vs. pureQuery with local database server
In Figure 12, the x-axis represents the number of objects per transaction. The y-axis represents the elapsed time taken for the transaction in milliseconds. Notice that the chart indicates that JDBC batching is taking more elapsed time than pureQuery. Also notice that, in this case, as the number of objects per transaction increases, the delta elapsed time between pureQuery and JDBC increases.
Figure 12. JDBC vs. pureQuery with remote database server
Using static SQL and heterogeneous batch together
Both the static SQL and heterogeneous batching features are orthogonal. This means that a Hibernate application can use static SQL and heterogeneous batching at the same time, or it can use each feature independently. There is no overlap of any steps in setting up each feature. To set up static SQL, follow the steps listed in Enabling pureQuery static SQL for Hibernate applications. To set up heterogeneous batching, follow the steps listed in Enabling pureQuery heterogeneous batching for Hibernate applications. To set up both, simply follow the steps in both sections; the order does not matter.
Gathering SQL not captured by integration module
If your Hibernate application uses JPA V2.0 Criteria API or your application uses inline queries, you can use incremental capture to get the SQL from these queries and convert to static SQL. To do this, change the pdq.properties, as shown in Listing 13.
Listing 13. pdq.properties for incremental capture
pdq.captureMode=ON pdq.executionMode=STATIC pdq.pureQueryXml=test.pdqxml pdq.OutputPureQueryXml=incremental_capture.pdqxml |
When you run the application, it creates a file named
incremental_capture.pdqxml that contains SQL
not present in the PDQXML file created by the
HibernateGen command. You can use the pureQuery
Merge utility to merge the two files. This gives you a single consolidated
PDQXML that contains SQL from both files.
To run the Merge utility:
- Right-click hibernate.cfg.xml.pdqxml.
- Select Data Access Development > Merge with....
- Select incremental_capture.pdqxml in the file browser that popped up.
- Save the output pureQuery XML file as test_Merged.pdqxml.
For more information about the pureQuery Merge utility, see Resources.
After you run the sample Merge utility, the test_Merged.pdqxml file is ready for binding. Run the StaticBinder on test_Merged.pdqxml, as explained in Configure and bind the pureQuery XML file for static SQL execution, then change the pdq.properties as shown in Listing 14.
Listing 14. pdq.properties for static SQL execution after incremental capture
pdq.captureMode=OFF pdq.executionMode=STATIC pdq.pureQueryXml=test_Merged.pdqxml |
Run the application again. Now the SQLs not generated by HibernateGen will be executed statically.
Why is the provider change required to enable static SQL usage with Hibernate?
The usage of static SQL requires that the SQL string generated by Hibernate be exactly the same each time. There is a known issue in Hibernate that causes Hibernate to generate the same SQL, but with different column aliases across runs of the application. (Refer to this forum post on the Hibernate website for details: how are column alias names generated).
To enable static SQL, you need to work around this issue in one of the following ways:
- Apply the Hibernate Defect HHH-2448 patch to Hibernate.
- Use the Configuration classes provided by the Integration Module.
This article, Part 1 of a two-part "Accelerate Hibernate and iBATIS applications using pureQuery" series, described how to use IBM Integration Module for pureQuery and Hibernate with Hibernate applications (Hibernate JPA applications as well as Hibernate-native applications). It also provided a brief overview of how to use pureQuery client optimization incremental capture and informal elapsed-time performance measurements. Part 2 focuses on iBATIS applications.
The IBM Integration Module for pureQuery and Hibernate delivers significant performance gains on DB2 when using static SQL. It also delivers significant elapsed time gains on DB2 and Informix servers when heterogeneous batching is used with suitable applications. These performance improvements apply in both J2EE and J2SE environments.
The authors would like thank Asim Singh and Ambrish Bhargava for reviewing the article and providing great feedback.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample Hibernate application for this article | DemoHibApp.zip | 32KB | HTTP |
Information about download methods
Learn
- "Optimize your existing JDBC applications using pureQuery" is a must-read tutorial on the process of
client optimization.
- Check out "Integrating JPA and pureQuery: Leveraging DB2 static execution for
the Java Persistence API."
- Visit IBM pureQuery for an overview of the pureQuery
platform, including the value it brings, which products it ships with, and
an FAQ.
- ""No Excuses" Database Programming for Java" (IBM Database Magazine, May 2008)
explains the benefits of static SQL and how to exploit those benefits
using the Java language.
- "Hibernate:
Externalize HQL Queries" (Javalobby, August 2005) describes how to
use named queries to externalize HQL queries.
- More information on pureQuery properties, Configure Utility, StaticBinder, and Merge Utility.
-
Visit the developerWorks Optim family page to learn more about Optim
solutions, find technical documentation, how-to articles, education,
downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
-
Managing pureQuery client optimization in Web application
environments (developerWorks, February 2010). Learn how to
optimize applications on a single application server node.
- Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation,
how-to articles, education, downloads, product information, and
more.
- Stay current with
developerWorks technical events and webcasts.
- Follow developerWorks on
Twitter.
Get products and technologies
- Download the IBM Integration Module for pureQuery and Hibernate.
- Download and try
IBM
Data Studio and pureQuery Runtime.
- Download the IBM Data Server Driver for JDBC and SQLJ (JCC Driver).
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
Discuss
- Participate in the discussion forum.
- Check out the Managing the data lifecycle blog.
- Check out the
developerWorks
blogs and get involved in the
developerWorks community.

Praveen Devarao is a software engineer at IBM India Software Labs working on varied open source technologies and offerings for IBM DB2 and IBM Informix. He started his career at IBM and has about four years of experience on technologies related to relational databases, Ruby on Rails, Perl, Python, PHP, and Java data access frameworks like Hibernate.

Sandhya Turaga is a Software Engineer in the object query development team at IBM Silicon Valley Lab. Previously, while working as an intern at IBM, she worked on query performance improvement for ObjectGrid's object query language. She graduated with an MS in Computer Science from San Jose State University in May 2008, where her thesis work was on Grid computing.

Mario Briggs leads the open source offerings for IBM DB2 and IBM Informix, including PHP, Ruby/Rails, Python/Django, Perl, and Java data access frameworks. He also leads the RDF support in DB2. He has about 14 years of experience in software development with many of those years spent in the area of data access, relational engines, and application-database performance.

Kathy Zeidenstein has worked at IBM for a bazillion years. She has worked in Information Development, product management, and product marketing. Currently, she is the manager responsible for Information Development for Optim solutions for database development, administration, and design. Previously, she worked on the IBM Optim Solutions technical enablement team and was responsible for community development and communications.





