Accelerate Hibernate and iBATIS applications using pureQuery, Part 1: Enable static SQL and heterogeneous batching for Hibernate applications

When extended with the downloadable IBM® Integration Module, the IBM Optim® pureQuery Runtime simplifies the process of generating DB2® static SQL for Hibernate and iBATIS applications. It does this without requiring you to make changes to your application code or to gather SQL from production workloads. The Optim pureQuery Runtime also enables Hibernate and iBATIS applications that access DB2 or Informix® to benefit from the heterogeneous batching feature in pureQuery. With it, you can batch multiple INSERT, UPDATE, and DELETE requests before sending them across the network, even when the requests reference multiple tables. This article is Part 1 of a two-part "Accelerate Hibernate and iBATIS applications using pureQuery" series. It describes using the IBM Integration Module with Hibernate applications and includes a downloadable sample application that illustrates how to easily enable static SQL and heterogeneous batch functions with Hibernate applications.

Share:

Praveen Devarao (praveendrl@in.ibm.com), Software Engineer, IBM

Praveen DevaraoPraveen Devarao is a software engineer at IBM India Software Labs working on varied open source technologies and offerings for IBM DB2 and IBM Informix. He started his career at IBM and has about four years of experience on technologies related to relational databases, Ruby on Rails, Perl, Python, PHP, and Java data access frameworks like Hibernate.



Sandhya Turaga (sturaga@us.ibm.com), Software Engineer, IBM

Sandhya Turaga photoSandhya Turaga is a Software Engineer in the object query development team at IBM Silicon Valley Lab. Previously, while working as an intern at IBM, she worked on query performance improvement for ObjectGrid's object query language. She graduated with an MS in Computer Science from San Jose State University in May 2008, where her thesis work was on Grid computing.



Mario Briggs (mario.briggs@in.ibm.com), Senior Software Engineer, IBM

Mario  Briggs photoMario Briggs leads the open source offerings for IBM DB2 and IBM Informix, including PHP, Ruby/Rails, Python/Django, Perl, and Java data access frameworks. He also leads the RDF support in DB2. He has about 14 years of experience in software development with many of those years spent in the area of data access, relational engines, and application-database performance.



Kathy Zeidenstein (krzeide@us.ibm.com), Senior Software Engineer, IBM

Author Photo: Kathy ZeidensteinKathy Zeidenstein has worked at IBM for a bazillion years. She has worked in Information Development, product management, and product marketing. Currently, she is the manager responsible for Information Development for Optim solutions for database development, administration, and design. Previously, she worked on the IBM Optim Solutions technical enablement team and was responsible for community development and communications.



Abhigyan Agrawal, Software Engineer, Consultant

Abhigyan Agrawal photoAbhigyan Agrawal was a software engineer on the Open Source Team at IBM India Software Labs. He worked on providing DB2 Static SQL support in iBATIS and developing the IBM Python drivers. He graduated with an MTech in Information Technology from IIIT, Bangalore.



21 October 2011 (First published 12 August 2010)

Also available in Chinese Russian Portuguese

Overview

pureQuery is a high-performance data access platform that consists of development tools, a Java™ API, a runtime, and monitoring services. Using the lightweight pureQuery API, you can write data access code that accesses DB2, Informix, and even Oracle databases. However, many of the benefits of pureQuery — the ability to visualize and modify generated SQL, use of database access best practices, support for performance features such as heterogeneous batching, and the ability to gather performance data from queries — do not require use of the API. For more information about pureQuery, see Resources.

A key use case for using pureQuery with existing applications can be seen with DB2 data servers, in which you can convert the SQL execution mode from dynamic execution (the default for Java applications) to static execution. This removes the need to do preparation at runtime and can reduce SQL injection risk. For more information about the benefits of static execution, refer to Resources.

With pureQuery, you can also take advantage of IBM Data Studio capabilities, such as visualizing performance metrics and correlating generated SQL statements with the originating Java source code. This makes pureQuery beneficial for use with existing applications that access Informix and Oracle databases. Figure 1 is a snapshot of IBM Data Studio with some key capabilities highlighted.

Figure 1. IBM Data Studio outline view
Screenshot shows the correlation between a query in Java source code with referenced tables in schema

For more information about the benefits of IBM Data Studio, see Resources.

There are two ways you can take advantage of pureQuery for existing applications that are not written with the pureQuery API. You can use these methods separately or together:

  • Use client optimization to have the Optim pureQuery Runtime capture SQL (and associated performance and application metadata) while the Java program is running. This approach is flexible in that you can use it with any Java application, including packaged applications. However, it does require some additional setup and requires execution of various paths in the code to ensure that all SQL is captured. After the SQL is captured, you can use the full power of IBM Data Studio for modifying the SQL, visualizing access paths, conducting performance testing, and doing impact analysis. Resources contains a link to a tutorial that describes how to use client optimization.
  • Use integration modules to enable programs to take advantage of the full power of pureQuery and IBM Data Studio without requiring an initial capture step. This is a new approach and is the focus of this series. With this approach, the integration module extracts the SQL and its metadata before the Hibernate/iBATIS program is run. This support was previously available for WebSphere® OpenJPA (see Resources for a link to an article on this subject). Now, with new integration modules, this support is also available for Hibernate and iBATIS applications. With these modules, you can use tooling to visualize and optimize SQL without performing a runtime capture (you can also use it in conjunction with capture, as described in Gathering SQL not captured by integration module.)

The IBM Integration Module for Hibernate and pureQuery enables the following benefits for your Hibernate applications:

  • Your Hibernate applications can use static SQL without the need for initial pureQuery client optimization capture. The Integration Module contains a tool called HibernateGen that generates the SQL that will be used by the Hibernate application. The SQL can be bound statically for access to DB2 databases and can be reviewed, shared, optimized, and replaced without changing the application itself. The SQL can also be correlated with application code for tracing performance problems back to Hibernate source code, or it can be correlated with database object changes for impact analysis using IBM Data Studio.
  • Your Hibernate applications can use the heterogeneous batching feature in pureQuery. This feature enables you to batch multiple INSERT, UPDATE, and DELETE requests even when they reference multiple tables. For example, an update of an employee object that requires updating several database tables that make up the object could be done in a single trip to the database rather than multiple trips. Each network round trip to the database server is expensive in terms of performance. So having fewer round trips but larger batches is better for performance. This support is available for DB2 and for Informix databases.

This article describes how to set up and use the Integration Module for Hibernate and pureQuery. We first describe the static SQL feature of the IBM Integration Module and then the heterogeneous batching feature. The article assumes you have some knowledge of Hibernate and that you are using Hibernate V3.2 or later. You can take advantage of the benefits provided by the Integration Module without making changes to your application code.


What is static SQL and how can it help performance?

Static SQL in DB2 is a powerful capability that enables you to streamline data access at runtime. It does this by performing some work ahead of time, such as determining the database access path. This helps make the runtime perform faster and in a more consistent manner. Figure 2 shows a comparison between the execution of dynamic SQL and static SQL.

Figure 2. Static SQL execution compared to dynamic SQL execution
Dynamic SQL flow chart shows six steps as opposed to static SQL fow chart that has only three steps

The security model for static SQL is also different from that of dynamic SQL. With static SQL, users only receive privileges to execute the output of the bind process, known as the package. That package contains the SQL statements. Therefore, if all access to a table is static, you would only need to grant access to the package, not the entire table or view. Also, by using static SQL, you reduce the opportunity for malicious injection, which is a well-known security issue with dynamic SQL.

For more information about static SQL, see Resources.


Overview of static SQL creation with Integration Module for Hibernate and pureQuery

This section compares the default execution of a Hibernate application (dynamic SQL) to one that uses the Integration Module to create and use static SQL.

Dynamic SQL execution by Hibernate: JDBC

Figure 3 shows the high-level execution of a Hibernate application without the Integration Module. Hibernate generates the SQL statement strings and executes them against the database using the JDBC prepare and execute operations.

Figure 3. Dynamic SQL execution of a Hibernate application without the Integration Module
Image shows that Hibernate application generates SQL strings, goes to JDBC driver, then the database

Static SQL execution by Hibernate with Integration Module: pureQuery

Figure 4 shows the high-level execution of a Hibernate application with the Integration Module. The DB2 packages have been created for static SQL execution. The figure depicts the following:

  1. A SQL generation step (HibernateGen) gathers all the SQL that the Hibernate application uses into a pureQuery XML file (these files have a file type of .pdqxml). This file can then be used in IBM Data Studio for tuning or impact analysis without requiring a BIND step. The pureQuery XML capture file can be bound into static packages using the StaticBinder.
  2. During application execution, the pureQuery Runtime redirects the JBDC calls to execute the DB2 static packages that were created in the generation step. The pdq.properties file controls the pureQuery Runtime execution behavior (dynamic or static) and also provides the location of the pureQuery XML file.
Figure 4. Static SQL execution of a Hibernate application with the Integration Module
Image shows that HibernateGen creates pdqxml, which creates static packages in DB2

Which SQL statements can be extracted from a Hibernate application?

The following SQL statements can be extracted by the Integration Module:

  • SQL for Named Queries — both Java Persistence Query Language (JPQL) and Hibernate Query Language (HQL)
  • SQL for Named Native Queries — named SQL Queries, both Java Persistence API (JPA) and Hibernate API specified
  • SQL for Create, Read, Update, and Delete (CRUD) operations on Hibernate Beans/entities — both JPA and native API
  • SQL for CRUD operations on Collections in Hibernate Beans/entities — both JPA and native API
  • SQL for Criteria Queries — Native Hibernate Criteria API

SQL for In-line Queries are not generated by the Integration Module. You should avoid embedding HQL queries in your Java code unless they are truly dynamic queries. In such cases, static SQL cannot be applied to those queries. For information on externalizing HQL queries, see Resources.


What is heterogeneous batching and how can it help performance?

Hibernate uses JDBC for data access. JDBC supports a form of batching called homogeneous batching. This form of batching is restricted to single tables and to multiple executions of the same SQL statement as long as the statement has input parameters. However, when updating multiple entity types in a transaction, or when using entity mappings involving secondary tables, join inheritance, or table-per-class inheritance, a transaction can contain multiple SQL statements involving multiple tables. So in these cases, JDBC batching is not effective.

In addition, JDBC batching only supports adding different SQL statements to the batch as long as the SQL statements do not have any input parameters. However, typical applications that use Hibernate (or OpenJPA) would use SQL statements with input parameters due to the dynamic nature of the applications. In those scenarios, JDBC batching does not allow batching of different SQL statements.

Consider the batching scenario, shown in Listing 1.

Listing 1. A batching scenario
PreparedStatement pstmt = conn.prepareStatement("update T1 set name=? where id=?");
pstmt.setString(1,"name1");
pstmt.setInt(2,1);
pstmt.addBatch();

pstmt.setString(2,"name2");
pstmt.setInt(2,2);
pstmt.addBatch();

pstmt.executeBatch();

You cannot add the SQL statement delete from T1 where ID = ? to the batch shown in Listing 1. Instead, you would need to create another preparedStatement for the delete, and another message would be sent to the database server to prepare the statement for the delete query.

You may be able to use heterogeneous batching, which is available with pureQuery, to improve performance for situations not covered by JDBC batching. For example, look again at the scenario shown in Listing 1. Using pureQuery, you could add the SQL statements in Listing 2 to the same pureQuery batch. pureQuery would then take care of preparing the statement and sending all these statements to the database server in a single message. This can improve performance.

Listing 2. Heterogeneous batching
update T1 set name=? where id=?
delete from T1 where ID = ?

Table 1 summarizes some characteristics of programs that can take advantage of pureQuery heterogeneous batching.

Table 1. Scenarios where pureQuery heterogeneous batching can be useful
Program characteristicSupported by JDBC homogeneous batching?Supported by pureQuery heterogeneous batching?
Batch the updates to entities that map to multiple tablesNoYes
Batch the updates against the same table but different columnsNo (unless using literals instead of input parameters)Yes
Batch multiple operations (INSERT, UPDATE, DELETE) against the same tableNo (unless using literals instead of input parameters)Yes
Batch updates against multiple entitiesNo (unless using literals instead of input parameters)Yes

Heterogeneous batching does not provide additional application performance for the following scenarios:

  • The application consists only of SELECT statements. This is because heterogeneous batching in the current release of the Integration Module only improves performance of INSERT, UPDATE, and DELETE statements.
  • The application uses entities with table identity or database sequence key generation.

Overview of heterogeneous batching enablement using the Integration Module

This section compares the execution of a Hibernate application that uses pureQuery heterogeneous batching to one that does not.

Default batching in Hibernate: JDBC

Figure 5 shows the high-level execution flow of the sample Hibernate application without pureQuery heterogeneous batching. (The sample application is available in the Download section.) Hibernate generates INSERT, UPDATE, and DELETE statement strings and passes them to the class persister for execution. The persister calls the batcher to perform the JDBC prepare and execute operations.

Figure 5. Execution of a Hibernate application without pureQuery heterogeneous batching
Image shows that Hiberntae APIs go to batcher and persister for the entity operations

Heterogeneous batching in Hibernate: pureQuery

Figure 6 shows the high-level execution flow of the sample Hibernate application with pureQuery heterogeneous batching. Hibernate generates INSERT, UPDATE, and DELETE statement strings and passes them to the class persister for execution.

The IBM Integration Module implements the batcher, eventListener, and persister interfaces to enable SQL INSERT, UPDATE, and DELETE statements to be routed to pureQuery rather than JDBC and to take advantage of pureQuery batching. The FlushListener notifies the batcher to execute the batch once all the SQL statements are added to the batch from a transaction. The current version of the Integration Module does not support heterogeneous batching of SELECT statements, so they do not flow through the custom batcher and persister.

Figure 6. Execution of a Hibernate application with pureQuery heterogeneous batching
Image shows that operations go through custom listener, batcher, and persister to pureQuery runtime, then to data server driver, then to the database

Prerequisites for using the IBM Integration Module for Hibernate and pureQuery

The Resources section contains a link you can use to download the IBM Integration Module for Hibernate and pureQuery.

Software

To use the pureQuery Integration Module, you must have:

  • Optim pureQuery Runtime V2.2.0.3 or higher (see Resources to download a 30-day trial version of IBM Data Studio and Optim pureQuery Runtime. The runtime must be on the same machine as IBM Data Studio.)
  • Ensure that you have pdq.jar and pdqmgmt.jar in your classpath.
  • IBM Data Server Driver for JDBC and SQLJ V3.58 or V4.7 (available with IBM Data Studio, or see Resources to download).
  • Hibernate V3.2 or later.

Supported databases

The IBM Integration Module for Hibernate and pureQuery works when the application is accessing any of the following database systems:

  • DB2 for Linux®, UNIX®, and Windows® V8.2, Fix Pack 11 or later
  • DB2 for Linux, UNIX, and Windows V9.1, V9.5, or V9.7
  • DB2 for i V5R4 (heterogeneous batching feature only)
  • DB2 for z/OS® V8 or later
  • Informix Dynamic Server V11.10 or V11.5 (heterogeneous batching feature only)

Sample application used in this article

The DemoHibApp.zip file (see Download) contains a sample application you can use to try out this capability if you are using Hibernate V3.2 or later.


Enabling pureQuery static SQL for Hibernate applications

This section uses the sample Hibernate application to help explain how the IBM Integration Module works for static SQL.

The latest version of HibernateGen is available as a plug-in for IBM Data Studio. The installation and usage of the HibernateGen plug-in to generate the pureQuery XML is described in the following sections. To use HibernateGen via a command prompt, refer to the HibernateStaticGeneratorQuickStart.pdf available for download with the Integration module.

Installing the HibernateGen plug-in for generating pureQuery XML file

Follow the steps below to install the HibernateGen plug-in:

  1. Copy the file hibernategen.jar under the plugins folder of the Integration Module into the plugins folder under the installation directory of IBM Data Studio.
  2. Restart IBM Data Studio with the -clean option.

Say your IBM Data Studio is installed at location C:\IBM\DS, issue the command as in Listing 3 from directory C:\IBM\DS to restart IBM Data Studio.

Listing 3. Launching IBM Data Studio with the -clean option
    C:\IBM\DS> eclipse.exe -clean

Setting up the sample application

Unzip the DemoHibApp.zip file and import the sample application as a Java project (named DemoHibApp) into your IBM Data Studio workspace (C:\workspace\, for example).

Configure the classpath

The sample application uses Hibernate V3.2, the Hibernate entity manager, and the Hibernate annotations. Therefore, to compile the application, you need to have the following JAR files in your classpath along with the other JAR files required for Hibernate:

  • hibernate3.jar
  • hibernate-entitymanager.jar
  • hibernate-annotations.jar
  • hibernate-common-annotations.jar
  • ejb3-persistence.jar
  • dom4j-1.6.1.jar
  • log4j-1.2.15.jar
  • javassist-3.4.GA.jar
  • antlr-2.7.6.jar
  • commons-collections-3.1.jar
  • slf4j-api-1.5.2.jar
  • slf4j-log4j12-1.5.2.jar
  • commons-logging-1.0.4.jar
  • pdqhibernate-3.2.jar (shipped with integration module)

To use the IBM Integration Module, you also need the following PDQ and DB2 JAR files in your classpath:

  • pdq.jar
  • pdqmgmt.jar
  • db2jcc jars

To add the PDQ and DB2 JAR files into the project's build path, right-click on the project and select Data Access Development > Add pureQuery support.... This will create a folder named dataAccessFolder under the project and adds the necessary JAR files to the project's build path.

Prepare the sample application's database

Create a database named DEMODB. Hibernate automatically creates the needed tables during the first run of the application.

The sample application uses the following entities:

  • CustomerInfo with subclass Customer
  • District
  • Item
  • Order

Generate the pureQuery XML file for static SQL execution

To generate the pureQuery XML file for the sample application, invoke HibernateGen by right-clicking the configuration file (hibernate.cfg.xml or persistence.xml), then select Data Access Development > HibernateGen..., as shown below. This will launch a dialog wizard, as shown below. Provide the database connection details, which will be used to connect to the database and validate the generated SQLs. You can also specify other optional parameters, like the name of the pureQuery XML file that will be generated. To know more about what the options stand for in the dialog wizard, hover the mouse over respective widgets.

Click Finish to generate the pureQuery XML file.

Figure 7. Invoking HibernateGen in IBM Data Studio
Screenshot shows the invocation of HibernateGen from IBM Data Studio
Figure 8. Dialog Wizard launched when HibernateGen is invoked
Screenshot showing the dialog wizard launched on invocation of HibernateGen

On completion, verify that a file named hibernate.cfg.xml.pdqxml (if using JPA, file with name as the persistent-unit name specified and extension pdqxml) was created in the directory dataAccessFolder under your project. Details of the number of SQLs captured and other logs will be displayed on the HibernateGen Messages console.

Figure 9 shows a screenshot of the HibernateGen Messages after run on the sample application.

Figure 9. Screenshot showing HibernateGen messages logged on console
Screenshot shows HibernateGen messages logged on console

NOTE:

  1. If using Hibernate-native APIs, by default, the connection details are fetched from the hibernate.cfg.xml file. To select a different database, click the Edit... button on the dialog box and select the database against which the generated SQLs are to be validated.
  2. If using JPA, you will need to specify the puName to be used.
  3. If using JPA, database connection details will be fetched from puName. If connection details are provided along with puName, those will be used.
  4. HibernateGen generates the SQL with the JDBC default for resultsetHoldability. However, WebSphere Application Server uses a different default for resultsetHoldability. If you are deploying your Hibernate application on WebSphere Application Server, check the resultsetHoldability of the WebSphere Application Server data source and specify it in the resultsetHoldability on the pureQuery XML Options page.

Configure and bind the pureQuery XML file for static SQL execution

In order to run the application statically, you first need to configure, then bind the generated hibernate.cfg.xml.pdqxml file.

Follow the steps below to configure and bind hibernate.cfg.xml.pdqxml:

  • Edit the file dataAccessFolder\Default.genProps in the DemoHibApp project to add the line as shown in Listing 4.
  • Save the file. If asked for rebuild of project, click Yes.
  • Right-click dataAccessFolder\hibernate.cfg.xml.pdqxml and invoke the binder by selecting Data Access Development > Bind..., as shown in Figure 10.
Listing 4. Line to add in Default.genProps file
C:\workspace\DemoHibApp\dataAccessFolder\hibernate.cfg.cml.pdqxml= -rootPkgName test
Figure 10. Binding the pureQuery XML file in IBM Data Studio
Image shows binding of the pureQuery XML file in IBM Data Studio

Run the application using static SQL

Edit Provider in persistence.xml

If you are using persistence.xml, in order to enable pureQuery static SQL support, you need to ensure that the provider property is specified and set to com.ibm.pdq.hibernate3.PDQHibernatePersistence in the persistence.xml file.

For the sample application, this property is already set, as shown in the first line of Listing 5.

Listing 5. Required property in persistence.xml to enable pureQuery static SQL
<provider>com.ibm.pdq.hibernate3.PDQHibernatePersistence</provider>
<!-- <provider>org.hibernate.ejb.HibernatePersistence</provider> -->

If you are using native Hibernate APIs, you need to use the Configuration Classes provided in the integration module as described below. Refer to the section Why is the Provider change required to enable static SQL usage with Hibernate? to know why provider change is required.

Edit pdq.properties file

Listing 6 shows the properties used in the pdq.properties file included with the sample application.

Listing 6. pdq.properties file for the sample application
pdq.captureMode=OFF
pdq.executionMode=STATIC
pdq.pureQueryXml=hibernate.cfg.xml.pdqxml

The following explains the property settings:

  • captureMode=OFF indicates that capture is turned off. You only need capture turned on when you are using the client optimization process described later in this article.
  • executionMode=STATIC indicates that static execution mode should be used (instead of dynamic).
  • pureQueryXml=hibernate.cfg.xml.pdqxml provides the name and location of the pureQuery XML file that was created when you ran HibernateGen.

For more information about all PDQ properties, see Resources.

In the actual pdq.properties file included with the sample application, all the properties are commented out (each line begins with #). Before proceeding, edit the file, delete the # character to uncomment the three properties, and save the file. When you are done, the file should look like Listing 6.

If you are using persistence.xml, you can now run the sample application by executing MainClient.java. To run MainClient.java, right-click on the file and select Run As > Java Application. SQL statements from the application are executed statically.

If you are using hibernate.cfg.xml, you can run the sample application by executing MainClientSessionFactory.java. Notice that in MainClientSessionFactory.java, we use the configuration class provided by the Integration Module as shown in Listing 7. This is required to solve the problem described in section Why is the Provider change required to enable static SQL usage with Hibernate?

Listing 7. Usage of the configuration class provided by the Integration Module in hibernate.cfg.xml applications
public static void main(String[] args) {
	
    _sFactory = new PDQAnnotationConfiguration().configure()
						.buildSessionFactory();

If you are deploying and running the sample application on WebSphere Application Server, see Resources regarding pureQuery client optimization in Web application environments.


Enabling pureQuery heterogeneous batching for Hibernate applications

Set properties in persistence.xml

To enable pureQuery heterogeneous batching with Hibernate V3.2 applications using the IBM Integration Module, ensure that the following properties are specified in your persistence.xml file:

  • hibernate.jdbc.factory_class must be set to com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory.
  • hibernate.ejb.event.flush must be set to com.ibm.pdq.hibernate3.ejb3.listener.PDQEJB3FlushEventListener.

The sample application includes these properties in a comment section of persistence.xml. So for the sample application, you can simply edit the persistence.xml file, uncomment those properties, and save the file. With the comments removed, this section of the file would look similar to Listing 8.

Listing 8. Required properties in persistence.xml to enable pureQuery heterogeneous batching
<property name="hibernate.jdbc.factory_class" value = 
"com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory"/>

<property name="hibernate.ejb.event.flush" value = 
"com.ibm.pdq.hibernate3.ejb3.listener.PDQEJB3FlushEventListener"/>

<property name="hibernate.connection.provider_class" value= 
"org.hibernate.connection.C3P0ConnectionProvider"/>

Configure statement caching

For optimal performance, it is important to have statement caching turned on and set to a reasonable size. Hibernate applications can use c3p0 connection pooling to enable statement caching in J2SE applications. To enable c3P0 statement caching, set the hibernate.connection.provider_class to org.hibernate.connection.C3P0ConnectionProvider in the persistence.xml as shown in Listing 8. For J2EE applications, you can set statement caching on the application server.

The c3p0.properties file specifies the size of the statement cache as shown in Listing 9.

Listing 9. c3p0 file size specification
c3p0.maxStatementsPerConnection = 200

The size you specify for the statement cache size should be relatively large. Typically, this is in the range of 100-200. In a J2SE environment, if the statement cache is too small or the batch size too large, you may receive SQL exceptions resulting from inconsistencies in the statement cache.

Configure hibernate.cfg.xml

If your Hibernate application is using the native Hibernate API, you must specify the following properties in the hibernate.cfg.xml file:

  • hibernate.jdbc.factory_class must be set to com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory.
  • The flush event listener class must be set to com.ibm.pdq.hibernate3.listener.PDQFlushListener.

The sample application includes these properties in comment sections of hibernate.cfg.xml. So for the sample application, you can simply edit the hibernate.cfg.xml file, uncomment those properties, and save the file. With the comments removed, the sections of the file would look similar to Listing 10.

Listing 10. Required properties in hibernate.cfg.xml to enable pureQuery heterogeneous batching
<property name="hibernate.jdbc.factory_class">
     com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory</property> 
<event type="flush">
<listener class="com.ibm.pdq.hibernate3.listener.PDQFlushListener"/>
</event>

Run the application with pureQuery heterogeneous batching

Listing 11 shows the persistence.xml file configured to run the sample application with pureQuery static SQL and heterogeneous batching enabled. It defines the entities, database connection properties, and other required configuration properties.

The batch_size property controls the size of a JDBC batch. It is also used by the Integration Module to control the size of a pureQuery batch. The recommended batch size is in the range of 50-100. Most of the benefit of batching is seen in the range 1-25. If the batch size is too large, you may receive an SQL exception with the SQL code -805. This means the database server has run out of dynamic cursors.

Listing 11. persistence.xml for the sample Hibernate application
<persistence-unit name="test" transaction-type="RESOURCE_LOCAL">

<provider>com.ibm.pdq.hibernate3.PDQHibernatePersistence</provider>
<!-- <provider>org.hibernate.ejb.HibernatePersistence</provider> -->

    <class>demo.CustomerInfo</class>     
    <class>demo.Customer</class> 
    <class>demo.District</class>
    <class>demo.Item</class>
    <class>demo.Order</class>
 		
<properties>
<property name="hibernate.connection.driver_class"
value="com.ibm.db2.jcc.DB2Driver"/>
					
<property name="hibernate.connection.url" 
value="jdbc:db2:demodb"/>
					
<property name="hibernate.connection.username" 
value="xx"/>
					
<property name="hibernate.connection.password" 
value="xx"/>  
        
<property name="hibernate.dialect" value="org.hibernate.dialect.DB2Dialect"/>

<property name="hibernate.hbm2ddl.auto" 
value="update"/>

<property name="hibernate.jdbc.batch_size" 
value = "100"/>

<property name="hibernate.jdbc.factory_class" value = 
"com.ibm.pdq.hibernate3.batcher.PDQBatcherFactory"/>

<property name="hibernate.ejb.event.flush" value = 
"com.ibm.pdq.hibernate3.ejb3.listener.PDQEJB3FlushEventListener"/>

<property name="hibernate.connection.provider_class" value= 
"org.hibernate.connection.C3P0ConnectionProvider"/>

</properties>
</persistence-unit>

NOTE: If you are using Informix Dynamic Server, you need to set the hibernate.dialect property in persistence.xml to InformixDialect as follows:

<property name="hibernate.dialect" value="org.hibernate.dialect.InformixDialect" />

Now you can run the sample application by executing MainClient.java. The custom pureQuery batcher ensures that heterogeneous batching occurs.

If you are using hibernate.cfg.xml, run the sample application by executing the MainClientSessionFactory.java class, which uses SessionFactory API. You should also include the c3p0.properties file in the application path with the c3p0.maxStatementsPerConnection property specified.


Comparing sample application elapsed-time performance

This section describes how to determine if the performance of the sample application improves when running it with pureQuery heterogeneous batching.

Gather elapsed time

To determine if pureQuery heterogeneous batching provides an elapsed-time benefit when running the sample application, you need to run the application with and without heterogeneous batching.

One method to get the elapsed time is to run the application using MainClient.java or MainClientSessionFactory.java. Each of these displays the total elapsed time it takes for the program to run.

Another method is to turn on the trace for the batcher. This lets you actually see how the Integration Module is creating the heterogeneous batches by showing you when the batch has started, when statements are added to the batch, and when the batch ends. The sample application includes a file named log4j.properties you can use to turn on the trace for the batcher. After the batch ends, the current directory contains a file named demo.log that contains messages similar to what is shown in Listing 12.

Listing 12. Trace messages in the demo.log
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - PureQuery batch started

Added to the batch: insert into District (d_city, d_name, d_next_o_id, 
   d_state, d_street_1, d_street_2, d_tax, d_ytd, d_zip, version, d_id) 
   values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

Added to the batch: insert into Customer (city, fname, lname, state, 
  street, version, zipcode, c_balance, c_credit, c_credit_lim, c_data, 
  c_delivery_cnt, c_discount, c_payment_cnt, c_since, c_ytd_payment, 
  district_d_id, c_d_id, c_id) 
   values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

Added to the batch: insert into Item (i_data, i_name, i_price, version, i_id) 
   values (?, ?, ?, ?, ?)

Added to the batch: insert into Order (district_d_id, itime, o_all_local, o_c_id, 
    o_carrier_id, o_entry_d, o_ol_cnt, version, o_d_id, o_id) 
      values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

Added to the batch: update District set d_city=?, d_name=?, d_next_o_id=?, d_state=?, 
    d_street_1=?, d_street_2=?, d_tax=?, d_ytd=?, d_zip=?, version=? 
       where d_id=? and version=?

Added to the batch: update Customer set city=?, fname=?, lname=?, state=?, street=?, 
  version=?, zipcode=?,    c_balance=?, c_credit=?, c_credit_lim=?, c_data=?, 
   c_delivery_cnt=?, c_discount=?, c_payment_cnt=?, c_since=?, c_ytd_payment=?, 
    district_d_id=? where c_d_id=? and c_id=? and version=?

Added to the batch: update Item set i_data=?, i_name=?, i_price=?, version=? 
   where i_id=? and version=?

TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - About to execute the batch
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - Batch Executed.
  Executed batch size=7
TRACE main com.ibm.pdq.hibernate3.batcher.PDQBatcher - Number of batches 
   executed in the session = 1

Sample results

The authors of this article tested the sample application with the batch size set to 100 against DB2 on a local machine and on a remote machine. This testing produced the following results on DB2 for Linux, UNIX, and Windows:

Disclaimer

Any performance data contained in this document was determined in various controlled laboratory environments and are for reference purposes only. Customers should not adapt these performance numbers to their own environments as system performance standards. The results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.

  • When DB2 and the sample application are running on a local machine, pureQuery improved elapsed-time performance over JDBC by more than 40 percent, depending on the number of objects per transaction. This is depicted in Figure 11. Seven SQL statements are created per object (in turn, seven entity objects are created per object).
  • When DB2 is running on a remote machine and the sample application is running on a local machine, elapsed-time performance improvements ranged from 12 percent to more than 50 percent, depending on the number of objects per transaction. This is depicted in Figure 12. Seven SQL statements are created per object (in turn, seven entity objects are created per object).

In Figure 11, the x-axis represents the number of objects per transaction. The y-axis represents to the elapsed time taken for the transaction in milliseconds. Notice that the chart indicates that JDBC batching is taking more elapsed time than pureQuery.

Figure 11. JDBC vs. pureQuery with local database server
Image shows JDBC vs. pureQuery with local database server

In Figure 12, the x-axis represents the number of objects per transaction. The y-axis represents the elapsed time taken for the transaction in milliseconds. Notice that the chart indicates that JDBC batching is taking more elapsed time than pureQuery. Also notice that, in this case, as the number of objects per transaction increases, the delta elapsed time between pureQuery and JDBC increases.

Figure 12. JDBC vs. pureQuery with remote database server
Image shows JDBC vs. pureQuery with remote database server

Using static SQL and heterogeneous batch together

Both the static SQL and heterogeneous batching features are orthogonal. This means that a Hibernate application can use static SQL and heterogeneous batching at the same time, or it can use each feature independently. There is no overlap of any steps in setting up each feature. To set up static SQL, follow the steps listed in Enabling pureQuery static SQL for Hibernate applications. To set up heterogeneous batching, follow the steps listed in Enabling pureQuery heterogeneous batching for Hibernate applications. To set up both, simply follow the steps in both sections; the order does not matter.


Gathering SQL not captured by integration module

If your Hibernate application uses JPA V2.0 Criteria API or your application uses inline queries, you can use incremental capture to get the SQL from these queries and convert to static SQL. To do this, change the pdq.properties, as shown in Listing 13.

Listing 13. pdq.properties for incremental capture
pdq.captureMode=ON
pdq.executionMode=STATIC
pdq.pureQueryXml=test.pdqxml
pdq.OutputPureQueryXml=incremental_capture.pdqxml

When you run the application, it creates a file named incremental_capture.pdqxml that contains SQL not present in the PDQXML file created by the HibernateGen command. You can use the pureQuery Merge utility to merge the two files. This gives you a single consolidated PDQXML that contains SQL from both files.

To run the Merge utility:

  1. Right-click hibernate.cfg.xml.pdqxml.
  2. Select Data Access Development > Merge with....
  3. Select incremental_capture.pdqxml in the file browser that popped up.
  4. Save the output pureQuery XML file as test_Merged.pdqxml.

For more information about the pureQuery Merge utility, see Resources.

After you run the sample Merge utility, the test_Merged.pdqxml file is ready for binding. Run the StaticBinder on test_Merged.pdqxml, as explained in Configure and bind the pureQuery XML file for static SQL execution, then change the pdq.properties as shown in Listing 14.

Listing 14. pdq.properties for static SQL execution after incremental capture
pdq.captureMode=OFF
pdq.executionMode=STATIC
pdq.pureQueryXml=test_Merged.pdqxml

Run the application again. Now the SQLs not generated by HibernateGen will be executed statically.


Why is the provider change required to enable static SQL usage with Hibernate?

The usage of static SQL requires that the SQL string generated by Hibernate be exactly the same each time. There is a known issue in Hibernate that causes Hibernate to generate the same SQL, but with different column aliases across runs of the application. (Refer to this forum post on the Hibernate website for details: how are column alias names generated).

To enable static SQL, you need to work around this issue in one of the following ways:


Conclusion

This article, Part 1 of a two-part "Accelerate Hibernate and iBATIS applications using pureQuery" series, described how to use IBM Integration Module for pureQuery and Hibernate with Hibernate applications (Hibernate JPA applications as well as Hibernate-native applications). It also provided a brief overview of how to use pureQuery client optimization incremental capture and informal elapsed-time performance measurements. Part 2 focuses on iBATIS applications.

The IBM Integration Module for pureQuery and Hibernate delivers significant performance gains on DB2 when using static SQL. It also delivers significant elapsed time gains on DB2 and Informix servers when heterogeneous batching is used with suitable applications. These performance improvements apply in both J2EE and J2SE environments.


Acknowledgements

The authors would like thank Asim Singh and Ambrish Bhargava for reviewing the article and providing great feedback.


Download

DescriptionNameSize
Sample Hibernate application for this articleDemoHibApp.zip32KB

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Open source, Java technology
ArticleID=505540
ArticleTitle=Accelerate Hibernate and iBATIS applications using pureQuery, Part 1: Enable static SQL and heterogeneous batching for Hibernate applications
publish-date=10212011