Accelerate Hibernate and iBATIS applications using pureQuery, Part 3: Auto-tune data fetch strategies in Hibernate applications with pureQuery

Development teams that build applications using Hibernate as the Object Relational Mapper (ORM), or persistence mechanism, spend significant time tuning the amount of data that Hibernate fetches from the database, and the number of SQL queries that Hibernate uses in each business use-case of the application. Database administrators have to deal with the database being slowed by SQL queries that join hundreds of tables, or thousands of SQL statements being issued in a single unit-of-work or transaction. These problems offset the productivity which was the reason for using Hibernate/ORM in the first place. In this article, learn how the IBM® InfoSphere® Optim® pureQuery auto-tuning feature for Hibernate automates the process of determining these problems and automatically fixing them without intervention. Both the application development team and DBAs benefit from the solution.

Mario Briggs (mario.briggs@in.ibm.com), Senior Software Engineer, IBM

Mario  Briggs photoMario Briggs leads the open source offerings for IBM DB2 and IBM Informix, including PHP, Ruby/Rails, Python/Django, Perl, and Java data access frameworks. He also leads the RDF support in DB2. He has about 14 years of experience in software development with many of those years spent in the area of data access, relational engines, and application-database performance.



Ganesh Choudhary (kumar.ganesh@in.ibm.com), System Software Engineer, IBM

Photograph of author Ganesh ChoudharyGanesh Choudhary is a System Software Engineer in the Open Source Team at IBM India Software Labs. He works on pureQuery auto-tuning feature for Hibernate applications. He graduated with an M.Tech in Information Technology from IIIT, Bangalore.



21 October 2011

Also available in Russian

Overview

This article is about a new capability called auto-tuning feature for Hibernate that is available with IBM InfoSphere Optim pureQuery Runtime v3.1. As the name suggests, this capability automates the most common problems faced by users of Hibernate. It automatically optimizes the database access patterns of Hibernate in each use-case of the application without manual intervention. Today, the process of ensuring that the optimal database access pattern is used in each and every use-case of an ORM application is a manual, tedious and error-prone process that consumes significant person hours. With this feature, optimizing each use-case's database access pattern leads to significant performance benefits of the system.

This article covers how to set up and use this feature. First, it describes the data fetch strategy problems faced by ORM applications, and then it describes how the pureQuery solution solves the problem. The article assumes that you have knowledge of Hibernate, and that you are using Hibernate 3.2 or later. You can take advantage of the benefits provided by pureQuery auto-tuning for Hibernate without making changes to your application code.

The first and second article in the series focused on accelerating Hibernate and iBATIS applications respectively by using DB2® Static SQL and pureQuery heterogeneous batching. The first article also covered the following two topics:

  • A review of the pureQuery platform, which consists of development tools, a Java API, a runtime, and monitoring services that you can use to visualize performance metrics and correlate SQL statements with originating Java code.
  • Describes how you can obtain the benefits of pureQuery without having to use the pureQuery API, through capabilities for DB2 Static SQL and heterogeneous batching as provided by the IBM Integration Modules.

Overview of the pureQuery auto-tuning feature for Hibernate

This section covers the auto-tuning feature capability available with InfoSphere Optim pureQuery Runtime and how it solves the most common problems of Hibernate applications. The most common problems faced by users of Hibernate/ORM are that the application does the following:

  • Issues too many SQL statements to the database for many of the application use-cases. In fact, the number of SQL statements issued by the application can run into hundreds or thousands per use-case.
  • Fetches an excess of data or rows from the database for what is required for a given business use-case.
  • Issues too many SQL statements, and loads excess data or rows from the database for what is required for a given business use-case.

All three scenarios can put a strain on the database server resources and can render the performance of the application and database system unacceptable.

Users consider these issues a big problem when they are faced with Hibernate/ORM applications. It is also the problem that they encounter most frequently and spend the most time and effort solving.

The problems of Hibernate/ORM applications can be understood by looking at the example in Figure 1.

Figure 1. Object model and relational model of a sample Hibernate application
The Application object model (top of the figure) defines the application in terms of object-oriented Java objects. The Database model (at the bottom) shows how the same objects are stored in a relational database system using tables and primary key / foreign key relationships

The application consists of the application object model (the first model in Figure 1) which defines the application in terms of object-oriented Java objects. The database entity relationship model (the second model in Figure 1) describes how the same objects are stored in a relational database system using tables and primary key and foreign key relationships. The ORM library sits between these two layers and converts data between the two formats.

An application object model consists of a hierarchy of inter-connected objects. For an application use-case, the entire object hierarchy might not be required. Only those portions of the hierarchy which the use-case deals with might be required. Application performance suffers if the entire hierarchy is loaded for each use-case.

Problem definition

The ORM mappings allow for a single definition (in mapping files or annotations) of how an object that is linked to another object in the hierarchy should be loaded (eagerly or lazily), and whether to use an SQL join or a separate SQL statement. For a well-tuned Hibernate/ORM application, in each application use-case, the settings for an object's relationship to other objects (lazy or eager) should be uniquely defined to ensure that no extra loading of data occurs. Also the optimal settings for how the data should be retrieved from the database (SQL join or separate SQL statement) should be uniquely defined.

Clearly these settings (lazy or eager data loading and SQL join or separate SQL statements for data retrieval) for a single link in the object model can be different in use-cases, and specifying the wrong settings in a use-case can cause poor performance.

Manually determining these optimal settings for each use-case can be a cumbersome process, especially in large applications, and more so when the application deals with complex object models. Sometimes developers resort to trial and error methods in use-cases where performance is totally unacceptable, and they might ignore other use cases. This optimization process can consume lot of time and resources. This is the reason it has been rated as a problem by users that consumes the most time and effort.

Solution - pureQuery auto-tuning feature for Hibernate

The pureQuery auto-tuning solution determines the unique or optimal settings for each link in the object model within each use-case. This is achieved by analyzing the data usage patterns within the application use-case as the application runs. On subsequent runs of the same application use-case, the optimal settings are automatically fed into the system so that performance is optimal. Again, no code changes are required to use this feature and it can be used with existing applications.


The pureQuery auto-tuning for Hibernate architecture

Figure 2 shows the high-level execution of a Hibernate application with the pureQuery auto-tuning feature for Hibernate.

Figure 2. High-level execution of a Hibernate application
diagram shows Hibernate app connected to pureQuery auto-tuning, connected to Hibernate, connected to JDBC Driver, connected to DB2

The pureQuery auto-tuning feature for Hibernate sits between the application and Hibernate, and analyzes the data fetched by Hibernate based on application requests. It also analyzes the application usage pattern of that data. Based on the analysis, pureQuery auto-tuning for Hibernate produces the optimized settings. It then feeds back the optimized settings to Hibernate so that the database access is now optimized. The pureQuery auto-tuning for Hibernate properties control the analysis and optimization process.


Prerequisites for using pureQuery auto-tuning feature for Hibernate

Software

To use the pureQuery auto-tuning feature for Hibernate, you must have the following:

  • IBM InfoSphere Optim pureQuery Runtime 3.1 or higher. If you don't have it, refer to the Resources section for a link to download IBM Data Studio 3.1 which contains Optim pureQuery Runtime. The pureQuery Runtime must be on the same machine as IBM Data Studio.
  • IBM Data Server Driver for JDBC and SQLJ 3.58 or 4.7. It is available with IBM Data Studio, or refer to the Resources section for a link to download.
  • Hibernate 3.2 or later (has been tested with 3.2. and 3.3.1).

Supported databases

The pureQuery auto-tuning feature for Hibernate works when the application is accessing any of the following database systems:

  • DB2 for Linux®, UNIX®, and Windows® V8.2, Fix Pack 11 or later
  • DB2 for Linux, UNIX, and Windows V9.1, V9.5, or V9.7
  • DB2 for z/OS® V8 or later
  • Informix® V11.10 or higher

Sample application used in this article

The AutoTuningSample.zip file included with this article contains a sample application you can use to try out this capability, if you are using Hibernate 3.2 or later.


Enabling pureQuery auto-tuning feature for Hibernate applications

Configure the class path

The sample application uses Hibernate 3.3, the Hibernate entity manager, and the Hibernate annotations. Therefore, to compile the application, you need to have the following jar files in your class path along with the other jar files required by the application:

  • hibernate3.jar
  • hibernate-entitymanager.jar
  • dom4j-1.6.1.jar
  • hibernate-annotations.jar
  • hibernate-commons-annotations.jar
  • log4j-1.2.15.jar
  • javassist-3.4.GA.jar
  • antlr-2.7.6.jar
  • commons-collections-3.1.jar
  • slf4j-api-1.5.2.jar
  • slf4j-log4j12-1.5.2.jar
  • commons-logging-1.0.4.jar
  • jta1.1.jar
  • ejb-persistence.jar

To use the pureQuery auto-tuning feature for Hibernate, you also need the following jar files in your class path:

  • The IBM Data Server driver for JDBC, and SQLJ 3.58 or 4.7 jar files (the db2jcc jar files).
  • The pureQuery jar files pdq.jar, pdqmgmt.jar, and pdqhibtune.jar. The jar files are part of the pureQuery Runtime installation.

Prepare the sample application database

Complete the following steps to prepare and run the sample application.

  1. Create a database named TRADEDB. Hibernate automatically creates the needed tables.
  2. Unzip the AutoTuningSample.zip file and import the sample application as a Java project into your IBM Data Studio workspace.
  3. Include the same jar files listed in Configure the class path in the project build path. You can add the pureQuery and IBM Data Server for JDBC and SQLJ jar files in the project build path by right-clicking the project and selecting Data Access Development > Add Data Access Development Support. Select the Add pureQuery support to project check box.
  4. Edit the hibernate.cfg.xml file and type the username and password to access your database. In the sample application it is shown in Listing 1.
    Listing 1. Database user-name and password property in configuration
    <property name="hibernate.connection.username">xx</property> 
    <property name="hibernate.connection.password">xx</property>
  5. Run the tradeApp.utils.PopulateDatabase class by right-clicking the class, selecting Run as > Java application in IBM Data Studio to populate the entities in the database.

Sample application entities and use-cases

The sample application uses an application object model that consists of the following two entities:

  • Account, which resides in the tradeApp.domain.Account.java file. Account is a skeletal entity for demo purposes and has the following attributes: name, age, a collection of orders, and ID (which is the primary key).
  • Order, which resides in the tradeApp.domain.Order.java file. Order is also a skeletal entity for demo purposes and has the following attributes: orderDate, orderState, the Account which placed the order, and ID (which is the primary key).

The application consists of the following two use cases:

  • View Orders
  • End-of-day order processing

The View Orders use-case is an implementation of a customer viewing orders that is located in the file tradeApp.dao.AccountDao.java, as shown in Listing 2.

Listing 2. Code for 'View Orders' use-case
public List <Account> queryAccounts(int accountId) {
    Criteria query = sess.createCriteria(Account.class);
    query.add(Restrictions.eq("id", accountId));

    List<Account> accounts = query.list();
    sess.close();
    return accounts;
    }

Here you create a query to return a particular account as identified by the accountId. You then issue the query to the database invoking the list method on the query and then return the matching account. The caller of this method is in the TradeApp.java file of the client program, which then displays the account data and its orders on the console. The code to display the account and its orders is in the tradeApp.view.View.java file.

The End-of-day order processing use case implementation is located in the tradeApp.dao.OrderDao.java file, as shown in Listing 3.

Listing 3. Code for use-case 'End-of-day Order Processing'
public void processOrders(java.sql.Date date) {
    Query query = session.createQuery("from Order o where o.orderDate = ? ");
    query.setDate(0,date);
		
    List<Order> orders = query.list();
    for (Order o : orders) {
        // Upgrade customer if reqd based on order value
	    Account a = o.getAccount();
	    a.upgradeCustomerIfRequired(); 
			
	    //set OrderState to IN_PROCESS
	    o.setOrderState("PROCESS"); 
			 
    }
		
    // push changes to database
    /* session.flush(); */

    session.close();
}

Here you create a query to return all orders as of a specified date and where the OrderState is NEW. You then issue the query to the database invoking the list method on the query. You process each matching order and update the OrderState to PROCESS, and also upgrade the account if required. To save your changes to the database, you call the flush method on the session. This is commented out in the code to enable you to run the application repeatedly.

The pureQuery auto-tuning feature for Hibernate properties

The hibernate.properties file contains the auto-tuning feature properties and is located in the src folder. These are shown in Listing 4.

Listing 4. Properties that control the auto-tuning feature
pdq.hibernate.fetchMode.analyze=ON 
pdq.hibernate.tunedSettings= pureQueryFolder/tunedSettings.xml
pdq.hibernate.fetchMode.optimize=ON
  • The pdq.hibernate.fetchMode.analyze property is used to turn on or off analysis of data usage patterns while the application is executing.
  • The pdq.hibernate.tunedSettings property controls where the optimized fetch-mode settings are saved when pdq.hibernate.fetchMode.analyze is set to ON. It also specifies where the optimized fetch-mode settings should be retrieved from when the pdq.hibernate.fetchMode.optimize is set to ON.
  • The pdq.hibernate.fetchMode.optimize property is used to turn on or off usage of the optimized hibernate fetch-modes when the application is executing.

Note: In a production or application server environment where you may need to change the values of the auto-tuning feature properties on the fly without restarting the application, pureQuery Runtime supports reading-in these properties from a pureQuery Runtime repository. The next article in this series covers this aspect.


Run the application in Analyze Mode

Set the JavaAgent JVM argument before running the application. In IBM Data Studio, right-click TradeApp.java > Run As > Run Configurations. In the Arguments tab of Run Configurations, set VM arguments as -javaagent: < path of pdqhibernatetuneagent.jar > and then click Run, as shown in Figure 3.

Figure 3. Setting JavaAgent JVM argument in IBM Data Studio
screen capture shows javaagent listed in VM arguments

This executes the TradeApp application. Once the application has run, an optimized fetch-mode settings file (tunedSettings.xml as defined in the property pdq.hibernate.tunedSettings) is produced at the location given as the value of pdq.hibernate.tunedSettings property in the configuration file.

The output of the run is displayed in the Console view in IBM Data Studio and is shown in Listing 5.

Listing 5. SQL executed by Hibernate Application before auto-tuning feature
SQL executed 203
-----------------------------------------------------------------------------------------
Execution count |        SQL Statement  
 1              select this_.id as id1_0_, this_.ORDERSTATE as ORDERSTATE1_0_,
 		this_.ORDERDATE as ORDERDATE1_0_,this_.ACCOUNT_ID as ACCOUNT4_1_0_ 
 		from DB2ADMIN.Orders this_ where this_.ORDERDATE=? and this_.ORDERSTATE=?
 		
 1              select this_.ID as ID0_0_, this_.NAME as NAME0_0_, this_.AGE as AGE0_0_
 		from DB2ADMIN.Accounts this_ where this_.ID=?
 		
 101            select orders0_.ACCOUNT_ID as ACCOUNT4_1_, orders0_.id as id1_,
 	        orders0_.id as id1_0_,orders0_.ORDERSTATE as ORDERSTATE1_0_,
 	        orders0_.ORDERDATE as ORDERDATE1_0_, orders0_.ACCOUNT_ID as ACCOUNT4_1_0_
 	        from DB2ADMIN.Orders orders0_ where orders0_.ACCOUNT_ID=?
 	        
 100            select account0_.ID as ID0_0_, account0_.NAME as NAME0_0_, account0_.AGE
                as AGE0_0_ from DB2ADMIN.Accounts account0_ where account0_.ID=?
-----------------------------------------------------------------------------------------

As you can see, there are 203 SQL statements run by the application's two use-cases.


Inspecting the TunedSettings XML

A snippet of the TunedSettings XML file generated in the previous step is shown in Listing 6.

Listing 6. TunedSettings.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<useCases>
  <useCase>
    <entities>
      <entity name="tradeApp.domain.Account">
        <attribute name="orders" fetchMode="select" lazy="false">
          <tuning-info accept="true" fetchMode="subselect" lazy ="false"/>
        </attribute>
      </entity>
      <entity name="tradeApp.domain.Order">
        <attribute name="account" fetchMode="select" lazy="false">
          <tuning-info accept="true" fetchMode="select" lazy ="true"/>
        </attribute>
      </entity>
    </entities>
  <id>java.lang.Thread.getSta..................................</id>
  </useCase>
  <useCase>
    <entities>
      <entity name="tradeApp.domain.Order">
        <attribute name="account" fetchMode="select" lazy="false"/>
          <tuning-info accept="true" fetchMode="join" lazy ="false"/>
        </attribute>
      </entity>
      <entity name="tradeApp.domain.Account">
        <attribute name="orders" fetchMode="select" lazy="false"/>
          <tuning-info accept="true" fetchMode="select" lazy ="true"/>
        </attribute>
      </entity>
      </entities>
      <hqls>
        <hql>
          <applicationHql>from Order o where o.orderDate = ? </applicationHql>
          <tunedHql>
             from Order o  left outer join fetch o.account  where o.orderDate = ?
          </tunedHql>
          <addedJoins>
            <join>left outer join fetch o.account</join>
          </addedJoins>
        </hql>
      </hqls>
      
  <id>java.lang.Thread.getSta..................................</id>
  </useCase>
</useCases>

The TunedSettings XML file has optimization suggestions for the following two use-cases.

  • In the first use-case, the ORM query was fired on tradeApp.domain.Account entity. For this use-case, the orders attribute defined in the tradeApp.domain.Account entity is eager and the fetch-mode is select. However, the tuned setting suggests that this entity should be fetched eagerly with the subselect fetch mode.

    Similarly, for the account attribute defined in the tradeApp.domain.Order was eager, and the fetch-mode was select. However, the tuned setting for this attribute suggests that this attribute should be fetched lazily with select fetch-mode.

  • In the second use-case, the ORM Query was fired on tradeApp.domain.Order entity. For this use-case, the account attribute defined in the tradeApp.domain.Order entity is eager with fetch-mode select. However, the tuned setting suggests that this entity should be fetched eagerly with join.

    Similarly, the orders attribute tradeApp.domain.Account is eager, and fetch-mode is select. However, the tuned setting suggests that this attribute should be fetched lazily with select fetch-mode.


    Further in this use-case an HQL was used. The <applicationHql> tag contains the HQL used by the application, and the <tunedHql> tag contains the optimized HQL for this use-case which includes joining the account when fetching the order. HQL tuning is restricted to adding fetch joins to the original application HQL only. This is because when Hibernate executes HQL/JPAQL, joins need to be specified in the HQL/JPAQL itself, and joins specified in the configuration settings are not honored.

Any optimization suggestion can be turned OFF if required by setting accept=”false” in the tuning-info tag, as shown in Listing 7.

Listing 7. Turning OFF optimization suggestions in Tuned Setting XML file
<tuning-info accept="false" fetchMode="select" lazy ="true"/>

HQL optimization suggestion can be turned OFF if required by setting the <tunedHql> tag as empty as shown in Listing 8.

Listing 8. Turning OFF HQL/JPAQL optimization suggestions in Tuned Setting XML file
<hql>
    <applicationHql>from Order o where o.orderDate = ? </applicationHql>
    <tunedHql></tunedHql>
    ...

Run the application with the optimization suggestions

Run the TradeApp application again. The output of the run with Optimized Settings is displayed in the Console view in IBM Data Studio and is shown in Listing 9.

Listing 9. SQL executed by Hibernate application with auto-tuning optimization suggestions
SQL executed 3
-----------------------------------------------------------------------------------------
Execution count |        SQL Statement  
 1                select this_.ID as ID0_0_, this_.NAME as NAME0_0_, this_.AGE as AGE0_0_ 
 	          from DB2ADMIN.Accounts this_ where this_.ID=?
 
 1                select orders0_.ACCOUNT_ID as ACCOUNT4_1_, orders0_.id as id1_,
 		  orders0_.id as id1_0_, orders0_.ORDERSTATE as ORDERSTATE1_0_,
 		  orders0_.ORDERDATE as ORDERDATE1_0_, orders0_.ACCOUNT_ID 
 		  as ACCOUNT4_1_0_ from DB2ADMIN.Orders orders0_ where 
 		  orders0_.ACCOUNT_ID=?
 
 1                select this_.id as id1_1_, this_.ORDERSTATE as ORDERSTATE1_1_, 
                  this_.ORDERDATE as ORDERDATE1_1_, this_.ACCOUNT_ID as ACCOUNT4_1_1_,
                  account2_.ID as ID0_0_, account2_.NAME as NAME0_0_, account2_.AGE as
                  AGE0_0_ from DB2ADMIN.Orders this_ left outer join DB2ADMIN.Accounts
                  account2_ on this_.ACCOUNT_ID=account2_.ID where this_.ORDERDATE=?
                  and this_.ORDERSTATE=?
---------------------------------------------------------------------------------------

As you can see, the number of SQL statements executed is only three this time compared to 203 without the auto-tuning optimization suggestions.


Setting up pureQuery auto-tuning feature for Hibernate applications deployed on a WebSphere Application Server

If your Hibernate application is deployed on a WebSphere® Application Server, all steps remain the same as previously described except for the step to set up the JavaAgent JVM argument. The following section describes how to set up the JavaAgent JVM argument.

Setting JavaAgent JVM argument on a WebSphere Application Server

  1. Open the administrative console and click Servers > Server Types > WebSphere Application Servers, and click server1 as shown in Figure 4.
    Figure 4. Select the server from WebSphere Administrative Console
    screen shows WebSphere application servers in the console
  2. In server properties, click Java and Process Management > Process definition as shown in Figure 5.
    Figure 5. Select process definition
    screen cap shows process definition selected
  3. Click Java Virtual Machine as shown in Figure 6.
    Figure 6. Select Java Virtual Machine
    screen cap shows Java Virtual Machine selected
  4. Type -javaagent: <Path to pdqhibtuneag.jar> in Generic JVM arguments as shown in Figure 7.
    Figure 7. Set JVM argument
    shows generic JVM arguments
  5. Click Apply, and when prompted, save the settings to the master configuration. Restart the server for changes to take effect.

Conclusion

In this article we delved into common problems faced with Hibernate applications, and how the IBM InfoSphere Optim pureQuery auto-tuning feature for Hibernate solves these problems. The article also described how to use this feature with Hibernate applications (both Hibernate JPA applications as well as Hibernate native applications).

Through the sample application, we demonstrated the ability of this feature to optimize the database access from using 203 SQL statements to just three SQL statements. This gives significant benefits and performance gains for Hibernate applications and does not require spending time and manual effort to achieve the benefits. These performance gains apply in both JEE and J2SE environments.


Download

DescriptionNameSize
pureQuery Auto-tuning for Hibernate Sample AppAuto-tuningSample.zip12KB

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Open source, Java technology
ArticleID=766498
ArticleTitle=Accelerate Hibernate and iBATIS applications using pureQuery, Part 3: Auto-tune data fetch strategies in Hibernate applications with pureQuery
publish-date=10212011