Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Weighing the options for Apache Geronimo EJB transactions, Part 3: Bringing it all together

Jonathan Sagorin (jonathan@javaoncall.com), Freelance software developer
Jonathan Sagorin is a freelance developer. He has spent the majority of his 10-year career working as a consultant delivering custom Java solutions. In his spare time he attempts softball and improv (not necessarily at the same time, although his softball teammates might disagree).

Summary:  Jonathan Sagorin wraps up his thorough coverage of Enterprise Java™Beans (EJB) transactions in this last installment of a three-part series. Discover the quirks and additional implementation and configuration choices related to both container- and bean-managed EJB transactions in the Apache Geronimo application server.

View more content in this series

Date:  15 Aug 2006
Level:  Intermediate
Also available in:   Japanese

Activity:  4114 views
Comments:  

Introduction

In Part 1 and Part 2 of this series, you briefly looked at bean-managed and container-managed EJB transactions and how to implement them in the Geronimo application server. So what's next? What other transactions settings are available, and what other considerations should you take into account when using EJB transactions?

This article starts by summarizing the transaction choices from Parts 1 and 2: container-managed or bean-managed transactions. You'll then learn about concurrency control strategies, methods that ensure that transactions are executed without data loss. You'll also look at isolation levels -- how to control the isolation of a transaction with other transactions -- and find out how to set transaction timeouts. Finally, you'll discover some of the pros and cons of using distributed transactions.


EJB transactions: What are my options?

When implementing EJB transactions, you have two options: container-managed or bean-managed transactions.

With container-managed transactions, you specify transaction behavior in your deployment descriptor. The EJB container is responsible for controlling transaction boundaries. You specify transaction attributes for the entire enterprise bean, for individual methods on the bean, or for both. The choices for transaction attributes are:

  • Required
  • RequiresNew
  • Supports
  • Mandatory
  • NotSupported
  • Never

With bean-managed transactions, you programmatically control your transaction boundaries and decide when transactions begin, commit, and roll back. Within bean-managed transactions, you can choose between implementing Java Transaction API (JTA) or Java Database Connectivity (JDBC) transactions. JTA transactions use the javax.transaction.UserTransaction interface to control transactions, while JDBC transactions control the behavior of transactions by performing operations directly through the java.sql.Connection interface.

If you're using session beans or message-driven beans (MDBs), you can implement bean-managed or container-managed transactions. Entity beans, however, can only use container-managed transactions.

Table 1 summarizes these options by transaction type for each enterprise bean implementation.


Table 1. Transaction type options by enteprise bean
Transaction typeSession beanEntity beanMessage-driven bean
Bean-managedx x
Container-managedxxx

If you're unsure of which transaction type to use for your bean, Sun Microsystems recommends using container-managed transactions with the required attribute for your enterprise bean.

For developers, using container-managed transactions is simpler and requires less work. No transactional logic is required in your bean method. You demarcate transaction boundaries at the method level on the enterprise bean. Your bean method must either run within the context of a transaction or not.

If you require stricter control of your transaction boundaries, use bean-managed transactions. If you expect to have long-running processes within your enterprise beans, use bean-managed transactions. For the purposes of this article, you want your transactions to run for as short a time as possible. If you use container-managed transactions, the demarcation boundaries are not granular enough; they are at the bean-method level.

By using bean-managed transactions, you can limit the duration of the transactions to be short-lived. You can isolate the database operations within the transaction and allow the longer-running processes to run outside the scope of the transaction. This will ensure that you don't block any other transactions from accessing the same data.


Concurrency control strategies

There are two strategies for implementing concurrency control using EJBs: You can follow either a pessimistic or an optimistic locking strategy.

With pessimistic locking, you acquire locks for the duration of the transaction on the data you need to modify to block anyone else from modifying the same data. This strategy is used in systems where there's a good chance someone else might attempt to modify the data being worked on. This strategy provides reliable access to data, but is suitable for smaller-scale systems, because as the system scales and more locks are required, performance will degrade.

Optimistic locking doesn't hold on to locks during a transaction. You take a positive outlook and assume that the data won't be modified by some other transaction while you're using it. You're able to handle the exceptional cases when data conflicts occur, but assume this will be rare. When updates to data are required, you implement strategies to check that the data has not changed between the time it was last read and then again prior to modification. If the data has not changed, you perform the update. This strategy is suitable for large-scale systems. The disadvantage is that you must implement code to detect and handle data conflicts.

The next section discusses changing the isolation level of a transaction. This is an example of pessimistic locking, where you control the degree and effectiveness of pessimistic locking by changing the isolation level of the transaction.


Isolation

One of the ACID properties of a transaction is isolation. Isolation allows the actions of transactions (whether reading or writing data) to be independent, or isolated, from other concurrent running transactions. By controlling the isolation, each transaction behaves as if it is the only transaction modifying the database at that moment. The degree to which a transaction is isolated from other transactions is called the isolation level.

Lock mechanisms and synchronization are used to control isolation levels. As isolation levels increase, more locks and synchronization are required. As locks are held on the data resource, other transactions attempting to perform any data operations must wait until the lock is released. So, increased isolation comes at the expense of performance. Conversely, as isolation level is decreased, performance improves because transactions spend less time waiting for locks to be released.

Data consistency

Isolation levels are set on a transaction to address the following data consistency problems:

  • Dirty reads
  • Unrepeatable reads
  • Phantom reads

Dirty reads

A dirty read occurs when data is read from a database that has not yet been committed. The data being read is out of synch with the actual data in the database.

Consider the following scenario, where two transactions are reading and updating a String field X on a database. String X's initial value is foo:

  1. Transaction 1 reads the value of String X of foo.
  2. Transaction 1 concatenates String X's current value with bar and saves it to the database.
  3. The new value of X is foobar. Transaction 1 has not yet issued a commit statement.
  4. Transaction 2 reads the String X's value, which is foobar.
  5. Transaction 1 aborts.
  6. Transaction 2 concatenates String X with bar and saves it to the database.
  7. X's new value is foobarbar, when its correct value should be foobar.
  8. Transaction 2 has performed a dirty read on String X's value.

The problem here is one transaction can change a value, and a second transaction might read this value before the initial change has been committed. The data is dirty and does not represent the true state of the data.

Unrepeatable reads

An unrepeatable read occurs when an application reads data from a database, and when it rereads the data (perhaps later in the same transaction) the data has been changed. Consider this scenario, where two applications are reading and updating the same data:

  1. Application 1 reads the value of String X of foo.
  2. Application 2 updates the value of String X to be foobar.
  3. Appplication 1 rereads the value of String and finds it has changed to foobar.

So between reads, the value of the data has changed and has become inconsistent.

Phantom reads

A phantom read is similar to an unrepeatable read. However, with phantom reads, new data is inserted into the database. An application reads a set of data from a database and finds that when it rereads the same set of data, additional data has been added. Consider this scenario, where two applications are reading and updating the same data on a database:

  1. Application 1 searches for data on certain criteria and returns a data set with five rows.
  2. Application 2 adds five additional rows to the database, which satisfies application 1's search criteria.
  3. When application 1 rereads the database based on its initial criteria (and expects to find five rows), 10 rows are returned.

Once again, the data has become inconsistent between reads.

Selecting an isolation level

Four levels of transaction isolation are listed below in order of isolation from lowest (weakest) to highest (strongest). Remember that as you increase the isolation level, performance of your application decreases.

  • Read uncommitted -- Use this option only for nonmission-critical systems with unshared data (which is rarely the case in applications). Performance is at its best, but you'll sacrifice concurrency control. Use this option if you're sure there will be no other concurrent transactions. By using this option, none of the data problems listed above is solved.
  • Read committed -- This is the default isolation level for most databases and is the default for Apache Geronimo. Only committed data is read, so this option solves the dirty-read problem. Additional locks are required on the database, so performance will be slower.
  • Repeatable read -- By using this isolation level, you address the dirty-read and uncommitted-read problems. You're guaranteed any rows that you read can be reread at a later time and their values will not have changed.
  • Serializable -- This is the strictest isolation level and addresses all three data problems. When you want your transaction to behave in a truly isolated fashion and in complete independence of other transactions, use this level. You'll be guaranteed data consistency. Use for mission-critical systems to guarantee truly isolated transaction behavior. But be aware, this isolation comes at a performance cost.

Table 2 summarizes isolation level choices and shows how each addresses the three data problems listed earlier.


Table 2. Solutions to data problems using isolation levels
Data solutionRead uncommittedRead committedRepeatable read Serializable
Solves dirty reads xxx
Solves unrepeatable reads xx
Solves phantom reads x

Isolation levels in bean-managed transactions

Isolation levels are specified through the underlying database resource manager. With bean-managed transactions, you have access to the underlying connection programmatically. Because you have access to the java.sql.Connection interface, you can change the isolation level for the connection using the method setTransactionIsolation(int level).

Set the appropriate isolation level using these constants:

  • Connection.TRANSACTION_READ_UNCOMMITTED
  • Connection.TRANSACTION_READ_COMMITTED
  • Connection.TRANSACTION_REPEATABLE_READ
  • Connection.TRANSACTION_SERIALIZABLE

Other methods of interest are:

  • Connection.getTransactionIsolation()
  • DatabaseMetaData.supportsTransactionIsolationLevel(int)

(Refer to the Sun JavaDoc API in the Resources section for more information about these methods.)

Note: By modifying isolation levels, you're requesting the database resource manager to change the isolation for this resource. There's no requirement that database vendors must support this. In fact, many database vendors won't allow you to do so. Be careful when changing the isolation level. Check your database resource manager documentation to find out which isolation levels are supported.

Also, set the transaction isolation level before you begin a transaction. Never switch an isolation level halfway through a transaction. Most resource managers also require you to use the same level of isolation for all participants in a transaction.

Isolation levels in container-managed transactions

When using container-managed transactions, there's no way to specify the isolation level in the deployment descriptor. By default, Geronimo uses the Read Committed isolation level for the EJB container. If you need more granular control of your isolation levels, consider using bean-managed transactions with JDBC transactions.


Transaction timeouts

When using bean-managed transactions with JTA transactions, you can use the setTransactionTimeout method on the javax.transaction.UserTransaction interface. This sets the maximum time (in seconds) a transaction will run before it aborts.


Distributed transactions

When multiple participants within a single transaction are physically distributed across a network, the transaction is known as a distributed transaction. Distributed transactions allow for different types of resources to participate in the transaction. Examples of distributed transactions are:

  • A single session bean begins a transaction and updates database A. It invokes a second session bean running on the same application server to update database B. The first session bean commits the transaction. Both database updates occur in the same transaction.
  • A single session bean begins a transaction and updates database A. It invokes a second session bean running on a different application server to update database B. The transaction managers for each application server will ensure both databases are updated in the same transaction.
  • A single session bean begins a transaction and updates database A, followed by a Java Message Service (JMS) operation. Both units of work are part of the same transaction. If the JMS operation were to fail, the transaction would not update the database.

Several transaction managers must work together to perform a distributed transaction. Usually a single transaction manager (called the transaction coordinator or distributed transaction manager) is appointed to coordinate the other transaction managers.

Transaction managers in turn will coordinate with resource managers to perform the necessary commits or rollbacks on their resources (perhaps a database or a messaging server). Most databases have their transaction managers and resource managers tightly coupled together.

Two-phase commits

Distributed transactions are accomplished by communicating using a protocol called the two-phase commit. From the name, you've probably figured out there are two phases:

  • First phase, or prepare to commit:
    • The transaction coordinator sends a signal to each transaction manager to prepare its operation.
    • The transaction manager writes the steps (or detail) of the operation (usually data updates) to a transaction log. In case of failure, the transaction manager uses these steps to repeat the operation.
    • The transaction manager creates a transaction locally and notifies the resource manager to perform the operation on the resource (for example, a database or a message server).
    • The resource manager performs the operation and indicates success (ready to commit signal) or failure (ready to roll back) to the transaction manager.
    • The resource manager waits for further instructions from the transaction manager.
    • The transaction manager indicates success or failure to the transaction coordinator.
  • Second phase, or commit phase: The results of the first phase are communicated to all transaction managers in the second phase. If any transaction manager reports failure, all transaction participants must roll back.
    • The transaction coordinator tells all transaction managers to commit (or roll back).
    • All transaction managers pass the commit or rollback information on to their resource managers.
    • The resource manager indicates success or failure back to the transaction manager.
    • The transaction manager indicates success or failure to the transaction coordinator.

Can we talk?

The biggest challenge in distributed transactions is agreeing on the common communication protocol between transaction managers. There is a standardized protocol for two-phase commits called the XA protocol, but not all vendors support this standard. The XA protocol defines the interfaces between the transaction manager and the resource manager.

Geronimo's transaction manager is Java Open Transaction Manager (JOTM), an open source transaction manager. It implements the XA protocol and complies to the JTA. Remember, JTA is the interface used in this series to communicate to the transaction manager. You used it with bean-managed transactions to demarcate when you start, commit, or roll back transactions.

As long as all transaction participants agree on the communication protocol, they can participate in the same distributed transaction.

It's not perfect

Be aware that distributed transactions are slower than using local transactions. All transaction participants need more system resources. Network chatter between the transaction coordinator and all the transaction participants will affect system response time. Distributed transactions just take longer due to their sheer number of transaction managers and resource managers involved.


Summary

In this, the last part of the introductory series on EJB transactions, you've seen a summary of options and a discussion on some additional configuration options and choices for using EJB transactions with Geronimo.


Resources

Learn

Get products and technologies

Discuss

About the author

Jonathan Sagorin is a freelance developer. He has spent the majority of his 10-year career working as a consultant delivering custom Java solutions. In his spare time he attempts softball and improv (not necessarily at the same time, although his softball teammates might disagree).

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source, Java technology, WebSphere
ArticleID=153490
ArticleTitle=Weighing the options for Apache Geronimo EJB transactions, Part 3: Bringing it all together
publish-date=08152006
author1-email=jonathan@javaoncall.com
author1-email-cc=ruterbo@us.ibm.com