In Part 1 and Part 2 of this series, you briefly looked at bean-managed and container-managed EJB transactions and how to implement them in the Geronimo application server. So what's next? What other transactions settings are available, and what other considerations should you take into account when using EJB transactions?
This article starts by summarizing the transaction choices from Parts 1 and 2: container-managed or bean-managed transactions. You'll then learn about concurrency control strategies, methods that ensure that transactions are executed without data loss. You'll also look at isolation levels -- how to control the isolation of a transaction with other transactions -- and find out how to set transaction timeouts. Finally, you'll discover some of the pros and cons of using distributed transactions.
When implementing EJB transactions, you have two options: container-managed or bean-managed transactions.
With container-managed transactions, you specify transaction behavior in your deployment descriptor. The EJB container is responsible for controlling transaction boundaries. You specify transaction attributes for the entire enterprise bean, for individual methods on the bean, or for both. The choices for transaction attributes are:
With bean-managed transactions, you programmatically control your transaction boundaries and decide when transactions begin, commit, and roll back. Within bean-managed transactions, you can choose between implementing Java Transaction API (JTA) or Java Database Connectivity (JDBC) transactions. JTA transactions use the
javax.transaction.UserTransaction interface to control transactions, while JDBC transactions control the behavior of transactions by performing operations directly through the
If you're using session beans or message-driven beans (MDBs), you can implement bean-managed or container-managed transactions. Entity beans, however, can only use container-managed transactions.
Table 1 summarizes these options by transaction type for each enterprise bean implementation.
Table 1. Transaction type options by enteprise bean
|Transaction type||Session bean||Entity bean||Message-driven bean|
If you're unsure of which transaction type to use for your bean, Sun Microsystems recommends using container-managed transactions with the
required attribute for your enterprise bean.
For developers, using container-managed transactions is simpler and requires less work. No transactional logic is required in your bean method. You demarcate transaction boundaries at the method level on the enterprise bean. Your bean method must either run within the context of a transaction or not.
If you require stricter control of your transaction boundaries, use bean-managed transactions. If you expect to have long-running processes within your enterprise beans, use bean-managed transactions. For the purposes of this article, you want your transactions to run for as short a time as possible. If you use container-managed transactions, the demarcation boundaries are not granular enough; they are at the bean-method level.
By using bean-managed transactions, you can limit the duration of the transactions to be short-lived. You can isolate the database operations within the transaction and allow the longer-running processes to run outside the scope of the transaction. This will ensure that you don't block any other transactions from accessing the same data.
There are two strategies for implementing concurrency control using EJBs: You can follow either a pessimistic or an optimistic locking strategy.
With pessimistic locking, you acquire locks for the duration of the transaction on the data you need to modify to block anyone else from modifying the same data. This strategy is used in systems where there's a good chance someone else might attempt to modify the data being worked on. This strategy provides reliable access to data, but is suitable for smaller-scale systems, because as the system scales and more locks are required, performance will degrade.
Optimistic locking doesn't hold on to locks during a transaction. You take a positive outlook and assume that the data won't be modified by some other transaction while you're using it. You're able to handle the exceptional cases when data conflicts occur, but assume this will be rare. When updates to data are required, you implement strategies to check that the data has not changed between the time it was last read and then again prior to modification. If the data has not changed, you perform the update. This strategy is suitable for large-scale systems. The disadvantage is that you must implement code to detect and handle data conflicts.
The next section discusses changing the isolation level of a transaction. This is an example of pessimistic locking, where you control the degree and effectiveness of pessimistic locking by changing the isolation level of the transaction.
One of the ACID properties of a transaction is isolation. Isolation allows the actions of transactions (whether reading or writing data) to be independent, or isolated, from other concurrent running transactions. By controlling the isolation, each transaction behaves as if it is the only transaction modifying the database at that moment. The degree to which a transaction is isolated from other transactions is called the isolation level.
Lock mechanisms and synchronization are used to control isolation levels. As isolation levels increase, more locks and synchronization are required. As locks are held on the data resource, other transactions attempting to perform any data operations must wait until the lock is released. So, increased isolation comes at the expense of performance. Conversely, as isolation level is decreased, performance improves because transactions spend less time waiting for locks to be released.
Isolation levels are set on a transaction to address the following data consistency problems:
- Dirty reads
- Unrepeatable reads
- Phantom reads
A dirty read occurs when data is read from a database that has not yet been committed. The data being read is out of synch with the actual data in the database.
Consider the following scenario, where two transactions are reading and updating a String field X on a database. String X's initial value is
- Transaction 1 reads the value of String X of
- Transaction 1 concatenates String X's current value with
barand saves it to the database.
- The new value of X is
foobar. Transaction 1 has not yet issued a commit statement.
- Transaction 2 reads the String X's value, which is
- Transaction 1 aborts.
- Transaction 2 concatenates String X with
barand saves it to the database.
- X's new value is
foobarbar, when its correct value should be
- Transaction 2 has performed a dirty read on String X's value.
The problem here is one transaction can change a value, and a second transaction might read this value before the initial change has been committed. The data is dirty and does not represent the true state of the data.
An unrepeatable read occurs when an application reads data from a database, and when it rereads the data (perhaps later in the same transaction) the data has been changed. Consider this scenario, where two applications are reading and updating the same data:
- Application 1 reads the value of String X of
- Application 2 updates the value of String X to be
- Appplication 1 rereads the value of String and finds it has changed to
So between reads, the value of the data has changed and has become inconsistent.
A phantom read is similar to an unrepeatable read. However, with phantom reads, new data is inserted into the database. An application reads a set of data from a database and finds that when it rereads the same set of data, additional data has been added. Consider this scenario, where two applications are reading and updating the same data on a database:
- Application 1 searches for data on certain criteria and returns a data set with five rows.
- Application 2 adds five additional rows to the database, which satisfies application 1's search criteria.
- When application 1 rereads the database based on its initial criteria (and expects to find five rows), 10 rows are returned.
Once again, the data has become inconsistent between reads.
Four levels of transaction isolation are listed below in order of isolation from lowest (weakest) to highest (strongest). Remember that as you increase the isolation level, performance of your application decreases.
- Read uncommitted -- Use this option only for nonmission-critical systems with unshared data (which is rarely the case in applications). Performance is at its best, but you'll sacrifice concurrency control. Use this option if you're sure there will be no other concurrent transactions. By using this option, none of the data problems listed above is solved.
- Read committed -- This is the default isolation level for most databases and is the default for Apache Geronimo. Only committed data is read, so this option solves the dirty-read problem. Additional locks are required on the database, so performance will be slower.
- Repeatable read -- By using this isolation level, you address the dirty-read and uncommitted-read problems. You're guaranteed any rows that you read can be reread at a later time and their values will not have changed.
- Serializable -- This is the strictest isolation level and addresses all three data problems. When you want your transaction to behave in a truly isolated fashion and in complete independence of other transactions, use this level. You'll be guaranteed data consistency. Use for mission-critical systems to guarantee truly isolated transaction behavior. But be aware, this isolation comes at a performance cost.
Table 2. Solutions to data problems using isolation levels
|Data solution||Read uncommitted||Read committed||Repeatable read||Serializable|
|Solves dirty reads||x||x||x|
|Solves unrepeatable reads||x||x|
|Solves phantom reads||x|
Isolation levels are specified through the underlying database resource manager. With bean-managed transactions, you have access to the underlying connection programmatically. Because you have access to the
java.sql.Connection interface, you can change the isolation level for the connection using the method
Set the appropriate isolation level using these constants:
Other methods of interest are:
(Refer to the Sun JavaDoc API in the Resources section for more information about these methods.)
Note: By modifying isolation levels, you're requesting the database resource manager to change the isolation for this resource. There's no requirement that database vendors must support this. In fact, many database vendors won't allow you to do so. Be careful when changing the isolation level. Check your database resource manager documentation to find out which isolation levels are supported.
Also, set the transaction isolation level before you begin a transaction. Never switch an isolation level halfway through a transaction. Most resource managers also require you to use the same level of isolation for all participants in a transaction.
When using container-managed transactions, there's no way to specify the isolation level in the deployment descriptor. By default, Geronimo uses the Read Committed isolation level for the EJB container. If you need more granular control of your isolation levels, consider using bean-managed transactions with JDBC transactions.
When using bean-managed transactions with JTA transactions, you can use the
setTransactionTimeout method on the
javax.transaction.UserTransaction interface. This sets the maximum time (in seconds) a transaction will run before it aborts.
When multiple participants within a single transaction are physically distributed across a network, the transaction is known as a distributed transaction. Distributed transactions allow for different types of resources to participate in the transaction. Examples of distributed transactions are:
- A single session bean begins a transaction and updates database A. It invokes a second session bean running on the same application server to update database B. The first session bean commits the transaction. Both database updates occur in the same transaction.
- A single session bean begins a transaction and updates database A. It invokes a second session bean running on a different application server to update database B. The transaction managers for each application server will ensure both databases are updated in the same transaction.
- A single session bean begins a transaction and updates database A, followed by a Java Message Service (JMS) operation. Both units of work are part of the same transaction. If the JMS operation were to fail, the transaction would not update the database.
Several transaction managers must work together to perform a distributed transaction. Usually a single transaction manager (called the transaction coordinator or distributed transaction manager) is appointed to coordinate the other transaction managers.
Transaction managers in turn will coordinate with resource managers to perform the necessary commits or rollbacks on their resources (perhaps a database or a messaging server). Most databases have their transaction managers and resource managers tightly coupled together.
Distributed transactions are accomplished by communicating using a protocol called the two-phase commit. From the name, you've probably figured out there are two phases:
- First phase, or prepare to commit:
- The transaction coordinator sends a signal to each transaction manager to prepare its operation.
- The transaction manager writes the steps (or detail) of the operation (usually data updates) to a transaction log. In case of failure, the transaction manager uses these steps to repeat the operation.
- The transaction manager creates a transaction locally and notifies the resource manager to perform the operation on the resource (for example, a database or a message server).
- The resource manager performs the operation and indicates success (ready to commit signal) or failure (ready to roll back) to the transaction manager.
- The resource manager waits for further instructions from the transaction manager.
- The transaction manager indicates success or failure to the transaction coordinator.
- Second phase, or commit phase: The results of the first phase are communicated to all transaction managers in the second phase. If any transaction manager reports failure, all transaction participants must roll back.
- The transaction coordinator tells all transaction managers to commit (or roll back).
- All transaction managers pass the commit or rollback information on to their resource managers.
- The resource manager indicates success or failure back to the transaction manager.
- The transaction manager indicates success or failure to the transaction coordinator.
The biggest challenge in distributed transactions is agreeing on the common communication protocol between transaction managers. There is a standardized protocol for two-phase commits called the XA protocol, but not all vendors support this standard. The XA protocol defines the interfaces between the transaction manager and the resource manager.
Geronimo's transaction manager is Java Open Transaction Manager (JOTM), an open source transaction manager. It implements the XA protocol and complies to the JTA. Remember, JTA is the interface used in this series to communicate to the transaction manager. You used it with bean-managed transactions to demarcate when you start, commit, or roll back transactions.
As long as all transaction participants agree on the communication protocol, they can participate in the same distributed transaction.
Be aware that distributed transactions are slower than using local transactions. All transaction participants need more system resources. Network chatter between the transaction coordinator and all the transaction participants will affect system response time. Distributed transactions just take longer due to their sheer number of transaction managers and resource managers involved.
In this, the last part of the introductory series on EJB transactions, you've seen a summary of options and a discussion on some additional configuration options and choices for using EJB transactions with Geronimo.
Read "Dive into EJB Web applications on Apache Geronimo" (developerWorks, July 2005), a great article about using EJBs with Geronimo.
- Check out "Deploy J2EE applications on Apache Geronimo" (developerWorks, January 2006) for information on deploying JavaServer Pages (JSPs), servlets, and different EJBs on Apache Geronimo.
Get more information about deployment descriptors in Geronimo.
Read more about Sun's EJB specification.
Learn about XDoclet.
Read Nicholas Chase's interview with David Blevins, the cofounder of OpenEJB, in "OpenEJB and Apache Geronimo's EJB implementation" (developerWorks, May 2006) to learn how OpenEJB came to be chosen as the EJB implementation in Geronimo.
Read "Building a better J2EE server, the open source way" (developerWorks, May 2005) for great information about the development process behind Geronimo, including its JSR 88 support.
- Check out the developerWorks Apache Geronimo project area for articles, tutorials, and other resources to help you get started developing with Geronimo today.
- Find helpful resources for beginners and experienced users at the Get started now with Apache Geronimo section of developerWorks.
- Check out the IBM® Support for Apache Geronimo offering, which lets you develop Geronimo applications backed by world-class IBM support.
- Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
- Browse all the Apache articles and free Apache tutorials available in the developerWorks Open source zone.
- Browse for books on these and other technical topics at the Safari bookstore.
Get products and technologies
- Download Apache Geronimo.
- Innovate your next open source development project with IBM trial software, available for download or on DVD.
- Download your free copy of IBM WebSphere® Application Server Community Edition V1.0 -- a lightweight J2EE application server built on Apache Geronimo open source technology that is designed to help you accelerate your development and deployment efforts.
- Participate in the discussion forum.
- Get involved in the developerWorks community by participating in developerWorks blogs.