Choosing the Right EJB Type: Some Design Criteria

When implementing your application design with EJBs, how do you decidewhat kind to use? This article reviews some EJB basics and describes the different kinds of EJBs (session, entity, stateless, stateful). It then gives you six rules of thumb that can help you decide what kinds of EJBs are best in a given situation, with special attention to the issues of database persistence and efficiency.

Kyle Brown (brownkyl@us.ibm.com), Senior Java Consultant, IBM WebSphere Services, IBM

Kyle Brown is an Executive Java Consultant with IBM Software Services for WebSphere. Kyle provides consulting services, education, and mentoring on object-oriented topics and Java 2 Enterprise Edition (J2EE) technologies to Fortune 500 clients. He is a co-author of Enterprise Java Programming with IBM WebSphere, the WebSphere AEs 4.0 Workbook for Enterprise Java Beans, 3rd Edition, and The Design Patterns Smalltalk Companion. He is also a frequent conference speaker on the topics of Enterprise Java, OO Design, and Design Patterns. Kyle can be reached at brownkyl@us.ibm.com.



01 August 2000

Introduction

When you are trying to master any new technology, you are inevitably faced with a number of choices between what seem to be nearly identical alternatives. Choosing which road to take is often more of an art than a science in such cases. Unfortunately, that seems to be the state of the art today when it comes to building systems that use Enterprise JavaBeans (EJBs). So far, very little guidance has been given to developers in choosing how to implement their designs using the new technology. This is particularly true when issues of database persistence and efficiency need to be considered.

This article will address some of those issues, and provide some much-needed guidance. After reviewing some EJB basics, I will examine six different "rules of thumb" that can help determine when to use a particular type of EJB in different situations.

Some EJB Basics

First of all, let's review the different types of EJBs.

Important: If you are unfamiliar with the EJB specification, then you should first start by reviewing [Sun99] or [Monson-Haefel00] to gain a basic understanding of what an EJB is. Study the diagram below, and then walk through the explanation of the different "branches" of the diagram:

An EJB is a software component. It is distributed, transactional, and possibly persistent. EJBs run within the context of a software system called an EJB Server. An EJB Server may contain one or more EJB Containers that are responsible for handling the details of persistence, thread safety, and other concerns. There are many different EJB Servers on the market today, such as IBM WebSphere Application Server, Advanced Edition (afterwards referred to as WebSphere AE) and BEA WebLogic. There are two basic types of EJBs: Session EJBs and Entity EJBs. Session EJBs are function-oriented components -- they represent a set of behaviors that reside on a server and can be invoked by clients. There are two sub-types of Session EJBs, Stateless Session EJBs and Stateful Session EJBs.

Stateless Session EJB -- Represents a set of related behaviors (methods) that do not retain client-specific states between invocations. Contrary to popular belief, stateless session beans can in fact, possess instance variables, but those variables must be shared among multiple potential clients (e.g., they should be read-only). This often-overlooked fact can be key to understanding some potential uses of stateless Session EJBs.

If you are familiar with a traditional transaction-processing system like CICS or Encina, then you can think of each method call to a Stateless Session EJB as an individual non-conversational transaction. In fact, the EJB specification allows some flexibility in the design of an EJB Server such that each method call to a single EJB client stub for a Stateless Session EJB may be directed to a different instance of a Stateless Session EJB -- possibly each residing on a different physical machine. This allows the EJB server to be flexible in its approach to workload management, providing safety and scalability.

Stateful Session EJB -- Closer to a traditional "distributed object" as implemented with CORBA or RMI than a stateless session EJB. Each stateful session EJB is "owned" by a single client stub and is uniquely connected to that stub. As a result of this, stateful session EJBs may retain client state across method invocations. That is to say, a client may call a method that sets a variable value in one method and then be assured that another, later, method invocation to retrieve that value will retrieve the same value.

Entity Beans model data entities and provide shared distributed transactional access to persistent data. As such, Entity Beans provide concurrent shared access to multiple users. An individual Entity EJB represents a single persistent entity -- for instance a row in a relational database (RDB). In contrast, Session EJBs model businesses processes and provide exclusive access to a single client either for a length of a method (in the stateless case) or for the life of the bean (in the stateful case).

There are two basic subtypes of Entity EJBs: Container-Managed Persistence (CMP) Entity EJBs and Bean-Managed Persistence (BMP) Entity EJBs. CMP entities are those whose persistence (for example, the storing and retrieving of their data from a backing store like an RDB) is managed by the EJB container. This means that the container would, for instance, manage both generating and executing SQL code to read and write to the RDB. On the other hand, Bean-Managed Persistence (BMP) EJBs leave the management of such details as what SQL is executed to the developer of the EJB. Each BMP EJB is responsible for storing and retrieving its own state from a backing store in response to specific "hook" messages (like ejbLoad() and ejbStore()) that are sent to it at appropriate times during its lifecycle.

Distributed object design is a subject that, if not exactly a science, is at least fairly well understood in the industry. There are a number of excellent books like [Orfali] and [Mobray] that have been in publication for several years that give guidance on mastering the art of distributed object design with technologies like CORBA. Since Session EJBs bear a lot of resemblance to standard distributed objects built using CORBA or RMI, most of the same "design patterns" and strategies that apply to these technologies also apply to Session EJB design.

However, it is when you start to consider Entity EJB design that people begin to have differing opinions, and where the expertise starts to wear thin. This subject is a more challenging one, and one that fewer people are comfortable with. Some hints for designing systems that use both types of EJBs will be the focus of the remainder of the article.

Choosing EJBs

Possibly the best way to learn a new subject is by following the "Socratic Method" -- in other words, to learn by asking questions. In this spirit, the following questions can help you understand more about your own designs by helping you learn when to choose each type of EJB. Ask these questions about your own requirements, and then ponder the considerations given afterward in order to understand how to apply EJBs in your particular environment.

Is there a set of objects (perhaps constituting a logical subsystem) that are both read and updated relatively frequently, with complex relationships between them changing rapidly?

In a nutshell, this is the case for using Container-Managed Persistence (CMP) Entity EJBs. When a set of complex relationships exist between different EJBs, the complexity of the programming of the relationship management becomes a key driver in choosing a solution. CMP Entity EJBs, especially as the code generators in VisualAge for Java, Enterprise Edition implement them, are a compelling solution to this problem. The code generators in VisualAge for Java can handle complex single and multi-valued relationships between different EJBs, and handle EJB inheritance issues. VisualAge for Java also handles complex mapping between the EJB design and underlying relational database stores [Brown 99a] including both Composition (1-1 and 1-N relationships) and implementation inheritance. When a tool can generate this code, rather than it being laboriously hand coded, it allows a system to be more easily adapted to changes in the requirements or the underlying data model.

Another feature of CMP that makes it attractive is its ability to manage optimization of the set of SQL calls that must be made in order to read or write the persistent state of an EJB. For instance, the CMP model in WebSphere AE allows a set of Entity EJBs to be read from a relational database in a find() method with only a single SQL SELECT call. This is much more efficient than the default BMP case, which requires N+1 SQL calls to do the same thing, unless some complex caching scheme is used (which will be the subject of a future article).

Since so much work has gone into optimizing and designing the CMP support in the WebSphere AE and VisualAge for Java, it is difficult for programmers to significantly improve upon its performance in the default case. So, for most persistence tasks involving a relational database, CMP should be the first choice.

Do you have an object structure in your design that corresponds to a relational join, or do you see the need for relational join to improve overall system performance?

For instance, consider the following problem -- let's say we are building a frequent flyer Web site for an airline. This site might have a Customer object as one of its main parts. Furthermore, let's say that the Customer has an Address that indicates where the Customer lives. Here we have a one-to-one relationship between a Customer and their Address.

If we take the approach of making both the Customer and the Address CMP Entity EJBs as in the previous solution, then we find that to obtain a Customer and his Address that we need to use two SQL SELECT statements: one to select the Customer from the Customer table (inside the findByPrimaryKey() of the Customer bean), and another to select the Address from the Address table (inside the findByPrimaryKey() of the Address Bean).

However, after some thought, we realize this isn't necessary -- in SQL we can retrieve the data in both the Customer and the Address objects with a single SELECT statement using a join if the two tables are linked by a foreign key relationship.

The problem is that while you can do this in SQL, it isn't possible to do this in the current implementation of CMP in the WebSphere AE. In fact, no EJB Server that I am aware of really handles relational joins well. VisualAge for Java does generated code for WebSphere AE to handle simple relational joins with Entity EJBs, but only where two tables share the same primary key. It will not handle general foreign key relationships like the one described above.

So, we are then stuck with using a BMP solution. However, this is a good thing. We can create what is called a Dependent Object, which simply means a Java Bean (not an EJB) that the Customer EJB will create and return when asked for an Address. We will create our Address Java bean in the ejbLoad() of our Customer BMP EJB.

Let's see how this would work in the following code fragment, which illustrates part of the implementation of the Customer BMP Entity Bean:

public class CustomerBean implements EntityBean {
   ...
   private String customerName;
   private int accountBalance;
   private Address address;
   ...
}

Here we see that in addition to fields like customerName and accountBalance, we have a Dependent Object of type Address. The Address is defined below:

public class Address implements java.io.Serializable {
   public String streetAddress;
   public String city;
   public String state;
   public String zip;
}

Now we can see how the ejbLoad() method would work and fill in the appropriate pieces of the EJB from each table:

public void ejbLoad() throws java.rmi.RemoteException {
   boolean wasFound = false;
   boolean foundMultiples = false;
   FilmPersonRoleKey key = (CustomerKey) 
getEntityContext().getPrimaryKey();
   makeConnection();
   try {
      // SELECT from database
      String sqlString = "SELECT a.name, a.accountBalance, b.street, b.city, b.state, 
      b.zip FROM Customer a, Address b WHERE (a.id = ? AND a.addressFK = b.id)";
      PreparedStatement sqlStatement = jdbcConn.prepareStatement(sqlString);
      sqlStatement.setInt(1, key.id);
      // Execute query
      ResultSet sqlResults = sqlStatement.executeQuery();
      // Advance cursor (there should be only one item)
      // wasFound will be true if there is one
      wasFound = sqlResults.next();
      if (wasFound) {
         // If the Join ran correctly, then set the internal //
         variables and create the dependent Address object
         this.setName(sqlResults.getString(1));
         this.setAccountBalance(sqlResults.getInt(2));
         Address anAddress = new Address();
         address.street = sqlResults.getString(3);
         address.city = sqlResults.getString(4);
         address.state = sqlResults.getString(5);
         address.zip = sqlResults.getString(6);
         this.setAddress(anAddress);
      }
      // foundMultiples will be true if more than one is found.
      foundMultiples = sqlResults.next();
    } catch (Exception e) { // DB error
      throw new RemoteException("Database Exception " + e + 
      "caught in ejbLoad()");
         }
         dropConnection();
         if (wasFound && !foundMultiples) {
            return;
         } else {
            // Report finding no key or multiple keys
            throw new RemoteException("Multiple rows found for unique key in ejbLoad().");
         }
}

Now this solution has been kept simple so that it can be easily understood -- it only describes a single dependent object. However, consider what happens when we say that the customer should also be able to view the recent trips they have taken to see how many miles were accumulated on those trips. There is a one-to-many relationship between the customer and the trips that they have taken.

We realize that if we also join in the Trip table to our previous SQL query that we could retrieve all of the information from all three tables in a single SQL statement. However, the CMP solution requires three SELECT statements. It is when this difference (three statements vs. one statement) is multiplied over thousands of requests that the performance difference becomes noticeable.

Do you have an object structure that represents an N-ary relationship in your underlying database?

In the previous section, you saw how that SQL joins can be problematic for the current WebSphere AE CMP solution. This is compounded when you have to deal with N-ary relationships involving multiple tables joined together by a single relationship table. The same solution outlined above can help in that situation, however. Using BMP you can develop a single EJB that represents the group of related objects (the association table and the tables referenced by that table) and load each group in a single SQL statement. This solution is detailed in [Brown 00], where you can look for more detailed information, including source code and an analysis of the performance gains in using a BMP solution.

Does your application contain a relatively small set of objects that are very rarely, or never updated, but whose state is frequently read?

Almost every program has at least some examples of this sort of object. For instance, insurance applications have a number of codes for different medical procedures that change vary rarely, perhaps once a year. Another, more common type of object like this is a political entity like a county or a state. These change exceedingly rarely, but the list of them may be expanded if an application must be made to work internationally.

So why not use one of the previous solutions and represent these objects as BMP Entity beans or CMP Entity beans? The key difference with this case is that these objects are not transactional -- they are read-only. To understand why this makes a difference, consider some of the benefits to using Entity EJBs that are provided by the EJB container. All Entity EJBs are automatically:

  • Distributed -- so that changes can be made from many different clients.
  • Transactional -- so that those changes are kept isolated from one another depending on the transaction isolation setting used).
  • Persistent -- so that you don't have to worry about deciding in your bean when to save or retrieve state from the database.

However, these advantages come at a substantial run-time cost. There is substantial overhead in the distribution aspect, since each EJB will be its own distributed object, necessitating stubs and associated objects for each instance. There is also overhead in the transactional and persistent aspects, since the frameworks that handle these aspects must come into play at each method call to an Entity EJB.

Also, you need to think about the way in which the state of the EJBs will be held in the memory of the EJB server. The EJB specification defines three caching options. In the default used by WebSphere AE (Option C), an Entity's state is read once per transaction at the beginning of each transaction –- even if the value did not change from the last time it was read. In Option A caching, which is also available in WebSphere AE, the state is read once and will always be the same, regardless of the transaction. However, Option A caching means that each instance of the EJB will be held in memory (or on disk in the passivation cache), which is a significant run-time penalty since each Entity EJB is comprised of not one instance of a class, but many instances of many classes.

So what is the best solution to this problem, since Entity EJBs carry such high overhead? This situation is best handled by a stateless Session bean that returns a set of Java beans (dependent objects) whose state is only read once, usually on program startup. So, if we read the state if these beans once and hold them in memory for the lifetime of the stateless session bean we will save a large number of needless calls to the back-end storage mechanism

Let's describe this solution with a simple example. Let's say we are writing a mortgage application that needs to be able to calculate the property tax on a particular piece of property. Let's say that one of the pieces of information needed in that calculation is a table listing the county that the property is located in, and the tax rate for that county.

We can easily define a stateless session bean to access this information as follows:

public class PropertyTaxLookupFacadeBean implements SessionBean {
   private javax.ejb.SessionContext mySessionCtx = null;
   final static long serialVersionUID = 3206093459760846163L;
}

The part of this EJB that we are interested in would be the lookup() method in its public interface, which returns the property tax rate for a particular county:

public int lookup(String county) {
   return PropertyTaxLookup.getInstance().lookup(county);
}

Now, this method does something interesting. It defers the lookup to another class: a class named PropertyTaxLookup. In fact, this is the heart of this solution. We can make the table of property taxes be held in a Singleton instance (e.g., it uses the Singleton pattern from [Gamma]):

import java.util.*;
/**
* Insert the type's description here.
* Creation date: (8/18/2000 1:02:52 PM)
* @author: Administrator
*/
public class PropertyTaxLookup {
   public java.util.Hashtable taxRates;
   public static PropertyTaxLookup instance;
}

The PropertyTaxLookup class defines a Hashtable of taxRates (Integers that hold the tax rate, keyed by Strings that represent the names of the county) and a static instance of PropertyTaxLookup that is the Singleton. The lookup() method that does the work of retrieving the tax rate is shown below:

public int lookup(String county) throws NoSuchElementException {
   Object result = taxRates.get(county);
   if (result == null)
      throw new NoSuchElementException("County not found");
   return ((Integer) result).intValue();
}

Clients (like the Session bean described above) use the getInstance() method that returns the singleton instance of PropertyTaxLookup. This method is shown below:

public static PropertyTaxLookup getInstance() {
   if (instance == null) {
      PropertyTaxBroker broker = new PropertyTaxBroker();
      instance = new PropertyTaxLookup(broker.retrievePropertyTaxes());
   }
   return instance;
}

The getInstance() method above connects to the last part of our solution, a "Broker" class responsible for executing the SQL to retrieve the property taxes from the database on startup. The class definition of the PropertyTaxBroker is shown below:

public class PropertyTaxBroker {
   Connection conn = null;
   DataSource ds = null;
   private static String SELECT_STRING = "select county, taxRate from PropertyTaxEntry";
   private static String DATASOURCE_NAME = "jdbc/MyDataSource";
}

The last part of our solution is the method that retrieves the property tax information from SQL. This method, which simply uses JDBC, is shown below:

public Hashtable retrievePropertyTaxes() {
   System.out.println("**** Retrieving Property tax rates ");
   Connection jdbcConn = null;
   Hashtable taxes = new Hashtable();
   try {
      jdbcConn = getConnection();
      // SELECT from database
      PreparedStatement sqlStmt = jdbcConn.prepareStatement(SELECT_STRING);
      // Execute query
      ResultSet sqlResults = sqlStmt.executeQuery();
      String countyName = null;
      int countyTaxRate = 0;
      while (sqlResults.next()) {
         countyName = sqlResults.getString(1);
         countyTaxRate = sqlResults.getInt(2);
         taxes.put(countyName, new Integer(countyTaxRate));
      }
    } catch (Exception e) { // DB error
      System.out.println("**** Exception caught in retrievePropertyTaxes " + e);
    } finally {
      dropConnection(jdbcConn);
    }
   return taxes;
}

This solution uses a static variable on the class PropertyTaxLookup to hold the "lone" instance of PropertyTaxLookup. Another option would have been to hold the reference to the singleton in the stateless session bean itself, which would guarantee that the instance will not be garbage collected (although the solution listed should prevent this on most JVMs).

Are there a set of objects in your design that are essentially write-only, like a system log?

Sometimes you need to write a lot of rows to a database efficiently, but do not need to worry about write-contention or reading these objects back from the database. The most common example of this is a log of events. Often there are two different applications that write and read the log, and the reading happens a great deal less often than the writing.

The solution for this problem is similar to the solution to the previous problem, and for similar reasons. Here, we don't need to worry about sharing this information outside of the database, so the fastest and simplest solution is not even to represent a log entry as an EJB at all (or at least not for writing purposes).

So, here we have an EJB that represents the log, and another Broker class whose purpose it is to record log entries to the database in an efficient way. The broker would simply use JDBC to write a new entry into the log table whenever it is needed.

In testing this solution, we have found that it is approximately four times faster to record information by directly writing to the database than it is to create (and then discard) an EJB for the same purpose.

Do you need to display and scroll through a large (>50 element) list in your application?

Many applications need to be able to display large lists of data in order to let a user select from that list. In general, this should be avoided because scrolling through a large list is a poor UI design choice, but there are times where it is the only option.

When you are retrieving data to display in a list, you generally only need a small subset of data. Often, lists contain only a unique identifier and some sort of user-readable representation of the list element. In this case, using a custom finder method to retrieve a large set of Entity EJBs, only to then use a few data elements in each EJB, is a huge waste of resources. So instead of retrieving an enumeration of EJBs and then walking over the enumeration, create a simple Stateful Session EJB that can retrieve only those pieces of data that are necessary through a very minimal SQL query. You can then return the information in a very simple form like a Hashtable of key values to the Strings that will be displayed in the list.

Once the user has selected a particular selection from the list, you can use an EJB to retrieve and operate on only that selected object by finding the EJB with a findByPrimaryKey() method using the key value that corresponds to the selected element.

This solution is over twice as fast as iterating through an Enumeration of Entity EJBs in most circumstances, and generates less garbage as well.

Summary

This article has shown how to apply some "Template architectures" or patterns to solving some of the thornier issues in designing with EJBs. We have examined a number of issues regarding persistence in EJB designs and reviewed some strategies for resolving those issues in your code.

Acknowledgements

I'd like to thank the world-class team of reviewers that have provided some extremely useful feedback on this article. Thanks go to Harvey Gunther, Keys Botzum and Geoff Hambrick for their insightful comments and great suggestions.

Resources

  • [Brown 99] Kyle Brown, et. al, A small pattern language for Distributed Component Design, submitted to the Pattern Languages of Programs (PLOP) '99 Conference
  • [Mobray] Tom Mobray and Ralph Malveau, CORBA Design Patterns, John Wiley and Sons, 1997
  • [Monson-Haefel 00] Richard Monson-Haefel, Enterprise JavaBeans, 2nd Edition, O'Reilly & Associates, Sebastapol, CA, 2000
  • [Orfali] Robert Orfali and Dan Harkey, Client Server Programming with Java and CORBA, John Wiley & Sons, 1997
  • [Sun 99] The Enterprise Java Beans Specification, version 1.1

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=316054
ArticleTitle=Choosing the Right EJB Type: Some Design Criteria
publish-date=08012000