The EJB Advocate: Making entity EJB components perform

Using poorly designed EJB™ components can lead to serious performance problems during system testing or (worse) in production. The EJB Advocate shows how to design method signatures to minimize the "chattiness" between layers and get the most out of your EJBs.

Geoff Hambrick (ghambric@us.ibm.com), Distinguished Engineer, IBM

Geoff HambrickGeoff Hambrick is a lead consultant with the IBM Software Services for WebSphere Enablement Team and lives in Round Rock, Texas (near to Austin). The Enablement Team generally helps support the pre-sales process through deep technical briefings and short term Proof of Concept engagements. Geoff was appointed an IBM Distinguished Engineer in March of 2004 for his work in creating and disseminating best practices for developing J2EE applications hosted on IBM WebSphere Application Server.



23 February 2005

Also available in Russian

From the IBM WebSphere Developer Technical Journal.

In each column, The EJB Advocate presents the gist of a typical back-and-forth dialogue exchange with actual customers and developers in the course of recommending a solution to an interesting design issue. Any identifying details have been obscured, and no "innovative" or proprietary architectures are presented. For more information, see Introducing the EJB Advocate.

The problem

divider

Dear EJB Advocate,

My first J2EE project was to build a Web application that renders products from our company catalog for customers to add to their shopping cart for purchase. I was a big advocate of object technology and the idea of entity EJBs. That is, until I started testing a prototype on my laptop using IBM® WebSphere® Studio Application Developer's test environment. To make a long story short, it took over five seconds for my servlet to load a page-worth of local entity EJBs for display. When I recoded the servlet to use JDBC, it took less than one second.

I hate to say this, but sign me:
No Longer a Fan

divider

It was pretty obvious right away that the entity EJB architecture was flawed because an apples-to-apples comparison of well designed CMP entity EJBs to hand-coded JDBC components has never shown this level of difference. However, there were some good things to point out about their focus on early testing to explore first in my reply. My thought was that by pointing out an extra test they could try, the team may be able to figure the problem out for themselves.

divider

Dear No Longer,

It was good to see that you were using local versus remote entity EJBs, so that you didn't pay the additional performance penalty of remote method invocations. This overhead could have easily doubled the amount of time it took to load the page. One of the best practices associated with using entity EJBs is to always use local ones. We briefly discussed this approach in last month's column.

I was also very happy to hear that you immediately built a prototype to test the performance of local entity EJBs versus directly-coded JDBC as the persistence layer. Many teams don't do candidate architecture performance testing until they build out the entire application, and ultimately end up with a lot of rework -- and maybe a critical situation -- in production. Others will attempt to hide the details of the persistence mechanism behind an "access bean" to make it easy to switch later, adding yet another layer to the system (both JDBC and entity EJBs are already abstractions of the persistence mechanism).

I suspect I know what the problem is, but I'd like you to turn on tracing of the JDBC datasource to see how many SQL statements are getting executed (see the WebSphere Application Server Information Center). You can also turn on EJB tracing which, as Matt Oberlin points out in his Meet the Experts article, is pretty easy to do in WebSphere Studio Application Developer (Figure 1).

Figure 1. Turning on EJB trace in WebSphere Studio Application Developer
Figure 1. Turning on EJB trace in WebSphere Studio Application Developer

Further, and as Stacy Joines points out in her excellent book on WebSphere performance, gathering precise performance measurements is really important to finding and fixing bottlenecks. The reason I ask you to capture this more precise measure is that it is likely you will see many more SQL statements for the entity bean case than for the direct JDBC case, and that accounts for the difference in performance. In fact, I predict that you will see one SQL statement executed for each attribute read from each product entity, plus one!

Let me know what you find out.

OK then,
Your EJB Advocate

divider


The good, the bad, and the ugly

divider

Dear EJB Advocate,

How did you guess? I display up to ten products per page, with five attributes per product (SKU, description, price, a link to an image, and a date when the item will be available). I ended up with fifty-one SQL statements executed for the entity EJB case and only one SQL statement for the direct JDBC case! It is no wonder that the entity EJBs did not come anywhere close to the performance. It seems like I made the right choice to go with JDBC.

Sign me still:
No Longer a Fan

divider

I had hoped that the SQL trace data would make it obvious what the problem was -- that "No Longer" would no longer rely on imprecise "stopwatch" measurements, and would try to find the cause of the difference or problem before giving up. Helping someone get over disillusionment is harder than I thought! Here was my reply:

divider

Dear No Longer,

I suspect that you are invoking your entities outside the scope of a global transaction, which will cause each call to an entity to be executed in a separate transaction and result in its own SQL statement (depending on your deployment options). Specifically, there will be one call to the home for the finder returning the next ten entities, and five calls to each entity for the get() methods associated with the attributes displayed. Calling entities outside of a global transaction is definitely a "worst" practice (more commonly called an anti-pattern). In fact, it is so important to avoid calling entities outside of a transaction that some EJB developers suggest (as this EJB Advocate does) to declare transactions to be "mandatory" in the EJB deployment descriptor: use the <trans-attribute>Mandatory</trans-attribute> within the <container-transaction> tag. This declaration will cause an exception to be thrown, if there is not already a transaction scope initialized when the entity is accessed.

There are two ways to wrap logic that may be calling entity EJBs in a global transaction and greatly improve performance. One is the "easy way" and one is the "right way.".

The easy way is to explicitly add code to your servlet to start and end a global transaction around the calls to the EJBs, like so:

...
import javax.transaction.*;
...
public class YourServlet implements HttpServlet {

    private InitialContext initCtx = new InitialContext();
    public void doGet (
		HttpServletRequest req, HttpServletResponse
    ){
		UserTransaction userTran =
		 (UserTransaction)initCtx.lookup(
	           "java:comp/UserTransaction"
		);
        
		userTran.begin();

		//Use entity to load data
		...
		userTran.commit();
    }
    ...
}

Some teams go a little further and create a superclass servlet to handle this behavior using a technique called template inheritance. The superclass would be declared abstract . Its doGet() method would be declared final and would call down to an abstract doGetYourParent() method implemented by YourServlet (the ones inheriting that behavior). The parent class code might look like the following:

...
import javax.transaction.*;
...
public abstract class YourParentServlet implements HttpServlet {

    private InitialContext initCtx = new InitialContext();
    public final void doGet (
		HttpServletRequest req, HttpServletResponse
    ){
		UserTransaction userTran =
		 (UserTransaction)initCtx.lookup(
	           "java:comp/UserTransaction"
		);
        
		userTran.begin();
		doGetYourParent();
		userTran.commit();
    }

    protected abstract void doGetYourParent(
		HttpServletRequest req, HttpServletResponse
    );
	    ...
}

The changes required to your subclass servlets in order to use template inheritance are pretty simple:

  1. Change the implements clause of YourServlet class from HttpServlet to YourParentServlet, and
  2. Change the name of its doGet(), doPost() and other HttpServlet methods to do<Method>YourParent().

One major benefit of the template inheritance approach is it makes it easy to consistently and transparently add qualities of service, like transaction start and stop, cache checking, error handling, and other behaviors, that your team will likely want to include to ensure robustness.

Regardless of your approach to starting a global transaction, you should notice a drastic drop in the number of SQL statements in the trace (depending on the access intents and other deployment options; see the WebSphere Application Server Information Center on how to set attributes like the Collection Increment to the number you would like to read -- ten in your case).

But even if you make these changes and eliminate all the places where CMPs are called outside of the scope of a global transaction, a load analysis tool (one that measures the performance of the system under near production conditions), like IBM Rational® Performance Tester, will still show a significant difference between the throughput and CPU utilization of JDBC and entity EJB code, even if profiling tools like JInsight and path analysis tools like IBM Tivoli® Monitoring for Transaction Performance do not show a difference.

The "right way" to fix the code depends on the details of your design. You may already be pretty close, so let me ask you a question: are you using a JavaServer™ Page to render the page from a "data transfer object" (a POJO with get/set methods only) loaded by the servlet (a J2EE best practice)? Or does the servlet render the HTML reply directly?

OK then,
Your EJB Advocate

divider


Hide the ugly behind a custom facade

divider

Dear EJB Advocate,

You are right that I used a JSP to render the page following the Model 2 approach. In other words, the servlet loads an array of up to ten "ProductView" objects (which is the same as your data transfer object except that it is Serializable), and then calls the JSP. Just to be clear, here is the relevant code in the servlet, written in the style you suggested in your previous reply:

...
import javax.transaction.*;
...
public class YourServlet implements HttpServlet {

    private InitialContext initCtx = new InitialContext();
    public void doGet (
		HttpServletRequest req, HttpServletResponse
    ){
		UserTransaction userTran =
		 (UserTransaction)initCtx.lookup(
	           "java:comp/UserTransaction"
		);
        
		userTran.begin();
		ProductLocalHome home =
		 (ProductLocalHome)initCtx.lookup(
	           "java:comp/env/ProductLocal"
		);
        
		Collection products = home.findNextNFrom(last, count);
		int size = 0;
		ProductView[] results = new ProductView[products.size];
		Iterator i = products.iterator();
		ProductView product = null;
		while (i.hasNext()) {
			ProductLocal product = (ProductLocal)i.next();
			ProductView result = new ProductView();
			// Each of these used to cause a tran and SQL
			result.setSku(product.getSku());
			result.setDesc(product.getDesc());
			result.setPrice(product.getPrice());
			result.setImage(product.getImage());
			result.setData(product.getDate());
			results[size++] = result;
		}
		// Set into the HttpServletRequest
		req.setAttribute("ProductView", results);

		// Invoke the ProductView JSP (not interesting here)
         ...
		userTran.commit();

    }
    ...
}

By the way, the JSP uses a custom tag to navigate through an array of ProductView objects (actually this tag navigates through an array of any kind of object simulating the "bean tag"), from which the "bean property tags" can be used to substitute the properties. I hope this is enough detail.

I was glad to find that this "easy way" code actually brought the performance of entity EJBs close to that of JDBC (as measured by JInsight). I also used the "mandatory" CMT attribute, which verified that this took care of all the calls to the CMP outside of a transaction, However, using JDBC was still significantly better in a head-to-head comparision using our load testing tool (we use LoadRunner now, but will take a look at the Rational Performance Tester you mentioned.)

Thanks, but I am still signed,
No Longer a Fan

divider

This time, I got more than I had hoped for. No Longer provided code samples, which are far more precise than descriptions or block diagrams. And it seemed like they took the hint to use more precise measures of system performance using load and path analysis tools.

If No Longer had intermixed the HTML rendering code (the view) with that to retrieve the data, I would have wanted to delve into servlet best practices (not a stretch, since I am an all around J2EE Advocate, too). I would have had to explain the best practice to use data transfer objects to flow data from the servlet to the JSP. The code that I include in my reply below would not have looked so familiar to No Longer.

divider

Dear No Longer,

Thanks for the code sample. I much prefer that to any other form, since code is where the "rubber meets the road" with respect to performance. Nothing beats a static analysis to see if the design follows best practices. Together with load and path analysis, you get a pretty complete picture of how to find and fix the bottlenecks.

It is great that you are following Model 2 best practices and have gone a step farther to provide a custom navigation tag for arrays. It is also great that you are already using data transfer objects, basically the same as Service Data Objects. Having this architecture makes the "right way" to get a global transaction the "easiest way" as well.

In other words, you can create a session facade EJB to encapsulate the logic associated with gathering the data for the page (the array of ProductView objects). This pattern is discussed in last month's column, as well as in Kyle Brown's book (see Resources), among others. The session facade might look something like this:

	 extends SessionBean {

    public ProductView[] getCatalog (
		ProductKey lastKey, int count
    ){
		ProductLocalHome home =
		 (ProductLocalHome)initCtx.lookup(
	           "java:comp/env/ProductLocal"
		);
        
		// This call used to cause a trans and SQL
		Collection products = home.findNextNFrom(last, count);
		int size = 0;
		ProductView[] results = new ProductView[products.size];
		Iterator i = products.iterator();
		ProductView product = null;
		while (i.hasNext()) {
			ProductLocal product = (ProductLocal)i.next();
			ProductView result = new ProductView();
			// Each of these used to cause a tran and SQL
			result.setSku(product.getSku());
			result.setDesc(product.getDesc());
			result.setPrice(product.getPrice());
			result.setImage(product.getImage());
			result.setData(product.getDate());
			results[size++] = result;
		}
		// Instead of setting into the HttpServletRequest
		return results;
    }
    ...
}

As you might notice in the example above, a benefit of using the Model 2 architecture with data transfer objects is that most of the logic in your servlet doGet() method simply moves into the session facade getCatalog() method. A major benefit of this move is that the logic to get the next page of products is now usable outside of the context of a servlet (like from within a message-driven bean, or another EJB). A remote interface can be provided as well (automatically generated by tooling in WebSphere Studio Application Developer), making it available from a J2EE client. The use of the data transfer object minimizes the chattiness between the layers -- only one stateless call is needed. In any event, the servlet no longer needs to deal with a transaction. It looks something like:

...
public class YourServlet implements HttpServlet {

    private InitialContext initCtx = new InitialContext();
    public void doGet (
		HttpServletRequest req, HttpServletResponse
    ){
		MySessionLocalHome home =
		 (MySessionLocalHome)initCtx.lookup(
	           "java:comp/env/MySession"
		);
        
		// Get the last key field as before
		ProductView[] home.create().getProductView(last, 10);

		//Load array into HttpServletRequest and invoke JSP
    	...
	
    }
    ...
}

I know it seems like this best practice trades code to start and end a transaction with code to invoke an EJB method, but there are a couple of additional benefits beyond the ability to reuse the logic in other situations. First, the local session reference can be cached in the servlet init() method to eliminate the lookup in the doGet(). Second, and most importantly, handling transactions can be very complex, especially where exceptions are concerned. Improper handling can lead to "leaks" that result in their own kind of performance problems. In short, another best practice is to use container managed transactions wherever possible.

Regardless, this "right way" code will run essentially as fast as the "easy way" code (given that a local session interface is used). But the problem will still remain that the end-to-end performance using a local entity will still perform significantly worse than when using JDBC (now behind the session EJB, which effectively encapsulates the model from the view). The reason is that even though the get<Property>() methods on the entity are local, there is still a lot of overhead to check for security and transactions. Some estimates of this overhead place it around ten thousand instructions, which for this example would be insignificant when measured by a path analysis tool: 50 x 10000 = 500000 total. But what if the "count" above were 100 and the number of properties accessed was 100. The total number of instructions is 100 million, which is starting to be a measurable difference. This phenomenon associated with "scale" is why load testing is best to find real performance differences. Path testing lets you find the likely culprit with which you can follow up with static analysis (a code review). In this case, the number of instructions it takes to access a property on a data transfer object is estimated in the 10s, not 10s of thousands -- orders of magnitude better than accessing even a local EJB under a global transaction.

The key to optimal EJB performance is to use data transfer objects and custom methods to create, get, and set all the needed properties for a given use case in one call. This minimizes the chattiness between the session facade and entity EJB. In general, it is possible to design your entity EJBs with the right set of methods such that a user need only make one call after a find -- either a create, retrieve, update, or delete method. The following code illustrates what the entity EJB would look like with a custom get:

public abstract ProductBean extends EntityBean {

    // Here are the properties needed for the custom method
    // NOW NO LONGER ON INTERFACE TO PREVENT INDIVIDUAL ACCESS
    public abstract ProductKey getSku() ;
    public abstract void setSku(ProductKey value);

    public abstract String getDesc() ;
    public abstract void setDesc(String value);

    public abstract BigDecimal getPrice() ;
    public abstract void setPrice(BigDecimal value);

    public abstract String getImage() ;
    public abstract void setImage(String value);

    public abstract Date getDate () ;
    public abstract void setDate(Date value);

    // Probably more properties than these above

    // Here is the custom get method&
    public ProductView getProductView (
    ){
		ProductView result = new ProductView();
		// Now cannot cause a tran and SQL
		result.setSku(getSku());
		result.setDesc(getDesc());
		result.setPrice(getPrice());
		result.setImage(getImage());
		result.setDate(getDate());
		return result;
    }
    ...
}

To enforce the use of the custom methods, many EJB designers only expose custom methods on the interface, like the following:

public interface Product extends EJBLocalObject {
    public ProductView getProductView ();
    public void setProductView(ProductView value);
}

The home would have the custom creates and finds:

public interface ProductHome extends EJBLocalHome {
    public Product create(ProductView value);
    public Collection findNextNFrom(ProductKey last, int count);
}

You might have noticed that I did not use "Local" as part of the entity EJB interface name (for either the home or the bean). Since I never expose a remote interface to an entity EJB, it seems like overkill to add length to the class name.

In any event, the session facade would change as follows to exploit these custom methods:

public class YourSession extends SessionBean {

    public ProductView[] getCatalog (
		ProductKey lastKey, int count
    ){
		ProductHome home =
		 (ProductHome)initCtx.lookup(
	           "java:comp/env/ProductLocal"
		);
        
		// This call used to cause a trans and SQL
		Collection products = home.findNextNFrom(last, count);
		int size = 0;
		ProductView[] results = new ProductView[products.size];
		Iterator i = products.iterator();
		ProductView product = null;
		while (i.hasNext()) {
			// Treat a next on an iterator of entities like a find
			Product product = (Product)i.next();

			// Now only one call after find, which is the ideal
			results[size++] = product.getProductView();
		}
		// Instead of setting into the HttpServletRequest
		return results;
    }
    ...
}

Note that we simply moved the code that was in the loop of the session EJB getCatalog() method into the entity EJB getProductView() method; then we replaced the moved code with a single method call. If you load test the code implemented in this manner, you will likely notice a much more reasonable overhead of using entity EJBs with respect to JDBC. Given that entity EJBs will be far more maintainable than JDBC, the tradeoff will be well worth it.

And just for future reference, here is a UML diagram showing the general end-to-end architecture. It shows the relationship between JSPs and servlets, session and entity EJBs, and key and view data transfer objects:

Figure 2. UML diagram showing end-to-end architecture
Figure 2. UML diagram showing end-to-end architecture

I hope this discussion helps to renew your enthusiasm for entity EJBs. At the very least, I hope you have some new tools and techniques for precisely measuring the performance differences and evaluating the tradeoffs.

OK then,
Your EJB Advocate

divider


Conclusion

Through this exchange, we saw why using a session facade in front of a local entity EJB is so important to ensure that only one transaction is executed per UI event, and to minimize the number of SQL calls sent to the database. Just as significant, we saw the importance of using a data transfer object (now being called an SDO) and custom methods on entity EJBs to minimize the chattiness between layers, even when the local interface is used. And, incidentally, we saw that using an SDO enables you to flow data all the way from the entity to the JavaServer Page rendering the HTML, passing through the session facade and the servlet (the model and view controllers respectively).

We discussed how template inheritance can be used to add behaviors (like starting and committing a global transaction) transparently to the servlet code. Even though the use of session facade minimizes the need for this approach, template inheritance in servlets may still be useful in cases where it calls more than one session EJB in the context of the doGet() or doPost().

We also discussed declaring transactions on entity EJBs to be mandatory, as well as not exposing the individual CMP attributes to the entity interface. Both of these best practices help to enforce your usage policies.

One process (rather than design) best practice we covered was related to measuring performance using load and path analysis tools, then analyzing the code and configuration to find and fix bottlenecks. The idea is to use tools like Rational Performance Tester, Tivoli Monitoring for Transaction Performance, and JInsight that capture number of calls, the amount of data passed back and forth, and the amount of time that each call takes. Also, we hinted at a best practice that static analysis should be based on code, not on class or high level sequence diagrams (even though these are useful to get an overview).

What's next...

If you have an interesting problem associated with using EJBs of any type, please feel free to contact the EJB Advocate. Otherwise, in the next column,we will examine Service Data Objects, and how EJBs (both entities and sessions) will play a role within a Service Oriented Architecture.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, Java technology
ArticleID=48779
ArticleTitle=The EJB Advocate: Making entity EJB components perform
publish-date=02232005