The EJB Advocate: Which type of EJB component should assemble the data returned by a service?

The EJB Advocate takes a top-down view of service-oriented architectures in order to get to the bottom of whether a session or entity EJB component should assemble the data transfer objects returned by the service.

Share:

Geoff Hambrick (ghambric@us.ibm.com), Distinguished Engineer, IBM

Geoff HambrickGeoff Hambrick is a lead consultant with the IBM Software Services for WebSphere Enablement Team and lives in Round Rock, Texas (near to Austin). The Enablement Team generally helps support the pre-sales process through deep technical briefings and short term Proof of Concept engagements. Geoff was appointed an IBM Distinguished Engineer in March of 2004 for his work in creating and disseminating best practices for developing J2EE applications hosted on IBM WebSphere Application Server.



17 August 2005

Also available in Russian

From the IBM WebSphere Developer Technical Journal.

In each column, The EJB Advocate presents the gist of a typical back-and-forth dialogue exchange with actual customers and developers in the course of recommending a solution to an interesting design issue. Any identifying details have been obscured, and no "innovative" or proprietary architectures are presented. For more information, see Introducing the EJB Advocate.

The role of entity EJB components in services

In the previous column, we introduced some characteristics of service-oriented architectures; for example, to minimize the chattiness between a service and its client, the service must be coarse-grained, stateless, and must normally assemble a data transfer object (DTO) that collects all the properties returned to the client application. We saw how well-designed EJB methods exhibit these characteristics, whether associated with a session EJB instance or an entity EJB home. This month, however, a reader asks whether we went too far in suggesting that entity EJB components be designed as stateless services too.


The problem: which data transfer object is returned by which entity?

Dear EJB Advocate,

I have been following your articles lately and had to comment about the use of data transfer objects (DTOs). I agree with you that they are especially important in the services layer, where you want coarse-grained methods, especially for facades representing services that hide implementation, and especially for remote facades (since a remote invocation is much more expensive than a local one).

Further, recommending code that lets you ask an entity EJB component to "give me a DTO of yourself" seems reasonable and well encapsulated. However, a problem is that for two different use cases, one might want some attributes of a given entity EJB component, while another might want different attributes -- so you would need two DTO classes. Which DTO type should the getDTO() method return?

And there is another problem. A different use case might want a DTO composed from multiple instances of the same or different types (a customer, the open order, and the products in the order). In which entity EJB are you going to implement that composite function?

It seems like it is hard to get away from using individual getters (and setters) on entity EJB components in the session facade. A benefit of putting the logic to assemble the DTO in the session bean (or as I prefer, a helper class used by the session) is that you do not have to worry about custom code getting out of sync if you need to delete and regenerate the entity for some reason (like adding a new property needed for yet another service).

I guess my real question boils down to this: isn't it "crying wolf" to return DTOs from entity CMPs when you recommend using local interfaces to entity EJB components anyway? I understand your point about trying to minimize chattiness between layers, but it seems like you are trading away maintainability for performance.

Please sign us,
Never Cry Wolf


Chalk one up for good encapsulation techniques

Dear Never,

You make some really good points that need to be addressed. Let me take the questions one at a time, starting from the top:

  1. Given that two use cases need different data, which DTO type should the getDTO() method return?

    My silly answer is that getDTO() should, of course, return an instance of DTO. More seriously, the name you choose for a DTO should clearly indicate what data it is returning. If two use cases require two different sets of data, the DTO class name should indicate it: maybe something like UseCase1DTO or UseCase2DTO. Therefore, the two methods signatures in the interface would be:

    Snippet 1. Multiple signatures for DTO methods
        public UseCase1DTO getUseCase1DTO();
        public UseCase2DTO getUseCase2DTO();

    Or more abstractly:

    Snippet 2. A less confusing way to show abstract examples
        public <DTO> get<DTO>();

    And as a final serious point on a convention, if you want to have a getDTO() method associated with an entity, one approach is to reserve this method name for one that returns an instance of <Entity>DTO, and that this DTO contains all the non-scalar properties of the associated entity (those properties with a maximum cardinality of 1). Your team may choose another convention, just be sure you are consistent.

  2. Given that a use case needs data from multiple entities (such as a customer, the open order, and the products in the order), which entity "owns" the get<ComposedDTO>() method?

    The answer from the point of view of defining a composed DTO method on an entity EJB gets into object-oriented design best practices. We touched on OO delegation briefly in the column on Making entity EJB components perform, Part 2. The choice of entity depends on the relationships between the entities and the data you want to return. In general, you want to pick an entity that can delegate to the other entities as needed to compose the entire structure. For your specific example, you wanted a DTO returning data from Customer, its open Order, and the Products in the Order (represented by a Line Item to hold the link attributes for the quantity and amount). Let's assume for a moment that the class diagram in Figure 1 describes the business objects and the relationships between them.

    Figure 1. An example class diagram showing entities and relationships
    Figure 1. An example class diagram showing entities and relationships

    For your example, a natural entity to "own" the composed DTO method would be Customer, since it can delegate to the Order, which can delegate to the Line Item which can delegate to the Product. For the name of the method on the Customer entity, we might choose something like: getCustomerOpenOrderDetails().

    The relationships shown on the diagram enable multiple paths by which we could retrieve a DTO that composes information from all four entity types. For example:

    • Given a customer, retrieve the open order details (the one as described above).
    • Given a customer, retrieve all the associated order details.
    • Given an order, retrieve its customer and details.
    • Given a line item, retrieve its associated product, containing order and customer.
    • Given a product, retrieve all the associated order line items, and customers.

    The entity owning the method is indicated in the "given a(n) <Entity>" clause above -- it represents the starting point, so the assumption is that any service facade (or EJB home method) associated with this method will take parameters that can identify one or more of that entity type to begin the delegation. Hopefully this answers your second question.

  3. Isn't it "crying wolf" (trading maintainability away performance) to recommend returning DTOs from entity EJBs when you also recommend using local interfaces?

    We went through an interesting trade-off analysis like this back in the early EJB 1.x days. We discovered that with respect to returning data from entities, there were two extreme approaches and one middle-ground approach, resulting in three basic styles of access:

    1. One-attribute-at-a-time
      On the surface, the most maintainable approach is to eschew DTOs altogether and get one attribute at a time from the client. But you trade number and size of messages for maintainability. The average use case will end up with lots of small messages, and the end-to-end communication overhead when remote interfaces were involved had a major impact on response time and throughput. As another minus (as you point out), you also break encapsulation. The bad news is that breaking encapsulation has a negative impact on maintainability. For example, if you decide to factor a set of attributes out into a related entity, you would need to change every client that used those attributes!
    2. One-size-fits-all-DTO
      The other end of the spectrum is to return all of the non-scalar data associated with the target entity returned in a DTO, as we discussed above in addressing question 1. In this case, you minimize the number of remotable messages and still have a relatively maintainable situation. You make only one call per entity and get all the data there is to get. You only have to develop one DTO and retrieval method per entity. Of course, the tradeoff is that DTO almost always contains more data than you need for a given unit of work, especially in real world entities that tend to have lots of attributes. Therefore, this approach has a major impact on the amount of data that is retrieved and transferred from the backend data for each unit of work. On the plus side, you only have to modify the DTO every time you add or remove an attribute. You only have to modify the client if it is using an attribute that was deleted.
    3. Custom-DTO
      The middle-ground (the EJB Advocate is reminded of Goldilocks and the Three Bears) is to tailor the DTO structure to return exactly what you need. One call. Perfect size. Just right. The problem here is that you must do a pretty good use case analysis to get the right set of DTOs and methods. When you implement a new class of use case, you have to create a new DTO and go tweak the EJB to retrieve it.

    When using EJB 1.x entities, option c was considered the best because distributed object application design is all about minimizing the number of calls (which could be remote). Option b was a good compromise to get some maintainability. Times were simpler then -there were none of those pesky CMRs to cause you to think beyond "dependent objects" (clusters of properties that actually represent an object "contained" in the entity).

    But now that EJB 2.x is here with local interfaces and CMRs, I have wondered about whether option a is the best (we mentioned this offhandedly in the column on Making entity EJB components perform, Part 1 (LINK)) - but we stuck to our guns that using custom DTOs as described in option c is still best for the same performance reasons, though they are not nearly so acute. We also figured that since many people were already used to the EJB 1.x "best practice" it was easier than retraining them (and it made the conversion much easier).

Hope this helps,
Your EJB Advocate


The maintainability problem revisited

Dear EJB Advocate,

I understand and agree with your answers to the first two questions. That helps a lot. With respect to your analysis for my third question, I understand why you thought option c was best back in the EJB 1.x days from a performance perspective, but you didn't really address the enormity of the maintainability problem that you can eliminate if you exploit EJB 2.x.

Think about just the simple diagram from your reply. There were five different "top level" methods to compose information from all four entities. You failed to mention the methods associated with each to handle the delegation. Just looking at the methods associated with getting the open order from a customer:

  1. Given an order, return it and the line items with product. This delegates to:
  2. Given a line item, return it and the product, which delegates to:
  3. Given a product, return its DTO.

Then there are compositions that do not include all four entities. By looking only at customer, I can come up with three more methods:

  1. Just return data from customer and nothing else.
  2. Return just the customer and the open order (no details).
  3. Return the customer and all associated orders (but again, no details).

Now, add to this all the permutations of getting partial attributes from each entity. In the final analysis, the combinations are enormous, especially when you take into account (as you said yourself) that there are usually a lot more attributes associated with each entity.

Therefore, where I wasn't clear in my third question is that you seem to be trading away a huge amount of maintainability for what is now a small amount of performance.

Thanks in advance, but I still must sign off with:
Never Cry Wolf


Data transfer objects are associated with views

Dear Never Cry Wolf,

I could make a joke and say that now it is you who are "crying wolf." You would not be likely to see very many of those permutations in a custom-developed application. To get to the bottom of why I say this, let's start at the top - from the use cases. While this article is not intended to be a full tutorial on our approach to object-oriented analysis and design, we will briefly walk through some of the major artifacts to make the point that the number of permutations will be relatively limited.

The entities shown in the class diagram of Figure 1 are associated with an order management business process. That process can be described with using a state transition diagram (STD), such as the one in Figure 2, that shows the stages in the lifecycle of an individual order being managed.

Figure 2. An example state-transition diagram showing the lifecycle of an order
Figure 2. An example state-transition diagram showing the lifecycle of an order

Just for your information, Figure 2 is extended with UML actor notation to show the owner of the instance in a given state. We have used this simple extension for years as a way to organize use cases and tie them to the business process (in fact, we like to joke that a Use Case diagram is just a state-transition diagram that hasn't "hatched" yet).

Another approach that we follow a little differently from the norm is to develop a class diagram for each state in the lifecycle model -to show the constraints on attributes and relationships that hold in that state. These changing constraints tend to get lost in a "monolithic" class diagram that does not take state into account. For example, the class diagram in Figure 1 shows the relationships and attributes that hold in the open state of the lifecycle shown in Figure 2. The class diagram for an order in the submitted state might look like Figure 3.

Figure 3. An example class diagram showing an order in the submitted state
Figure 3. An example class diagram showing an order in the submitted state

This approach to providing separate class diagrams per state explains why some classes in Figure 3 and Figure 1 use a [] in the class name to indicate the state, tying the "dynamic" state transition model together with the "static" class diagram. By comparing Figure 1 and Figure 3, you can easily see the change in the "shape" of the Order as it moves from open to submitted states in the business process shown in Figure 2. The states in Figure 2 show the changes in behavior.

Together, the static and dynamic models define the complete set of services in a service-oriented architecture. Each state in the lifecycle model can be mapped one-to-one to a session EJB component. Each transition in the STD maps to a method on the session bean associated with the state, and handles the updates. Each class in the relationship model can be mapped to an entity EJB component. Relationships in the diagram map naturally to CMRs. So for the two states, opened and submitted, you can infer that there will be two session EJBs, with four and three update methods each. For the two class diagrams shown, you can infer that there will be 15 entity EJB components, with a number of attributes and CMRs.

To keep this concrete, let's take the open order state as an example. We like to start with a pure Java interface that we can reuse throughout the like so:

Snippet 3. A pure Java interface showing the methods derived from the STD
    public interface OpenOrder {
	OrderKey open(CustomerKey cust) 
	    throws CustomerNotFound, OrderAlreadyOpen;

        int addLineItem(CustomerKey cust, ProductKey product, int qty)
            throws CustomerNotFound, OrderNotOpen, InvalidQuantity;

        void submit(CustomerKey cust)
            throws CustomerNotFound, OrderNotOpen, OrderHasNoLineItems;            

        void cancel(CustomerKey cust)
            throws CustomerNotFound, OrderNotOpen;

    }

This interface can be reused in the session EJB interface and implementation like so:

Snippet 4. Session bean interface and implementation class
    public interface OpenOrderSession 
        extends javax.ejb.EJBLocalObject, OpenOrder;

    public class OpenOrderSessionBean 
    implements javax.ejb.SessionBean
    {
	// implementations go here
    }

If you wish, you can use EJB Home methods instead of session EJBs. In this case, the Customer entity becomes the natural "gateway" to the business logic because it represents the actor driving the business process in that state. In this case, you can simply extend the same OpenOrder interface that you did for the session above, like the following code snippet shows:

Snippet 5. Customer entity home interface reusing OpenOrder interface
    public interface CustomerHome 
    extends javax.ejb.EJBLocalHome, OpenOrder 
    {
	// other Home methods like findByPrimaryKey() and create()
    }

Whichever approach you choose, these models and mappings give you the components needed to implement the update methods. These dynamic and static models could be used to derive all the read methods too, if we went through every permutation that you enumerated in your analysis. But we have found it best to derive the read methods from a User Interface (UI) screen flow that delivers the data needed to support invoking the business process functions.

For documenting a screen flow, we also like to use state transition diagrams. A screen flow STD can be thought of as capturing the lifecycle of a typical "session" of the actor that owns the state in the business process model. The states show screens and pop-up dialogs, while the transitions show user initiated events. To illustrate, Figure 4 shows a screen flow for a customer, showing how it interacts with the order management process.

Figure 4. An example screen flow for a customer order management session
Figure 4. An example screen flow for a customer order management session

The interaction between models, if any, is specified within {} on the transitions showing side effects in terms of invoking transitions on the business process. Some events, like submit, enter into a confirmation dialog state. Only if "OK" is triggered by the user does the side effect to actually submit the order occur. These confirmation states are shown with italics.

Another state shown in italics is the "Home" state, which represents how that role (Customer in this case) initiates and ends their session. And yet another special state shown in italics is associated with the business process - since a given actor may interact with multiple processes. Both of these states are special because they are nothing more than navigations (they usually show up as menus or tabs, depending on the style of UI).

The other states represent the "real" screens (or fragments of screens; we will save that discussion for a later article since it gets into general J2EE best practices). For each screen (whether special or not), you can use a class diagram to capture the visible data for that state, similar to the way that a class diagram associated with a business process shows the persistent data associated with a state. Figure 5 shows a combined class diagram for all the states except the confirmation ones.

Figure 5. An example class diagram showing the visible data for the session
Figure 5. An example class diagram showing the visible data for the session

Now we can get to the point of why there will not be that many DTOs and associated methods. Figure 5 shows seven DTOs with content (a relationship counts). These map very straightforwardly to Java classes. The transitions into a state on the screen flow shown in Figure 4 map to the additional methods on the OpenOrder. The DTO they return is described in the class diagram on Figure 5. The convention we like to use is <TargetState>Data (instead of DTO). The parameters to this method are shown as data flows associated with the transition; its name becomes part of the name method name. Some examples are shown in Snippet 6:

Snippet 6. Some read only methods derived from UI
	CustomerHomeData getCustomerHomeData(CustomerKey cust) 
	    throws CustomerNotFound;

	ProductCatalogData getProductData(CustomerKey cust) 
	    throws CustomerNotFound;

	ProductCatalogData getNextProductData(
	    CustomerKey cust, 
            ProductKey last
        ) 
	throws CustomerNotFound, ProductNotFound;

	ProductCatalogData getPreviousProductData(
                CustomerKey cust, 
                ProductKey first
        ) 
	throws CustomerNotFound, ProductNotFound;

	OrderDetailsData getOpenOrderDetailData(
	    CustomerKey cust, 
        ) 
	throws CustomerNotFound, OrderNotFound;

	OrderDetailsData getOrderDetailData(
	    OrderKey order, 
        ) 
	throws OrderNotFound;

	OrderStatusData getOrderStatusData(
	    CustomerKey cust, 
        ) 
	throws CustomerNotFound;

As you can see, there are very few read only methods. There are even fewer "root" DTOs because they are reused. In the worst case for this example, there are only eight DTOs needed to support the business process. And even if you do OO delegation in the entity EJB components to load the full structure, the total number of get<DTO>() methods will be relatively few.

Sorry to give you the fire hose on general OO analysis and design techniques just to answer a relatively simple question, but in practice I still think that the benefits of encapsulation afforded by returning the data structure needed outweigh the maintenance problems.

Do you agree?

OK then,
Your EJB Advocate.


The performance problem revisited

Dear EJB Advocate,

It was interesting to see how you do object oriented analysis and map those work products down to EJB components. Your approach convinced me that the maintainability problem is not as bad as it could be. But I was still left with a nagging doubt that I couldn't ignore.

When I thought about it some more, it boiled down to a completely different question: if there was no performance penalty at all for calling the get<attribute>() methods of a CMP, would you still recommend that the session facade, DTO assembler or entity Home method use a get<DTO>() method on the entity rather than simply get the attributes it needs one at a time from the associated entities?

For example, look at the DetailItemData DTO in Figure 5. I suppose if I was to follow your advice and use OO delegation, the LineItem entity would have a getDetailItemData() method. This method would follow the CMR to get a reference to the Product. Rather than invoke getDescription() and getPrice() on the Product, you would have me create a new ProductDescriptionAndPriceData object and invoke getProductDescriptionAndPriceData() on product instead. Then I would copy the fields out of this structure, along with the productId and quantity (and computed amount) into the DetailItemData structure.

This approach seems like a real waste to me, unless the performance was really that bad.

Later,
Never Cry Wolf


Trust your intuition, but verify it

Dear Never Cry Wolf,

Your persistent (no pun intended) questioning really got the EJB Advocate wondering if his advocacy for using EJBs has not gone far enough.

I have to admit that if there was no performance problem with using the get<attribute>() methods on local entity EJB components, your approach of assembling the needed data in the facade or the entity home or delegated method is much more attractive. It ends up with less components to compile at build time and less garbage to collect at runtime, so I am really glad you trusted your intuition that the real question was not being answered. This exchange shows that sometimes you have to iterate to get to the bottom of an issue.

I, on the other hand, had an intuition that the performance of local method calls was still significant. But I didn't bother to test the performance in any meaningful way to verify the theory. Instead I was relying on the axiom if it ain't broke, don't fix it - not only because doing performance testing is hard work, but also because modifying all of my presentation materials about best practices and existing examples is harder.

This axiom is reasonable for "legacy" applications, but it would be better to verify my original intuition so that new development projects could choose the right approach for the right situation.

Therefore, I had a team test the performance and found to my happy surprise that there is no appreciable difference between the "client" calling a local get<Attribute>() method through the local interface and having the entity bean implementation call its own abstract get<Attribute>() method - as long as the client and entity are both within a global transaction scope.

With this new data, I am happy to modify my approach such that if there is not already an existing get<DTO>() that provides the information, go ahead and use the get<Attribute>() methods to assemble the data needed. This applies to a delegated method on an entity EJB, an entity Home method, a session facade, or (as you call it) a DTO assembler class.

I am also happy to discover that there are some people out there who are even more of an EJB advocate than I! If we ever meet in person, I will have to buy you dinner.

OK then,
Your EJB Advocate.


Conclusion

Here are a few interesting principles that can be gleaned from this discussion:

  • Your analysis work products should cover the dynamic and static aspects of both the business domain and the user interface.
  • Systematically map your analysis work products to the appropriate EJB components and DTOs.
  • Session beans are excellent for implementing the transitions associated with a state in the business process (updates) or screen flow (reads).
  • Entity bean home methods are best when associated with a "gateway" object representing an actor that owns one or more states in a business process.
  • In either case, exploit CMRs associated with a gateway object where possible, and either use get<DTO> (OO delegation) or get<Attributes> (procedural assembly) as needed to get the data to return. A lot depends on if you already have an appropriate DTO or not.
  • Finally, the EJB Advocate is willing to admit that he is wrong. It just may take a while.

We are sure you can find others. That should keep you busy until next time.


Acknowledgements

Special thanks to Bobby Woolf, who provided major inspiration and material for this article. Check out Bobby's blog, J2EE in Practice.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, Java technology
ArticleID=92050
ArticleTitle=The EJB Advocate: Which type of EJB component should assemble the data returned by a service?
publish-date=08172005