Best practices and tuning for large objects in WebSphere Enterprise Service Bus

Design patterns and tuning

Ensuring optimum performance is attained on systems processing large objects is an issue commonly faced by users of middle-ware software. In general, objects of 1M or more can be considered to be 'large' and require special attention. This article aims to provide you with the necessary information and advice required to successfully utilise the WebSphere Enterprise Service Bus (ESB) V7 product to process large objects efficiently in a 64 bit production environment.

Share:

Martin Ross (martin.ross@uk.ibm.com), Performance Analyst, IBM

Martin Ross is a performance analyst for WebSphere Enterprise Service Bus based in Hursley, United Kingdom, and has worked in the product area since 2006. He holds a degree in Software Engineering from the University of Southampton.



03 October 2011

Also available in Russian Japanese

Introduction

Ensuring optimum performance is attained on systems processing large objects is an issue commonly faced by users of middleware software. In general, objects of 1M or more can be considered to be ‘large’ and require special attention. This article aims to provide you with the necessary information and advice required to successfully utilise the WebSphere Enterprise Service Bus (ESB) V7 product to process large objects efficiently in a 64 bit production environment.


Considerations and affecting factors

This section provides information on the main considerations and affecting factors when processing large messages.

JVM limitations

The main advantages of 64bit architectures relate to memory management and accessibility. The increased data bus width enables support for addressable memory space above the 4GB generally available on 32bit architectures. Although the limit for the size of the Java heap is operating system dependent, it is not unusual to have a limit of around 1.4 GB for a 32-bit JVM. The increased memory support that 64bit architectures deliver alleviates the constraints on Java heap sizes that can become a limiting factor on 32bit systems when performing operations on large data objects.

As a general rule you should always run with 64-bit JVMs to service large objects.

Size of in-memory BO

It should be noted that the size of the in-memory business object (BO) can be much larger than the representation available on the wire. There can be several reasons for this, notably character encoding differences, modifications made as the message flows through the system, and copies held of the BO during a transaction to allow for error handling and roll-back.

Number of concurrent objects

The achievable response time is primarily inversely proportional to the number of concurrent objects being processed – although modern SMP hardware is helping to alleviate these limitations to a degree. In order to achieve the best possible response times from your system you can limit the number of messages being concurrently processed – this is of particular note when processing large data objects due to possible stresses placed on the Java heap.

Limiting the number of concurrently processed messages can be achieved by:

  • Restricting the number of clients used to drive the workload.
  • Tuning the appropriate thread pools to restrict the number of concurrent threads.

Network

Network bandwidth can be a limiting factor when processing large messages. If we consider a simple client-server model where the client is sending a negligibly sized request message and receiving a 50MB response over a 1Gbit LAN then the maximum theoretical throughput can be calculated as follows:

Bandwidth (1000Mbits) / Message Size (400Mbits) = 2.5 Messages per Second

This equates to an achievable response time of 400ms assuming a single client thread.

In reality, the nominal transfer rate of a network interface card (NIC) is not achievable at the application layer due to the overheads of the lower layers (TCP/IP etc.). A maximum throughput of around 70% of the NIC rating is not unusual.

When processing messages over a multi-tiered configuration (Figure 1) the network load on the middle tier is effectively double that of the client or service provider – this has the effect of halving the achievable throughput from the scenario defined above.

Figure 1. Multi-tiered configuration
Multi-tiered configuration

Application design patterns

This section provides a number of design patterns to improve performance for processing of large messages.

Decomposing inputs

Decomposition of an input message is a technique that aims to decompose a large message into multiple smaller messages for individual submission.

If the large message is primarily a collection of smaller business objects then the solution is to group the smaller objects into conglomerate objects less than 1MB in size. If there are temporal dependencies or an "all or nothing" requirement for the individual objects then the solution becomes more complex.

Claim check pattern

The claim check pattern pertains to a technique for reducing the size of the in-memory BO when only a few attributes of a large message are required by the mediation.

  1. Detach the data payload from the message
  2. Extract the required attributes into a smaller 'control' BO
  3. Persist the larger data payload to a data store and store the 'claim check' as a reference in the 'control' BO
  4. Process the smaller 'control' BO, which has a smaller memory footprint
  5. At the point where the solution needs the whole large payload again, check out the large payload from the data store using the 'claim check' key
  6. Delete the large payload from the data store
  7. Merge the attributes in the 'control' BO with the large payload, taking the changed attributes in the 'control' BO into account

Co-located services

The most significant solution architecture that should be implemented is to utilize a separate JVM (dedicated server) for processing large messages, especially if you are running a transaction mix of small messages payloads (high throughput / low response times) and large message payloads. This technique should be employed even if the large message payloads are only occasional but exhibit relatively long response times.

On systems that host a number of services, with a mix of services that handle large and small message payloads, the GC and message processing overhead incurred due to handling the larger messages can have a detrimental effect on the performance of the other services.

If we take two example services:

  • ServiceA – predominantly handles large message payloads
  • ServiceB – predominantly handles small message payloads (high throughput / low response times)

Ensuring that ServiceA is located on a separate JVM to ServiceB has multiple benefits:

  • GC and message processing overhead of handling larger messages on ServiceA does not affect the high throughput and low response time performance of ServiceB as dramatically
  • You can independently tune the separate JVMs so that they are optimised for the expected workloads

Performance tuning

This section provides information and advice on a number of tuning options that should be understood and correctly configured to obtain optimum performance.

JVM Tuning

This section describes tuning considerations relating to the JVM.

What is Garbage Collection?

Garbage Collection (GC) is a form of memory management for the JVM. The trigger for a GC is usually an allocation failure – this occurs when the allocation of an object to the JVM heap fails due to insufficient available space. The aim of the GC is to clear up the JVM heap of any objects that are no longer required, thus providing enough space for the object that previously failed allocation. If a GC was triggered and there is still not enough room for the object then you have exhausted the JVM Heap.

Generational GC is a policy that is best suited to applications that create many short-lived objects, which is typical of middleware solutions. The JVM Heap is split into three sections (Allocate Space, Survivor Space and Tenured Space) and although this provides performance optimisations in a number of situations, when processing large messages you need to be aware of how the JVM Heap is being utilised. Due to the JVM Heap size constraints this can be a limiting factor on 32 bit JVMs and it is recommended that on such architectures you do not use the Generation GC policy for processing large messages. This is not the case on 64 bit JVMs due to the increased memory support.

Increase the size of the JVM Heap?

Processing a number of large messages, especially when running with concurrent threads, can lead to JVM Heap exhaustion. Increasing the size of your JVM Heap can alleviate the majority of cases where JVM Heap exhaustion has been an issue – however, a balance is needed so that side-effects of this change do not inhibit the performance.

Increasing the size of your JVM Heap to compensate for JVM Heap exhaustion will result in more objects being able to be allocated before a GC is triggered. This has the side-effect of increasing the interval times between GCs, and increasing the time it takes to process an allocation failure.

When in a GC all other JVM threads are temporarily blocked – thus if you have a Global GC that regularly takes 3 seconds to complete, and a Service Level Agreement (SLA) on response times of 1 second, then if a Global GC occurs during that transaction the 1 second response time will be exceeded.

If you are running on a 32-bit JVM (not recommended for large object processing) you can maximise the space available to process large BOs by not using generational garbage collection. This results in a “flat heap”, where the entire heap space is available for transient object allocation rather than just the nursery space.

Is there an alternative approach?

If multiple large messages are being processed by a service at the same time, then available space within the JVM Heap can quickly disappear. Limiting the number of Web Container Threads will give the administrator additional control over the number of messages being concurrently processed. This can help alleviate the issue of Heap exhaustion without the need to increase the JVM Heap to an excessive size.

Additionally, you could ensure that only a single large message is being processed at once by using a single client to drive messages into WebSphere ESB – this will help to reduce memory consumption and provide optimal response times. Throttling incoming client requests with large messages to arrive sequentially into WebSphere ESB can be achieved by a front end server such as a DataPower appliance for instance.

The administrative tuning section of this article details the parameters and settings available from the WebSphere ESB Administrative Console.

Administrative tuning

This section describes a number of relevant parameters, tuning considerations, recommendations and information of where these can be applied in the Administrative Console:

MDB ActivationSpec

There are a few ways to access the MDB ActivationSpec tuning parameters:

Resources > Resource Adapters > J2C Activation Specifications > ActivationSpec Name Resources > JMS > Activation Specifications > ActivationSpec Name

Figure 2. Activation Specifications
Activation Specifications

There are two properties that need to be considered when processing large messages:

Figure 3. Activation Specification Properties
Activation Specification Properties

maxConcurrency – this property controls the number of messages that can be concurrently delivered from the JMS queue to the MDB threads.

maxBatchSize – this property determines how many messages are taken from the messaging layer and delivered to the application layer in a single step.

Thread Pools

The following thread pools will typically need to be tuned:

  • Default
  • ORB.thread.pool
  • WebContainer

The maximum size of these thread pools can be configured under Servers > Application Servers > Server Name > Thread Pools > Thread Pool Name

Figure 4. Thread Pools
Thread Pools

JMS Connection Pool

There are a few ways to access the JMS Connection Factories and JMS Queue Connection Factories from the Admin Console:

Resources > Resource Adapters > J2C Connection Factories > Factory Name

Resources > JMS > Connection Factories > Factory Name

Resources > JMS > Queue Connection Factories > Factory Name

Figure 5. Connection Factories
Connection Factories

From the admin panel for the connection factory open Additional Properties > Connection Pool Properties. From here you can control the maximum number of connections.

Figure 6. Connection Factory Properties
Connection Factory Properties

Conclusion

The increased memory support that 64bit architectures deliver alleviates the constraints on Java heap sizes that can become a limiting factor on 32bit systems when performing operations on large data objects.

Increasing the size of your JVM Heap can alleviate the majority of cases where JVM Heap exhaustion has been an issue – however, a balance is needed so that side-effects of this change do not inhibit the performance.

  • Tune the JVM appropriately to balance GC intervals and GC pause times.
  • Consider available design patterns aimed at reducing the stress on the JVM.
  • Use a dedicated server for processing large messages.
  • Constrain concurrency or single-thread requests through the large message server.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into SOA and web services on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and web services
ArticleID=761533
ArticleTitle=Best practices and tuning for large objects in WebSphere Enterprise Service Bus
publish-date=10032011