Stability and performance play an important role in service-oriented architecture (SOA) integration solutions. WebSphere Process Server (hereafter called Process Server) and WebSphere Enterprise Service Bus (ESB) are two products that help build successful SOA integration solutions. Once a solution is put together, a tuning phase is usually performed to ensure that the solution can produce the throughput or the response time required by the business that is ideally captured in service level agreements. This tuning effort usually maximizes the throughput for a balanced situation that consists of a constant rate of injected requests and constant performance of the invoked backend services.
In real life, balanced conditions are not always met. There might be times where the input rate is unusually high, the backend services are slower than expected, or the system performance is worse than usual. This article discusses how to deal with such system overload situations.
This article assumes you have intermediate to advanced administration knowledge of Process Server.
Note: The content of this article also applies to IBM® Business Process Manager (BPM) Advanced.
Process Server is a middleware product that helps build SOA integration solutions. Fundamental components of Process Server are Business Process Choreographer (BPC), Human Task Manager, and ESB.
Figure 1. Components of WebSphere Process Server
As shown in Figure 1, the building blocks of Process Server solutions are SOA components that interact with each other with the help of Service Component Architecture (SCA). Service components built for Process Server such as mediation flow components, business processes, or business state machines:
- Are exposed and invoked as services with SCA binding, WebService binding, JMS binding, MQ binding, or other bindings.
- Can invoke other components or backend services via SCA calls.
Typically, a collection of components is aggregated to an SCA module. An SCA module is a unit of installation, administration, and operation.
Multiple SCA modules usually work together to build a solution that solves a specific business problem. Developing such a solution is one part of the picture. Once such a solution has been created, you must tune it to make sure that it fulfills its quality of service requirements. During the tuning effort, make sure the system can handle a higher than expected average load during production. Once a solution is tuned and taken into production, it must be monitored constantly to ensure that no bottlenecks evolve over time.
Tuning a Process Server solution
To tune the performance of a Process Server solution, the system is usually exposed to a certain load. Then the system is monitored and bottlenecks are identified and removed. Usually, after removing a bottleneck, the next bottleneck shows up and is removed. This process is repeated until system performance is satisfactory, which means that the system meets the throughput or response time requirements typically captured by service level agreement contracts.
The first steps in the tuning cycle are to monitor and tune operating system and database performance. These activities are outside the scope of this article. Refer to the Resources section of this article for references.
You then need to tune Process Server. Important parameters to monitor and modify include the following:
- Available heap space
- Garbage collection policy
- Connection pool sizes of the JDBC data sources and JMS connection factories
- Thread pool sizes
- Maximum concurrency parameters of Message Driven Beans (MDBs)
Each change in one of the parameters may require changes in additional parameters. For example, if a WebSphere thread pool size is increased, it may be necessary to also increase the connection pool sizes, and in turn the maximum sessions or applications parameter of the database. Tuning usually requires several iterations until a stable configuration with satisfactory performance is reached.
The tuning process is done with a maximum load, which is significantly higher than the average load during typical production time. Doing it this way, the system can handle unexpected load spikes during production. Nevertheless, there may be situations in production where the extra load, the degradation of backend services, or database performance degradation is so high that the system gets into trouble.
Reasons overload situations occur
There are several possibilities that lead to unusually high load in the Process Server system, including, but not limited, to the following:
- Unusually high volume of incoming requests
- Failures in clustered solutions
- Slow backend systems
- Slow database
- Slow network
Unusually high volume of incoming requests
Process Server executes custom-built solutions that contain SCA modules and mediation modules. These modules usually process requests obtained from external sources. There are several ways in which requests can arrive at Process Server, such as:
- Web service requests arrive via HTTP/HTTPS or SOAP over JMS transport
- Asynchronous JMS requests arrive via JMS queues
- Inbound adapters may poll the file system, databases, or be triggered by Enterprise Information Systems
All these external sources are subject to fluctuations. There may be times where more customer requests arrive than expected.
In addition, there may be operational reasons for peaks in the incoming requests. For example, if Process Server is shut down during a scheduled service interval, an inbound MQ queue may fill up. Once Process Server starts again, the messages from this inbound queue will flood the system with requests.
Failures in clustered solutions
Another possibility is that in a clustered solution, one cluster member experiences a hardware failure. Then this member is shut down and the remaining members must each process a higher load since the total load is now distributed over fewer machines. In an extreme case, where the cluster consisted of two machines, the remaining machine must now cope with twice the load as usual.
Slow backend systems
Process Server solutions invoke services offered by backend systems. These backend services might be extremely slow at times. For example, there might be batch jobs running in the backend servers at night that consume system resources not available for processing backend services.
Process Server uses several databases that contain business rules, JMS messages managed by message engines, and model and instance data for business processes. These databases might become unusually slow at specific times due to a variety of reasons. Tables and indexes may fill up leading to performance degradation, other activities in the system that hosts the database may impact performance negatively, or a hardware failure occurs in a disk cache. There are literally hundreds of possible reasons why database performance may temporarily or permanently degrade. A slow database is one of the main reasons for performance problems in Process Server. Low performance may mean that Process Server cannot handle the incoming load.
Process Server uses databases to persist messages, rules, and processes. These databases are usually accessed via the network. The same applies to the invocation of backend services. In addition, data files (such as transaction logs) often reside in network-attached file systems. If the network slows down, then the overall Process Server performance decreases and it may not be able to handle the incoming load.
Avoiding overload situations
Strategies that you can use to limit the load include, but are not limited, to:
- Statically limit the number of threads that accept external requests.
- Statically limit the number of threads that process requests inside Process Server.
- If backends are slow, queue the invocation of backend services.
- Monitor and throttle the system once the overload is approaching.
It is essential to distinguish between external requests that are processed synchronously, versus requests that are processed asynchronously. They create different loads on the system and require different handling.
For synchronous requests, the caller's thread is blocked until control returns from the call. Usually, the implementation of the caller is executed in the caller's thread.
For asynchronous requests, the caller issues a request and then continues working. The response to the request will arrive at some time in the future and can be processed by the same or another thread in the caller. To match a response with a specific request, a correlation mechanism needs to be employed. Typically, an asynchronous interaction is implemented by the client sending a JMS message to a JMS queue where it is picked up by a Message Driven Bean (MDB) that invokes the target service. Once the target service has processed the request, it sends the response back to the client's response queue where it is picked up by an MDB on the client side, invoking the client code that handles the reply. A side effect of this is an asynchronous invocation may spawn one or more new threads that will implement the called service. These threads create an additional load in the system.
Limit the number of threads that accept external requests
You can limit the number of threads that accept requests from external sources. Typical synchronous requests are EJB calls, Web Service requests, and service invocations with a synchronous SCA binding. These synchronous requests are handled by threads from ORB.thread.pool or the WebContainer thread pool. Limiting the number of threads in these thread pools puts an upper limit to the number of synchronous requests that are processed concurrently in the system.
If you know that a specific application uses an unusual high amount of resources, you may want to use a dedicated thread pool for inbound requests for this application. In this way, you can control how many threads are available for that application. For details, refer to the Using a dedicated thread pool for web services section.
Asynchronous services are usually based on a JMS based implementation. In this implementation, service requests arrive in a JMS queue (such as the module's input queue) as JMS messages. These JMS messages are delivered by the WebSphere infrastructure to the onMessage() method of an MDB and the module MDB, which in turn invokes the code that implements the service. The operation of MDBs is governed by an Activation Specification that controls how many threads can be used by the MDB to accept requests.
The activation specification of MDBs is found in the Resources > JMS > Activation specification panel. Depending on the messaging provider used by this activation specification, the number of threads utilized by the MDB is configured differently:
- If the activation specification uses the "Default messaging provider", then the number of threads used by the MDB is controlled by the activation specification's property "Maximum concurrent MDB invocations per endpoint".
- If the activation specification uses the "Platform Messaging Component SPI Resource Adapter" (these are the activation specifications for the SCA module MDBs), then the number of threads used by the MDB is configured by setting the "maxConcurrency" parameter of the activation specification's custom properties.
The threads used to execute the MDB's onMessage() method are taken out of specific thread pools. Which thread pool is used depends on the resource adapter used by the MDB or activations specification. Table 1 summarizes the most important resource adapters or thread pools in Process Server.
Table 1. Thread pool usage
|Resource Adapter||Used by||Thread pool used|
|SIB JMS Resource Adapter||JMS based MDBs, such as ProcessContainerMDB||SIBJMSRAThreadPool|
|Platform Messaging Component SPI Resource Adapter||SCAs generated MDBs that use the SIB's SPI||Default|
|WebSphere MQ Resource Adapter||MQ Binding||WMQJCAResourceAdapter|
Limiting the number of threads in these thread pools provides a means to control the overall number of threads that can accept asynchronous requests. These asynchronous requests can originate from external or internal clients.
If all services that are provided by an application have an asynchronous binding, and you want to control the number of threads on a module level, then limit the "maxConcurrency" custom property of the activation specification of the Module-MDB or the export MDBs.
Limit the number of threads working inside Process Server
In addition to limiting the number of threads that accept incoming requests, you might want to limit the number of threads inside Process Server for processing the work that was initiated by asynchronous requests. These are mainly the threads that perform navigation in long running business processes. Typically, a long running business process is started by a request coming from an external source, either via a WebService invocation, an asynchronous JMS binding, or a call to BPC's EJB API. Once the process starting transaction is finished, the remaining transactions for this long running process are executed in sequence, and potentially concurrently in different thread pools than the starting transaction:
- If the BPC is configured to use JMS-based navigation, then the navigation threads run under the "ProcessContainerMDB" message driven bean. This MDB is controlled by the activation specification "BPEInternalActivationSpec". Therefore, limiting the "Maximum concurrent MDB invocations per endpoint" property of this JMS activation specification limits the number of threads that execute the navigation logic.
- If the BPC is configured to use the WorkManager-based navigation (WMBN), then ProcessContainerMDB is used for navigation only in exceptional cases. Therefore, the "Maximum concurrent MDB invocations per endpoint" property of BPEInternalActivationSpec is set much lower than in the case of the JMS-based navigation. In this case, most of the navigation work happens inside the thread pool of BPENavigationWorkManager. Therefore, pay attention to the size of this thread pool.
The goal is to configure the system in such a way that it is capable of accepting sufficient load to meet its performance requirements during good times, yet limit the load sufficiently to prevent overloading of the system in bad times. These two goals are somewhat contradicting, and therefore, it may not always be possible to meet both of them. Usually, the system is then configured in such a way that it can accept sufficient load during good times, but has more threads than can be tolerated during bad times.
To prevent overload, you may need to monitor the system and take action if the load increases too much.
Detecting a system overload
If a system is overloaded, a series of symptoms occur, including but not limited, to the following:
- The SystemOut.log is full of exceptions.
- The system reacts only very slowly to administrative requests.
- The CPU utilization may go up to 100%.
- Transactions start to time out.
- Threads start to hang. In this case, CPU utilization may go down, but the queues start to build up.
Since it is difficult and time consuming to recover from an overload situation, you need to constantly monitor your system and determine measures that indicate that an overload situation is approaching before it actually occurs. It is much easier to keep a system out of trouble before the problem occurs, than to recover from a real problem situation.
The indicator that shows an overload situation is approaching depends on the application scenario. It can be, but is not limited to, one of the following:
- High number of briefly persisted processes in running state (briefly persisted processes are long running business processes that run only for a short period of time, such as a few minutes).
- High number of messages in certain JMS queues.
- High number of active threads in certain thread pools.
- High number of records in the SAVED_ENGINE_MESSAGE_B_T table of BPEDB.
What to do if the system approaches overload
Since modifications of parameters in the WebSphere configuration, such as thread pool sizes, properties of activation specifications, or service integration bus queues, take only effect after the server is restarted, it is challenging to take adequate actions once the system overload is approaching. Modifying parameters and stopping and restarting the server are usually not an option. Fortunately, there are some measures that you can take:
- The most elegant solution is to take precaution for this situation and develop applications to include mediation modules that can accept only a defined number of requests per time interval. Ideally, you can modify these numbers dynamically at runtime. For details on how to do this, see Just in time throttler and dispatcher for WebSphere ESB. An alternate approach is described in this article, WebSphere Process Server throughput management, Part 2.
- Another possibility is to make use of the Store and Forward feature.
When developing an application, you can set the Store and Forward
qualifier on components that are invoked asynchronously. If you detect
an overload situation, you can use the Store and Forward widget in
Business Space to set the state of critical components to "Store".
Then these components will not be invoked anymore. Instead, the
request messages for these services are stored in a specific
Once the constrained situation is resolved, you can set the status of these components back to "Forward", and subsequently the stored requests (and all new requests) will be forwarded for processing. For details, see the Store and forward sections in the Information Center.
- The two previous options require specific constructs in the
applications that have to be defined during application modeling. If
you missed this opportunity, you can still stop individual messaging
endpoints temporarily. This can be done via administrative scripts.
For details, see How to stop a messaging endpoint.
Stopping a messaging endpoint prevents the associated message driven beans from accepting incoming messages. This reduces the system load created by the JMS request messages that are injected into the system to invoke services with asynchronous binding. It is also possible to stop the endpoint for ProcessContainerMDB temporarily. This reduces the system load created by the process navigation threads. If most navigation transactions fail, this might be the desired action until the underlying infrastructure problem is resolved or relieved.
If messaging endpoints are stopped, the queues associated with these MDBs will fill up over time. Therefore, it is vital to configure the queue capacities large enough and to resume the stopped endpoints as soon as possible once the constrained situation is resolved or relieved.
- The next possible action is to stop the enterprise applications.
The effect is similar to that in the previous bullet, but in this
case, all services that belong to the application will stop accepting
incoming requests. This includes not only services with asynchronous
binding, but also services with synchronous binding.
If you intend to stop the applications, consider the following:
- Applications that contain long running BPEL processes cannot be stopped as long as running instances of these processes exist. Therefore, you might want to build separate modules for long running BPEL processes and for microflows. You can stop applications containing microflows if an overload situation evolves. However, you usually cannot stop applications with long running processes.
- If services with asynchronous binding stop to accept requests, the only effect a client sees is that it takes longer until a reply arrives. On the other hand, if services with synchronous binding stop to accept requests, then their clients might receive exceptions or timeouts. Therefore, you need to stop the applications only if their clients can deal with exceptions or timeouts.
Using a dedicated thread pool for Web services
If you want to fine-tune the number of threads used for specific Web services, then configure these Web services to use a dedicated thread pool.
Web services run inside the Web container. WebSphere uses transport chains with associated transport channels to control Web container traffic and thread pools.
To assign a dedicated thread pool to a specific Web service, you need to:
- Create a new thread pool.
- Create a new transport chain with a dedicated port and assign the new thread pool to it.
- Add the dedicated port to the virtual hosts.
- Use the new port for invocations of the Web service.
Create a new thread pool
You can create a new thread pool via the admin console panel by selecting Application servers > server1 > Thread pools. The thread pool page is shown in Figure 2. With the New button on this panel, you can create a new thread pool, for example "MyThreadPool".
Figure 2. Thread pools in WebSphere
Create a new transport chain
After this, you can create a new transport chain. Go to Application servers > server1 > Web container transport chains and click the New button (Figure 3). This way, you enter a dialog where you specify the name of the transport chain and the port it is listening to. The specific port is used to work explicitly with this transport chain.
Figure 3. Transport chains in WebSphere
Once the new transport chain is defined, you follow the link to it as shown in Figure 4.
Figure 4. Transport chain settings
If you follow the "TCP inbound channel" link in the new transport chain, you can modify the thread pool used. Specify the thread pool you created.
Add a new port to virtual hosts
The last step is adding the new TCP/IP port as a new alias, as shown in Figure 5. Select Environment > virtual hosts > default host.
Figure 5. Virtual host Alias settings
Now, if you invoke Web services inside your specific module via the new port (in the example, it is 9087), WebSphere will use only threads from "MyThreadPool" to execute the service.
The simple truth is that if the rate of requests processed by a system in a given time interval is lower than the rate of requests injected into the system during the same time interval, then some requests must be queued. This can only work for a certain amount of time, otherwise the queues involved will fill up and eventually they will overflow. Therefore, it is vital that:
- Queue sizes are large enough to allow the queues to buffer sufficient work.
- The system can process significantly more than the typical average load injected into the system. This reduces the number of overload situations and speeds up recovery if an overload situation evolves.
Depending on the scenario, such as the topology and the involved modules, services, and backend systems, it might be appropriate to throttle one of the following:
- Request injection into the system: The queues will build up in front of the system.
- Request processing inside the system: The queues will fill up inside the system.
- Request invocation of backend services: The backend input queues will fill up.
During system tuning, make sure that there is enough contingency in the system so that it can deal smoothly with load spikes. For the case of severe problems such as backend performance degradation, monitor your system continuously. Once you see an overload situation approaching, throttle the load in the system.
- Operating a Websphere Process Server environment, Parts 1 to 5
- Case study: Tuning WebSphere Application Server V7 and V8 for performance
- Best Practices: Tuning and Monitoring Database System Performance
- WebSphere Process Server Performance Tuning Worksheet
- WebSphere Process Server 6.1 – Tuning Automatic Processes
- IBM Redbook: WebSphere Business Process Management 6.2.0 Performance Tuning
- Just in time throttler and dispatcher for WebSphere ESB
- The WebSphere Process Server store and forward feature
- How to stop a messaging endpoint
- Configuring MDB or SCA throttling for the default messaging provider
- WebSphere Process Server throughput management, Part 2
- developerWorks WebSphere Process Server and WebSphere Integration Developer resource page