Skip to main content

Clustering WebSphere Process Server V6.0.2, Part 1: Understanding the topology

Basic steps guide: Install, configure, and set up WebSphere Process Server V6.0.2 clusters and the golden topology

Michele Chilanti (chilanti@us.ibm.com), Consulting IT Specialist, IBM
Author photo
Michele Chilanti is a Consulting IT Specialist with the IBM Software Services organization. He has over 15 years of experience working with a variety of products of the IBM software portfolio. Currently, he consults on a daily basis with IBM customers in the areas of business process modeling, implementation, and deployment. Michele regularly presents at conferences worldwide, and has authored a number of IBM and external technical publications.
Sriram Madapura (sriramm@us.ibm.com), Senior IT Specialist, IBM
Sriram Madapura is a Managing Consultant with the IBM Software Services organization. Currently he consults in the area of process integration, implementation and deployment using IBM Websphere products.

Summary:  Set up a basic clustered IBM® WebSphere® Process Server V6.0.2 installation using a step-by-step approach for a reasonably simple, yet robust, clustered topology that improves availability and scalability. This two-part article series provides information for WebSphere Process Server V6.0.2 and is considered an update to IBM WebSphere Developer Technical Journal article: Basic steps for clustering WebSphere Process Server V6.0 (see Resources).

Date:  18 Apr 2007
Level:  Intermediate
Activity:  1394 views

Introduction

Clustering is a key technique that you can use to improve the availability and the scalability of a WebSphere Process Server environment. With clustering, you can:

  • Increase the system's availability by providing redundant Java™ Virtual Machine (JVM) processes or hardware components which can ensure some level of continuity of service in case of failures.
  • Provide a mechanism to accommodate additional workload scalablity by making available additional processes and systems to run transactions.

The concepts of failover and scalability are largely independent. Since this is the case, you may find that a topology that ensures scalability may not be very good at ensuring availability, and vice-versa.

With WebSphere Process Server, you can use clustering techniques in many different ways to address availability and scalability.

This article is the first in a series that briefly presents a few topological solutions for WebSphere Process Server clustering and discusses the trade-offs of the various approaches. The second article of this series guides you through the steps to set up what we expect will become the most commonly adopted clustering topology for WebSphere Process Server Version 6.0.2.


Key topologies for WebSphere Process Server clustering

Macroscopically, every WebSphere Process Server environment involves three fundamental layers: WebSphere Process Server applications, one or more relational databases, and the messaging infrastructure, see Figure 1:


Figure 1. Components to be clustered
Components to be clustered
  1. WebSphere Process Server applications include the process server infrastructure code, such as the Business Process Choreographer, and any user applications that exploit the process server functions. These applications require a WebSphere Process Server application server to be installed and run. Conceptually, clustering of WebSphere Process Server applications is not significantly different from clustering a plain J2EE application on a WebSphere Application Server V6 environment.
  2. One or more relational databases. WebSphere Process Server requires certain application configuration and runtime information to be stored in relational database tables. The messaging infrastructure, which will be discussed next, also uses relational database tables for persistence. Clustering of RDBs for scalability and availability is a well established discipline. We will not spend time discussing techniques for clustering RDBs. In this article, we are going to discuss how to set up the necessary databases and schemas to support a WebSphere Process Server cluster. The chart shows the four groups of database objects that you are going to deal with when clustering WebSphere Process Server:
    1. The "common database" (WPRCSDB).
    2. The Business Process Choreographer database (BPEDB).
    3. The Messaging Databases (Messaging DB) that are required for message persistence.
    4. The database objects for the Common Event Infrastructure (CEI DB), where we store, primarily, the event data.
    5. The ESB Mediation Log database (EsbLogMedDb), that can be used if you want to log every message processed by an ESB mediation. This database schema contains a single table. Since this database is relevant to the functions of the WebSphere Enterprise Service Bus product, we are not going to deal with it in this article.
  3. Messaging infrastructure. WebSphere Process Server also requires using a messaging infrastructure. Some of that messaging infrastructure must use WebSphere Service Integration Buses (SI Buses) and the WebSphere Default Messaging JMS Provider. In this article, we will not consider using other messaging providers for any of the messaging requirements of the WebSphere Process Server infrastructure. We are going to assume that you are going to be relying on the SI Buses entirely, which at this point is the recommended practice.

    Clustering the messaging infrastructure is perhaps the most complex aspect of the overall clustering discussion. In general, we can say that, since we use the SI Bus, which requires a WebSphere Application Server to run, the messaging infrastructure can also be clustered using the WebSphere clustering techniques. However, there are a number of considerations that you need to understand, when you select the topology to adopt. The next section discusses some of those considerations. Notice that there are four Service Integration Buses (SIBuses), the messaging infrastructure needs clustering as well. The SI Buses which we will need to take into consideration when creating the clusters are:
    1. Two buses for the Service Component Architecture (SCA) support (SCA. SYSTEM and SCA.APPLICATION bus).
    2. One bus for the Business Process Choreographer (BPC SI Bus).
    3. One for the CEI asynchronous event publishing.

Clustering the messaging infrastructure

A WebSphere SI Bus allows you to define, among other things, "Destinations" (such as Queues or Topics) that applications can use to send or retrieve messages.

In order to make those destinations physically available, you have to designate an application server process (or cluster of processes) where the messaging infrastructure can be run. You do so by "adding a member" to the SI Bus.

A "member" can be either a single server of an Network Deployment cell, or a cluster of servers.

When you add a member to the bus, a "Messaging Engine" (ME) is also created on the member. The ME is the component in the application server process which implements the logic of messaging infrastructure itself.

After you add a cluster as a member of an SI Bus, each cluster member is capable of running the Messaging Engine. However, only one cluster member will have an active Messaging Engine at any given time. The high-availability policy that applies to the Messaging Engine is a "One-of-N" policy.

There are two key options when it comes to clustering Messaging Engines:

  1. You have a single messaging engine that gets created automatically when you add the cluster as a member of the SI Bus. As we explained, this operation creates a Messaging Engine which in turn uses a "One-of-N" policy for high-availability, resulting in a single instance of the Messaging Engine being active.

    In this case, there is only one physical repository for the messages associated with the destination. This scenario ensures availability; however, scalability can only be achieved by providing additional computing power to the server (essentially, by configuring the application server on more powerful hardware).



    Figure 2. An Active/Stand-by clustered topology for messaging
    An Active/Stand-by clustered topology for messaging

  2. Multiple Messaging Engines are active at the same time. After you have added the cluster to the SI Bus, which results in the creation of the first messaging engine, you can also manually create additional messaging engines on that cluster for that SI Bus. Each messaging engine will operate with a "One-of-N" policy, but since you have multiple engines, you can now have multiple active instances. You can create your own High Availability (HA) Policies to define where each active instance should run by default, and thus evenly distribute the active engines across the various cluster members.

    This solution, however, implies that the destinations on the SI Bus are partitioned. In other words, each instance of the ME controls and works with a portion of the entire queue; there is no longer a single physical repository of messages for a certain destination. Such a topology does ensure scalability and some degree of availability. However, there are three considerations that we would like you to keep in mind before you pursue this topology:
    1. Since there is no longer a single physical bucket of messages, these topologies are not always adequate when requirements such as preserving the sequencing of messages are to be enforced.
    2. Message consumers may be statically bound to one particular partition of the queue. If that's the case, you may end up with stranded messages that nobody is ready to consume if that particular consumer partition dies.
    3. These topologies are more complex to set up and administer than topologies where there is a single active ME at one time (non-partitioned).

We suggest using the topology depicted in Figure 3 only when it is proven that the messaging infrastructure is indeed the bottleneck of the solution:


Figure 3. An Active/Active clustered topology with partitioned queues
An Active/Stand-by clustered topology for messaging

Now that you understand the two fundamental options to cluster a messaging engine, we can discuss where the messaging engine can be located with respect to the WebSphere Process Server applications.

There are two options here, too:

  1. The messaging engine is "co-located" with the WebSphere Process Server applications. In other words, the messaging engine runs within the same cluster as the WebSphere Process Server applications.
  2. The messaging engine is located in its own cluster, separate from the WebSphere Process Server applications.

These two options lead to four possible combinations.

  1. Messaging engine and WebSphere Process Server applications are located in the same cluster and the queues are non-partitioned. In this case, any Message-driven Bean that runs in the WebSphere Process Server applications will be forcefully "bound" to the local Messaging Engine. This is a design point of the SI Bus: if an MDB has a local ME available, it will always be connected to it, even if the ME is inactive. Because of this, only one cluster member will have active instances of the MDB. Obviously, this option implies serious limitations on the overall scalability of the solution, if you are using interruptible (long-running) business processes or asynchronous SCA invocations. These topologies, however, may be simpler to set up and manage. As a rule, we'd recommend you do not utilize these topologies, unless you are positively sure that:
    1. You will never use long-running processes nor asynchronous SCA calls.
    2. Even though you use long-running processes and asynchronous SCA calls, you are not concerned about the fact that all but one of the cluster members is essentially in stand-by.

      Figure 4. The Simplest Clustering Topology: Active/Stand by
      The Simplest Clustering Topology: Active/Stand by



      This topology is simple to set up and requires a small number of servers and single cluster. However, it has a major disadvantage, as we mentioned, which may strongly limit the scalability factor of such configuration.
  2. WebSphere Process Server Applications and Messaging Engines are located in the same cluster, but the queues are partitioned.

    This topology is a variation of the previous one, and is shown in Figure 5.



    Figure 5. An Active/Active topology with partitioned queues
    An Active/Active topology with partitioned queues



    This topology has the advantage (over the previous one) of providing a scalable environment where multiple MDBs are active at the same time, albeit on different partitions of the same queues. Let's review the trade-offs of this topological choice:

    1. Messaging Engines scale up with the number of cluster members. This is a plus.
    2. Also, the fact that MEs are located in the same server as the MDB that consumes messages from it minimizes latency. This is also a plus.
    3. Destinations are partitioned, making it a difficult environment to administer, and posing issues on preserving message sequencing. This is a disadvantage.
    4. True failover of the Messaging Engines is not guaranteed by this solution. If, for example, ME1 crashed on the server where it is currently active, the HA Manager would indeed activate an ME on a surviving server (let's say, the server where ME2 is active). However, no consumers would be able to use it on that server. Any messages contained in the first partition at the time of the crash would be "stranded".
  3. Messaging engine and WebSphere Process Server Applications are located in separate clusters. This is the topology we recommend you adopt, whenever practically possible.
    1. Non-partitioned queues, active/stand-by topology.

      Figure 6. Segregating the messaging engines in a separate cluster
      Segregating the messaging engines in a separate cluster



      This topology allows you to achieve true scalability of the WebSphere Process Server applications, because it allows multiple MDBs to be active at the same time.

      In addition, since the queues are non-partitioned, there are no particular restrictions as to what kind of applications can run on the WebSphere Process Server cluster.

      It also allows you to tune and configure the WebSphere Process Server cluster independently of the Messaging Engine Cluster.

      The only caveat is the scalability of the messaging engine that can only occur by placing it on a more powerful system.
    2. Partitioned queues, active/active topology.

      Figure 7. Queues for maximum messaging scalability
      Queues for maximum messaging scalability



      This topology allows full scalability and separate administration of the WebSphere Process Server cluster and of the ME cluster.

      However, because of the partitioned destinations, there are some considerations that you need to keep in mind as to issues such as preserving the sequencing when processing messages, and potential imbalance of the workload, depending on how the various partitions are populated. The setup of such topologies is also a complex task.

      Because of the added complexity, we recommend adopting partitioned destinations when it is proven that the key limiting factor of throughput is the messaging engine, and with a thorough understanding of the limitations on the applications.


Classifying the topologies

Now that you understand what the options are with regard to the relative configuration of WebSphere Process Server components and of the Messaging Engines, let's classify the various topologies.


Table 1. Topology classifications
Topology nameDescription
BronzeThis topology includes one cluster that runs all the components:
  1. WebSphere Process Server applications
  2. Messaging Engines (one per SI Bus)
  3. CEI

As we said, this topology may not ensure scalability if you use long-running Business Process Execution Language (BPEL) processes or asynchronous SCA interactions.
SilverThis topology includes two separate clusters:
  1. One cluster where we install the WebSphere Process Server applications and the CEI Event Server.
  2. One cluster for the messaging engines.

This topology ensures the WebSphere Process Server applications and the CEI Event Server "scale up" with the number of physical server assigned to their cluster. This topology is a very reasonable choice for most small to medium size environments.
GoldThis topology introduces a third cluster that is dedicated (essentially) to running the CEI Event Server.
You end up with three clusters:
  1. One cluster for WebSphere Process Server applications.
  2. One for the messaging engines.
  3. One for the CEI Event Server.

This topology is recommended when volumes are high and scalability is a key requirement. By decoupling the CEI Event Server and the WebSphere Process Server applications, you ensure that these two components do not compete for the same resources (memory and CPU).
This topology is also conducive to creating a centralized event server. You may envision events coming from multiple sources being processed by this event server. The CEI cluster can also be isolated on its own set of physical boxes.

In this document, we are going to discuss how to set up the Gold topology, by setting up three server clusters.


Clustering the relational databases

As we briefly mentioned above, ensuring the high availability and scalability of relational database platform can be done through a number of known and proven techniques. We consider this topic to be outside the scope of this article.


Target topology

For our sample scenario, we have adopted the golden topology with three separate clusters, one for the WebSphere Process Server applications and one each for the Messaging Engine and CEI Event Server.

For the gold topology, the recommended configuration is comprised of at least three systems and a separate box (or cluster of boxes) to host the databases:

  1. One system (the smallest), would be dedicated to running the Deployment Manager. You would only create the WebSphere Process Server Deployment Manager profile on this system.
  2. On the other two systems, you would create a WebSphere Process Server Custom Profile and add each profile to the Deployment Manager cell. Each profile includes a node.

You would then create three clusters, across both systems, as indicated in Figure 8:


Figure 8. The Gold Topology with three systems for WebSphere Process Server
Gold Topology with three systems for WebSphere Process Server


As a variation of the topology in Figure 7, you could install the Deployment Manager on one of the two nodes, but keep in mind that this might make backup and recovery more complicated.

If you have more than three systems, you could also attempt to minimize the licensing requirements of your topology by using a plain WebSphere Application Server profile for the Messaging Engines (the Messaging Engines do not require running in a WebSphere Process Server server and can run on a node where plain WebSphere Application Server is installed).


Figure 9. A more complex "Gold" topology
More complex "Gold" topology


Figure 9 shows you a topology where we have created the following:

  1. WebSphere Process Server Deployment Manager profile on System A
  2. WebSphere Process Server Custom Profile on System B and System C
  3. WebSphere Application Server Custom Profile on System D and System E

We then federated the four nodes to the Deployment Manager and configured the Messaging Engines only on System D and System E.

The topology illustrated in this article "mimics" what we show in Figure 8. We didn't use as many systems, but we did create a WebSphere Process Server Deployment Manager profile, a WebSphere Process Server Custom Profile, and a WebSphere Application Server Custom Profile.

Below (Figure 10) is a picture that summarizes the target topology: On the box ISSW1, we installed the Deployment Manager and the WebSphere Process Server Custom profiles. On the box ISSW2, we created the WebSphere Application Server Custom Profile.

On a third box, we have DB2® and the databases needed for the topology. This topology eliminates the single point of failure that results by creating a cluster that resides on a single physical box.


Figure 10. The topology described in the step-by-step section
Topology described in the step-by-step section


Of course, in this example we used "vertical clustering" (creating cluster members on the same physical box) to demonstrate how to set up the topology, in a real life environment, "vertical clustering" is almost never used alone because a physical crash would not ensure failover.


High-level steps for installing the cluster (golden topology)

Below is a list of high level steps involved in installing and configuring a cluster that implements the topology described in Figure 10.

  1. Install product binaries and fix packs on all the machines.
  2. Create the profiles.
    1. Federate the nodes.
  3. Create the ME Cluster.
    1. Use new Business Integration Wizard which creates and configures the SI buses required by the SCA runtime.
  4. Create the WebSphere Process Server Cluster.
    1. Use scripts to create BPC database.
    2. Use new Business Integration Wizard which configures the Business Process Container and the Human Task Container.
  5. Create the CEI Cluster.
    1. Perform a manual install of CEI Applications.
  6. Perform the manual changes to Data Sources.

Conclusion

So far we have discussed, at a high level, some of the topological options and some of the trade-offs you need to deal with when planning to cluster WebSphere Process Server. We have also described a commonly adopted clustering scheme, the "golden topology", and we outlined the steps needed to implement it.

The second article of this two-part series walks you through each individual step, from installing the product, to configuring the clusters and the related resources.


Resources

About the authors

Author photo

Michele Chilanti is a Consulting IT Specialist with the IBM Software Services organization. He has over 15 years of experience working with a variety of products of the IBM software portfolio. Currently, he consults on a daily basis with IBM customers in the areas of business process modeling, implementation, and deployment. Michele regularly presents at conferences worldwide, and has authored a number of IBM and external technical publications.

Sriram Madapura is a Managing Consultant with the IBM Software Services organization. Currently he consults in the area of process integration, implementation and deployment using IBM Websphere products.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=208311
ArticleTitle=Clustering WebSphere Process Server V6.0.2, Part 1: Understanding the topology
publish-date=04182007
author1-email=chilanti@us.ibm.com
author1-email-cc=Copy email address
author2-email=sriramm@us.ibm.com
author2-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers