Direct routed publish/subscribe cluster performance

In direct routed publish/subscribe clusters, information such as clustered topics and proxy subscriptions is pushed to all members of the cluster, irrespective of whether all cluster queue managers are actively participating in publish/subscribe messaging. This process can create a significant additional load on the system. To reduce the effect of cluster management on performance you can perform updates at off-peak times, define a much smaller subset of queue managers involved in publish/subscribe and make that an "overlapping" cluster, or switch to using topic host routing.

There are two sources of workload on a queue manager in a publish/subscribe cluster:
  • Directly handling messages for application programs.
  • Handling messages and channels needed to manage the cluster.

In a typical point-to-point cluster, the cluster system workload is largely limited to information explicitly requested by members of the cluster as required. Therefore in anything other than a very large point-to-point cluster, for example one which contains thousands of queue managers, you can largely discount the performance effect of managing the cluster. However, in a direct routed publish/subscribe cluster, information such as clustered topics, queue manager membership and proxy subscriptions is pushed to all members of the cluster, irrespective of whether all cluster queue managers are actively participating in publish/subscribe messaging. This can create a significant additional load on the system. Therefore you need to consider the effect of cluster management on queue manager performance, both in its timing, and its size.

Performance characteristics of direct routed clusters

Compare a point-to-point cluster with a direct routed publish/subscribe cluster in respect of the core management tasks.

First, a point to point cluster:

  1. When a new cluster queue is defined, the destination information is pushed to the full repository queue managers, and only sent to other cluster members when they first reference a cluster queue (for example, when an application attempts to open it). This information is then cached locally by the queue manager to remove the need to remotely retrieve the information each time the queue is accessed.
  2. Adding a queue manager to a cluster does not directly affect the load on other queue managers. Information about the new queue manager is pushed to the full repositories, but channels to the new queue manager from other queue managers in the cluster are only created and started when traffic begins to flow to or from the new queue manager.

In summary, the load on a queue manager in a point-to-point cluster is related to the message traffic it handles for application programs and is not directly related to the size of the cluster.

Second, a direct routed publish/subscribe cluster:

  1. When a new cluster topic is defined, the information is pushed to the full repository queue managers, and from there directly to all members of the cluster, causing channels to be started to each member of the cluster from the full repositories if not already started. If this is the first direct clustered topic, each queue manager member is sent information about all other queue manager members in the cluster.
  2. When a subscription is created to a cluster topic on a new topic string, the information is pushed directly from that queue manager to all other members of the cluster immediately, causing channels to be started to each member of the cluster from that queue manager if not already started.
  3. When a new queue manager joins an existing cluster, information about all clustered topics (and all queue manager members if a direct cluster topic is defined) is pushed to the new queue manager from the full repository queue managers. The new queue manager then synchronizes knowledge of all subscriptions to cluster topics in the cluster with all members of the cluster.

In summary, cluster management load at any queue manager in a direct routed publish/subscribe cluster grows with the number of queue managers, clustered topics, and changes to subscriptions on different topic strings within the cluster, irrespective of the local use of those cluster topics on each queue manager.

In a large cluster, or one where the rate of change of subscriptions is high, this level of cluster management can be a significant overhead across all queue managers.

Reducing the effect of direct routed publish/subscribe on performance

To reduce the effect of cluster management on the performance of a direct routed publish/subscribe cluster, consider the following options:
  • Perform cluster, topic, and subscription updates at off-peak times of the day.
  • Define a much smaller subset of queue managers involved in publish/subscribe, and make that an "overlapping" cluster. This cluster is then the cluster where cluster topics are defined. Although some queue managers are now in two clusters, the overall effect of publish/subscribe is reduced:
    • The size of the publish/subscribe cluster is smaller.
    • Queue managers not in the publish/subscribe cluster are much less affected by cluster management traffic.

If the previous options do not adequately resolve your performance issues, consider using a topic host routed publish/subscribe cluster instead. For a detailed comparison of direct routing and topic host routing in publish/subscribe clusters, see Designing publish/subscribe clusters.