Why are messages piling up in Process Server queues and not getting processed?
BinHu 0600024J6S Visits (28662)
You have SCA business flow running inside WebSphere Process Server (WPS). You use Service Integration Bus or WebSphere MQ to integrate your business. You may intermittently run into the situation that there are messages piling up in Process Server's internal queues and not getting picked up or they are processing very slowly. Why is this situation happening? What should you do?
Locating the problem application:
1. First you need to understand where the messages are getting stuck. You need to check your queues. Depending on which kind of messaging engine you use, the queues may be defined on SIBus or WebSphere MQ. Work with your application team to understand where the queues are defined. In this article, we suppose the messages are piled up in a WebSphere MQ queue named BSCS
2. Then you need to track back which application used this queue. This is usually done by checking the WebSphere Application Server (WAS) admin console, Resources -> JMS -> Queues. You can find the queue defined in this page. Then you can get the JNDI Name, in this case, it is jms/
3. Then, you need to review the application design in the WID/IID, checking the MQ Binding End-Point Configuration about which module and component defined this resource by its JNDI name. In this case, the MQ Export Binding IF_A
You can see that this is a receive queue, which means this queue is used to place the incoming messages from the upstream MQ server.
Now, you have a rough idea, a lot of messages are piling up in the receive queue (entry queue) of Process Server MQ Export Binding, Process server is not being able to digest it quickly. Messages piled up more and more and it looks Process server is processing more slowly than expected.
So for this problem, what are some of the possible causes?
Cause 1: A sudden large amount of data from upstream system flooding to Process Server
The first thing you need to check is, does your upstream system suddenly have a large number of messages flooding to Process Server? Check with your upstream application team that designed the application putting messages in this queue. Is it possible that the application is putting a large number of messages in the queue in a very short time? Check with your application team to see if you can get the application log to monitor the workload of the message placing speed.
Cause 2: The WebSphere MQ or network may be down and recovered from a incident
The 2nd thing you may need to check is, is your WebSphere MQ server running well all the time? Is it possible that the MQ queue manager is down or the channel is down for any reason that blocks the messages from being placed, and once they are restored, the retry mechanism suddenly put a large number of messages in queue with high speed? Also, is it possible that the network speed is slowing down causing lengthy latency?
Cause 3: Process Server was not tuned up to handle the sudden peak load of concurrent processing capability
We know that for the JMS applications, we use Activation Spec to monitor the target destination. The Activation Spec connects to the remote WMQ and opens the destination to monitor new messages. In this case, our Activation Spec jms/
You can see that the Activation Spec is monitoring a destination jms/
How could we tune up MQ binding resources for best concurrent processing
There are several places to tune up for JMS applications using WebSphere MQ as provider to connect Process Server.
1. Max Server Sessions in Activation Spec
The Activation Spec has an associated server session pool, and its size controls the number of messages that can be processed concurrently by the Message Driven Bean (MDB). The default size of the server session pool is 10, which means that up to 10 messages can be processed at the same time by the activation specification's associated MDB. In this case, we found the value is set to 3, less than 3 messages can be picked-up once a time. We can consider increasing this value to improve the concurrent processing capability.
What happens if an Activation Spec tries to let Message-Driven Bean process a message but all server sessions are already busy processing messages? In this situation, the Activation Spec will be blocked until a server session becomes free. As soon as a server session is available, the Activation Spec loads it up with the message reference, and then schedules a new piece of work so the server session can run again. In that period, the messages are piled up in the receive queue it would look like Process Server is unable to process it.
The application server has a dedicated thread pool named WMQJ
What happens if there is not enough threads in the thread pool to work on the message delivery? It will also cause the message process to be delayed. For detailed reasons about why this happens, please see the references at the bottom of this post.
Tuning up the entry of MQ Export Binding processing concurrency means there will be more messages coming into Process Server. You should not only consider tuning up the MQ Export Binding Resource, but also consider the rest of the other connecting components after the MQ binding, like BPEL, MQ Import Binding to avoid messages piling up at the next bottleneck point. Please gradually tune up the parameters and monitor the overall server processing ratio before setting up a large value all at once. Remember, what we want is to increase the overall Process Server concurrent processing capability, not only limit to MQ Export Binding. We should consider all components comprehensively.
This article first described a simple way to track back the problem from the queue point to the JMS resources, then finally to the application design. Then, we discussed that there are many possibilities that may cause a message to get piled up in a MQ queue and not get processed by Process Server. You should consider the upstream application first, because we have seen similar situations of this kind many times. Process Server is doing no wrong, but upstream applications send bad data or massive amount of data that causes the processing to be pending on Process Server side. Then the environment, like any firewall, network, 3rd party software or WebSphere MQ server related issues cause the messages to be piled up. Finally, if you believe your Process Server concurrent processing speed needs to be tuned up, you can follow the above steps to tune up the Max Server sessions of the Activation Spec in question and the max size of the WMQJ
3. IBM Business Process Manager V7.5 Performance Tuning and Best Practices (see Section 4.3.3 Message-driven bean ActivationSpec)