How WebSphere Application Server V6 handles poison messages
IBM WebSphere Application Server Version 6.x provides support for asynchronous messaging based on the Java™ Message Service (JMS) specification. Using either the default messaging provider or IBM WebSphere MQ, you can write message-driven beans (MDBs) that listen on a destination (either a message queue or a topic). When a message arrives at the destination, the MDB is invoked and its onMessage() method called. If a "poison message" is delivered to an MDB, the application can choose to reject it. In this situation, what happens to the message and how does the application server behave?
This article assumes a basic knowledge of JMS.
What is a poison message?
A poison message is simply a message that the receiving MDB application is unable to process. It could be that the message has become corrupt, is in an unexpected format, or contains information that cannot be handled by the MDB's business logic. For example, suppose you have an MDB that processes book orders. If the MDB receives an order for a book that doesn’t exist, the message could be considered a poison message.
If a poison message is delivered to an MDB, the bean can do one of three things:
Roll back the message to the destination that it came from. This can be done if the MDB is running within a transaction, and ensures that the message is not lost. Returning the message to its original destination will give the MDB the chance to process the message again. This is useful if the application was unable to handle the message due to a temporary problem, such as a database being unavailable. To roll back the message, the MDB should call the setRollbackOnly() method on the message-driven context associated with the bean.
Move the message to a different destination. This is particularly useful when the MDB is not running inside a transaction, as it prevents the poison message from being lost. A systems administrator can examine the message at a later date to find out why it could not be processed, and potentially move it back to the destination being monitored by the MDB so that it can be reprocessed.
Discard the message, by doing nothing. This means that the message is gone forever.
It is the responsibility of the MDB application to determine if it has received a poison message, and how it should be handled. There is no way for the JMS provider or the application server to determine if a message is corrupt or cannot be processed.
Rolling back a poison message
An MDB running inside of a transaction can choose to roll back a message that cannot be processed. What does the application server do in this situation? The answer depends on what JMS provider is being used.
Using the default messaging provider
MDBs that are configured to use the default messaging provider monitor queues or topic spaces hosted by a service integration bus. When messages arrive on the queue or are published on the topic, they are delivered to the MDB. The behaviour of the application server when an MDB rolls back a message depends on the values of three properties:
The JMS message property Redelivery count indicates the number of times a JMS message has been delivered to an application. This property is incremented if an MDB has rejected the message after delivery.
The JMS destination property Maximum failed deliveries specifies the number of times a message on a destination will be delivered to an MDB before it is moved to the exception destination that has been defined for this destination. The default value of the property is 5, which means that if a message is rolled back 5 times, the application server will move it to a different location. This property can be changed on the destination’s configuration panel in the WebSphere Application Server administrative console (Figure 1).
Figure 1. Maximum failed deliveries property for the TestDestination
The JMS destination property Exception destination tells the application server what to do with poison messages that have been rolled back the number of times specified in the Maximum failed deliveries property. The Exception destination property can have one of three values:
- System: Route messages to the system exception destination _SYSTEM.Exception.Destination.<messaging engine name>
- None: Leave the message on the original destination.
- Specify: Move the message to a user-specified exception destination.
The default value for this property is "System," so any messages that have been rolled back more than Maximum failed deliveries will be moved to the system exception destination defined for the messaging engine that hosts the destination. Like the Maximum failed deliveries property, Exception destination can be changed via the destination’s configuration panel in the WebSphere Administrative Console (Figure 2).
Figure 2. Exception destination for the TestDestination is set to "System"
The messaging engine custom property sib.processor.blockedRetryTimeout tells the application server how long to wait before redelivering a poison message to an MDB, if and only if the Exception destination property is set to "None." This property is only available with WebSphere Application Server Versions 6.0.2.x and 6.1, and its default value is different for each version:
- In Version 6.0.2.x, the default value for this property is 0 milliseconds, which means any messages that are rolled back will be immediately redelivered to an MDB.
- In Version 6.1, the default value is 5000 milliseconds, which equates to 5 seconds. Any messages that are rolled back with Version 6.1 will therefore remain on the original destination for 5 seconds before they are redelivered to an MDB.
When a message is rolled back, its Redelivery count is incremented and compared against the value of Maximum failed deliveries for the destination from which the message originally came.
If the Redelivery count is less than the Maximum failed deliveries, the message is returned to the destination so that it can be reprocessed.
If the Redelivery count is equal to or greater than Maximum failed deliveries, the messaging engine will either move the message to the Exception destination specified, or wait for the period of time specified by sib.processor.blockedRetryTimeout before attempting to deliver it again, if the Exception destination is set to None. This behaviour is shown in Figure 3.
Figure 3. How the default messaging provider handles poison messages
The default behaviour
By default, the Maximum failed deliveries property has the value 5, and the Exception destination is set to System. If these default values are used, what happens when a poison message arrives on a destination and is then delivered to an MDB?
Since the MDB is unable to process the message, it rolls it back, which causes the Redelivery count to be increased to 1. The messaging engine returns the message to the destination, since the Redelivery count is less than the Maximum failed deliveries for the destination.
The MDB receives the message again, but is still unable to process it, so the MDB performs a rollback as before. The Redelivery count of the message is now set to 2, which is still less than Maximum failed deliveries for the destination, so the messaging engine puts the message back where it originated
This pattern repeats until the message has been rolled back 5 times.
Now, the value of the Redelivery count is the same as the destination’s Maximum failed deliveries. Rather than return the message to its original destination, the messaging engine moves the message to the destination’s Exception destination, which is specified as _SYSTEM.Exception.Destination.<messaging engine name>.
Using WebSphere MQ on distributed platforms
The message listener service is used by MDBs that listen on JMS destinations hosted by WebSphere MQ. MDBs are bound to listener ports, which are configured to monitor individual queues or topics hosted on a queue manager. When a message is put onto a queue or is published on a specific topic, the message is detected by the listener port and delivered to the MDB. When an MDB rolls a message back, the behaviour of the application server depends on four properties:
The listener port property Maximum retries specifies the number of times the message listener service will deliver a message to an MDB before the listener port is stopped. The default value for this property is 0, which means that the first time an MDB rolls back a message, the listener port associated with it is shut down. The Maximum retries property can be changed on the listener ports configuration panel in the WebSphere Application Server administrative console, shown in Figure 4.
Figure 4: Maximum retries for listener port TestMDBListener
The JMS message property Redelivery count indicates the number of times a JMS message has been delivered to an application. This property is incremented if an MDB has rejected the message after delivery.
The WebSphere MQ queue property Backout threshold (BOTHRESH) specifies the maximum number of times a message can be put onto a queue before it is moved onto a different location. The default value for this property is 0, which means that the message listener service will never attempt to re-queue messages that have been rolled back by an MDB. The value of Backout threshold can be set using either the WebSphere MQ command line utility runmqsc, or the Queue Properties panel in the WebSphere MQ Explorer (Figure 5).
Figure 5: Backout threshold property for test queue
The WebSphere MQ queue property Backout requeue queue (BOQNAME) is the queue location where a message is moved when the message has been rolled back onto a queue the number of times specified in the Backout threshold property. Backout requeue queue has no default value, which means that the application server will move any messages that have exceeded the backout threshold to the SYSTEM.DEAD.LETTER.QUEUE. The Backout requeue queue property can be set using either the WebSphere MQ utility runmqsc, or the WebSphere MQ Explorer. The Queue Properties panel is shown in Figure 6.
Figure 6. Backout requeue queue property for the test queue is set to SYSTEM.DEAD.LETTER.QUEUE
The first thing the message listener service does when a poison message is rolled back is to increment the Redelivery count of the message and return it to the queue where it came from.
The message listener service compares the Redelivery count to the listener port’s Maximum retries property. If Redelivery count is equal or greater than Maximum retries, the listener port is stopped. Otherwise, the listener continues running.
The next time the listener port detects the message, the message listener service compares Redelivery count with the queue’s Backout threshold. If the Redelivery Count is less than the Backout threshold, the message is left on the queue, ready to be reprocessed. However, if Redelivery count is equal to the Backout threshold, the message listener service moves the message onto the Backout requeue queue. If no Backout requeue queue has been defined, the message is moved to the SYSTEM.DEAD.LETTER.QUEUE. This behaviour is shown in Figure 7:
Figure 7. How the application server handles poison messages when using the WebSphere MQ JMS Provider
The default behaviour
By default, the Maximum retries property has a value of 0 and both the Backout threshold and Backout requeue queue properties have no value. So, what is the default behaviour when a poison message is delivered to an MDB that is listening on a JMS destination being hosted by WebSphere MQ?
The MDB rolls back the message, which means that the Redelivery count of the message will be increased to 1. The message listener service now compares the Redelivery count with the value of the listener port’s Maximum retries property, which is 0. As the Redelivery count is greater than Maximum retries, the listener port is stopped.
When the listener port is restarted, it will detect the poison message again and compare the Redelivery count of the message with the value of the queue’s Backout threshold property. This property has no value, so the message is delivered to the MDB. If the MDB is still unable to process it, the message will be rolled back onto the queue, and its Redelivery count will be incremented to 2. Once again, the message listener service will compare the Redelivery count to the Maximum retries property, and, once again, the listener port will be shut down.
The next time the listener port is restarted, it will detect the message, and the whole cycle will be repeated.
As a result of this behaviour, it is possible to end up in a situation where a poison message blocks the processing of other messages on the queue.
When a message is rolled back, it is returned to its original position on the queue. Listener ports always start processing messages from the top of the queue, so if the very first message on the queue is a poison message, the listener port will detect it, and deliver it to the MDB. As the message cannot be processed, the MDB will roll it back, which will cause it to go back to the top of the queue again. The listener port will then be shut down. When it restarts, the listener port will detect the poison message again, and redeliver it to the MDB. The MDB will roll it back, again causing the listener port to stop.
Change the default behaviour
As you can see, the default behaviour will continue until the poison message is deleted from the queue by a systems administrator.
To prevent this from happening, you need to ensure that the queue being monitored by the listener port has both a Backout threshold and a Backout requeue queue defined, and that the value of the Backout threshold is less than the listener port’s Maximum retries.
For example, suppose that you have a listener port defined called TestMDBListener, which is monitoring the WebSphere MQ queue test for messages. The listener port has its Maximum retries property set to 10, and the queue has a Backout threshold of 1 and a Backout requeue queue of SYSTEM.DEAD.LETTER.QUEUE:
A message arrives on the queue test, is detected by the listener port, and delivered to your MDB. Now, suppose that your MDB is unable to process this message and rolls it back. The Redelivery count of the message is now set to 1.
The message listener service compares this value to the listener port’s Maximum retries property, which has a value 10, which is greater than the Redelivery count, so the listener continues running.
The next time the listener port detects the message, the message listener service checks the message’s Redelivery count and finds it has a value 1. It now looks at the Backout threshold for the queue test, which also has a value of 1. Therefore, the message listener service decides to back the message out.
The message listener service queries the queue’s Backout requeue queue property. This is set to SYSTEM.DEAD.LETTER.QUEUE, so the message listener service removes the message from the test queue, and puts it onto this one.
The message listener service now goes back to monitoring test, waiting for more messages to arrive.
How the Maximum sessions property affects poison messages
The listener port’s Maximum sessions property defines the maximum number of messages that can be processed concurrently by the listener. If this property has a value 10, and there are 10 messages on the queue being monitored by the listener port, then all 10 messages will be processed at the same time. This has implications when one of the messages cannot be processed and is rolled back.
As you have seen, when a poison message is rolled back by an MDB, the default behaviour of the application server is to return the message to the queue that it came from and stop the listener port. However, if the Maximum sessions property is set to a value greater than 1, it is possible that the message might be reprocessed before the listener port shuts down.
In most situations, this should not cause any problems, as by the time the message is rolled back again, the listener port will have stopped and will not attempt to deliver it again. However, if the MDB performs some transactional and non-transactional work, there is a possibility that the non-transactional work will be performed again. To prevent this situation, the Backout threshold property should be set to 1; this forces the application server to move the message to the Backout requeue queue rather than redeliver it again before the listener port shuts down.
How the Maximum messages property affects poison message handling
There is another listener port property, Maximum messages, that also affects the application server’s behaviour when a poison message is detected. This property specifies the number of messages that will be processed by an MDB in a single transaction. The default value of Maximum messages is 1, which means that each message will be processed in its own transaction.
If this value is increased, then the MDB will process a batch of messages in one go. If one message in the batch cannot be processed and is rolled back, then all of the messages in that batch will also be rolled back.
So, for example, suppose that Maximum messages is set to 10. When the listener port starts up, ten messages are delivered to the MDB, which attempts to process them sequentially. Suppose the first five messages are processed successfully, and the sixth one fails. When the MDB rolls this one back, the application server also rolls back the five messages that were successfully processed, as well as the four that have yet to be looked at.
Now, if the Backout threshold of the queue being monitored by the listener port is set to 1, all ten messages will be moved to the Backout requeue queue!
This is something you need to be aware of and consider carefully when thinking about increasing the value of Maximum messages.
Listener ports and alias queues
As well as being configured to look for messages on WebSphere MQ local and remote queues, listener ports can be set up to monitor alias queues. When setting up your application server to do this, it is important that the local or remote queue being pointed to by the alias queue have a the Backout threshold and Backout requeue queue properties set, and that the application server has the permission to inquire the values of these attributes on the local or remote queue.
If the application server is unable to determine the values of these attributes, then it uses a value of 20 for the Backout threshold and leaves the Backout requeue queue property unset. This means that any poison messages detected by the listener port monitoring the alias queue will be rolled back 20 times before being moved to the SYSTEM.DEAD.LETTER.QUEUE.
WebSphere MQ security considerations
One question that comes up quite a lot is: What WebSphere MQ authorizations does my WebSphere Application Server system need in order to back out messages?
In order for WebSphere Application Server to back out messages, the user ID that the application server is running under needs to have the following permissions on the backout requeue queue:
- Pass All Context
- Set All Context
If the application server does not have these permissions, the following error message will be generated when the message listener service attempts to back out a poison message:
WMSG0018E: Error on JMSConnection for MDB <MDB Name> , JMSDestination <Destination name> : javax.jms.JMSException: MQJMS1081: Message requeue failed
Using the WebSphere MQ provider on z/OS
The way poison messages are handled when using the WebSphere MQ JMS provider on z/OS is the same as described above, with one minor difference.
WebSphere MQ on z/OS uses in-memory copies of messages. The first time a message is detected by a listener port, WebSphere MQ stores a copy of it in memory before passing it to the application server. If the message is processed successfully, the in-memory copy is deleted and the actual message is deleted from storage.
In the situation where the MDB rolls back the message, WebSphere MQ will increment the Redelivery count of the in-memory copy. The application server then looks at the value of this property on the copy to determine whether the actual message should be moved to the Backout requeue queue and whether the listener port should be stopped. If the value of the in-memory copy’s Redelivery count is more than the Backout threshold, the actual message is moved to the Backout requeue queue, and the in-memory copy deleted.
This has implications if the WebSphere MQ system is stopped before a message has been backed out.
Suppose we have another listener port defined, called TestMDBListener2. The listener port has the Maximum retries property set to 10. This is configured to monitor the WebSphere MQ queue test2 looking for messages. The queue is hosted by a WebSphere MQ queue manager running on z/OS, and has the Backout threshold property set to 5:
A message arrives on the queue and is detected by TestMDBListener2. A copy of the message is made, and is delivered to an MDB. However, the message cannot be processed, so the MDB rolls it back. The Redelivery count of the in-memory copy of the message is incremented, and now has the value 1.
The listener port detects the message again, and once again delivers it to the MDB, where it is rolled back for a second time. The in-memory copy of the message now has a Redelivery count of 2.
Suppose that the z/OS queue manager is shut down at this point. Since the application server has only been working on an in-memory copy of the message, the Redelivery count of the actual message stored on the queue is still set to 0. When the queue manager is restarted, the listener port will detect the message again. A new in-memory copy of the message is made, which has a Redelivery count of 0. When the message is rolled back by the MDB, the value of the copy’s Redelivery count will be incremented to 1.
This behaviour continues until the MDB has rolled back the message 5 times. At this point, the application server determines that the Redelivery count of the in-memory copy of the message is equal to the queue’s Backout threshold, and moves the actual message to the queue specified by the Backout requeue queue property.
In this scenario, the poison message has actually been processed 7 times before it was backed out, which is more than the Backout threshold that has been defined for the queue.
This behaviour is not ideal!
To prevent this from happening, WebSphere MQ on z/OS needs to be configured to update the Redelivery count of the actual message in storage as well as the in-memory copy when a rollback occurs. This means that the Redelivery count value will be persisted, and therefore can survive a queue manager restart. To do this, the HardenGetBackout (HNDBKTCNT) property for the queue being monitored by the MDB needs to be set to YES.
This article described what a poison message is, and explained what JMS applications can do when they encounter them. You also learned how the default messaging provider and the WebSphere MQ JMS provider handle situations where an MDB rolls back a poison message, how the default behaviour can be changed, and how some listener port properties affect the behaviour of the application server when using the WebSphere MQ JMS Provider.
- How WebSphere Application Server V5 handles poison messages
- Using JMS connection pooling with WebSphere Application Server and WebSphere MQ, Part 1
- Using JMS connection pooling with WebSphere Application Server and WebSphere MQ, Part 2