How WebSphere Application Server V8.x handles poison messages

This article describes how IBM® WebSphere® Application Server Version 8.x handles poison JMS messages, looks at the behaviour of both the default messaging provider and the IBM WebSphere MQ messaging provider, and provides information on how the default behaviour can be changed. This content is part of the IBM WebSphere Developer Technical Journal.

Share:

Paul Titheridge, Service Specialist, WebSphere MQ Level 3 Support team, IBM

Paul Titheridge has been working on the WebSphere MQ Level 3 Support team at the IBM Hursley Lab in the United Kingdom since 2003, and he specialises in resolving issues related to interactions between WebSphere MQ and WebSphere Application Server. You can contact Paul pault@uk.ibm.com.



30 July 2014

Also available in Chinese

Introduction

IBM WebSphere Application Server Version 8 and later (V8.x) provides support for asynchronous messaging based on the Java™ Message Service (JMS) version 1.1 specification. Using either the default messaging provider or IBM WebSphere MQ, you can write message-driven beans (MDBs) that listen on a destination (either a message queue or a topic). When a message arrives at the destination, the MDB is invoked and its onMessage() method called. If a "poison message" is delivered to an MDB, the application can choose to reject it.

In this situation, what happens to the message and how does the application server behave?


What is a poison message?

This article assumes a basic knowledge of JMS.

A poison message is simply a message that the receiving MDB application is unable to process. It could be that the message has become corrupt, is in an unexpected format, or contains information that cannot be handled by the MDB's business logic. For example, suppose you have an MDB that processes book orders. If the MDB receives an order for a book that doesn’t exist, the message could be considered a poison message.

If a poison message is delivered to an MDB, the bean can do one of three things:

  • Roll back the message to the destination that it came from.

    This can be done if the MDB is running within a transaction, and ensures that the message is not lost. Returning the message to its original destination will give the MDB the chance to process the message again. This is useful if the application was unable to handle the message due to a temporary problem, such as a database being unavailable. To roll back the message, the MDB should call the setRollbackOnly() method on the message-driven context associated with the bean.

  • Move the message to a different destination.

    This is particularly useful when the MDB is not running inside a transaction, as it prevents the poison message from being lost. A systems administrator can examine the message at a later date to find out why it could not be processed, and potentially move it back to the destination being monitored by the MDB so that it can be reprocessed.

  • Discard the message, by doing nothing.

    This means that the message is gone forever.

It is the responsibility of the MDB application to determine if it has received a poison message and how it should be handled. There is no way for the JMS provider or the application server to determine if a message is corrupt or cannot be processed.


Rolling back a poison message

An MDB running inside of a transaction can choose to roll back a message that cannot be processed. What does the application server do in this situation? The answer depends on what JMS provider is being used.

Using the default messaging provider

MDBs that are configured to use the default messaging provider monitor queues or topic spaces hosted by a service integration bus. When messages arrive on the queue or are published on the topic, they are delivered to the MDB. The behaviour of the application server when an MDB rolls a message back depends on the values of three properties:

  • The JMS message property Redelivery count indicates the number of times a JMS message has been delivered to an application. This property is incremented if an MDB has rejected the message after delivery.
  • The JMS destination property Maximum failed deliveries specifies the number of times a message on a destination will be delivered to an MDB before it is moved to the exception destination that has been defined for this destination. The default value of the property is 5, which means that if a message is rolled back five times, the application server will move it to a different location. This property can be changed on the destination’s configuration panel in the WebSphere Application Server administrative console (Figure 1).
Figure 1. Maximum failed deliveries property for the TestDestination
Maximum failed deliveries property for the TestDestination

The JMS destination property Exception destination tells the application server what to do with poison messages that have been rolled back the number of times specified in the Maximum failed deliveries property. The Exception destination property can have one of three values:

  • System: Route messages to the system exception destination _SYSTEM.Exception.Destination.<messaging engine name>
  • None: Leave the message on the original destination.
  • Specify: Move the message to a user-specified exception destination.

The default value for this property is System, so any messages that have been rolled back more than Maximum failed deliveries will be moved to the system exception destination defined for the messaging engine that hosts the destination. Like the Maximum failed deliveries property, Exception destination can be changed via the destination’s configuration panel in the WebSphere administrative console (Figure 2).

Figure 2. Exception destination for the TestDestination is set to "System"
Exception destination for the TestDestination is set to 'System'

If the Exception destination property is set to “None” then, by default, the application server will look at the value of the messaging engine property Default blocked destination retry interval to determine how long to wait before redelivering a poison message to an MDB. This property has the default value of 5000 milliseconds, which equates to 5 seconds.

Figure 3. Default blocked destination retry interval
Default blocked destination retry interval

It is possible to override this time period for individual JMS destinations by setting two JMS destination properties:

  • Override messaging engine blocked retry timeout default needs to be selected.
  • Blocked retry timeout in milliseconds needs to be set to a value of 0 or greater. The default value of this property is -1, which means that if the checkbox Override messaging engine blocked retry timeout default has been selected the poison message will never be redelivered. Setting the property to the value of 0 means that any messages that are rolled back will be immediately redelivered to an MDB.
Figure 4. Blocked retry timeout in milliseconds for the TestDestination is set to 10000 milliseonds (or 10 seconds)
Blocked retry timeout in milliseconds for the TestDestination is set to 10000 milliseonds (or 10 seconds)

When a message is rolled back, its Redelivery count is incremented and compared against the value of Maximum failed deliveries for the destination from which the message originally came.

If the Redelivery count is less than the Maximum failed deliveries, the message is returned to the destination so that it can be reprocessed.

If the Redelivery count is equal to or greater than Maximum failed deliveries, the messaging engine will either move the message to the Exception destination specified, or wait for the period of time specified by sib.processor.blockedRetryTimeout before attempting to deliver it again, if the Exception destination is set to None. This behaviour is shown in Figure 5.

Figure 5. How the default messaging provider handles poison messages
How the default messaging provider handles poison messages

The default behaviour

By default, the Maximum failed deliveries property has the value 5, and the Exception destination is set to System. If these default values are used, what happens when a poison message arrives on a destination and is then delivered to an MDB?

  1. Since the MDB is unable to process the message, it rolls it back, which causes the Redelivery count to be increased to 1. The messaging engine returns the message to the destination because the Redelivery count is less than the Maximum failed deliveries for the destination.
  2. The MDB receives the message again, but is still unable to process it, so the MDB performs a rollback as before. The Redelivery count of the message is now set to 2, which is still less than Maximum failed deliveries for the destination, so the messaging engine puts the message back where it originated.
  3. This pattern repeats until the message has been rolled back 5 times.
  4. Now, the value of the Redelivery count is the same as the destination’s Maximum failed deliveries. Rather than return the message to its original destination, the messaging engine moves the message to the destination’s Exception destination, which is specified as _SYSTEM.Exception.Destination.<messaging engine name>.

Using the WebSphere MQ messaging provider

MDBs that use the WebSphere MQ messaging provider can either use activation specifications or listener ports to monitor queues or topics hosted by WebSphere MQ. When a message is put onto a queue or is published on a specific topic, the message is detected by the activation specification or listener port and delivered to the MDB.

When an MDB rolls a message back, the behaviour of the application server is different depending on whether the MDB was bound to an activation specification or a listener port.

Activation specifications

If an MDB that was configured to use an activation specification rolls a message back, the behaviour of the application server depends on five properties:

  • The first two:
    • Stop endpoint if message delivery fails and
    • Number of sequential delivery failures before suspending endpoint

    are activation specification advanced properties, and work together to determine if an activation specification should stop after a message has been rolled back.

    The Stop endpoint if message delivery fails property is a checkbox. When selected, the activation specification will keep a count of the number of rollbacks that have been performed by the MDBs that are using it. When a message is rolled back, the rollback counter increases by one. When a message is successfully processed by an MDB, the rollback counter is reset to zero.

    If the rollback counter reaches the value specified by the Number of sequential delivery failures before suspending endpoint property, then the activation specification stops.

    By default, the Stop endpoint if message delivery fails checkbox is selected and the Number of sequential delivery failures before suspending endpoint property has the value 0. This means that, as soon as an MDB rolls a message back, the activation specification will stop. These properties can be changed on the Advanced properties panel for an activation specification in the WebSphere administrative console (see Figure 6).

    Figure 6. The Stop endpoint if message delivery fails and Number of sequential delivery failures before suspending endpoint properties for TestActivationSpec
    The Stop endpoint if message delivery fails and Number of sequential delivery failures before suspending endpoint properties for TestActivationSpec
  • The JMS message property Redelivery count indicates the number of times a JMS message has been delivered to an application. This property is incremented if an MDB has rejected the message after delivery.
  • The WebSphere MQ queue property Backout threshold (BOTHRESH) specifies the maximum number of times a message can be put onto a queue before it is moved onto a different location. The default value for this property is 0, which means that an activation specification will never attempt to re-queue messages that have been rolled back by an MDB. The value of Backout threshold can be set using either the WebSphere MQ command line utility runmqsc, or the Queue Properties panel in the WebSphere MQ Explorer (Figure 7).
    Figure 7. Backout threshold property for test queue
    Backout threshold property for test queue
  • The WebSphere MQ queue property Backout requeue queue (BOQNAME) is the queue location where a message is moved to when the message has been rolled back onto a queue the number of times specified in the Backout threshold property. Backout requeue queue has no default value, which means that the application server will move any messages that have exceeded the backout threshold to the SYSTEM.DEAD.LETTER.QUEUE. The Backout requeue queue property can be set using either the WebSphere MQ utility runmqsc, or the WebSphere MQ Explorer. The Queue Properties panel is shown in Figure 8.
    Figure 8. Backout requeue queue property for the test queue is set to SYSTEM.DEAD.LETTER.QUEUE
    Backout requeue queue property for the test queue is set to SYSTEM.DEAD.LETTER.QUEUE

    When an activation specification detects a message on a JMS destination, the first thing it does is to compare the value of the the message's Redelivery count to the value of the queue’s Backout threshold. If the Redelivery Count is less than the Backout threshold, the message is delivered to the MDB for processing. However, if Redelivery count is equal to the Backout threshold, the WebSphere MQ messaging provider moves the message onto the Backout requeue queue. If no Backout requeue queue has been defined, the message is moved to the SYSTEM.DEAD.LETTER.QUEUE.

    If the message is delivered to the MDB and is then rolled back, the activation specification puts the message back onto the JMS destination that it came from and increments the value of Redelivery count.

    The activation specification then checks if the Stop endpoint if message delivery fails checkbox is selected. If it is, then the activation specification increments its internal rollback counter and compares the value of the counter to the value of the Number of sequential delivery failures before suspending endpoint property. If the two values are equal, then the activation specification is stopped.

    This behaviour is shown in Figure 9.

    Figure 9. How the application server handles poison messages when using the WebSphere MQ JMS Provider
    How the application server handles poison messages when using the WebSphere MQ JMS Provider

The default behaviour

By default, the Stop endpoint if message delivery fails checkbox is selected, the Number of sequential delivery failures before suspending endpoint property has a value of 0 and both the Backout threshold and Backout requeue queue properties have no value. So, what is the default behaviour when a poison message is delivered to an MDB that is using an activation specification to monitor a JMS destination being hosted by WebSphere MQ?

The MDB rolls back the message, which means that the Redelivery count of the message will be increased to 1. The activation specification now checks the Stop endpoint if message delivery fails checkbox and finds it has been selected, so it increments the rollback counter.

The activation specification then checks the value of the Number of sequential delivery failures before suspending endpoint property, and compares this to the value of the rollback counter. As the rollback counter is greater than the Number of sequential delivery failures before suspending endpoint, the activation specification is stopped.

When the activation specification is restarted, it will detect the poison message again and compare the Redelivery count of the message with the value of the queue’s Backout threshold property. This property has no value, so the message is delivered to the MDB. If the MDB is still unable to process it, the message will be rolled back onto the queue, and its Redelivery count will be incremented to 2. Once again, the activation specification will check the Stop endpoint if message delivery fails checkbox, finds it is selected and increments the rollback counter. The rollback counter now has a value of 2, which is greater than the value of Number of sequential delivery failures before suspending endpoint, so the activation specification is stopped again.

The next time the activation specification is restarted, it will detect the message, and the whole cycle will be repeated.

As a result of this behaviour, it is possible to end up in a situation where a poison message blocks the processing of other messages on the queue.

When a message is rolled back, it is returned to its original position on the queue. Activation specifications always start processing messages from the top of the queue, so if the very first message on the queue is a poison message, the activation specification will detect it, and deliver it to the MDB. As the message cannot be processed, the MDB will roll it back, which will cause it to go back to the top of the queue again. The activation specification will then be shut down. When it restarts, the activation specification will detect the poison message again, and redeliver it to the MDB. The MDB will roll it back, again causing the activation specification to stop.


Change the default behaviour

As you can see, the default behaviour will continue until the poison message is deleted from the queue by a systems administrator.

To prevent this from happening, you need to:

  • Ensure that the queue being monitored by the activation specification has both a Backout threshold and a Backout requeue queue defined.
  • Either:
    • Unselect the Stop endpoint if message delivery fails checkbox, or
    • Leave the Stop endpoint if message delivery fails checkbox, and set the Number of sequential delivery failures before suspending endpoint property to a value greater than the Backout threshold.

For example, suppose that you have an activation specification defined called TestActivationSpecification, which is monitoring the WebSphere MQ queue test for messages. The activation specification has the Stop endpoint if message delivery fails checkbox selected, and the Number of sequential delivery failures before suspending endpoint property set to the value 5. The queue test has a Backout threshold of 1 and a Backout requeue queue of SYSTEM.DEAD.LETTER.QUEUE.

A message arrives on the queue test, is detected by the activation specification, and delivered to your MDB. Now, suppose that your MDB is unable to process this message and rolls it back. The Redelivery count of the message is now set to 1.

The WebSphere MQ messaging provider checks the Stop endpoint if message delivery fails checkbox for the activation specification and finds it is selected, so increments the rollback counter.

It then compares the value of the rollback counter to the value of the Number of sequential delivery failures before suspending endpoint property, which has a value of 5. This is greater than the value of the rollback counter, so the activation specification continues running.

The next time the activation specification detects the message, the WebSphere MQ messaging provider checks the message’s Redelivery count and finds it has a value 1. It now looks at the Backout threshold for the queue test, which also has a value of 1. Therefore, the WebSphere MQ messaging provider decides to back the message out.

The WebSphere MQ messaging provider queries the queue’s Backout requeue queue property. This is set to SYSTEM.DEAD.LETTER.QUEUE, so the WebSphere MQ messaging provider removes the message from the test queue, and puts it onto this one.

The activation specification then goes back to monitoring the queue test for more messages to arrive.


How the Maximum server sessions property affects poison messages

The activation specification advanced property Maximum server sessions defines the maximum number of messages that can be processed concurrently. If this property has a value of 10, and there are 10 messages on the destination being monitored by the activation specification, then all 10 messages will be processed at the same time by an internal server session associated with the activation specification.

It is important to note that if the Stop endpoint if message delivery fails checkbox is selected for an activation specification, and the Number of sequential delivery failures before suspending endpoint property is set to a value greater than zero, then the rollback counter maintained by the activation specification applies across all server sessions.

This means that if different poison messages are rolled back by different server sessions at the same time, then the activation specification might stop without trying to move the poison messages to the backout queue.

In addition to this, as soon as a message detected by an activation specification is successfully processed by a server session, the rollback counter for the activation specification is reset to zero.

For example, suppose our activation specification testActivationSpecification has:

  • Stop endpoint if message delivery fails selected
  • Number of sequential delivery failures before suspending endpoint property set to 3.
  • Maximum server sessions property set to 5.

The activation specification is configured to monitor the queue called test, which has the Backout threshold property set to 5 and the Backout queue name set to SYSTEM.DEAD.LETTER.QUEUE.

When the activation specification starts up, there are ten poison messages on the queue. What happens? Good question.

The activation specification detects the first five messages, and delivers them to five server sessions for processing.

The first server session hands the first poison message to an MDB. The MDB tries to process it, finds it is unable to do so and rolls it back onto the queue. The redelivery count of the message is now set to 1, and the rollback counter associated with the activation specification is set to 1.

In parallel, the second server session gives the second poison message to another instance of the same MDB. This MDB instance tries to process the message, is unable to do so, and so rolls it back onto the queue. The redelivery count of this message is now set to 1. The internal rollback counter for the activation specification is set to 2.

While all this processing is going on, the third server session passes the third poison message to another MDB instance. This MDB instance also rolls the message back, as it is unable to process it. The redelivery count of the third poison message is set to 1, and more importantly the rollback counter for the activation specification is set to 3.

At this point, the WebSphere MQ messaging provider detects that the rollback counter is equal to the value of the Number of sequential delivery failures before suspending endpoint for the activation specification. As a result of this, the WebSphere MQ messaging provider stops the activation specification.

Poison messages 4 and 5 will still be processed by server sessions four and five respectively, as the messages were given to the server sessions before the activation specification was stopped.

It is also possible that poison messages 6 and 7 might also be processed before the activation specification is stopped. This is because they will be delivered to the first and second server sessions once those server sessions have rolled back poison messages 1 and 2.

In order to change this behaviour, and ensure that the activation specification always tries to move poison messages to the specified backout queue rather than stopping, ensure that the Stop endpoint if message delivery fails checkbox is unselected.


Listener ports

Listener ports have been available since WebSphere Application Server V5, and provide an alternative mechanism for MDBs to monitor JMS destinations for messages. The behaviour of listener ports has not changed since WebSphere Application Server V6.1, and has been stabilised since WebSphere Application Server V7, which means that no new functionality has been added to them for a while.

This means that the information on listener ports in the developerWorks article How WebSphere Application Server V6 handles poison messages is still valid for WebSphere Application Server V8 and later.


WebSphere MQ security considerations

One question that comes up quite a lot is:

What WebSphere MQ authorizations does my WebSphere Application Server system need in order to back out messages?

In order for WebSphere Application Server to back out messages, the user ID under which the application server is running needs to have the following permissions on the backout requeue queue:

  • Get
  • Inquire
  • Pass All Context
  • Put
  • Set All Context

If the application server does not have these permissions, the application server will move the message to the dead letter queue that has been defined for the queue manager.


Using the WebSphere MQ provider on z/OS

The way poison messages are handled when using the WebSphere MQ JMS provider on z/OS® is the same as described above, with one minor difference.

WebSphere MQ on z/OS uses in-memory copies of messages. The first time a message is detected by a listener port, WebSphere MQ stores a copy of it in memory before passing it to the application server. If the message is processed successfully, the in-memory copy is deleted and the actual message is deleted from storage.

In the situation where the MDB rolls back the message, WebSphere MQ will increment the Redelivery count of the in-memory copy. The application server then looks at the value of this property on the copy to determine whether the actual message should be moved to the Backout requeue queue and whether the listener port should be stopped. If the value of the in-memory copy’s Redelivery count is more than the Backout threshold, the actual message is moved to the Backout requeue queue, and the in-memory copy deleted.

This has implications if the WebSphere MQ system is stopped before a message has been backed out.

Suppose you have an activation specification defined, called TestActivationSpec2. The activation specification is configured to monitor the WebSphere MQ queue test2 for messages, and to stop after 10 sequential delivery failures. The queue is hosted by a WebSphere MQ queue manager running on z/OS, and has the Backout threshold property set to 5:

A message arrives on the queue and is detected by TestActivationSpecfication2. A copy of the message is made, and is delivered to an MDB. However, the message cannot be processed, so the MDB rolls it back. The Redelivery count of the in-memory copy of the message is incremented, and now has the value 1.

The activation specification detects the message again, and once again delivers it to the MDB, where it is rolled back for a second time. The in-memory copy of the message now has a Redelivery count of 2.

Suppose that the z/OS queue manager is shut down at this point. Since the application server has only been working on an in-memory copy of the message, the Redelivery count of the actual message stored on the queue is still set to 0. When the queue manager is restarted, the activation specification will detect the message again. A new in-memory copy of the message is made, which has a Redelivery count of 0. When the message is rolled back by the MDB, the value of the copy’s Redelivery count will be incremented to 1.

This behaviour continues until the MDB has rolled back the message 5 times. At this point, the application server determines that the Redelivery count of the in-memory copy of the message is equal to the queue’s Backout threshold, and moves the actual message to the queue specified by the Backout requeue queue property.

In this scenario, the poison message has actually been processed seven times before it was backed out, which is more than the Backout threshold that has been defined for the queue.

This behaviour is not ideal.

To prevent this from happening, WebSphere MQ on z/OS needs to be configured to update the Redelivery count of the actual message in storage as well as the in-memory copy when a rollback occurs. This means that the Redelivery count value will be persisted, and therefore can survive a queue manager restart. To do this, the HardenGetBackout (HNDBKTCNT) property for the queue being monitored by the MDB needs to be set to YES.


Conclusion

This article described what a poison message is, and explained what JMS applications can do when they encounter them. You also learned how the default messaging provider and the WebSphere MQ JMS provider handle situations where an MDB rolls back a poison message, how the default behaviour can be changed, and how some listener port properties affect the behaviour of the application server when using the WebSphere MQ JMS provider.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=978915
ArticleTitle=How WebSphere Application Server V8.x handles poison messages
publish-date=07302014