APAR status
Closed as program error.
Error description
A WebSphere Application Server v8.5 system was upgraded to v9.0. An activation specification running witin the application server repeatedly stopped when the system was under load, with a corresponding message in the SystemOut.log of the form: CWWMQ0007W: The message endpoint myApp#MyAppEjb.jar#myMDB has been paused by the system. Message delivery failed to the endpoint more than 0 times. The last attempted delivery failed with the following error: javax.jms.TransactionRolledBackException: at com.ibm.mq.connector.inbound.AbstractWorkImpl.xaStateChanged at com.ibm.mq.connector.xa.XAObservable.update at com.ibm.mq.connector.xa.XARWrapper.rollback at com.ibm.tx.jta.impl.JTAXAResourceImpl.rollback at com.ibm.tx.jta.impl.RegisteredResources.deliverOutcome at com.ibm.tx.jta.impl.RegisteredResources.distributeOutcome at com.ibm.tx.jta.impl.RegisteredResources.distributeRollback at com.ibm.tx.jta.impl.TransactionImpl.internalRollback at com.ibm.tx.jta.impl.TransactionImpl.internalRollback at com.ibm.tx.jta.impl.TransactionImpl.rollback at com.ibm.ws.tx.jta.TransactionImpl.rollback at com.ibm.ws.tx.jta.TranManagerImpl.rollback at com.ibm.tx.jta.impl.TranManagerSet.rollback at com.ibm.ejs.csi.TranStrategy.rollback at com.ibm.ejs.csi.TranStrategy.postInvoke at com.ibm.ejs.csi.TransactionControlImpl.postInvoke at com.ibm.ejs.container.EJSContainer.postInvoke at com.ibm.ws.ejbcontainer.mdb.MessageEndpointBase.afterDelivery at com.ibm.mq.connector.inbound.AbstractWorkImpl.run at com.ibm.ejs.j2c.work.WorkProxy.run at com.ibm.ws.util.ThreadPool$Worker.run The activation specification had the default configuration where the endpoint was suspended on the first delivery failures, configured through the Administration Console using the breadcrumb path: Resources -> JMS -> Activation specifications -> [select activation specification] --> Advanced properties At the bottom of the page, there is an entry for: "Number of sequential delivery failures before suspending endpoint" However, the MDB's method: javax.jms.MessageListener.onMessage(javax.jms.Message) was not failing to process the message. It was also noticed that before the activation specification closed down, the following message was output in the SystemOut.log: CWSJY0003W: JMSCC0108: An attempt to get a message for delivery to an message listener was made, but the message was not there.
Local fix
Update the queue manager property "MARKINT" to give it a larger value than the longest time which you would expect the MDB onMessage() method to complete in. For example, if your MDB takes in the region of 45 seconds to complete its processing, changing the MARKINT value to 60 seconds using the runmqsc syntax: ALTER QMGR MARKINT(60000) will limit the number of times this problem occurs on your WebSphere Application Server system. Also consider changing the Activation Specification behavior related to when the endpoint should be shutdown. This is configured in the Administration Console: Resources -> JMS -> Activation specifications -> [select ActSpec] --> Advanced properties At the bottom of the page, you will see an entry for: "Number of sequential delivery failures before suspending endpoint" This has the default value of '0'. If you change this to a larger value, for example '5', the endpoint will only pause if 5 sequential failures are recorded, which has a reduced likelihood.
Problem summary
**************************************************************** USERS AFFECTED: Users of WebSphere Application Server (tWAS) v9.0, who are using activation specifications which are driven off MQ destinations, where the activation specifications are running in a transacted environment. This problem does not affect MQ activation specifications running within the WebSphere Liberty Profile, or other application servers which are using MQ activation specifications. Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: The default operation of MQ activation specifications is to use a two phase message get mechanism, the two phases being: (1) Browse the destination looking for suitable messages (2) Once a suitable message is found, a destructive get is issued on the message, before the MDB's onMessage() method is called Due to the use of these two phases, there exists the possibility that a message browsed in phase (1) has disappeared by the time phase (2) attempts to consume the message. For example, the message may have been removed by another application consuming messages from the same queue, or the message may have expired between when the message was browsed, and when the attempt was made to get the message. The first time this happens, the activation specification puts out the warning message: JMSCC0108: An attempt to get a message for delivery to an message listener was made, but the message was not there. The objective of this message is to warn the application server administrator that the system is not fully optimised. The most likely cause of this is that the messages' "MARKINT" timer has expired, and the message is browsed for a second time by an activation specification browsing thread. This "MARKINT" mechanism is intended to prevent multiple activation specifications from trying to process the same message on a destination. It does this by placing a flag on the message to indicate that the message has already been browsed, which expires after a period of time which defaults to 5000ms (5 seconds). The activation specification browsing thread instructs the queue manager not to return any messages which have this flag on them. This means that existing running MDB instances have a maximum of 5 seconds to process their message, before being made available to process the newly browsed message. If no MDB thread completes their existing MDB processing within this 5 second period, the browse-mark flag is removed from the message, making it available to browsing threads once again. There has been a change of behaviour in the WebSphere Application Server between JCA 1.5 (as used in WAS 8.5 and earlier with the MQ v7 Resource Adapter), and JCA 1.7 (as used by WAS 9.0 with the MQ v9.0 Resource Adapter). When using JCA 1.5, if the message was not available in phase (2), the transaction under which the attempted consume of the message occurred on, was subsequently committed. When using JCA 1.7, in this scenario the application server detects that the MDB has not been called, and rolls the transaction back. The act of rolling the transaction back was then to trigger the MQ Resource Adapter to flag to the application server that the message delivery had failed. This counted towards the delivery failure count for the activation specification, which once exceeded would trigger the application server to shut the activation specification down. The default value for the sequential delivery failures was 0 - meaning that the activation specification would shut down the first time the JMSCC0108 message was output.
Problem conclusion
The MQ Resource Adapter has been updated such that when it fails to find a message on the destination during phase (2), and the application server subsequently rolls back the transaction - the resource adapter does not inform the application server that message delivery had failed. This reverts the behaviour of the shutting down of the activation specification back to how it was in earlier versions of the WebSphere Application Server when using JCA 1.5. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.0 LTS 9.0.0.6 v9.1 CD 9.1.2 v9.1 LTS 9.1.0.2 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT26571
Reported component name
IBM MQ BASE M/P
Reported component ID
5724H7261
Reported release
900
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-10-09
Closed date
2018-10-23
Last modified date
2018-10-23
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
IBM MQ BASE M/P
Fixed component ID
5724H7261
Applicable component levels
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
23 October 2018