Dear Readers,
Since I saw this issue a couple of times before, and as everybody will deploy something someday, I thought it would be worth a posting.
Symptoms
- It appears our patient has a very bad flu and cannot deploy snapshots to his Process Server (PS) anymore
- He is using a golden topology and the issue is impacting several process applications
- The following exceptions occur in SystemOut.log of PS-AppTarget:
[10/27/14 23:36:39:757 EDT] 0000007d wle I Exception in EJB call [10/27/14 23:36:39:757 EDT] 0000007d wle E CWLLG2164E: Install did not complete successfully because of an unexpected exception. Error: C:**** com.lombardisoftware.client.delegate.BusinessDelegateException: CWLLG0433E: Tracking definitions were not sent successfully. Details: UOWManager transaction processing failed; nested exception is com.ibm.wsspi.uow.UOWException: javax.transaction.RollbackException ... Caused by: com.ibm.wsspi.uow.UOWException: javax.transaction.RollbackException ...
Diagnosis
- Let's take a look at that exception. The „install did not complete successfully“ (CWLLG2164E) because „Tracking definitions were not sent successfully“ (CWLLG0433E). I see.
- So that makes me think... Hmm, tracking definitions... Isn't that stuff related to Performance Data Warehouse (PDW) and data defined in the process app/toolkit? And those are tracked there if I am not mistaken.
- Oh well, PDW-related data might be logged in the Support-Cluster. Worth a shot.
- The analysis in the SystemOut.log of the PS-Support-Cluster reveals the following:
[10/27/14 23:36:39:133 EDT] 00000035 XATransaction E J2CA0027E: An exception occurred while invoking prepare on an XA Resource Adapter from DataSource jms/DataDefLoaderConnectionFactory, within transaction ID {XidImpl: formatId(), gtrid_length(), bqual_length(), data()} : javax.transaction.xa.XAException: CWSIC8007E: An exception was caught from the remote server with Probe Id 3-013-0010. Exception: CWSIC2029E: This transaction cannot commit as an operation that was performed within the transaction boundary failed. The first operation that failed generated the following exception: com.ibm.ws.sib.processor.exceptions.SIMPLimitExceededException: CWSIK0025E: The destination DataDefLoaderQueueDestination.PSMECluster on messaging engine PSME.PERFDW.BPMCell.Bus is not available because the high limit for the number of messages for this destination has already been reached... ... Caused by: com.ibm.wsspi.sib.core.exception.SILimitExceededException: CWSIK0025E: The destination DataDefLoaderQueueDestination.PSMECluster on messaging engine PSME.PERFDW.BPMCell.Bus is not available because the high limit for the number of messages for this destination has already been reached.
- To sum it up the exception says that the destination DataDefLoaderQueueDestination.PSMECluster on messaging engine PSME.PERFDW.BPMCell.Bus is not available because the high limit for the number of messages for this destination has already been reached. Aha!
- Now THAT is the information I looked for. It seems that the limit of the DataDefLoaderQueue has been reached, hence tracking definitions could not be sent, and therefor the deployment failed.
- That does make sense, I guess.
Potential treatments
In order to heal the patient I'd recommend to clean up messages from the DataDefLoader-Queue. This should generate space for tracking definitions and deployments should not fail anymore with this exception.
To do so, browse your WAS Admin console, open the Runtime tab in Service Integration>Buses>bus_name>Bus members>Messaging engines for SingleCluster>cluster_name>Queue points>DataDefLoaderQueueDestination_Bus>messages (running messaging engine is required).
There are usually messages with content "Request PDW transfer" (approx. 20 Bytes large). These messages are just trigger messages for the PDW application to start tracking data transfers, and it is usually safe to delete them.
In additon we can go all the way with antibiotics which accelerates the treatment:
The patient might identify BPDs and toolkits where autotracking is enabled, see:
Furthermore, we might use a script to remove trigger messages, as desribed here:
And if that ain't helping, take two of these and call me in the morning.
Your Doc D.