i was recently at a customer where they had lost a message somewhere in their MQ infrastructure, and I was asked to document how to find it.
Take a scenario where messages are sent from z/OS down to a queue in a cluster on distributed.
- Check the chinit job log for messages. You can use the MQLOG* exec described else where in this blog to display error type messages. Check for messages similar to
- +CSQX506E +cpf CSQXRCTL Message receipt confirmation not received for channel z_to_Linux
- +CSQX527E +cpf CSQXRCTL Unable to send message for channel z_to_Linux
- +CSQX544E +cpf CSQXRCTL Messages for channel z_to_Linux sent to remote dead-letter queue;
- Check the channels are started. Use +cpf DIS CHS(...) to check
- If this is a cluster channel - find out the possible locations. Use +cpf DIS QCLUSTER(queuename). This will report information like
Check the queue managers listed (MQPC in above example) If the message was put 10 minutes ago, and the last time a message was sent over a channel was over an hour ago- the message was clearly not sent over this channel.
For each potential queue manager
check the logs - are any problems reported?
Check the application queue on the system - if is a remote or clustered queue, find out where this queue is located - and see if the message is there.
if there are messages on the queue the message may be stuck there. Use DIS QSTATUS to display the age of the oldest message on the queue and the number of Input handles open. If the value of Input handles is 0 then no application has the queue open for input.
Use the DIS QMGR DEADQ to identify the dead letter queue for the queue manager. This may be a remote or clustered queue, so you will have to find where the queue(s) are located. Check to see if the dead letter queue has depth > 0 - if so investigate the messages on the queue
What else may have happened to it
- It may have expired - so it gets deleted.
- If the EXPIRY report option is specified a message will be sent to the reply_ to queue. Did the application reading this queue know what to do with a report message - did it report the event or did it just throw it away?
- is the report message stuck somewhere or is not deliverable?
- Did an application process it. An application may have logic like - if message type A then do A_logic, else if message type B then do B_logic else ignore it and get next message
- Is it a shared queue - so the message was processed on a different LPAR?
- Did someone clear the queue perhaps using the CLEAR QLOCAL command?
- Did the application that put the message commit it - or did it roll back? If it rolled back the message was not successfully put.
- It may be a poisoned message so an application does MQGET - abend - rollback and does this repeatedly. The applications need logic to say - if backed out more than 3 times - then do not look inside the message - just put it somewhere safe like the dead letter queue.
- Do not assume that all your queue managers are identical. On 99% of your distributed queue managers the definition is the same - local queue with max message size of 10 KB. On one of your queue managers, the queue is defined as a cluster queue going somewhere totally different, or has max message size of 1KB.