IBM Support

High CPU in the IBM MQ channel initiator job during reallocation of messages on the SYSTEM.CLUSTER.TRANSMIT.QUEUE

Troubleshooting


Problem

The IBM MQ for z/OS channel initiator (CHIN) is experiencing extremely high CPU. IBM MQ is recycled, but the CHIN immediately experiences the same symptoms. One or more CLUSSDR channels may appear to be hung.

Symptom

Additional symptoms may include:
- CURDEPTH is high in DISPLAY QUEUE or DISPLAY QSTATUS output for SYSTEM.CLUSTER.TRANSMIT.QUEUE
- MSGAGE and QTIME increase in DISPLAY CHSTATUS output for SYSTEM.CLUSTER.TRANSMIT.QUEUE

Cause

IBM MQ tries to reallocate cluster messages on the SYSTEM.CLUSTER.TRANSMIT.QUEUE to a different cluster channel, if possible, when a CLUSSDR goes into retry mode.

The SYSTEM.CLUSTER.TRANSMIT.QUEUE contains thousands of messages that required examination for reallocation. The problem is aggravated by having small buffer pools, which means I/O to the page set is necessary.

Resolving The Problem

Check for channel error messages on both ends of the channel. Fix the channel problem. Depending on the error message in the CHIN joblog, this may include:
  • Start the listener on the remote end
  • Use ADOPTMCA / AdoptNewMCA on the remote end so that the clusrcvr channel can be automatically restarted after a communications failure. If this feature has not been implemented, stop the receiver channel with MODE(FORCE), then start it again to re-enable it.
  • Use TCPKEEP / KeepAlive / RCVTIME to recover from communications failures
  • Resolve network problems
  • If this can not be done immediately, you can stop the channel to prevent it from retrying. See additional information below about stopping the channel. Another option is to decrease LONGRTY or increase LONGTMR, so that retries will not occur as frequently.
  • If you do not need your messages bound to a specific instance of the cluster queue, then use the MQOO_BIND_NOT_FIXED option. This will allow the messages to be allocated to another channel. The reallocation process is less efficient if each iteration must examine thousands of messages put using the MQOO_BIND_ON_OPEN option.
  • Make sure your Buffer Pool allocations are large enough, as described in Defining your buffer pools .
  • Be sure the target queue on the remote end is not full. If it is, the CLUSRCVR channel trying to put the message will receive reason 2053 MQRC_Q_FULL.  If there is not a Dead Letter Queue (DLQ) defined on the remote queue manager or if the DLQ is full, the messages will build up on the sending SYSTEM.CLUSTER.TRANSMIT.QUEUE and the CLUSSDR channel will enter the reallocation and retry processing.
 

Additional information



With WebSphere MQ for z/OS MQ V7 and above:
1) Message CSQX191I is issued when a channel enters reallocation processing.
    The following diagnostic messages for reallocation processing were added by APARs:

      PI97990 /UI58200 for MQ V8.0.0
      PH00940 /UI58206 for MQ V9.0.0
      PH03175 /UI60812 for MQ V9.1.0

    CSQX179I                                                        
       Channel channel-name message reallocation is in    
       progress, msg-progress messages of msg-total processed       

    CSQX180I                                           
       Channel channel-name completed message
       reallocation, msg-processed messages processed     

2) STOP CHANNEL MODE(FORCE/TERMINATE) is allowed to interrupt reallocation processing.

3) Message CSQX192E is issued when a STOP CHANNEL without FORCE or TERMINATE is issued and the channel is in reallocation processing.
 

Recycling the CHIN will also interrupt reallocate processing, but reallocation will occur after startup if the channel again fails to start.


Effects which may be seen if reallocate processing is interrupted:
- Messages are delivered in a different order in which they were put.
- Messages remain on the SYSTEM.CLUSTER.TRANSMIT.QUEUE until the channel is manually restarted despite there being alternative destinations.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"ARM Category":[{"code":"a8m0z00000008PPAAY","label":"Components and Features->Clustering"},{"code":"a8m0z00000008M1AAI","label":"Performance->CPU"}],"ARM Case Number":"","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Version(s)","Line of Business":{"code":"LOB45","label":"Automation"}}]

Product Synonym

WMQ WebSphere MQ

Document Information

Modified date:
17 July 2020

UID

swg21199910