A fix is available
APAR status
Closed as program error.
Error description
MQ Development describes the sequence of events, that a START CHINIT command is issued from a CSQINP2 dataset. The channel initiator address space subsequently starts but its initialisation didn't complete until a bit later. Meanwhile, automation issued a second START CHINIT command. This was before the first CHINIT instance had successfully connected to the QMGR, and this timing window allowed for a second channel initiator address space to start. The first CHINIT instance then went on to connect to the QMGR, subsequently placing its ASID and xGwa address in the QMGR's MGBL control block. While processing the second START CHINIT command, the command processor proceeded to overwrite the saved CHINIT ASID in the QMGR MGBL with the ASID of the second CHINIT instance. The second CHINIT instance then tried to connect to the QMGR and detected that another CHINIT was already connected. This resulted in the second CHINIT instance failing to start with message CSQX007E with reason MQRC_DUPLICATE_RECOV_COORD. This left the QMGR's MGBL control block in an inconsistent state: the xGwa address correctly pointed to the xGwa in the original CHINIT address space, but the saved ASID was for a channel initiator address space that had since ended. This ASID was later claimed by a different job. The QMGR checks if the CHINIT address space is connected in various ways, but one of the first checks is that the ASCB associated with the ASID saved in the MGBL has the correct CHINIT jobname. If this check fails, then the QMGR assumes that the CHINIT isn't started. Later attempts to start the CHINIT can result in new CHINIT instances starting due to the QMGR not thinking the CHINIT was started. These will subsequently end due to failing to connect to the QMGR with messsage CSQX007E. - This APAR investigates improvement in the serialisation between multiple CHINIT instances starting to ensure against an inconsistent state.
Local fix
Allow a bit of time between START CHINIT commands to allow for the channel initiator address space to initialise, prior to other START CHINIT commands being issued (either manually or via automation).
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM MQ for z/OS Version 9 * * Release 1 Modification 0 and Release 2 * * Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: Message "CSQM131I +xxxx CSQMDCST * * CHANNEL INITIATOR NOT ACTIVE, CLUSTER * * AND CHANNEL COMMANDS INHIBITED" is * * issued when the channel initiator is * * active. * **************************************************************** If multiple attempts to start the channel initiator occur during a short interval, a timing window exists where multiple xxxxCHIN address spaces are started. One of these will successfully connect to the queue manager, and the others will fail after issuing "CSQX007E +xxxx CSQXADPI Unable to connect to queue manager xxxx MQCC=2 MQRC=2163 (MQRC_DUPLICATE_RECOV_COORD)". Depending on which instance connects, and when the associated START CHINT command ran, it is possible for the wrong asid to be stored in the mgbl. When subsequent commands (e.g. DIS CHSTATUS, STOP CHINIT) check if the channel initiator is running, this incorrect asid value results in the command incorrectly determining that the channel initator is not running, preventing the command from being executed.
Problem conclusion
Channel Initiator startup processing is changed to ensure the correct asid is only stored for a Channel Initiator address space that has successfully connected to the queue manager.
Temporary fix
Comments
APAR Information
APAR number
PH37290
Reported component name
IBM MQ Z/OS V9
Reported component ID
5655MQ900
Reported release
100
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-05-17
Closed date
2022-02-16
Last modified date
2022-04-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UI79379 UI79380
Modules/Macros
CSQMCCHT CSQMSCHI
Fix information
Fixed component name
IBM MQ Z/OS V9
Fixed component ID
5655MQ900
Applicable component levels
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100"}]
Document Information
Modified date:
02 April 2022