A fix is available
APAR status
Closed as program error.
Error description
Peer Level Recovery for a CF structure can hang when multiple queue managers are restarted at the same time. They previously disconnected in an unclean fashion, causing EeplExistingConnection events to be triggered. GRS ENQs are held for the CSQ_ADMIN structure and an application structure. Resulting symptoms include: - /D GRS,C,LATCHID shows a GRS ENQ wait for SYSZCSQE CSQESTOP_RESTARTING_QMGR_&_PEERS_RECOVERING_AT_SAME TIME - Title: ABEND=S026,REASON=08118001,CONNECTOR HANG: CONNAME =name,JOBNAME=ssid9MSTR - Title: ABN=5C6-00C510A4,U=SYSOPR ,C=R3600.710.CFM - CSQERWI2,M=CSQGFRCV,LOC=CSQELPLM.CSQERWI2+00001C02 00C510A4 was preceded by IXLRSNCODERSPNOTREC EQU X'00000C27' "All surviving connections have not responded via IXLEERSP for the requested connection." - Title: QUEUE MANAGER TERMINATION REQUESTED, REASON=00C510AB - CSQE007I CSQESTE EEPLEXISTINGCONNECTION event received for structure <structure> connection name <name> - CSQE007I CSQESTE EEPLDISCFAILCONNECTION event received for structure <structure> connection name <name> - CSQE021I CSQECONN Structure <structure> connection as <name> warning, RC=00000004 reason=02010407 codes=00000000 00000000 00000000 CSQE008I CSQESTE Recovery event from <queue manager> received for structure <structure> IXL040E CONNECTOR NAME: <name>, JOBNAME: ssidMSTR, ASID: nnnn HAS NOT RESPONDED AFTER CONNECTING DURING A USER SYNC POINT. USER SYNC POINT PROCESSING FOR STRUCTURE MQSPARTSP CANNOT CONTINUE... IXL049E HANG RESOLUTION ACTION FOR CONNECTOR NAME: <name> TO STRUCTURE <structure> , JOBNAME: ssidMSTR, ASID: nnnn ... IXL050I CONNECTOR NAME: <name> TO STRUCTURE <structure> , JOBNAME: ssidMSTR, ASID: nnnn HAS NOT PROVIDED A REQUIRED RESPONSE AFTER 1020 SECONDS. TERMINATING CONNECTOR TASK TO RELIEVE THE HANG. Additional Symptom(s) Search Keyword(s): QSG queue sharing group CFSTRUCT PLR peer level recovery timing deadlock ABEND 5C6 S5C6 S05C6 ABEND5C6 ABENDS5C6 00C510A4 00C510AB
Local fix
Stop and restart the queue manager that holds the "CSQESTOP_RESTARTING_QMGR_&_PEERS_RECOVERING_AT_SAME TIME" enqueue for the CSQ_ADMIN structure
Problem summary
**************************************************************** * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 * * Release 1 Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: The QMGR hangs during peer level * * recovery for an application structure * * for an EEPLEXISTINGCONNECTION event. * * Abend 026 is issued, followed by an * * early termination of the queue manager * * with reason code 00C510AB. * **************************************************************** * RECOMMENDATION: * **************************************************************** The queue manager is processing a USYNC initiated for a EEPLEXISTINGCONNECTION event. It is checking whether the queue manager instance is the same as when the event occurred by looking at instance number stored for the application structure and admin structure. They match, indicating that the queue manager has not yet stopped. The queue manager processing the recovery is requesting an ENQ for the queue manager subject of the recovery, expecting it to be released when the queue manager terminates. As the queue manager is not terminating, the ENQ is never released and the hang condition occurs. When the CF time out limit for connections is reached, abend 026 is issued in the connected TCB to resolve the hang.
Problem conclusion
The code was changed to get the ENQ conditionally. If the ENQ is obtained, the connection will be recovered on behalf of the queue manager owning the connection. If the ENQ is not obtained, the queue manager will have recovered its own connection, thus no further processing is required. 100Y CSQESTE
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
PM98694
Reported component name
WMQ Z/OS V7
Reported component ID
5655R3600
Reported release
100
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2013-10-08
Closed date
2013-11-20
Last modified date
2014-01-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UI12722
Modules/Macros
CSQESTE
Fix information
Fixed component name
WMQ Z/OS V7
Fixed component ID
5655R3600
Applicable component levels
R100 PSY UI12722
UP13/12/24 P F312
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
02 January 2014