A fix is available
APAR status
Closed as program error.
Error description
Customer reported a queue manager became unresponsive. They could not connect via the panels nor connect via Admin tool over svrconn channel. The THR=* formatter shows that several threads, including the command server thread RTSSRV01 are waiting for a latch. There is a deadlock / deadly embrace involving this lock. *PEB 31266B70 ACE 31266B10 CCB 31233808 SRB 312F5C70 PROG 8000 SYSTEM 201.RAHEAD02 ROB is Q'd on a Latch waiter chain Susp by MQ EBSUS14 3146C5F1 ROBSOT 314613CA Latch Waited on 7EF58FE8 DSA 7CF52420 () Suspend issued at 1900/05/18 14:47:26.863393 Latch 7EF58FE8 is HELD by EB 3127A690 <=== 7CF527D0 7CF52420 7CF52198 7CF51F80 7CF51D40 7CF4EEC8 7CF4E248 -------- -------- CSQP3GET CSQP1GET CSQP1RAH CSQIRAHP -------- LOWN 7ED72040 ITHR 7D0291B0 SOFTLOG 00000000 MTHR 7D03E7A0 Open Handles = 0 LastGETexp 0.0 *PEB 3127A690 ACE 3127A630 CCB 3126D838 SRB 312F6C60 PROG 8000 SYSTEM 215.DWP_O305 ROB is Q'd on a Latch waiter chain Latch Held Mask 00010000 = (16)BMXL2/RMCRMST/RLMARQC Susp by MQ EBSUS14 3146C5F1 ROBSOT 314613CA Latch Waited on 7D6AFFC0 DSA 7EEF9288 () Suspend issued at 1970/10/06 20:14:13.556985 Latch 7D6AFFC0 is HELD by EB 31266B70 <=== This is a BDSC Dlatch for Psid 00000006 Page 00003BE2 7EEF9638 7EEF9288 7EEF9108 7EEF7270 7EEF6EC8 7EEF6248 -------- -------- CSQP4DWP CSQP2DWP CSQP1DWP -------- Tasks suspended waiting for the latches will be paused in module CSQVSRX. In rare cases, EOM processing can require one of the held latches, leading to it hanging until the system abnormally terminates the hung EOM task with abend S30D, leading to abnormal queue manager termination. Additional Symptom(s) Search Keyword(s): S30D ABEND30D ABENDS30D S030D
Local fix
Restart the queue manager. It may be necessary to cancel it.
Problem summary
**************************************************************** * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 * * Release 1 Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: Applications hang during MQI calls due * * to BDSC latch contention. * * Internal MQ processing and queue * * manager shutdown also hangs. * **************************************************************** * RECOMMENDATION: * **************************************************************** An application releasing an updated page was unable to latch the buffer used for the page, and consequently left the buffer on the buffer pool LRU chain. During deferred write processing, CSQP4DWP marked the buffer as clean after the update to the pageset had completed (making it eligible for page stealing), and then later attempted to check if the buffer needed to be added back on to the LRU chain while holding a latch on the lru chain - to do this it requested a latch on the buffer. However, between the page being marked clean and CSQP4DWP obtaining the LRU latch, the readahead task CSQP1RAH stole the buffer and obtained the buffer latch. CSQP1RAH then attempted to get another buffer, and suspended waiting for the LRU latch. This resulted in the reported deadlock between the deferred write processor (which held the LRU latch and required the BDSC latch) and the readahead task (which held the BDSC latch and required the LRU latch). Any application or queue manager tasks requiring pages using the same buffer pool will also hang waiting for the LRU latch unless the requested page is already available in a buffer.
Problem conclusion
CSQP4DWP is updated to prevent the pages it is processing from being eligible for stealing until after they have been added back to the LRU chain, preventing this deadlock situation from occurring. 100Y CSQP3GET CSQP4DWP
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
PI41605
Reported component name
WMQ Z/OS V7
Reported component ID
5655R3600
Reported release
100
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2015-05-21
Closed date
2015-08-26
Last modified date
2015-12-09
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
PI47171 UI30530
Modules/Macros
CSQP3GET CSQP4DWP
Fix information
Fixed component name
WMQ Z/OS V7
Fixed component ID
5655R3600
Applicable component levels
R100 PSY UI30530
UP15/10/08 P F510 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
09 December 2015