A fix is available
APAR status
Closed as program error.
Error description
The rebuild request for the lock structure was queued because IRLM was processing a connect and set FENCE mode one member at a time. During this time a series of connection failures occurred. Also the connection failures got queued because the REBUILD was pending.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All HIR2220 (IRLM2.2) and HIR2230 (IRLM2.3) * * users data sharing SYSPLEXDS. * **************************************************************** * PROBLEM DESCRIPTION: IRLM failed to respond to the rebuild * * quiesce event of the lock structure * * causing the whole sysplex hung with * * message IXL040E and ABEND=S026 with * * REASON=08118001 * **************************************************************** * RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF * **************************************************************** A series of New Connection events from peer members arrived. A co-exist rebuild was initiated causing most of the newly connected connectors to disconnect from the lock structure. This resulted in another series of Failed Connection events to be processed. IRLM put up the fence to serialize the global initialization process for each New Connection event, causing the Rebuild Quiesce QE to be queued behind the fence. If the ID of the FailConn is different with the fence ID then the FailConn QE will be queue to the rebuild pending queue. These FailConn QEs will be processed after the rebuild process completed. The New Connection QE was holding the fence waiting for response from the peer member. The response was never returned because peer member has disconnected from the structure and its FailConn QE was stuck in the rebuild pending queue. The rebuild process also could not continue because its Rebuild Quiesce QE was queued behind the fence. Without the response from the Rebuild Quiesce event, XES issued message IXL040E and abended S026 with reason 08118001.
Problem conclusion
GEN: KEYWORDS: *** END IMS KEYWORDS *** Code is modified to queue those failed connect QEs to the work-to-do queue instead of the rebuild pending queue when the failing member disconnect is done during global initialization.
Temporary fix
********* * HIPER * *********
Comments
**** PE13/08/12 FIX IN ERROR. SEE APAR PM94539 FOR DESCRIPTION
APAR Information
APAR number
PM65217
Reported component name
IRLM V2
Reported component ID
569516401
Reported release
230
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt
Submitted date
2012-05-22
Closed date
2012-06-21
Last modified date
2013-09-26
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UK79709 UK79710
Modules/Macros
DXRRL752 DXRRS752
Fix information
Fixed component name
IRLM V2
Fixed component ID
569516401
Applicable component levels
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPHL","label":"IRLM"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"230","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
26 September 2013