A fix is available
APAR status
Closed as program error.
Error description
SM duplex str were established between CF1 and CF2. When CF1 was lost, the in-flight request toward the CF1 was detected with IFCC due to the lack of respond from CF1. Path validation toward CF1 were performed and also failed with IFCC since CF1 was lost. In-flight request toward the CF2 was detected with IFCC and path validation toward CF2 were successful to confirm CF2 was still available. Under the recovery on the CF2 thread, IXLE1REC concluded the breaking duplex condition but determined to give up the copy on CF2 instead on CF1. As the 'surviving' copy of str on the CF2 was dropped and CF1 was no longer accessible, applications suffered further delay as the str must be rebuild. Connector to the str could issue : DXR139E with RC0C RSN0C540C06 and DXR122E with ABENDU2025 ANALYSIS: Confirm the CF which was lost by finding the IXC518I in the syslog or output of VERBX MTRACE. IXC577I to confirm the surviving copy on the CF name mentioned in the IXC518I. IXC579I for the copy on surviving CF. KNOWN IMPACT: Application faliure as can not acess the str ( RC0C with RSN 0xxx0C06 with IXL request) which lead to application (IRLM) shut down VERIFICATION STEPS: If the dump with collector of the str is available. Check the output from CTRACE COMP(SYSXES) SUB((.....)) to locate Entry with following id : 09100006 with ARWEMFID xx, TmsCctraced wiht 10 070B0001 070B0005 with McftMfid xx, TmsCCByte 10 030D0001 09060005 MFID xx TMSCCByte 10 for the lost CF or 09060007 MFID xx for the surviving CF 070B0008 ARWEMFID xx as surviving CF LocalRequestRetcode with 00000FFF PE Information: Users affected Installations exploiting parallel sysplex with duplexed lock structures, using either the synchronous or asynchronous duplexing protocol, and having the following OA60275 PTFs installed: HBB77C0 UJ08016 HBB77B0 UJ08015 HBB77A0 UJ08025 Note: HBB77D0 PTF UJ08017 is not PE. User Impact: OA60275 provides toleration support for coupling facilities (CFs) defined on z16 processors, which implement CFLEVEL 25. OA60275 provided the required toleration support but regressed subchannel recovery for interface control checks (IFCCs) when they occur during processing of specific lock structure commands at CFLEVELs below 25. The regression causes z/OS to break duplexing for the structure, and may cause loss of the structure. Loss of structure occurs only when the break-duplexing causes failover to a CF that is inaccessible or that contains an instance of the structure that is no longer viable. This is very uncommon except in test scenarios involving CF deactivation. Break-duplexing without loss of structure is inconvenient but does not impact normal workload execution. In the absence of hardware issues, IFCCs will be rare in a normal configuration.
Local fix
Install PTFs for OA63312. Until then: o Avoid running tests involving the deactivation of a CF containing lock structures. o If break-duplexings are disruptive, consider running with lock structures in simplex mode.
Problem summary
**************************************************************** * USERS AFFECTED: * * Installations exploiting parallel sysplex at * * z/OS V2R2 (HBB77A0) and above, using * * synchronously- or asynchronously-duplexed * * lock structures, with any coupling facility * * (CF) below CFLEVEL 25 in the configuration. * **************************************************************** * PROBLEM DESCRIPTION: * * Inappropriate break-duplexing and * * potential loss of structure when * * a command to a CF lock structure * * incurs an interface control check * * (IFCC). * **************************************************************** * RECOMMENDATION: * * An IPL is required to activate this fix on * * each system of the sysplex. A rolling IPL * * is sufficient to activate the fix. * **************************************************************** For a specific subset of locking commands, an operation directed to a duplexed CF lock structure instance that resides in a CF below CFLEVEL 25 and incurs an IFCC will inappropriately break duplexing. If this occurs, subchannel recovery processing recommends to CFRM that it retain the peer structure instance. If the peer instance is no longer viable, for example if the CF housing it has been deactivated, the structure is lost. SYSPLEXDS R3931/K D/T3931
Problem conclusion
Subchannel recovery will not break duplexing for a duplexed locking command that IFCCs. Instead, when appropriate, the command will be internally restarted. KEYWORDS: D/T3931 R3931/K
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
OA63312
Reported component name
CROSS SYS.EXT.S
Reported component ID
5752SCIXL
Reported release
7D0
Status
CLOSED PER
PE
YesPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-05-23
Closed date
2022-08-04
Last modified date
2022-09-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UJ08992 UJ08993 UJ08994 UJ08995
Modules/Macros
IXLE1REC
Fix information
Fixed component name
CROSS SYS.EXT.S
Fixed component ID
5752SCIXL
Applicable component levels
R7A0 PSY UJ08995
UP22/08/17 P F208 «
R7D0 PSY UJ08994
UP22/08/17 P F208 «
R7B0 PSY UJ08992
UP22/08/17 P F208 «
R7C0 PSY UJ08993
UP22/08/17 P F208 «
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M"},"Platform":[{"code":"PF054","label":"z Systems"}],"Version":"7D0"}]
Document Information
Modified date:
01 September 2022