IBM Support

OA63312: INCORRECTLY HANDLING THE BREAKING DUPLEX FOR SM DUPLEX STR IN CAUSING DXR139E IXLRT RC0C RSN0C540C06. LO 22/08/29 PTF PECHANGE

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • SM duplex str were established between CF1 and CF2. When CF1 was
    lost, the in-flight request toward the CF1 was detected with
    IFCC due to the
    lack of respond from CF1. Path validation toward CF1 were
    performed and also failed with IFCC since CF1 was lost.
    In-flight request toward the CF2 was detected with IFCC and path
    validation toward CF2 were successful to confirm CF2 was still
    available. Under the recovery on the CF2 thread, IXLE1REC
    concluded the breaking duplex condition but determined to give
    up the copy on CF2 instead on CF1. As the 'surviving' copy of
    str on the CF2 was dropped and CF1 was no longer accessible,
    applications suffered further delay as the str must be rebuild.
    Connector to the str could issue :
    DXR139E with RC0C RSN0C540C06 and DXR122E with ABENDU2025
    
    ANALYSIS:
    Confirm the CF which was lost by finding the IXC518I in the
    syslog or output of VERBX MTRACE.
    IXC577I to confirm the surviving copy on the CF name mentioned
    in the IXC518I. IXC579I for the copy on surviving CF.
    
    KNOWN IMPACT:
    Application faliure as can not acess the str ( RC0C with RSN
    0xxx0C06 with IXL request) which lead to application (IRLM) shut
    down
    
    VERIFICATION STEPS:
    If the dump with collector of the str is available. Check the
    output from
    CTRACE COMP(SYSXES) SUB((.....)) to locate Entry with following
    id :
    09100006 with ARWEMFID xx, TmsCctraced wiht 10
    070B0001
    070B0005 with McftMfid xx, TmsCCByte 10
    030D0001
    09060005 MFID xx TMSCCByte 10  for the lost CF or
    09060007 MFID xx for the surviving CF
    070B0008 ARWEMFID xx as surviving CF LocalRequestRetcode with
    00000FFF
    
    PE Information:
    
    Users affected
    
      Installations exploiting parallel sysplex with duplexed lock
      structures, using either the synchronous or asynchronous
      duplexing protocol, and having the following OA60275 PTFs
      installed:
    
      HBB77C0 UJ08016   HBB77B0 UJ08015   HBB77A0 UJ08025
      Note: HBB77D0 PTF UJ08017 is not PE.
    
    User Impact:
      OA60275 provides toleration support for coupling facilities
      (CFs) defined on z16 processors, which implement CFLEVEL 25.
      OA60275 provided the required toleration support but regressed
      subchannel recovery for interface control checks (IFCCs) when
      they occur during processing of specific lock structure
      commands at CFLEVELs below 25.  The regression causes z/OS to
      break duplexing for the structure, and may cause loss of the
      structure.
    
      Loss of structure occurs only when the break-duplexing causes
      failover to a CF that is inaccessible or that contains an
      instance of the structure that is no longer viable.  This is
      very uncommon except in test scenarios involving CF
      deactivation.
    
      Break-duplexing without loss of structure is inconvenient but
      does not impact normal workload execution.  In the absence of
      hardware issues, IFCCs will be rare in a normal configuration.
    

Local fix

  • Install PTFs for OA63312.  Until then:
    
    o Avoid running tests involving the deactivation of a CF
      containing lock structures.
    
    o If break-duplexings are disruptive, consider running with
      lock structures in simplex mode.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * Installations exploiting parallel sysplex at                 *
    * z/OS V2R2 (HBB77A0) and above, using                         *
    * synchronously- or asynchronously-duplexed                    *
    * lock structures, with any coupling facility                  *
    * (CF) below CFLEVEL 25 in the configuration.                  *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * Inappropriate break-duplexing and                            *
    * potential loss of structure when                             *
    * a command to a CF lock structure                             *
    * incurs an interface control check                            *
    * (IFCC).                                                      *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * An IPL is required to activate this fix on                   *
    * each system of the sysplex.  A rolling IPL                   *
    * is sufficient to activate the fix.                           *
    ****************************************************************
    For a specific subset of locking commands, an operation directed
    to a duplexed CF lock structure instance that resides in a CF
    below CFLEVEL 25 and incurs an IFCC will inappropriately break
    duplexing.  If this occurs, subchannel recovery processing
    recommends to CFRM that it retain the peer structure instance.
    If the peer instance is no longer viable, for example if the CF
    housing it has been deactivated, the structure is lost.
    
    SYSPLEXDS
    R3931/K
    D/T3931
    

Problem conclusion

  • Subchannel recovery will not break duplexing for a duplexed
    locking command that IFCCs.  Instead, when appropriate, the
    command will be internally restarted.
    KEYWORDS: D/T3931 R3931/K
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    OA63312

  • Reported component name

    CROSS SYS.EXT.S

  • Reported component ID

    5752SCIXL

  • Reported release

    7D0

  • Status

    CLOSED PER

  • PE

    YesPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-05-23

  • Closed date

    2022-08-04

  • Last modified date

    2022-09-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UJ08992 UJ08993 UJ08994 UJ08995

Modules/Macros

  • IXLE1REC
    

Fix information

  • Fixed component name

    CROSS SYS.EXT.S

  • Fixed component ID

    5752SCIXL

Applicable component levels

  • R7A0 PSY UJ08995

       UP22/08/17 P F208 «

  • R7D0 PSY UJ08994

       UP22/08/17 P F208 «

  • R7B0 PSY UJ08992

       UP22/08/17 P F208 «

  • R7C0 PSY UJ08993

       UP22/08/17 P F208 «

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M"},"Platform":[{"code":"PF054","label":"z Systems"}],"Version":"7D0"}]

Document Information

Modified date:
01 September 2022