IBM Support

OA61283: PSA OVERLAY AFTER 'P CSF' ISSUED CSFZTCA TCABUFPT IEAVESAR ABEND0C4 PIC4 IEAVTRTV ABEND0DC

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • After issuing a P CSF (STOP ICSF) console command to terminate
    ICSF, the Console stopped responding, evidence of various PSA
    overlays were recorded in LOGREC and multiple abends were seen,
    including ABEND0D5 from CSFGCARC attempting to PC into ICSF
    during end-of-memory termination processing (EOM).
    
    
    
    ANALYSIS:
    ICSF termination processing (CSFINIT2) calls CSFMICTC with a
    parm of #TERM to terminate ICSF CTRACE. CSFMICTC will zero out
    the first CTRACE buffer pointer (TcaBufPT(1)) as part of this
    CTRACE termination processing.  Later, CSFINIT2 tells the ICSF
    subtasks to terminate.  One of these subtasks encounters a
    failure in CSFVCGKF and decides to write a CTRACE entry
    (GkfFail). The CTRACE TCA indicates that the first CTRACE buffer
    address (which had been zeroed) should be used. This causes
    the ICSF subtask to write the GkfFail CTRACE entry into the PSA.
    
    
    The PSA overlays caused various system-wide issues, and the ICSF
    address space eventually encountered an ABEND0DC and entered
    memterm processing, which hung.  Since ICSF did not complete its
    termination, residual ICSF control blocks (including PC entries
    in the PC Information table and the CSFGCARC resource manager)
    were still being used by other address spaces, causing further
    fallout (including ABEND0D5 on various MASTER tasks).
    
    
    A few factors come into play to determine whether this
    overlay bug will hit.  First, there has to be some number of
    outstanding keyusage or keylifecycle audit events that need to
    be hardened during ICSF termination processing.  This could
    happen when applications are actively interacting with ICSF
    while ICSF is terminating.  Second, a large number of these
    keyusage/keylifecycle updates need to be failing (to cause many
    attempted GkfFail CTRACE records to be cut).  Even with these
    two factors in place, ICSF CTRACE processing (during
    termination) would need to explicitly be writing to (and fill
    up) its existing 4th CTRACE buffer and switch over to its 1st
    CTRACE buffer (the only buffer whose address is zeroed during
    termination).  At that point, ICSF would try to write CTRACE
    entries into address zero, which results in (one or more)
    unrecorded ABEND0C4 PIC4 failures.  ICSF would then need to
    continue attempting to write CTRACE entries into low-core
    storage until it increments past the first x'200' bytes of
    "protected" storage, finally being "allowed" to overwrite the
    PSA.
    
    If CTRACE buffer 1, 2, or 3 is in use instead and becomes full
    during ICSF termination, no PSA overlay would occur.  If fewer
    ICSF CTRACE entries are cut than the threshold needed to get
    past the first x'200' bytes of protected storage, no PSA overlay
    would occur.
    
    
    KNOWN IMPACT:
    Unpredictable fallout can occur after a PSA overlay occurs. An
    IPL is required to fully clean up the overlaid storage.
    
    VERIFICATION STEPS:
    1. Look for any LOGREC entries cut by IEAVTRTV showing "damaged"
    storage.  The damaged storage will contain ICSF CTRACE
    eyecatchers. In this case, the CTRACE entry was GkfFail cut by
    CSFVCGKF and contained eyecatchers such as KEK, KEKHDR,
    KFPTYPES, and KFPS, but other CTRACE entries are also possible.
    2. Locate the CcvtTca pointer and review its contents.
    TcaBufPT(1) will contain a word of zeros, but index 2-4 will
    contain non-zero addresses. TcaCurIdx = 00000001 and TcaCurent
    will contain an address within the first page of storage
    (between zero and x'1000').
    3. CcvtCtr=00000000
    4. Find systrace (ttch) in the standalone dump showing the
    timeframe of the PSA overlay and confirm that an ICSF task
    encountered one or more ABEND0C4 PIC4 abends shortly before the
    overlay (via IEAVESAR records) was noticed. These ABEND0C4 PIC4s
    are likely indicative of ICSF attempting to write (CTRACE
    entries) into low-core storage.
    
    ADDITIONAL KEYWORDS:
    CATKEYS: CATHANG
    

Local fix

  • BYPASS/CIRCUMVENTION:
    As a local workaround, there are a few options to prevent this
    PSA overlay.  Option 1, which we think would be easiest to
    implement, is to issue SETICSF commands just prior to the 'P
    CSF' to dynamically turn off all key usage and key lifecycle
    auditing.  This will flush any pending writes from the buffer
    such that the ICSF termination flow will not encounter any still
    needing to be written.  The following console commands can be
    issued to turn off the auditing and then stop ICSF:
    
    SETICSF OPT,AUDITKEYLIFECKDS,LAB=NO,TOK=NO
    SETICSF OPT,AUDITKEYLIFEPKDS,LAB=NO,TOK=NO
    SETICSF OPT,AUDITKEYLIFETKDS,TOKO=NO,SESSO=NO
    SETICSF OPT,AUDITKEYUSGCKDS,LAB=NO,TOK=NO
    SETICSF OPT,AUDITKEYUSGPKDS,LAB=NO,TOK=NO
    SETICSF OPT,AUDITPKCS11USG,TOKO=NO,SESSO=NO,NOKEY=NO
    P CSF
    
    Since this would be a dynamic change, the auditing would revert
    to its original ICSF options dataset settings the next time ICSF
    is started.  There is no harm in turning off auditing options
    that are already off, but you'll want to ensure you turn off all
    the options that may be ON.  (Confirm by reviewing the audit
    settings that are ON in the LPAR's ICSF options data set, and
    ensuring SETICSF is used to turn them all OFF).
    
    The second option, if you are concerned about possibly losing
    some auditing records by turning the auditing off while
    ICSF-related work is still active on your system, is to ensure
    all ICSF-related work is terminated prior to stopping ICSF.
    This can be done by shutting down OMVS and JES before stopping
    ICSF.
    
    The third option would be a combination of the two options
    proposed above -- First shut down OMVS and JES, and then issue
    the SETICSF OPT commands just prior to issuing the STOP CSF.
    
    
    RECOVERY ACTION:
    If this overlay is encountered, an IPL will be required to
    fully clean up.  Please capture a standalone dump to confirm
    root cause matches this APAR.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * Users of ICSF                                                *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * During ICSF termination, ICSF CTRACE                         *
    * processing attempts to continue                              *
    * writing entries without validating                           *
    * that the CTRACE buffer pointer is                            *
    * still valid.                                                 *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    Problem Summary
    ------------------------------------------------------------
    During ICSF termination, ICSF CTRACE processing fails to
    validate the CTRACE buffer pointer.
    

Problem conclusion

  • Problem Conclusion
    ------------------------
    CSFMICTB will validate the CTRACE buffer pointer prior to use.
    In addition, CTRACE will be terminated later in STOP processing
    to allow for any final CTRACE entries to be recorded.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    OA61283

  • Reported component name

    ICSF/MVS

  • Reported component ID

    568505101

  • Reported release

    7D0

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-04-16

  • Closed date

    2021-05-18

  • Last modified date

    2021-06-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UJ05632 UJ05633 UJ05634 UJ05637

Modules/Macros

  • CSFMICTC CSFINIT  CSFMICTB CSFINIT2
    

Fix information

  • Fixed component name

    ICSF/MVS

  • Fixed component ID

    568505101

Applicable component levels

  • R7D0 PSY UJ05633

       UP21/05/19 P F105 ¢

  • R7C1 PSY UJ05632

       UP21/05/19 P F105 ¢

  • R7D1 PSY UJ05634

       UP21/05/19 P F105 ¢

  • R7C0 PSY UJ05637

       UP21/05/19 P F105 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"7D0"}]

Document Information

Modified date:
03 June 2021