A fix is available
APAR status
Closed as program error.
Error description
After issuing a P CSF (STOP ICSF) console command to terminate ICSF, the Console stopped responding, evidence of various PSA overlays were recorded in LOGREC and multiple abends were seen, including ABEND0D5 from CSFGCARC attempting to PC into ICSF during end-of-memory termination processing (EOM). ANALYSIS: ICSF termination processing (CSFINIT2) calls CSFMICTC with a parm of #TERM to terminate ICSF CTRACE. CSFMICTC will zero out the first CTRACE buffer pointer (TcaBufPT(1)) as part of this CTRACE termination processing. Later, CSFINIT2 tells the ICSF subtasks to terminate. One of these subtasks encounters a failure in CSFVCGKF and decides to write a CTRACE entry (GkfFail). The CTRACE TCA indicates that the first CTRACE buffer address (which had been zeroed) should be used. This causes the ICSF subtask to write the GkfFail CTRACE entry into the PSA. The PSA overlays caused various system-wide issues, and the ICSF address space eventually encountered an ABEND0DC and entered memterm processing, which hung. Since ICSF did not complete its termination, residual ICSF control blocks (including PC entries in the PC Information table and the CSFGCARC resource manager) were still being used by other address spaces, causing further fallout (including ABEND0D5 on various MASTER tasks). A few factors come into play to determine whether this overlay bug will hit. First, there has to be some number of outstanding keyusage or keylifecycle audit events that need to be hardened during ICSF termination processing. This could happen when applications are actively interacting with ICSF while ICSF is terminating. Second, a large number of these keyusage/keylifecycle updates need to be failing (to cause many attempted GkfFail CTRACE records to be cut). Even with these two factors in place, ICSF CTRACE processing (during termination) would need to explicitly be writing to (and fill up) its existing 4th CTRACE buffer and switch over to its 1st CTRACE buffer (the only buffer whose address is zeroed during termination). At that point, ICSF would try to write CTRACE entries into address zero, which results in (one or more) unrecorded ABEND0C4 PIC4 failures. ICSF would then need to continue attempting to write CTRACE entries into low-core storage until it increments past the first x'200' bytes of "protected" storage, finally being "allowed" to overwrite the PSA. If CTRACE buffer 1, 2, or 3 is in use instead and becomes full during ICSF termination, no PSA overlay would occur. If fewer ICSF CTRACE entries are cut than the threshold needed to get past the first x'200' bytes of protected storage, no PSA overlay would occur. KNOWN IMPACT: Unpredictable fallout can occur after a PSA overlay occurs. An IPL is required to fully clean up the overlaid storage. VERIFICATION STEPS: 1. Look for any LOGREC entries cut by IEAVTRTV showing "damaged" storage. The damaged storage will contain ICSF CTRACE eyecatchers. In this case, the CTRACE entry was GkfFail cut by CSFVCGKF and contained eyecatchers such as KEK, KEKHDR, KFPTYPES, and KFPS, but other CTRACE entries are also possible. 2. Locate the CcvtTca pointer and review its contents. TcaBufPT(1) will contain a word of zeros, but index 2-4 will contain non-zero addresses. TcaCurIdx = 00000001 and TcaCurent will contain an address within the first page of storage (between zero and x'1000'). 3. CcvtCtr=00000000 4. Find systrace (ttch) in the standalone dump showing the timeframe of the PSA overlay and confirm that an ICSF task encountered one or more ABEND0C4 PIC4 abends shortly before the overlay (via IEAVESAR records) was noticed. These ABEND0C4 PIC4s are likely indicative of ICSF attempting to write (CTRACE entries) into low-core storage. ADDITIONAL KEYWORDS: CATKEYS: CATHANG
Local fix
BYPASS/CIRCUMVENTION: As a local workaround, there are a few options to prevent this PSA overlay. Option 1, which we think would be easiest to implement, is to issue SETICSF commands just prior to the 'P CSF' to dynamically turn off all key usage and key lifecycle auditing. This will flush any pending writes from the buffer such that the ICSF termination flow will not encounter any still needing to be written. The following console commands can be issued to turn off the auditing and then stop ICSF: SETICSF OPT,AUDITKEYLIFECKDS,LAB=NO,TOK=NO SETICSF OPT,AUDITKEYLIFEPKDS,LAB=NO,TOK=NO SETICSF OPT,AUDITKEYLIFETKDS,TOKO=NO,SESSO=NO SETICSF OPT,AUDITKEYUSGCKDS,LAB=NO,TOK=NO SETICSF OPT,AUDITKEYUSGPKDS,LAB=NO,TOK=NO SETICSF OPT,AUDITPKCS11USG,TOKO=NO,SESSO=NO,NOKEY=NO P CSF Since this would be a dynamic change, the auditing would revert to its original ICSF options dataset settings the next time ICSF is started. There is no harm in turning off auditing options that are already off, but you'll want to ensure you turn off all the options that may be ON. (Confirm by reviewing the audit settings that are ON in the LPAR's ICSF options data set, and ensuring SETICSF is used to turn them all OFF). The second option, if you are concerned about possibly losing some auditing records by turning the auditing off while ICSF-related work is still active on your system, is to ensure all ICSF-related work is terminated prior to stopping ICSF. This can be done by shutting down OMVS and JES before stopping ICSF. The third option would be a combination of the two options proposed above -- First shut down OMVS and JES, and then issue the SETICSF OPT commands just prior to issuing the STOP CSF. RECOVERY ACTION: If this overlay is encountered, an IPL will be required to fully clean up. Please capture a standalone dump to confirm root cause matches this APAR.
Problem summary
**************************************************************** * USERS AFFECTED: * * Users of ICSF * **************************************************************** * PROBLEM DESCRIPTION: * * During ICSF termination, ICSF CTRACE * * processing attempts to continue * * writing entries without validating * * that the CTRACE buffer pointer is * * still valid. * **************************************************************** * RECOMMENDATION: * **************************************************************** Problem Summary ------------------------------------------------------------ During ICSF termination, ICSF CTRACE processing fails to validate the CTRACE buffer pointer.
Problem conclusion
Problem Conclusion ------------------------ CSFMICTB will validate the CTRACE buffer pointer prior to use. In addition, CTRACE will be terminated later in STOP processing to allow for any final CTRACE entries to be recorded.
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
OA61283
Reported component name
ICSF/MVS
Reported component ID
568505101
Reported release
7D0
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-04-16
Closed date
2021-05-18
Last modified date
2021-06-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UJ05632 UJ05633 UJ05634 UJ05637
Modules/Macros
CSFMICTC CSFINIT CSFMICTB CSFINIT2
Fix information
Fixed component name
ICSF/MVS
Fixed component ID
568505101
Applicable component levels
R7D0 PSY UJ05633
UP21/05/19 P F105 ¢
R7C1 PSY UJ05632
UP21/05/19 P F105 ¢
R7D1 PSY UJ05634
UP21/05/19 P F105 ¢
R7C0 PSY UJ05637
UP21/05/19 P F105 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"7D0"}]
Document Information
Modified date:
03 June 2021