Previous topic |
Next topic |
Contents |
Index |
Contact z/OS |
Library |
PDF
Recovering from a Coordinated CKDS administration failure z/OS Cryptographic Services ICSF Administrator's Guide SA22-7521-17 |
|
This information describes how to use ICSF diagnostic information to recover from a coordinated CKDS administration failure. The coordinated CKDS administration functions performs multiple steps to validate the environment, including verifying master key registers across the CKDS sysplex cluster and validating CKDSs involved in the operation. If the environment is verified and meets criteria for the operation, then the initiating system of the coordinated CKDS administration function will attempt to coordinate the function across all members of the CKDS sysplex cluster (all ICSF instances sharing the same active CKDS). Coordinated CKDS change master key or coordinated CKDS refresh messagesThe coordinated CKDS refresh and coordinated CKDS change master key dialogs result in one or more dialog messages indicating the success or failure of the operation. In the case of a failure, there should be enough information in the dialog message to identify the problem. If there is not enough information in the dialog, you must use the ICSF job log to further identify the problem. During coordinated CKDS change master key and coordinated CKDS refresh, a sequence of messages are written to the ICSF job log. CSFM622I messages are written to provide status for internal steps taken by the function. For example, one of the very first steps for a coordinated CKDS change master key operation is to make a copy of the in-storage KDS that will be used for the subsequent reencipher step. When this copy is made, the following CSFM622I message is written to the ICSF joblog.
If a failure occurs during a coordinated CKDS change master key or coordinate CKDS refresh operation, failure messages are written to the ICSF job log that provide diagnostic information for determining the cause of the problem. Depending on how far the function is into processing, steps may be required to back out from the overall operation. CSFM622I messages are also used to provide status for back out steps. Additionally, all failure cases will end with the following CSFM616I message to provide further diagnostic information.
An explanation of the return code and reason code provided in the CSFM616I message can be found in the "Return and Reason Codes" section of the z/OS Cryptographic Services ICSF Application Programmer’s Guide. The rest of the information in this message is IBM internal diagnostic information. The sequence of messages written to the ICSF job log during a coordinated CKDS change master key and coordinated CKDS refresh should indicate how far along the function progressed, and, if a failure occurred, should include enough diagnostic information to determine the cause of the problem. Use the CSFM622I messages to determine how far along the function progressed before the failure. Then use the failure messages to determine why the problem occurred. New master key register mismatchFor a coordinated CKDS change master key, all CKDS sysplex cluster members must have their symmetric (DES and/or AES) new master key registers pre-loaded with the same master key values. Either the DES or AES new master key registers may be pre-loaded, or both may be pre-loaded on all CKDS sysplex cluster members. If a CKDS sysplex cluster member's symmetric new master key registers do not match the initiator’s new master key registers, the following error message will be displayed on the dialog. In this example, it is the AES new master key register that does not match.
In addition, the following message will be written to the ICSF job log.
To resolve this problem, the security administrator should compare all CKDS sysplex cluster members’ symmetric new master key registers to ensure they match the initiators exactly. If a CKDS sysplex cluster member’s symmetric new master key registers do not match, the security administrator should re-load them or clear them to match the values on the initiating system. Additional information about the failure can be determined by looking up the return and reason codes in the Return and Reason Codes section of the z/OS Cryptographic Services ICSF Application Programmer’s Guide. Cataloged failuresIf any of these data sets are not cataloged, one of the following dialog messages will be displayed:
In addition, a CSFM619I and/or a CSFM623I message will be written to the ICSF job log. To correct this problem, make sure the necessary data sets are cataloged and retry the function. Mainline processing failureIf a coordinated CKDS change master key or coordinated CKDS refresh operation fails during one of its internal mainline processing steps, a dialog message will be displayed indicating the problem and a CSFM620I message will be written to the ICSF job log. For example, when using the rename option, if the active CKDS cannot be renamed to the archive data set name, the following dialog message will be displayed:
In addition, the following message will be written to the ICSF job log:
To correct this problem the security administrator and/or system programmer should determine if there is a conflict with the archive data set name that caused the failure. The CSFM620I is also used for other internal mainline processing failures, such as if a problem occurs trying to load or process the target or backup data sets. For either case, the CSFM620I message should provide enough information for the security administrator and/or system programmer to further investigate the problem. Backout processing failureIf a failure occurs during mainline processing of a coordinated KDS change master key or coordinated KDS refresh, backout processing will attempt to undo any steps that have already completed in the operation. A CSFM620I message will be written to the ICSF job log to indicate the mainline processing failure. Additionally backout processing messages will be written to the ICSF job log indicating the status of the backout. If backout processing fails, a dialog message will indicate the problem. For example:
A series of CSFM622I messages will be written to the ICSF joblog to track the status of the back out steps. If there is a failure during backout processing, a CSFM621I message will be written to the ICSF job log indicating the failure during backout processing. When a failure in backout processing occurs, use the overall sequence of CSFM620I, CSFM621I, and CSFM622I messages to determine which step the function failed on, and which step failed during backout processing. For this situation, it is likely other messages listed in this section are also written to the ICSF job log to help determine the root cause of the problem. Set master key failureIf there was a problem setting the master key on either the initiating system or a target system of a coordinated CKDS change master key, a dialog message will indicate the failure and a CSFM625I message will be written to that system’s ICSF job log. For example, if the step for setting the AES master key fails, the following dialog message will be displayed:
The following message will be written to the ICSF job log for this failure.
If this failure occurs on the initiating system, the entire change master key processes will be cancelled and the target systems will not be affected by the operation. Check the status of the coprocessor with serial number identified in the message to determine if it requires maintenance. If this failure occurs on a target system, the initiating system and other target systems may have successfully changed their master key. If the initiating system has set the master key and completed the coordinated CKDS change master key function, the active CKDS is now reenciphered under the new master key. Check the status of the coprocessor with serial number identified in the message to determine if it requires maintenance. After the coprocessors status is resolved, the target system must perform a single-system change master key in order to remain in synch with the active CKDS. Follow the steps in Reentering master keys when they have been cleared (PCIXCC, CEX2C, or CEX3C). Back-level ICSF releases in the sysplexThe coordinated CKDS change master key and coordinated CKDS refresh functions are only available if all ICSF instances in the CKDS sysplex group are running FMID HCR7790 or later. If an ICSF instance at a level lower than HCR7790 joins the sysplex group, a CSFM631I message (indicating all downlevel systems) will be written to the ICSF job log and the operation will fail. To resolve this problem, all downlevel systems must either be removed from the CKDS sysplex group or upgraded to HCR7790 or higher. If this is not possible, the coordinated CKDS change master key and coordinated CKDS refresh functions cannot be used. The single-system CKDS change master key and single-system CKDS refresh can be used with ICSF instances running at supported FMID levels. Rename failuresIf there is a failure during the optional rename step of coordinated CKDS change master key or coordinated CKDS refresh, CSFM629I and CSFM630I messages will be written to the ICSF job log to indicate the reason for the failure. The rename function uses the IDCAMS processor to perform the actual VSAM data set rename. CSFM629I messages are used to route IDCAMS processor messages to the ICSF job log when the IDCAMS processor fails to perform the rename. The CSFM629I messages contain the reason from the IDCAMS failure. These messages are followed by a CSFM630I message that indicates which data set name failed to be renamed to which new name. CKDS data sets are KDS VSAM data sets. They consist of 3 parts: a cluster name, an index name, and a data name. For example, if you use the sample JCL provided in the "Steps to create the CKDS" section of the z/OS Cryptographic Services ICSF System Programmer’s Guide, the cluster name, data name, and index name will be the following in order. CSF.CSFCKDS
CSF.CSFCKDS.DATA CSF.CSFCKDS.INDEX When the rename option is selected, all 3 parts of the active CKDS will be renamed to the archive name, and all 3 parts of the target CKDS will be renamed to the active name. When renaming the data and index portions of a CKDS VSAM data set, the suffix format of the original data set is maintained. For example, if the preceding data set names are used for the active CKDS, and the archive data set name is specified as CSF.CSFCKDS.ARC, the 3 portions of the active CKDS will be renamed to: CSF.CSFCKDS.ARC
CSF.CSFCKDS.ARC.DATA CSF.CSFCKDS.ARC.INDEX In the case of a failure during rename processing, the coordinated function will attempt to back out and rename the data sets back to their original names. If the back out fails, you may end up with a partially renamed data set. This can be easily corrected by performing an IDCAMS ALTER from JCL. Whenever a rename failure occurs, scan the ICSF job log of the initiating system for CSFM629I and CSFM630I messages. These messages will indicate which data set part failed during rename and if backout processing was able to rename the data set back to its original name. If backout processing was able to rename back to the original name, check the catalog for the data set name that failed to be used for rename. Most likely you have a conflict with the archive data set name and need to either rename existing data sets in your catalog or choose a different archive name. If backout processing failed to rename your data set back to the original name, use ISPF to confirm that the data set parts match up with what is reported in the CSFM629I and CSFM630I messages. For example, if during a coordinated CKDS change master key operation, the active CKDS cluster name is successfully renamed to the archive name, but the data portion fails to be renamed, backout processing begins. If backout processing fails to rename the data set back to the original active CKDS name, ICSF will shut down all instances in the sysplex CKDS cluster because the active CKDS name is only half renamed. In this scenario, the following set of messages may be reported in the ICSF job log.
This sequence of messages indicates that the active CKDS name of CSF.CSFCKDS was renamed to CSF.CSFCKDS.ARC. Then, the CSF.CSFCKDS.DATA data portion of the active CKDS failed to be renamed to CSF.CSFCKDS.DATA.ARC because another data set in the catalog was already using this name. At this point, the coordinated CKDS change master key function tried to back out and rename the cluster portion of the CKDS from CSF.CSFCKDS.ARC back to its original name of CSF.CSFCKDS. However the renaming failed because another data set with name CSF.CSFCKDS now exists in the catalog. The result is a half renamed active CKDS which causes ICSF to shut down across the CKDS sysplex cluster. The first step to resolving this problem is to confirm in ISPF that the following data set names reported in the messages above do exist: CSF.CSFCKDS.ARC
CSF.CSFCKDS.DATA CSF.CSFCKDS.INDEX Once this is confirmed, the next step is to rename the cluster name back to the original name manually by calling IDCAMS ALTER from JCL. Before doing that, the messages above indicate that back out processing already failed to rename the cluster name back because another data set is now using that name. The data set that has taken that name should be renamed to a different name as this name is needed to restore the active CKDS. Once the cluster name conflict has been resolved, issue IDCAMS ALTER from JCL to rename the CSF.CSFCKDS.ARC cluster name back to the original active CKDS name of CSF.CSFCKDS. For example: //DEFINE EXEC PGM=IDCAMS,REGION=4M
//SYSPRINT DD SYSOUT=* //SYSIN DD * ALTER CSF.CSFCKDS.ARC - NEWNAME(CSF.CSFCKDS) /* ICSF may be restarted on all instances that were previously taken down. Processing should resume as normal and the coordinated CKDS change master key with rename option may be issued again with an archive data set name that does not have a conflict in the catalog. |
Copyright IBM Corporation 1990, 2014
|