Immediate copy failure reason codes

This topic lists and explains the failure codes that can be returned with a Rewind-Unload command where one or more of the immediate copies associated with the volume did not complete.

If an immediate copy fails before the Rewind-Unload command completes, a host must be able to isolate the reason for the failure. Enhanced sense data is captured to provide the extra information that software or the operator can use to determine subsequent recovery actions. Failure reason codes are returned that contain the enhanced sense data. Upon receipt of these codes, the software or operator can choose to continue with the copies completed, resubmit the job, or fail the job. Possible reasons for an immediate copy failure include:
  • Copies are disabled because the cluster is out of physical scratch volumes
  • Copies are disabled because the cluster is out of cache space
  • Copies have been disabled using a Host Console Request
  • A cluster has enter the Service state
The failure reason codes identify any of the applicable reasons for the failure, and indicate whether the failure is expected (due to service or copy disabled) or unexpected (due to link failures or copy timeout). Additionally, sense data also includes counts for required immediate copies and actual number completed.

Sense data: immediate copy count

The firmware storage manager (FSM) Error field displays an error code in the following format: X'1C'. It provides information about the number of copies that were required and the number of copies that were able to be completed due to the reason code specified in byte 19:
Reason codes X'00'-X'7F'
The condition affecting the number of copies was unexpected.
Reason codes X'80'-X'FF'
The condition affecting the number of copies was expected.
The following sense data, specified in byte 18, is also provided:
Bits 0-3
Copies expected
Number of immediate mode copies required based on management class definition
Copy count override, including the source copy
Bits 4-7
Copies completed
Number of immediate mode copies completed, including the source copy
Note: When operating in a System z® (z/OS®) environment, this sense information is also provided and formatted in the IOS0001 message.
Table 1 and Table 2 displays a list of possible reason codes and their conditions. The conditions are valid when:
  • Not all copies or the required RUN copies defined in Copy Override Settings have succeeded.
  • The number of required copies is not satisfied due to one or more conditions described by the Description column.
Table 1. Immediate copy failure reason codes, unexpected reasons
Completion code Reason code tag Description
0x01 GCM_RUN_FAIL_NO_SOURCE All the consistent copy sources are unavailable.

The reason is undetermined and can include: host copy is disabled, cluster is offline, an unexpected outage occurred, or the cluster is in forced service mode.

0x02 GCM_RUN_FAIL_CL_OUTAGE At least one copy source is available.

One or more target or source clusters have experienced an unexpected outage (offline, dual link failures, or other).

0x03 GCM_RUN_FAIL_JOB_TIMEOUT At least one copy source is available.

The target cluster is running, but the monitor cluster detects the job is timed out by taking too long (40 minutes or longer)

None of the failed sites have an unexpected outage.

0x04 GCM_RUN_FAIL_JOB_DOWNGRADE At least one copy source is available.

One or more targets downgraded a copy job from immediate to immediate-deferred after 3 unsuccessful copy attempts, or because all the sources are already migrated.

None of the failed sites have an unexpected outage.

None of the failed sites have a job timeout.

0x05 GCM_RUN_FAIL_UNEXPECT_RSN At least one copy source is available.

One or more targets have an unexpected outage that is not categorized by any other reason codes.

Table 2. Immediate copy failure reason codes, expected reasons
Completion code Reason code tag Description
0x80 GCM_RUN_FAIL_NO_SCRATCH At least one copy source is available.

One or more targets are copy-disabled due to an out-of-physical-scratch state.

None of the failed sites have unexpected error reason

0x81 GCM_RUN_FAIL_LOW_CACHE_RESOURCE At least one copy source is available.

One or more targets are copy-disabled due to low cache resources.

None of the failed sites have an unexpected error reason.

None of the failed sites are copy-disabled due to an out-of-physical-scratch state.

0x82 GCM_RUN_FAIL_CL_IN_SERVICE At least one copy source is available.

One or more targets or sources are in service.

None of the failed sites have an unexpected error reason.

None of the failed sites are copy-disabled due to an out-of-physical-scratch state.

None of the failed sites are in a low-cache-resources state.

0x83 GCM_RUN_FAIL_HOST_COPY_DISABLE At least one copy source is available.

One or more targets or sources are host copy-disabled.

None of the failed sites have an unexpected error reason.

None of the failed sites are copy-disabled due to an out-of-physical-scratch state.

None of the failed sites are in a low-cache-resources state.

0x84 GCM_RUN_FAIL_HOT_LVOL At least one copy source is available.

The copy is deferred due to a hot token at the target cluster

Reconciliation is required first.

None of the sources or targets are host copy-disabled.

None of the failed sites have an unexpected error reason.

None of the failed sites are copy-disabled due to an out-of-physical-scratch state.

None of the failed sites are in a low-cache-resources state.

If multiple reason codes can be applicable, the lowest number of reason code is surfaced. This is because reason code numbers are inversely proportional to the critical level of the situation: lower code numbers indicate a higher critical level of solution in the domain. For example, in a three-way domain where all three clusters have RUN copy mode enabled and no copy override settings are defined. When one target is in service and another target is offline the following two reason codes below are applicable:
  • 0x02: GCM_ RUN_FAIL_CL_OUTAGE
  • 0x82: GCM_RUN_FAIL_CL_IN_SERVICE
However, only the reason code with the smaller number (0x02) is returned.