Alternate CPU recovery (ACR)

ACR is a function that is initiated on an operative CPU when that CPU receives a signal that another CPU has had an ending error. ACR has two major functions:
  • To configure offline the malfunctioning CPU
  • To initiate the release of system resources held on the malfunctioning CPU

If the failing CPU has an Integrated Cryptographic Feature (ICRF), the ICRF is also taken offline.

ACR initiates the release of any resources held on the failing CPU by causing control to pass to the recovery routines for the work on the failing CPU. ACR allows the operating system to continue its normal operation on the remaining CPU(s), although the task that was interrupted by the error on the failing CPU might be ended.

When ACR is complete, it issues message IEA858E stating that ACR is complete and identifying the CPU that was configured offline. At this point, the operator can try to configure the failing CPU back online using a CONFIG CPU(x),ONLINE command. The configuration online might, or might not, be successful depending on the error that caused the CPU to be configured offline.

Some hardware malfunctions might cause a subsequent CONFIG CPU(x),ONLINE command to that CPU to fail, or might cause the problem to recur when the CPU is brought back online. In these cases, hardware support personnel need to service the CPU before it can be successfully brought back into the system.

However, if a CPU was configured offline because a threshold was reached or because of an operating system problem, a subsequent request to configure the CPU back online might work.