Unit of work recovery

A unit of work in CICS® is also the unit of recovery - that is, it is the atomic component of the transaction in which any changes made either must all be committed, or must all be backed out.

A transaction can be composed of a single unit of work or multiple units of work. In CICS, recovery is managed at the unit of work level.

For recovery purposes, CICS recovery manager is concerned only with the units of work that have not yet completed a syncpoint because of some failure. This topic discusses how CICS handles these failed units of work.

The CICS recovery manager has to manage the recovery of the following types of unit of work failure:
In-flight-failed
The transaction fails before the current unit of work reaches a syncpoint, as a result either of a task abend, or the abnormal termination of CICS. The transaction is abnormally terminated, and recovery manager initiates backout of any changes made by the unit of work.

See Transaction backout.

Commit-failed
A unit of work fails during commit processing while taking a syncpoint. A partial copy of the unit of work is shunted to await retry of the commit process when the problem is resolved.

This does not cause the transaction to terminate abnormally.

See Commit-failed recovery.

Backout-failed
A unit of work fails while backing out updates to file control recoverable resources. (The concept of backout-failed applies in principle to any resource that performs backout recovery, but CICS file control is the only resource manager to provide backout failure support.) A partial copy of the unit of work is shunted to await retry of the backout process when the problem is resolved.
Note: Although the failed backout may have been attempted as a result of the abnormal termination of a transaction, the backout failure itself does not cause the transaction to terminate abnormally.

For example, if a transaction initiates backout through an EXEC CICS SYNCPOINT ROLLBACK command, CICS returns a normal response (not an exception condition) and the transaction continues executing. It is up to recovery manager to ensure that locks are preserved until backout is eventually completed.

If some resources involved in a unit of work are backout-failed, while others are commit-failed, the UOW as a whole is flagged as backout-failed.

See Backout-failed recovery.

Indoubt-failed
A distributed unit of work fails while in the indoubt state of the two-phase commit process. The transaction is abnormally terminated. If there are normally more units of work that follow the one that failed indoubt, these will not be executed as a result of the abend.

A partial copy of the unit of work is shunted to await resynchronization when CICS re-establishes communication with its coordinator. This action happens only when the transaction resource definition specifies that units of work are to wait in the event of failure while indoubt. If they are defined with WAIT(NO), CICS takes the action specified on the ACTION parameter, and the unit of work cannot become failed indoubt.

See Indoubt failure recovery.