Start of change

Peer recovery for data sharing members

Peer recovery enables a Db2 data sharing group to automatically recover retained locks for failed members, without the need for other automation tools or components.

The peer recovery mechanism is controlled via the PEER_RECOVERY subsystem parameter value. You can specify whether each particular member assists failed members, receives assistance from peers when it fails, or both.

Peer recovery is accomplished by one of the assisting members initiating a restart light operation for the failed member. The LIGHT(YES) option is used, along with the subsystem parameter that was last used by the failed member. The first assisting peer member to obtain the lock for the failed member attempts the restart light operation. If this restart fails, the assisting member does not issue another restart light operation for the failed member for the same instance of the failure. If there are restart light failures, each assisting member tries to restart the failed member only one time. If all of the restart attempts fail, the failed member remains failed. Manual intervention is required to restart the member to remove its retained locks.

If a member fails before it joins the XCF group then peer recovery is not used. This situation is unlikely because the join to the XCF group is early in the restart process.

Before using peer recovery, ensure that every data sharing member can start on every LPAR.

In coexistence mode, members on Db2 11 do not participate in peer recovery.

Attention: Do not use peer recovery if you use z/OS® Automatic Recovery Manager (ARM) or other automation to restart failed members.
End of change