Peer recovery is the process of completing the work that
was left in an incomplete state due to the failure of an instance
of DFSMStvs by another instance of DFSMStvs. The only function of
peer recovery is to complete that work and then end; a peer recovery
instance of DFSMStvs does not accept any new work.
Peer recovery occurs only in cases of
system failure,
not merely when DFSMStvs fails. These rules apply to peer recovery:
- Peer recovery occurs only when both DFSMStvs and the system on
which it was running fail. Peer recovery does not occur when the DFSMStvs
instance failed, but the system continued to run. If this were the
case, either DFSMStvs would automatically restart, or the DFSMStvs
instance was stopped and was not meant to be restarted.
- All resource managers that had shared interest in
units of recovery restart on the same system. For this reason, DFSMStvs uses the
automatic restart manager (ARM) to manage the grouping. If the installation
is not using ARM, or if ARM is unavailable, you can manually initiate
peer recovery by issuing the VARY SMS command, as follows:
VARY SMS,TRANVSAM(001),PEERRECOVERY,ACTIVE
Issue
the VARY SMS command on another system. That system then runs peer
recovery for the instance specified on the command, as well as any
instances for which that instance was performing peer recovery.Normally,
if a DFSMStvs was disabling or disabled due to an operator command,
peer recovery does not run. This is because that instance of DFSMStvs
was told to come down and not restart. If you want peer recovery to
occur, issue this command:
VARY SMS,TRANVSAM(001),PEERRECOVERY,ACTIVEFORCE
- A peer recovery DFSMStvs instance is only started if there is
a primary DFSMStvs instance of DFSMStvs running on the system. If
DFSMStvs is not started on the system, peer recovery does not run.
- Peer recovery starts asynchronous tasks to process
outstanding units of recovery in parallel. As the tasks complete,
additional tasks are started, until all the outstanding units of recovery
are processed or an operator command is issued to request the end
of peer recovery processing. If such a command is issued, tasks that
are already running are allowed to complete, then peer recovery processing
ends.
- Peer recovery is allowed to run if the state of the failed DFSMStvs
instance was quiescing, enabling, or enabled, since peer recovery
would only complete the quiesce process. It is not allowed if the
state was quiesced since, in this case, there should be no work to
do. It also is not allowed if the state was disabled; this implies
that the installation did not want the DFSMStvs instance to do any
work. If you want peer recovery to be performed on behalf of a DFSMStvs
instance that was disabling or disabled, you must use the VARY SMS
command with the ACTIVEFORCE keyword.
If the DFSMStvs instance had
been disabling due to an RRS failure, peer recovery is allowed to
run. In this case DFSMStvs is reinitialized when RRS became available
again.