Recovery procedure on a fault-tolerant agent with the use of the resetFTA command

If the Symphony file is corrupt on a fault-tolerant agent, you can use the resetFTA command to automate the recovery procedure.

Complete removal and replacement of the Symphony file causes some loss of data, for example events on job status, or the contents of the Mailbox.msg message and the tomaster.msg message queues. If state information about a job was contained in those queues, that job is rerun. The following procedure minimizes that loss and indicates what is lost. It is recommended that you apply this procedure with caution.

The procedure renames the Symphony, Sinfonia, *.msg files on the fault-tolerant agent where the Symphony corruption occurred and generates an updated Sinfonia file, which is sent to the fault-tolerant agent. You can therefore resume operations quickly on the affected fault-tolerant agent, minimize loss of job and job stream information, and reduce recovery time.

The procedure involves two agents, the fault-tolerant agent where the Symphony file is corrupt and its domain manager.

You can start the command from any IBM Workload Scheduler workstation, with the exception of the fault-tolerant agent where the corruption occurred. Connection to the target fault-tolerant agent and to its domain manager is established using the netman port number. The default port number is 31111.

When you start the resetFTA command, the following operations are performed in the specified order:
on the fault-tolerant agent
  • The following files are renamed:
    • Appserverbox.msg
    • clbox.msg
    • Courier.msg
    • Intercom.msg
    • Mailbox.msg
    • Monbox.msg
    • Moncmd.msg
    • Symphony
    • Sinfonia
The operations are performed asynchronously, to ensure that all target files have been renamed before starting the procedure on the domain manager.
on the domain manager
  1. A backup of the Sinfonia file is created.
  2. The Symphony file is copied to the Sinfonia file.
  3. The target fault-tolerant agent is linked.
  4. The updated Sinfonia file is sent to the target fault-tolerant agent.

The syntax of the command is as follows:

Syntax

resetFTA cpu

Arguments

cpu
Is the fault-tolerant agent to be reset.

This command is not available in the Dynamic Workload Console.

For more information, see the section about the resetfta command in IBM Workload Scheduler: User's Guide and Reference.