IBM Support

FTA linking and initialization issues

Troubleshooting


Problem

A Tivoli Workload Scheduler (TWS) Fault Tolerant Agent (FTA) will not link to the master or domain manager.

Symptom

Initialization is not completing for the FTA.

Cause

  • TWS processes not properly stopping
  • Network issues
  • File system filling up (causing corruption in some TWS files)

Resolving The Problem

CAUTION: These steps to reinitialize the FTA can cause data loss. This resolution is based on a need for a quick resolution and the premise that the FTA had not been initialized since the last plan extend (Final or JnextPlan).

NOTE: An alternate procedure is documented in the TWS Troubleshooting guide that sends a modified, updated Sinfonia/Symphony file from the master to the FTA during the middle of a production day. This procedure does not resend the entire Symphony file from the beginning of the production day, so ensures that you will not have to "clean up" (cancel/confirm successful) the jobs and schedules that have already run on the FTA up to the point of the linking failure. This is especially helpful if the linking failure occurs after many jobs and schedules have already run (and you do not want them to run again). See "Initialization problems" for this procedure.

To reinitialize the FTA (especially if not concerned with loss of job/schedule information from the FTA or if no jobs/schedules have run yet in the current production day on the FTA), complete the following steps:

1. Stop all TWS processes on the FTA; run the following as <twsuser>:

conman "unlink @;noask"
conman "stop;wait"
conman "shut;wait"

shutdown.cmd (Windows)

If the conman commands do not work, use:

<twshome>\unsupported\listproc & killproc (Windows)

ps -ef |grep <twsuser> & kill -9 (Unix)

2. Move or remove from <twshome>:

a. Symphony - depending on the timing, this could cause jobs to run a second time
b. Sinfonia
c. Jobtable
d. *.msg,
e. pobox/*.msg
f. Master/Domain Manager pobox/FTANAME.msg

NOTE: Step 2 has a high probability of losing job completion information and should be used with caution if this is done during the middle of the production day. If done during the day, you can perform the following:

g. from the manager or domain manager that the FTA directly links to:

cp Symphony Sinfonia

This will prevent the FTA from rerunning work it has already completed.

3. Issue <twshome>/StartUp

4. From the Master or Domain Manager issue:

conman "link FTANAME"

An alternative to going directly from step 1 to 2 is to create a loop that will minimize information loss:

Perform steps 1, 3, and 4 - check the results

Perform steps 1, 2 (a, b, and c), 3, 4 - check results
Note: This is less aggressive if plan extend has just been run.

or

Perform steps 1, 2 (d, e, and f), 3, and 4 - check result
Note: This is less aggressive if .msg files are 48K (Unix) or 1K (Windows). The order of the removal of the message files may be switched depending on the situation. Step 2 f may be left out and used in another loop if desired.

Also, see Technote # 1193396 in the "Related information" section below for help with troubleshooting your network. Issues with the network are also a very common cause for linking issues between the TWS master and FTA.

Related Information

[{"Product":{"code":"SSGSPN","label":"IBM Workload Scheduler"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"--","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5;8.5.1;8.6;9.1;9.2;9.3","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Product Synonym

Maestro;TWS;IWS;TWA

Document Information

Modified date:
17 June 2018

UID

swg21296908