Fault-tolerant agents not linking to master domain manager
A fault-tolerant agent does not link to its master domain manager and any other link problem scenarios documented here do not apply.
Cause and solution:
The cause of this problem might not be easy to discover, but is almost certainly involved with a mismatch between the levels of the various files used on the fault-tolerant agent.
To resolve the problem, if all other attempts have failed, perform the following cleanup procedure. However, note that this procedure loses data (unless the fault-tolerant agent is not linking after a fresh installation), so should not be undertaken lightly.
- Using conman "unlink @;noask" or the Dynamic Workload Console, unlink the agent from the master domain manager
- Stop IBM Workload Scheduler,
in particular netman, as follows:
- conman "stop;wait"
- conman "shut;wait"
- On Windows™ only; shutdown
- Stop the SSM agent, as follows:
- On Windows, stop the Windows service: IBM Workload Scheduler SSM Agent (for <˂TWS_user>>).
- On UNIX™, run stopmon.
Note: If the conman commands do not work, enter the following command:- UNIX
- ps -ef |grep <˂TWS_user>> & kill -9
- Windows
- <TWA_home>\TWS\unsupported\listproc & killproc
- Risk of data loss: Removing the followoing
indicated files can cause significant loss of data. Further, if jobs
have run on the fault-tolerant agent for
the current plan, without additional interaction, the fault-tolerant agent will
rerun those jobs. Remove or rename the following files:
<TWS_home>\TWS\*.msg \Symphony \Sinfonia \Jobtable \pobox\*.msg
Note: See Corrupt Symphony file recovery for additional options. - Start netman with StartUp run as the ˂TWS_user>
- Issue a "link" command from the master domain manager to the fault-tolerant agent
- Issue a conman start command on the fault-tolerant agent.
The IBM® technical note describing this procedure also contains some advice about starting with a lossless version of this procedure (by omitting step 3) and then looping through the procedure in increasingly more-aggressive ways, with the intention of minimizing data loss. See http://www.ibm.com/support/docview.wss?uid=swg21296908