Fault-tolerant agents not linking to master domain manager

A fault-tolerant agent does not link to its master domain manager and any other link problem scenarios documented here do not apply.

Cause and solution:

The cause of this problem might not be easy to discover, but is almost certainly involved with a mismatch between the levels of the various files used on the fault-tolerant agent.

To resolve the problem, if all other attempts have failed, perform the following cleanup procedure. However, note that this procedure loses data (unless the fault-tolerant agent is not linking after a fresh installation), so should not be undertaken lightly.

Perform the following steps:
  1. Using conman "unlink @;noask" or the Dynamic Workload Console, unlink the agent from the master domain manager
  2. Stop IBM Workload Scheduler, in particular netman, as follows:
    1. conman "stop;wait"
    2. conman "shut;wait"
    3. On Windows™ only; shutdown
    4. Stop the SSM agent, as follows:
      • On Windows, stop the Windows service: IBM Workload Scheduler SSM Agent (for <˂TWS_user>>).
      • On UNIX™, run stopmon.
    Note: If the conman commands do not work, enter the following command:
    UNIX
    ps -ef |grep <˂TWS_user>> & kill -9
    Windows
    <TWA_home>\TWS\unsupported\listproc & killproc
  3. Start netman with StartUp run as the ˂TWS_user>
  4. Issue a "link" command from the master domain manager to the fault-tolerant agent
  5. Issue a conman start command on the fault-tolerant agent.

The IBM® technical note describing this procedure also contains some advice about starting with a lossless version of this procedure (by omitting step 3) and then looping through the procedure in increasingly more-aggressive ways, with the intention of minimizing data loss. See http://www.ibm.com/support/docview.wss?uid=swg21296908