Recovering the control node after cluster manager stops

If a CLSTRMGR_KILL test runs on the control node and stops the control node, reboot the control node. No action is taken to recover from the failure. After the node reboots, the testing continues.

To monitor testing after the Cluster Test Tool starts again, review output in the /var/hacmp/log/cl_testtool.log file. To determine whether a test procedure completes, run the tail -f command on /var/hacmp/log/cl_testtool.log file.

You can avoid manual intervention to reboot the control node during testing by:

  • Editing the /etc/cluster/hacmp.term file to change the default action after an abnormal exit.

    The clexit.rc script checks for the presence of this file and, if the file is executable, the script calls it instead of halting the system automatically.

  • Configuring the node to auto-Initial Program Load (IPL) before running the Cluster Test Tool.