Cluster does not return to a stable state

The Cluster Test Tool stops running tests after a timeout if the cluster does not return to a stable state either while a test is running or as a result of a test being processed.

The timeout is based on ongoing cluster activity and the cluster-wide event-duration time until warning values. If the Cluster Test Tool stops running, an error appears on the screen and is logged to the Cluster Test Tool log file before the tool stops running.

After the cluster returns to a stable state, it is possible that the cluster components, such as resource groups, networks, and nodes, are not in a state consistent with the specifications of the list of tests. If the tool cannot run a test due to the state of the cluster, the tool generates an error. The Cluster Test Tool continues to process tests.

If the cluster state does not let you continue a test, you can:

  1. Reboot cluster nodes and restart the Cluster Manager.
  2. Inspect the Cluster Test Tool log file and the hacmp.out file to get more information about what may have happened when the test stopped.
  3. Review the timer settings for the following cluster timers, and make sure that the settings are appropriate to your cluster:
    • Time until warning
    • Stabilization interval
    • Monitor interval.

For information about timers in the Cluster Test tool, and about how application monitor timers can affect whether the tool times out, see Working with timer settings.