Application monitoring

You can monitor a set of applications that you define through the SMIT interface.

You can configure multiple application monitors and associate them with one or more application controllers. By supporting multiple monitors per application, PowerHA® SystemMirror® can support more complex configurations. For example, you can configure one monitor for each instance of an Oracle parallel server in use. Or, you can configure a custom monitor to check the health of the database, and a process termination monitor to instantly detect termination of the database process.

You assign each monitor a unique name in SMIT.

It is possible to configure either a process monitor or a custom monitor. For example, you can supply a customized script to SystemMirror that sends a request to a database to check that it is functioning. A non-zero exit from the script indicates a failure of the monitored application, and PowerHA SystemMirror responds by trying to recover the resource group that contains the application.

With each monitor configured, when a problem is detected, PowerHA SystemMirror attempts to restart the application, and continues up to a specified restart count. You select one of the following responses for PowerHA SystemMirror to take when an application cannot be restarted within the restart count:

  • The fallover option causes the resource group containing the application to fall over to the node with the next-highest priority according to the resource policy.
  • The notify option causes PowerHA SystemMirror to generate a server_down event, to inform the cluster of the failure.

You can customize the restart process through the Notify Method, Cleanup Method, and Restart Method for the application monitor.

Note: If the System Resource Controller (SRC) is configured to restart the application, this can interfere with actions taken by application monitoring. Disable the SRC restart for the application (application start and stop scripts should not use the SRC unless the application is not restartable). For the case of a custom monitor, the script is responsible for the correct operation. The action taken by application monitoring is supported based on the script return.

If a monitored application is under control of the system resource controller, check to be certain that action:multi are -O and -Q. The -O Specifies that the subsystem is not restarted if it stops abnormally. The -Q Specifies that multiple instances of the subsystem are not allowed to run at the same time. These values can be checked using the following command:

lssrc -Ss <Subsystem> | cut -d : -f 10,11

If the values are not -O and -Q, they must be changed using the chssys command.