Steps for configuring a custom application monitor

This topic explains the steps for configuring a custom application monitor.

About this task

To set up a custom application monitoring method, complete the following steps:

Procedure

  1. From the command line, enter smit sysmirror
  2. In SMIT, select Cluster Applications and Resources > Resource > Configure User Applications (Scripts and Monitors) > Application Monitors > Configure Custom Application Monitors > Add a Custom Application Monitor and press Enter.

    A list of defined application controllers is displayed.

  3. Select the application controller for which you want to add a monitoring method.
  4. In the Add Custom Application Monitor panel, fill in field values as follows. The Monitor Method and Monitor Interval fields require you to supply your own scripts and specify your own preference for the polling interval:
    Table 1. Add Custom Application Monitor fields
    Field Value
    Application Controller Name Select the application controller from the picklist.
    Monitor Mode Select the mode in which the application monitor monitors the application:
    • Startup monitoring. In this mode the application monitor checks that the application controller has successfully started within the specified stabilization period. If you are configuring a monitor for an application that is included in a parent resource group, select this mode (in addition to other monitors that you may need for dependent resource groups).
    • Long-running monitoring. In this mode, the application monitor periodically checks that the application controller is running. The checking starts after the specified stabilization interval has passed. This is the default.
    • Both. In this mode, the application monitor checks that within the stabilization interval the application controller has started successfully, and periodically monitors that the application controller is running after the stabilization interval have passed.
    Note: If the Monitor Mode field is set to the Long-running monitoring mode, the application monitor does not run when the cluster services are started. The application controller runs the start script each time to start your application irrespective of whether the application is already running. When the resource group that contains the application is online, the application monitor is used to monitor the application.

    You can also specify the startup application monitor to run when you start the cluster services. When the startup application monitor runs, it determines whether the application is already active. The application controller does not run the start script when the application monitor indicates that the application is up and running.

    You can use the same application monitor for both the startup monitoring mode and long-running monitoring mode, by selecting Both as the value for the Monitor Mode field.

    The mode that you select for your application monitor determines whether the application monitor is run when the cluster services are started or when the resource group is online. You must carefully select these modes when you write your monitoring scripts so that proper checks are performed and actions are taken when the application monitor runs. For example, when the cluster services are started for the first time, the other resources on which the application relies on might not be fully active. In this case, you might consider specifying the startup application monitor to run when you start the cluster services.

    Monitor Method Enter a script or executable for custom monitoring of the health of the specified application. Do not leave this field blank.

    Note that the method must return a zero value if the application is healthy and a non-zero value if a problem is detected.

    The method can log messages by printing them to the standard output stdout file. For long running monitors, the output is stored in the /var/hacmp/log/clappmond.application monitor name.resource group name.monitor.log file. For startup monitors, this output is stored in the /var/hacmp/log/clappmond.application controller name.resource group name.monitor.log file. In PowerHA® SystemMirror® Version 7.1.1, or earlier, there is a single log file that is overwritten each time the application monitor is restarted. In PowerHA SystemMirror Version 7.1.2, or later, a new log file is created each time the application monitor is restarted.

    Monitor Interval Enter the polling interval (in seconds) for checking the health of the application. If the monitor does not respond within this interval, it is considered hung.
    Monitor Retry Count Specifies the number of times PowerHA SystemMirror tries to restart the custom application monitor before performing any other actions. The default value is 0. This field is related to the Restart Count.
    Hung Monitor Signal The signal the system should send to stop the Monitor Method script if it does not return within the time specified for the Monitor Interval. The default is SIGKILL(9).
    Stabilization Interval Specify the time (in seconds). PowerHA SystemMirror uses the stabilization period for the monitor in different ways, depending on which monitor mode is selected in this SMIT panel:
    • If you select the startup monitoring mode, the stabilization interval is the period within which PowerHA SystemMirror monitors that the application has successfully started. When the specified time expires, PowerHA SystemMirror terminates the monitoring of the application startup, and continues event processing. If the application fails to start within the stabilization interval, the resource group's acquisition fails on the node, and PowerHA SystemMirror launches resource group recovery actions to acquire a resource group on another node. The number of seconds you specify should be approximately equal to the period of time it takes for the application to start. This depends on the application you are using.
    • If you select the long-running mode for the monitor, the stabilization interval is the period during which PowerHA SystemMirror waits for the application to stabilize, before beginning to monitor that the application is running successfully. For instance, with a database application, you may wish to delay monitoring until after the start script and initial database search have been completed. You may need to experiment with this value to balance performance with reliability.
    • If you select both as a monitoring mode, the application monitor uses the stabilization interval to wait for the application to start successfully. It uses the same interval to wait until it starts checking periodically that the application is successfully running on the node.
    Note: In most circumstances, this value should not be zero.
    Restart Count Specify the number of times to try restarting the application before taking any other actions. The default is 3 .
    Restart Interval Specify the interval (in seconds) that the application must remain stable before resetting the restart count. Do not set this to be shorter than (Restart Count) x (Stabilization Interval + Monitor Interval). The default is 10% longer than that value. If the restart interval is too short, the restart count will be reset too soon and the desired failure response action may not occur when it should.
    Action on Application Failure Specify the action to be taken if the application cannot be restarted within the restart count. You can keep the default choice notify, which runs an event to inform the cluster of the failure, or select fallover, in which case the resource group containing the failed application moves over to the cluster node with the next highest priority for that resource group.
    Notify Method (Optional) The full pathname of a user defined method to perform notification when a monitored application fails. This method will execute each time an application is restarted, fails completely, or falls over to the next node in the cluster.

    Configuring this method is strongly recommended.

    Cleanup Method (Optional) Specify an application cleanup script to be invoked when a failed application is detected, before invoking the restart method. The default is the application controller stop script defined when the application controller was set up.

    With application monitoring, since the application may be already stopped when this script is called, the server stop script may fail.

    Restart Method (Required if Restart Count is not zero.) The default restart method is the application controller start script defined previously, when the application controller was set up. You can specify a different method here if desired.
  5. Press Enter.

Results

SMIT checks the values for consistency and enters them into the PowerHA SystemMirror Configuration Database. When the resource group comes online, the application monitor in the long-running mode starts. The application startup monitor starts before the resource group is brought online.

When you synchronize the cluster, verification ensures that all methods you have specified exist and are executable on all nodes.