How can I generate an alert when a managed server stops and handle the alert automatically when the server restarts?
You can use a linked rule to generate an alert when a managed server stops and automatically handle the alert when the server restarts.
When IBM® Sterling Control Center Monitor can no longer communicate with a monitored server, it assumes that the server stopped. It then generates an event of type Server Status with a message ID of CCTR034E. Also, when IBM Sterling Control Center Monitor establishes communication with a monitored server, it assumes that the server is started and generates an event of type Server Status with a message ID of CCTR033E.This information can be used in conjunction with a linked rule to generate an alert for a server down condition and to automatically clear any alert generated if and when the server restarts.
A linked rule is nothing more than a normal rule with additional attributes. To construct a rule to alert you when a server is down:
- Build a rule with a parameter by using a Key of Message ID, an Operator of Matches, and a Value of CCTR034E. CCTR024E is the event generated by the IBM Sterling Control Center Monitor engine when it thinks a server stopped. You can specify additional parameters to limit when this rule is triggered.
- Select alert1 as the Action to generate an alert of severity 1 when the Server stops.
- Select Enabled on the Linked Rules wizard that appears next.
- For linked rule parameters, specify a parameter with a Key of Message ID, an Operator of Matches, and a Value of CCTR033E, as an event that is generated by the IBM Sterling Control Center Monitor engine when it finds that a managed server starts. Other parameters might also be specified as necessary to ensure that only the appropriate CCTR033E events are matched.
- Since you want to clear the alert generated when the server down occurred when the system sees that the server restarted, choose alert0 for the Resolution action, and the action No Operation for the Non-Resolution action.
- Select alert0 for the Non-Resolution action as well if you always want the alert cleared. The Non-Resolution action does not occur until the number of minutes specified for timeout transpires.
- Set the timeout value to the time the IBM Sterling Control Center Monitor engine waits before taking either the Resolution or the Non-Resolution action. That is, set the timeout value to the maximum time you feel it takes for a server outage to be handled.