How to monitor number of available processes with APM v8
Albook 120000625S Visits (4671)
One of the most common scenarios in monitoring availability and performances, is the one concerning running processes.
For example, let's suppose you must have at least 3 instances of a process always running, and you want to be informed with an alert
Using the Linux OS agent, you will notice there is an attribute called "Process Count".
It seems this is what we are looking for, so the first attempt is made using this attribute.
Let's suppose you want to monitor a process that contains the string "sleep" (I used this one just as a sample) that is started and active on the server with different parameters:
/bin/exec1 sleep 1500
You could use the Process Filter attribute to be sure you match the wanted string in the process command line (attribute Command Line).
So something like:
In this way we can filter in only processes that contains string "sleep"; In a situation/threshold context, the command line attribute is then populated with
Then you should add the attribute Process Count that is calculated on the process having the same value in "Command Line" attribute.
The resulting threshold formula would be something like this:
I gave this threshold a try, but as soon as it started, it immediately fired and generated an alert. !!!
Additional investigation revealed why we cannot use Process Count in this kind of scenarios.
The attribute Process_Count is expected to contain a number that represents the processes having the same value of Command Line attribute.
As we know, when we create a situation/threshold using Process Filter, Command Line output should contain the result of the regexp processing.
I was expecting Process_Count to use "Command Line" attribute after Process Filter processing, but this is not the case.
Actually, the Process_Count is calculated at data collection time, outside the situation/threshold context, so it is based on the original value of "Command Line".
In our example, the original "Command Line" attribute contains:
/bin/exec1 sleep 1500
There is anyway a different method to create a working threshold without using Process Count.
There is a flag, called "Count" that can be used to group the instances of same value, more or less like function "count of group members" in ITM v6.
In our scenario we can use attribute Timestamp to calculate how many times our process string appear in the data collection, but you can use any other numeric attribute available from the list.
In order to select the processes of our interest, we can use attribute Process Filter, that in my test I defined as follow:
At the end you will have a threshold formula like:
Until I had 3 or more processes containing sleep string, threshold did no trigger; then I killed 1 or 2 of them, and I received the threshold alarm as expected:
As additional steps, I created 2 new processes containing sleep string to exceed the threshold (3), and the alarm has been closed out.
So, if you need to monitor the number of available processes and you need to apply some filtering on them, forget about Process_Count attribute and just use the method described above.
Hope it helps.
Subscribe and follow us for all the latest information directly on your social feeds: