Three minutes with Adaptive Polling and ITNM
kcstone 0600021BUE Visits (5233)
IBM Tivoli Network Manager 3.9 fixpack 3 is seeing a surge in the use of the adaptive polling or rapid polling feature.
Some information on the scenario and some of the numbers used with this application. And why you see three minutes in the documentation, yet you won't find that timer.....
This is a common scenario.
Device fails a chassis ping from the ITNM poller
Operators are alerted via the event list or other mechanism to take an immediate action to a downed device.
When the next polling cycle hits, the device responds. Was it a false alarm?
Adaptive polling if configured means that a rapid poll begins. The ping fail event is still in the objectserver but ITNM will rapid poll the device and if it was a false alarm it can be cleared in a few seconds, rather than minutes. Operators could then know that a device failure may have been a false alarm.
The documentation link above gives three steps to set up adaptive polling. For step 2 use this technote rather than copy a dynamic view.
Here is the documentation on the specific parameters to use with Adaptive polling
Here is a quick review of some of the numbers
polling interval - the number of seconds between each rapid poll. Once the default chassis ping fail occurs, the policy for the adaptive polling will now ping this device at this interval. The documentation uses '10' for example, but upon logging into the GUI for ITNM, you will see '15'. I personally don't like an interval smaller than 10 seconds so that TCP/IP retry is not part of the equation.
tally - how many times do you want to poll this device before the rapid polling stops? Tally is accumulated, so remember the initial ping fail plus the rapid poller.
Did you see the three minutes in those numbers? The documentation references three minutes by using those two numbers.
10 second polling interval, for a total of 18 in the Tally is 180 seconds or three minutes. So this rapid poll runs for three minutes.
If any of those 18 polls are successful the event would be cleared.
Look at another scenario.
User has default chassis ping, set for 5 minutes between polls.
If there is a failure, just poll the device a couple more times quickly to verify.
polling interval 15 seconds
Tally <= 3
In this example once a device down occurs, in the next thirty seconds the device will be polled three more times before rapid polling stops. Two rapid polls push the Tally to 3, The fourth poll makes Tally 4 removing it from the rapid polling.
Thanks for your three minutes!