Troubleshooting the processing of historical poll data

You can define thresholds related to the age of data in the historical poll data tables, and related to the size of the poll data tables. If these thresholds are violated, this is an indication that the polling load might be too great, or that there is an issue with the historical poll data system or with the polling database. Threshold violations cause log messgaers to be generated, and also generate Tivoli Netcool/OMNIbus ObjectServer alerts which can be viewed in the Tivoli Netcool/OMNIbus Web GUI Event Viewer.

About this task

A Tivoli Netcool/OMNIbus ObjectServer alert is generated if the following events occur. These alerts have an Alert Group setting of ITNM Status:
Maximum insertion rate to the raw poll data table is exceeded
The maximum insertion rate to the raw poll data table ncpolldata.pollData is set in the config database table config.tableMonitor. If this values is exceeded then an ITNM Status alert is raised in the Tivoli Netcool/OMNIbus Web GUI Event Viewer, and a log message is written to the NCHOME/log/precision/ncp_poller.SnmpPoller.poller_name.domain log file.
If this maximum insertion rate is exceeded, then this is an indication that the polling load is too great. You might need to adjust polling load by decreasing number of devices that you are polling, decreasing the number of metrics that you are collecting on these devices, or by changing your polling intervals.
Age counts are exceeded for the historical poll data tables
Maximum age settings for the historical poll data tables in the ncpolldata database are set in the config database table config.tableMonitor. If these values are exceeded then an ITNM Status alert is raised in the Tivoli Netcool/OMNIbus Web GUI Event Viewer and a log message is written to the NCHOME/log/precision/ncp_poller.SnmpPoller.poller_name.domain log file.
If any of these maximum age settings is exceeded, then this is an indication that the automatic data purging mechanism is not working. The Apache Storm system, which processes historical poll data, might not be running or might not be keeping up with polling and purging. Alternatively, there might be a database server issue on the server that hosts the ncpolldata database.
No heartbeat received from Apache Storm
The system checks the timestamp within an ncpolldata database. If this timestamp is not updating, then this indicates that no heartbeat has been received from the Apache Storm system, which processes historical poll data. In this case an ITNM Status alert is raised in the Tivoli Netcool/OMNIbus Web GUI Event Viewer.
If the heartbeat is not being received fromApache Storm, then this could mean Apache Storm is not running or needs attention.
Batches processed by Apache Storm falling behind polling batches
The Polling engine ncp_poller collects raw poll data in batches and assigns a batch ID to each batch. When processing this raw poll data into historical poll data, Apache Storm marks each batch to indicate that it has been processed. The batch ID status of the latest poll data batches is checked regularly. If the timings associated with the batch identifier of the last set of raw poll show that the data processed by Apache Storm is falling behind the latest batches written by ncp_poller, then an ITNM Status alert is raised in the Tivoli Netcool/OMNIbus Web GUI Event Viewer.
If batch processing is falling behind, then this could mean Apache Storm is not running or needs attention.