ITM Agent Insights: Monitoring logs with ITM: Monitoring circular logs with LO agent
DougDoering 120000E6GG Visits (5071)
Can the LO agent be used to monitor a "circular" or "rolling" log file?
But there are limitations.
Is it recommended?
It is up to the user to determine if the need to montor a circular log can be accomplished with the LO agent given its design limitations.
The LO agent is primarily intended to read data that is appended to a monitored log, and works by periodically polling the identified file and making checks to determine if there is "new" data to process, and then processes those "new" entries.
The LO agent does perform some checks to identify if a monitored log has been replaced (checking the "checksum" of the first line of data to see if it has changed, checking to see if the creation date has changed, checking to see if the overall file size has shrunk since the last time the file was read). This allows the LO agent to read a file from the beginning treating all the log entries as "new" data, which can allow for "rolling" or "replaced" logs to be monitored.
If the LO agent does NOT identify the monitored log as being replaced, by design it will only process the "appended" data from the point that the LO agent was at based on the Current File Position.
The LO agent can be configured to maintain a "restart" location in a .rst file using NumEventsToCatchup= setting. This allows the LO agent to be configured to keep a pointer to where it was in a monitored file for tracking entries written to the monitored log while the LO agent was stopped. The configuration option for controlling whether the LO agent will try to maintain a "restart" location to process entries written to a monitored log while LO agent is not running is the NumEventsToCatchup setting. The default setting is to only monitor "new" data appended to a monitored log while the LO agent is running (Num
The LO agent looks at the monitored log periodically based on the "PollInterval" to see if there is "new" data. It is important to understand how the LO agent determines that there is unprocessed data.
If in the .conf file setting the following is specified:
This means that the LO agent is NOT going to try and maintain a restart checkpoint file (.rst) so will NOT try to process entries that are written while the LO agent is not running.
The behavior with a "circular" log is going to depend on whether the rolling log is already at its max size or is still filling when the LO agent starts.
If the monitored log is already full when the LO agent starts, the LO agent will advance to the "end" of the file, so the current file position will be at the end of the file, and the LO agent will not process anything currently in the file.
The LO agent will periodically check the status of the monitored log to see if there is any new "appended" data, but since the monitored log is of a fixed size, its filesize will not grow if it is already at its maximum size, so the LO agent will not process any data when it stats the monitored log on the subsequent PollInterval checks.
Eventually, a circular log will end up overwriting the first line of data when the log wraps, and the next PollInterval after that happens, the LO agent will detect the beginning of the file being overwritten (assuming the checksum of the first line of data changes) when the monitored log is queried.
The LO agent detects this as "file replacement" and will begin reading the file from the beginning and processing it to the end. The first "PollInterval" that happens after the beginning of a monitored file is overwritten will trigger the LO agent to process the current entire contents of the monitored log in bulk as one batch of "new" data, since the LO agent will process log entries until it hits the end of the file.
Then the LO agent will again sit idle, periodically checking to see if the file has grown in size (which it won't since it is fixed size and full), and again will wait until the beginning of the circular log is overwritten before it detects that the file has been "replaced" based on the checksum logic of the first line. Once that happens, it again will trigger the entire current contents of the monitored log to be processed all at once.
The LO agent is never going to monitor a circular log in "real-time" once the log fills to its max size, because the only time that the LO agent is going to detect that there is "new" data since the file size would no longer grow would be based on the "checksum" changing when the first line is modified, or if the file is somehow otherwise modified so that the creation date is changing, or the file is "unavailable" so the OS returns a "file not found" and then the LO agent "rediscovers" the file on a subsequent PollInterval / NewFilePollInterval and processes the contents as "new" data.
With a circular log there is no real way to maintain a meaningful restart checkpoint (.rst) once the file fills to its "full" size, so there is the possibility of losing entries that are written when the LO agent isn't running.
On restart, the LO agent is always going to advance to the "end" of the file, which is simply the "full" size position if it has already filled, and then wait for "appended" data or the file to be "replaced" which the LO agent can detect when the beginning is overwritten when a circular log wraps.
Depending on how frequently the circular log fills, the LO agent may not process a log entry for a long time after it was actually written to the monitored circular log.
If a circular log wraps multiple times while the LO agent is down, those log entries are lost and never processed.
With a circular log, the same "log entry" may be reprocessed when the log is being detected as "replaced" and the LO agent processes the file from beginning.
This is often reported as "duplicate" situation alerts for that same log entry when it is reprocessed.
This can happen with circular logs if initially the log monitoring is started when the circular log is filling for the first time (so has not yet reached its maximum size).
In this case, the LO agent will process new "appended" data on each PollInterval as the file fills since it will see the file size growing.
Once the circular log reaches its maximum size and wraps, it will overwrite the beginning of the log, triggering the LO agent to process the entire file contents, and the LO agent will re-process those log entries it had already sent that are still conained in the log when it is read from the beginning and processed in its entirety to the end of file.
Because of the limitations of how LO agent determines when there is data to process, it is not recommended to monitor circular logs with LO agent.
Since "duplicate" log entries is an issue that could be reported in logs that are only appended to, it is often necessary to gather detailed RAS1 logging to try and diagnose an issue with a file being detected as replaced / switched to see if the reason events are being "re-sent" is due to the LO agent detecting that it should read the monitored log from the beginning, enable detailed RAS1 logging.
kumpfdp - will show trace points related to "PollInterval" processing
Review of the RAS1 logs to look for indications of a file being read from the beginning around the time leading up to any "duplicate" events.
Look for indications of "File switched or re-created or copied" in the detailed RAS1 logs.
Detailed analysis of RAS1 logs would need to be done by L2 Support to identify the specifics of any restart of monitoring.
The LO agent performs several checks on each "PollInterval" when it periodically checks a monitored log for "new" data to process.
If using FileComparisonMode= and telling the LO agent to pick between multiple files that could match LogSources= or RegexLogSources= specification, and are telling the LO agent that it should "pick" one match over another, if the LO agent detects that the log to monitor based on FileComparisonMode setting switches... the "new" log will be processed from the beginning. If there are two files being written to - LogA and LogB - and a FileComparisonMode= comp
Keywords: LFA LO kloagent
Additional ITM Agent Insights series of IBM Tivoli Monitoring Agent blogs are indexed under ITM
Subscribe and follow us for all the latest information directly on your social feeds: