Forcing checkpointing after a specified number of minutes
There are cases when a network is down, or communication between servers is not good. In
such circumstances, Impact may lose a block of events that were sent to the secondary server for
processing. If this happens, then Impact will be holding the checkpoint of events and the events
themselves in memory, which may cause an OutOfMemory error.
Checkpoint means to persist the Serial or
Statechange field of events to the etc/eventreader.state file,
so that an event reader knows whether or not it has handled a block of events.
For example, in the case where processing for a block is slow, you may see the following messages in the logs:
INFO [EventBroker] AbstractEventReader: checkPoint: The Block ID = 248262
is not the one I was expecting: 248260
INFO [EventBroker] Hold the events with identifier :248262 until earlier
block of events are processed
In this case event block 248260 has not reported back to the
primary cluster member that its processing is complete. It may still be being processed, or the
confirmation may have been lost, possibly due to network issues. The primary cluster member holds
all events after this event block in memory, which may cause an OutOfMemory error.
To avoid this problem, you can set the maxminutestoforcecheckpoint property in the OMNIbus event reader properties file: $IMPACT_HOME/etc/<servername>_<omnibuseventreadername>.props.
For example, add the following property:
impact.<omnibuseventreadername>.maxminutestoforcecheckpoint=5
This forces checkpointing to occur after the specified number of minutes. Impact server can then continue processing and checkpointing events.
There are two possible reasons for missing checkpoints:
- Missing checkpoint when events are processed successfully: There may be an exception such as
NullPointerExceptionthrown when checkpoint the block ineventreader, so it is not missing from processing, just missing for checkpoint. - Missing checkpoint when event processing fails or times out: You can find out the exception in the event processor logs and impactserver.log for these events.