Failure to append messages to Buffer Pool

FAL uses buffers to hold json events in memory before flushing to disk.

If the buffer pool becomes resource constrained, IO threads will wait a short amount of time waiting for space to become available before dropping events. This is done in order to prevent throttling the filesystem waiting for available buffer space.

This issue might happen periodically if the IO load on the node is heavy, but does not indicate a fatal error. The message Failed to append to Buffer Pool causes the health state of audit to go to degraded, and should clear itself from system health once buffer space becomes available. In the logs, a message like < Wrote message to audit log successfully! Total messages that could not be sent: 372. > will appear after normal function is resumed, with the amount of messages that had to be dropped displayed in the log entry.

It is recommended to try to limit the scope of the audit if possible. Rather than auditing for all events on the entire filesystem, if not all events or filesets need to be audited, it is recommended to remove them from the audit. This could help with buffer space contention, but not guaranteed to solve the issue.