Operations

You can monitor the operations of the Notification Service by using a consumer application to view the topic on the Kafka cluster.

Alternatively, you can upload logs from the Manager to a host and view the notifications.log and report.log files from the Accesser devices to see notification send status.

Verifying that the Notification Service is working correctly

After you create a configuration and assign it to a vault, you can verify that the Notification Service is working in in the following ways:
  • Ensure that there is write or delete I/O going to the vault. If there is no I/O, then there is no way to know if the configuration is working.
  • Check for incidents on the Manager device. If the Accessor devices are experiencing send failures, an incident will be opened.
  • In the Manager Web Interface, navigate to Maintenance > Troubleshooting Console. Run the nut health statistic command on the Accessor devices that have the Notification Service-enabled vaults.
    • Note any notificationService.{uuid}.sends and notificationService.{uuid}.sendFailures.
      • If the sends are greater than the send failures, then the Notification Service is working. It is normal to see a small number of failures; the first few notifications tend to fail due to races with initalization inside the Apache Kafka library. The retry agent handles these cases.
      • If sends and send failures are equal, then the Notification Service is not working.
      • If there is a 100 percent failure rate, there might be incorrect hostnames or ports, bad authentication, or bad SSL certificates.
      • If notificationService.{uuid}.producerAllocated is false, you might have incorrectly entered the cluster's hostnames.
      Tip: The notificationService.{uuid}.sendFailurePercentage metric is reset every minute, while the send counts are not.
  • Use a Kafka Consumer to read the topic that you put into the Notification Service Configuration.
  • Check the notification.log file on the Accessor devices.
    • Check the success field.
    • Any failures can be cross-referenced in the file by request_id to see further retries and successes.