Displaying log anomaly detection resources and models

During natural language log anomaly detection, models are produced for each of the resources identified in the log data provided as input to the algorithm. On completion of log anomaly detection training, the system provides information on the models that were generated during training.

About this task

In order to produce a model for a resource, a minimum of 2,000 log lines is required for that resource and the log data must meet formatting requirements. On completion of training, a Models tile is displayed containing the details described below.

Models

The main part of the tile displays a donut graph showing resources with models in green and resources without models in red.

  • Total number of models identified is shown in the middle of the donut graph.
  • Hover over the donut graph to display the number of resources with models and the number of resources without models.

View resources

Click this link at the top right of the Models tile to display resource and model details.

Column Description
Resource name Name of the resource for which models were produced.
Log lines found Number of log lines found for this resource in the data supplied when configuring the algorithm. A minimum of 2,000 log lines is required per resource to meet data sufficiency requirements. Note The number of log lines displayed here might be smaller than the number of log lines provided by the log data integration and presented in a log aggegator dashboard. This is because log lines are deduplicated during the training process. A log line is said to be duplicated when the combination of timestamp, instance_id and message is exactly the same. The duplicate log lines are not saved on Elasticsearch and are also not used for inference in log anomaly detection. Deduplication has no impact on the quality of the models generated but does have the advantage of lower data storage requirements and also helps to manage Flink restart issues due to collection of the same data multiple times.
Sufficient data Displays one of the following values: Passed: 2,000 log lines or more were found for this resource. Failed: Less than 2,000 log lines were found for this resource.
Data format An indication of whether the data found for this resource meets data formatting needs. This column takes one of the following values: Passed: Formatting needs are met. Failed: Formatting needs not met. You might need to modify the field mapping of the data. For more information see Mapping data.
Model status Indicates whether a model was created for this resource. A model is only created if both sufficiency and formatting needs are met.