Examples of statistical baseline log anomaly detection
This topic provides examples of log anomalies generated by the statistical baseline log anomaly detection algorithm and when they are likely to be found.
Examples presented in this topic
The following examples are described in this topic.
Example: unexpected 500 error
You recently installed the sock-shop application, which is made up of multiple microservices, such as front-end
and orders
. Most of the microservices that make up this application generate their own log files.
Note For the statistical baseline log anomaly detection algorithm to work, there is no formal minimum log line count. The main requirement is that all microservices generate a reasonable amount of log data.
The example proceeds as follows:
-
Initially the microservices are all running OK and generate clean log files, without any text entities that might indicate an error. See Clean log file lines for an example of what a log like this might look like.
-
Shortly after this, some of the pods associated with the microservices start shutting down unexpectedly; as a result of this,
HTTP 500
errors appear for the first time in the live log stream. -
Live log data is analyzed in 10 second slots. Now, when the latest logs are compared to the baseline reference values, several error entities are identified:
500
,error
, andillegIllegalStateException
. The algorithm determines that the number500
is an error, as it is accompanied by the entitystatus
, which points to an HTTP 500 error, a general HTTP status code indicating that something has gone wrong on a server but the server could not be more specific about what the exact problem is. These entities have not been identified before during normal behavior and since they are statistically significant (trivially so in this example) thus point to an anomaly. See Error log file lines for an example of what a log like this might look like.Note If this was an error message that occurs on a regular basis, say an optional library that is not used in a given piece of code, then the statistical baseline log anomaly detection algorithm will automatically learn this, and consequently treat this error message as normal and therefore no longer alert on it.
-
Log anomaly alerts are generated for each log anomaly and sent to the in-cluster ObjectServer. See Log anomaly alert format for an example of what a log anomaly alert might look like.
Reference information
The following information is referenced in this task.
Clean log file lines
These example JSON log snippets display clean log lines for the front-end and orders microservices within the sock-shop application. The log algorithm extracts entities from the _line
keyword within these log lines and then
performs the analysis described in Workflow of statistical baseline log anomaly detection on these extracted entities.
Clean log file lines from the front-end microservice
The following log files lines do contain numbers but they are not accompanied by error entities. The numbers in these log message are actually part of user address data (for example 322 Main Street
) and credit card data, and
can be seen in the _line
parameter in the code snippet below. You might need to scroll to the right to see the full content of the _line
parameter.
[
{
"_app":"front-end",
"_file":"/var/log/containers/front-end-85c54f4fb-jvzqn_sock-shop_front-end-e19f0a886c8a03aa63e8138def1df8f33a60981ec41cdbc2aca7c120d16e7a42.log",
"_host":"front-end-85c54f4fb-jvzqn",
"_ip":"172.30.148.208",
"_ipremote":"52.117.50.10",
"_lid":"1264263893737455616",
"_line":"Posting Address: {\"number\":\"\",\"street\":\"322 Main Street\",\"city\":\"Gotham City\",\"postcode\":\"12345\",\"country\":\"United States\",\"userID\":\"1q2w3e4r5t6y7u8i9o0p\"}",
"_logtype":"json",
"_mac":"00:00:00:00:94:60",
"_tag":["k8s"],"_ts":1601424000273,
"city":"Gotham City",
"container":"front-end",
"containerid":"e19f0a886c8a03aa63e8138def1df8f33a60981ec41cdbc2aca7c120d16e7a42",
"country":"United States",
"namespace":"sock-shop",
"node":"kube-bqvhtmdd031d903ud4g0-simulationd-default-00000105",
"number":"",
"pod":"front-end-85c54f4fb-jvzqn",
"postcode":"12345",
"street":"322 Main Street",
"userID":"0p9o8i7u6y5t4r3e2w1q"
},
{
"_app":"front-end",
"_file":"/var/log/containers/front-end-85c54f4fb-jvzqn_sock-shop_front-end-e19f0a886c8a03aa63e8138def1df8f33a60981ec41cdbc2aca7c120d16e7a42.log",
"_host":"front-end-85c54f4fb-jvzqn",
"_ip":"172.30.148.208",
"_ipremote":"52.117.50.10",
"_lid":"1264263893737455617",
"_line":"\u001B[0mPOST /addresses \u001B[32m200 \u001B[0m9.663 ms - -\u001B[0m",
"_logtype":"customapp",
"_mac":"00:00:00:00:94:60",
"_tag":["k8s"],
"_ts":1601424000282,
"container":"front-end",
"containerid":"e19f0a886c8a03aa63e8138def1df8f33a60981ec41cdbc2aca7c120d16e7a42",
"namespace":"sock-shop",
"node":"kube-bqvhtmdd031d903ud4g0-simulationd-default-00000105",
"pod":"front-end-85c54f4fb-jvzqn"
}
]
Clean log file lines from the orders microservice
The following log files lines do contain numbers but they are not accompanied by error entities. The numbers in these log message are actually part of user address data (for example 322 Main Street
) and credit card data, and
can be seen in the _line
parameter in the code snippet below. You might need to scroll to the right to see the full content of the _line
parameter.
[
{"_app":"orders",
"_file":"/var/log/containers/orders-b8cb69f5c-gn87g_sock-shop_orders-9d699fcc72f3283bf66ecb0bb850251e73dd5a7f5a0026b4fcb6facc6c533baf.log",
"_host":"orders-b8cb69f5c-gn87g",
"_ip":"172.30.148.208",
"_ipremote":"52.117.50.10",
"_lid":"1264263925219901440",
"_line":"2020-09-30 00:00:07.959 INFO [orders,4ecc59a2bbf00a18,276a9b5feb58dcd3,true] 7 --- [p-nio-80-exec-8] w.w.s.o.controllers.OrdersController : Sending payment request: PaymentRequest{address=Address{id=null, number='', street='322 Main Street', city='Gotahm City', country='United States', postcode='12345'}, card=Card{id=null, longNum='1234567812345678', expires='12/56', ccv='321'}, customer=Customer{id=null, firstName='John', lastName='Doe', username='John_Doe', addresses=[], cards=[]}}",
"_logtype":"json",
"_mac":"00:00:00:00:94:60",
"_tag":["k8s"],
"_ts":1601424007959,
"addresses":"[],",
"cards":"[]}}",
"container":"orders",
"containerid":"9d699fcc72f3283bf66ecb0bb850251e73dd5a7f5a0026b4fcb6facc6c533baf",
"firstName":"'John',",
"lastName":"'Doe',",
"level":"INFO",
"message":"[orders,4ecc59a2bbf00a18,276a9b5feb58dcd3,true] 7 --- [p-nio-80-exec-8] w.w.s.o.controllers.OrdersController : Sending payment request: PaymentRequest{address=Address{id=null, number='', street='322 Main Street', city='Gotham City', country='United States', postcode='12345'}, card=Card{id=null, longNum='1234567812345678', expires='12/56', ccv='321'}, customer=Customer{id=null, firstName='John', lastName='Doe', username='John_Doe', addresses=[], cards=[]}}",
"namespace":"sock-shop",
"node":"kube-bqvhtmdd031d903ud4g0-simulationd-default-00000105",
"pod":"orders-b8cb69f5c-gn87g",
"username":"'John_Doe',"}
]
Error log file lines
The following example JSON log snippet displays error log lines for the front-end microservice within the sock-shop application. The log algorithm extracts entities from the _line
keyword within these log lines and then performs
the analysis described in Workflow of statistical baseline log anomaly detection on these extracted entities.
Error log file line from the front-end microservice
The following log line contains numbers accompanied by error entities; in particular, the _line
parameter contains the number 500
, as well as the error entities status
, exception
, and
IllegalStateException
. You might need to scroll to the right to see the full content of the _line
parameter.
[
{"_account": "6a8e297e13",
"_cluster": "ld70",
"_host": "front-end-85c54f4fb-jvzqn",
"_ingester": "mezmo-agent/1.6.2 (Ubuntu/16.04) k8s",
"_logtype": "json",
"_mac": "00:00:00:00:94:60",
"_tag": ["k8s"],
"_file": "/var/log/containers/front-end-85c54f4fb-jvzqn_sock-shop_front-end-e19f0a886c8a03aa63e8138def1df8f33a60981ec41cdbc2aca7c120d16e7a42.log",
"_line": "Order response: {\"timestamp\":1599236098757,\"status\":500,\"error\":\"Internal Server Error\",\"exception\":\"java.lang.IllegalStateException\",\"message\":\"Unable to create order due to unspecified IO error.\",\"path\":\"/orders\"}",
"_ts": 1599236098759,
"_app": "front-end",
"pod": "front-end-85c54f4fb-jvzqn",
"namespace": "sock-shop",
"container": "front-end",
"containerid": "e19f0a886c8a03aa63e8138def1df8f33a60981ec41cdbc2aca7c120d16e7a42",
"node": "kube-bqvhtmdd031d903ud4g0-simulationd-default-00000105",
"_ip": "172.30.148.208",
"_ipremote": "52.117.50.10",
"__key": "logline:6a8e297e13:front-end:front-end-85c54f4fb-jvzqn:ld70",
"_bid": "e4975ea4-6634-4f88-accc-8eba410d2617:61301:ld70",
"level": null,
"timestamp": 1599236098757,
"error": "Internal Server Error",
"exception": "java.lang.IllegalStateException",
"message": "Unable to create order due to unspecified IO error.",
"_status": 500,
"_path": "/orders",
"_id": "1255087168836907020"}
]
Log anomaly alert example
This example alert incorporates data from the identified log anomalies into an ObjectServer alert structure.
- This alert has been assigned a severity of 5.
- The
summary
parameter explains that 24 errors have been found in the front-end component. - The
details.log_anomaly_detector.text_dict.template_list
parameter lists the error entities identified.
[
{"id": "d708346f-046a-35ae-b0ab-1a9ad8f1bafa",
"occurrenceTime": "2020-09-04T16:15:40.000Z",
"sender": {"name": "Log Anomaly"},
"event_link": "https://api.us-south.logging.cloud.ibm.com/v1/export?to=1599236150000&from=1599236140000",
"summary": "24 errors found in component: front-end",
"description": "500: 8\njava.lang.IllegalStateException: 8\nerror_log: 8",
"severity": 5,
"type":
{"eventType": "problem",
"classification": "log anomaly",
"condition": "c9680808-14bb-11ec-b72f-acde48001122"},
"resource":
{"name": "front-end",
"type": "application",
"application": "front-end",
"entity": "front-end"},
"expirySeconds": 900,
"details":
{"log_anomaly_detector":
{"start_timestamp": 1599236140000,
"end_timestamp": 1599236150000,
"original_group_id": "c9680808-14bb-11ec-b72f-acde48001122",
"causality":
{"service": ["front-end"]},
"detected_at": 1632338288103.0,
"source_application_id": "front-end",
"log_anomaly_confidence": 1.0,
"log_anomaly_model": ["oob_entities"],
"prediction_error":
{"oob_entities_error": 24},
"error_templates": [0, 1, 2],
"template_list": ["500", "java.lang.IllegalStateException", "error_log"],
"count_vector": [8, 8, 8],
"text_dict":
{"template_ids": ["oob_entities_500", "oob_entities_java.lang.IllegalStateException", "oob_entities_error_log"],
"template_list": ["500", "java.lang.IllegalStateException", "error_log"],
"count_vector": [8, 8, 8],
"expected_count_vector": [0.29, 0.29, 0.48],
"expected_count_lower_bound": [0, 0, 0],
"expected_count_upper_bound": [3.0, 3.0, 3.21]
},
"application_group_id": "ss",
"application_id": "1",
"model_version": "5",
"severity_from_model": 0,
"description": "500: 8\njava.lang.IllegalStateException: 8\nerror_log: 8"
}
},
"timestamp": 1632338288103.0}
]
Example: WebSphere log anomaly
The following example shows a WebSphere log. In log anomalies for a specific domain like WebSphere, entities are included in the logs. WebSphere log anomaly detection identifies which entities are error indicators. Message ID, ibm_messageId
,
and log levels, loglevel
, are examples of WebSphere entities.
{
"type": "was_message",
"host": "abc-123.myserver.com",
"ibm_cellName": "gsV900Cell",
"ibm_nodeName": "node4",
"ibm_serverName": "nodeagent",
"ibm_sequence": "1616166373781_0000000000002",
"message": "TRAS0017I: The startup trace state is *=info.",
"ibm_datetime": "2021-03-19T11:06:13.781-0400",
"ibm_messageId": "TRAS0017I",
"ibm_threadId": "00000001",
"module": "com.ibm.ejs.ras.ManagerAdmin",
"loglevel": "INFO",
"unix_ts": 1616166373781,
"start": 1616166370000,
"end": 1616166380000
}
On the Incidents and alerts page, the following example from the Summary parameter explains that two logs were found in the qotd-qrcode
container. This parameter also says that a primary message
code of SSLC0008E
is contributing to the error. This alert is further explained by the Alert details panel. For more information, see Domain-specific log anomaly alert details.
Log Anomaly found 2 logs containing errors for container(s): qotd-qrcode with a primary message code of SSLC0008E
Example: IBM MQ log anomaly
The following example shows an IBM MQ log. In log anomalies for a specific domain like IBM MQ, entities are included in the logs. IBM MQ log anomaly detection identifies which entities are error indicators. Message ID, ibm_messageId
,
and log levels, loglevel
, are examples of IBM MQ entities.
{
"ibm_messageId": "AMQ6287I",
"ibm_arithInsert1": 0,
"ibm_arithInsert2": 0,
"ibm_commentInsert1": "Linux 4.18.0-193.70.1.el8_2.x86_64 (Cloud Pak Linux (x86-64 platform) 64-bit)",
"ibm_commentInsert2": "/opt/mqm (Installation1)",
"ibm_commentInsert3": "9.3.0.0 (p930-L220526)",
"ibm_datetime": "2022-06-01T01:13:49.673Z",
"ibm_serverName": "QMA",
"type": "mq_log",
"host": "slot11-qma-ibm-mq-0",
"loglevel": "INFO",
"module": "amqxeida.c:6755",
"ibm_sequence": "1654046029_673594295",
"ibm_processId": "37",
"ibm_threadId": "3",
"ibm_version": "9.3.0.0",
"ibm_processName": "amqzmuc0",
"ibm_userName": "mqm",
"ibm_installationName": "Installation1",
"ibm_installationDir": "/opt/mqm",
"message": "AMQ6287I: IBM MQ V9.3.0.0 (p930-L220526)."
}
On the Incidents and alerts page, the following example from the Summary parameter explains that two logs were found in the qotd-qrcode
container. This parameter also says that a primary message
code of SSLC0008E
is contributing to the error. This alert is further explained by the Alert details panel. For more information, see Domain-specific log anomaly alert details.
Log Anomaly found 2 logs containing errors for container(s): qotd-qrcode with a primary message code of SSLC0008E