Managing embedded JSON
Log lines containing embedded JSON might not be parsed correctly by the natual language log anomaly detection algorithm. If your log lines contain embedded JSON, then apply the methods described here to convert your log lines to unstructured data.
The following snippet shows an example of a log line containing embedded JSON.
021-09-23T21:43:40.067+0000 I NETWORK [conn5083706] received client metadata from 127.0.0.1:49736 conn5083706: { name: \"PyMongo\", version: \"3.8.0\", type: \"Linux\", architecture: \"x86_64\", version: \"3.10.0-1160.25.1.el7.x86_64\", platform: \"CPython3.7.3.final.0\" }
The braces (curly brackets) and the section between those braces is embedded JSON.
You can handle these log lines by applying one of the following options:
-
None: The default option. The JSON is not processed or modified. This option results in the example JSON resembling the following code:
021-09-23T21:43:40.067+0000 I NETWORK [conn5083706] received client metadata from 127.0.0.1:49736 conn5083706: { name: \"PyMongo\", version: \"3.8.0\", type: \"Linux\", architecture: \"x86_64\", version: \"3.10.0-1160.25.1.el7.x86_64\", platform: \"CPython3.7.3.final.0\" }
-
Flatten: This option flattens the JSON object by removing the opening and closing braces. This option results in the example JSON resembling the following code:
021-09-23T21:43:40.067+0000 I NETWORK [conn5083706] received client metadata from 127.0.0.1:49736 conn5083706: name: \"PyMongo\", version: \"3.8.0\", type: \"Linux\", architecture: \"x86_64\", version: \"3.10.0-1160.25.1.el7.x86_64\", platform: \"CPython3.7.3.final.0\"
-
Filter: This option extracts the JSON object and replaces it with an empty string. This option results in the example JSON resembling the following code:
021-09-23T21:43:40.067+0000 I NETWORK [conn5083706] received client metadata from 127.0.0.1:49736 conn5083706:
Note: Examine your data before applying this option, especially if your log lines are complete JSON objects.
Which option to select
Perform the following steps to get an idea of which option to select:
-
Retrieve a sample of your log data of at least 30,000 lines.
-
In your sample, determine the percentage of log lines that contain either JSON objects, or lines made up of both text and JSON objects.
-
Use the following table to help you select an option.
JSON percentage in data | Suggested option |
---|---|
Less than 30% of log lines contain JSON objects | none |
More than 30% of log lines contain JSON objects | filter |
More than 80% of log lines contain JSON objects | none or flatten |
Limitations
Be aware of the following limitations:
-
Neither the flatten nor filter option checks the validity of the JSON object.
-
In some cases log data can include partial JSON. In this case, the filter options will extract the message preceding the first opening brace of the partial JSON object.
Selecting an option
To select a JSON processing option, see Adding an integration.