Mapping log data from incoming sources

With any integration, you can specify your own mappings to the standard IBM Cloud Pak for AIOps values to optimize search performance for any data type.

For each type of integration (Custom, ELK, Falcon LogScale, Kafka, Mezmo, and Splunk), you can edit which fields map to the standard values. By default, each integration type (including custom integrations) uses a default field mapping based on what the source data typically looks like. When you map source fields, you're replacing or supplementing those values with something else in your logs.

Mapping log anomaly data

The Log Anomaly service detects abnormal activities from system logs, which help Site Reliability Engineers (SREs) to better react to issues. Two types of anomalies are considered: quantitative anomalies (where constant linear or quantitative relationships are broken) and sequential anomalies (where a log sequence deviates from normal patterns of program flows).

IBM Cloud Pak for AIOps ingests logs from standard or custom data sources. Raw logs from supported standard data sources are transformed to a normalized format, as shown in the following example. Data from custom data sources must be transformed into normalized format first to be used in an integration (or eventually for AI model training).

Log anomaly raw data

To maintain the structure of the log anomaly data in the Elastic database, you must transform raw log anomaly data into a normalized format. The following examples illustrate raw JSON log anomaly data that comes from the ELK Stack, Falcon LogScale, and Splunk:

ELK Stack raw data example

{"_index":"syslog-console-example.com-20201018","_type":"_doc","_id":"t1IaPXUBuJIWpBPwb1fG","_score":null,"_source":{"MESSAGEID":"DSNJ001I","sysplexName":"MCMPLEX1","JOBNAME":"DC33MSTR","TEXT":"DSNJ001I  -DC11 DSNJW307 CURRENT COPY 2 ACTIVE LOG\nDATA SET IS DSNAME=DSN11.LOGCOPY2.DS02,\nSTARTRBA=00000000000C20358000,ENDRBA=00000000000C2078FFFF\"\n","ROUTECODE":"80000000000000000000000000000000","SYSPLEX_SYSTEM_JOB":"MCMPLEX1.ABCLP20.DC33MSTR","SYSPLEX_SYSTEM_CICS":"MCMPLEX1.ABCLP20","path":"SYSLOG","rcd":"MC","ASID":"0043","systemName":"ABCLP20","JOBNUM":"STC02552","TIMESTAMP":"20292 15.05.38.160 -0400","host":"ABCLP20.svl.ibm.com","@version":"1","FLAGS":"00","seq":{"c":"14469","w":"4"},"DESCRIPTOR":"00000090","timeZone":"-0400","message":"MC,0043,20292 15.05.38.160 -0400,ABCLP20 ,STC02552,        ,80000000000000000000000000000000,00000090,DC33MSTR,00,\" DSNJ001I  -DC11 DSNJW307 CURRENT COPY 2 ACTIVE LOG\n DATA SET IS DSNAME=DSN11.LOGCOPY2.DS02,\n STARTRBA=00000000000C20358000,ENDRBA=00000000000C2078FFFF\"\n","inputsequence":"20201018190543703:000000","tags":"pipeline","SMFID":"ABCLP20 ","@timestamp":"2020-10-18T15:05:38.160-04:00","port":6911,"CONSOLE":"        ","sourceType":"syslog-console","SYSPLEX_SYSTEM_DB2":"MCMPLEX1.ABCLP20","sourceName":"ABCLP20-SYSLOG","SYSPLEX_SYSTEM":"MCMPLEX1.ABCLP20"},"sort":[1603047938160]}

Falcon LogScale (previously named Humio) raw data example

{"kubernetes.namespace_name":"myhumio","@timestamp":1603445464206,"kubernetes.host":"worker.host.example.com","kubernetes.container_hash":"docker.io/value/enterprise-kafka@sha256:bb3d4572b51490b7e9fc2431125ae963bf7cdc4cd30ff4575b1e529e3e53a1e6","kubernetes.docker_id":"39e5cba870732269940daa8378ee2eab52adffa93dd6c5ccf15246e70b115584","kubernetes.labels.app":"cp-kafka","kubernetes.container_name":"cp-kafka-broker","kubernetes.container_image":"docker.io/value/enterprise-kafka:5.2.1","kubernetes.labels.controller-revision-hash":"humio-example-kafka-66c4fb4d64","kubernetes.annotations.k8s_v1_cni_cncf_io/networks-status":"[{\n    \"name\": \"openshift-sdn\",\n    \"interface\": \"eth0\",\n    \"ips\": [\n        \"10.254.20.30\"\n    ],\n    \"dns\": {},\n    \"default-route\": [\n        \"10.254.20.1\"\n    ]\n}]","kubernetes.annotations.prometheus_io/scrape":"true","@timezone":"Z","@rawstring":"2020-10-23T09:31:04.206512759+00:00 stdout F [2020-10-23 09:31:04,206] TRACE [Controller id=2] Leader imbalance ratio for broker 2 is 0.0 (kafka.controller.KafkaController)","kubernetes.labels.statefulset_kubernetes_io/pod-name":"humio-example-kafka-2","@id":"yI1o4C0cV5i8oGU4TkcbavzO_587_147_1603445464","#type":"unparsed","kubernetes.annotations.openshift_io/scc":"restricted","kubernetes.labels.release":"humio","#repo":"kubernetes","kubernetes.pod_name":"humio-example-kafka-2","kubernetes.pod_id":"7a4b0b92-1254-4335-ae29-b1f548c8e21d","#humioAutoShard":"1","kubernetes.annotations.prometheus_io/port":"5556"}

Splunk raw data example

{"_bkt":"_internal~53~283A637C-D363-4549-A819-E8175BC73650","_cd":"53:2848711","_indextime":"1603220666","_raw":"127.0.0.1 - splunk-system-user [20/Oct/2020:12:04:26.272 -0700] \"GET /services/saved/searches?count=0 HTTP/1.1\" 200 1861412 - - - 162ms","_serial":"31263","_si":["abc.example.com","_internal"],"_sourcetype":"splunkd_access","_subsecond":".272","_time":"2020-10-20T12:04:26.272-07:00","host":"abc.example.com","index":"_internal","linecount":"1","source":"/root/mj/mjsplunk/splunk/var/log/splunk/splunkd_access.log","sourcetype":"splunkd_access","splunk_server":"abc.example.com"}

Normalized log data example

The following sample illustrates the result of mapping data from an external log source for use with IBM Cloud Pak for AIOps:

{
    "timestamp": 1655991641000,
    "instance_id": "calico-node-jn2d2",
    "application_group_id": "1000",
    "application_id": "1000",
    "features": [],
    "meta_features": [],
    "message": "[64] ipsets.go 254: Resyncing ipsets with dataplane. family=\"inet\"",
    "entities": {
        "pod": "calico-node-jn2d2",
        "cluster": null,
        "container": "calico-node",
        "node": "kube-bmgcm5td0stjujtqv8m0-ai4itsimula-wp16cpu-00002c34"
    },
    "type": "StandardLog"
}

Log anomaly normalized attributes

Required attributes expected as input for the pipeline
Attribute Description
timestamp Epoch Timestamp of the event in the log entry. The epoch time should be in milliseconds (13 digit) and not in seconds (10 digit).
utc_timestamp The utc_timestamp of the event in the log entry. The format should be in yyyy-MM-dd HH:mm:ss.SSS format. The value of this field should be equivalent for the epoch value of the timestamp field.
instance_id Logs can be trained based on certain attributes, like the application to which the log entry belongs, the hosts from which the log is generated, or the pods from which the log is generated.
instance_id is the value of the attribute that is chosen to be the key upon which the models are built. Attributes chosen to be instance_id are specified in the ingest configuration during onboarding.
application_id Application from which the log entry was generated. It can be the same as instance_id if the instance_id is chosen to be the application
features Contains log templates
message Log message for the log entry. It can be an error message, a normal information message, or a debug message. A good combination of message types yields the best results.
For example, if the logs are only error messages or if no error messages appear in the logs, the quality of the anomaly detection suffers.
entities Entities are provided for better event attribution. They provide clues to connect information from multiple data sources such as events, logs, and incidents, and also are used to globalize faulty components.
Entities can be infrastructure-related attributes like node name, hostname, IP address, pod name, container name, or cluster name. These attributes are specified as a comma-separated list in the configuration during onboarding.
The entities attribute must, at least, contain an empty field: "entities": {}.
Static attributes
Attribute Description
application Application that the IBM Cloud Pak for AIOps instance is monitoring. Provided as part of the initial configuration during onboarding
application_group_id Supports multi-tenancy. Provided as part of the initial configuration during onboarding
type Hardcoded to the log
Attributes analyzed and populated in the pipeline
Attribute Description
meta_features Contains windowing details

Examples

Review the following examples to view how to map log data.

Example: Falcon Logscale field mapping

  • message_field maps to @rawstring
  • log_entity_types maps to kubernetes.container_name
  • instance_id_field maps to kubernetes.container_name
  • timestamp_field maps to @timestamp

Falcon Logscale field mapping example
Figure. Falcon Logscale mapping example

All of the values in this Falcon LogScale field mapping correspond to entries in the Falcon LogScale raw data. In this example, the following values map to one another in the Falcon LogScale raw log.

Important: You must specify the corresponding codec in your field mapping when you create your integration. If you do not, the mapping fails. In the preceding example, the mapping specifies humio as the codec value.

You can define mappings when you create your integration. The basic building blocks of mappings are similar for most integration types. For logs, the codec, rolling_time, instance_id_field, log_entity_types, message_field, and resource_id mappings typically correspond to values that you might find in your log data. The following table describes typical mapping fields for logs:

Typical mapping values for log integrations
Field Name Description
codec The type of integration for your incoming data. You must specify the codec value for non-log integrations too. For example, you must specify humio for a Falcon Logscale integration mapping, or servicenow for a ServiceNow integration.
rolling_time The recurring time interval to capture data from in seconds. The default value is 10 seconds.
instance_id_field The field name in the incoming data that contains the application name. This value is used to identify where your log data is coming from. If you choose an overly specific value, it might lead to flooding the pipeline with irrelevant log data. Ideal candidates for mapping this field include: kubernetes.namespace_name, kubernetes.host, or kubernetes.container_name.
log_entity_types The field name in the incoming data that contains the entity to extract and analyze. The value provides information about the app or service where the log data comes from. Ideal candidates for mapping this field include: kubernetes.namespace_name, kubernetes.host, or kubernetes.container_name.
message_field The field name in the incoming data that contains the log message. For example, @rawstring.
resource_id The field name in the incoming data, which contains the specific component that produces the log message. For example, if you deploy your application through a Kubernetes deployment with four replicas, then the resource_id field name contains the specific pod name that produces the log message. This information helps to narrow down the component that must be analysed on detection of an anomaly in your application.

Important: Be mindful when you select mapping values for your instance_id_field and log_entity_types fields. instance_id_field identifies the app or service where your messages come from. instance_id_field acts almost like the application name in that respect. log_entity_types provides information about the app or service, like the name of the pod, the image used in the pod, the node of the pod, and more. If you chose values that are too specific, you might gather too much information from irrelevant pods or projects (namespaces).

Example: Log anomaly output schema

{
    "application_group_id": "1",
    "application_id": "1",
    "timestamp":1655991641000,
    "utc_timestamp":"2022-06-23 13:40:41.000"
    "type": "log-anomaly-detection",
    "alert": {
      "alert_id": "6751a428-5827-11ea-9f3a-c4b301ca8457",
      "title": "NETWORK <*> end connection <*> (<*> <*> now open) NETWORK [listener] connection accepted from <*> #<*> (<*> connections now open)",
      "text": "NETWORK <*> end connection <*> (<*> <*> now open) NETWORK [listener] connection accepted from <*> #<*> (<*> connections now open)",
      "created_at_utc_timestamp": "2020-02-25T23:34:44.874301",
      "severity": 3,
      "source": {
        "source_name": "anomaly-detector",
        "source_url": "",
        "source_alert_id": "6751a428-5827-11ea-9f3a-c4b301ca8457",
        "source_team": {
          "id": "",
          "name": "",
          "web_url": ""
        },
        "source_application": {
          "id": "ts-assurance-mongo",
          "name": "ts-assurance-mongo",
          "web_url": ""
        }
      },
      "features": [
        {
          "id": "log_anomaly_data",
          "value": {
            "id": "6751a428-5827-11ea-9f3a-c4b301ca8457",
            "start_timestamp": 1578669070000,
            "end_timestamp": 1578669080000,
            "log_anomaly_data": {
              "detected_at": 1582702484874.0,
              "source_application_id": "ts-assurance-mongo",
              "log_anomaly_confidence": 1.0,
              "template_sequence": [
                "NETWORK [listener] connection accepted from <*> #<*> (<*> connections now open)",
                "NETWORK [listener] connection accepted from <*> #<*> (<*> connections now open)",
                "NETWORK <*> end connection <*> (<*> <*> now open)",
                "NETWORK <*> end connection <*> (<*> <*> now open)"
              ],
              "template_list": [
                "[ENTSV:ERRR] <*> <*> <*> <*> <*>",
                "Open Entities Serve Exception received. Max retries: <*>, Current retry <*>",
                "Exception during getNLQuestionClasses() call",
                "Exception invoking mmesh.EntityPredictionService/predictTokenizedDocument method for model ids <*>",
                "Error invoking mmesh.EntityPredictionService/predictTokenizedDocument method on model <*> INTERNAL: <ENT02360739W> Exception caught: Authentication failed.",
                "applyModel",
                "Error on query to NLU engine.",
                "Error occurred while processing a message v2 request"
              ],
              "count_vector": [
                2,
                2,
                0,
                0
              ]
            },
            "entity": {
              "pod": [
                {
                  "name": "ts-assurance-mongo-54b557bfbc-7txwh",
                  "counts": 4
                }
              ],
              "cluster": [
                {
                  "name": "null",
                  "counts": 4
                }
              ],
              "container": [
                {
                  "name": "ts-assurance-mongo",
                  "counts": 4
                }
              ],
              "node": [
                {
                  "name": "kube-bmgcm5td0stjujtqv8m0-ai4itsimula-wp16cpu-00002d54",
                  "counts": 4
                }
              ]
            }
          }
        }
      ],
      "resolved_at_utc_timestamp": null
    },
    "meta_features": [
      {}
    ]
  }

Example: Normalized output schema

The preceding example adheres to the normalized output schema for IBM Cloud Pak for AIOps, which can resemble the following example output schema.

{
    "$id": "incidents-schema.json",
    "$schema": "http://json-schema.org/draft-06/schema",
    "title": "incidents",
    "type": "object",
    "properties": {
        "application_group_id": {
            "description": "Unique identifier for the tenant which owns the application being monitored (defined by IBM Cloud Pak for AIOps)",
            "type": "string"
        },
        "application_id": {
            "description": "Unique identifier of the application being monitored (defined by IBM Cloud Pak for AIOps)",
            "type": "string"
        },
        "incident_id": {
            "description": "The unique id for this incident created by IBM Cloud Pak for AIOps",
            "type": "string"
        },
        "title": {
            "description": "The short title for the given incident (SN short_description)",
            "type": "string"
        },
        "description": {
            "description": "The description for the given incident (SN description)",
            "type": "string"
        },
        "created_at": {
            "description": "The timestamp the incident was created at (SN sys_created_on).",
            "type": "string"
        },
        "updated_at": {
            "description": "The timestamp the incident was last updated (SN sys_updated_on).",
            "type": "string"
        },
        "resolved_at": {
            "description": "The timestamp the incident was resolved (SN resolved_at).",
            "type": "string"
        },
        "closed_at": {
            "description": "The timestamp the incident was closed (SN closed_at).",
            "type": "string"
        },
        "started_at": {
            "description": "The timestamp the disruption (which caused this incident to be created) was started at (SN opened_at).",
            "type": "string"
        },
        "business_duration_ms": {
            "description": "The duration in ms the incident. (SN business_duration)",
            "type": "integer"
        },
        "severity": {
            "description": "The severity, as set by the client, of the incident on a scale from 1 (highest) to 5 (lowest) (SN severity)",
            "type": "integer",
            "enum": [1,2,3,4,5]
        },
        "priority": {
            "description": "The priority, as set by the helpdesk, of the incident on a scale from 1 (highest) to 5 (lowest) (SN priority)",
            "type": "integer",
            "enum": [1,2,3,4,5]
        },
        "impact": {
            "description": "The impact of the incident on a scale from 1 (highest) to 5 (lowest) (SN impact)",
            "type": "integer",
            "enum": [1,2,3,4,5]
        },
        "state": {
            "description": "The current state of the incident (SN state)",
            "type": "string",
            "enum": ["new", "in_progress", "resolved", "closed"]
        },
        "source": {
            "type": "object",
            "description": "The source that the incident was read from",
            "properties": {
                "source_name": {
                    "description": "The display name for the incident system this incident was read from (such as ServiceNow).",
                    "type": "string"
                },
                "source_url": {
                    "description": "The url for the incident system this incident was read from (such as https://....).",
                    "type": "string"
                },
                "source_incident_id": {
                    "description": "The id for this incident in the original source (SN number)",
                    "type": "string"
                },
                "source_application_id": {
                    "description": "Id of the application that had the disruption (SN cmdb_ci)",
                    "type": "string"
                }
            },
            "required": [
                "source_name",
                "source_incident_id"
            ]
        },
        "comments": {
            "description": "The set of comments and work notes for the incident",
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "comment_text": {
                        "description": "The text for the given comment (SN comments_and_work_notes).",
                        "type": "string"
                    }
                },
                "additionalProperties": false,
                "required": [
                    "comment_text"
                ]
            },
            "minItems": 0
        },
        "related_incidents": {
            "description": "The set of incidents that are related (SN parent_incident & child_incidents)",
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "source_incident_id": {
                        "description": "The id for the related incident in the source system.",
                        "type": "string"
                    },
                    "relationship": {
                        "description": "The relationship to this incident, namely parent child or similar",
                        "type": "string",
                        "enum": ["parent","child","similar"]
                    }
                },
                "additionalProperties": false,
                "required": [
                    "comment_text"
                ]
            },
            "minItems": 0
        },
        "resolution": {
            "type": "object",
            "description": "The source that the incident was read from",
            "properties": {
                "rca_id": {
                    "description": "The id of the rca (problem) ticket that was created in response to this incident (SN problem_id).",
                    "type": "string"
                },
                "resolution_summary": {
                    "description": "The summary of what was done to resolve the issue (SN close_notes).",
                    "type": "string"
                }
            },
            "required": [
                "resolution_summary"
            ]
        },
        "features": {
            "description": "The extended set of features related to the incident",
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                },
                "additionalProperties": true,
                "required": [
                ]
            },
            "minItems": 0
        },
        "required": [
            "application_group_id",
            "application_id",
            "incident_id",
            "created_at_utc_timestamp",
            "title",
            "description",
            "severity",
            "state",
            "source"
        ],
        "meta_features": {
            "description": "The extended set of features related to the message",
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                },
                "additionalProperties": true,
                "required": [ ]
            },
            "minItems": 0
        }
    }
}