Mapping event data for a Kafka integration

With a Kafka integration, you can gather data for both logs and events. If you are gathering event data, you can specify your own mappings to the standard IBM Cloud Pak for AIOps values.

By default, the Kafka integration uses a default field mapping based on what the source data typically looks like. When you map source fields, you're replacing or supplementing those values with something else in your logs.

Important: Values mapped to the alert.type.classification field in the Cloud Pak for AIOps data model are especially significant. Inappropriate values assigned to this field at the time of mapping has a negative impact on several downstream activities, such as Golden Signal classification, incident creation, and recommended policies. Classification should contain information summarizing the event type, for example, CPU Utilization, Node Down, or the actual metric name itself. The following example mappings are from an attribute in an Instana event payload:

classification = "Garbage Collection Activity High"
or 
classification = "Abnormally High backend commit duration"
or 
classification = "Container Memory Usage is getting closer to limit"

Sample mappings are available when setting up a generic webhook integration in Cloud Pak for AIOps. Click Load sample mapping on the Configure event mapping screen and select from the available use cases.

Mapping event grouping data

Event grouping receives streams of events from a Kafka integration, and continuously groups events in real time. This grouping process uses various techniques such as deduplication, temporal clustering, and more. The event grouping service interacts with other services to further provide localization, average severity, pattern recognition, diagnosis, and explanation.

The event grouping service takes normalized data through the integration from an event data source, such as PagerDuty. That data is then used to fit into a single structure that the event grouping service uses to derive new information, such as a topology. Event grouping aims to suppress the number of events to help you diagnose problems with relevant events. This grouping increases the efficiency of handling incidents, and associates the event with other information (such as logs or metrics).

Event raw data

To maintain the structure of the event data in the Elasticsearch database, you must map and transform raw event data into a normalized format. This normalized data can then be used in AI training.

The following raw JSON data shows example event data from a PagerDuty data source:

{"id": "PBDVRS5", "type": "alert", "summary": "[Kubernetes] Pods Available less than desired", "self": "https://api.pagerduty.com/alerts/PBDVRS5", "html_url": "https://example.pagerduty.com/alerts/PBDVRS5", "created_at": "2020-01-17T19:21:24-05:00", "status": "resolved", "resolved_at": "2020-01-17T19:34:22-05:00", "alert_key": "3014911", "suppressed": false, "service": {"id": "PTOQ55S", "type": "service_reference", "summary": "simulation", "self": "https://api.pagerduty.com/services/PTOQ55S", "html_url": "https://example.pagerduty.com/services/PTOQ55S"}, "severity": "critical", "incident": {"id": "P6ZLD21", "type": "incident_reference", "summary": "[#29918643] [Kubernetes] Pods Available less than desired", "self": "https://api.pagerduty.com/incidents/P6ZLD21", "html_url": "https://example.pagerduty.com/incidents/P6ZLD21"}, "first_trigger_log_entry": {"id": "RNSI4KGVTUAA2ZJ8IILWPF52PL", "type": "trigger_log_entry_reference", "summary": "Triggered through the API", "self": "https://api.pagerduty.com/log_entries/RNSI4KGVTUAA2ZJ8IILWPF52PL", "html_url": "https://example.pagerduty.com/alerts/PBDVRS5/log_entries/RNSI4KGVTUAA2ZJ8IILWPF52PL"}, "body": {"contexts": [{"href": "https://cloud-server.monitoring.cloud.example.com:443/#/alerts/343016", "text": "Link to Alert definition that triggered the event", "type": "link"}, {"href": "https://cloud-server.monitoring.cloud.example.com:443/#/events/notifications/l:604800/3014911/details", "text": "Troubleshoot Event", "type": "link"}], "details": {"Value": "0.95; 1.0", "UTC": "2020-01-18T00:19Z", "Timestamp": 1579306740000, "Subject": "[Kubernetes] Pods Available less than desired is Triggered on kubernetes.namespace.name = default and kubernetes.deployment.name = ts-admin-travel-service", "Source": "Sysdig", "Severity": "Low", "Segment": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Scope": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Event ID": 3014911, "Condition": "avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))", "Body": "\n\n\nEvent Generated:\n\nSeverity:         Low\n    Metric:\n    kubernetes.deployment.replicas.desired = 1\nSegment:\n    kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'\nScope:\n    Everywhere\n\nTime:             01/18/2020 12:19 AM UTC\nState:            Triggered\nNotification URL: https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/6f42245bn4ab5gc294c55b5785e74ab9/ba4b7qqa-1fe9-47d1-9355-li97d23a1855?redirectRoute&#x3D;/events/notifications/l:2419200/3014911/details\n\n------\n\nTriggered by Alert:\n\nName:         [Kubernetes] Pods Available less than desired\nDescription:  Not enough pods running for deployment\nTeam:         Monitor Operations\nScope:\n    Everywhere\nSegment by:   kubernetes.namespace.name, kubernetes.deployment.name\nWhen:         avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))\nFor at least: 10 min\nAlert URL:    https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/6f58425be4ll4ab295c55b3238e74ab9/ba5n7gga-1fe9-47d1-9355-lp97d63a1855?redirectRoute&#x3D;/alerts/343016\n\n\n", "Alert name": "[Kubernetes] Pods Available less than desired", "Alert description": "Not enough pods running for deployment", "Alert ID": 343016}, "cef_details": {"client": "Sysdig", "client_url": "https://cloud-server.monitoring.cloud.example.com:443/#/events/notifications/l:604800/3014911/details", "contexts": [{"type": "link", "text": "Link to Alert definition that triggered the event", "href": "https://cloud-server.monitoring.cloud.example.com:443/#/alerts/343016"}, {"type": "link", "text": "Troubleshoot Event", "href": "https://cloud-server.monitoring.cloud.example.com:443/#/events/notifications/l:604800/3014911/details"}], "dedup_key": "3014911", "description": "[Kubernetes] Pods Available less than desired", "details": {"Value": "0.95; 1.0", "UTC": "2020-01-18T00:19Z", "Timestamp": 1579306740000, "Subject": "[Kubernetes] Pods Available less than desired is Triggered on kubernetes.namespace.name = default and kubernetes.deployment.name = ts-admin-travel-service", "Source": "Sysdig", "Severity": "Low", "Segment": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Scope": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Event ID": 3014911, "Condition": "avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))", "Body": "\n\n\nEvent Generated:\n\nSeverity:         Low\n    Metric:\n    kubernetes.deployment.replicas.desired = 1\nSegment:\n    kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'\nScope:\n    Everywhere\n\nTime:             01/18/2020 12:19 AM UTC\nState:            Triggered\nNotification URL: https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/5gabc35be4ab4cc294c55b5785e74ab9/ba4b7nna-1fe9-47d1-9355-fm65d63a1855?redirectRoute&#x3D;/events/notifications/l:2419200/3014911/details\n\n------\n\nTriggered by Alert:\n\nName:         [Kubernetes] Pods Available less than desired\nDescription:  Not enough pods running for deployment\nTeam:         Monitor Operations\nScope:\n    Everywhere\nSegment by:   kubernetes.namespace.name, kubernetes.deployment.name\nWhen:         avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))\nFor at least: 10 min\nAlert URL:    https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/6f86155be4ab4cc294c55b5785e74ab9/ba4b7bba-1fe9-47d1-9355-ba97d63a1855?redirectRoute&#x3D;/alerts/343016\n\n\n", "Alert name": "[Kubernetes] Pods Available less than desired", "Alert description": "Not enough pods running for deployment", "Alert ID": 343016}, "message": "[Kubernetes] Pods Available less than desired", "mutations": [], "version": "1.0"}, "type": "alert_body"}, "integration": {"id": "PKKYNGU", "type": "generic_events_api_inbound_integration_reference", "summary": "Sysdig-ai4it", "self": "https://api.pagerduty.com/services/PTOQ55S/integrations/PKKYNGU", "html_url": "https://example.pagerduty.com/services/PTOQ55S/integrations/PKKYNGU"}, "privilege": null, "team": {"id": "PWY2HS1", "type": "team", "summary": "Project Zeno E2E", "self": "https://api.pagerduty.com/teams/PWY2HS1", "html_url": "https://example.pagerduty.com/teams/PWY2HS1"}}

Description of PagerDuty normalized attributes

The following table provides details about the attributes that are expected in the normalized data that results from mapping and transforming the raw PagerDuty data.

Normalized attributes for PagerDuty event data
PagerDuty attribute Normalized attribute Description
created_at timestamp Epoch timestamp of the event in the log entry
created_at utc_timestamp Timestamp of the event in the log entry, in the format “yyyy-mm-ddTHH:MM:SSZ”
summary alert.title Title of the entry created
summary alert.text Describes the condition and the affected managed object instance
created_at alert.created_at_utc_timestamp Time the entry was created
resolved_at alert.resolved_at_utc_timestamp Time the entry was resolved
severity alert.severity Indicates the severity level from 1 (indeterminate) to 5 (critical)
html_url alert.source.source_url URL of the event in PagerDuty
id alert.source.source_alert_id Internal PagerDuty ID assigned to the event
team.id alert.source.source_team.id Internal PagerDuty ID associated with the team that is assigned to the event.
Team data is added to the original PagerDuty payload with an extra API call
team.summary alert.source.source_team.name Name of the team that is assigned to the event
team.html_url alert.source.source_team.web_url URL of the team in PagerDuty
service.id alert.source.source_application.id Internal PagerDuty ID associated with the service that is assigned to the event
service.summary alert.source.source_application.name Name of the service that is assigned to the event
service.html_url alert.source.source_application.web_url URL of the service in PagerDuty
instance.id alert.features.<id:pagerduty_incident>.value.id Internal PagerDuty ID associated with the incident that is assigned to the event features is an array. incident.id goes into the value.id field of the object with id:pagerduty_incident
instance.summary alert.features.<id:pagerduty_incident>.value.name Name of the service that is assigned to the alert
instance.html_url alert.features.<id:pagerduty_incident>.value.web_url URL to the incident in PagerDuty
<entire raw PagerDuty JSON> alert.features.<id:root>.value The entire raw PagerDuty JSON goes into the value field of the object with id:root

Example of normalized PagerDuty event

{
  "application_group_id": "123abc",
  "application_id": "123abc",
  "timestamp": 1581531561.4586372,
  "utc_timestamp": "2020-02-12 18:19:21.458637",
  "type": "alert",
  "alert": {
    "alert_id": "7f83fb17-5a02-8c86-020d23cfe01c",
    "title": "A failure occurred invoking the provision callback on conversation",
    "text": "A failure occurred invoking the provision callback on conversation",
    "created_at_utc_timestamp": "2019-12-09T11:20:46-05:00",
    "resolved_at_utc_timestamp": null,
    "severity": 4,
    "source": {
      "source_name": "PagerDuty",
      "source_url": "https://<pagerdutyserver.com>/alerts/PBCAOOO",
      "source_alert_id": "PBCAOOO",
      "source_team": {
        "id": "P8FWLZZ",
        "name": "SRE - Assistant",
        "web_url": "https://<pagerdutyserver.com>/teams/P8FWLZZ"
      },
      "source_application": {
        "id": "P67C6KK",
        "name": "FirstTest",
        "web_url": "https://<pagerdutyserver.com>/services/P67C6KK"
      }
    },
    "features": [
      {
        "id": "pagerduty_incident",
        "value": {
          "id": "PGAC9NN",
          "name": "[#291477] A failure occurred invoking the provision callback on conversation",
          "web_url": "https://<pagerdutyserver.com>/incidents/PGAC9NN"
        }
      },
      {
        "id": "root",
        "value": {
          "id": "PBCAOOO",
          "type": "alert",
          "summary": "A failure occurred invoking the provision callback on conversation",
          "self": "https://<pagerdutyapi>/alerts/PBCAOOO",
          "html_url": "https://<pagerdutyserver.com>/alerts/PBCAOOO",
          "created_at": "2019-12-09T11:20:46-05:00",
          "status": "triggered",
          "resolved_at": null,
          "alert_key": "mni-dedup-key-2-2019-12-09-17:20:34",
          "suppressed": false,
          "service": {
            "id": "P67C6KK",
            "type": "service_reference",
            "summary": "FirstTest",
            "self": "https://<pagerdutyapi>/services/P67C6KK",
            "html_url": "https://<pagerdutyserver.com>/services/P67C6KK"
          },
          "severity": "info",
          "incident": {
            "id": "PGAC9NN",
            "type": "incident_reference",
            "summary": "[#29147766] A failure occurred invoking the provision callback on conversation",
            "self": "https://<pagerdutyapi>/incidents/PGAC9NN",
            "html_url": "https://<pagerdutyserver.com>/incidents/PGAC9NN"
          },
          "first_trigger_log_entry": {
            "id": "R4Y7P9Y8HTZ9VRPJ",
            "type": "trigger_log_entry_reference",
            "summary": "Triggered through the API",
            "self": "https://<pagerdutyapi>/log_entries/R4Y7P9Y8HTZ9VRPJ",
            "html_url": "https://<pagerdutyserver.com>/alerts/PBCAOOO/log_entries/R4Y7P9Y8HTZ9VRPJ"
          },
          "body": {
            "contexts": [],
            "details": null,
            "cef_details": {
              "client": "curl",
              "contexts": [],
              "dedup_key": "mni-dedup-key-2-2019-12-09-17:20:34",
              "description": "A failure occurred invoking the provision callback on conversation",
              "event_class": "Test",
              "message": "A failure occurred invoking the provision callback on conversation",
              "mutations": [],
              "service_group": "app-testing",
              "severity": "info",
              "source_component": "manual-test",
              "source_origin": "000.00.000.000",
              "version": "1.0"
            },
            "type": "alert_body"
          },
          "integration": {
            "id": "P02BNYY",
            "type": "events_api_v2_inbound_integration_reference",
            "summary": "AppCluster",
            "self": "https://<pagerdutyapi>/services/P67C6GI/integrations/P02BNYY",
            "html_url": "https://<pagerdutyserver.com>/services/P67C6GI/integrations/P02BNYY"
          },
          "privilege": null,
          "team": {
            "id": "P8FWLZZ",
            "type": "team",
            "summary": "SRE - Assistant",
            "self": "https://<pagerdutyapi>/teams/P8FWLZZ",
            "html_url": "https://<pagerdutyserver.com>/teams/P8FWLZZ"
          }
        }
      }
    ]
  },
  "meta_features": []
}