Mapping event data for a Kafka integration
With a Kafka integration, you can gather data for both logs and events. If you are gathering event data, you can specify your own mappings to the standard IBM Cloud Pak for AIOps values.
By default, the Kafka integration uses a default field mapping based on what the source data typically looks like. When you map source fields, you're replacing or supplementing those values with something else in your logs.
Important: Values mapped to the alert.type.classification
field in the Cloud Pak for AIOps data model are especially significant. Inappropriate values assigned to this field at the time of mapping
has a negative impact on several downstream activities, such as Golden Signal classification, incident creation, and recommended policies. Classification should contain information summarizing the event type, for example, CPU Utilization, Node
Down, or the actual metric name itself. The following example mappings are from an attribute in an Instana event payload:
classification = "Garbage Collection Activity High"
or
classification = "Abnormally High backend commit duration"
or
classification = "Container Memory Usage is getting closer to limit"
Sample mappings are available when setting up a generic webhook integration in Cloud Pak for AIOps. Click Load sample mapping on the Configure event mapping screen and select from the available use cases.
Mapping event grouping data
Event grouping receives streams of events from a Kafka integration, and continuously groups events in real time. This grouping process uses various techniques such as deduplication, temporal clustering, and more. The event grouping service interacts with other services to further provide localization, average severity, pattern recognition, diagnosis, and explanation.
The event grouping service takes normalized data through the integration from an event data source, such as PagerDuty. That data is then used to fit into a single structure that the event grouping service uses to derive new information, such as a topology. Event grouping aims to suppress the number of events to help you diagnose problems with relevant events. This grouping increases the efficiency of handling incidents, and associates the event with other information (such as logs or metrics).
Event raw data
To maintain the structure of the event data in the Elasticsearch database, you must map and transform raw event data into a normalized format. This normalized data can then be used in AI training.
The following raw JSON data shows example event data from a PagerDuty data source:
{"id": "PBDVRS5", "type": "alert", "summary": "[Kubernetes] Pods Available less than desired", "self": "https://api.pagerduty.com/alerts/PBDVRS5", "html_url": "https://example.pagerduty.com/alerts/PBDVRS5", "created_at": "2020-01-17T19:21:24-05:00", "status": "resolved", "resolved_at": "2020-01-17T19:34:22-05:00", "alert_key": "3014911", "suppressed": false, "service": {"id": "PTOQ55S", "type": "service_reference", "summary": "simulation", "self": "https://api.pagerduty.com/services/PTOQ55S", "html_url": "https://example.pagerduty.com/services/PTOQ55S"}, "severity": "critical", "incident": {"id": "P6ZLD21", "type": "incident_reference", "summary": "[#29918643] [Kubernetes] Pods Available less than desired", "self": "https://api.pagerduty.com/incidents/P6ZLD21", "html_url": "https://example.pagerduty.com/incidents/P6ZLD21"}, "first_trigger_log_entry": {"id": "RNSI4KGVTUAA2ZJ8IILWPF52PL", "type": "trigger_log_entry_reference", "summary": "Triggered through the API", "self": "https://api.pagerduty.com/log_entries/RNSI4KGVTUAA2ZJ8IILWPF52PL", "html_url": "https://example.pagerduty.com/alerts/PBDVRS5/log_entries/RNSI4KGVTUAA2ZJ8IILWPF52PL"}, "body": {"contexts": [{"href": "https://cloud-server.monitoring.cloud.example.com:443/#/alerts/343016", "text": "Link to Alert definition that triggered the event", "type": "link"}, {"href": "https://cloud-server.monitoring.cloud.example.com:443/#/events/notifications/l:604800/3014911/details", "text": "Troubleshoot Event", "type": "link"}], "details": {"Value": "0.95; 1.0", "UTC": "2020-01-18T00:19Z", "Timestamp": 1579306740000, "Subject": "[Kubernetes] Pods Available less than desired is Triggered on kubernetes.namespace.name = default and kubernetes.deployment.name = ts-admin-travel-service", "Source": "Sysdig", "Severity": "Low", "Segment": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Scope": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Event ID": 3014911, "Condition": "avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))", "Body": "\n\n\nEvent Generated:\n\nSeverity: Low\n Metric:\n kubernetes.deployment.replicas.desired = 1\nSegment:\n kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'\nScope:\n Everywhere\n\nTime: 01/18/2020 12:19 AM UTC\nState: Triggered\nNotification URL: https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/6f42245bn4ab5gc294c55b5785e74ab9/ba4b7qqa-1fe9-47d1-9355-li97d23a1855?redirectRoute=/events/notifications/l:2419200/3014911/details\n\n------\n\nTriggered by Alert:\n\nName: [Kubernetes] Pods Available less than desired\nDescription: Not enough pods running for deployment\nTeam: Monitor Operations\nScope:\n Everywhere\nSegment by: kubernetes.namespace.name, kubernetes.deployment.name\nWhen: avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))\nFor at least: 10 min\nAlert URL: https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/6f58425be4ll4ab295c55b3238e74ab9/ba5n7gga-1fe9-47d1-9355-lp97d63a1855?redirectRoute=/alerts/343016\n\n\n", "Alert name": "[Kubernetes] Pods Available less than desired", "Alert description": "Not enough pods running for deployment", "Alert ID": 343016}, "cef_details": {"client": "Sysdig", "client_url": "https://cloud-server.monitoring.cloud.example.com:443/#/events/notifications/l:604800/3014911/details", "contexts": [{"type": "link", "text": "Link to Alert definition that triggered the event", "href": "https://cloud-server.monitoring.cloud.example.com:443/#/alerts/343016"}, {"type": "link", "text": "Troubleshoot Event", "href": "https://cloud-server.monitoring.cloud.example.com:443/#/events/notifications/l:604800/3014911/details"}], "dedup_key": "3014911", "description": "[Kubernetes] Pods Available less than desired", "details": {"Value": "0.95; 1.0", "UTC": "2020-01-18T00:19Z", "Timestamp": 1579306740000, "Subject": "[Kubernetes] Pods Available less than desired is Triggered on kubernetes.namespace.name = default and kubernetes.deployment.name = ts-admin-travel-service", "Source": "Sysdig", "Severity": "Low", "Segment": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Scope": "kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'", "Event ID": 3014911, "Condition": "avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))", "Body": "\n\n\nEvent Generated:\n\nSeverity: Low\n Metric:\n kubernetes.deployment.replicas.desired = 1\nSegment:\n kubernetes.deployment.name = 'ts-admin-travel-service' and kubernetes.namespace.name = 'default'\nScope:\n Everywhere\n\nTime: 01/18/2020 12:19 AM UTC\nState: Triggered\nNotification URL: https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/5gabc35be4ab4cc294c55b5785e74ab9/ba4b7nna-1fe9-47d1-9355-fm65d63a1855?redirectRoute=/events/notifications/l:2419200/3014911/details\n\n------\n\nTriggered by Alert:\n\nName: [Kubernetes] Pods Available less than desired\nDescription: Not enough pods running for deployment\nTeam: Monitor Operations\nScope:\n Everywhere\nSegment by: kubernetes.namespace.name, kubernetes.deployment.name\nWhen: avg(timeAvg(kubernetes.deployment.replicas.available)) < avg(timeAvg(kubernetes.deployment.replicas.desired))\nFor at least: 10 min\nAlert URL: https://cloud-server.monitoring.cloud.example.com:443/api/oauth/openid/example/6f86155be4ab4cc294c55b5785e74ab9/ba4b7bba-1fe9-47d1-9355-ba97d63a1855?redirectRoute=/alerts/343016\n\n\n", "Alert name": "[Kubernetes] Pods Available less than desired", "Alert description": "Not enough pods running for deployment", "Alert ID": 343016}, "message": "[Kubernetes] Pods Available less than desired", "mutations": [], "version": "1.0"}, "type": "alert_body"}, "integration": {"id": "PKKYNGU", "type": "generic_events_api_inbound_integration_reference", "summary": "Sysdig-ai4it", "self": "https://api.pagerduty.com/services/PTOQ55S/integrations/PKKYNGU", "html_url": "https://example.pagerduty.com/services/PTOQ55S/integrations/PKKYNGU"}, "privilege": null, "team": {"id": "PWY2HS1", "type": "team", "summary": "Project Zeno E2E", "self": "https://api.pagerduty.com/teams/PWY2HS1", "html_url": "https://example.pagerduty.com/teams/PWY2HS1"}}
Description of PagerDuty normalized attributes
The following table provides details about the attributes that are expected in the normalized data that results from mapping and transforming the raw PagerDuty data.
PagerDuty attribute | Normalized attribute | Description |
---|---|---|
created_at |
timestamp |
Epoch timestamp of the event in the log entry |
created_at |
utc_timestamp |
Timestamp of the event in the log entry, in the format “yyyy-mm-ddTHH:MM:SSZ” |
summary |
alert.title |
Title of the entry created |
summary |
alert.text |
Describes the condition and the affected managed object instance |
created_at |
alert.created_at_utc_timestamp |
Time the entry was created |
resolved_at |
alert.resolved_at_utc_timestamp |
Time the entry was resolved |
severity |
alert.severity |
Indicates the severity level from 1 (indeterminate) to 5 (critical) |
html_url |
alert.source.source_url |
URL of the event in PagerDuty |
id |
alert.source.source_alert_id |
Internal PagerDuty ID assigned to the event |
team.id |
alert.source.source_team.id |
Internal PagerDuty ID associated with the team that is assigned to the event. Team data is added to the original PagerDuty payload with an extra API call |
team.summary |
alert.source.source_team.name |
Name of the team that is assigned to the event |
team.html_url |
alert.source.source_team.web_url |
URL of the team in PagerDuty |
service.id |
alert.source.source_application.id |
Internal PagerDuty ID associated with the service that is assigned to the event |
service.summary |
alert.source.source_application.name |
Name of the service that is assigned to the event |
service.html_url |
alert.source.source_application.web_url |
URL of the service in PagerDuty |
instance.id |
alert.features.<id:pagerduty_incident>.value.id |
Internal PagerDuty ID associated with the incident that is assigned to the event features is an array. incident.id goes into the value.id field of the object with id:pagerduty_incident |
instance.summary |
alert.features.<id:pagerduty_incident>.value.name |
Name of the service that is assigned to the alert |
instance.html_url |
alert.features.<id:pagerduty_incident>.value.web_url |
URL to the incident in PagerDuty |
<entire raw PagerDuty JSON> |
alert.features.<id:root>.value |
The entire raw PagerDuty JSON goes into the value field of the object with id:root |
Example of normalized PagerDuty event
{
"application_group_id": "123abc",
"application_id": "123abc",
"timestamp": 1581531561.4586372,
"utc_timestamp": "2020-02-12 18:19:21.458637",
"type": "alert",
"alert": {
"alert_id": "7f83fb17-5a02-8c86-020d23cfe01c",
"title": "A failure occurred invoking the provision callback on conversation",
"text": "A failure occurred invoking the provision callback on conversation",
"created_at_utc_timestamp": "2019-12-09T11:20:46-05:00",
"resolved_at_utc_timestamp": null,
"severity": 4,
"source": {
"source_name": "PagerDuty",
"source_url": "https://<pagerdutyserver.com>/alerts/PBCAOOO",
"source_alert_id": "PBCAOOO",
"source_team": {
"id": "P8FWLZZ",
"name": "SRE - Assistant",
"web_url": "https://<pagerdutyserver.com>/teams/P8FWLZZ"
},
"source_application": {
"id": "P67C6KK",
"name": "FirstTest",
"web_url": "https://<pagerdutyserver.com>/services/P67C6KK"
}
},
"features": [
{
"id": "pagerduty_incident",
"value": {
"id": "PGAC9NN",
"name": "[#291477] A failure occurred invoking the provision callback on conversation",
"web_url": "https://<pagerdutyserver.com>/incidents/PGAC9NN"
}
},
{
"id": "root",
"value": {
"id": "PBCAOOO",
"type": "alert",
"summary": "A failure occurred invoking the provision callback on conversation",
"self": "https://<pagerdutyapi>/alerts/PBCAOOO",
"html_url": "https://<pagerdutyserver.com>/alerts/PBCAOOO",
"created_at": "2019-12-09T11:20:46-05:00",
"status": "triggered",
"resolved_at": null,
"alert_key": "mni-dedup-key-2-2019-12-09-17:20:34",
"suppressed": false,
"service": {
"id": "P67C6KK",
"type": "service_reference",
"summary": "FirstTest",
"self": "https://<pagerdutyapi>/services/P67C6KK",
"html_url": "https://<pagerdutyserver.com>/services/P67C6KK"
},
"severity": "info",
"incident": {
"id": "PGAC9NN",
"type": "incident_reference",
"summary": "[#29147766] A failure occurred invoking the provision callback on conversation",
"self": "https://<pagerdutyapi>/incidents/PGAC9NN",
"html_url": "https://<pagerdutyserver.com>/incidents/PGAC9NN"
},
"first_trigger_log_entry": {
"id": "R4Y7P9Y8HTZ9VRPJ",
"type": "trigger_log_entry_reference",
"summary": "Triggered through the API",
"self": "https://<pagerdutyapi>/log_entries/R4Y7P9Y8HTZ9VRPJ",
"html_url": "https://<pagerdutyserver.com>/alerts/PBCAOOO/log_entries/R4Y7P9Y8HTZ9VRPJ"
},
"body": {
"contexts": [],
"details": null,
"cef_details": {
"client": "curl",
"contexts": [],
"dedup_key": "mni-dedup-key-2-2019-12-09-17:20:34",
"description": "A failure occurred invoking the provision callback on conversation",
"event_class": "Test",
"message": "A failure occurred invoking the provision callback on conversation",
"mutations": [],
"service_group": "app-testing",
"severity": "info",
"source_component": "manual-test",
"source_origin": "000.00.000.000",
"version": "1.0"
},
"type": "alert_body"
},
"integration": {
"id": "P02BNYY",
"type": "events_api_v2_inbound_integration_reference",
"summary": "AppCluster",
"self": "https://<pagerdutyapi>/services/P67C6GI/integrations/P02BNYY",
"html_url": "https://<pagerdutyserver.com>/services/P67C6GI/integrations/P02BNYY"
},
"privilege": null,
"team": {
"id": "P8FWLZZ",
"type": "team",
"summary": "SRE - Assistant",
"self": "https://<pagerdutyapi>/teams/P8FWLZZ",
"html_url": "https://<pagerdutyserver.com>/teams/P8FWLZZ"
}
}
}
]
},
"meta_features": []
}