Creating Mezmo integrations
A Mezmo integration provides log data, which is used to establish a baseline of normal behavior and then identify anomalies. These anomalies can be correlated with other alerts and events, and published to your ChatOps interface to help you determine the cause and resolution of a problem.
For more information about working with Mezmo integrations, see the following sections:
- Creating Mezmo integrations
- Enabling Mezmo integrations
- Editing Mezmo integrations
- Deleting Mezmo integrations
For more information about HTTP headers for the various credential types, see HTTP headers for credential types.
Creating Mezmo integrations
The Mezmo integration type collects data from a Mezmo data source.
About this task
Before you create a integration, gather the following information:
-
Load: To prevent the integration from placing an inordinate load on your data source and potentially impacting your logging operations, the integration connects to only one API with a default data frequency of 60 seconds. This restriction is controlled by using the Sampling rate setting in the Procedure section.
-
Access: Mezmo data sources are cloud-based and use a service key approach to enable access.
-
Data volume: Typical data volumes are as follows:
- 5.5 MB (15000 lines) for 20 seconds on Development environment.
- 49 logs (19 KB) in a 10-second interval. Pulled within 30 seconds.
- 10 logs (3.2 KB) in a 10-second interval. Pulled within 30 seconds.
- 195 logs (75 KB) in a 30-second interval. Pulled within 60 seconds.
Procedure
To create a Mezmo integration, step through the following sections:
- Adding a Mezmo integration
- Specifying integration parameters
- Specifying field mapping
- Specifying how log data is collected for AI training
Adding a Mezmo integration
-
Log in to IBM Cloud Pak for AIOps console.
-
Expand the navigation menu (four horizontal bars), then click Define > Integrations.
-
On the Integrations page, click Add integration.
-
From the list of available integrations, find and click the Mezmo tile.
Note: If you do not immediately see the integration that you want to create, you can filter the tiles by type of integration. Click the type of integration that you want in the Category section.
-
On the side-panel, review the instructions and when ready to continue, click Get started.
Specifying integration parameters
-
On the Add integration page enter the following integration information:
-
Name: The display name of your integration.
-
Description: An optional description for the integration.
-
URL: The hostname or IP address of the Mezmo API. To find the URL, log in to your Mezmo account or check the Mezmo instance in your IBM Cloud account. For more information, see the Log Analysis documentation about Endpoints.
-
Service key: The Mezmo service key. For more information, see Service keys
.
Figure. Create Mezmo integration -
Certificate: An optional certificate used to verify the SSL/TLS connection to the REST service.
-
Filters: Defines subsets of data that is pulled from Mezmo. For example,
healthcheck (successful OR ping)
returns all of the log lines with the wordhealthcheck
and without the wordsuccessful
, and all of the log lines with the wordhealthcheck
and with the wordping
. For more information about Mezmo filtering, see Search and Filter Log Data documentation. -
Base parallelism: Select a value to specify the number of Flink jobs that can run in parallel. These jobs run to process and normalize the collected data. The default value is 1. However, it is recommended to use a higher value than 1 so that you can process data in parallel. This value cannot exceed the total available free Flink slots. In a small environment, the available flinks slots are 16, while in a large environment, the maximum available slots are 32. If you are collecting historical data with this integration, you can set this value to be equal to the source parallelism.
-
Sampling rate: The rate at which data is pulled from the live source (in seconds). The default value is
60
. -
JSON processing option: Select a JSON processing option.
-
None: The default option. The JSON is not processed or modified.
-
Flatten: This option flattens the JSON object by removing the opening and closing braces.
-
Filter: This option extracts the JSON object and replaces it with an empty string.
-
For more information about the options, see Managing embedded JSON.
-
-
Click Test connection.
Figure. Test connection Note: To improve data throughput, you can increase the base parallelism value incrementally. For more information about maximum base parallelism for starter and production deployments, see Improving data streaming performance for log anomaly detection.
-
Click Next to move to the next page.
Specifying field mapping
-
On the Field mapping page you can improve search performance by mapping the fields from your implementation fields to standard fields within IBM Cloud Pak for AIOps.
Figure. Field mapping - For more information about how field mappings are defined, see Mapping data from incoming sources.
- For more information about using mappings to clean your data for use in IBM Cloud Pak for AIOps, see Cleaning mapped data using regular expressions.
The following code snippet displays an example of field mapping by using the supported format. When coding your mapping, use this example to help you.
{ "codec": "mezmo", "rolling_time": 10, "instance_id_field": "_app", "log_entity_types": "container", "message_field": "_line", "timestamp_field": "_ts" }
You can also map from a nested field in the source data to a field in IBM Cloud Pak for AIOps data. For example, in the following code snippet, the line
"message_field": "_line.message",
maps the fieldmessage_field
in IBM Cloud Pak for AIOps data to the nested field_line.message
in the Mezmo data.{ "rolling_time": 10, "instance_id_field": "_app", "log_entity_types": "container", "message_field": "_line.message", "timestamp_field": "_ts" }
You can also create an OR relationship between multiple nested fields, where the OR relationship is expressed by using a semicolon. For example, in the following code snippet, the line
"message_field": "_line.message1;_line.message2;_line.message3",
maps the fieldmessage_field
in IBM Cloud Pak for AIOps data to one of three possible nested fields in the Mezmo data:-
_line.message1
, or -
_line.message2
, or -
_line.message3
-
The system first looks for
_line.message1
in the Mezmo data record, and if it finds this field, then it uses it for the mapping. -
If it can't find
_line.message1
in the Mezmo data record, then it looks for the next item_line.message2
and if it finds it, then it uses it for the mapping. -
If it can't find
_line.message2
in the Mezmo data record, then it looks for the next item_line.message3
and if it finds it, then it uses it for the mapping.
-
{ "rolling_time": 10, "instance_id_field": "_app", "log_entity_types": "container", "message_field": "_line.message1;_line.message2;_line.message3", "timestamp_field": "_ts" }
-
Click Next to move to the next page.
Specifying how log data is collected for AI training
-
On the AI training and log data page, select how you want to manage collecting data for use in AI training and anomaly detection. Click the Data flow toggle to turn on data flow and then select how you want to collect data:
-
Live data for continuous AI training and anomaly detection: A continuous collection of data from your integration is used to both train AI models and analyze your data for anomalous behavior.
Note: After an initial installation, there is no data at all in the system. If you select this option, then the two different log anomaly detection algorithms behave in the following ways:
-
Natural language log anomaly detection does not initially detect anomalies as no model has been trained. You can retrieve historical data (select Historical data for initial AI training) to speed up the retrieval of data to train on, or you can leave the Live data for continuous AI training and anomaly detection setting on. In the latter case, the system gathers training data live and after a few days there is enough data to train a model. When this model is deployed, then it detects anomalies as normal.
-
Statistical baseline log anomaly detection does not detect anomalies for the first 30 minutes of data collection. This is because it does not have a baseline yet. After 30 minutes of live data collection the baseline is automatically created. After that it detects anomalies on an ongoing basis, while continuing to gather data and improve its model every 30 minutes.
-
-
Live data for initial AI training: A single set of training data used to define your AI model. Data collection takes place over a specified time period that starts when you create your integration.
Note: Selecting this option causes the system to continue to collect data while the option is enabled; however, the data is collected for training only, and not for log anomaly detection. For more information about AI model training, including minimum and ideal data quantities, see Configuring AI training.
-
Historical data for initial AI training: A single set of training data used to define your AI model. You must specify a Start and End date, and the parallelism of your source data. Historical data is harvested from existing logs in your integration over a specified time period in the past.
-
Start date: Select a start date from the calendar.
Note: The start date must not exceed 31 days from the present as the maximum time period for historical data collection is 31 days. The recommended time period is two weeks.
-
End date: Select an end date from the calendar.
Note: If you do not specify the end date, then live data collection follows the historical data collection. If you do not want to set an end date, click Remove end date.
-
Source parallelism (1-50): Select a value to specify the number of requests that can run in parallel to collect data from the source. Generally, you can set the value to equal the number of days of datat that you want to collect. When you are setting this value, consider the number of requests that are allowed by the source in a minute. For example, if only 1-2 requests are allowed, set the value to be low.
Figure. AI training
-
Important: Keep in mind the following considerations when you select your data collection type:
- Anomaly detection for your integration occurs if you select Live data for continuous AI training and anomaly detection.
- Different types of AI models have different requirements to properly train a model. Make sure that your settings satisfy minimum data requirements. For more information about how much data you need to train different AI models, see Configuring AI training.
-
-
Click Next.
-
On the Resource requirements page, you can review the slot usage for your log integrations to see if there are enough slots to fully support the integration for multizone high availability.
Figure. Resource requirements If you set the Data collection toggle to On, you will see the resource management overview.
-
If your current usage and other usage are less than the provisioned slots, but the HA slots exceed the provisioned slots, you will be able to create the integration, but will see a warning that you do not have enough slots. The integration will not have multizone high availability.
-
If your projected usage exceeds the provisioned slots, you will not be able to create the integration because you do not have enough slots on your system for log data integrations.
-
If your total slots, including HA slots, are within the provisioned slots, the integration will have multizone high availability.
Note: HA operation assumes high availability for three zones.
If you set the Data collection toggle to Off, you will see a message stating that you need to enable logs data collection to see the resource management overview. When data collection is off, no slots are used by that integration.
-
-
Click Save.
You have created a Mezmo integration in your instance. After you create your integration, you must enable the data collection to connect your integration with the AI of IBM Cloud Pak for AIOps. For more information about enabling your integration, see Enabling Mezmo integrations.
To create more integrations (such as a ChatOps integration), see Configuring Integrations.
For more information about working with the insights provided by your integrations, see ChatOps insight management.
Enabling and disabling Mezmo integrations
If you did not enable your data collection during creation, you can enable your integration afterward. You can also disable a previously enabled integration the same way. If you selected Live data for initial AI training when you created your integration, you must disable the integration before AI model training. To enable or disable a created integration, complete the following steps:
-
Log in to IBM Cloud Pak for AIOps console.
-
Expand the navigation menu (four horizontal bars), then click Define > Integrations.
-
On the Manage integrations tab of the Integrations page, click the Mezmo integration type.
-
Click the integration that you want to enable or disable.
-
Go to the AI training and log data section. Set Data collection to On or Off to enable or disable data collection. Disabling data collection for an integration does not delete the integration.
You enabled or disabled your integration. For more information about deleting a integration, see Deleting Mezmo integrations.
Editing Mezmo integrations
After you create your integration, your can edit the integration. For example, if you specified Historical data for initial AI training but now want your integration to pull in live data for continuous monitoring, you can edit it. To edit a integration, complete the following steps:
-
Log in to IBM Cloud Pak for AIOps console.
-
Expand the navigation menu (four horizontal bars), then click Define > Integrations.
-
Click the Mezmo integration type on the Manage integrations tab of the Integrations page.
-
On the Mezmo integrations page, click the name of the integration that you want to edit. Alternatively, you can click the options menu (three vertical dots) for the integration and click Edit. The integration configuration opens.
-
Edit your integration as required. Click Save when you are done editing.
Your integration is now edited. If your application was not previously enabled or disabled, you can enable or disable the integration directly from the interface. For more information about enabling and disabling your integration, see Enabling and disabling Mezmo integrations. For more information about deleting a integration, see Deleting Mezmo integrations.
Deleting Mezmo integrations
If you no longer need your Mezmo integration and want to not only disable it, but delete it entirely, you can delete the integration from the console.
Note: You must disable data collection before you delete your integration. For more information about disabling data collection, see Enabling and disabling Mezmo integrations.
To delete a integration, complete the following steps:
-
Log in to IBM Cloud Pak for AIOps console.
-
Expand the navigation menu (four horizontal bars), then click Define > Integrations.
-
Click the Mezmo integration type on the Manage integrations tab of the Integrations page.
-
On the Mezmo integrations page, click the options menu (three vertical dots) for the integration that you want to delete and click Delete.
-
Enter the name of the integration to confirm that you want to delete your integration. Then, click Delete.
Your integration is deleted.
Note: If you manually send Mezmo logs from an archive, the model performs as expected only on other logs that are sent from the archive. Mixing archived data and data from the active service instance results in lower quality models.