Creating ELK connections
Elasticsearch, Logstash, and Kibana (ELK) connections provide log data for anomaly detection in IBM Cloud Pak® for Watson AIOps. An ELK connection provides log data, which is used to establish a baseline of normal behavior and then identify anomalies. These anomalies can be correlated with other alerts and events, and published to your ChatOps interface to help you determine the cause and resolution of a problem.
For more information about working with ELK connections, see the following sections:
For more information about HTTP headers for the various credential types, see HTTP headers for credential types.
About this task
Before creating the connection, you should be aware of the following information.
-
Load: To prevent this connection placing an inordinate load on your data source and potentially impacting your logging operations, this connection only connects to one API with a default data frequency of 60 seconds. This is controlled by using the Sampling rate setting in the Procedure section.
-
Access: Custom data sources are cloud-based REST APIs. Access is configured by using the authentication methods that are specified in the Authentication type setting in the Procedure section.
-
Data volume: Data volume depends on the application, and is not a set value. Therefore, it does not appear in the settings.
Procedure
To create an ELK connection, complete the following steps:
-
Log in to IBM Cloud Pak Automation console.
-
Expand the navigation menu (four horizontal bars), then click Define > Data and tool connections.
-
On the Data and tool connections page, click Add connection.
-
From the list of available connections, find and click the ELK tile.
Note: If you do not immediately see the connection that you want to create, you can filter the tiles by type of connection. Click the type of connection that you want in the Category section.
-
On the side-panel, review the instructions and when ready to continue, click Connect.
-
On the Add connection page, define the general connection details:
-
Name: The display name of your connection.
-
Description: An optional description for the connection.
-
ELK service URL: The Elasticsearch host and public API port. The URL might also have the target index that IBM Cloud Pak for Watson AIOps uses to search for the data from your applications.
-
Kibana URL: Enter a URL for the service instance.
-
Authentication type: Select one of the following values:
- User ID/password: The Elasticsearch instance has a user ID and password as authentication. You must enter both in the connection configuration.
- API key: The Elasticsearch instance is authenticated with an API key.
- Token: The Elasticsearch instance is authenticated with a temporary token.
- None: The Elasticsearch instance has no authentication.
Note: If you selected API key as Authentication type value, then you need to follow the steps on ELK connector steps to use ApiKey authentication method.
-
User ID: Enter a user ID for the connection.
-
Password: Enter a password for the connection.
-
Certificate (optional): Certificate used to verify the SSL/TLS connection to the REST service.
-
Filters (optional): A custom Boolean Query
to filter the Elasticsearch request for your specific application, terms, keywords, or other filters.
-
Time zone (optional): The time zone in which your data is situated. The default time is converted from the system time relative to UTC time. The default value is UTC.
-
Kibana port: The port of the Kibana instance that is on the same host as the Elasticsearch instance. If the Kibana instance is not exposed on the default
5601
port, change this value to the port where the Kibana instance is shown. -
Base parallelism: Select a value to specify the number of Flink jobs that can run in parallel. These jobs run to process and normalize the collected data. The default value is 1. However, it is recommended to use a higher value than 1 so that you can process data in parallel. This value cannot exceed the total available free Flink slots. In a small environment, the available flinks slots are 16, while in a large environment, the maximum available slots are 32. If you are collecting historical data with this connection, you can set this value to be equal to the source parallelism.
-
Sampling rate: The rate at which data is pulled from live source (in seconds). The default value is
60
. -
JSON processing option: Select a JSON processing option.
- None: The default option. The JSON is not processed or modified.
- Flatten: This option flattens the JSON object by removing the opening and closing braces.
- Filter: This option extracts the JSON object and replaces it with an empty string.
- For more information about the options, see Managing embedded JSON.
Note: To improve data throughput, you can increase the base parallelism value incrementally. For more information about maximum base parallelism for starter and production deployments, see Improving data streaming performance for log anomaly detection.
Note: If using the Filter option, you should not use a timestamp in the filter query. It gives a parsing error in the backend.
"range": { "@timestamp": { "gte": "now-2m", "lt" : "now" }
Other than that, ELK filter can use any clauses so long as the fields and the values that are specified in the filter are relevant to the target endpoint data set.
Filter debugging tip: You can first test the filter by using the curl command to ELK endpoint. When using the curl command, the timestamp range is optional in the elk filter, and depends on whether you want to narrow down the search over a particular period of time. If the query gets zero records with 200 HTTP responses (for example no filter error), it is an indication that the query was not able to hit the particular set of data scoped by the filter. Users can tune filter to get the needed results.
-
-
You can test your connection by clicking Test connection.
-
Click Next.
-
Enter Field Mapping information (Optional):
You can improve search performance by mapping the fields from your implementation fields to IBM Cloud Pak for Watson AIOps's standard fields. For more information about how field mappings are defined, see Mapping data from incoming sources. For more information about using mappings to clean your data for use in IBM Cloud Pak for Watson AIOps, see Cleaning mapped data that use regular expressions. Consider the supported data schema when you create your field mapping:
{ "codec": "elk", "rolling_time": 10, "instance_id_field": "application_name", "log_entity_types": "kubernetes.pod_name", "message_field": "message", "timestamp_field": "@timestamp", "resource_id": "kubernetes.pod_name" }
-
Click Next.
-
Enter AI training and log data (Optional):
Select how you want to manage collecting data for use in AI training and anomaly detection. Click the Data collection toggle to turn on data collection, then select how you want to collect data:
-
Live data for continuous AI training and anomaly detection: A continuous collection of data from your connection is used to both train AI models and analyze your data for anomalous behavior.
Note: After an initial installation, there is no data at all in the system. If you select this option, then the two different log anomaly detection algorithms behave in the following ways:
-
Natural language log anomaly detection does not initially detect anomalies as no model has been trained. You can retrieve historical data (select Historical data for initial AI training) to speed up the retrieval of data to train on, or you can leave the Live data for continuous AI training and anomaly detection setting on. In the latter case, the system gathers training data live and after a few days there is enough data to train a model. When this model is deployed, then it detects anomalies as normal.
-
Statistical baseline log anomaly detection does not detect anomalies for the first 30 minutes of data collection. This is because it does not have a baseline yet. After 30 minutes of live data collection the baseline is automatically created. After that it detects anomalies on an ongoing basis, while continuing to gather data and improve its model every 30 minutes.
-
-
Live data for initial AI training: A single set of training data used to define your AI model. Data collection takes place over a specified time period that starts when you create your connection.
Note: Selecting this option causes the system to continue to collect data while the option is enabled; however, the data is collected for training only, and not for log anomaly detection. For more information about AI model training, including minimum and ideal data quantities, see Configuring AI training.
-
Historical data for initial AI training: A single set of training data used to define your AI model. You need to give Start and End dates, and specify the parallelism of your source data. Historical data is harvested from existing logs in your connection over a specified time period in the past.
-
Start date: Select a start date from the calendar and enter the time in hh:mm (hours and minutes) format.
Note: The start date must not exceed 31 days from the present as the maximum time period for historical data collection is 31 days. The recommended time period is two weeks.
-
Time zone: Select your time zone from the dropdown list.
-
End date and time: Click Add end date and select an end date from the calendar and enter the time in hh:mm format.
Note: If you do not specify the end date, then live data collection follows the historical data collection. If you do not want to set an end date, click Remove end date.
-
Source parallelism (1-50): Select a value to specify the number of requests that can run in parallel to collect data from the source. Generally, you can set the value to equal the number of days of datat that you want to collect. When you are setting this value, consider the number of requests that are allowed by the source in a minute. For example, if only 1-2 requests are allowed, set the value to be low.
-
Important: Keep in mind the following considerations when you select your data collection type:
- Anomaly detection for your connection occurs if you select Live data for continuous AI training and anomaly detection.
- Different types of AI models have different requirements to properly train a model. Make sure that your settings satisfy minimum data requirements. For more information about how much data you need to train different AI models, see Configuring AI training.
-
-
Click Done.
You created an ELK connection in your instance. After you create your connection, you must enable the data collection to connect your connection with the AI of IBM Cloud Pak for Watson AIOps. For more information about enabling your connection, see Enabling ELK connections.
To create more connections (such as a ChatOps connection), see Configuring data and tool connections.
For more information about working with the insights provided by your connections, see ChatOps insight management.
Enabling and disabling ELK connections
If you didn't enable your data collection during creation, you can enable your connection afterward. You can also disable a previously enabled connection the same way. If you selected Live data for initial AI training when you created your connection, you must disable the connection before AI model training. To enable or disable a created connection, complete the following steps:
-
Log in to IBM Cloud Pak Automation console.
-
Expand the navigation menu (four horizontal bars), then click Define > Data and tool connections.
-
On the Manage connections tab of the Data and tool connections page, click the ELK connection type.
-
Click the connection that you want to enable or disable.
-
Go to the AI training and log data section. Set Data connection to On or Off to enable or disable data collection. Disabling data collection for a connection does not delete the connection.
You enabled or disabled your connection. For more information about deleting a connection, see Deleting ELK connections.
Editing ELK connections
After you create your connection, your can edit the connection. For example, if you specified Historical data for initial AI training but now want your connection to pull in live data for continuous monitoring, you can edit it. To edit a connection, complete the following steps:
-
Log in to IBM Cloud Pak Automation console.
-
Expand the navigation menu (four horizontal bars), then click Define > Data and tool connections.
-
Click the ELK connection type on the Manage connections tab of the Data and tool connections page.
-
On the ELK connections page, click the name of the connection that you want to edit. Alternatively, you can click the options menu (three vertical dots) for the connection and click Edit. The connection configuration opens.
-
Edit your connection as required. Click Save when you are done editing.
Your connection is now edited. If your application was not previously enabled or disabled, you can enable or disable the connection directly from the interface. For more information about enabling and disabling your connection, see Enabling and disabling ELK connections. For more information about deleting a connection, see Deleting ELK connections.
Deleting ELK connections
If you no longer need your ELK connection and want to not only disable it, but delete it entirely, you can delete the connection from the console.
Note: You must disable data collection before deleting your connection. For more information about disabling data collection, see Enabling and disabling ELK connections.
To delete a connection, complete the following steps:
-
Log in to IBM Cloud Pak Automation console.
-
Expand the navigation menu (four horizontal bars), then click Define > Data and tool connections.
-
Click the ELK connection type on the Manage connections tab of the Data and tool connections page.
-
On the ELK connections page, click the options menu (three vertical dots) for the connection that you want to delete and click Delete.
-
Enter the name of the connection to confirm that you want to delete your connection. Then, click Delete.
Your connection is deleted.