Elasticsearch
The Elasticsearch destination writes data to an Elasticsearch cluster, including Elastic Cloud clusters and Amazon OpenSearch Service clusters (formerly Amazon Elasticsearch Service). For information about supported versions, see Supported Systems and Versions in the Data Collector documentation.
The destination uses the Elasticsearch HTTP module to access the Bulk API and write each record to Elasticsearch as a document.
When you configure the Elasticsearch destination, you configure the HTTP URLs used to connect to the Elasticsearch cluster and specify whether security is enabled on the cluster. When Data Collector shares the same network as the Elasticsearch cluster, you can enter one or more node URLs and automatically detect additional Elasticsearch nodes on the cluster.
The Elasticsearch destination can use CRUD operations defined in the
sdc.operation.type
record header attribute to write
data. You can define a default operation for records without the header
attribute or value. You can also configure how to handle records with
unsupported operations.
For information about Data Collector change data
processing and a list of CDC-enabled origins, see Processing Changed Data.
You can add advanced Elasticsearch properties as needed. You can also use a connection to configure the destination.
Security
- Basic
- Use Basic authentication for Elasticsearch clusters outside of Amazon OpenSearch Service. With Basic authentication, the stage passes the Elasticsearch user name and password.
- AWS Signature V4
- Use
AWS Signature V4 authentication for Elasticsearch clusters within Amazon
OpenSearch Service. The stage must
sign HTTP requests with Amazon Web Services credentials. For details, see the
Amazon OpenSearch Service
documentation. Use one of the following methods to sign with AWS credentials:
- Instance profile
- When the execution engine - Data Collector or Transformer - runs on an Amazon EC2 instance that has an associated instance profile, the engine uses the instance profile credentials to automatically authenticate with AWS.
- AWS access key pair
- When the execution engine does not run on an Amazon EC2 instance or when the EC2 instance doesn’t have an instance profile, you must specify the Access Key ID and Secret Access Key properties.
Time Basis and Time-Based Indexes
The time basis is the time used by the Elasticsearch destination to write records to time-based indexes. When indexes have no time component, you can ignore the time basis property.
You can use the time of processing or the time associated with the data as the time basis.
logs-${YYYY()}-${MM()}-${DD()}
If you use the time of processing as the time basis, the destination write records to indexes based on when it processes each record. If you use the time associated with the data, such as a transaction timestamp, then the destination writes records to the indexes based on that timestamp.
- Processing Time
- When you use processing time as the time basis, the destination writes to
indexes based on the processing time and the index. To use the processing
time as the time basis, use the following expression:
This is the default time basis.${time:now()}
- Record Time
- When you use the time associated with a record as the time basis, you specify a date field in the record. The destination writes data to indexes based on the datetimes associated with the records.
Document IDs
When appropriate, you can specify an expression that defines the document ID. When you do not specify an expression, Elasticsearch generates IDs for each document.
When you configure the destination to perform create, update, or delete operations, you must define the document ID.
For example, to perform updates for documents with IDs based on the EmployeeID field,
define the write operation as update and define the Document ID as follows:
${record:value('/EmployeeID')}
.
You can also optionally define a parent ID for each document to define a parent/child relationship between documents in the same index.
CRUD Operation Processing
The Elasticsearch destination can create, update, delete, or index data. The destination writes the records based on the CRUD operation defined in a CRUD operation header attribute or in operation-related stage properties.
The destination uses the header attribute and stage properties as follows:
- CRUD operation header attribute
- The destination
looks for the CRUD operation in the
sdc.operation.type
record header attribute. - Operation stage properties
- If there is no CRUD operation in the
sdc.operation.type
record header attribute, the destination uses the operation configured in the Default Operation property.
Configuring an Elasticsearch Destination
Configure an Elasticsearch destination to write data to an Elasticsearch cluster.