Pulsar Producer
The Pulsar Producer destination writes data to topics in an Apache Pulsar cluster. The Pulsar Producer destination attaches to one or more topics and publishes messages to a Pulsar broker for processing. For information about supported versions, see Supported Systems and Versions.
When you configure a Pulsar Producer destination, you define the URL to connect to Pulsar, and you configure the Pulsar security features to use when connecting. You can also use a connection to configure the destination.
You define the topics to publish messages to, either a single topic or multiple topics by including an expression in the configured topic name. In addition, you define the schema Pulsar uses to validate the messages that the destination writes to the topic.
You can configure advanced properties as needed, such as the partition or compression type to use when publishing messages or whether the destination publishes messages asynchronously or synchronously.
For more information about Pulsar topics and producers, see the Apache Pulsar documentation.
Schema Properties
- Schema tab
- The schema specified on the Schema tab is used to validate messages written
to Pulsar.
If the destination writes to a topic in a Pulsar namespace configured to enforce schema validation, then you must specify a schema on the Schema tab. The destination passes the schema to Pulsar. Then Pulsar uses the schema to validate written messages.
- Data Format tab
- The schema specified on the Data Format tab supports message
processing.
If you configure the destination to write messages in Avro or XML format, then you can specify a schema on the Data Format tab. The destination uses the specified schema to format and write messages.
If you specify a schema on both tabs, then specify the same schema.
Security
Configure the Pulsar Producer destination to use the security features available in the Pulsar cluster.
The destination supports the following security features:
- TLS transport encryption
- The Pulsar cluster encrypts all traffic between the Pulsar server and the stage. The Pulsar server provides a key and certificate, which the stage uses to verify the server's identity. For details, see the Pulsar documentation on TLS transport encryption.
- TLS authentication
- The stage provides keys and certificates, which the Pulsar server uses to verify the stage's identity. TLS authentication requires TLS transport encryption. For details, see the Pulsar documentation on TLS authentication.
- JWT authentication
- The stage provides Pulsar a JSON Web Token (JWT), which identifies the stage and grants permission for some actions. JWT authentication requires TLS transport encryption. For details, see the Pulsar documentation on JWT authentication.
- OAuth authentication
- The stage provides Pulsar an OAuth 2.0 access token, which identifies the stage and associates the stage with a role. For details, see the Pulsar documentation on OAuth authentication.
Enabling TLS Transport Encryption
Enable TLS transport encryption to encrypt all traffic between the Pulsar server and the stage.
Enabling TLS Authentication
Enable TLS authentication so that Pulsar can authenticate the stage with certificates.
- Enable TLS transport encryption.
- On the Security tab of the stage, select Enable Mutual Authentication.
-
Create the client certificate and client private key PEM files for the stage to
use.
For information about creating client certificates for Pulsar, see the Pulsar documentation.
- Store the client certificate and client private key PEM files created for the stage in the Data Collector resources directory, $SDC_RESOURCES.
- On the Security tab of the stage, enter the name of the client files in the Client Certificate PEM and Client Key PEM properties.
Enabling JWT Authentication
Enable JWT authentication so that Pulsar can authenticate the stage with a JSON Web Token (JWT).
- Enable TLS transport encryption.
- On the Security tab, select Use JWT.
- In the Token property, enter the token string that represents a signed JWT for the stage.
Enabling OAuth Authentication
Enable OAuth authentication so that Pulsar can authenticate the stage with an access token.
Data Formats
The Pulsar Producer destination writes data to Pulsar based on the data format that you select. You can use the following data formats:
- Avro
- The stage writes records based on the Avro schema. You can use one of the following methods to specify the location of the Avro schema definition:
- Binary
- The stage writes binary data to a single field in the record.
- Delimited
- The destination writes records as delimited data. When you use this data format, the root field must be list or list-map.
- JSON
- The destination writes records as JSON data. You can use one of
the following formats:
- Array - Each file includes a single array. In the array, each element is a JSON representation of each record.
- Multiple objects - Each file includes multiple JSON objects. Each object is a JSON representation of a record.
- Protobuf
- Writes one record in a message. Uses the user-defined message type and the definition of the message type in the descriptor file to generate the message.
- SDC Record
- The destination writes records in the SDC Record data format.
- Text
- The destination writes data from a single text field to the destination system. When you configure the stage, you select the field to use.
- XML
- The destination creates a valid XML document for each record. The
destination requires the record to have a single root field that
contains the rest of the record data. For details and
suggestions for how to accomplish this, see Record Structure Requirement.
The destination can include indentation to produce human-readable documents. It can also validate that the generated XML conforms to the specified schema definition. Records with invalid schemas are handled based on the error handling configured for the destination.
Configuring a Pulsar Producer
Configure a Pulsar Producer destination to write data to Pulsar topics.