Pulsar Producer
The Pulsar Producer target writes data to topics in an Apache Pulsar cluster. The Pulsar Producer target attaches to one or more topics and publishes messages to a Pulsar broker for processing. For information about supported versions, see Supported systems and versions.
When you configure a Pulsar Producer target, you define the URL to connect to Pulsar, and you configure the Pulsar security features to use when connecting. You can also use a connection to configure the target.
You define the topics to publish messages to, either a single topic or multiple topics by including an expression in the configured topic name. In addition, you define the schema Pulsar uses to validate the messages that the target writes to the topic.
You can configure advanced properties as needed, such as the partition or compression type to use when publishing messages or whether the target publishes messages asynchronously or synchronously.
Schema properties
- Schema tab
- The schema specified on the Schema tab is used to validate messages written to Pulsar.
If the target writes to a topic in a Pulsar namespace configured to enforce schema validation, then you must specify a schema on the Schema tab. The target passes the schema to Pulsar. Then Pulsar uses the schema to validate written messages.
- Data Format tab
- The schema specified on the Data Format tab supports message processing.
If you configure the target to write messages in Avro or XML format, then you can specify a schema on the Data Format tab. The target uses the specified schema to format and write messages.
If you specify a schema on both tabs, then specify the same schema.
Security
Configure the Pulsar Producer target to use the security features available in the Pulsar cluster.
About this task
The target supports the following security features:
- TLS transport encryption
- The Pulsar cluster encrypts all traffic between the Pulsar server and the stage. The Pulsar server provides a key and certificate, which the stage uses to verify the server's identity.
- TLS authentication
- The stage provides keys and certificates, which the Pulsar server uses to verify the stage's identity. TLS authentication requires TLS transport encryption.
- JWT authentication
- The stage provides Pulsar a JSON Web Token (JWT), which identifies the stage and grants permission for some actions. JWT authentication requires TLS transport encryption.
- OAuth authentication
- The stage provides Pulsar an OAuth 2.0 access token, which identifies the stage and associates the stage with a role.
Enabling TLS transport encryption
Enable TLS transport encryption to encrypt all traffic between the Pulsar server and the stage.
Procedure
Enabling TLS authentication
Enable TLS authentication so that Pulsar can authenticate the stage with certificates.
Procedure
- Enable TLS transport encryption.
- On the Security tab of the stage, select Enable Mutual Authentication.
- Create the client certificate and client private key PEM files for the stage to use.
- Store the client certificate and client private key PEM files created for the stage in the Data Collector resources directory, $SDC_RESOURCES.
- On the Security tab of the stage, enter the name of the client files in the Client Certificate PEM and Client Key PEM properties.
Enabling JWT authentication
Enable JWT authentication so that Pulsar can authenticate the stage with a JSON Web Token (JWT).
Before you begin
Procedure
- Enable TLS transport encryption.
- On the Security tab, select Use JWT.
- In the Token property, enter the token string that represents a signed JWT for the stage.
Enabling OAuth authentication
Enable OAuth authentication so that Pulsar can authenticate the stage with an access token.
Procedure
Data formats
The Pulsar Producer target writes data to Pulsar based on the data format that you select. You can use the following data formats:
- Avro
- The stage writes records based on the Avro schema. You can use one of the following methods to specify the location of the Avro schema definition:
- Binary
- The stage writes binary data to a single field in the record.
- Delimited
- The target writes records as delimited data. When you use this data format, the root field must be list or list-map.
- JSON
- The target writes records as JSON data. You can use one of
the following formats:
- Array - Each file includes a single array. In the array, each element is a JSON representation of each record.
- Multiple objects - Each file includes multiple JSON objects. Each object is a JSON representation of a record.
- Protobuf
- Writes one record in a message. Uses the user-defined message type and the definition of the message type in the descriptor file to generate the message.
- SDC Record
- The target writes records in the SDC Record data format.
- Text
- The target writes data from a single text field to the target system. When you configure the stage, you select the field to use.
- XML
- The target creates a valid XML document for each
record. The target requires the record to have a single root field that contains the
rest of the record data. For details and suggestions for how to accomplish this, see Record structure requirement.
The target can include indentation to produce human-readable documents. It can also validate that the generated XML conforms to the specified schema definition. Records with invalid schemas are handled based on the error handling that is configured for the target.
Configuring a Pulsar Producer target
About this task
Configure a Pulsar Producer target to write data to Pulsar topics.