Enabling the partitioning of data topics for KCOPs

You can enable partitioning of data topics for all Kafka custom operation processors (KCOP) that are supported by the CDC Replication Engine for Kafka.

Partitioning of data topics can increase apply parallelism. This solution offers better performance than specifying the partitioner.class property in the kafkaproducer.properties file because it causes CDC Replication to use one Kafka producer per partition instead of one Kafka producer per topic.

Option 1
By adding the property PARTITION_AUTO, you can dynamically assign partitions to the topic you are writing to. The CDC Replication automatically determines the appropriate number of partitions and assigns them to the topic. However, the topic must exist for this property to work. Also, you must specify the path to your kafkaconsumer.properties file in kcops.properties using the KAFKACONSUMER_PROPERTIES_PATH property in order to use this feature. For example:
KAFKACONSUMER_PROPERTIES_PATH=<CDC-install-dir>/instance/<your-instance>/conf/kafkaconsumer.properties
PARTITION_AUTO=true
Option 2
To specify that CDC Replication should distribute records over a given number of partitions, add the PARTITION_TOPIC_topic_name property to a KCOP properties file. For example:
PARTITION_TOPIC_data=3

This setting causes CDC Replication to write records to the partitions 0, 1, and 2 for the topic data. Other topics are not partitioned.

To specify the default number of partitions, you add the PARTITION_DEFAULT property to a KCOP properties file. For example:

PARTITION_DEFAULT=3

This setting causes CDC Replication to write records to partitions 0, 1, and 2 for all data topics.

Both properties can be used at the same time, but PARTITION_TOPIC_* takes precedence over PARTITION_DEFAULT. For example, the following statements cause CDC Replication to write records to the partitions 0, 1, 2, 3, 4, and 5 for the topic data. For other data topics, records are written to the partitions 0, 1, and 2:

PARTITION_TOPIC_data=6
PARTITION_DEFAULT=3

If you want to both remap and partition, use the remapped name of a data topic. For example:

MAP_ALL=all-topic
PARTITION_TOPIC_all-topic=3

Records are partitioned by hash using a key value. If a record does not have a key, then it is partitioned with round-robin. If more than one image builder thread is used, then each thread has its own round-robin counter.

Important:
  • The maximum number of Kafka producers that a subscription creates is equal to the number of source tables.
  • Setting the topic prefix in the Management Console by using Kafka properties changes the default topic naming convention. If you set the topic prefix without using topic remapping, ensure that you use the following format:
    PARTITION_TOPIC_topic-prefix.schema.table=int