Resilience and robustness

The CDC Replication Engine for Kafka has multiple levels of redundancy and resilience to ensure that your data is written to Kafka reliably and can be retrieved in the same order in which it was written on the source databases.

You can configure the engine for high availability by using the cold standby model. The replication binaries are installed on two separate Linux® hosts, with a shared disk between them. The Kafka instance (or instances) are created by using the active installation, and are written to the shared disk. If the primary instance fails, it now becomes the standby, and the standby becomes the primary. Start up the replication instance on the new primary installation, and it will pick up where the other installation left off. This failover can be scripted by using command-line utilities.

The next layer of redundancy is at the Kafka broker level. The CDC Replication Engine for Kafka can be configured to use multiple brokers. If the lead broker fails, the engine uses the configured broker list to find the next broker to try.

Kafka resilience is built into the CDC Replication Engine for Kafka by using Apache Kafka's native functionality. If the target Apache Kafka level is 0.10 or higher, you can configure the replication engine to use parameters in the kafkaproducer.properties file such as retry.backoff.ms, retries, and max.in.flight.requests.per.connection. For Apache Kafka 1.1.1 or higher, you can enable Apache Kafka's idempotence feature.

The most powerful tool is the transactionally consistent consumer functionality. You can recreate the order of operations in source transactions across multiple Kafka topics and partitions and consume Kafka records that are free of duplicates by including the Kafka transactionally consistent consumer library in your Java™ applications.

When you include this library, the application makes calls to request records for a topic or set of topics that are part of a CDC Replication subscription. The Kafka transactionally consistent consumer provides the records as well as information on the topic, partition, offset, operation ID, and operation bookmark for each record. The operation bookmark can be used to position the application that calls the Kafka transactionally consistent consumer. The application can then process these records in order and duplicate-free.

When you configure subscriptions to use the Kafka custom operation processor, you add not only flexibility for writing in parallel to Kafka topics and partitions in whatever format suits your business needs, but you also can take advantage of consuming the data in the same transactionally consistent manner.