What happened in the Kafka community in November 2018?
Kafka Releases:
-
2.1.0: Dong Ling released this new version on November 21, 2018, and it contains a number of interesting new features:
-
Java 11 support: Apache Kafka is keeping up with the faster JVM release cycle. See the OpenJDK page for Java 11 for a list of new features.
-
Support for Zstandard compression: Zstd offers great compression ratios and is also really fast. Consequently, on most workloads, it can provide significant throughput improvements compared to Snappy, GZIP, and LZ4. See KIP-110 for more details.
-
Updated replication protocol: The work to strengthen the inter-broker protocol continued in this release. TLA+, a formal specification language, was used to model and verify the replication protocol.
-
A ton of client improvements: This includes better timeouts for Producers, access to AdminClient metrics, new ACL for listing consumer groups, and DNS improvements for Kerberos and Cloud environments.
-
Improved Streams API: This includes a few new metrics, cleanups of StoreSuppliers APIs, and better timestamp synchronization.
-
-
2.0.1: This release came out on the November 9, 2018, and contains 51 fixes. Thanks to Manikumar Reddy for running this release.
Finally, you can access the release plan on the wiki for the full details about each release: 2.0.1 and 2.1.0.
KIPs:
Last month, the community submitted 12 KIPs (KIP-387 to KIP-398). These are the ones that caught my eye:
KIP-388: Add observer interface to record request and response
This KIP proposes adding an interface to brokers to observe requests and responses. The goal is to allow access to information that cannot be retrieved using existing metrics like latency per topic or bytes transmitted per principal.
KIP-390: Allow fine-grained configuration for compression
Compression operations involve a tradeoff between speed and compressed size. Therefore compression algorithms have the concept of compression level to enable adjusting the tradeoff. This KIP’s goal is to allow users to specify the compression level they desire when configuring compression in Producers or Brokers instead of always using the default.
KIP-391: Allow producing with offsets for cluster replication
Replicating data between clusters is a very common operation. However, one difficulty to handle is consumer offsets as messages are likely to have different offsets in each cluster. The proposal of this KIP is to support replicating messages with their offsets. This would enable you to easily replicate committed offsets and allow consumers to rely on them even when switching clusters.
KIP-392: Allow consumers to fetch from closest replica
Currently, Producers and Consumers have to connect to partition leaders in order to send or receive messages. This KIP proposes to give some flexibility to Consumers and allow them to fetch messages from other replicas as long as they are in-sync. In addition to potentially spreading client load on more brokers, this is especially important when a Kafka cluster is spanning multiple availability zones because Consumers would be able to fetch from the closest replica.
Blogs:
-
https://blogs.apache.org/kafka/entry/apache-kafka-supports-more-partitions
-
https://medium.com/@andrew_schofield/does-apache-kafka-do-acid-transactions-647b207f3d0e
IBM Event Streams for Cloud is Apache Kafka-as-a-service for IBM Cloud.