What happened in the Kafka community in November 2018?
2.1.0: Dong Ling released this new version on November 21, 2018, and it contains a number of interesting new features:
Java 11 support: Apache Kafka is keeping up with the faster JVM release cycle. See the OpenJDK page for Java 11 for a list of new features.
Support for Zstandard compression: Zstd offers great compression ratios and is also really fast. Consequently, on most workloads, it can provide significant throughput improvements compared to Snappy, GZIP, and LZ4. See KIP-110 for more details.
Updated replication protocol: The work to strengthen the inter-broker protocol continued in this release. TLA+, a formal specification language, was used to model and verify the replication protocol.
KIP-390: Allow fine-grained configuration for compression
Compression operations involve a tradeoff between speed and compressed size. Therefore compression algorithms have the concept of compression level to enable adjusting the tradeoff. This KIP’s goal is to allow users to specify the compression level they desire when configuring compression in Producers or Brokers instead of always using the default.
KIP-391: Allow producing with offsets for cluster replication
Replicating data between clusters is a very common operation. However, one difficulty to handle is consumer offsets as messages are likely to have different offsets in each cluster. The proposal of this KIP is to support replicating messages with their offsets. This would enable you to easily replicate committed offsets and allow consumers to rely on them even when switching clusters.
KIP-392: Allow consumers to fetch from closest replica
Currently, Producers and Consumers have to connect to partition leaders in order to send or receive messages. This KIP proposes to give some flexibility to Consumers and allow them to fetch messages from other replicas as long as they are in-sync. In addition to potentially spreading client load on more brokers, this is especially important when a Kafka cluster is spanning multiple availability zones because Consumers would be able to fetch from the closest replica.