Integration

Kafka Monthly Digest: November 2018

Share this post:

What happened in the Kafka community in November 2018?

Kafka Releases:

  • 2.1.0: Dong Ling released this new version on November 21, 2018, and it contains a number of interesting new features:
    • Java 11 support: Apache Kafka is keeping up with the faster JVM release cycle. See the OpenJDK page for Java 11 for a list of new features.
    • Support for Zstandard compression: Zstd offers great compression ratios and is also really fast. Consequently, on most workloads, it can provide significant throughput improvements compared to Snappy, GZIP, and LZ4. See KIP-110 for more details.
    • Updated replication protocol: The work to strengthen the inter-broker protocol continued in this release. TLA+, a formal specification language, was used to model and verify the replication protocol.
    • A ton of client improvements: This includes better timeouts for Producers, access to AdminClient metrics, new ACL for listing consumer groups, and DNS improvements for Kerberos and Cloud environments.
    • Improved Streams API: This includes a few new metrics, cleanups of StoreSuppliers APIs, and better timestamp synchronization.
  • 2.0.1: This release came out on the November 9, 2018, and contains 51 fixes. Thanks to Manikumar Reddy for running this release.

Finally, you can access the release plan on the wiki for the full details about each release: 2.0.1 and 2.1.0.

KIPs:

Last month, the community submitted 12 KIPs (KIP-387 to KIP-398). These are the ones that caught my eye:

KIP-388: Add observer interface to record request and response
This KIP proposes adding an interface to brokers to observe requests and responses. The goal is to allow access to information that cannot be retrieved using existing metrics like latency per topic or bytes transmitted per principal.

KIP-390: Allow fine-grained configuration for compression
Compression operations involve a tradeoff between speed and compressed size. Therefore compression algorithms have the concept of compression level to enable adjusting the tradeoff. This KIP’s goal is to allow users to specify the compression level they desire when configuring compression in Producers or Brokers instead of always using the default.

KIP-391: Allow producing with offsets for cluster replication
Replicating data between clusters is a very common operation. However, one difficulty to handle is consumer offsets as messages are likely to have different offsets in each cluster. The proposal of this KIP is to support replicating messages with their offsets. This would enable you to easily replicate committed offsets and allow consumers to rely on them even when switching clusters.

KIP-392: Allow consumers to fetch from closest replica
Currently, Producers and Consumers have to connect to partition leaders in order to send or receive messages. This KIP proposes to give some flexibility to Consumers and allow them to fetch messages from other replicas as long as they are in-sync. In addition to potentially spreading client load on more brokers, this is especially important when a Kafka cluster is spanning multiple availability zones because Consumers would be able to fetch from the closest replica.

Blogs:


IBM Event Streams for Cloud is Apache Kafka-as-a-service for IBM Cloud.

Get started with IBM Event Streams

Software Engineer

More Integration stories
March 12, 2019

Expanding Data Warehouse Capabilities for the IBM Hybrid Data Management Platform

The IBM Hybrid Data Management Platform is expanding capabilities with both the Flex and Hybrid Flex plans. These two types of warehousing solutions will help you optimize your hybrid cloud architectures in terms of both performance and cost-savings

Continue reading

March 11, 2019

Worker Node Auto-Scaling GA in IBM Cloud Kubernetes Service

We're extremely excited to announce the general availability of worker node auto-scaling in IBM Cloud Kubernetes Service.

Continue reading

March 8, 2019

IBM Cloud Kubernetes Service Supports CoreDNS

Kubernetes recently announced that CoreDNS has become the default cluster DNS provider starting in version 1.13. To align with this announcement, CoreDNS is also the default cluster DNS provider for new IBM Cloud Kubernetes Service version 1.13 clusters.

Continue reading