What happened in the Kafka community in March 2019?
Releases:
After three Release Candidates, Matthias J. Sax released Apache Kafka 2.2.0 on the March 26, 2019. This new minor version contains a number of interesting features:
-
The ability to re-authenticate SASL connections periodically (KIP-368)
-
Hardened inter-broker protocol (KIP-380)
-
The ability to separate Controller traffic from data plane (KIP-291)
-
Producer/AdminClient/Streams API improvements
-
Improved consumer group management (KIP-289)
-
Metrics without a value are now emitted as
NaN
instead of various values (KIP-386) -
Updated
kafka-topics.sh
tool—it now uses the AdminClient API and does not require access to zookeeper anymore (KIP-377) -
All command line tools now accept the
--help
flag (KIP-374)
KIPs:
Last month, the community submitted 10 KIPs (KIP-438 to KIP-447). These are the ones that caught my eye:
KIP-440: Extend Connect Converter to support headers
At the moment, Kafka Connect Converters don’t support message headers. In environments that rely on a lot of headers, this makes it impossible to use Connect to import or export messages. This KIP aims at fixing this discrepancy and allowing Converters to use headers.
KIP-443: Return to default segment.ms and segment.index.bytes in Streams repartition topics
In 2.0.0, the default configurations of Kafka Streams repartition topics changed in order to improve high throughput applications. However, it turns out these settings are a bit too aggressive for low throughput use cases. This KIP proposes removing these settings and instead use the default values.
KIP-444: Augment metrics for Kafka Streams
Operating Streams applications is currently relatively hard due to the lack of several key metrics. This KIP aims at addressing these issues and started doing a full review of all existing Streams metrics. The idea is to remove metrics that turned out not useful and provide new metrics to allow better monitoring and ease debugging.
Blogs:
-
https://medium.com/@gwenshapira/the-case-for-database-first-pipelines-f86240c69863
-
https://www.memsql.com/blog/how-we-use-exactly-once-semantics-with-apache-kafka/
IBM Event Streams for Cloud is Apache Kafka-as-a-service for IBM Cloud.