Apache Kafka + Apache Flink = Match Made in Heaven

Kai Waehner
12 min readMay 5, 2023

Apache Kafka and Apache Flink are increasingly joining forces to build innovative real-time stream processing applications. This blog post explores the benefits of combining both open-source frameworks, shows unique differentiators of Flink versus Kafka, and discusses when to use a Kafka-native streaming engine like Kafka Streams instead of Flink.

(Originally posted on Kai Waehner’s blog: “Apache Kafka + Apache Flink = Match Made in Heaven”… Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter)

The tremendous adoption of Apache Kafka and Apache Flink

Apache Kafka became the de facto standard for data streaming. The core of Kafka is messaging at any scale in combination with a distributed storage (= commit log) for reliable durability, decoupling of applications, and replayability of historical data.

Kafka also includes a stream processing engine with Kafka Streams. And KSQL is another successful Kafka-native streaming SQL engine built on top of Kafka Streams. Both are fantastic tools. In parallel, Apache Flink became a very successful stream processing engine.

The first prominent Kafka + Flink case study I remember is the fraud detection use case of ING Bank. The first publications came up in 2017, i.e., over five years ago: “StreamING Machine Learning Models: How ING Adds Fraud Detection Models at Runtime with Apache Kafka and Apache Flink”. This is just one of many Kafka fraud detection case studies.

One of the last case studies I blogged about goes in the same direction: “Why DoorDash migrated from Cloud-native Amazon SQS and Kinesis to Apache Kafka and Flink”.

The adoption of Kafka is already outstanding. And Flink gets into enterprises more and more, very often in combination with Kafka. This article is no introduction to Apache Kafka or Apache Flink. Instead, I explore why these two technologies are a perfect match for many use cases and when other Kafka-native tools are the appropriate choice instead of Flink.

Top reasons Apache Flink is a perfect complementary technology for Kafka

--

--

Kai Waehner

Technology Evangelist — www.kai-waehner.de → Big Data Analytics, Data Streaming, Apache Kafka, Middleware, Microservices => linkedin.com/in/kaiwaehner