Is Apache Kafka really real-time? No, and yes!

Kai Waehner
10 min readJan 11, 2021

Is Apache Kafka really real-time? This is a question I get asked every week. Real-time is a great marketing term to describe how businesses can add value by processing data as fast as possible. Most software and product vendors use it these days. Including messages frameworks (e.g., IBM MQ, RabbitMQ), event streaming platforms (e.g., Apache Kafka, Confluent), data warehouse/analytics vendors (e.g., Spark, Snowflake, Elasticsearch), and security / SIEM products (e.g., Splunk). This blog post explores what “real-time” really means and how Apache Kafka and other messaging frameworks accomplish the mission of providing real-time data processing.

(Originally posted on Kai Waehner’s blog: “Apache Kafka is NOT hard real-time, but used everywhere in Manufacturing 4.0 and Industrial IoT”)

Definition: What is real-time?

The definition of the term “real-time” is not easy. However, it is essential to define it before you start any discussion about this topic.

In general, real-time computing (sometimes called reactive computing) is the computer science term for hardware and software systems subject to a “real-time constraint”, for example, from event to system response. Real-time programs must guarantee a response within specified time constraints, often referred to as “deadlines”…

--

--

Kai Waehner

Technology Evangelist — www.kai-waehner.de → Big Data Analytics, Data Streaming, Apache Kafka, Middleware, Microservices => linkedin.com/in/kaiwaehner