Serverless Kafka in a Cloud-native Data Lake Architecture

Kai Waehner
14 min readAug 6, 2021

Apache Kafka became the de facto standard for processing data in motion. Kafka is open, flexible, and scalable. Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use a serverless Kafka SaaS offering to focus on business logic. However, hybrid scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden. This blog post explores how to leverage cloud-native and serverless Kafka offerings in a hybrid cloud architecture. We start from the perspective of data at rest with a data lake and explore its relation to data in motion with Kafka.

(Originally posted on Kai Waehner’s blog: “Serverless Kafka in a Cloud-native Data Lake Architecture”… Stay informed about new blog posts by subscribing to my newsletter)

Data at Rest — Still the Right Approach?

Data at Rest means to store data in a database, data warehouse, or data lake. This means that the data is processed too late in many use cases — even if a real-time streaming component (like Kafka) ingests the data. The data processing is still a web service call, SQL query, or map-reduce batch process away from providing a result to your problem.

--

--

Kai Waehner
Kai Waehner

Written by Kai Waehner

Technology Evangelist — www.kai-waehner.de → Big Data Analytics, Data Streaming, Apache Kafka, Middleware, Microservices => linkedin.com/in/kaiwaehner