Serverless Kafka in a Cloud-native Data Lake Architecture
Apache Kafka became the de facto standard for processing data in motion. Kafka is open, flexible, and scalable. Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use a serverless Kafka SaaS offering to focus on business logic. However, hybrid scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden. This blog post explores how to leverage cloud-native and serverless Kafka offerings in a hybrid cloud architecture. We start from the perspective of data at rest with a data lake and explore its relation to data in motion with Kafka.
(Originally posted on Kai Waehner’s blog: “Serverless Kafka in a Cloud-native Data Lake Architecture”… Stay informed about new blog posts by subscribing to my newsletter)
Data at Rest — Still the Right Approach?
Data at Rest means to store data in a database, data warehouse, or data lake. This means that the data is processed too late in many use cases — even if a real-time streaming component (like Kafka) ingests the data. The data processing is still a web service call, SQL query, or map-reduce batch process away from providing a result to your problem.