Member-only story

Kafka-native Machine Learning and Model Deployment

7 min readDec 14, 2020

(Originally posted on Kai Waehner’s blog: “Kafka-native Machine Learning and Model Deployment”)

Apache Kafka became the de facto standard for event streaming across the globe and industries. Machine Learning (ML) includes model training on historical data and model deployment for scoring and predictions. While training is mostly batch, scoring usually requires real-time capabilities at scale and reliability. Apache Kafka plays a key role in modern machine learning infrastructures. The next-generation architecture leverages a Kafka-native streaming model server instead of RPC (HTTP/gRPC) calls:

This blog post explores the architectures and trade-offs between three options for model deployment with Kafka: Embedded model into the Kafka application, model server and RPC, model server, and Kafka-native communication.

Kafka and Machine Learning Architecture

Model deployment is usually completely separated from model training (from the process and the technology perspective). The model training is often executed in elastic cloud infrastructure or data lakes such as Hadoop or Spark. Model scoring (i.e., doing the predictions) is usually a mission-critical workload with different uptime SLAs and latency requirements:

Kafka-native Machine Learning and Model Deployment

Kafka and Machine Learning Architecture

Written by Kai Waehner

No responses yet