Apache Kafka
A distributed event streaming platform.
Apache Kafka stores streams of events in a partitioned, replicated log. Producers append events to topics; consumers read independently at their own pace, with each topic split into partitions that allow horizontal scaling. The log-as-a-service model — events kept around in order for hours, days, or indefinitely, with multiple readers each at their own position — has made Kafka a foundational piece of modern data infrastructure.
LinkedIn open-sourced Kafka in 2011, and the project moved to the Apache Software Foundation shortly after. Confluent, the company many original developers founded, has continued substantial commercial development on top of Apache Kafka and provides a hosted Kafka service. Older versions of Kafka relied on Apache ZooKeeper for cluster coordination; the more recent KRaft mode builds the same coordination directly into Kafka.
Kafka underpins much of the world's data engineering — ETL pipelines, real-time analytics, event-driven microservices, metric streams. LinkedIn, Netflix, Uber, Twitter (X), Airbnb, and countless smaller companies run Kafka clusters as the backbone of their data plane.
Install
Download from https://kafka.apache.org/downloads Or via Docker: docker run -p 9092:9092 apache/kafka:latest
Authors
- Apache Software Foundation
- Confluent