Apache Cassandra
A wide-column distributed NoSQL database.
Apache Cassandra stores data across a peer-to-peer ring of nodes with no single point of failure. There is no master; every node is equal. Data is partitioned by hash of a chosen partition key, replicated across configurable numbers of nodes (typically three), and made available for read/write from any node in the cluster. The model is tunable consistency: each operation specifies how many replicas must acknowledge before it is considered successful.
Avinash Lakshman and Prashant Malik developed Cassandra at Facebook in 2008, and Facebook open-sourced it in 2009. The project moved to the Apache Software Foundation in 2010. The data model draws from Bigtable's wide-column structure but the distribution model owes more to Amazon's Dynamo paper.
Cassandra excels at write-heavy workloads that need to scale linearly across hundreds of machines while tolerating regular node and network failures. Apple, Netflix, eBay, GoDaddy, and many other large operators run Cassandra at the scale of hundreds of nodes per cluster. ScyllaDB is a high-performance C++ rewrite that speaks the same protocol.
Install
Debian/Ubuntu: see https://cassandra.apache.org/_/quickstart.html Or via Docker: docker run -p 9042:9042 cassandra:latest
Authors
- Apache Software Foundation