January 10, 2017
Notes on Apache Cassandra
These notes are based on short intro.
1. cassandra has thing called "partitioning key" which is a part of the _primary key_
PK -> consistency hashing function -> value determines which Bucket (range of Hashes) it fits into -> which node we need to talk to
2. CAP theorem tradeoffs. Cassandra is Highly Available (Availability) and Partition Tolerant (Partitioning)
3. Replication. RF = 3 - factor of replication.
4. Consistency level (how many confirmations of actual nodes has written data): { All, Quorum, One }. Also can have a modifier "Local" in case of multiple data centers. We can write to different DC which could be one for OLAP one for OLTP (for different queries does not impact performance each other)
5. CL - per query; RF - per keyspace
1. cassandra has thing called "partitioning key" which is a part of the _primary key_
PK -> consistency hashing function -> value determines which Bucket (range of Hashes) it fits into -> which node we need to talk to
2. CAP theorem tradeoffs. Cassandra is Highly Available (Availability) and Partition Tolerant (Partitioning)
3. Replication. RF = 3 - factor of replication.
4. Consistency level (how many confirmations of actual nodes has written data): { All, Quorum, One }. Also can have a modifier "Local" in case of multiple data centers. We can write to different DC which could be one for OLAP one for OLTP (for different queries does not impact performance each other)
5. CL - per query; RF - per keyspace
Labels: apache cassandra, big data, cassandra, software architecture