Cassandra horizontal scaling
Overview
This document provides the insights of horizontally scaling cassandra cluster with the observations and recommendations from benchmarking different scenarios.
Tests and Observations
To test the scenarios, a keyspace with replication factor of 3 and a schema with all possible data types which are currently used in the platform is considered.
CREATE KEYSPACE test_keyspace_rf3 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE test_keyspace_rf3.user_quorum_rw (
id text PRIMARY KEY,
channel text,
countrycode text,
createdby text,
createddate text,
currentlogintime text,
dob text,
email text,
emailverified boolean,
firstname text,
flagsvalue int,
framework map<text, frozen<list<text>>>,
gender text,
grade list<text>,
language list<text>,
lastname text,
loginid text,
password text,
phone text,
phoneverified boolean,
profilevisibility map<text, text>,
status int,
updatedby text,
updateddate text,
userid text,
username text,
usertype text
);
CREATE INDEX inx_t3uq_userid ON test_keyspace_rf3.user_quorum_rw (userid);
CREATE INDEX inx_t3uq_loginid ON test_keyspace_rf3.user_quorum_rw (loginid);
CREATE INDEX inx_t3uq_email ON test_keyspace_rf3.user_quorum_rw (email);
CREATE INDEX inx_t3uq_status ON test_keyspace_rf3.user_quorum_rw (status);
CREATE INDEX inx_t3uq_phone ON test_keyspace_rf3.user_quorum_rw (phone);
CREATE INDEX inx_t3uq_username ON test_keyspace_rf3.user_quorum_rw (username);
1. Read and Write with QUORUM
Test was conducted on both 3 node cluster and 5 node cluster for reading data just after writing with same consistency level of QUORUM.
3Nodes Result:
No. of requests | 50M (1000 Threads) |
Throughput | 8098.5 per sec |
CPU usage | 89% Max |
Error | 0 |
5Nodes Result:
No. of requests | 100M (1000 Threads) |
Throughput | 11429.6/s |
CPU usage | 85% Max |
Error | 3 - For writes, 1700 - for reads |
Observation:
With 5nodes cluster, there is a considerable amount of increase in throughput, and a very negligible amount of errors.
Since the consistency is QUORUM, the data availability and consistency is met and this is the recommended consistency for both write and read.
2. Read with secondary index
Test was conducted on both 3 node cluster and 5 node cluster for reading data using secondary index, as defined in the above mentioned schema.
3Nodes Result:
No. of requests | 1.6M (1000 threads) |
Throughput | 7847.6 per sec |
CPU usage | 65% Max |
Error | 0 |
Avg response time | 124 ms |
5Nodes Result:
No. of requests | 1.6M (1000 threads) |
Throughput | 894.2 per sec |
CPU usage | 91% Max |
Error | 0 |
Avg response time | 1.1 s |
Observation:
As the number of nodes in the cluster increase, the performance of reads with secondary index decreases exponentially.
It is evident from the above tests, the average response time has increased drastically, as cassandra scans each node for the index value, thereby reducing the performance.
Reading from secondary index is not dependent on the consistency level.
It is not recommended to query data using secondary indices.
3. Read with QUORUM
Test was conducted on both 3 node cluster and 5 node cluster for reading data with QUROUM consistency.
3nodes result:
No. of requests | 50M (1000 threads) |
Throughput | 20099.2 per sec |
CPU usage | 70% Max |
Error | 0 |
5nodes result:
No. of requests | 50M (1000 threads) |
Throughput | 22683.9 per sec |
CPU usage | 59% Max |
Error | 0 |
Observation:
From the above results, with increased number of concurrency, we can get better throughput with 5 nodes cluster.
4. Write with QUORUM
Test was conducted on both 3 node cluster and 5 node cluster for writing data with QUROUM consistency.
3nodes result:
No. of requests | 50M (1000 threads) |
Throughput | 12804.8 per sec |
CPU usage | 88% Max |
Error | 0 |
5nodes result:
No. of requests | 50M (1000 threads) |
Throughput | 18014.7 per sec |
CPU usage | 91% Max |
Error | 0 |
Observation:
Writes with QUORUM gave better throughout with 5 nodes cluster.
Summary
On adding nodes to cassandra cluster, writes and reads with QUROUM consistency level is recommended.
Avoiding or not using secondary indices for querying data is highly recommended as there is an exponential drop in performance.