Cassandra horizontal scaling

Overview

This document provides the insights of horizontally scaling cassandra cluster with the observations and recommendations from benchmarking different scenarios.

Tests and Observations

To test the scenarios, a keyspace with replication factor of 3 and a schema with all possible data types which are currently used in the platform is considered.

CREATE KEYSPACE test_keyspace_rf3 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE test_keyspace_rf3.user_quorum_rw (
    id text PRIMARY KEY,
    channel text,
    countrycode text,
    createdby text,
    createddate text,
    currentlogintime text,
    dob text,
    email text,
    emailverified boolean,
    firstname text,
    flagsvalue int,
    framework map<text, frozen<list<text>>>,
    gender text,
    grade list<text>,
    language list<text>,
    lastname text,
    loginid text,
    password text,
    phone text,
    phoneverified boolean,
    profilevisibility map<text, text>,
    status int,
    updatedby text,
    updateddate text,
    userid text,
    username text,
    usertype text
);
CREATE INDEX inx_t3uq_userid ON test_keyspace_rf3.user_quorum_rw (userid);
CREATE INDEX inx_t3uq_loginid ON test_keyspace_rf3.user_quorum_rw (loginid);
CREATE INDEX inx_t3uq_email ON test_keyspace_rf3.user_quorum_rw (email);
CREATE INDEX inx_t3uq_status ON test_keyspace_rf3.user_quorum_rw (status);
CREATE INDEX inx_t3uq_phone ON test_keyspace_rf3.user_quorum_rw (phone);
CREATE INDEX inx_t3uq_username ON test_keyspace_rf3.user_quorum_rw (username);

1. Read and Write with QUORUM

Test was conducted on both 3 node cluster and 5 node cluster for reading data just after writing with same consistency level of QUORUM.
3Nodes Result:

No. of requests	50M (1000 Threads)
Throughput	8098.5 per sec
CPU usage	89% Max
Error	0

5Nodes Result:

No. of requests	100M (1000 Threads)
Throughput	11429.6/s
CPU usage	85% Max
Error	3 - For writes, 1700 - for reads

Observation:
With 5nodes cluster, there is a considerable amount of increase in throughput, and a very negligible amount of errors.
Since the consistency is QUORUM, the data availability and consistency is met and this is the recommended consistency for both write and read.

2. Read with secondary index

Test was conducted on both 3 node cluster and 5 node cluster for reading data using secondary index, as defined in the above mentioned schema.
3Nodes Result:

No. of requests	1.6M (1000 threads)
Throughput	7847.6 per sec
CPU usage	65% Max
Error	0
Avg response time	124 ms

5Nodes Result:

No. of requests	1.6M (1000 threads)
Throughput	894.2 per sec
CPU usage	91% Max
Error	0
Avg response time	1.1 s

Observation:
As the number of nodes in the cluster increase, the performance of reads with secondary index decreases exponentially.
It is evident from the above tests, the average response time has increased drastically, as cassandra scans each node for the index value, thereby reducing the performance.
Reading from secondary index is not dependent on the consistency level.
It is not recommended to query data using secondary indices.

3. Read with QUORUM

Test was conducted on both 3 node cluster and 5 node cluster for reading data with QUROUM consistency.
3nodes result:

No. of requests	50M (1000 threads)
Throughput	20099.2 per sec
CPU usage	70% Max
Error	0

5nodes result:

No. of requests	50M (1000 threads)
Throughput	22683.9 per sec
CPU usage	59% Max
Error	0

Observation:
From the above results, with increased number of concurrency, we can get better throughput with 5 nodes cluster.

4. Write with QUORUM

Test was conducted on both 3 node cluster and 5 node cluster for writing data with QUROUM consistency.
3nodes result:

No. of requests	50M (1000 threads)
Throughput	12804.8 per sec
CPU usage	88% Max
Error	0

5nodes result:

No. of requests	50M (1000 threads)
Throughput	18014.7 per sec
CPU usage	91% Max
Error	0

Observation:
Writes with QUORUM gave better throughout with 5 nodes cluster.

Summary

On adding nodes to cassandra cluster, writes and reads with QUROUM consistency level is recommended.
Avoiding or not using secondary indices for querying data is highly recommended as there is an exponential drop in performance.

Sunbird Design