Partial Cassandra Writes
Overview
This document contains observation on Batch execution with different configuration for partial writes.
Tests and Observations
For testing different scenarios keyspace and table created
CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 2};
CREATE TABLE user(name text, id text, PRIMARY KEY (name));
In java
PreparedStatement preparedStatement = session.prepare("insert into test.user (id, name) values (?, ?)");
PreparedStatement preparedStatement2 = session.prepare("update test.user set id='id_updated-1' where name=?");
PreparedStatement preparedStatement3 = session.prepare("update test.user set id='id_updated-2' where name=?");
int i = 1;
while(i <= 1000) {
batchStatement.add(preparedStatement.bind("id_"+i, "user-" + i));
++i;
}
batchStatement.add(preparedStatement2.bind("user-1"));
batchStatement.add(preparedStatement3.bind("user-2"));
In cassandra.yaml file change timeout to verify partial write
write_request_timeout_in_ms: 10
The above query will throw exception
Observation
The data got inserted into user table and got updated as above batch execution.
The WriteTimeoutException.getWriteType() == BATCH, ensures data is inserted into the table
Approach
To handle above WriteTimeoutException
we can use BatchStatement.Type
Batch type of LOGGED, guarantees atomic insertion of data, and WriteTimeoutException.getWriteType() == BATCH, eventually written to the appropriate replicas and the developer doesn't have to do anything.
Thus, the above two checks handles partial write.