Overview
This document contains observation on Batch execution with different configuration for partial writes.
Tests and Observations
For testing different scenarios keyspace and table created
CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 2}; CREATE TABLE user(name text, id text, PRIMARY KEY (name));
In java
PreparedStatement preparedStatement = session.prepare("insert into test.user (id, name) values (?, ?)"); PreparedStatement preparedStatement2 = session.prepare("update test.user set id='id_updated-1' where name=?"); PreparedStatement preparedStatement3 = session.prepare("update test.user set id='id_updated-2' where name=?"); int i = 1; while(i <= 1000) { batchStatement.add(preparedStatement.bind("id_"+i, "user-" + i)); ++i; } batchStatement.add(preparedStatement2.bind("user-1")); batchStatement.add(preparedStatement3.bind("user-2"));
In cassandra.yaml file change timeout to verify partial write
write_request_timeout_in_ms: 10
The above query will throw exception
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
Observation
The data got inserted into user table and got updated as above batch execution
Approach
To handle above WriteTimeoutException
we can use BatchStatement.Type
BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.UNLOGGED);
Below are the two batch types we executed for partial writes
LOGGED
UNLOGGED
LOGGED and UNLOGGED batch types, inserts data and throws exception on acknowledgement error. This needs to be handled in the code.
In cassandra docs, it is suggested to go with UNLOGGED batch type for better performance.
For detailed analysis, please refer here.