Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Keypsace.Table name

Columns to be migrated

sunbird_courses.course_batch

createddate
startdate
enddate
enrollmentenddate
updateddate

sunbird_courses.user_enrolments

enrolleddate

sunbird_courses.user_content_consumption

lastaccesstime

lastcompletedtime

lastupdatedtime

sunbird.page_management

createddate

sunbird.page_section

createddate
updateddate

Option 1: Migration using temp tables

  • For the above mentioned tables, a temp table is created and migration is done using a spark script

  • Delete the existing table, and re-create the table with proper datatype with same name.

  • Migrate the data from temp table to newly created table.

  • The details related to code changes and steps to run the script are here.

...

Option 2: Migration using new columns

  • Add new columns with type timestamp.

  • Migrate the data from older columns to new columns

  • Drop the older columns having text data.

  • Update APIs, jobs and data products to read data from and write data to new columns.

Assumption

  • All the date timezones are in UTC, as per cassandra doc, timestamps are preferred with UTC.While migrating to the temp table, datatype correction is done.

Clarifications Required

  • Since user_enrolments and user_content_consumption tables have huge data and are distributes across nodes, is it good approach to migrate using a temp table?

...