Problem Statement
...
operation-mode | workflow |
---|---|
upload | create-upload content |
publish | create-upload-publish content |
link | create-upload-publish content and link it to textbook |
Design
1. Validations
File related validations to be done are,
...
- Check whether the file is conforming to the bulk content upload template(The template should be configurable)
- Number of rows in file should be less than Max rows allowed(configuration)
- Duplicity check within the file. Key is Taxonomy(BGMS)+ContentName
2. Synchronous Processing
- Upload the CSV file to blob storage
...
column | data to insert | remarks |
---|---|---|
processid | processid | id from master table |
sequenceid | auto-generated sequence id | |
createdon | current timestamp | |
data | data in JSON format | |
failureresult | JSON data + failed message | |
iterationid | 0 | Not used |
lastupdatedon | last updated timestamp to be updated here on each update | |
status | possible values - queued, success, failed | |
successresult | JSON data + success message |
4. For LINK operation-mode, get draft hierarchy of Textbooks mentioned in CSV and cache the dialCode-TextBookUnitDoId mapping in Redis
5. Push events to Kafka with Textbook Id as partition key for LINK operation-mode. Use hashed-value generated during duplicity check as partition key for other operation modes.
3. Asynchronous Processing - Samza
- Validation of mandatory fields
- Validate DIAL code (first against redis-cache, if not present, get draft hierarchy from cassandra and validate)
- Validate the file size and file format in Google Drive
- Validate the Taxonomy by creating the content - Hit the REST API
- Download the AppIcon from Google Drive
- Create an asset with downloaded image
- Update content with AppIcon image URL
- Download content file from Google Drive
- Upload content - Hit the REST API
- Publish the content - Hit the Java API
- Get draft hierarchy of the TextBook from Cassandra
- Get the metadata of the published content
- Update the draft hierarchy of the TextBook in Cassandra
- Update the status back to LMS Cassandra- bulk_upload_process_task table
- Retire the Content in case of any exception in the flow
4. Scheduler
- Scheduler to run in periodic intervals to consolidate the result from bulk_upload_process_task table and update the master table(bulk_upload_process) with success_count, failed_count, process_end_time, result_file_url and status
- While a process is being marked as completed, the result file has to be generated, uploaded to blobstore and URL updated back to bulk_upload_process table
5. Status Check API
- Data from bulk_upload_process table to be served based on the processId
6. Status List API
- The userId of the user should be deduced from keycloak access token passed in the header.
- Statuses of all uploads done by the user has to be served from the bulk_upload_process tables
...
API Specifications
Bulk Content Upload API - POST - /v1/textbook/content/bulk/upload
Request Headers
Content-Type | multipart/form-data |
Authorization | Bearer {{api-key}} |
x-authenticated-user-token | {{keycloak-token}} |
x-channel-id | {{channel-identifier}} |
x-framework-id | {{framework-identifier}} |
x-hashtag-id | {{tenant-id}} |
operation-mode | upload/publish/link |
...
Bulk Content Upload Status Check API - GET - /v1/textbook/content/bulk/upload/status/:processId
Request Headers
Accept | application/json |
Authorization | Bearer {{api-key}} |
x-authenticated-user-token | {{keycloak-token}} |
Response : Success Response - OK (200) - In Queue
...
Bulk Content Upload Status List API - GET - /v1/textbook/content/bulk/upload/status/list
Request Headers
Accept | application/json |
Authorization | Bearer {{api-key}} |
x-authenticated-user-token | {{keycloak-token}} |
Response : Success Response - OK (200)
...