Problem Statement
...
- Bulk Content Upload is to be supported in Sunbird, with 3 operation modes
- API to be made available to check the real-time status of the bulk content upload process
- API to be made available to list the statuses of processes initiated by an user
operation-mode | workflow |
---|---|
upload | create-upload content |
publish | create-upload-publish content |
link | create-upload-publish content and link it to textbook |
Design
API Specifications
Bulk Content Upload API - POST - /v1/textbook/content/bulk/upload
Request Headers
...
Request Body
Code Block |
---|
content: [contentUploadFile.csv] |
Response : Success Response - OK (200)
Code Block |
---|
{
"id": "api.textbook.content.bulk.upload",
"ver": "v1",
"ts": "2019-07-26 11:28:42:315+0000",
"params": {
"resmsgid": null,
"msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67",
"err": null,
"status": "success",
"errmsg": null
},
"responseCode": "OK",
"result": {
"processId": "012813442982903808142"
}
} |
Response : Failure Response - BAD REQUEST (400) - Corrupt File
Code Block |
---|
{
"id": "api.textbook.content.bulk.upload",
"ver": "v1",
"ts": "2019-07-26 11:28:42:315+0000",
"params": {
"resmsgid": null,
"msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67",
"err": "CORRUPT_FILE",
"status": "CORRUPT_FILE",
"errmsg": "Bulk content upload failed due to corrupt file"
},
"responseCode": "CLIENT_ERROR",
"result": { }
} |
Response : Failure Response - BAD REQUEST (400) - Invalid File Format(Only CSV files are supported)
Code Block |
---|
{
"id": "api.textbook.content.bulk.upload",
"ver": "v1",
"ts": "2019-07-26 11:28:42:315+0000",
"params": {
"resmsgid": null,
"msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67",
"err": "INVALID_FILE_FORMAT",
"status": "INVALID_FILE_FORMAT",
"errmsg": "Bulk content upload failed due to invalid file format"
},
"responseCode": "CLIENT_ERROR",
"result": { }
} |
...
1. Validations
File related validations to be done are,
- Validate the format of the file
- Validate whether the file is readable
- Validate whether the file has data
Data related validations to be done are,
- Check whether the file is conforming to the bulk content upload template(The template should be configurable)
- Number of rows in file should be less than Max rows allowed(configuration)
- Duplicity check within the file. Key is Taxonomy(BGMS)+ContentName
2. Synchronous Processing
- Upload the CSV file to blob storage
2. Make an entry in bulk_upload_process table
column | data to insert | remarks |
---|---|---|
id | auto-generated unique id | processId |
createdby | uploader id | |
createdon | current timestamp | |
data | blobstore url of CSV file | |
failureresult | failedCount to be updated here | |
lastupdatedon | last updated timestamp to be updated here on each update | |
objecttype | content | |
organisationid | tenant id | |
processendtime | endTime - current timestamp to be inserted here while moving this process to completed state | |
processstarttime | startTime - current timestamp to be inserted here while moving this process to processing state | |
retrycount | 0 | Not used |
status | queued | status - possible values - queued, processing, completed |
storagedetails | report - blobstore url of result file | |
successresult | successCount to be updated here | |
taskcount | number of records in file | totalCount |
uploadedby | uploader id | |
uploadeddate | current timestamp |
3. Make entries into bulk_upload_process_task table (One record per content)
column | data to insert | remarks |
---|---|---|
processid | processid | id from master table |
sequenceid | auto-generated sequence id | |
createdon | current timestamp | |
data | data in JSON format | |
failureresult | JSON data + failed message | |
iterationid | 0 | Not used |
lastupdatedon | last updated timestamp to be updated here on each update | |
status | possible values - queued, success, failed | |
successresult | JSON data + success message |
4. For LINK operation-mode, get draft hierarchy of Textbooks mentioned in CSV and cache the dialCode-TextBookUnitDoId mapping in Redis
5. Push events to Kafka with Textbook Id as partition key for LINK operation-mode. Use hashed-value generated during duplicity check as partition key for other operation modes.
3. Asynchronous Processing - Samza
- Validation of mandatory fields
- Validate DIAL code (first against redis-cache, if not present, get draft hierarchy from cassandra and validate)
- Validate the file size and file format in Google Drive
- Validate the Taxonomy by creating the content - Hit the REST API
- Download the AppIcon from Google Drive
- Create an asset with downloaded image
- Update content with AppIcon image URL
- Download content file from Google Drive
- Upload content - Hit the REST API
- Publish the content - Hit the Java API
- Get draft hierarchy of the TextBook from Cassandra
- Get the metadata of the published content
- Update the draft hierarchy of the TextBook in Cassandra
- Update the status back to LMS Cassandra- bulk_upload_process_task table
- Retire the Content in case of any exception in the flow
4. Scheduler
- Scheduler to run in periodic intervals to consolidate the result from bulk_upload_process_task table and update the master table(bulk_upload_process) with success_count, failed_count, process_end_time, result_file_url and status
- While a process is being marked as completed, the result file has to be generated, uploaded to blobstore and URL updated back to bulk_upload_process table
5. Status Check API
- Data from bulk_upload_process table to be served based on the processId
6. Status List API
- The userId of the user should be deduced from keycloak access token passed in the header.
- Statuses of all uploads done by the user has to be served from the bulk_upload_process tables
...
API Specifications
Bulk Content Upload API - POST - /v1/textbook/content/bulk/upload
Request Headers
Content-Type | multipart/form-data |
Authorization | Bearer {{api-key}} |
x-authenticated-user-token | {{keycloak-token}} |
x-channel-id | {{channel-identifier}} |
x-framework-id | {{framework-identifier}} |
x-hashtag-id | {{tenant-id}} |
operation-mode | upload/publish/link |
Request Body
Code Block |
---|
content: [contentUploadFile.csv] |
Response : Success Response - OK (200)
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": "INVALID_FILE_TEMPLATE"null, "status": "INVALID_FILE_TEMPLATEsuccess", "errmsg": "Bulk content upload failed due to invalid file template"errmsg": null }, "responseCode": "CLIENT_ERROROK", "result": { "processId": "012813442982903808142" } } |
Response : Failure Response - BAD REQUEST (400) - Too many rowsCorrupt File
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": "MAX_ROW_COUNT_EXCEEDEDCORRUPT_FILE", "status": "MAX_ROW_COUNT_EXCEEDEDCORRUPT_FILE", "errmsg": "Max row count allowed is <config>Bulk content upload failed due to corrupt file" }, "responseCode": "CLIENT_ERROR", "result": { } } |
Bulk Content Upload Status Check API - GET - /v1/textbook/content/bulk/upload/status/:processId
Request Headers
...
, "result": { } } |
Response :
...
Failure Response -
...
BAD REQUEST (
...
400) -
...
Invalid File Format(Only CSV files are supported)
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload.status", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": null, "status": "success""INVALID_FILE_FORMAT", "errmsgstatus": null "INVALID_FILE_FORMAT", }, "responseCodeerrmsg": "OK", "result": {Bulk content upload failed due to invalid file format" }, "processIdresponseCode": "012813442982903808142CLIENT_ERROR", "status": "Queued", "totalCount": 500 "result": { } } |
Response :
...
Failure Response -
...
BAD REQUEST (
...
400) -
...
Invalid File Template (Columns Missing)
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload.status", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": null"INVALID_FILE_TEMPLATE", "status": "successINVALID_FILE_TEMPLATE", "errmsg": null }, "responseCode": "OK", "result": { "processId": "012813442982903808142", "status": "Processing", "totalCount": 500, "successCount": 100, "failedCount": 10, "startTime": "2019-07-26 11:28:42:315+0000" "Bulk content upload failed due to invalid file template" }, "responseCode": "CLIENT_ERROR", "result": { } } |
Response :
...
Failure Response -
...
BAD REQUEST (
...
400) -
...
Too many rows
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload.status", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": null, "status": "success"af29-980bc3151c67", "errmsgerr": null"MAX_ROW_COUNT_EXCEEDED", }, "responseCodestatus": "OKMAX_ROW_COUNT_EXCEEDED", "result": { "processId"errmsg": "012813442982903808142", "status": "Completed", "totalCount": 500, "successCount": 450, "failedCount": 50, "startTime": "2019-07-26 11:28:42:315+0000" "endTime": "2019-07-26 12:28:42:315+0000", "report": "signedDownloadUrl" } } |
...
Max row count allowed is <config>"
},
"responseCode": "CLIENT_ERROR",
"result": { }
} |
...
Bulk Content Upload Status Check API - GET - /v1/textbook/content/bulk/upload/status/:processId
Request Headers
Accept | application/json |
Authorization | Bearer {{api-key}} |
x-authenticated-user-token | {{keycloak-token}} |
Response : Success Response - OK (200) - In Queue
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload.status", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": "PROCESS_NOT_FOUND"null, "status": "PROCESS_NOT_FOUNDsuccess", "errmsg": "Process Id xxx is not found in the system": null }, "responseCode": "RESOURCE_NOT_FOUNDOK", "result": { } } |
Bulk Content Upload Status List API
HTTP Method - GET
API Endpoint - /v1/textbook/content/bulk/upload/status/list
Request Headers
...
"result": { "processId": "012813442982903808142", "status": "Queued", "totalCount": 500 } } |
Response : Success Response - OK (200) - In QueueProgress
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload.status.list", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": null, "status": "success", "errmsg": null }, "responseCode": "OK", "result": { "userId": "", "uploads": [ { "processId": "012813442982903808142", "uploadedDatestatus": "2018-12-12 14:25:27:466+0530Processing", "statustotalCount": "Completed" }, {500, "successCount": 100, "processIdfailedCount": "012813442982903808143"10, "uploadedDatestartTime": "20182019-1207-1426 1211:0128:3642:807315+05300000", "status": "Queued" } ] } } |
Response : Success Response - OK (200) - In ProgressCompleted
Code Block |
---|
{ "id": "api.bulkupload"api.textbook.content.bulk.upload.status", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": null, "status": "success", "errmsg": null }, "responseCode": "OK", "result": { "processId": "012813442982903808142", "status": "ProcessingCompleted", "totalCount": 500, "successCount": 100450, "failedCount": 1050, "startTime": "2019-07-26 11:28:42:315+0000" "endTime": "2019-07-26 12:28:42:315+0000", "report": "signedDownloadUrl" } } |
Response :
...
Failure Response -
...
RESOURSE NOT FOUND (
...
404) -
...
ProcessId not found
Code Block |
---|
{ "id": "api.bulkuploadapi.textbook.content.bulk.upload.status", "ver": "v1", "ts": "2019-07-26 11:28:42:315+0000", "params": { "resmsgid": null, "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "err": "PROCESS_NOT_FOUND", "status": null,"PROCESS_NOT_FOUND", "errmsg": "Process Id xxx is not found in the system" }, "statusresponseCode": "successRESOURCE_NOT_FOUND", "result": { "errmsg": null }, "responseCode": "OK", "result": { "processId": "012813442982903808142", "status": "Completed", "totalCount": 500, "successCount": 450, "failedCount": 50, "startTime": "2019-07-26 11:28:42:315+0000" "endTime} } |
...
Bulk Content Upload Status List API - GET - /v1/textbook/content/bulk/upload/status/list
Request Headers
Accept | application/json |
Authorization | Bearer {{api-key}} |
x-authenticated-user-token | {{keycloak-token}} |
Response : Success Response - OK (200)
Code Block |
---|
{ "id": "api.textbook.content.bulk.upload.status.list", "ver": "v1", "ts": "2019-07-26 1211:28:42:315+0000", "reportparams": { "signedDownloadUrl" "resmsgid": null, } } |
Response : Failure Response - RESOURSE NOT FOUND (404) - ProcessId not found
Code Block |
---|
{ "msgid": "cf5b2e8e-70cf-401c-af29-980bc3151c67", "iderr": "api.bulkupload.content",null, "verstatus": "v1success", "tserrmsg": "2019-07-26 11:28:42:315+0000"null }, "paramsresponseCode": { "OK", "result": { "resmsgiduserId": null"f61826aa-8e5d-4356-b8ad-edda92460750", "uploads": [ { "processId": "msgid"012813442982903808142", "uploadedDate": "cf5b2e8e2018-70cf-401c-af29-980bc3151c67", "err12-12 14:25:27:466+0530", "status": "PROCESS_NOT_FOUND", "statusCompleted" }, { "processId": "PROCESS_NOT_FOUND012813442982903808143", "errmsg "uploadedDate": "Process Id xxx is not found in the system" }, "responseCode": "RESOURCE_NOT_FOUND", "result": {2018-12-14 12:01:36:807+0530", "status": "Processing" } ] } } |