Background
This document details about integration points for any new CSP provider with Sunbird-Knowledge platform. (latest as on Dec, 2022. Sunbird-Knowlg release-5.2.0)
Sunbird-Knowlg has verified with Azure, AWS and GCP(GCP MediaService integration is pending).
Knowlg has below flink jobs which interact with cloud storage for upload/download operation:
asset-enrichment
content-publish
qrcode-image-generator
live-node-publisher
knowlg-api-service transforms cloud related metadata (e.g: downloadUrl, appIcon, etc) but doesn't interact with cloud storage.
e.g: the service convert cloud specific path (absolute path) to cloud neutral path (relative path) and vice versa.
In order to add support for any other cloud storage (e.g: OCI) under Knowlg components, below steps need to be followed:
knowlg-api-service:
Git Repos:
https://github.com/project-sunbird/knowledge-platform
https://github.com/project-sunbird/sunbird-learning-platform
Latest branch: release-5.2.0
The Service need only configuration change to maintain relative path in database while write operation and return the absolute path for cloud related metadata while read operation.
e.g:
If appIcon has been given as absolute url (https://sunbirddevbbpublic.blob.core.windows.net/sunbird-content-staging/content/do_213687607599996928135/artifact/download-1.thumb.jpg) in Content/Collection Create API, the service should store it as relative path (CLOUD_STORAGE_BASE_PATH/content/do_213687607599996928135/artifact/download-1.thumb.jpg) and the read api should return back the absolute url.
In above example, base url of storage account & bucket name got replaced with a string value "CONTENT_STORAGE_BASE_PATH" configured in cloudstorage_relative_path_prefix_content variable.
Override value for below variables under private devops repo (file path: ansible/inventory/<env_name>/Core/common.yml) for new storage account:
cloud_storage_content_bucketname
cloudstorage_replace_absolute_path
cloudstorage_relative_path_prefix
cloudstorage_base_path
valid_cloudstorage_base_urls
Configuration File Reference:
https://github.com/project-sunbird/sunbird-devops/blob/b61a35fad0362ea7eb0bb688ff0bc12ffc811571/ansible/roles/stack-sunbird/templates/content-service_application.conf#L484
After Configuration Change, Deploy the service.
Test Content Create & Read API with some metadata having cloud path (e.g: appIcon)
The key integration points present at the below files
Knowledge-Platform repo:
StorageService.scala → Integration points of collecting to org.sunbird.cloud.storage
library. This SunbirdCloudStage SDK has to support for new CSP providers.
Sunbird cloud-stage-sdk version upgrade location
https://github.com/project-sunbird/knowledge-platform/blob/release-5.2.0/platform-modules/mimetype-manager/pom.xml
<dependency> <groupId>org.sunbird</groupId> <artifactId>cloud-store-sdk</artifactId> <version>1.4.3</version> </dependency>
flink jobs:
Git Repo:
https://github.com/project-sunbird/knowledge-platform-jobs
Both jobs uses cloud-storage-sdk for cloud storage operations. So first the sdk need a code change to have support for new cloud storage provider (e.g: OCI).
Link to cloud-storage-sdk git repo: https://github.com/project-sunbird/sunbird-cloud-storage-sdk/tree/scala-2.12-with-latest
Once the cloud-storage-sdk new version is published in maven central, we need to update the version in jobs-core module https://github.com/project-sunbird/knowledge-platform-jobs/blob/e576a50834529c9984ca1cb5012f8ca0c59a5a29/jobs-core/pom.xml#L84
Override value for below variables under private devops repo (file path: ansible/inventory/<env_name>/Knowledge-Platform/common.yml) for new storage account:
cloud_service_provider
cloud_public_storage_accountname
cloud_public_storage_secret
cloud_public_storage_endpoint
cloud_storage_content_bucketname
andcloud_storage_dial_bucketname
Configuration File Reference:
https://github.com/project-sunbird/sunbird-learning-platform/blob/59a59270b5419153b902b3d68165a8b5539f872e/kubernetes/helm_charts/datapipeline_jobs/values.j2#L731
cloudstorage_replace_absolute_path
cloudstorage_relative_path_prefix
cloudstorage_base_path
valid_cloudstorage_base_urls
Configuration File Reference:
https://github.com/project-sunbird/sunbird-learning-platform/blob/59a59270b5419153b902b3d68165a8b5539f872e/kubernetes/helm_charts/datapipeline_jobs/values.j2#L736Build & Deploy both Job
Test Content/Collection Publish Workflow.
Content/Collection should be published successfully.
metadata having cloud storage file reference should be accessible.
e.g: downloadUrl pointing to a file, should be downloadable.
Media service integration (video-streaming)