Knowlg - Domain Agnostic Data

Introduction

This wiki explains the current format of the absolute path stored in DBs (Neo4J, Cassandra, ES) and the proposed design to make it domain-agnostic.

Background & Problem Statement

Knowlg uses URL storage in various scenarios to save and access URLs. The metadata of the functional objects stores the references (URLs) of the respective data.

For example, the content object in the Sunbird Knowlg building block stores the URLs of different files in its metadata to refer to the actual content.

Key Design Problems

  1. Replace usage of absolute URLs in metadata with relative path while storing in DB.

  2. While reading the data it should return the absolute path.

Design

With the above problem statement, it is clear that we should not use the absolute path directly in our databases or any other places. Configure the base path, trim it while storing it, and prefix it while reading.

Solution 1:

Trim the URL as a relative path in EkStepTransactionEventHandler (in knowledge-platform-db-extentions) while storing after generating the logstash events (so ES will have the absolute path and no code changes will be required for ES).

For the returning the absolute path while reading the data from DB, prefix the base path in the following classes based on different repos:

 

Solution 2:

Instead of trimming the relative path in EkStepTransactionEventHandler, do it in GraphService only. But in this solution, ES data needs to be handled in the search-indexer job. Reading will be same as solution 1.

 

Pros & Cons

Solution

Pros

Cons

Solution

Pros

Cons

Solution 1

  • Code changes will required only in DB layer. No other code/logic changes will be required.

  • Hence Trimming the data in TransactionEventHandler so no ES data handling is required separately.

  • If adopter later wanted to switch from Neo4J to other DB then he need to handle the data trimming part because it is part of TransactionEventHandler.

Solution 2

  • Code changes will required only in DB layer. No other code/logic changes will be required.

  • If adopter later wanted to switch from Neo4J to other DB then he not need to do anything extra.

  • ES data needs to handle in search-indexer while indexing the data.