Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current Restore this Version View Page History

« Previous Version 2 Next »

Usage Documentation

Introduction

This usage document provides the details of how the product workflow has been designed by considering the use cases of end users. This will explain the flow of the application starting from Injecting the datasets.

Purpose: The purpose of the document is to provide how to ingest the data and access the datasets.

Step-wise Ingestion process

The ingestion of the data can be done with using csv import or with using the ingestion apis

Using CSV import

API: ingestion/csv

HTTP Method: POST

Note : Before starting to use this api first we need to run the spec apis for the csv import as it depends on the schema. Based on the schema it will validate the data.

A. CSV import with Event

To ingest the data for the event we need to run the spec/event api. Attach the event file to the api request body, it takes three parameters as shown in this example.

The Parameter of the request body is :-

1.file : attach the csv file for the importing

2.ingestion_type : need to specify the type of ingestion

3.ingestion_name:name of the event

After the post api it will call the ingestion/event api internally and validate the csv data with the ajv validator , after successfully validating the csv it will write the data to another csv file based on the csv data and store it in the input_files folder.

B. CSV import with Dataset

To ingest the data for the dataset we need to attach the dataset csv file to our request body as shown in the below screenshot and it will call the ingestion/dataset api internally.

On hitting the post api it will do the validation and gives a appropriate result with an successfully message

C. CSV import with Dimension

We need to pass the csv file in the request body along with the ingestion type and ingestion name for the dimension as you can see in the below screenshot,it will call ingestion/dimension api internally and generate the csv files in the input_files folder.

As it will validate and return the successful message in the response and generate a new csv file.

Using the Ingestion API

Note : Before starting to use this api we need to run the spec apis.Based on the schema present in the database it will validate the data present in the request body using ajv validator.

A. Execution of Event API

API : ingestion/event

HTTP Method: POST

Before calling the ingestion event API, schema for the particular event name passed in the request body should be present in the database. Then the event data is validated with the schema present in the database. After the successful validation, the data is written to the csv file and stored in the input-files folder.

The request body is :

1.event_name : name of the event

2.event : array of objects having data.

B. Execution of Dimension API

API-POST : ingestion/dimension

HTTP Method: POST

Note : To call the ingestion/dimension api, dimension schema for the particular dimension name should be present in the database.

The Request body of ingestion/dimension API is given the below screenshot.

Note: The below screenshot contains sample data for dimension name “school details” and will vary for different dimensions.

The Post api will validate the request body with ajv validator and once it validates the request body it will write the data to a csv file which is stored in the input-files folder.

C. Execution of Dataset API

API: ingestion/dataset

HTTP Method: POST

The request body for the API can be seen in the below screenshot.

The post api will validate the request body with the ajv validator and it will write the data to the csv file which is stored in the “input-files” folder.

Execution of Schedule API

Note : To call the Schedule API first we need to run the spec/pipeline api.

The Request body of ingestion/dimension API is given the below screenshot.

The schedule API helps to run the processor group at a particular time.

It takes scheduled_at and pipeline name as the request body where pipeline_name

is the processor group name present in the Nifi.

The transformer will read the csv files and ingest the data into the database.

Execution of File status API

There are two types of file-status API:

A. GET file-status API

B. PUT file-status API

A. Execution of GET file-status API

API: ingestion/file-status

HTTP Method: GET

With the help of this api we can check the status of the uploaded csv file. As we are validating the data present in the csv with the schema data, there may be a huge volume of data that needs to be validated. So this api will help us to know the file status of the uploaded csv. If there are any errors present it in the csv it will give us the status of the file.The request body is:

B. Execution of PUT file-status API

API : ingestion/file-status

HTTP Method: PUT

The use for this api is to update the csv files status in the file_tracker table present in the db. Once the csv upload is successful the files will be stored in the input-files folder so to move those files into the processing and then to archived-files folder this api is integrated.With the help of this api we will update the file status in file_tracker table based on whether it is in processing state and archived state.

The request body for the api is :

6

Yaml File

YAML is a digestible data serialization language often used to create configuration files with any programming language.

YAML File : https://github.com/Sunbird-cQube/spec-ms/blob/dev/spec.yaml with this link copy the all content from this file to and paste it into the Swagger Editor so with the use of swagger we can see the example request body for the apis and with response as expected.

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.