Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • It will read the config.json file of specific state present inside the ingest folder.

  • Then it will process all the dimension grammar present in the dimensions folder.

  • The dimension grammars are stored in the “spec.dimensionGrammar” table and the dimensions tables are created in the dimensions schema.

  • It will also look for data files respective to each dimension grammar file name and ingest all the dimension data to the respective tables.

  • After the dimensions are ingested the programs array present in config.json is read and the event grammars are process from the corresponding <program-name> folder. The event grammars are stored in the spec.”EventGrammars” table.

  • The dataset grammars are also stored in the spec.”datasetGrammars” table and the dataset tables are created based on the combination of timeDimension, dimension and metric present in the event grammars.

  • In addition to the above combination of datasets created the user can also specify the combination of datasets that can be created in the whitelist array.

Config.json file

Code Block
{
  "globals": {
    "onlyCreateWhitelisted": true
  },
  "dimensions": {
    "namespace": "dimensions",
    "fileNameFormat": "${dimensionName}.${index}.dimensions.data.csv",
    "input": {
      "files": "./ingest/JH/dimensions"
    }
  },
  "programs": [
    {
      "name": "DIKSHA",
      "namespace": "diksha",
      "description": "DIKSHA",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/diksha"
      },
      "./output": {
        "location": "./output/programs/diksha"
      },
      "dimensions": {
        "whitelisted": [
          "state,grade,subject,medium,board",
          "textbookdiksha,grade,subject,medium",
          "textbookdiksha,grade,subject,medium"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "School Attendance",
      "namespace": "sch_att",
      "description": "School Attendance",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/school-attendance"
      },
      "./output": {
        "location": "././output/programs/school-attendance"
      },
      "dimensions": {
        "whitelisted": [
          "gender,district",
          "gender,block",
          "gender,cluster",
          "school,grade",
          "gender,school",
          "gender,school,grade",
          "schoolcategory,district",
          "schoolcategory,block",
          "schoolcategory,cluster"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "PM Poshan",
      "namespace": "pm_poshan",
      "description": "PM Poshan",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/pm-poshan"
      },
      "./output": {
        "location": "./output/programs/pm-poshan"
      },
      "dimensions": {
        "whitelisted": [
          "district,categorypm"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "NAS",
      "namespace": "nas",
      "description": "NAS",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/nas"
      },
      "./output": {
        "location": "./output/programs/nas"
      },
      "dimensions": {
        "whitelisted": [
          "district,lo,subject,grade",
          "state,lo,subject,grade"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "UDISE",
      "namespace": "udise",
      "description": "UDISE",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/udise"
      },
      "./output": {
        "location": "./output/programs/udise"
      },
      "dimensions": {
        "whitelisted": [
          "district,categoryudise",
          "state,categoryudise"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "PGI",
      "namespace": "pgi",
      "description": "PGI",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/pgi"
      },
      "./output": {
        "location": "./output/programs/pgi"
      },
      "dimensions": {
        "whitelisted": [
          "state,district,categorypgi",
          "state,categorypgi"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "NISHTHA",
      "namespace": "nishtha",
      "description": "NISHTHA",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/nishtha"
      },
      "./output": {
        "location": "./output/programs/nishtha"
      },
      "dimensions": {
        "whitelisted": [
          "state,district,programnishtha",
          "state,programnishtha,coursenishtha",
          "state,programnishtha",
          "district,programnishtha"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "Student Progression",
      "namespace": "student_progression",
      "description": "Student Progression",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/student-progression"
      },
      "./output": {
        "location": "./output/programs/student-progression"
      },
      "dimensions": {
        "whitelisted": [
          "school,academicyear"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "School Infrastructure",
      "namespace": "school_infra",
      "description": "School Infrastructure",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/school-infra"
      },
      "./output": {
        "location": "./output/programs/school-infra"
      },
      "dimensions": {
        "whitelisted": [
          "school,academicyear"
        ],
        "blacklisted": []
      }
    },
    {
      "name": "Student Assessment",
      "namespace": "assessment",
      "description": "Student Assessment",
      "shouldIngestToDB": true,
      "input": {
        "files": "./ingest/JH/programs/student-assessment"
      },
      "./output": {
        "location": "./output/programs/student-assessment"
      },
      "dimensions": {
        "whitelisted": [
          "exam,grade,academicyear,subject,lo,school",
          "state,lo,subject,grade",
          "district,subject,grade"
        ],
        "blacklisted": []
      }
    }
  
  ]
}

yarn cli ingest-data: This command will ingest the data to the dataset tables for all the programs. It also provides an option to ingest the data for the particular program.

...