Introduction:
This document describes the design to generate the data for the dashboards from Druid and export the data to cloud storage. It mainly consists of 2 parts:
- Request Report API - which is used to request for a new report.
- Report Data Generator - is a spark job runs as per schedule time, generates data out of druid and exports data to cloud storage for each requests submitted.
Request Report API:
- Inputs:
- * Channel Id
- * Time Interval - example
Last 7 Days
,Last Month
,Current Month
- * Json Report Config
- Outputs:
- * Saves report-id and job_name as "druid-reports" in cassandra platform_db.job_request table as "SUBMITTED"
- * Build druid query from inputs and save to azure as "report-id.txt"
- * Save "report-id.json" to azure
Report Data Generator:
- Search through cassandra platform_db.job_request table for job_name = "druid-reports" and status = "SUBMITTED"
- Execute druid query for each reports from the list
- Save query response json to azure as "report-data/report-id.json"
- Updates URL path in "report-id.json"