Design for Automated reports from Druid

Introduction:

This document describes the detailed design to generate the data for the portal dashboard reports from Druid and export the data to cloud storage. It mainly consists of 2 parts:

Request Report API - which is used to request for a new report.
Report Data Generator - is a spark job runs as per schedule time, generates data out of druid and exports data to cloud storage for each requests submitted.

Request Report API:

Inputs:

Channel Id
Time Interval - example Last 7 Days, Last Month, Current Month
Json Report Config

Outputs:

Saves report-id and job_name as "druid-reports" in cassandra platform_db.job_request table as "SUBMITTED"
Build druid query from inputs and save to azure as "report-id.txt"
Save "report-id.json" to azure

Report Data Generator:

Search through cassandra platform_db.job_request table for job_name = "druid-reports" and status = "SUBMITTED"
Execute druid query for each reports from the list
Save query response json to azure as "report-data/report-id.json"
Updates URL path in "report-id.json"