Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
themeRDark
ReportName
ChannelIdDate
GenerationDateTimestamp



Job Scheduler Engine:


Image Added


  • Input:

- A list of reports in druid_reports_configuration Cassandra table with the cron_expression which falls within the current day of execution.

  • Algorithm:

- Data availability check has following 2 parameters:

    1. Kafka indexing lag: 0 indicates no lag in druid ingestion

    2. Druid segments count: Based on previous segments count will set some threshold range. Segments count for a day should fall in that threshold range.

Reports will be submitted for execution upon satisfying above 2 criteria.

  • Output:

- The list of reports are submitted for execution into the platform_db.job_request Cassandra table with the status=SUBMITTED and job_name=druid-reports-<report-id>.

Disable Report API:

  • Input:

...

Set of Requests - i.e All records in platform_db.job_request where status=SUBMITTED and job_name =starts with druid-reports

  • Output:

-  Report data file will be saved in Azure with specified format
platform_db.job_request table will be updated with job status and output file details will be updated in platform_db.druid_reports_configuration

...