...
Configure Report API:
Input parameters
Parameter | Mandatory | Description | Comments |
---|---|---|---|
report_name | Yes | Name of the report | |
query_engine | Yes | Data Source | DRUID, CASSANDRA, ELASTICSEARCH |
execution_frequency | Yes | Report generation frequency | DAILY, WEEKLY, MONTHLY |
channel_id | No | ChannelId for filtering | |
report_interval | Yes | Date range for queries |
|
query | Yes | Query to be executed | |
output_format | Yes | Output format of the report | json, csv |
output_file_pattern | No | Report output filename pattern | Supported Placeholders are:
|
Request Object
Code Block language theme RDark borderStyle solid linenumbers true collapse false { "id":"sunbird.analytics.report.submit", "ver":"1.0", "ts":"2019-03-07T12:40:40+05:30", "params":{ "msgid":"4406df37-cd54-4d8a-ab8d-3939e0223580", "client_key":"analytics-team" }, "request":{ "channel_id":"in.ekstep", "report_name":"avg_collection_downloads", "query_engine": "druid", "execution_frequency": "DAILY", "report_interval":"LAST_7_DAYS", "output_format": "json" "query_json":{ "queryType":"groupBy", "dataSource":"telemetry-events", "granularity":"day", "dimensions":[ "eid" ], "aggregations":[ { "type":"count", "name":"context_did", fieldName":"context_did" } ], "filter":{ "type":"and", "fields":[ { "type":"selector", "name":"eid", fieldName":"IMPRESSION" }, { "type":"selector", "name":"edata_type", fieldName":"detail" }, { "type":"selector", "name":"edata_pageid", fieldName":"collection-detail" }, { "type":"selector", "name":"context_pdata_id", fieldName":"prod.diksha.app" } ] }, "postAggregations":[ { "type":"arithmetic", "name":"avg__edata_value", "fn":"/", "fields":[ { "type":"fieldAccess", "name":"total_edata_value", "fieldName":"total_edata_value" }, { "type":"fieldAccess", "name":"rows", "fieldName":"rows" } ] } ], "intervals":[ "2019-02-20T00:00:00.000/2019-01-27T23:59:59.000" ] } } }
- Output:
...
Job Scheduler Engine:
- Input:
-
...
- A list of reports
...
- in druid_reports_configuration
...
- Cassandra table with the cron_expression which falls within the current day of execution.
- Algorithm:
...
-
Data availability check has following 2 criteria:
1. Kafka indexing lag: check for 0 lag in druid ingestion.
2. Druid segments count: Segments should have been created for previous day.
...
-
Reports based on telemetry-events will be submitted for execution upon satisfying both the criteria.
...
-
Reports based on summary-events will be submitted for execution upon satisfying only 2nd criteria or check for files in azure.
- Output:
-
The list of reports are submitted for execution into the platform_db.job_request
Cassandra table with the status=SUBMITTED and job_name=druid-reports-<report-id>.
...