Introduction
This wiki details the architecture of enabling reporting framework to operate at scale. It discusses the high level design problems to be solved and introduces the proposed architecture for the same.
Key Design Problems
TBA
...
Reporting Architecture
...
Druid Architecture
...
...
Druid Data
...
Telemetry data models for Druid
...
Model
Raw Telemetry
Dimension in Druid | Field in Telemetry | Description | Data Type | |
---|---|---|---|---|
1 | eid | eid | Event Id | String |
2 | syncts | syncts | Sync Timestamp | Long |
@timestamp
@timestamp
String
3 | actor_id | actor.id |
Actor Id of the event | String |
4 |
actor_type | actor.type | Type of the |
actor | String | |||
5 | channel_id | context.channel | Channel Id | String |
6 | producer_id | context.pdata.id | Producer Id | String |
7 | producer_pid | context.pdata.pid | Producer Process Id | String |
8 | context_env | context.env | Context Environment | String |
9 |
sid | context.sid | Session Id | String |
10 |
did | context.did | Device Id | String |
11 | context_ |
cdata_type | context.cdata.type | Correlation Data Type | String |
12 | context_ |
cdata_id | context.cdata.id | Correlation Data Id | Array[String] | |
13 | object_id | object.id | Content Id | String |
14 | object_type | object.type | Content Type | String |
15 | object_version | object.ver | Content Version | String |
16 | tags | tags | Tags | Array[String] |
17 | edata_type | edata.type | Event type | String |
18 | edata_ |
subtype | edata.subtype | Event subtype | String | |
19 | edata_mode | edata.mode | START event Mode of start | String |
20 | edata_ |
pageid | edata.pageid | Unique pageid | String | |
21 | edata_uri | edata.uri | IMPRESSION event Relative URI of the content | String |
22 | edata_id | edata.id | Event data Id | String |
23 | edata_duration | edata.duration | Duration of the event | String |
24 | edata_index | edata.index | ASSESS event Index of the question within a content | String |
25 | edata_pass | edata.pass | ASSESS event Field to identify pass or fail for assessments | String |
26 | edata_score | edata.score | ASSESS event Assessment score | Double |
27 | edata_ |
resvalues | edata.resvalues | ASSESS event Assessment results | Array[Object] | |
28 | edata_item_id | edata.item.id | ASSESS event Assessment item id | String |
29 | edata_item_title | edata.item.title | ASSESS event Assessment item title | String |
30 | edata_item_ |
maxscore | edata.item.maxscore | ASSESS event Assessment item max score | Double | |
31 | edata_target_id | edata.target.id | ASSESS event Assessment item target id | String |
32 | edata_target_type | edata.target.type | ASSESS event Assessment item target type | String |
33 | edata_rating | edata.rating | FEEDBACK |
event Ratings | String | |||
34 | edata_comments | edata.comments | FEEDBACK event Comments | String |
35 | edata_dir | edata.dir | SHARE event direction | String |
36 | edata_items_id | edata.items.id | SHARE event shared item ids | String |
37 | edata_items_type | edata.items.type | SHARE item types | String |
38 | edata_items_origin_id | edata.items.origin.id | SHARE event source id | String |
39 | edata_items_origin_type | edata.items.origin.type | SHARE event source type | String |
40 | edata_items_to_id | edata.items.to.id | SHARE event destination id | String |
41 | edata_items_to_type | edata.items.to.type | SHARE event destination type | String |
42 | edata_state | edata.state | AUDIT event current state | String |
43 | edata_prevstate | edata.prevstate | AUDIT event previous state | String |
edata_error
edata.err
ERROR event error code
String
edata_error_type
edata.errtype
ERROR event error type
String
edata_message
edata.message
LOG event message
String
edata_level
edata.level
LOG event log level
44 | edata_size | edata.size | SEARCH event result size | Integer |
45 | edata_filters_dialcodes | edata.filters.dialcodes | SEARCH event List of dialcodes | Array[String] |
46 | dloc_state | ldata.state | State location information for the device | String |
47 | dloc_state_code | ldata.state_code | State ISO code information for the device | String |
48 | dloc_city | ldata.city | City location information for the device | String |
49 |
dloc_country_code | ldata.country_code | Country ISO code information for the device | String |
50 | dloc_country | ldata.country | Country location information for the device | String |
51 |
Summary Events
...
Dimension in Druid | Field in Summary event | Description | Data Type | |
---|---|---|---|---|
1 | eid | eid | Event Id | String |
2 | ver | ver | Version | String |
3 | syncts | syncts | Sync timestamp | Long |
4 | uid | uid | User Id | String |
5 | context_date_range_from | context.date_range.from | Start Date for the summary | String |
6 | context_date_range_to | context.date_range.to | End Date for the summary | String |
7 | context_rollup_l1 | context.rollup.l1 | Context level1 rollup | String |
8 | context_rollup_l2 | context.rollup.l2 | Context level2 rollup | String |
9 | context_rollup_l3 | context.rollup.l3 | Context level3 rollup | String |
10 | context_rollup_l4 | context.rollup.l4 | Context level4 rollup | String |
11 | channel_id | dimensions.channel | Channel Id as dimension from raw telemetry | String |
12 | device_id | dimensions.did | Device Id as dimension from raw telemetry | String |
13 | producer_id | dimensions.pdata.id | Producer Id as dimension from raw telemetry | String |
14 | producer_pid | dimensions.pdata.pid | Producer Process Id as dimension from raw telemetry | String |
15 | session_id | dimensions.sid | Session Id as dimension | String |
16 | session_type | dimension.type | Type of summary | String |
17 | session_mode | dimension.mode | Mode of action in the session | String |
18 | object_id | object.id | Content Id | String |
19 | object_type | object.type | Content Type | String |
20 | object_type | object.type | Content Type | String |
21 | object_version | object.ver | Content version | String |
22 | object_rollup_l1 | object.rollup.l1 | Object level1 rollup | String |
23 | object_rollup_l2 | object.rollup.l2 | Object level2 rollup | String |
24 | object_rollup_l3 | object.rollup.l3 | Object level3 rollup | String |
25 | object_rollup_l4 | object.rollup.l4 | Object level4 rollup | String |
26 | tags | tags | Tags attached to a summary event | Array[String] |
27 | time_spent | edata.eks.time_spent | Time spent in the session excluding idle time | String |
28 | time_difference | edata.eks.time_diff | Total time in a session including idle time | String |
29 | interaction_count | edata.eks.interact_events_count | Total count of interact events in a session | Long |
30 | summary_env | edata.eks.env_summary.env | High level env within the app (content, domain, resources, community) | String |
31 | summary_env_count | edata.eks.env_summary.count | Count of times the environment has been visited | Integer |
32 | summary_env_time_spent | edata.eks.env_summary.time_spent | Time spent per env | Double |
33 | summary_page_id | edata.eks.page_summary.id | Page id | String |
34 | summary_page_type | edata.eks.page_summary.type | Type of page e.g. view/edit | String |
35 | summary_page_visit_count | edata.eks.page_summary.visit_count | Number of times each page was visited | String |
36 | summary_page_time_spent | edata.eks.page_summary.time_spent | Time taken per page | Double |
37 | item_responses_item_id | edata.eks.item_responses.itemId | Question Id passed in the ASSESS event | String |
38 | item_responses_time_spent | edata.eks.item_responses.timeSpent | Time spent in seconds from ASSESS event | String |
39 | item_responses_pass | edata.eks.item_responses.pass | Pass response for a question from ASSESS event | String |
40 | item_responses_score | edata.eks.item_responses.score | Score from ASSESS event | Array[Integer] |
41 | item_responses_max_score | edata.eks.item_responses.maxScore | Max Score from ASSESS event | Array[Integer] |
42 | item_responses_timestamp | edata.eks.item_responses.time_stamp | Timestamp for each response from ASSESS event | String |
43 | dloc_state | ldata.state | State location information for the device | String |
tags
tags
Tags attached to a summary event
Array[String]
...
44 | dloc_state_code | ldata.state_code | State ISO code information for the device | String |
45 | dloc_city | ldata.city | City location information for the device | String |
46 | dloc_country_code | ldata.country_code | Country ISO code information for the device | String |
47 | dloc_country | ldata.country | Country location information for the device | String |
48 |
Aggregates
Granularity → DAY
Druid field name | Druid source field | Aggregate Type |
---|---|---|
total_ |
interactions | interaction_count | SUM |
total_time_spent | time_spent | SUM |
total_sessions | mid | COUNT |
...
Report JSON Spec
JSON Schema
Code Block | ||
---|---|---|
| ||
[{
id: String, // Required. Report ID.
label: String, // Required. Report Label (will be shown up as menu)
title: String, // Optional. Report title. Defaults to report label
description: String, // Optional. Report description. HTML text can be included as description
dataSource: String, // Required. Location of the data source to show the report. Can be an expression. For ex: /<report_id>/{{channel}}/report.json
charts: [{ // Optional
datasets: [{
data: Array[Number], // Required if `dataExpr` is not provided. Array of Number. Data points to show in the chart
dataExpr: String, // Required if `data` is not provided. Expression pointing to the data in dataSource. For ex: {{data.noOfDownloads}}
label: String // Required. Label to display on the chart
}],
labels: Array[String], // Required if `labelsExpr` is not provided. Labels to show on the x-axis
labelsExpr: String, // Required if `labels` is not provided. Expression pointing to the data in dataSource. For ex: {{data.Date}}
chartType: String, // Optional. Defaults to line. Available types - line, bar, radar, pie, polarArea & doughnut
colors: [""], // Optional. Color to show for each dataset. Defaults to ["#024F9D"].
options: { // Optional. options for display. Full set of options look at https://valor-software.com/ng2-charts/
responsive: Boolean, // Defaults to true
...
},
legend: Boolean // Optional. Whether to show the legend below/above the chart. Defaults to true and position to top.
}],
table: { // Optional
"columns": Array[String], // Required if `columnsExpr` is not provided. Columns to show.
"values": Array[Array[String]], // Required if `valuesExpr` is not provided. Column data.
"columnsExpr": String, // Required if `columns` is not provided. Expression pointing to the data in dataSource. For ex: {{keys}}
"valuesExpr": String // Required if `values` is not provided. Expression pointing to the data in dataSource. For ex: {{tableData}}
},
downloadUrl: String // Location to download the data as CSV
}] |
Following is a example schema to show the general usage report
Code Block | ||
---|---|---|
| ||
{
id: "usage",
label: "Diksha Usage Report",
title: "Diksha Usage Report",
description: "The report provides a quick summary of the data analysed by the analytics team to track progess of Diksha across states. This report will be used to consolidate insights using various metrics on which Diksha is currently being mapped and will be shared on a weekly basis. The first section of the report will provide a snapshot of the overall health of the Diksha App. This will be followed by individual state sections that provide state-wise status of Diksha",
dataSource: "/usage/$state/report.json",
charts: [
{
datasets: [{
dataExpr: "{{data.Number_of_downloads}}",
label: "# of downloads"
}],
labelsExpr: "{{data.Date}}",
chartType: "line"
},
{
datasets: [{
dataExpr: "{{data.Number_of_succesful_scans}}",
label: "# of successful scans"
}],
labelsExpr: "{{data.Date}}",
chartType: "bar"
}
],
table: {
"columnsExpr": "{{key}}",
"valuesExpr": "{{tableData}}"
},
downloadUrl: "<report_id>/$state/$timeFilter.csv"
} |