Background
The portal landing pages should serve dashboards with relevant metrics in order to communicate the reach and impact of the DIKSHA program. The dashboards are required for the MHRD launch in Jan 2019.
Problem Statement
The dashboard requires various static and dynamically computed metrics. This design brainstorm needs to define a JSON data structure to serve the data for the various requirements for the dashboard. The metrics that need to be computed for the portal dashboards are defined below:. Also, the dashboard metrics will have to support drill down by State to which the data belongs to.
- Total number of unique devices till date across portal and mobile app that have access Diksha.
- Total number of content play sessions till date across portal and mobile app.
- Total time spent on Diksha (i.e. total session time) till date across portal and mobile app.
Solution
...
Daily Summaries
...
- noOfUniqueDevices: This metric tracks the total number of unique devices that have accessed Diksha. This can be computed by filtering out events that have dimensions.pdata.id = "prod.diksha.app" and dimensions.type = "app" from the WorkflowSummary data and computing the total distinct count.
- totalContentPlaySessions: This metric tracks the total number of content play sessions. This can be computed by filtering out events that have dimension.pdata.id = "prod.diksha.app" and dimensions.type = "content" and dimensions.mode = "play" and aggregating the edata.eks.time_spent field. The time_spent field will be in seconds.
- totalTimeSpent: This metric track the total time spent on Diksha. This can be computed by filtering out events that have dimension.pdata.id = "prod.diksha.app" and aggregating the edata.eks.time_spent field. The time_spent field will be in seconds.
Code Block | ||
---|---|---|
| ||
{
"eid":"ME_DASHBOARD_SUMMARY",
"ets":1535417736822,
"syncts":1535390695986,
"ver":"1.0",
"mid":"25791B1E895129968CBECD7A39C0822E",
"uid":"b58cef32-ce0e-4968-b834-550ee677ac6c",
"context":{
"pdata":{
"id":"AnalyticsDataPipeline",
"ver":"1.0",
"model":"DashboardSummary"
},
"granularity":"DAY",
"date_range":{
"from":1535390585259,
"to":1535390684744
}
},
"dimensions":{
"pdata":{
"id":"prod.diksha.app"
}
},
"edata":{
"eks":{
"start_time":1535390585259,
"end_time":1535390684744,
"noOfUniqueDevices":100,
"totalContentPlaySessions":200,
"totalTimeSpent":500.96,
"telemetryVersion":"3.0"
}
}
} |
Cumulative Summaries
Cumulative summaries can be generated by using the data from the workflow_usage_summary_fact cassandra table.
- noOfUniqueDevices: This metric can be computed by filtering out records with d_period = 0 (cumulative), d_device_id = 'all' and summing up the m_total_devices_count field.
- totalContentPlaySessions: This metric can be computed by filtering out records with d_period = 0 (cumulative), d_type = 'content', d_mode = 'play' and summing up the m_total_sessions field.
- totalTimeSpent: This metric can be computed by filtering out records with d_period = 0 (cumulative), d_device_id = 'all' and summing up on m_total_ts field.
- totalDigitalContentPublished: This metric can be computed from the search api. We need to use the composite search API /composite/v3/search and filter by contentType Resource to get a count of total number of live content.
Code Block | ||
---|---|---|
| ||
{ "eid":"ME_DASHBOARD_CUMULATIVE_SUMMARY", "ets":1535417736822, "syncts":1535390695986, "ver":"1.0", "mid":"25791B1E895129968CBECD7A39C0822E", "uid":"b58cef32-ce0e-4968-b834-550ee677ac6c", "context":metrics_summary":{ "pdatanoOfUniqueDevices":{ "id":"AnalyticsDataPipeline", 100, "ver":"1.0", "model":"DashboardSummary" }, "granularity":"CUMULATIVE" }totalDigitalContentPublished":400, "edata":{ "eks":{ "start_time":1535390585259, "end_time":1535390684744, "noOfUniqueDevices":100, "totalContentPlaySessions":200, "totalTimeSpent":500.96, "telemetryVersion":"3.0" } } } |
...
The cumulative summary files will be uploaded to the cloud storage using Secor. The sunbird platform team will pull the cumulative summary file from the cloud storage. There will be a new file generated everyday with the date appended to the file name.