Background
The portal landing pages should serve dashboards with relevant metrics in order to communicate the reach and impact of the DIKSHA program. The dashboards are required for the MHRD launch in Jan 2019.
Problem Statement
The dashboard requires various static and dynamically computed metrics. This design brainstorm needs to define a JSON data structure to serve the data for the various requirements for the dashboard. The metrics that need to be computed for the portal dashboards are defined below:
- Total number of unique devices till date across portal and mobile app that have access Diksha.
- Total number of content play sessions till date across portal and mobile app.
- Total time spent on Diksha (i.e. total session time) till date across portal and mobile app.
Solution
Daily Summaries
A data product needs to be created to compute the daily summary of the metrics defined above to serve the portal dashboards. The JSON data structure to capture all the metrics required is defined below. We can use the logic specified below to compute the required metrics.
- noOfUniqueDevices: This metric tracks the total number of unique devices that have accessed Diksha. This can be computed by filtering out events that have dimensions.pdata.id = "prod.diksha.app" and dimensions.type = "app" from the WorkflowSummary data and computing the total distinct count.
- totalContentPlaySessions: This metric tracks the total number of content play sessions. This can be computed by filtering out events that have dimension.pdata.id = "prod.diksha.app" and dimensions.type = "content" and dimensions.mode = "play" and aggregating the edata.eks.time_spent field. The time_spent field will be in seconds.
- totalTimeSpent: This metric track the total time spent on Diksha. This can be computed by filtering out events that have dimension.pdata.id = "prod.diksha.app" and aggregating the edata.eks.time_spent field. The time_spent field will be in seconds.
{ "eid":"ME_DASHBOARD_SUMMARY", "ets":1535417736822, "syncts":1535390695986, "ver":"1.0", "mid":"25791B1E895129968CBECD7A39C0822E", "uid":"b58cef32-ce0e-4968-b834-550ee677ac6c", "context":{ "pdata":{ "id":"AnalyticsDataPipeline", "ver":"1.0", "model":"DashboardSummary" }, "granularity":"DAY", "date_range":{ "from":1535390585259, "to":1535390684744 } }, "dimensions":{ "pdata":{ "id":"prod.diksha.app" } }, "edata":{ "eks":{ "start_time":1535390585259, "end_time":1535390684744, "noOfUniqueDevices":100, "totalContentPlaySessions":200, "totalTimeSpent":500.96, "telemetryVersion":"3.0" } } }
Cumulative Summaries
Cumulative summaries can be generated by using the data from the workflow_usage_summary_fact cassandra table.
- noOfUniqueDevices: This metric can be computed by filtering out records with d_period = 0 (cumulative), d_device_id = 'all' and summing up the m_total_devices_count field.
- totalContentPlaySessions: This metric can be computed by filtering out records with d_period = 0 (cumulative), d_type = 'content', d_mode = 'play' and summing up the m_total_sessions field.
- totalTimeSpent: This metric can be computed by filtering out records with d_period = 0 (cumulative), d_device_id = 'all' and summing up on m_total_ts field.
{ "eid":"ME_DASHBOARD_CUMULATIVE_SUMMARY", "ets":1535417736822, "syncts":1535390695986, "ver":"1.0", "mid":"25791B1E895129968CBECD7A39C0822E", "uid":"b58cef32-ce0e-4968-b834-550ee677ac6c", "context":{ "pdata":{ "id":"AnalyticsDataPipeline", "ver":"1.0", "model":"DashboardSummary" }, "granularity":"CUMULATIVE" }, "edata":{ "eks":{ "noOfUniqueDevices":100, "totalContentPlaySessions":200, "totalTimeSpent":500.96, "telemetryVersion":"3.0" } } }
Both the daily summary and cumulative summary files will be uploaded to the cloud storage. The sunbird platform team will pull the cumulative summary file from the cloud storage. There will be a new file generated everyday with the date appended to the file name.