[Data Product] Device summarizer

Summary

  • Type - Device Summary - summary of each device per day
  • Granularity - DAY
  • Dimensions - period, did & channel
  • Computation Level - Level 1 (computed from raw telemetry)
  • Frequency - Runs Daily

Purpose

Purpose of Device Summarizer to compute device level daily summary.

Inputs

  • Raw Telemetry: SEARCH & INTERACT events
  • ME_WORKFLOW_SUMMARY

Output

  • ME_DEVICE_SUMMARY
{
    "eid" :"ME_DEVICE_SUMMARY",
    "ets" : Long, // Event generation time in epoch
    "syncts" : Long, // Event sync time in epoch
    "ver" : "1.0",
    "mid" : String, // Unique message id for the device usage for specific sync date
    "context" : {
        "pdata" : {
            "id": "AnalyticsDataPipeline",
            "model" : "DeviceSummary",
            "ver" : "1.0"
        },
        "granularity" : "DAY",
        "date_range" : {
            "from" : Long,
            "to" : Long
        }
    },
    "dimensions" : {
        "period" : Int,
        "channel": String,
        "did" : String // Device id
    },
    "edata" : {
        "eks" :{
            "total_ts": Double, // Total time spent in seconds excluding idle time.
            "total_launches": Long, // Total launches/visits.
            "contents_played": Int, // Total content played
            "unique_contents_played": Int, // Distinct content played
            "dial_stats": {
                "total_count": Int, // QR codes scanned
                "success_count": Int, // QR scans successful
                "failed_count": Int, // QR scans failed
            },
            "content_downloads": Int // Content downloaded
        }
    }
}

Algorithm Design

  • Group By did and channel

Field

Computation

Remarks

did

context.did


period

Integer value of the DAY


total_ts

Sum (edata.eks.time_spent of all the ME_WORKFLOW_SUMMARY events)


context.date_range.from(first_access)

context.date_range.from of first ME_WORKFLOW_SUMMARY event


context.date_range.to(last_access)

context.date_range.to of last ME_WORKFLOW_SUMMARY event


total_launches

Count of ME_WORKFLOW_SUMMARY events with dimensions.type = app


contents_played

Count of ME_WORKFLOW_SUMMARY events with dimensions.type = content and dimensions.mode = play


unique_contents_played

Count of unique object.id from content play ME_WORKFLOW_SUMMARY events


dial_stats.total_count

Count of SEARCH events where edata.filters.dialcodes exists


dial_stats.success_count

Count of SEARCH events where edata.filters.dialcodes exists and edata.size > 0


dial_stats.failed_count

Count of SEARCH events where edata.filters.dialcodes exists and edata.size = 0


content_downloads

Count of INTERACT events with edata.subType = "ContentDownload-Success"

For portal, it will be 0