Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
minLevel1
maxLevel7


cQube Vision

Executive Summary

  • cQube is a domain-agnostic, packaged solution for enabling scheme monitoring for public programs

  • cQube offers packaging, deployment ease, and infra optimization, compared to off-the-shelf tech

  • Core proposition is its ability to provide out-of-the-box usability leveraging domain-specific configurations

  • The configurations layer will equip cQube with pre-defined indicators, insights, and suggested actions (nudges)

  • cQube for education (Ed) will accelerate adoption of VSK as envisioned under MoE’s program

  • The Ed configurations, along with programmatic SOPs will constitute a VSK Playbook as an additional public good

cQube Vision and Value Prop

cQube is envisioned as a ready-to-use/pre-packaged, configurable, and extendable DPG solution to enable observability and action towards effective policy implementation in education and other sectors, involving various stakeholders across govt, society, and private sectorssector.

There are several challenges that government program owners face while leveraging technology for monitoring public programs

...

  • A domain-agnostic product packaging of ingestion, processing and visualization layer in a manner that is:

    • Easy to deploy with minimal engineering staff

    • Lite deployment (minimal infrastructure requirements optimized to program needs)

    • Built for scale (to be scale tested with ~500k users)

    • Leverages reference schemas and spec-compliant API based implementation

    • Brownfield ready with modular components interoperable with existing systems

  • Domain-specific configuration layer allowing out-of-the-box contextualization of indicators, insights, role based nudges and actions

  • We aim to initiate cQube Ed and other sector based configurations can also be made available going forward

cQube architecture

The current architecture of cQube has certain challenges which affect the adopter experience

...

  • Spec-based architecture has been created for cQube Ed v5.0 with API-first design consideration

  • There are 5 blocks: Ingestion > Processing > Storage > Visualisation > Insight & Action Adapter

  • All the blocks can be used individually and as an end-to-end solution by the state adopters

cQube Ed (for India)

VSK was initiated by MoE as a program to improve monitoring of schemes / programs in education

...

Actor

Problem

National Body (e.g., NCERT)

- Is not able to collate data easily from states to monitor programs nationally

NCERT wants to analyze NISHTHA compliance across states, but data is incomplete

Department Head

- Does not know how all programs are impacting the key department goals

DG cannot gauge whether mentoring compliance is improving NIPUN competencies

Program Owner

(e.g., MDM incharge)

- Unable to monitor effective delivery

State MDM incharge doesn’t know which districts are behind in ration procurement

- Unable to measure intended outcomes

State MDM incharge cannot gauge impact of MDM program on student attendance

Admin Officer

(e.g., Block Edu Officer)

- Is not aware of key indicators (metrics) to be monitored and improved

Is called to a district review, but does not know on which KPIs he will get questioned

- Does not know who are the accountable actors that can improve key indicators

Only 40% teachers in the block completed DIKSHA training, but how to improve?

School Head / Principal

- Unable to determine the course of action if any indicator for the school goes off

Principal isn’t aware which subjects are lowering the Class 10th Board pass rates

cQube layered with education-specific configurations (cQube Ed) can accelerate state VSK adoption

  • A VSK Playbook will define common indicators, roles, insights and suggested actions relevant across states

  • The Playbook will serve as an independent public good, enabling VSK compliance independent of tech.

  • cQube Ed will provide an end-to-end solution to enable VSK and come with:

VSK Specs

Image RemovedImage Added

Playbook Reference Schema APIs

  • Schemas to capture data, insights, actions

  • NDEAR compliant APIs for enabling interoperability

cQube Ed Software

Image RemovedImage Added

Data Ingestion and Processing

  • Ingestion of input data from existing (DIKSHA, SARAL, ODK.) or new systems

  • Ingestion of data from new data sources through configurable schemas

  • Processing ingested data into aggregated indicators

Data Output (out of box plugins for Visualization, Insights, Nudges)

  • Mobile app with personalized role based dashboards and nudges

  • Out-of-box visualizations with dashboarding tools

  • Precomputed insights downloadable weekly as reports for key role actors

  • Ability to add new programs to generate insights and visualizations

Admin Capabilities

  • Role mapping to indicators for visibility and accountability

  • Weekly report templates and dashboard configurations

  • Configure and automate in-app nudges (as a plugin)

cQube Ed Configuration Layer

The cQube Ed configurations layer will allow cQube to be VSK ready for states to begin ‘using’

...

  • VSK Playbook will serve as a public good for enabling monitoring of ed focused schemes / programs

  • VSK playbook to be created by Samagra (Q3) in discussion with states, housed within MoE

  • The Playbook can be evolved and contributed to by state entities, other ed ecosystem partners and MoE

  • cQube Ed has multiple deployments, further enhancement needed to be VSK Playbook ready

  • Today, cQube is being leveraged by MoE (NVSK) and Jharkhand, with more states expected

  • Overall, goal is to enable all states to have VSK playbook implemented before the next academic year

  • Certain enhancements are needed to make it VSK playbook ready

  • There are some functionalities & use cases that cQube Ed (out-of-box) will not support

  • cQube Ed will continue storing aggregated data , user-level indicators & nudges not possible

E.g., viewing admin officer wise school monitoring visit compliance in a given month

  • For predefined indicators, cQube Ed will not ingest data in any schema other than the one specified

  • cQube Ed will allow limited operations to be performed on the datasets exposed (SQL layer)

  • States will be able to adopt cQube Ed based on the evolution of their own brownfield systems

  • States that have existing data input, processing and visualization solutions

    • Can configure their existing systems based on VSK playbook

    • Or can connect their data input to cQube and get processing, visualization, nudges out of the box

    • Or can connect their data input and visualization to cQube and get processing, nudges out of the box

  • States that don’t have existing data input, processing and visualization solutions

    • Can setup cQube through a SI and get data processing, visualization and nudges out of the box

    • Can leverage NDEAR reference data input tools - SARAL, Shiksha

  • The high level architecture for cQube Ed has been finalized

  • Spec-based architecture has been created for cQube Ed v5.0 with API-first design consideration

  • There are 5 blocks: Ingestion > Processing > Storage > Visualisation > Insight & Action Adapter

  • All the blocks can be used individually and as an end-to-end solution by the state adopters

Problem Statement

Governments usually collect a large amount of administrative data (especially in the education context), mostly for routine reporting and compliance purposes. Many countries have recently begun finding more advanced and useful ways of leveraging these data sets to allocate resources better, measure success, and improve government-administered programs' efficiency. However, a lot of state governments in the country still operate on anecdotal rather than data-backed decision making due to some key challenges:

...

These challenges become a hindrance for the state governments in implementing effective review and monitoring processes, which in result also affect the government-administered programs' efficiency.

cQube Ed

cQube is envisioned as a ready-to-use/pre-packaged, configurable, and extendable solution to enable observability and action towards effective policy implementation in education and other sectors, involving various stakeholders across govt, society, and private sectors

...

cQube can be extended to other domains as well, similar to education through a domain specific configuration layer.

Design Principles:

cQube is based on the following design principles:

  1. Solution: cQube is neither a tool nor a platform. It is a ready-to-use/pre-packaged, configurable, and extendable solution to enable observability and action towards effective policy implementation in education and other sectors, involving various stakeholders across govt, society, and private sectors

  2. Education-specific: A pre-packaged solution of cQube for education, cQube Ed, comes with a set of predefined actionable indicators and insights which are specific to education. For eg: the metrics could be related to attendance, enrolment, assessments etc. The schema for data ingestion will also be defined for edu-specific indicators.

  3. Based in Indian Context: cQube is based in the Indian context, implying that the jurisdictions and hierarchies will be defined accordingly. For eg: The hierarchy for cQube Ed will be as follows - State > District > Block > Cluster > School > Class..

Use Cases

The major use cases envisaged to be unlocked through cQube Ed are as follows:

...

Solution: cQube Ed will allow the state admins to add an indicator or an insight within a day through an intuitive and easy-to-use admin console.

1. Terminologies

  1. Adopter - Leadership / Decision Makers who plan to and leverage cQube Ed

  2. Domain - Area of interest and expertise that we are working on. Eg: Health, Education, etc.

  3. Input Sources - Any input data sources that adopters use. Eg: Database (MIS, SQL, NoSQL) and Google Sheets / Excel / CSV, adopter applications

  4. Management Information System (MIS) - A data store that houses all the raw data of adopter organizations to generate events

  5. Ingestion - A system where all the data from input sources reach as the first step

  6. Adapter - Generates events from MIS / any other input source and pushes them to cQube Ed through an API

  7. Dataset - High-level data which is computed by aggregating events. It is a data representation of the indicator. Datasets are persistent within cQube Ed. A dataset is created for at least one indicator. This has been explained in detail with examples in this section

  8. Indicator -A visual representation of a dataset(s). Eg: District-wise average attendance %.

  9. Event -A data structure that records an occurrence at a particular time for an entity (eg: school, etc). It is a combination of simple data types (eg: integer, varchar, etc.). An event should always contain a column/set of columns that helps you calculate the Indicator. A table with a timestamp doesn’t necessarily mean that it is an event; it should contribute to either aggregation or filtering of the dataset. This has been explained in detail with examples in this section. Additional details are in this section.

  10. Allowed Data Types - SQL compliant data types found and supported across most RDBMS implementations.

    1. Numeric data types such as int, tinyint, bigint, float, real, etc.

    2. Date and Time data types such as Date, Time, Datetime, etc.

    3. Character and String data types such as char, varchar, text, etc.

    4. Unicode character string data types, for example nchar, nvarchar, ntext, etc.

  11. PII - Personal Identifiable Information

  12. Data Processing - Processing involves:

    1. Transformation of Events to data that updates datasets - Transformation happens through a transformer: f(eventDetails, eventSchema, datasetSchema, dimesionConfig) = [array of columns]

    2. Updating datasets

...

  1. Dimension - Dimensions describe events. [image] This has been explained in detail with examples in this section

  2. Transformer - Operation / Function being performed on an event & dimension to process them into a dataset. This has been explained in detail with examples in this section

  3. Visualization - A functional block that focussed on rendering data

  4. Charts - A single sheet of information in the form of a table, graph, or diagram

  5. Dashboard - A collection of charts

  6. Dashboard Organizer - A WYSIWYG editor that allows the placement of charts

  7. Insight - An actionable comprehension of certain data and visualizations

  8. Action Adapters - A type of written communication to disseminate information in order to encourage or persuade someone to do something in a certain way

  9. Plugins - An external or internal component that adds functionality to vanilla cQube Ed. Since cQube Ed is an API first design and the APIs are exposed, to add additional functionality, a plugin needs to be created.

2. Use Cases [1]

#

Persona

#

Use case (epics)

1

Deployer

1.1

As a deployer, I can install cQube Ed seamlessly (at a single click), select domain & setup data ingestion, processing and visualization pipeline

1.2

As a deployer, I can define and ingest spec-compliant state data into cQube Ed to generate actionable insights

1.3

As a deployer, I can process the events to generate datasets

1.4

As a deployer, I can visualize any dataset to generate charts for the program dashboard

1.5

As a deployer, I can analyze if the cQube Ed instance is running well or not

2

Admin

2.1

As an admin, I can choose the insight to be shown on the dashboards to selected users

2.2

As an admin, I can request for an additional insight to be shown on the dashboards to selected users

2.3

As an admin, I can setup program dashboards using the insights generated from datasets and provide access to multiple users

2.4

As an admin, I can configure nudges to decide what nudge has to be sent to whom

3

User

3.1

As a user, I can view the program dashboard to identify potential actions

3.2

As a user, I can view and receive nudges based on data insights

3.3

As a user, I can share the insights from the dashboards with stakeholders on other channels

Detailed user stories for each use case have been linked here.

3. Design Considerations

...

5. Specifications (WIP)

Specs

Links

Event

Link

Dimension

Link

Dataset

Link

Transformer

Link

Indicator

Link

Charts

Link (Dashlet Spec++)

Dashboards

Link

...

District

Average Attendance %

<District Name>

<Average attendance % with color coding>

From the Indicators, datasets are defined. There will be at least one dataset for an Indicator being visualized. For example, for the Indicator mentioned above, the dataset will have the following columns:

...

In order to create this dataset, data will need to be ingested from the state databases. The state will send the data to cQube Ed as per the defined event spec. For example, the event in this case will be attendance. Following will be the schema for attendance, the format in which the state will share their data (adapters can be leveraged by the states to convert their MIS / database data into this format):

...

Next step is transformation (processing) performed to convert these events and dimensions into a dataset. For the above example,

INSERTINTO

dataset_attendance (date, sum, count, average, schoolId)

VALUES

(‘13 - 11 - 2022’, 20, 50, 40, 101) ON CONFLICT ONCONSTRAINT dataset_attendance_unique_date_schoolId DO

UPDATE

SET

count = count + 50,

sum = sum + 20,

average = (sum + 20) /(count + 50)

WHERE

schoolId = 101

This processing will result in the following dataset:

...

  • Data is emitted/pulled through input sources.

  • The adapter then converts emitted data to an event(s), in order to make it compliant to the event spec. An adapter can also ingest datasets directly and push them to cQube Ed (using the dataset spec). An adapter is a custom logic that sits inside the state data center and monitors it for new events.

...

  • Events can also be sourced using an SDK that becomes part of the state applications.

  • To ensure backpressure handling, no loss of input events, and throughput, the API will ensure that events are but with a smaller event bus (and not a resource intensive solution like Kafka) with offsets and message retention as first-class citizens.

  • Since the events are not processed/validated in sync with the API, the admin for input sources should be notified in case of any issues.

  • Events are first-class citizens in cQube Ed and hence can be directly ingested without further changes. Since the API is expected to handle a large number of events, they can either be aggregated at the adapter level or at the processing level.

  • The event bus ensures that:

    • The events are only being stored until they are processed. The SLA for processing has been kept at 4 hours. (Back pressure management)

    • Since cQube Ed will become domain agnostic, the events for each domain will be created upfront as part of the domain solution in a similar manner as shown in the example above. Events are created using an event spec and pushed upstream to the adapter to enforce them at runtime.

    • Helps aggregate similar events and thus helps optimizations.

  • Any amount (< 1k events/second) of data can be ingested real-time in the form of events for aggregate entities - e.g. Classroom attendance at that moment. This can be uploaded in batches as well.

  • The state can fix a frequency for data updates in the ingestion spec which will run by truncating and inserting all data

...

  1. Transformation of Events to data that updates datasets - Transformation happens through a transformer.
    f(eventDetails, eventSchema, datasetSchema, dimesionConfig) = [array of columns]

  2. Updating datasets -> Datasets = And UPSERT to the data store.

...

As explained earlier in the sections, here is an example of a transformer.

INSERTINTO

dataset_attendance (date, sum, count, average, schoolId)

VALUES

(‘13 - 11 - 2022’, 20, 50, 40, 101) ON CONFLICT ONCONSTRAINT dataset_attendance_unique_date_schoolId DO

UPDATE

SET

count = count + 50,

sum = sum + 20,

average = (sum + 20) /(count + 50)

WHERE

schoolId = 101

Datasets

Current Implementation -

...

Proposed Implementation -

  • Datasets will be maintained on SQL compatible data store.

  • The datasets will be JSON-compliant, hence enabling visualization using existing chart types on cQube

  • The datasets will also be SQL-compliant, hence enabling

    • External system to build features faster over cQube Ed (Ecosystem play)

    • Existing SQL-based charting tools, data processing tools, etc.

  • Datasets will be of two types:

    • Auto Datasets - generated automatically based on the domain spec.

    • Custom Datasets - allows for an extension on the existing ones if the spec doesn’t solve for it; can be generated at runtime by using well-defined SQL constructs [1], [2]; custom dataset creation is not part of the pipeline and managed separately for ease of use. Custom datasets can also be created by the states from the auto datasets. (as part of v6.0)

...

  • Rule-based Action generation: Some actions will be generated based on some rules applied to the visualizations. For example, if assessment performance is less than x %, then show y pedagogical recommendations to the teacher.

  • Anomaly detection: Anomalies will be detected in the data, based on which visualization has been created. For example, assessment scores submitted by a teacher.

  • Correlation and Causation on factors: Correlation and Causation can be established on a metric and factors affecting it to generate relations. For example, does attendance affect the assessment performance of students in a classroom?

  • Auto-generated / Templated narrative generation: Text-based narratives can be created based on the insights generated to ease decision-making for the users. For example, the narrative can tell the district officer to conduct a review of officers of x blocks on delivery of textbooks as there is a delay in the same in these blocks.

  • Rule-based Nudges: Nudges can be sent through UCI based on some set rules on the visualizations created on cQube Ed. For example, an admin can nudge a district officer to take reviews on student assessments as the overall district performance has been going down. These will be enabled by plugins that a state can create to be able to implement this functionality.

  • Simulator: A simulator can be created for interaction with the data to make decisions. For example, a district officer can simulate improvement in assessment performance if textbooks are delivered to schools by x date. (Here is an example of how this will work.)

...

  • State will have a config-driven admin dashboard for ingestion of events, dimensions, and datasets

  • This dashboard will be further extended for visualization configs such as: 2.3.4, 2,3.5, 2.3.6, 2.3.7, 2.3.8, 2.3.9.

  • The implementation of the admin console can be done using form schema and react-admin.

9. Installation

  1. The state deployer will be able to install cQube Ed by running a single command with minimal specs. That will also include a one-time setup for ingestion, processing, and visualization blocks.

  2. The state deployer then selects the domain during cQube Ed installation.

  3. The state deployer is then also able to install helper modules like adapter & Anonymized and Aggregated event store for certain use cases if required.

...

There will be minor changes in the current software requirements than the ones being followed - https://cqube.sunbird.org/use/software-requirements. The idea is to continue building with the existing constraints on hardware but with the ability to scale if needed horizontally.

...

For ingestion, Typescript & Nestjs (Framework of express.js) will be used.

...

An entire domain for cQube Ed can be modeled as a class which will include instances of all of the items mentioned above - Event, Dimension, Dataset, Transformer, Indicator, Charts, Dashboard, Actions. The domain can be modified after instantiation using the inbuilt APIs. An example would look like this. The domain config is used by the starter script to initialize a cQube Ed instance..

Deployment

  1. Networking - Existing network architectures can be reused without changes.

  2. Automated Deployment

    1. Using Ansible scripts on Jenkins on physical machines

    2. Using Ansible scripts on Jenkins on k8s

...

  • The doc builds over and above what is already shared here.

  • There are security concerns over access of data as a SQL source - this will be handled through row/column level granular permissions.

  • PII Management - No PII is stored as part of cQube Ed in the original form.

...