Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Question: A question is a measuring construct used to assess the learning outcome of a learner. In simple terms, a question is what is given to a student for the purpose of assessing a student's proficiency in a concept. Example: Explain the benefits of ground water recharging?
  2. Question Set: A question set a collection of questions having a certain common characteristic. That common characteristic is  defined in terms of the question meta data. Example: set of all questions having one mark.
  3. Question Paper: It is an array (collection) of Question Sets, where additional conditions/restrictions are placed at the individual member  Question Sets. For example, give ten, 1 mark questions, followed by 5 five mark questions, which all must have appeared in exams. 
  4. Answer: A workout, in sufficient details, that satisfies what is asked. In the simple terms, it is what a student provides in response to a question. It could just be a correct option in a Multiple Choice Question (MCQ), a single word or a phrase in the Fill-In-The-Blank (FTB) Question, a 150 word essay in a creative writing question.
  5. Marking Scheme: A Marking Scheme is a set of hints or key points, along with grading scheme, that an evaluators evaluator looks for when grading assessments. Primarily, Marking Scheme is meant for a teacher (evaluator) and not a student. 

...

  1. Extraction PhaseConvert Unstructured data to Structured data:  Questions, Answers, Marking Schemes, Meta Data could be available in different forms such as a PDFs, Word Documents, Scanned PDFs, possibly even in databases. Ingest all such (unstructured) data, and extract relevant data and write that data against a specific suitable QML compliant schema. Extracting the data as per the schema, may be human generated or algorithm derived. Source data is now ingested into the platform, and is ready for auto-curation (validation).
  2. Auto Curation (Pre-Validation) - Generate Quality Metadata of the  Extracted dataSome or part of the the extracted data may require validation by an SME. However, validating every and all extracted data might be too cumbersome for a human in the loop. In this phase, additional machine derived quality meta data will be used to prioritize the effort required by the human in the loop.
  3. Curation (Validation) - Validate Extracted dataAn expert or a designated curator can validate the extracted data, and change the status of the extracted data into one of the three states: accepted (ready for publishing), modified (and ready for publishing) and rejected (not suitable for publishing). At the end of this phase, extracted data is curated, and the publishable data now is available for downstream consumption. 
  4. Ingestion - Ingest the curated data. Once data is in a published form, write it to a DB like Cassandra

Note: If we choose to bypass auto curation phase, we might still show all questions in draft mode. It is a policy decision then.

Question Set Creation Flow:

...

Pre-requisites: Assume that Textbook Framework and Question Meta data Data are aligned at the Taxonomy level. In addition, two data points are required to create an Energized Question Bank: 1) Textbook Spine 2) Question Set Logic (Blueprint, as it is called from now on). Once these two pieces are available, a Textbook can be created with links to Question Sets. 

...

  1. Automated Textbook Creation:
    1. Provide Textbook Spine as a CSV or create these spines for every Framework available in the platform. DIKSHA  Implementation can create Textbook Spines can be created via APIs or via Portal UI
    2. Provide a list of available Pre-baked Question Set Intent Blueprints.
    3. For every entry in the Spine, and the preselected list of Blueprints, fire the query, get the question sets, create the resources, and link them 
    4. Once an energized question bank is created, a user can edit it.
    EQB Creation via eVolve:
    1. Current eVolve UI for creating Textbooks and linking Teaching Content can be leveraged (Rayulu you can add more specifics here)
  2. Semi Automated:  DIKSHA Implementation   A Sunbird Adoption Team can create Textbooks Textbook via APIs or portal UI. From backend, EQB can be auto created, made available for editing. eVolve A simple UI component can be enhanced developed to edit the Intent configuration and select, deselect the items in the result set


Schema and Architecture Details:

...

  1. Schema is defined for the source type (dimensions are specified)
  2. Information is extracted against the "meaning" of the dimension. Example: "difficulty" is a dimension of the question. It needs to be extracted if not available in an consumable form
  3. By default, all dimensions are in draft state. 
  4. During pre-validation phase, a set of rules (specified by a human or learnt by the system) are applied, which are helpful in validating the extraction step. For example:
    1. If a dimension is filled by machine, confidence score is a derived score which can be used to prioritise prioritize tasks for a human-in-the-loop
    2. We suspect that the image presented in the question is not legible or crossed the boundaries, but there is no easy way to fix it. Present the image to human in the loop and ask whether it is presentable or not.
    3. We (system) may think that one Marking Scheme is suitable for presentation as an Answer. Let the human in the loop validate it.
  5. Human-in-the-loop, prioritizes the validation tasks based on "interest". A Maths teacher can only select  Maths questions and only provide Answers. 
    1. Different kinds of reviewers can focus on specific tasks (of the validation). Not everybody needs to do everything
  6.  Eventually, every record moves from a draft state to either draft or published or reject state
    1. Every individual dimension also goes through the same states except that unlike at record level, individual dimension can only be either published or rejected state

...

  1. Taxonomy Terms can be specified via multiple entry points
    1. DIKSHA implementation Sunbird Adoption team can create Textbook Spine
    2. via CSV 
  2. Question Set Purpose is exposed via a JSON configuration. They can be provided in multiple ways
    1. Several Blueprints are provided with pre-filled values. 
    2. The JSON data can be seen via a custom UI in new Portal as is
    3. A User can edit the JSON data UI in case he/she wants to customise customize it
  3. The Results of the query can be edited by the User. This allows full customisationcustomization.
    1. Bulk Accept
    2. Reject individual items
    3. Add individual items (so that query can be modified, new results are fetched and interesting items are added to the collection)

TBD

  1. telemetry event structures (during curation)
    1. to better the auto curation process
  2. telemetry event structures (during set acceptance/modification)
    1. to better query fulfilment
  3. intent specification schema
    1. similar to Plug-n-Play analytics JSON-ified filtering criteria

05th June Scope

  1. Preparatory Question Sets and Exam Question sets will be made available to
    1. Grades 9 and 10
    2. Seven Subjects (Maths, Science, Social Science, Hindi, English and Sanskrit)
    3. Coverage and quality will vary depend on the curation effort required, and quality of the data
  2. Auto Curation (Validation) metrics will be developed to reduce effort by the Human-in-the-Loop. This is seen as a general ML infrastructure capability (in particular a Reinforcement Learning Environment)
    1. Few rules will be developed to identify quality tags (Manual effort)
    2. Few algorithms will be developed to identify quality tags
  3. Alignment between extracted Taxonomy terms from CBSE Question  Bank and NCERT will be aligned (Manual effort)
  4. Sample Question Set Intents will be created (Manual effort)
  5. Auto Textbook Creation: Given Textbook Spine and list of Blueprints, Textbook creation will be automated
  6. Ability to modify the Intent, and select the result set will be developed
  7. Support for passive consumption of Question-Answer pairs. There will not be any evaluation of answers, or interactions on the questions. They will be treated like normal resources types (not assessment resources)

Beyond June Scope

  1. 16 core subjects,  k1-12 grades
  2. QML implementation, support additional interaction types
  3. Support for Exemplary Answers (actual Answers written by students, and taken as OCR images)
  4. Support for ingesting previous exam paper questions

Design Direction(?)

Treat Textbook as a Map.

  1. Every learning services is then actually a location-based Service.
    1. In a Map, we can ask for
      1. Find restaurants near-by
      2. Find the shortest route from A to B
      3. Find a shopping mall near by
    2. If we treat Textbook as a Map
      1. Find teaching material for 
      2. Find practicing  material for
      3. Find previous exam questions for
      4. Tell me how others are doing here
      5. Tell me what others are facing difficulty with here
      6. How can from here (concept) to another (next chapter)
  2. It unifies the design across all verticals in DIKSHA. It is like building an Uber, an Ola, a Swiggy, on the top of Google Maps.
  3. Lessens the burden of providing exact location details (Taxonomy terms) by non-expert Users
    1. In order ask for services, a user does not need provide exact address (like house number, street name, nearest landmark, pin code)
    2. He simply drops a pin
  4. A user simply select a kindle like Textbook from the library (by supplying just four fields – board, medium, grade, subject)
    1. User need bother about choosing Chapter, Topic or anything of that sort.
    2. A User implicitly provides them as he/she is browsing the Textbook. The current location on the kindle-like book, gives away the exact details
  5. At this time, Users are using Taxonomy terms as pointers. But in the platform, we also interpret them as pointers as well as what they mean. With Textbook as a Map, metaphor, we dont've even have to ask for the pointers. All the complexity can be handled by the backend of figuring out the meaning of the location
  6. Providing resources in one medium to another medium, is like transforming cartesian coordinates to polar coordinates. User always remains in their own zone.