Optimizing Representations and Policies for Question Sequencing using Reinforcement Learning

Aqil Zainal Azhar, Avi Segal, Kobi Gal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper studies the use of Reinforcement Learning (RL) policies for optimizing the sequencing of online learning materials to students. Our approach provides an end to end pipeline for automatically deriving and evaluating robust representations of students’ interactions and policies for content sequencing in online educational settings. We conduct the training and evaluation offline based on a publicly available dataset of diverse student online activities used by tens of thousands of students. We study the influence of the state representations on the performance of the obtained policy and its robustness towards perturbations on the environment dynamics induced by stronger and weaker learners. We show that ‘bigger may not be better’, in that increasing the complexity of the state space does not necessarily lead to better performance, as measured by expected future reward. We describe two methods for offline evaluation of the policy based on importance sampling and Monte Carlo policy evaluation. This work is a first step towards optimizing representations when designing policies for sequencing educational content that can be used in the real world.

Original languageAmerican English
Title of host publicationProceedings of the 15th International Conference on Educational Data Mining, EDM 2022
PublisherInternational Educational Data Mining Society
ISBN (Electronic)9781733673631
DOIs
StatePublished - 1 Jan 2022
Event15th International Conference on Educational Data Mining, EDM 2022 - Hybrid, Durham, United Kingdom
Duration: 24 Jul 202227 Jul 2022

Publication series

NameProceedings of the 15th International Conference on Educational Data Mining, EDM 2022

Conference

Conference15th International Conference on Educational Data Mining, EDM 2022
Country/TerritoryUnited Kingdom
CityHybrid, Durham
Period24/07/2227/07/22

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Optimizing Representations and Policies for Question Sequencing using Reinforcement Learning'. Together they form a unique fingerprint.

Cite this