Researchers that make tutoring systems would like to know which pieces of educational content are most effective at promoting learning among their students. Randomized controlled experiments are often used to determine which content produces more learning in an ITS. While these experiments are powerful they are often very costly to setup and run. The majority of data collected in many ITS systems consist of answers to a finite set of questions of a given skill often presented in a random sequence. We propose a Bayesian method to detect which questions produce the most learning in this random sequence of data. We confine our analysis to random sequences with four questions. A student simulation study was run to investigate the validity of the method and boundaries on what learning probability differences could be reliably detected with various numbers of users. Finally, real tutor data from random sequence problem sets was analyzed. Results of the simulation data analysis showed that th...
Zachary A. Pardos, Neil T. Heffernan