TY - JOUR
T1 - Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be!
AU - Alexandron, Giora
AU - Yoo, Lisa Y.
AU - Ruiperez-Valiente, Jose A.
AU - Lee, Sunbok
AU - Pritchard, David E.
N1 - GA’s research is supported by the Israeli Ministry of Science and Technology under project no. 713257.
PY - 2019/12
Y1 - 2019/12
N2 - The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and open-nature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners' activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners' data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods.
AB - The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and open-nature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners' activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners' data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods.
UR - http://www.scopus.com/inward/record.url?scp=85068820660&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/s40593-019-00183-1
DO - https://doi.org/10.1007/s40593-019-00183-1
M3 - مقالة
SN - 1560-4292
VL - 29
SP - 484
EP - 506
JO - International Journal of Artificial Intelligence in Education
JF - International Journal of Artificial Intelligence in Education
IS - 4
ER -