TY - GEN
T1 - Use of Machine Learning Methods in the Assessment of Programming Assignments
AU - Tarcsay, Botond
AU - Vasić, Jelena
AU - Perez-Tellez, Fernando
N1 - Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Programming has become an important skill in today’s world and is taught widely both in traditional and online settings. Educators need to grade increasing numbers of student submissions. Unit testing can contribute to the automation of the grading process; however, it cannot assess the structure, or partial correctness, which are needed for finely differentiated grading. This paper builds on previous research that investigated several machine learning models for determining the correctness of source code. It was found that some such models can be successful, provided that the code samples used for fitting and prediction fulfil the same sets of requirements (corresponding to coding assignments). The hypothesis investigated in this paper is that code samples can be grouped by similarity of the requirements that they fulfil and that models built with samples of code from such a group can be used for determining the quality of new samples that belong to the same group, even if they do not correspond to the same coding assignment, which would make for a much more useful predictive model in practice. The investigation involved ten different machine learning algorithms used on over four hundred thousand student code submissions and it confirmed the hypothesis.
AB - Programming has become an important skill in today’s world and is taught widely both in traditional and online settings. Educators need to grade increasing numbers of student submissions. Unit testing can contribute to the automation of the grading process; however, it cannot assess the structure, or partial correctness, which are needed for finely differentiated grading. This paper builds on previous research that investigated several machine learning models for determining the correctness of source code. It was found that some such models can be successful, provided that the code samples used for fitting and prediction fulfil the same sets of requirements (corresponding to coding assignments). The hypothesis investigated in this paper is that code samples can be grouped by similarity of the requirements that they fulfil and that models built with samples of code from such a group can be used for determining the quality of new samples that belong to the same group, even if they do not correspond to the same coding assignment, which would make for a much more useful predictive model in practice. The investigation involved ten different machine learning algorithms used on over four hundred thousand student code submissions and it confirmed the hypothesis.
KW - Applied Machine Learning for Code Assessment
KW - Automated Grading
KW - Student Programming Code Grading
UR - https://www.scopus.com/pages/publications/85139012635
U2 - 10.1007/978-3-031-16270-1_13
DO - 10.1007/978-3-031-16270-1_13
M3 - Conference contribution
AN - SCOPUS:85139012635
SN - 9783031162695
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 151
EP - 159
BT - Text, Speech, and Dialogue - 25th International Conference, TSD 2022, Proceedings
A2 - Sojka, Petr
A2 - Horák, Aleš
A2 - Kopeček, Ivan
A2 - Pala, Karel
PB - Springer Science and Business Media Deutschland GmbH
T2 - 25th International Conference on Text, Speech, and Dialogue, TSD 2022
Y2 - 6 September 2022 through 9 September 2022
ER -