Authors:
Elijah Chou
1
;
Davide Fossati
1
and
Arnon Hershkovitz
2
Affiliations:
1
Computer Science Department, Emory University, Atlanta, GA, U.S.A.
;
2
Science and Technology Education Department, Tel Aviv University, Tel Aviv, Israel
Keyword(s):
Originality, Creativity, Educational Data Mining, Tree Edit Distance, Computer Science Education.
Abstract:
We propose a novel approach to measure student originality in computer programming. We collected two sets of programming problems in Java and Python, and their solutions submitted by multiple students. We parsed the students’ code into abstract syntax trees, and calculated the distance among code submissions within problem groups using a tree edit distance algorithm. We estimated each student’s originality as the normalized average distance between their code and the other students’ codes. Pearson correlation analysis revealed a negative correlation between students’ coding performance (i.e., the degree of correctness of their code) and students’ programming originality. Further analysis comparing state (features of the problem set) and trait (features of the students) for this measure revealed a correlation with trait and no correlation with state. This suggests that we are likely measuring some trait that a student has, possibly originality, and not some coincidental feature of our
problem set. We also examined the validity of our proposed measure by observing the agreement between human graders and our measure in ranking the originality of pairs of code.
(More)