I am a teacher, and would like to correctly use the data.table
package in R
to automatically grade student answers in a log file, i.e. add a column called correct
if the student answer to a particular question, is the correct answer to that question, and 0 otherwise. I can do this easily if each question has only 1 answer, but I am getting tripped up if a question has multiple possible answers (questions and their possible correct answers are stored in another table)
Below is a MWE:
set.seed(123)
question_table <- data.table(id=c(1,1,2,2,3,4),correct_ans=sample(1:4,6,replace = T))
log <- data.table(student=sample(letters[1:3],10,replace = T),
question_id=c(1,1,1,2,2,2,3,3,4,4),
student_answer= c(2,4,1,3,2,4,4,5,2,1))
My question lies in what is the correct data.table
way to use ifelse
in j
, especially if we depend on another table?
log[,correct:=ifelse(student_answer %in%
question_table[log$question_id %in% id]$correct_ans,1,0)]
As can be seen below, question 1 and 2 both have multiple possible correct answers.
> question_table
id correct_ans
1: 1 2
2: 1 4
3: 2 2
4: 2 4
5: 3 4
6: 4 1
While the correct column is calculated without errors, something isn't right: e.g. when student b
answers question, he is awarded a correct score, even though he answered incorrectly. Only some entries of the correct
column are off, which leads me to believe there is something i am not getting with how variables have are scoped.
> log
student question_id student_answer correct
1: b 1 2 1
2: c 1 4 1
3: b 1 1 1 <- ?
4: b 2 3 0
5: c 2 2 1
6: b 2 4 1
7: c 3 4 1
8: b 3 5 0
9: a 4 2 1 <- ?
10: c 4 1 1
I considered making a helper column with the correct ans in the log
table by join
ing with question_table
, but that does not work since the key is not unique in the latter.
Any and all help would be appreciated. Thanks in advance.
See Question&Answers more detail:os