This project has investigated whether techniques from Artificial Intelligence can be used to automatically generate the mark schemes for online assessment. Because of our unusually large student body, the Open University has large collections of existing student responses to short questions that have been marked as part of their assessment. We have used a technique from Machine Learning called Inductive Logic Programming to attempt to generate mark schemes from a series of (marked) student responses to Level 1 Science questions. The system appears to produce plausible results in the first instance without requiring additional human input. For example, (one clause of) a typical mark scheme generated by the machine learner might read:
correct response(X) :-
term_in_response(X, wind),
term_in_response(X, desert).
which should be interpreted as meaning “response A is correct, if it contains the word wind and the word desert.” The learned rules are based on keyword matching, stemmed keywords (for example, using chang to match with change or changing), and order of appearance in the response.
The performance of the system was evaluated by comparing the accuracy of the generated rules (subject to subsequent editing) against a mark scheme written for the Natural Language based system currently used by the University, and a set of marking patterns written in Java, developed within the University.
All three systems demonstrated similar accuracy in marking. On most questions, the accuracy was better than 95%, suggesting that each of these systems may be suitable for writing mark schemes for use by students. The comparisons also yielded some valuable insights into the kind of question that is more or less suitable for automatic free text marking.
The benefit of the Machine Learning approach is that initial mark schemes can be proposed to a mark scheme author to support them in creating the final mark scheme for online deployment. Because the rules are written in a readable logical language (as illustrated above), the rule sets can be developed, read and maintained by educators who are not themselves trained in computing. It also appears that the rule proposer may significantly reduce the time to build mark schemes, than building by them by hand (although this claim is currently unquantified). Our experience demonstrates that the output of the system can also be used to support educators who are authoring using any of the existing systems currently in use by the University to support this style of free-text based assessment.
Willis, A. (2010) COLMSCT Final Report 'Inductive Logic Programming to Support Automatic Marking of Student Answers in Free Text.'
