Archive for August, 2011

Open-ended and multiple-choice versions of the same test

Friday, August 19th, 2011

I’ve just read an excellent paper. It’s rather old, so old indeed that I might have been one of the ‘first year secondary school pupils’ involved in the evaluation! (though I don’t think that I was). The full reference is:

Bishop, A.J., Knapp, T.R. & McIntyre, D.I. (1969) A comparison of the results of open-ended and multiple-choice versions of a mathematics test. International Journal of Educational Science, 3, 147-54.

The first thing they did was to produce a multiple-choice version of a mathematics test by using, as distractors, actual wrong answers, commonly given to the same questions in open-ended form. (more…)

The testing effect

Tuesday, August 16th, 2011

This will be my final post that picks up a theme from CAA 2011 , but the potential implications of this one are massive. For the past few weeks I have been trying to get my head around the significance of the ideas I was introduced to by John Kleeman’s presentation ‘Recent cognitive psychology research shows strongly that quizzes help retain learning’. I’m ashamed to admit that the ideas John was presenting were mostly new to me. The ideas echo with a lot of what we do at the UK Open University in encouraging students to learn actively, but they go further. Thank you John! (more…)

Are we assessing what we think we are?

Tuesday, August 9th, 2011

In the past week (when I should have been working at Open University summer school, but got sent home ill) I haven’t felt up to doing a great deal, but I have managed quite a lot of reading. I’ve also tried to get a deeper understanding of some of the concepts in assessment which I once thought I understood – but the more I learn the less I feel I know. Validity is one of those concepts. (more…)

Poor quality assessment – inescapable and memorable

Tuesday, August 9th, 2011

David Boud famously said ‘Students can, with difficulty, escape from the effects of poor teaching, they cannot (by definition if they want to graduate) escape the effects of poor assessment.’

Boud, D. (1995) Assessment and learning: contradictory or complementary? In P. Knight (ed) Assessment for learning in higher education. Kogan Page in association with SEDA. pg 35.

Poor assessment is also memorable. (more…)

It’s just not cricket

Wednesday, August 3rd, 2011

First of all, for the benefit of those who are not native speakers of English, I ought to explain the meaning of the phrase ‘It’s just not cricket’. The game of cricket carries connotations of being something that is played by ‘gentlemen’ (probably on a village green) who wouldn’t dream of cheating – so if something  is unfair, biased or involves cheating it is ‘just not cricket’. This post is about bias and unfairness in assessment.

 © Copyright Martin Addison and licensed for reuse under this Creative Commons Licence

It is reasonable that we do all we reasonably can to remove unfair bias (e.g. by not using idiomatic language like ‘it’s just not cricket’…) but how far down this road it is appropriate to go is a matter of some debate. Some of my colleagues love to make their tutor-marked assignment questions interesting and, by coincidence, some of them are rather fond of cricket…so we end up with all sorts of energy considerations relating to cricket balls and wickets. Is that fair? Are students who don’t know about cricket disadvantaged? Or perhaps it is the students who know lots about cricket who are disadvantaged -  they  may try to interpret the question too deeply. In situations like this, I insist on including an explanation of the terms used, but I sometimes wonder if we’d be better not trying to make assessment ‘interesting’ in this way. The sport gets in the way of the physics. (more…)

Does a picture paint a thousand words?

Tuesday, August 2nd, 2011

One of the things that Matt Haigh looked at when considering the impact of item format (see previous post) was whether the presence of a picture made the question easier or harder. He started with a very simple multiple choice item:

Which of the following is a valid argument for using nuclear power stations?

  • for maximum efficiency, they have to be sited on the coast
  • they have high decommissioning costs
  • they use a renewable energy source
  • they do not produce gases that pollute the atmosphere

All the students received this question, but half had a version which showed a photograph of a nuclear power station. Not surprisingly, the students liked the presence of the photograph. The version with the photograph also had a slightly lower item difficulty  (when calcualted by either Classical Test Theory and Item Response Theory paradigms), but not significantly so.

When compared with aspects of my work that I’ve  already described briefly (it’s the dreaded sandstone question again – see Helpful and unhelpful feedback : a story of Sandstone) it is perhaps surprising that the presence of the photograph does not confuse people and so make the question more difficult. (more…)

The impact of item format

Tuesday, August 2nd, 2011

One of the things I’ve found time and time again in my investigations into student engagement with e-assessment is that little things can make a difference. Therefore the research done by Matt Haigh of Cambridge Assessment into the impact of question format, which I’ve heard Matt speak about a couple of times, most recently at CAA 2011, was well overdue. It’s hard to believe that so few people have done work in this area.

Matt compared the difficulty (as measured by performance on the questions) of ten pairs of question types e.g. with or without a picture, drag and drop vs tick box, drag and drop vs drop-down selection, multiple-choice with only single selection allowed vs multiple-choice with multiple selections enabled, when adminstered to 112 students at secondary schools in England. In each case the actual question asked was identical. The quantitative evaluation was followed by focus group discussions.

This work is very relevant to what we do at the OU (since, for example, we use drop-down selection as the replacement for drag and drop questions for students who need to use a screen reader to attempt the questions). Happily, Matt’s main conclusion was the variations of item format explored here had very little impact on difficulty – even when there appeared to have been some difference this was not statistically significant. The focus group discussions led to general insight into what makes a question difficult (not surprisingly ‘lack of clarity’ came top) and also to some suggestions for the observed differences and lack of differences in difficulty in the parallel forms of the questions.

I’d very much like to do some work in this area myself, looking at the impact of item format on our rather different (and vast) student population. I’d also like to observe people doing questions in parallel formats, so see what clues that might give.

Automatically generated questions

Tuesday, August 2nd, 2011

In describing a presentation by Margit Hofler of the Institute for Information Systems and Computer Media at Graz University of Technology, Austria, the CAA 2011 Conference Chair Denise Whitelock used the words ‘holy grail’ and this is certainly interesting and cutting-edge stuff.  The work is described in the paper ‘Investigating automatically and manually generated questions to support self-directed learning’ by Margit and her colleagues at http://caaconference.co.uk/proceedings/

An ‘enhanced automatic question creator’ has been used to create questions from a piece of text, and the content quality of 120 automatically created test items has been compared with 290 items created by students. (more…)