It’s just not cricket

First of all, for the benefit of those who are not native speakers of English, I ought to explain the meaning of the phrase ‘It’s just not cricket’. The game of cricket carries connotations of being something that is played by ‘gentlemen’ (probably on a village green) who wouldn’t dream of cheating – so if something  is unfair, biased or involves cheating it is ‘just not cricket’. This post is about bias and unfairness in assessment.

 © Copyright Martin Addison and licensed for reuse under this Creative Commons Licence

It is reasonable that we do all we reasonably can to remove unfair bias (e.g. by not using idiomatic language like ‘it’s just not cricket’…) but how far down this road it is appropriate to go is a matter of some debate. Some of my colleagues love to make their tutor-marked assignment questions interesting and, by coincidence, some of them are rather fond of cricket…so we end up with all sorts of energy considerations relating to cricket balls and wickets. Is that fair? Are students who don’t know about cricket disadvantaged? Or perhaps it is the students who know lots about cricket who are disadvantaged –  they  may try to interpret the question too deeply. In situations like this, I insist on including an explanation of the terms used, but I sometimes wonder if we’d be better not trying to make assessment ‘interesting’ in this way. The sport gets in the way of the physics. Continue reading

Posted in bias, fairness | Tagged , , | Leave a comment

Does a picture paint a thousand words?

One of the things that Matt Haigh looked at when considering the impact of item format (see previous post) was whether the presence of a picture made the question easier or harder. He started with a very simple multiple choice item:

Which of the following is a valid argument for using nuclear power stations?

  • for maximum efficiency, they have to be sited on the coast
  • they have high decommissioning costs
  • they use a renewable energy source
  • they do not produce gases that pollute the atmosphere

All the students received this question, but half had a version which showed a photograph of a nuclear power station. Not surprisingly, the students liked the presence of the photograph. The version with the photograph also had a slightly lower item difficulty  (when calcualted by either Classical Test Theory and Item Response Theory paradigms), but not significantly so.

When compared with aspects of my work that I’ve  already described briefly (it’s the dreaded sandstone question again – see Helpful and unhelpful feedback : a story of Sandstone) it is perhaps surprising that the presence of the photograph does not confuse people and so make the question more difficult. Continue reading

Posted in item format, picture, question difficulty | Tagged , , , | Leave a comment

The impact of item format

One of the things I’ve found time and time again in my investigations into student engagement with e-assessment is that little things can make a difference. Therefore the research done by Matt Haigh of Cambridge Assessment into the impact of question format, which I’ve heard Matt speak about a couple of times, most recently at CAA 2011, was well overdue. It’s hard to believe that so few people have done work in this area.

Matt compared the difficulty (as measured by performance on the questions) of ten pairs of question types e.g. with or without a picture, drag and drop vs tick box, drag and drop vs drop-down selection, multiple-choice with only single selection allowed vs multiple-choice with multiple selections enabled, when adminstered to 112 students at secondary schools in England. In each case the actual question asked was identical. The quantitative evaluation was followed by focus group discussions.

This work is very relevant to what we do at the OU (since, for example, we use drop-down selection as the replacement for drag and drop questions for students who need to use a screen reader to attempt the questions). Happily, Matt’s main conclusion was the variations of item format explored here had very little impact on difficulty – even when there appeared to have been some difference this was not statistically significant. The focus group discussions led to general insight into what makes a question difficult (not surprisingly ‘lack of clarity’ came top) and also to some suggestions for the observed differences and lack of differences in difficulty in the parallel forms of the questions.

I’d very much like to do some work in this area myself, looking at the impact of item format on our rather different (and vast) student population. I’d also like to observe people doing questions in parallel formats, so see what clues that might give.

Posted in conferences, item format, question difficulty | Tagged , , , , , | 1 Comment

Automatically generated questions

In describing a presentation by Margit Hofler of the Institute for Information Systems and Computer Media at Graz University of Technology, Austria, the CAA 2011 Conference Chair Denise Whitelock used the words ‘holy grail’ and this is certainly interesting and cutting-edge stuff.  The work is described in the paper ‘Investigating automatically and manually generated questions to support self-directed learning’ by Margit and her colleagues at http://caaconference.co.uk/proceedings/

An ‘enhanced automatic question creator’ has been used to create questions from a piece of text, and the content quality of 120 automatically created test items has been compared with 290 items created by students. Continue reading

Posted in Automatically generated questions, conferences | Tagged , , , , | Leave a comment

Assessing Open University students – at residential schools and otherwise

I was due to be tutoring at the Open University residential school SXR103 Practising Science at the University of Sussex (shown left; the crane is a reminder of the huge amount of building work that is going on) for two weeks this summer. Unfortunately flu and laryngitis forced me to bail out after just one week. It was an excellent week, even if teaching is a bit challenging when you have no voice! – I’m extremely grateful to my colleagues, especially Keith, Chris and Anne-Marie, for helping to cover for me and to Dave for staying on an extra week in my place.

So now I’m home with unexpected time to catch up on some reading, writing and reflection (though my thinking may be even more befuddled than usual!).  I’ve thought previously that the assessment-related issues confronting those of us who work at the UK Open University, with its ‘supported distance learning’, are not that dissimilar to those working at conventional ‘face to face’ universities. Now I’m beginning to wonder. Continue reading

Posted in Open University, Residential schools | Tagged , | Leave a comment

Happy birthday blog!

It seems hard to believe that I’ve been blogging on assessment, especially e-assessment, and especially the impact of e-assessment on students, for a year now.

Even more amazing is the fact that there is still so much I want to say. Assessment and e-assessment  have been growth areas for the past 20 years or so (huge numbers of papers have been written and huge numbers of conference presentations given). In many ways we know so much…but yet we know so little.  I’m not an expert, just an amateur pottering around the edges of a large, overpopulated and sometimes contested territory. I find it difficult to get my head around many of the concepts. Continue reading

Posted in Uncategorized | Leave a comment

Answer matching for short-answer questions: simple but not that simple

In describing our use of  (simple) PMatch for answer matching for short-answer free-text questions, I may have made it sound too simple. I’ll give two examples of the sorts of things you need to consider:

Firstly, consider the question shown on the left. I’m not going to say whether the answer given is correct or incorrect, but note that the answer ‘Kinetic energy is converted to gravitational (potential) energy’ includes exactly the same words – and responses of both types are commonly received from real students. So word order matters.

The other thing to take care with is negatives. As I’ve said before, it isn’t that students are trying to trick the system. However responses that would be correct were it not for the fact that they contain the word ‘not’ are suprisingly common. So answer matching needs to be able to deal with negationContinue reading

Posted in short-answer free text questions | Tagged , , , | Leave a comment

Can online selected response questions really provide useful formative feedback?

The title of this post comes from the title of a thoughtful paper from John Dermo and Liz Carpenter at CAA 2011. In his presentation, John asked whether automated e-feedback can create ‘moments of contingency?’ (Black & Wiliam 2009). This is something I’ve reflected on a lot – it some senses the ideas seem worlds apart. Continue reading

Posted in conferences, feedback | Tagged , , , , , | Leave a comment

Are you sure?

For various reasons I’ve been thinking a lot recently about confidence-based marking.  (Tony Gardner-Medwin, who does most of the work in this area also calls it ‘certainty-based marking’). The principle is that you get most marks for a correct response that you are sure is right, fewer for a correct response that you are not sure about. But at the opposite end of the scale, you tend to get a penalty for an incorrect response that you were sure was right. Continue reading

Posted in confidence-based marking | Tagged , | 2 Comments

Let students not technology be the driver

Just home from CAA 2011 (the International Computer Assisted Assessment Conference in Southampton). The attendance was quite low ( probably a victim of the current economic climate) but the conference was good, with some very thoughtful presentations and extremely useful conversations. I’ll post more in the coming days, but for the moment I’ll just reflect on John Dermo’s summary at the end of the conference. John had used wordle.net to create a word cloud from the papers. Amazingly the ‘biggest’ (i.e. most common) word in the cloud was  was ‘student’, whilst ‘technology’ was tiny. 

I did occasionally feel that some presenters were still seeing  technology as a solution in need of a problem (and also seeing ‘evaluation’ as something that we do to convince others that what we’re doing is the sensible way forward – surely honest evaluation has to accept that our use of technology might not always be the best solution). However, the overall focus on students not technology was refreshing. Hurrah!

Posted in conferences | Tagged , , , , | Leave a comment