Are we assessing what we think we are?

In the past week (when I should have been working at Open University summer school, but got sent home ill) I haven’t felt up to doing a great deal, but I have managed quite a lot of reading. I’ve also tried to get a deeper understanding of some of the concepts in assessment which I once thought I understood – but the more I learn the less I feel I know. Validity is one of those concepts. So those of you who are experts, please ignore this use of my blog to get my own thoughts straight.

At one level validity is simple – are you assessing what you think you are? (and, sadly, I think the answer is often ‘no’). As a physicist I spend quite a lot of time considering measurement uncertainties, which can be classified as random or systematic. I think it is probably fair to say that increasing reliability is akin to reducing random uncertainties (which can be done, though not necessarily easily) whilst increasing validity is akin to reducing systematic uncertainties (much more slippery!) . The problem is that is is very difficult to know if you are measuring or assessing the right thing.

In a similar way to the word ‘assessment’ itself (see my earlier posting Adjectives of assessment ), the word ‘validity’ seems frequently to follow an adjective i.e. it comes in different flavours. I’ll just describe my understanding a few of the types of validity here. Some of the definitions are taken from other people, but I have edited them to reflect my own understanding.

Concurrent validity – the correlation of a new test with existing tests which purport to measure the same thing (but note that old and new may correlate but neither be valid).

Construct validity – if the test is intended to measure e.g. ‘verbal reasoning’, ‘numeracy’ etc. is this what it is measuring?

Content validity – do the questions match the contents and learning outcomes of the syllabus?

Convergent validity is demonstrated when different measures of the same trait correlate highly.

Discriminant validity describes the degree to which the measure does not correlate with other measures that it theoretically should not be correlated with.

Face validity – the acceptability of the test items to both test user and subject

Predictive validity – used when tests are used to make predictions. Predictive validity is represented as a correlation between the test score itself and a score of the degree of success in the predicted field.

Leave a Reply

Archives

Meta