# Victorian clergymen

This is more ‘rant of the day’ than ‘quote of the day’ but I’d like to start with a quote from my own ‘Maths for Science’ (though I’m indebted to my co-author Pat Murphy who actually wrote this bit):

” It is extremely important to appreciate that even a statistically significant correlation between two variables does not prove that changes in one variable cause changes in the other variable.

Correlation does not imply causality.

A time-honoured, but probably apocryphal, example often cited to illustrate this point is the statistically significant positive correlation reported for the late 19th Century between the number of clergymen in England and the consumption of alcoholic spirits. Both the increased number of the clergymen and the increased consumption of spirits can presumably be attributed to population growth (which is therefore regarded as a ‘confounding variable’) rather than the increase in the number of clergymen being the cause of the increased consumption of spirits of vice versa.”

Jordan, S., Ross, S. and Murphy, P. (2013) Maths for Science. Oxford: Oxford University Press. p. 302.

Now, my fellow educational researchers, have you understood that point? Correlation does not imply causality. In the past week I have read two papers, both published in respectable peer-reviewed journals and one much cited (including, I’m sad to say, by one publication on which I’m a co-author), which make the mistake of assuming that an intervention has been the cause of an effect.

In particular, if you offer students some sort of non-compulsory practice quiz, those who do the quiz will do better on the module’s summative assessment. We hope that the quiz has  helped them, and maybe it has – but we can’t prove this fact just from the fact that they have done better in a piece of assessed work.  What we mustn’t forget that it is the keen, well motivated, students who do the non-compulsory activities – and  these students are more likely to do well in the summative assessment, for all sorts of reasons (they may actually have studied the course materials for a start…).

One of the papers I’ve just mentioned tried to justify the causal claim by saying that the quiz was particularly effective for “weaker” students. The trouble is that a little investigation showed me that this claim made the logical inconsistency even worse! Firstly it assumed that weaker students are less well motivated. That may be true, but no proof was offered. Secondly, I was puzzled about where the data came from and discovered that the author was using score on the first quiz that a student did, be that formative or summative, as an indicator of student ability. But students try harder when the mark counts and their mark on a summative assignment is very likely to be higher for that reason alone. The whole argument is flawed. Oh dear…

I am deliberately not naming the papers I’m referring to, partly because I’m not brave enough and partly because I fear there are many similar cases out there. Please can we all try a little harder not to make claims unless we are sure that we can justify them.

This entry was posted in research methods, statistics and tagged , , . Bookmark the permalink.

### 3 Responses to Victorian clergymen

1. Toni Soto says:

Hi Sally,

I’m a Spanish Secondary Science teacher eager to read every post you write in your blog. I know (correct me if I’m wrong) that you’re an expert in e-assessment specially in Sciences and Maths fields. That’s why I follow you in twitter and read your blog.

I fully agree with “Correlation does not imply causality” but at the same time I wonder what a difficult issue is to isolate variables to test an hypothesis when dealing with human behaviour in general and with human learning in particular.

I’m using Moodle as an extra tool to improve my teaching quality since some years ago. In my opinion, Quizzing is a powerful tool for improving learning. Specially when it’s able to provide a ‘good’ feedback on real time. When I began to use Moodle, Quizzes were used for summative assessment but since some years ago I’m spending time in improving my questionbank with useful feedback for a correct treatment of mistakes. I’m very happy with this ‘migration’ from summative to formative assessment strategy.

At the same time I realize that Moodle is collecting tons of data about the learning activity of our students. The term ‘Learning analytics’ is everywhere and it’s true that there are mountains of data to dig in and search for patterns to improve the teaching-Learning processes.

My question is: What areas/fields in ‘Learning analytics’ can we define so that we could safely isolate variables without being afraid of ‘Correlation does not imply causality’? Where to focus? I’d like to ‘research’ (not as a professional researcher …I’m a teacher) in this area because I’ve got data and I can design new activities to get more ‘useful’ data.

Forgive me if my question has no sense at all. It came to my mind immediately after reading your post and I decided to leave a reply here.

Anyway, thank you again for sharing your knowledge and keep posting… I’ll keep reading!

Best regards form Vigo (Spain)

Toni Soto

2. Tim Hunt says:

How to design experiments to minimise the chance that confounding variables prevent you from detecting true correlations is something that the medical profession worked out some decades ago. They use double-blind randomised controlled trials. The double-blind aspect is easy for pills. Not so easy for physical therapies like massage, say – and education suffers form that problem.

Anyway, a really good introduction to this subject (and a really good read for other reasons) is ‘Bad Science’ by Ben Goldacre. The sort-of follow-up book ‘Bad Pharam’ is also good.

3. Sally Jordan says:

Thank you Toni and Tim for your insightful comments. I don’t really feel that I’m an expert on anything – I come at all of this from a standpoint of wanting to do the best for our students and also of honestly evaluating what we do. So, all I can offer are some further ramblings (and you have given me ideas for future posts!).

As I said on Twitter, we certainly must not be put off from attempting robust evaluation. If there is one thing that annoys me more than people claiming causal relationships without evidence, it is people assuming that what students say they do is the same as what they actually do! That’s where learning analytics has such strength – you can monitor actual engagement at either the cohort or the individual student level, and use it to improve learning. I don’t know if you have seen:
Redecker, C., & Johannessen, Ø. (2013) Changing Assessment—Towards a New Assessment Paradigm Using ICT. European Journal of Education, 48(1), 79-96.
I find what they say very exciting – more on that to follow!

Returning to evaluation of our current practice, the issue is that teachers (including me) tend to feel that, when we make a change in our practice it is somehow morally wrong to only make it available to half of the group. So, if the new activity is not compulsory, we have the situation where the keen students do the new activity and also do better in the exam. I don’t think it is really fair to compare the behaviour of these two groups of students. If we are lucky (especially if the activity is compulsory) then we may get an overall increase in performance from one year to the next. You can never be absolutely sure that this has not come about for some other reason, but it is a more reliable measure than comparing the students who do a non-compulsory activity with those who don’t. I do think Moodle quizzes can improve engagement and therefore lead to improvements.

Tim is right that the best option would be to use proper randomised testing – half the group gets offered the quiz and half doesn’t. Some of the work that has evaluated the ‘Testing effect’ has used that approach (half the group got a quiz, half got to read the book for longer) but even this work has mostly taken place in a ‘laboratory’ rather than a proper classroom. Real education is messy. But it can’t be as messy as medicine and they make it work. Lots to think about here!