Although NLP techniques exist for statistical similarity (particularly of longer documents), I think that some semantic reasoning is required for more fine-grained analyses. I am interested in investigating systems which can assign very lightweight representations to texts (such as Robust Minimal Recursion Semantics) and can then be used to find the common concepts in texts which are stylistically very different. For example, a popular science magazine like Scientific American, and the source material from the original scientific journal (eg. PNAS).
You should have a good undergraduate or masters degree in Computing or a related subject, and a strong interest in how computers can be used to understand natural language. Some understanding of symbolic machine learning or automated reasoning could also be a good foundation for this project.
This project could also be suitable for students from areas such as linguistics if you are able to show sufficiently advanced programming skills.
The techniques used in the following paper could form the basis for this project:
Willis, Alistair (2015). Using NLP to support scalable assessment of short free text responses. In: Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 243–253. https://oro.open.ac.uk/43458/
Explore our qualifications and courses by requesting one of our prospectuses today.