City Research Online

Automatic language ability assessment method based on natural language processing

Nnamoko, N. ORCID: 0000-0002-5064-2621, Karaminis, T. ORCID: 0000-0003-2977-5451, Procter, J. , Barrowclough, J. ORCID: 0000-0003-0902-5098 & Korkontzelos, I. ORCID: 0000-0001-8052-2471 (2024). Automatic language ability assessment method based on natural language processing. Natural Language Processing Journal, 8, article number 100094. doi: 10.1016/j.nlp.2024.100094

Abstract

Background and Objectives:
The Wechsler Abbreviated Scales of Intelligence second edition (WASI-II) is a standardised assessment tool that is widely used to assess cognitive ability in clinical, research, and educational settings. In one of the components of this assessment, referred to as the Vocabulary task, the assessed individuals are presented with words (called stimulus items), and asked to explain what each word mean. Their responses are hand-scored based on a list of pre-rated sample responses [0-Point (poor), 1-Point (moderate), or 2-Point (excellent)] that is provided in the accompanying manual of WASI-II. This scoring method is time-consuming, and scoring of responses that do not fully match the pre-rated ones may vary between individual scorers. In this study, we aim to use natural language processing techniques to automate the scoring procedure and make it more time-efficient and reliable (objective).

Methods:
Utilising five different word embeddings (Word2vec, Global Vectors, Bidirectional Encoder Representations from Transformers, Generative Pre-trained Transformer 2, and Embeddings from Language Model), we transformed stimulus items and pre-rated responses from the WASI-II Vocabulary task into machine-readable vectors. We measured distance with cosine similarity, evaluating each model against a rational-expectations hypothesis that vector representations for stimuli should align closely with 2-Point responses and diverge from 0-Point responses. Assessment involved frequency of consistent representation and the Pearson correlation coefficient, examining overall consistency with the manual’s ranking across all items and sample responses.

Results:
The Word2vec model showed the highest consistency with the WASI-II manual (frequency = 20 out of 27; Pearson Correlation coefficient = 0.61) while Bidirectional Encoder Representations from Transformers was the worst performing model (frequency = 5; Pearson Correlation coefficient = 0.05). The consistency of these two models with the WASI-II manual differed significantly, Z = 2.282, p = 0.022.

Conclusions:
Our results showed that the scoring of the WASI-II Vocabulary task can be automated with moderate accuracy relying upon off-the-shelf embedding models. These results are promising, and could be improved further by considering alternative vector dimensions, similarity metrics, and data preprocessing techniques to those used in this study.

Publication Type: Article
Publisher Keywords: Cognitive assessment, Natural Language Processing, Language ability test, Cosine similarity, WASI-II, Word embedding
Subjects: B Philosophy. Psychology. Religion > BF Psychology
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
R Medicine > RC Internal medicine > RC0321 Neuroscience. Biological psychiatry. Neuropsychiatry
Departments: School of Health & Psychological Sciences
School of Health & Psychological Sciences > Psychology
SWORD Depositor:
[thumbnail of 1-s2.0-S2949719124000426-main.pdf]
Preview
Text - Published Version
Available under License Creative Commons: Attribution International Public License 4.0.

Download (741kB) | Preview
Supplementary Materials:

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login