Korochkina, Maria, Marelli, Marco and Rastle, Kathleen (2026). Morphemes in the wild: Modelling affix learning from the noisy landscape of natural text. Journal of Memory and Language, 148 ,
Abstract
Morphological knowledge serves as a powerful heuristic for vocabulary growth and contributes significantly to the speed and efficiency of reading. While research has long sought to explain how the knowledge of derivational morphology is acquired, previous approaches have struggled to capture the nuanced and complex ways in which derivational morphemes are used in written language, particularly that these morphemes contribute to meaning in a graded manner and that noise introduced by misleading forms (e.g., deliver) can impede learning. Our approach builds on earlier insights but moves beyond them by combining a large-scale analysis of vocabulary used in 1,200 popular books with computational modelling to explore how learning of derivational affixes may occur from text containing naturally occurring noise. We use a compositional distributional semantic model to investigate what can be learned about the meanings of individual English prefixes and suffixes through reading and evaluate the model’s performance against data from 120 adults in a lexical processing task. Our findings demonstrate that, despite the presence of noise, natural text contains sufficient structure to support the extraction of core affix semantics, and that readers are attuned to the complex patterns that shape affix use in the wild. This work contributes a new dimension to a more principled and psychologically grounded account of morpheme learning, and we discuss both this contribution and the broader insights it offers for language research.
| Publication DOI: | https://doi.org/10.1016/j.jml.2026.104746 |
|---|---|
| Divisions: | College of Health & Life Sciences > School of Psychology College of Health & Life Sciences > Aston Institute of Health & Neurodevelopment (AIHN) College of Health & Life Sciences Aston University (General) |
| Funding Information: | MK and KR were supported by a research grant from the Economic and Social Research Council, United Kingdom ( ES/W002310/1 ). MM was supported by a research grant from the European Union ( ERC-COG-2022 , BraveNewWord, 101087053 ). The views and opinions expressed in this article are those of the authors only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. |
| Additional Information: | Copyright © 2026 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/). |
| Uncontrolled Keywords: | Computational modelling,Distributional semantics,Learning,Lexical statistics,Morphology,Popular books,Reading,Neuropsychology and Physiological Psychology,Language and Linguistics,Experimental and Cognitive Psychology,Linguistics and Language,Artificial Intelligence |
| Publication ISSN: | 1096-0821 |
| Data Access Statement: | All data, code, and materials associated with this article are available on this project’s page on the Open Science Framework: https://osf.io/sf2bh/ |
| Last Modified: | 03 Feb 2026 08:07 |
| Date Deposited: | 03 Feb 2026 08:07 |
| Full Text Link: | |
| Related URLs: |
https://osf.io/sf2bh
(Related URL) https://linking ... 749596X26000161 (Publisher URL) http://www.scop ... tnerID=8YFLogxK (Scopus URL) |
PURE Output Type: | Article |
| Published Date: | 2026-04-01 |
| Published Online Date: | 2026-01-17 |
| Accepted Date: | 2026-01-13 |
| Authors: |
Korochkina, Maria
(
0000-0002-8017-7855)
Marelli, Marco Rastle, Kathleen |
0000-0002-8017-7855