Exploring Part of Speech (POS)-tag sequences in a large-scale learner corpus of L2 English: A developmental perspective

Abstract

This research explores the POS-tag sequences that shape the transition from upper intermediate (B2 CEFR) to near-native proficiency (C2 CEFR) in a corpus of essays (n=32,410) from the Cambridge Learner Corpus. Gilquin (2018) and others have shown that POS tag sequences offer a holistic approach to extracting the most commonly used patterns without a starting point of an a priori set of words and word sequences. Using corpus linguistics informed by usage-based theories of language learning, this paper examines the frequency and distribution of 4-slot POStag sequences in L2 English writing, drawing on the taxonomy of pattern grammar (Francis et al. 1996, 1998; Hunston & Francis, 2000). Findings point to the presence of both core and emergent POS-tag sequences in learner language in the two proficiency levels analysed. These sequences point to the presence of dynamic language restructuring processes as learners become more proficient and re-evaluate their understanding of frequency and distribution in English. This paper shows evidence of how language competence increases with proficiency. The research offers new evidence to our understanding of the development of L2 writing in EFL contexts.

Publication DOI: https://doi.org/10.3366/cor.2024.0297
Divisions: College of Business and Social Sciences > School of Social Sciences & Humanities > English Languages and Applied Linguistics
Additional Information: Copyright © Edinburgh University Press. This is an Accepted Manuscript of an article published by Edinburgh University Press in Corpora. The Version of Record is available online at: https://doi.org/10.3366/cor.2024.0297
Uncontrolled Keywords: CEFR,language proficiency development,learner corpora,pattern grammar,POS tag sequence,POS n-grams,usage-based
Publication ISSN: 1755-1676
Last Modified: 25 Nov 2024 08:39
Date Deposited: 05 May 2023 08:25
Full Text Link:
Related URLs: https://www.eup ... 6/cor.2024.0297 (Publisher URL)
http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Article
Published Date: 2024-04
Accepted Date: 2023-02-07
Authors: Lim, Joyce Dong Ok (ORCID Profile 0000-0003-3779-4521)
Mark, Geraldine
Pérez-Paredes, Pascual
O'Keeffe, Anne

Download

[img]

Version: Accepted Version

| Preview

Export / Share Citation


Statistics

Additional statistics for this record