Investigating coreference use in L2 German: A corpus study

Date:

Abstract

The use of coreference influences writing fluency in a second language (L2) (Tian et al., 2021), text comprehension (e.g., Berkemeyer, 1994) and accuracy of a text (e.g., Bui, 2022). Cohesion in general and more specifically coreference use is being impacted by language proficiency (Yang & Sun, 2012) as well as by the native language (L1) of an L2 writer, since students tend to rely on L1 strategies to create cohesive texts, which may differ from the strategy used in the L2 (Roberts et al., 2008). This challenge has been documented in studies into L2 English (e.g., Grüter et al., 2017; He, 2020; Kang, 2009). For example, He (2020) investigated the use of coreference by L1 Chinese writers and found differences compared with L1 English texts, such as an underuse of demonstrative references. In stark contrast to L2 English, research on coreference in L2 German has been scarce to date. The few existing studies focus on specific coreference types, such as possessives (e.g., Fabricius-Hansen et al., 2021) and pronominal adverbs (Belz, 2005; Strobl, 2019), with no study available that gives a quantitive overview of L2 German coreference use, focussing, amongst others, on reference types or coreferential relations.

Our study aims to fill this gap by furthering research into coreference in L2 German writing, focusing on L2 writers with L1 Dutch. The analysis is based on the Belgisches Deutschkorpus (Beldeko) (Strobl & Wedig, 2023). Beldeko consists of 301 summaries written by advanced students of L2 German in an academic writing course. The corpus has been pre-processed and automatically annotated with part-of-speech tags and lemmas. Coreference was manually annotated with the help of a newly developed annotation system that combines categories of different frameworks (e.g., Becher, 2011; Kunz, 2010; Reznicek, 2013), such as antecedent types, coreferential expression, degree of coreference explicitness, and coreferential relations. This exhaustive annotation system also facilitates the calculation of the length of coreference chains.

The analysis of the corpus via the statistical software R revealed pronouns to be the most used type of reference (34%), closely followed by repetitions of proper nouns (31%). Demonstrative pronouns were used less often. Most coreferential relations were anaphoric, with nearly no cataphoric relation found in the corpus. Additionally, most coreferential relations were inter-sentential. The average coreferential chain consisted of 3.2 elements. We will present the results of this first analysis of coreference use in L2 German written by L1 Dutch students, comparing them with coreference patterns in L1 and translated German (e.g., Kunz et al., 2021) and discussing potential influences of the students´ L1.

References

Becher, V. (2011). Explicitation and implicitation in translation. A corpus-based study of English-German and German-English translations of business texts. Unpublished Doctoral dissertation, Universität Hamburg.

Belz, J. A. (2005). Corpus-driven characterizations of pronominal da-compound use by learners and native speakers of German. Die Unterrichtspraxis/Teaching German, 38(1), 44–60. https://doi.org/10.1111/j.1756-1221.2005.tb00041.x

Berkemeyer, V. C. (1994). Anaphoric Resolution and Text Comprehension for Readers of German. Die Unterrichtspraxis/Teaching German, 27(2), 15–22. https://doi.org/10.2307/3530982

Bui, H. P. (2022). Vietnamese EFL Students’ Use and Misconceptions of Cohesive Devices in Writing. SAGE Open, 12(3). https://doi.org/10.1177/21582440221126993

Fabricius-Hansen, C., Pitz, A. P., & Torgersen, H. A. T. (2021). Lexical interference in non-native resolution of possessives? Oslo Studies in Language, 12(2), 25–63. https://doi.org/10.5617/osla.8955

Grüter, T., Rohde, H., & Schafer, A. J. (2017). Coreference and discourse coherence in L2: The roles of grammatical aspect and referential form. Linguistic Approaches to Bilingualism, 7(2), 199–229. https://doi.org/https://doi.org/10.1075/lab.15011.gru

He, Z. (2020). Cohesion in Academic Writing: A Comparison of Essays in English Written by L1 and L2 University Students. Theory and Practice in Language Studies, 10(7), 761–770. http://dx.doi.org/10.17507/tpls.1007.06

Kang, J. Y. (2009). Referencing in a Second Language: Korean EFL Learners’ Cohesive Use of References in Written Narrative Discourse. Discourse processes, 46(5), 439–66. https://doi.org/10.1080/01638530902959638

Kunz, K. (2010). Variation in English and German nominal coreference: a study of political essays. Peter Lang.

Kunz, K., Lapshinova-Koltunski, E., Martínez, J. M. M., Menzel, K., & Steiner, E. (2021). GECCo - German-English Contrasts in Cohesion. De Gruyter Mouton. https://doi.org/10.1515/9783110711073

Reznicek, M (2013). Linguistische Annotation von Nichtstandardvarietäten — Guidelines und „Best Practices”: Guidelines Koreferenz: Version 1.01. F-AG 7: Angewandte Sprachwissenschaft, Computerlinguistik Kurationsprojekt 2. https://www.linguistik.hu-berlin.de/de/institut/professuren/korpuslinguistik /forschung/nosta-d/nosta-d-cor-1.1

Roberts, L., Gullberg, M., & Indefrey, P. (2008). Online pronoun resolution in L2 discourse - L1 influence and general learner effects. Studies in Second Language Acquisition, 30, 333–357. https://doi.org/10.1017/S0272263108080480

Strobl, C. (2019). Darum sind Pronominaladverbien eine Herausforderung für Deutschlerner. Germanistische Mitteilungen, 45(1&2). https://doi.org/10.33675/GM/2019/1&2/11

Strobl, C., & Wedig, H. (2023). Beldeko Summary Corpus v1.1.0. http://hdl.handle.net/20.500.12124/68

Tian, Y., Kim, M., Crossley, S., & Wan, Q. (2021). Cohesive devices as an indicator of L2 students’ writing fluency. Reading and Writing. https://doi.org/10.1007/s11145-021-10229-3

Yang, W., & Sun, Y. (2012). The use of cohesive devices in argumentative writing by Chinese EFL learners at different proficiency levels. Linguistics and Education, 23(1), 31–48. https://doi.org/10.1016/j.linged.2011.09.004