An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking

Linguistics and English Language

Electronic data

2021eberharterphd
Final published version, 17 MB, PDF document

Text available via DOI:

https://doi.org/10.17635/lancaster/thesis/1378
Final published version

View graph of relations

Research output: Thesis › Doctoral Thesis

Published

Standard

An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking. / Eberharter, Kathrin.
Lancaster University, 2021. 412 p.

Research output: Thesis › Doctoral Thesis

Bibtex

@phdthesis{186f51b4c3b241778fd89b3d1645df41,

title = "An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking",

abstract = "Examinations of language proficiency routinely include the assessment of speaking, which still largely necessitates the use of human raters. However, variability in rating quality is a well-established phenomenon and makes rating a fundamental validity concern (Kane, 1992; 2006). Despite increased efforts to investigate rater cognition to better understand and mitigate rater effects (Bejar, 2012), research in language testing is yet to fully engage with the field of decision research (Baker, 2012; Purpura, 2013). Findings from this literature emphasize how complex decision tasks are shaped by factorssuch as processing capacities, perception, deliberate and automated thinking, and metacognitive control (Newell and Br{\"o}der, 2008).The purpose of this study was to investigate how novice raters use an analytic rating scale and to explore whether decision-making style, cognitive style, working memory capacity and executive function influence rating quality and rating behaviour. 39 pre-service English teachers rated a set of speaking performances (N=30) and completed two psychological questionnaires as well as a battery of cognitive tests. Rating behaviours were captured through JavaScript embedded in the online rating form. Data analysis first established measures of rating quality and scale use through a series of Many-Facets Rasch Measurement (MFRM) analyses. Next, relationships between individual attributes and measures of rater quality and behaviour were explored in a series of correlational analyses. Finally, the handwritten notes and self-report data from four selected raters were accumulated and explored to further enhance understanding of the rating process.Findings showed that there were considerable individual differences among the raters regarding rating quality and behaviours. Of all the variables included, decision-making style displayed the strongest associations with rating quality and behaviour, suggesting a relationship between intuitive and flexible processing and more successful rating. The four case studies highlighted a need to address cognitive load and directing of attention in rater training for speaking assessment.",

author = "Kathrin Eberharter",

year = "2021",

doi = "10.17635/lancaster/thesis/1378",

language = "English",

publisher = "Lancaster University",

school = "Lancaster University",

}

RIS

TY - BOOK

T1 - An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking

AU - Eberharter, Kathrin

PY - 2021

Y1 - 2021

N2 - Examinations of language proficiency routinely include the assessment of speaking, which still largely necessitates the use of human raters. However, variability in rating quality is a well-established phenomenon and makes rating a fundamental validity concern (Kane, 1992; 2006). Despite increased efforts to investigate rater cognition to better understand and mitigate rater effects (Bejar, 2012), research in language testing is yet to fully engage with the field of decision research (Baker, 2012; Purpura, 2013). Findings from this literature emphasize how complex decision tasks are shaped by factorssuch as processing capacities, perception, deliberate and automated thinking, and metacognitive control (Newell and Bröder, 2008).The purpose of this study was to investigate how novice raters use an analytic rating scale and to explore whether decision-making style, cognitive style, working memory capacity and executive function influence rating quality and rating behaviour. 39 pre-service English teachers rated a set of speaking performances (N=30) and completed two psychological questionnaires as well as a battery of cognitive tests. Rating behaviours were captured through JavaScript embedded in the online rating form. Data analysis first established measures of rating quality and scale use through a series of Many-Facets Rasch Measurement (MFRM) analyses. Next, relationships between individual attributes and measures of rater quality and behaviour were explored in a series of correlational analyses. Finally, the handwritten notes and self-report data from four selected raters were accumulated and explored to further enhance understanding of the rating process.Findings showed that there were considerable individual differences among the raters regarding rating quality and behaviours. Of all the variables included, decision-making style displayed the strongest associations with rating quality and behaviour, suggesting a relationship between intuitive and flexible processing and more successful rating. The four case studies highlighted a need to address cognitive load and directing of attention in rater training for speaking assessment.

AB - Examinations of language proficiency routinely include the assessment of speaking, which still largely necessitates the use of human raters. However, variability in rating quality is a well-established phenomenon and makes rating a fundamental validity concern (Kane, 1992; 2006). Despite increased efforts to investigate rater cognition to better understand and mitigate rater effects (Bejar, 2012), research in language testing is yet to fully engage with the field of decision research (Baker, 2012; Purpura, 2013). Findings from this literature emphasize how complex decision tasks are shaped by factorssuch as processing capacities, perception, deliberate and automated thinking, and metacognitive control (Newell and Bröder, 2008).The purpose of this study was to investigate how novice raters use an analytic rating scale and to explore whether decision-making style, cognitive style, working memory capacity and executive function influence rating quality and rating behaviour. 39 pre-service English teachers rated a set of speaking performances (N=30) and completed two psychological questionnaires as well as a battery of cognitive tests. Rating behaviours were captured through JavaScript embedded in the online rating form. Data analysis first established measures of rating quality and scale use through a series of Many-Facets Rasch Measurement (MFRM) analyses. Next, relationships between individual attributes and measures of rater quality and behaviour were explored in a series of correlational analyses. Finally, the handwritten notes and self-report data from four selected raters were accumulated and explored to further enhance understanding of the rating process.Findings showed that there were considerable individual differences among the raters regarding rating quality and behaviours. Of all the variables included, decision-making style displayed the strongest associations with rating quality and behaviour, suggesting a relationship between intuitive and flexible processing and more successful rating. The four case studies highlighted a need to address cognitive load and directing of attention in rater training for speaking assessment.

U2 - 10.17635/lancaster/thesis/1378

DO - 10.17635/lancaster/thesis/1378

M3 - Doctoral Thesis

PB - Lancaster University

ER -

Research

Electronic data

Text available via DOI: