Home > Research > Publications & Outputs > Operationalizing the reading-into-writing const...

Electronic data

  • Lestari & Brunfaut 2023

    Final published version, 2.28 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License


Text available via DOI:

View graph of relations

Operationalizing the reading-into-writing construct in analytic rating scales: Effects of different approaches on rating

Research output: Contribution to Journal/MagazineJournal articlepeer-review

<mark>Journal publication date</mark>31/07/2023
<mark>Journal</mark>Language Testing
Issue number3
Number of pages39
Pages (from-to)684-722
Publication StatusPublished
Early online date20/03/23
<mark>Original language</mark>English


Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating scales may affect rating quality, and by extension score inferences and uses. Using two different analytic rating scales as proxies for two approaches to reading-into-writing construct operationalization, this study investigated the extent to which these approaches affect rating reliability and consistency. Twenty raters rated a set of reading-into-writing performances twice, each time using a different analytic rating scale, and completed post-rating questionnaires. The findings resulting from our convergent explanatory mixed-method research design show that both analytic rating scales functioned well, further supporting the use of analytic rating scales for scoring reading-into-writing. Raters reported that either type of analytic rating scale prompted them to attend to the reading-related aspects of reading-into-writing, although rating these aspects remained more challenging than judging writing-related aspects. The two scales differed, however, in the extent to which they led raters to uniform interpretations of performance difficulty levels. This study has implications for reading-into-writing scale design and rater training.