Final published version, 3.33 MB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
Final published version
Licence: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Annotating speaker stance in discourse
T2 - the Brexit Blog Corpus
AU - Simaki, Vasiliki
AU - Paradis, Carita
AU - Skeppstedt, Maria
AU - Sahlgren, Magnus
AU - Kucher, Kostiantyn
AU - Kerren, Andreas
PY - 2020/10/1
Y1 - 2020/10/1
N2 - The aim of this study is to explore the possibility of identifying speaker stance in discourse, provide an analytical resource for it and an evaluation of the level of agreement across speakers. We also explore to what extent language users agree about what kind of stances are expressed in natural language use or whether their interpretations diverge. In order to perform this task, a comprehensive cognitive-functional framework of ten stance categories was developed based on previous work on speaker stance in the literature. A corpus of opinionated texts was compiled, the Brexit Blog Corpus (BBC). An analytical protocol and interface (Active Learning and Visual Analytics) for the annotations was set up and the data were independently annotated by two annotators. The annotation procedure, the annotation agreements and the co-occurrence of more than one stance in the utterances are described and discussed. The careful, analytical annotation process has returned satisfactory inter- and intra-annotation agreement scores, resulting in a gold standard corpus, the final version of the BBC.
AB - The aim of this study is to explore the possibility of identifying speaker stance in discourse, provide an analytical resource for it and an evaluation of the level of agreement across speakers. We also explore to what extent language users agree about what kind of stances are expressed in natural language use or whether their interpretations diverge. In order to perform this task, a comprehensive cognitive-functional framework of ten stance categories was developed based on previous work on speaker stance in the literature. A corpus of opinionated texts was compiled, the Brexit Blog Corpus (BBC). An analytical protocol and interface (Active Learning and Visual Analytics) for the annotations was set up and the data were independently annotated by two annotators. The annotation procedure, the annotation agreements and the co-occurrence of more than one stance in the utterances are described and discussed. The careful, analytical annotation process has returned satisfactory inter- and intra-annotation agreement scores, resulting in a gold standard corpus, the final version of the BBC.
KW - text annotation
KW - blog post texts
KW - modality
KW - evaluation
KW - positioning
U2 - 10.1515/cllt-2016-0060
DO - 10.1515/cllt-2016-0060
M3 - Journal article
JO - Corpus Linguistics and Linguistic Theory
JF - Corpus Linguistics and Linguistic Theory
SN - 1613-7027
ER -