Accepted author manuscript, 61 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Accepted author manuscript
Licence: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Social differentiation in the use of English vocabulary: some analyses of the conversational component of the British National Corpus
AU - Rayson, Paul
AU - Leech, Geoffrey
AU - Hodges, Mary
PY - 1997
Y1 - 1997
N2 - In this article we undertake selective quantitative analyses of the demographically-sampled spoken English component of the British National Corpus (for brevity, referred to here as the Conversational Corpus). This is a subcorpus of c.4.5 million words, in which speakers and respondents are identified by such factors as gender, age, social group and geographical region. Using a corpus analysis tool developed at Lancaster University, we undertake a comparison of the vocabulary of speakers, highlighting those differences which are marked by a very high chi-squared value of difference between different sectors of the corpus according gender, age and social group. A fourth variable, that of geographical region of the United Kingdom, is not investigated in this article, although it remains a promising subject for future research. (As background we also briefly examine differences between spoken and written material in the British National Corpus (BNC).) This study is illustrative of the potentiality of the Conversational Corpus for future corpus-based research on social differentiation in the use of language. There are evident limitations, including (a) the reliance on vocabulary frequency lists, and (b) the simplicity of the transcription system employed for the spoken part of the BNC. The conclusion of the article considers future advances in the research paradigm illustrated here.
AB - In this article we undertake selective quantitative analyses of the demographically-sampled spoken English component of the British National Corpus (for brevity, referred to here as the Conversational Corpus). This is a subcorpus of c.4.5 million words, in which speakers and respondents are identified by such factors as gender, age, social group and geographical region. Using a corpus analysis tool developed at Lancaster University, we undertake a comparison of the vocabulary of speakers, highlighting those differences which are marked by a very high chi-squared value of difference between different sectors of the corpus according gender, age and social group. A fourth variable, that of geographical region of the United Kingdom, is not investigated in this article, although it remains a promising subject for future research. (As background we also briefly examine differences between spoken and written material in the British National Corpus (BNC).) This study is illustrative of the potentiality of the Conversational Corpus for future corpus-based research on social differentiation in the use of language. There are evident limitations, including (a) the reliance on vocabulary frequency lists, and (b) the simplicity of the transcription system employed for the spoken part of the BNC. The conclusion of the article considers future advances in the research paradigm illustrated here.
KW - British National Corpus
KW - spoken English vocabulary frequency
KW - chi-squared test
U2 - 10.1075/ijcl.2.1.07ray
DO - 10.1075/ijcl.2.1.07ray
M3 - Journal article
VL - 2
SP - 133
EP - 152
JO - International Journal of Corpus Linguistics
JF - International Journal of Corpus Linguistics
SN - 1569-9811
IS - 1
ER -