Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - The BE06 Corpus of British English and recent language change.
AU - Baker, Paul
PY - 2009
Y1 - 2009
N2 - This paper describes the BE06 Corpus, a one million word reference corpus of general written British English that was designed to be comparable to the Brown family of corpora. After providing a description of the Brown sampling frame, and giving the rationale for building a new corpus, the process of building the BE06 is elaborated upon, with reference to collecting previously published texts from internet sources, defining "British" authors and enabling accessibility of the corpus. Three studies of lexical frequency using BE06 and comparable corpora (LOB, FLOB and BLOB) are then carried out. These involve a comparison of the 20 most frequent lexical items, an examination of pronoun usage, and an investigation of keywords derived from comparing the 1991 FLOB corpus with the BE06. The paper ends with a critical evaluation of the worth of using the same sampling frame for linguistic studies of diachronic variation.
AB - This paper describes the BE06 Corpus, a one million word reference corpus of general written British English that was designed to be comparable to the Brown family of corpora. After providing a description of the Brown sampling frame, and giving the rationale for building a new corpus, the process of building the BE06 is elaborated upon, with reference to collecting previously published texts from internet sources, defining "British" authors and enabling accessibility of the corpus. Three studies of lexical frequency using BE06 and comparable corpora (LOB, FLOB and BLOB) are then carried out. These involve a comparison of the 20 most frequent lexical items, an examination of pronoun usage, and an investigation of keywords derived from comparing the 1991 FLOB corpus with the BE06. The paper ends with a critical evaluation of the worth of using the same sampling frame for linguistic studies of diachronic variation.
KW - corpus building
U2 - 10.1075/ijcl.14.3.02bak
DO - 10.1075/ijcl.14.3.02bak
M3 - Journal article
VL - 14
SP - 312
EP - 337
JO - International Journal of Corpus Linguistics
JF - International Journal of Corpus Linguistics
SN - 1569-9811
IS - 3
ER -