Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Short term diachronic shifts in part-of-speech frequencies: a comparison of the tagged LOB and F-LOB corpora.
AU - Mair, C.
AU - Hundt, M.
AU - Leech, Geoffrey
AU - Smith, N.
PY - 2003
Y1 - 2003
N2 - The paper presents a comparison of tag frequencies in two matching one-million word reference corpora of British standard English, the 1961 LOB-corpus and its 1991 “clone” produced at Freiburg. Both corpora were tagged using a version of the CLAWS part-of-speech-tagger developed at Lancaster, and part of the material was post-edited manually in Freiburg to assess the accuracy of the automatic procedure. The comparison of tag frequencies is an essential complement to work on recent linguistic change carried out on the untagged material, because this work has been based on the – so far unverified – assumption that tag frequencies have remained constant over the thirty-year period in question. In addition, the paper discusses some common and partly contradictory claims about the prevalence of a “nominal” style in present-day written English. It is shown that while part-of-speech frequencies have not remained constant over the period investigated, the shifts are usually not big enough to invalidate the results obtained in analyses of the untagged material. With regard to style, the material shows a significant rise in the frequency of nouns, which, however, is not paralleled by a corresponding decrease in verbs.
AB - The paper presents a comparison of tag frequencies in two matching one-million word reference corpora of British standard English, the 1961 LOB-corpus and its 1991 “clone” produced at Freiburg. Both corpora were tagged using a version of the CLAWS part-of-speech-tagger developed at Lancaster, and part of the material was post-edited manually in Freiburg to assess the accuracy of the automatic procedure. The comparison of tag frequencies is an essential complement to work on recent linguistic change carried out on the untagged material, because this work has been based on the – so far unverified – assumption that tag frequencies have remained constant over the thirty-year period in question. In addition, the paper discusses some common and partly contradictory claims about the prevalence of a “nominal” style in present-day written English. It is shown that while part-of-speech frequencies have not remained constant over the period investigated, the shifts are usually not big enough to invalidate the results obtained in analyses of the untagged material. With regard to style, the material shows a significant rise in the frequency of nouns, which, however, is not paralleled by a corresponding decrease in verbs.
U2 - 10.1075/ijcl.7.2.05mai
DO - 10.1075/ijcl.7.2.05mai
M3 - Journal article
VL - 7
SP - 245
EP - 264
JO - International Journal of Corpus Linguistics
JF - International Journal of Corpus Linguistics
SN - 1569-9811
IS - 2
ER -