Home > Research > Publications & Outputs > Is there a core general vocabulary?

Electronic data

  • Applied Linguistics-2015-Brezina-1-22

    Rights statement: © Oxford University Press 2013 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

    Final published version, 190 KB, PDF document

    Available under license: CC BY

Links

Text available via DOI:

View graph of relations

Is there a core general vocabulary?: introducing the New General Service List

Research output: Contribution to journalJournal article

Published

Standard

Is there a core general vocabulary? introducing the New General Service List. / Brezina, Vaclav; Gablasova, Dana.

In: Applied Linguistics, Vol. 36, No. 1, 2015, p. 1-22.

Research output: Contribution to journalJournal article

Harvard

APA

Vancouver

Author

Bibtex

@article{25d4d43382824b7f93beb8f42b4fc021,
title = "Is there a core general vocabulary?: introducing the New General Service List",
abstract = "The current study presents a New General Service List (new-GSL), which is a result of robust comparison of four language corpora (LOB, BNC, BE06, and EnTenTen12) of the total size of over 12 billion running words. The four corpora were selected to represent a variety of corpus sizes and approaches to representativeness and sampling. In particular, the study investigates the lexical overlap among the corpora in the top 3,000 words based on the average reduced frequency (ARF), which is a measure that takes into consideration both frequency and dispersion of lexical items. The results show that there exists a stable vocabulary core of 2,122 items (70.7%) among the four corpora. Moreover, these vocabulary items occur with comparable ranks in the individual wordlists. In producing the new-GSL, the core vocabulary items were combined with new items frequently occurring in the corpora representing current language use (BE06 and EnTenTen12). The final product of the study, the new-GSL, consists of 2,494 lemmas and covers between 80.1 and 81.7 per cent of the text in the source corpora.",
author = "Vaclav Brezina and Dana Gablasova",
note = "{\textcopyright} Oxford University Press 2013 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.",
year = "2015",
doi = "10.1093/applin/amt018",
language = "English",
volume = "36",
pages = "1--22",
journal = "Applied Linguistics",
issn = "0142-6001",
publisher = "Oxford University Press",
number = "1",

}

RIS

TY - JOUR

T1 - Is there a core general vocabulary?

T2 - introducing the New General Service List

AU - Brezina, Vaclav

AU - Gablasova, Dana

N1 - © Oxford University Press 2013 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PY - 2015

Y1 - 2015

N2 - The current study presents a New General Service List (new-GSL), which is a result of robust comparison of four language corpora (LOB, BNC, BE06, and EnTenTen12) of the total size of over 12 billion running words. The four corpora were selected to represent a variety of corpus sizes and approaches to representativeness and sampling. In particular, the study investigates the lexical overlap among the corpora in the top 3,000 words based on the average reduced frequency (ARF), which is a measure that takes into consideration both frequency and dispersion of lexical items. The results show that there exists a stable vocabulary core of 2,122 items (70.7%) among the four corpora. Moreover, these vocabulary items occur with comparable ranks in the individual wordlists. In producing the new-GSL, the core vocabulary items were combined with new items frequently occurring in the corpora representing current language use (BE06 and EnTenTen12). The final product of the study, the new-GSL, consists of 2,494 lemmas and covers between 80.1 and 81.7 per cent of the text in the source corpora.

AB - The current study presents a New General Service List (new-GSL), which is a result of robust comparison of four language corpora (LOB, BNC, BE06, and EnTenTen12) of the total size of over 12 billion running words. The four corpora were selected to represent a variety of corpus sizes and approaches to representativeness and sampling. In particular, the study investigates the lexical overlap among the corpora in the top 3,000 words based on the average reduced frequency (ARF), which is a measure that takes into consideration both frequency and dispersion of lexical items. The results show that there exists a stable vocabulary core of 2,122 items (70.7%) among the four corpora. Moreover, these vocabulary items occur with comparable ranks in the individual wordlists. In producing the new-GSL, the core vocabulary items were combined with new items frequently occurring in the corpora representing current language use (BE06 and EnTenTen12). The final product of the study, the new-GSL, consists of 2,494 lemmas and covers between 80.1 and 81.7 per cent of the text in the source corpora.

U2 - 10.1093/applin/amt018

DO - 10.1093/applin/amt018

M3 - Journal article

VL - 36

SP - 1

EP - 22

JO - Applied Linguistics

JF - Applied Linguistics

SN - 0142-6001

IS - 1

ER -