Home > Research > Publications & Outputs > Building LANA-CASE, a spoken corpus of American...

Electronic data

  • Hanks_et_al_Abstract

    Accepted author manuscript, 76.9 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Building LANA-CASE, a spoken corpus of American English conversation: Challenges and innovations in corpus compilation

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print

Standard

Building LANA-CASE, a spoken corpus of American English conversation: Challenges and innovations in corpus compilation. / Hanks, Elizabeth; McEnery, Anthony; Egbert, Jesse et al.
In: Research in Corpus Linguistics, Vol. 12, No. 2, 03.03.2024, p. 24-44.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Hanks E, McEnery A, Egbert J, Larsson T, Biber D, Reppen R et al. Building LANA-CASE, a spoken corpus of American English conversation: Challenges and innovations in corpus compilation. Research in Corpus Linguistics. 2024 Mar 3;12(2):24-44. Epub 2024 Mar 3. doi: 10.32714/ricl.12.02.03

Author

Hanks, Elizabeth ; McEnery, Anthony ; Egbert, Jesse et al. / Building LANA-CASE, a spoken corpus of American English conversation : Challenges and innovations in corpus compilation. In: Research in Corpus Linguistics. 2024 ; Vol. 12, No. 2. pp. 24-44.

Bibtex

@article{926fe0efdd50448f9bac2ee44217ef6b,
title = "Building LANA-CASE, a spoken corpus of American English conversation: Challenges and innovations in corpus compilation",
abstract = "The Lancaster-Northern Arizona Corpus of Spoken American English (LANA-CASE) is a collaborative project between Lancaster University and Northern Arizona University to create a publicly available, large-scale corpus of American English conversation. In this article, we describe the design of LANA-CASE in terms of the challenges that have arisen and how these have been addressed – including decisions related to operationalizing the domain, sampling the data, recruiting participants, and selecting instruments for data collection. In addressing these challenges, we were able to draw on and further develop strategies established in the creation of other spoken corpora (including the British English counterpart to LANA-CASE, the Spoken British National Corpus 2014) as well as to implement recent theoretical and technical innovations related to each step. We hope that this discussion can inform future projects focused on the design and construction of spoken corpora.",
author = "Elizabeth Hanks and Anthony McEnery and Jesse Egbert and Tove Larsson and Douglas Biber and Randi Reppen and Paul Baker and Vaclav Brezina and Gavin Brookes and Isobelle Clarke and Raffaella Bottini",
year = "2024",
month = mar,
day = "3",
doi = "10.32714/ricl.12.02.03",
language = "English",
volume = "12",
pages = "24--44",
journal = "Research in Corpus Linguistics",
number = "2",

}

RIS

TY - JOUR

T1 - Building LANA-CASE, a spoken corpus of American English conversation

T2 - Challenges and innovations in corpus compilation

AU - Hanks, Elizabeth

AU - McEnery, Anthony

AU - Egbert, Jesse

AU - Larsson, Tove

AU - Biber, Douglas

AU - Reppen, Randi

AU - Baker, Paul

AU - Brezina, Vaclav

AU - Brookes, Gavin

AU - Clarke, Isobelle

AU - Bottini, Raffaella

PY - 2024/3/3

Y1 - 2024/3/3

N2 - The Lancaster-Northern Arizona Corpus of Spoken American English (LANA-CASE) is a collaborative project between Lancaster University and Northern Arizona University to create a publicly available, large-scale corpus of American English conversation. In this article, we describe the design of LANA-CASE in terms of the challenges that have arisen and how these have been addressed – including decisions related to operationalizing the domain, sampling the data, recruiting participants, and selecting instruments for data collection. In addressing these challenges, we were able to draw on and further develop strategies established in the creation of other spoken corpora (including the British English counterpart to LANA-CASE, the Spoken British National Corpus 2014) as well as to implement recent theoretical and technical innovations related to each step. We hope that this discussion can inform future projects focused on the design and construction of spoken corpora.

AB - The Lancaster-Northern Arizona Corpus of Spoken American English (LANA-CASE) is a collaborative project between Lancaster University and Northern Arizona University to create a publicly available, large-scale corpus of American English conversation. In this article, we describe the design of LANA-CASE in terms of the challenges that have arisen and how these have been addressed – including decisions related to operationalizing the domain, sampling the data, recruiting participants, and selecting instruments for data collection. In addressing these challenges, we were able to draw on and further develop strategies established in the creation of other spoken corpora (including the British English counterpart to LANA-CASE, the Spoken British National Corpus 2014) as well as to implement recent theoretical and technical innovations related to each step. We hope that this discussion can inform future projects focused on the design and construction of spoken corpora.

U2 - 10.32714/ricl.12.02.03

DO - 10.32714/ricl.12.02.03

M3 - Journal article

VL - 12

SP - 24

EP - 44

JO - Research in Corpus Linguistics

JF - Research in Corpus Linguistics

IS - 2

ER -