Home > Research > Publications & Outputs > A large-scale and PCR-referenced vocal audio da...

Links

Text available via DOI:

View graph of relations

A large-scale and PCR-referenced vocal audio dataset for COVID-19

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

A large-scale and PCR-referenced vocal audio dataset for COVID-19. / Budd, Jobie; Baker, Kieran; Karoune, Emma et al.
In: Scientific Data, Vol. 11, No. 1, 700, 27.06.2024.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Budd, J, Baker, K, Karoune, E, Coppock, H, Patel, S, Payne, R, Tendero Cañadas, A, Titcomb, A, Hurley, D, Egglestone, S, Butler, L, Mellor, J, Nicholson, G, Kiskin, I, Koutra, V, Jersakova, R, McKendry, RA, Diggle, P, Richardson, S, Schuller, BW, Gilmour, S, Pigoli, D, Roberts, S, Packham, J, Thornley, T & Holmes, C 2024, 'A large-scale and PCR-referenced vocal audio dataset for COVID-19', Scientific Data, vol. 11, no. 1, 700. https://doi.org/10.1038/s41597-024-03492-w

APA

Budd, J., Baker, K., Karoune, E., Coppock, H., Patel, S., Payne, R., Tendero Cañadas, A., Titcomb, A., Hurley, D., Egglestone, S., Butler, L., Mellor, J., Nicholson, G., Kiskin, I., Koutra, V., Jersakova, R., McKendry, R. A., Diggle, P., Richardson, S., ... Holmes, C. (2024). A large-scale and PCR-referenced vocal audio dataset for COVID-19. Scientific Data, 11(1), Article 700. https://doi.org/10.1038/s41597-024-03492-w

Vancouver

Budd J, Baker K, Karoune E, Coppock H, Patel S, Payne R et al. A large-scale and PCR-referenced vocal audio dataset for COVID-19. Scientific Data. 2024 Jun 27;11(1):700. doi: 10.1038/s41597-024-03492-w

Author

Budd, Jobie ; Baker, Kieran ; Karoune, Emma et al. / A large-scale and PCR-referenced vocal audio dataset for COVID-19. In: Scientific Data. 2024 ; Vol. 11, No. 1.

Bibtex

@article{fd84c719b7e3435790e7efba8b5e79b1,
title = "A large-scale and PCR-referenced vocal audio dataset for COVID-19",
abstract = "The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the {\textquoteleft}Speak up and help beat coronavirus{\textquoteright} digital survey alongside demographic, symptom and self-reported respiratory condition data. Digital survey submissions were linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,565 of 72,999 participants and 24,105 of 25,706 positive cases. Respiratory symptoms were reported by 45.6% of participants. This dataset has additional potential uses for bioacoustics research, with 11.3% participants self-reporting asthma, and 27.2% with linked influenza PCR test results.",
author = "Jobie Budd and Kieran Baker and Emma Karoune and Harry Coppock and Selina Patel and Richard Payne and {Tendero Ca{\~n}adas}, Ana and Alexander Titcomb and David Hurley and Sabrina Egglestone and Lorraine Butler and Jonathon Mellor and George Nicholson and Ivan Kiskin and Vasiliki Koutra and Radka Jersakova and McKendry, {Rachel A.} and Peter Diggle and Sylvia Richardson and Schuller, {Bj{\"o}rn W.} and Steven Gilmour and Davide Pigoli and Stephen Roberts and Josef Packham and Tracey Thornley and Chris Holmes",
year = "2024",
month = jun,
day = "27",
doi = "10.1038/s41597-024-03492-w",
language = "English",
volume = "11",
journal = "Scientific Data",
issn = "2052-4463",
publisher = "Nature Publishing Group",
number = "1",

}

RIS

TY - JOUR

T1 - A large-scale and PCR-referenced vocal audio dataset for COVID-19

AU - Budd, Jobie

AU - Baker, Kieran

AU - Karoune, Emma

AU - Coppock, Harry

AU - Patel, Selina

AU - Payne, Richard

AU - Tendero Cañadas, Ana

AU - Titcomb, Alexander

AU - Hurley, David

AU - Egglestone, Sabrina

AU - Butler, Lorraine

AU - Mellor, Jonathon

AU - Nicholson, George

AU - Kiskin, Ivan

AU - Koutra, Vasiliki

AU - Jersakova, Radka

AU - McKendry, Rachel A.

AU - Diggle, Peter

AU - Richardson, Sylvia

AU - Schuller, Björn W.

AU - Gilmour, Steven

AU - Pigoli, Davide

AU - Roberts, Stephen

AU - Packham, Josef

AU - Thornley, Tracey

AU - Holmes, Chris

PY - 2024/6/27

Y1 - 2024/6/27

N2 - The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the ‘Speak up and help beat coronavirus’ digital survey alongside demographic, symptom and self-reported respiratory condition data. Digital survey submissions were linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,565 of 72,999 participants and 24,105 of 25,706 positive cases. Respiratory symptoms were reported by 45.6% of participants. This dataset has additional potential uses for bioacoustics research, with 11.3% participants self-reporting asthma, and 27.2% with linked influenza PCR test results.

AB - The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the ‘Speak up and help beat coronavirus’ digital survey alongside demographic, symptom and self-reported respiratory condition data. Digital survey submissions were linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,565 of 72,999 participants and 24,105 of 25,706 positive cases. Respiratory symptoms were reported by 45.6% of participants. This dataset has additional potential uses for bioacoustics research, with 11.3% participants self-reporting asthma, and 27.2% with linked influenza PCR test results.

U2 - 10.1038/s41597-024-03492-w

DO - 10.1038/s41597-024-03492-w

M3 - Journal article

VL - 11

JO - Scientific Data

JF - Scientific Data

SN - 2052-4463

IS - 1

M1 - 700

ER -