Rights statement: This is the author’s version of a work that was accepted for publication in Speech Communication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Speech Communication, 132, 2021 DOI: 10.1016/j.specom.2021.05.006
Accepted author manuscript, 1.21 MB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Articulation Rates' Inter-Correlations And Discriminating Powers In An English Speech Corpus
AU - Plug, Leendert
AU - Lennon, Robert
AU - Gold, Erica
N1 - This is the author’s version of a work that was accepted for publication in Speech Communication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Speech Communication, 132, 2021 DOI: 10.1016/j.specom.2021.05.006
PY - 2021/9/30
Y1 - 2021/9/30
N2 - Studies that quantify speech tempo on acoustic grounds typically use one of various rate measures. The availability of multiple measurement techniques yields ‘researcher degrees of freedom’ which call the robustness of generalisations across studies into question. However, explicit assessments of the possible impact of researchers’ choices amongst the available measures are rare. In this study we attempt such an assessment by comparing the distributions of five common rate measures―canonical and surface syllable and phone rates, and CV segment rate―calculated over fluent stretches of unscripted speech produced by 100 English speakers. We assess the measures’ inter-correlations across the corpus as a whole as well as in relevant data samples to simulate multiple analysis scenarios. We also report on deletion rates in our corpus, as they determine the relationship between canonical and surface rates; we assess the impact on rate figures of variable assumptions as to what constitutes deletion; and we compare the measures’ discriminating powers in a forensic analysis context using Bayesian likelihood ratios. Our results suggest that in a sizeable English corpus with normal deletion rates, the five rates are closely inter-correlated and have similar discriminating powers; decisions as to the segmental make-up of canonical forms also have limited impact on distributions. Therefore, for common analytical purposes and forensic applications the choice between these measures is unlikely to substantially affect outcomes.
AB - Studies that quantify speech tempo on acoustic grounds typically use one of various rate measures. The availability of multiple measurement techniques yields ‘researcher degrees of freedom’ which call the robustness of generalisations across studies into question. However, explicit assessments of the possible impact of researchers’ choices amongst the available measures are rare. In this study we attempt such an assessment by comparing the distributions of five common rate measures―canonical and surface syllable and phone rates, and CV segment rate―calculated over fluent stretches of unscripted speech produced by 100 English speakers. We assess the measures’ inter-correlations across the corpus as a whole as well as in relevant data samples to simulate multiple analysis scenarios. We also report on deletion rates in our corpus, as they determine the relationship between canonical and surface rates; we assess the impact on rate figures of variable assumptions as to what constitutes deletion; and we compare the measures’ discriminating powers in a forensic analysis context using Bayesian likelihood ratios. Our results suggest that in a sizeable English corpus with normal deletion rates, the five rates are closely inter-correlated and have similar discriminating powers; decisions as to the segmental make-up of canonical forms also have limited impact on distributions. Therefore, for common analytical purposes and forensic applications the choice between these measures is unlikely to substantially affect outcomes.
KW - articulation rate
KW - speaker comparison
KW - correlations
U2 - 10.1016/j.specom.2021.05.006
DO - 10.1016/j.specom.2021.05.006
M3 - Journal article
VL - 132
SP - 40
EP - 54
JO - Speech Communication
JF - Speech Communication
SN - 0167-6393
ER -