Home > Research > Publications & Outputs > Developing Speech Rhythm Analysis for Forensic ...

Associated organisational unit

Electronic data

Text available via DOI:

View graph of relations

Developing Speech Rhythm Analysis for Forensic Voice Comparison

Research output: ThesisDoctoral Thesis

Published

Standard

Developing Speech Rhythm Analysis for Forensic Voice Comparison. / Carroll, Luke.
Lancaster University, 2025. 330 p.

Research output: ThesisDoctoral Thesis

Harvard

APA

Vancouver

Carroll L. Developing Speech Rhythm Analysis for Forensic Voice Comparison. Lancaster University, 2025. 330 p. doi: 10.17635/lancaster/thesis/2641

Author

Bibtex

@phdthesis{b1bb54e2435249f389dfde709e895688,
title = "Developing Speech Rhythm Analysis for Forensic Voice Comparison",
abstract = "Forensic voice comparison (FVC) involves the comparison of a criminal recording (e.g., a threatening phone call), and a known suspect sample (e.g., a police interview). It is the role of an expert forensic analyst to advise the trier of fact (e.g., judge or jury) on the likelihood that the two samples include the same or different speakers. To do this, the expert will carry out an assessment of the similarity of the speech characteristics in the criminal recording and the suspect sample.Speech rhythm has been proposed as a feature that could contribute to FVC, but there is not yet a structured analysis framework that practitioners can exploit in forensic casework. When an analyst suspects a speaker{\textquoteright}s speech rhythm is relevant to an analysis, it is usually only described at an impressionistic level.Using both production and perception experiments, the present research explores whether there are acoustic and auditory cues that could capture speech rhythm and subsequently be used to discriminate between speakers in forensic casework. The production experiments revealed that there was very little discriminatory power in syllabic duration, intensity and f₀ measurements across spontaneous, content-mismatched utterances. However, there does appear to be some speaker discriminatory value in applying these same measurements to, so-called, “frequently occurring speech units” (e.g., “er”, “erm”, “yes” and “no”).The perception experiments aimed to determine whether listeners (expert and non-expert) can make meaningful speaker identification assessments when presented with delexicalised speech samples that foreground the rhythmic attributes of speech. Results revealed that expert listeners were better than non-expert listeners in making correct speaker identification assessments, with those who had expertise in forensic phonetics generally performing better than those who did not.The findings from these experiments give promise to the prospect of developing a perceptual (auditory) rhythm framework which can used in forensic casework.",
keywords = "Speech rhythm, Forensic voice comparison, Spontaneous speech, Perceptual framework",
author = "Luke Carroll",
year = "2025",
month = feb,
day = "3",
doi = "10.17635/lancaster/thesis/2641",
language = "English",
publisher = "Lancaster University",
school = "Lancaster University",

}

RIS

TY - BOOK

T1 - Developing Speech Rhythm Analysis for Forensic Voice Comparison

AU - Carroll, Luke

PY - 2025/2/3

Y1 - 2025/2/3

N2 - Forensic voice comparison (FVC) involves the comparison of a criminal recording (e.g., a threatening phone call), and a known suspect sample (e.g., a police interview). It is the role of an expert forensic analyst to advise the trier of fact (e.g., judge or jury) on the likelihood that the two samples include the same or different speakers. To do this, the expert will carry out an assessment of the similarity of the speech characteristics in the criminal recording and the suspect sample.Speech rhythm has been proposed as a feature that could contribute to FVC, but there is not yet a structured analysis framework that practitioners can exploit in forensic casework. When an analyst suspects a speaker’s speech rhythm is relevant to an analysis, it is usually only described at an impressionistic level.Using both production and perception experiments, the present research explores whether there are acoustic and auditory cues that could capture speech rhythm and subsequently be used to discriminate between speakers in forensic casework. The production experiments revealed that there was very little discriminatory power in syllabic duration, intensity and f₀ measurements across spontaneous, content-mismatched utterances. However, there does appear to be some speaker discriminatory value in applying these same measurements to, so-called, “frequently occurring speech units” (e.g., “er”, “erm”, “yes” and “no”).The perception experiments aimed to determine whether listeners (expert and non-expert) can make meaningful speaker identification assessments when presented with delexicalised speech samples that foreground the rhythmic attributes of speech. Results revealed that expert listeners were better than non-expert listeners in making correct speaker identification assessments, with those who had expertise in forensic phonetics generally performing better than those who did not.The findings from these experiments give promise to the prospect of developing a perceptual (auditory) rhythm framework which can used in forensic casework.

AB - Forensic voice comparison (FVC) involves the comparison of a criminal recording (e.g., a threatening phone call), and a known suspect sample (e.g., a police interview). It is the role of an expert forensic analyst to advise the trier of fact (e.g., judge or jury) on the likelihood that the two samples include the same or different speakers. To do this, the expert will carry out an assessment of the similarity of the speech characteristics in the criminal recording and the suspect sample.Speech rhythm has been proposed as a feature that could contribute to FVC, but there is not yet a structured analysis framework that practitioners can exploit in forensic casework. When an analyst suspects a speaker’s speech rhythm is relevant to an analysis, it is usually only described at an impressionistic level.Using both production and perception experiments, the present research explores whether there are acoustic and auditory cues that could capture speech rhythm and subsequently be used to discriminate between speakers in forensic casework. The production experiments revealed that there was very little discriminatory power in syllabic duration, intensity and f₀ measurements across spontaneous, content-mismatched utterances. However, there does appear to be some speaker discriminatory value in applying these same measurements to, so-called, “frequently occurring speech units” (e.g., “er”, “erm”, “yes” and “no”).The perception experiments aimed to determine whether listeners (expert and non-expert) can make meaningful speaker identification assessments when presented with delexicalised speech samples that foreground the rhythmic attributes of speech. Results revealed that expert listeners were better than non-expert listeners in making correct speaker identification assessments, with those who had expertise in forensic phonetics generally performing better than those who did not.The findings from these experiments give promise to the prospect of developing a perceptual (auditory) rhythm framework which can used in forensic casework.

KW - Speech rhythm

KW - Forensic voice comparison

KW - Spontaneous speech

KW - Perceptual framework

U2 - 10.17635/lancaster/thesis/2641

DO - 10.17635/lancaster/thesis/2641

M3 - Doctoral Thesis

PB - Lancaster University

ER -