Exploring pause fillers in conversational speech for forensic phonetics: Findings in a Spanish cohort including twins

Abstract

Pause fillers occur naturally during conversational speech, and have recently generated interest in their use for forensic applications. We extracted pause fillers from conversational speech from 54 speakers, including twins, whose voices are often perceptually similar. Overall 872 tokens of the sound [e:] were extracted (7-33 tokens per speaker), and objectively characterised using 315 acoustic measures. We used a Random Forest (RF) classifier and tested its performance using a leave-one-sample-out scheme to obtain probabilistic estimates of binary class membership denoting whether a query token belongs to a speaker. We report results using the Receiver Operating Characteristic (ROC) curve, and computing the Area Under the Curve (AUC). When the RF was presented with at least 20 tokens in the training phase for each of the two classes, we observed AUC in the range 0.71-0.98. These findings have important implications in the potential of pause fillers as an additional objective tool in forensic speaker verification.

Publication
Proceedings of ICPRS 2017: 8th International Conference on Pattern Recognition Systems, 11-13 July 2017, Madrid, Spain. Published by IET Digital Library, pp. 32-37. ISBN 978-1-78561-652-5