To hesitate is to be proficient: acoustics and speech perception of filled pauses in spontaneous L2 Hindi speech by L1 Assamese speakers
Source
28th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA 2025)
Date Issued
2025-11-12
Author(s)
Ravi, Sridevi
Chakraborty, Joyshree
Sarmah, Priyankoo
Abstract
In this work, we present an acoustic analysis of filled pauses (FPs) produced by 24 Assamese L1 speakers of L2 Hindi, examining their relationship with perceived proficiency. Spontaneous Hindi speech was elicited and rated for proficiency by 20 native Hindi listeners on a 1−5 scale. A total of 667 FPs were extracted and analysed for type, vowel quality (F1, F2), duration, and speaker gender. By studying speech interaction in L2 speakers through FPs, we aim to answer pertinent questions in the discourse surrounding FPs. Are they speaker-specific, posing as an important tool for forensic science or language-specific? Results indicate a systematic proficiency-linked shift: high-rated speakers predominantly produced uhh, the central hesitation vowel in Hindi, while low-rated speakers employed a wider repertoire, including aah, the Assamese central vowel. Vowel space analysis showed that with increasing proficiency, aah tokens were fronted and centralized, reducing their acoustic distance from uhh, indicating accommodation to L2 sounds. Temporal analysis revealed that high-rated speakers produced significantly shorter and less variable FPs, whereas lower-rated speakers exhibited longer durations. These findings suggest that FPs are structured, language-specific cues that index both linguistic background and proficiency. From an applied perspective, the findings underscore the role of hesitation acoustics in enhancing automatic speech recognition.
