Repository logo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. IIT Gandhinagar
  3. Cognitive and Brain Sciences
  4. CBS Publications
  5. To hesitate is to be proficient: acoustics and speech perception of filled pauses in spontaneous L2 Hindi speech by L1 Assamese speakers
 
  • Details

To hesitate is to be proficient: acoustics and speech perception of filled pauses in spontaneous L2 Hindi speech by L1 Assamese speakers

Source
28th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA 2025)
Date Issued
2025-11-12
Author(s)
Ravi, Sridevi
Chakraborty, Joyshree
Sarmah, Priyankoo
DOI
10.1109/O-COCOSDA68185.2025.11385092
Abstract
In this work, we present an acoustic analysis of filled pauses (FPs) produced by 24 Assamese L1 speakers of L2 Hindi, examining their relationship with perceived proficiency. Spontaneous Hindi speech was elicited and rated for proficiency by 20 native Hindi listeners on a 1−5 scale. A total of 667 FPs were extracted and analysed for type, vowel quality (F1, F2), duration, and speaker gender. By studying speech interaction in L2 speakers through FPs, we aim to answer pertinent questions in the discourse surrounding FPs. Are they speaker-specific, posing as an important tool for forensic science or language-specific? Results indicate a systematic proficiency-linked shift: high-rated speakers predominantly produced uhh, the central hesitation vowel in Hindi, while low-rated speakers employed a wider repertoire, including aah, the Assamese central vowel. Vowel space analysis showed that with increasing proficiency, aah tokens were fronted and centralized, reducing their acoustic distance from uhh, indicating accommodation to L2 sounds. Temporal analysis revealed that high-rated speakers produced significantly shorter and less variable FPs, whereas lower-rated speakers exhibited longer durations. These findings suggest that FPs are structured, language-specific cues that index both linguistic background and proficiency. From an applied perspective, the findings underscore the role of hesitation acoustics in enhancing automatic speech recognition.
URI
https://repository.iitgn.ac.in/handle/IITG2025/34682
IITGN Knowledge Repository Developed and Managed by Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify