dc.contributor.author |
Jana, Rituparna |
|
dc.contributor.author |
Chandra, Akash |
|
dc.contributor.author |
George, Nithin V. |
|
dc.contributor.author |
Chakraborty, Arup Lal |
|
dc.coverage.spatial |
Czech Republic |
|
dc.date.accessioned |
2025-06-06T12:12:06Z |
|
dc.date.available |
2025-06-06T12:12:06Z |
|
dc.date.issued |
2025-04-07 |
|
dc.identifier.citation |
Jana, Rituparna; Chandra, Akash; George, Nithin V. and Chakraborty, Arup Lal, "Speech enhancement in FBG-based throat microphones: a tailored long short-term memory recurrent neural network approach", in the SPIE Optics + Optoelectronics 2025, Prague, CZ, Apr. 07-10, 2025. |
|
dc.identifier.uri |
https://doi.org/10.1117/12.3056379 |
|
dc.identifier.uri |
https://repository.iitgn.ac.in/handle/123456789/11502 |
|
dc.description.abstract |
Fiber Bragg grating (FBG)-based throat microphone's superior background noise suppression makes them ideal for wearable automatic speech recognition (ASR) devices. However, achieving naturalness and intelligibility remains challenging due to the low-pass filtering effects of tissue and bones. This study presents a deep learning framework using a long short-term memory (LSTM) recurrent neural network for speech enhancement in FBG microphones. Also, it explores the impact of microphone placement and sex on ASR performance. The microphone, designed with a 1530.12 nm prestrained FBG, captured vocal vibrations from six participants reciting Harvard sentences. An LSTM model trained with spectral mapping restored high-frequency components, improving the non-intrusive short-time objective intelligibility (NI-STOI) score by 2%. Character error rate (CER) and NI-STOI score showed significantly better performance at the lower throat position, emphasizing the importance of optimal microphone placement. Speaker sex, however, had no significant effect on CER or intelligibility. |
|
dc.description.statementofresponsibility |
by Rituparna Jana, Akash Chandra, Nithin V. George and Arup Lal Chakraborty |
|
dc.language.iso |
en_US |
|
dc.publisher |
Society of Photo-Optical Instrumentation Engineers (SPIE) |
|
dc.subject |
FBG sensor |
|
dc.subject |
Throat microphone |
|
dc.subject |
LSTM-RNN |
|
dc.subject |
Speech enhancement |
|
dc.title |
Speech enhancement in FBG-based throat microphones: a tailored long short-term memory recurrent neural network approach |
|
dc.type |
Conference Paper |
|
dc.relation.journal |
SPIE Optics + Optoelectronics 2025 |
|