Automatic speech recognition for code-mixed  Indian languages

Kumar, Shivam; Akhtar, Md. Shad (Advisor)

Automatic speech recognition for code-mixed Indian languages

Kumar, Shivam; Akhtar, Md. Shad (Advisor)

URI: http://repository.iiitd.edu.in/xmlui/handle/123456789/1928

Date: 2025-05

Abstract:

Code-mixing presents significant challenges for Automatic Speech Recognition (ASR), especially for Indian languages, due to homophone ambiguity, domain-specific word identification, and data scarcity. Traditional ASR models struggle with these complexities, often failing to differentiate between phonetically similar words in multilingual contexts. To address this, we propose CLEAR, a novel rescoring model that integrates descriptive prompting and LLM-based rescoring while analyzing the impact of n-best hypotheses across multiple beam widths. CLEAR enhances ASR performance, achieving S-WER of 26.9, P-WER of 26.46, and T- WER of 25.04—improving by 6.9%, 13.47%, and 4.42%, respectively, over the best baseline, i.e., TDNN. These findings demonstrate that CLEAR effectively resolves homophone ambiguities and refines transcriptions, leading to a 13.56% S-WER reduction over fine-tuned Whisper without extensive pretraining. In addition to improving transcription accuracy, CLEAR introduces a principled framework for handling ambiguous hypotheses in low-resource, script-mixed speech. CLEAR is a generic framework that can be adopted for multiple languages apart from Hindi. This work sets the foundation for more linguistically aware ASR systems tailored for multilingual societies.

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Year-2025 [18]
Year-2025

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Automatic speech recognition for code-mixed Indian languages

Automatic speech recognition for code-mixed Indian languages

Abstract:

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account