Year-2025

Year-2025 http://repository.iiitd.edu.in/xmlui/handle/123456789/1785 Year-2025 2026-05-27T21:01:42Z Automatic speech recognition for code-mixed Indian languages http://repository.iiitd.edu.in/xmlui/handle/123456789/1928 Automatic speech recognition for code-mixed Indian languages Kumar, Shivam; Akhtar, Md. Shad (Advisor) Code-mixing presents significant challenges for Automatic Speech Recognition (ASR), especially for Indian languages, due to homophone ambiguity, domain-specific word identification, and data scarcity. Traditional ASR models struggle with these complexities, often failing to differentiate between phonetically similar words in multilingual contexts. To address this, we propose CLEAR, a novel rescoring model that integrates descriptive prompting and LLM-based rescoring while analyzing the impact of n-best hypotheses across multiple beam widths. CLEAR enhances ASR performance, achieving S-WER of 26.9, P-WER of 26.46, and T- WER of 25.04—improving by 6.9%, 13.47%, and 4.42%, respectively, over the best baseline, i.e., TDNN. These findings demonstrate that CLEAR effectively resolves homophone ambiguities and refines transcriptions, leading to a 13.56% S-WER reduction over fine-tuned Whisper without extensive pretraining. In addition to improving transcription accuracy, CLEAR introduces a principled framework for handling ambiguous hypotheses in low-resource, script-mixed speech. CLEAR is a generic framework that can be adopted for multiple languages apart from Hindi. This work sets the foundation for more linguistically aware ASR systems tailored for multilingual societies. 2025-05-01T00:00:00Z Simulating distributed ML training under heterogeneous network infrastructure http://repository.iiitd.edu.in/xmlui/handle/123456789/1927 Simulating distributed ML training under heterogeneous network infrastructure Temura, Arjun; Shah, Rinku (Advisor) There has been an increasing demand to train ML models, particularly large language models (LLMs), on multiple GPUs to ensure reduced training time and costs. However, making the correct training configuration choice (for example, the number of GPUs, parallelism technique, and network topology) to ensure minimal training time and maximum resource utilisation remains challenging. Distributed ML simulators help users with capacity planning and selecting optimal configuration knobs before training. However, state-of-the-art simulators assume homogeneous compute and network infrastructure. Distributed ML training infrastructure frequently consists of heterogeneous hardware, arising from generational shifts in devices or resource sharing in cloud environments. Several training plans have been introduced in the last few years to make the best out of the available heterogeneous hardware and improve training performance. However, there are no simulation tools that mimic realistic training environments for these heterogeneity-aware training strategies. Generally, heterogeneity-aware training optimisations make guided training plans considering compute or network heterogeneity. Therefore, we design a heterogeneity-aware distributed ML training simulator that supports compute and network heterogeneity. As part of our preliminary analysis, we study GPU communication flows for popular LLMs (GPT, Mixtral) on existing simulation frameworks under realistic training configurations with network heterogeneity. We observe improvement in the completion time of the median flows under heterogeneous configurations during training. Additionally, we develop ideas for effective model partitioning strategies in light of heterogeneous compute. Finally, we briefly discuss the additional abstractions required for our simulator to leverage heterogeneous hardware effectively. 2025-05-21T00:00:00Z Workload-aware in-network cryptographic primitives for FPGA NIC http://repository.iiitd.edu.in/xmlui/handle/123456789/1926 Workload-aware in-network cryptographic primitives for FPGA NIC Peer, Aditya; Shah, Rinku (Advisor) With the increasing demand for low-latency and high-throughput requirements across emerging applications (for example, 5G/6G and the cloud), it has become imperative to offload compute-intensive tasks such as cryptographic processing to specialized accelerators. Given that ASIC-based cryptographic accelerators hinder flexibility, and are unsuitable for dynamic workloads, cloud providers (for example, AWS, Azure, Alibaba, and Google) and telecom operators use FPGA-based accelerators. The state-of-the-art FPGA-based accelerators are designed for high throughput or power efficiency, and they scale by replicating the high-throughput or power-efficient cryptographic cores, which may not be an optimal design for a given workload. We propose the concept of an “Asymmetric Cryptographic Core” to optimize CPU utilization by offloading cryptographic operations. Unlike traditional symmetric cores, our design introduces multiple variants of a specific cryptographic core, each optimized for different performance characteristics such as throughput, power efficiency, and resource usage. These core variants are deployed on an FPGA, dynamically selected based on real-time network workload distribution. This approach enables more efficient use of FPGA resources and delivers improved performance under varying workload conditions. We implemented the Rocca-S algorithm in the FPGA board and designed the variants of the small, medium, and large Rocca-S algorithms. These variants are optimised in terms of throughput, power efficiency, and resource usage. To scale these multiple cryptographic cores and process data streams in parallel, we implemented a load balancer that decides which data packet is supposed to be scheduled to the respective cryptographic core. These choices of asymmetric cryptographic core and scheduling policies will depend upon the deployer’s requirements and varying work-load conditions, to prioritise either throughput, power and resource utilisation of the system. The results showed that the combination of asymmetric cryptographic cores’ performance was comparable with the combination of symmetric cores in terms of throughput, resource and power efficiency, and as the workload distribution varies over time, we observed that the choice of asymmetric and symmetric cores changes in terms of throughput, power efficiency and resource efficiency. 2025-05-21T00:00:00Z Parameterized complexity of computing twin width of a graph http://repository.iiitd.edu.in/xmlui/handle/123456789/1923 Parameterized complexity of computing twin width of a graph Gusain, Mayank; Majumdar, Diptapriyo (Advisor) This thesis explores the design of efficient algorithms for determining the twin-width of a graph from a parameterized complexity perspective. Twin-width is a graph parameter that gives a decomposition of the graph in the lens of a contraction sequence. It is a parameter that measures the similarity of a graph to a cluster graph with smaller values of twin-width corresponding to more regular and structured graphs. However, determining the exact value of twin-width for a given graph is NP-complete. The thesis focuses on fixed-parameter tractable (FPT) algorithms, which can solve instances of the problem efficiently if the parameter remains small, or identifying suitable parameters that lead to meaningful hierarchy results. The thesis presents a detailed analysis of computing twin-width under various structural graph parameters and their relationship to twin-width, including twin cover number, vertex cover number, neighborhood diversity, edge deletion distance to cluster graph, and restricted modular partitions. The thesis makes several contributions to the field, including the development of fixed-parameter tractable (FPT) algorithms for computing twin-width parameterized by these graph parameters. These algorithms are complemented with reduction rules that simplify the input graph, making the algorithms more efficient in practice. The thesis also discusses the implications of these results for the parameterized complexity of other graph problems, and identifies several directions for future research. Overall, the thesis aims to better understand the practical implications of using twin-width in real-world scenarios while acknowledging its computational limitations. 2025-06-01T00:00:00Z