<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>MTech Theses</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/15" rel="alternate"/>
<subtitle/>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/15</id>
<updated>2026-04-14T10:00:12Z</updated>
<dc:date>2026-04-14T10:00:12Z</dc:date>
<entry>
<title>KG-Scout : a policy driven knowledge-graph retrieval framework to mitigate factual inaccuracies of large  language model</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1875" rel="alternate"/>
<author>
<name>Chakraborty, Sourav</name>
</author>
<author>
<name>Akhtar, Md. Shad (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1875</id>
<updated>2026-04-13T22:00:25Z</updated>
<published>2025-08-01T00:00:00Z</published>
<summary type="text">KG-Scout : a policy driven knowledge-graph retrieval framework to mitigate factual inaccuracies of large  language model
Chakraborty, Sourav; Akhtar, Md. Shad (Advisor)
Large Language Models (LLMs) have rapidly advanced their ability to answer questions and perform complex reasoning tasks. However, they often generate factual inaccuracies and hallucinations because they lack access or have limited access to up-to-date factual knowledge. To mitigate this, researchers often augment LLMs with factual information from external sources, such as knowledge graphs (KGs). However, most existing KG-based RAG systems suffer from a key limitation: triplet retrieval from KGs is either based on simplistic distance metrics, heuristics, or tightly coupled with reasoning, making optimizing both retrieval and reasoning challenging. To mitigate these, we propose KG-Scout , a reinforcement learning (RL)-based policy network that decouples retrieval from reasoning, enabling the selection of triplets that are both semantically aligned with the query and structurally important in the KG. Our approach operates in two key stages: (1) extracting a subgraph using topic entities and computing Personalized PageRank (PPR) scores for nodes, and (2) employing our policy network to select the most valuable triplets from this set based on their learned relevance scoring. To enhance the efficiency of this process, we first perform an initial filtering of candidate triplets using cosine similarity with the query before the policy network considers them. Using the retrieved results, smaller pretrained LLMs such as LLAMA-3.1-8b outperform several complex LLM-based baselines on WebQSP and CWQ benchmarks.
</summary>
<dc:date>2025-08-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Audio spoofing detection via hybrid feature integration</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1829" rel="alternate"/>
<author>
<name>Singh, Barneet</name>
</author>
<author>
<name>Abrol, Vinayak (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1829</id>
<updated>2026-04-03T22:00:24Z</updated>
<published>2025-05-01T00:00:00Z</published>
<summary type="text">Audio spoofing detection via hybrid feature integration
Singh, Barneet; Abrol, Vinayak (Advisor)
This thesis explores advanced techniques in the field of audio spoofing detection. With the emergence of high-quality deepfake generation techniques and the vulnerabilities in automatic speaker verification (ASV) systems, robust countermeasures are essential. We investigate state-of-the-art deep learning models including ECAPA-TDNN, ResNet, TitaNet, and self-supervised models such as Wav2Vec2, WavLM, and UniSpeech. Experiments are conducted on datasets from ASVspoof 2021 and 2024 challenges. Our approach introduces a hybrid integration of handcrafted features with SSL-based embeddings, demonstrating notable improvements in Equal Error Rate (EER) and minimum Detection Cost Function (minDCF). Data augmentation strategies are also evaluated for enhancing robustness. Results indicate that hybrid systems combining engineered and learned features outperform standalone models and offer practical insights for developing next-generation anti-spoofing solutions.
</summary>
<dc:date>2025-05-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>A journey down the federated valleys</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1828" rel="alternate"/>
<author>
<name>Tyagi, Somya</name>
</author>
<author>
<name>Chatterjee, Bapi (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1828</id>
<updated>2026-04-03T22:00:23Z</updated>
<published>2025-06-01T00:00:00Z</published>
<summary type="text">A journey down the federated valleys
Tyagi, Somya; Chatterjee, Bapi (Advisor)
Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy. However, the choice of optimizer on both the client and server sides significantly impacts training efficiency and model performance, especially under non-IID data distributions. Despite the existence of numerous optimizers, the absence of strong, consistent empirical evidence specific to federated environments makes it challenging to identify the most effective optimizer. Consequently, practitioners often rely on intuition and prior experience when choosing optimizers. This study provides comprehensive insights and practical guidelines for optimizer selection in federated learning frameworks. Beyond standard empirical risk minimization, min-max optimization is a fundamental framework in machine learning to model adversarial and robust problems, and its utility extends beyond traditional ML applications into econometrics and causal inference. One notable application is the Generalized Method of Moments (GMM), a widely used technique for causal effect estimation via Instrumental Variables (IV) analysis, which finds practical applications in important areas such as healthcare and consumer economics. For IV analysis in high-dimensional settings, the Generalized Method of Moments (GMM) using deep neural networks offers an efficient approach. If the data is sourced from scattered, decentralized clients, federated learning readily fits for training the models while promising data privacy. However, to our knowledge, no federated algorithm for either GMM or IV analysis exists to date. This study also includes a method for federated instrumental variables analysis (FedIV) via the federated deep generalized method of moments (FedDeepGMM) for non-iid data. We characterize an equilibrium of a federated zero-sum game to show that it consistently estimates the local moment conditions of every participating client. The proposed algorithm is backed by extensive experiments to demonstrate the efficacy of our approach.
</summary>
<dc:date>2025-06-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Motion forecasting of surrounding agents for autonomous vehicles</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1827" rel="alternate"/>
<author>
<name>Fayaz Lone, Junaid</name>
</author>
<author>
<name>Anand, Saket (Advisor)</name>
</author>
<author>
<name>Sanjit, Kaul (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1827</id>
<updated>2026-04-02T22:00:25Z</updated>
<published>2025-05-01T00:00:00Z</published>
<summary type="text">Motion forecasting of surrounding agents for autonomous vehicles
Fayaz Lone, Junaid; Anand, Saket (Advisor); Sanjit, Kaul (Advisor)
Motion forecasting of surrounding agents is fundamental for autonomous systems navigating complex, dynamic environments. This capability enables autonomous vehicles and robots to anticipate the future trajectories of vehicles, pedestrians, and other moving entities. Recent models leverage large-scale datasets to learn spatiotemporal patterns. Many existing models focus on single-agent prediction. This poses a problem because, as the number of agents in a scene increases, the computational time scales linearly due to the redundant re-encoding of features—such as static or HD maps—that could otherwise be shared across agents. Moreover, some models struggle to accurately capture interactions between agents, limiting their ability to understand the influence they have on one another. They also encounter challenges with variable sequence lengths; as the sequence length increases, models using GRUs or RNNs can lose context over long sequence lengths, and improper handling of padding timesteps can diminish encoding quality. We also explore monocular motion forecasting in the context of traffic surveillance. In this setting, the goal is to forecast the future positions of agents using monocular camera footage by leveraging monocular depth estimation. This approach typically operates without access to high-definition (HD) maps and instead relies solely on the historical motion of agents—essentially performing map-free motion forecasting. In this work, we propose a new approach that addresses the challenges such as multi- agent forecasting, agent-agent interaction, and monocular camera based motion forecasting. We evaluate our autonomous driving model on two benchmark datasets Argoverse 2 and nuScenes—under standard evaluation metrics: MinFDE, MinADE, and MR. For monocular motion forecasting, we evaluate our method on the BrnoComp dataset.
</summary>
<dc:date>2025-05-01T00:00:00Z</dc:date>
</entry>
</feed>
