<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>Computer Science and Engineering</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1" rel="alternate"/>
<subtitle>CSE</subtitle>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1</id>
<updated>2026-04-10T22:47:57Z</updated>
<dc:date>2026-04-10T22:47:57Z</dc:date>
<entry>
<title>Audio spoofing detection via hybrid feature integration</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1829" rel="alternate"/>
<author>
<name>Singh, Barneet</name>
</author>
<author>
<name>Abrol, Vinayak (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1829</id>
<updated>2026-04-03T22:00:24Z</updated>
<published>2025-05-01T00:00:00Z</published>
<summary type="text">Audio spoofing detection via hybrid feature integration
Singh, Barneet; Abrol, Vinayak (Advisor)
This thesis explores advanced techniques in the field of audio spoofing detection. With the emergence of high-quality deepfake generation techniques and the vulnerabilities in automatic speaker verification (ASV) systems, robust countermeasures are essential. We investigate state-of-the-art deep learning models including ECAPA-TDNN, ResNet, TitaNet, and self-supervised models such as Wav2Vec2, WavLM, and UniSpeech. Experiments are conducted on datasets from ASVspoof 2021 and 2024 challenges. Our approach introduces a hybrid integration of handcrafted features with SSL-based embeddings, demonstrating notable improvements in Equal Error Rate (EER) and minimum Detection Cost Function (minDCF). Data augmentation strategies are also evaluated for enhancing robustness. Results indicate that hybrid systems combining engineered and learned features outperform standalone models and offer practical insights for developing next-generation anti-spoofing solutions.
</summary>
<dc:date>2025-05-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>A journey down the federated valleys</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1828" rel="alternate"/>
<author>
<name>Tyagi, Somya</name>
</author>
<author>
<name>Chatterjee, Bapi (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1828</id>
<updated>2026-04-03T22:00:23Z</updated>
<published>2025-06-01T00:00:00Z</published>
<summary type="text">A journey down the federated valleys
Tyagi, Somya; Chatterjee, Bapi (Advisor)
Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy. However, the choice of optimizer on both the client and server sides significantly impacts training efficiency and model performance, especially under non-IID data distributions. Despite the existence of numerous optimizers, the absence of strong, consistent empirical evidence specific to federated environments makes it challenging to identify the most effective optimizer. Consequently, practitioners often rely on intuition and prior experience when choosing optimizers. This study provides comprehensive insights and practical guidelines for optimizer selection in federated learning frameworks. Beyond standard empirical risk minimization, min-max optimization is a fundamental framework in machine learning to model adversarial and robust problems, and its utility extends beyond traditional ML applications into econometrics and causal inference. One notable application is the Generalized Method of Moments (GMM), a widely used technique for causal effect estimation via Instrumental Variables (IV) analysis, which finds practical applications in important areas such as healthcare and consumer economics. For IV analysis in high-dimensional settings, the Generalized Method of Moments (GMM) using deep neural networks offers an efficient approach. If the data is sourced from scattered, decentralized clients, federated learning readily fits for training the models while promising data privacy. However, to our knowledge, no federated algorithm for either GMM or IV analysis exists to date. This study also includes a method for federated instrumental variables analysis (FedIV) via the federated deep generalized method of moments (FedDeepGMM) for non-iid data. We characterize an equilibrium of a federated zero-sum game to show that it consistently estimates the local moment conditions of every participating client. The proposed algorithm is backed by extensive experiments to demonstrate the efficacy of our approach.
</summary>
<dc:date>2025-06-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Motion forecasting of surrounding agents for autonomous vehicles</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1827" rel="alternate"/>
<author>
<name>Fayaz Lone, Junaid</name>
</author>
<author>
<name>Anand, Saket (Advisor)</name>
</author>
<author>
<name>Sanjit, Kaul (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1827</id>
<updated>2026-04-02T22:00:25Z</updated>
<published>2025-05-01T00:00:00Z</published>
<summary type="text">Motion forecasting of surrounding agents for autonomous vehicles
Fayaz Lone, Junaid; Anand, Saket (Advisor); Sanjit, Kaul (Advisor)
Motion forecasting of surrounding agents is fundamental for autonomous systems navigating complex, dynamic environments. This capability enables autonomous vehicles and robots to anticipate the future trajectories of vehicles, pedestrians, and other moving entities. Recent models leverage large-scale datasets to learn spatiotemporal patterns. Many existing models focus on single-agent prediction. This poses a problem because, as the number of agents in a scene increases, the computational time scales linearly due to the redundant re-encoding of features—such as static or HD maps—that could otherwise be shared across agents. Moreover, some models struggle to accurately capture interactions between agents, limiting their ability to understand the influence they have on one another. They also encounter challenges with variable sequence lengths; as the sequence length increases, models using GRUs or RNNs can lose context over long sequence lengths, and improper handling of padding timesteps can diminish encoding quality. We also explore monocular motion forecasting in the context of traffic surveillance. In this setting, the goal is to forecast the future positions of agents using monocular camera footage by leveraging monocular depth estimation. This approach typically operates without access to high-definition (HD) maps and instead relies solely on the historical motion of agents—essentially performing map-free motion forecasting. In this work, we propose a new approach that addresses the challenges such as multi- agent forecasting, agent-agent interaction, and monocular camera based motion forecasting. We evaluate our autonomous driving model on two benchmark datasets Argoverse 2 and nuScenes—under standard evaluation metrics: MinFDE, MinADE, and MR. For monocular motion forecasting, we evaluate our method on the BrnoComp dataset.
</summary>
<dc:date>2025-05-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Implementation of live video streaming using multipath over QUIC</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/1790" rel="alternate"/>
<author>
<name>Sood, Aruba</name>
</author>
<author>
<name>Bhattacharya, Arani (Advisor)</name>
</author>
<author>
<name>Maity, Mukulika (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/1790</id>
<updated>2025-12-29T22:00:11Z</updated>
<published>2025-05-01T00:00:00Z</published>
<summary type="text">Implementation of live video streaming using multipath over QUIC
Sood, Aruba; Bhattacharya, Arani (Advisor); Maity, Mukulika (Advisor)
The increasing demand for live online classes, especially in remote and underserved areas, underscores the importance of providing a seamless and high-quality experience to support effective learning. These real-time, bandwidth-intensive applications pose significant challenges for current cellular networks in terms of maintaining consistent bandwidth, low latency, and minimal stalls. A system COMPACT tackled these challenges by using a content-aware streaming strategy while leveraging multiple devices, each with its own cellular connection, to cooperatively stream video. COMPACT splits the video into foreground and background regions using independently encoded tiles and streams them over different paths based on network estimates. The original implementation of COMPACT used SCTP (Stream Control Transmission Protocol) for multipath support. The key disadvantage of this version was that SCTP is often blocked by firewalls and is not supported by Android. This thesis, therefore, extends the original implementation of COMPACT to utilize QUIC, a modern transport protocol offering built-in multiplexing, reduced latency, and improved congestion control. We adapt COMPACT’s scheduling and streaming logic to work efficiently over QUIC and evaluate the system using realistic network traces. Our results demonstrate that COMPACT over QUIC retains the benefits of multipath collaboration while offering better compatibility with today’s internet infrastructure.
</summary>
<dc:date>2025-05-01T00:00:00Z</dc:date>
</entry>
</feed>
