DSpace Collection: Year-2024

DSpace Collection: Year-2024 http://repository.iiitd.edu.in/xmlui/handle/123456789/1654 Year-2024 Mon, 13 Jul 2026 15:51:37 GMT 2026-07-13T15:51:37Z Ship marine strategy database access using natural language: an application of LLM-based text-to-SQL model http://repository.iiitd.edu.in/xmlui/handle/123456789/1965 Title: Ship marine strategy database access using natural language: an application of LLM-based text-to-SQL model Authors: Ghorai, Arunoday; Goyal, Vikram (Advisor) Abstract: The growing reliance on relational databases across industries and the ability to efficiently query and extract from a structured database has become a crucial skill in the industry. However, the Complexity of SQL Syntax creates a barrier for non-technical uses limiting their ability to interact with databases effectively. Natural Language to SQL (NL-to-SQL) query generation performs a critical task in bridging gap between non-technical users and relational databases and enables intuitive data interaction with out any need for SQL expertise. This thesis first explores various Text-to-SQL approaches, leveraging both proprietary model like Open AI’s GPT-4 and open-source models like RESDSQL, focusing on their performance across benchmark datasets like Spider, CoSQL and SPARC. Additionally, two datasets, MORD and CMEC are prepared from the real world use cases to highlight unique challenges such as hierarchical data structures, string matching operations, and privacy issues. The MORD dataset was queried using GPT-4 integrated with LangChain, to showcase natural language interaction with data and the usability of proprietary models without any tuning to domain specific dataset. Meanwhile the CMEC dataset is a privately curated dataset and access to it needs to be confidential. So we use open source models like RESDSQL that run on local server in order to minimize leakage. The dataset is pre-processed into a relational schema, and RESDSQL is fine tuned on curated NL to SQL pairs to improve performance. String matching techniques are applied to prepare better prompts in order to further enhance the results generated by the model. Sun, 01 Dec 2024 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/1965 2024-12-01T00:00:00Z Detection to interpretation: advancing tabular data processing with multimodal AI http://repository.iiitd.edu.in/xmlui/handle/123456789/1788 Title: Detection to interpretation: advancing tabular data processing with multimodal AI Authors: Bhuyan, Pijush; Shah, Rajiv Ratn (Advisor) Abstract: Tables are the most common form of structured data found in documents. Proper interpretation of such raw tabular data by computer systems remains an open challenge. We take a deep dive into document intelligence - which includes table detection, table reconstruction and table structure interpretation by AI models. Firstly, we handle domain adaptation in table detection. Pre-trained table detection models have displayed poor results when the target domain varied from the source. We resolve this by building a domain invariant table detection dataset where we inject additional noisy synthetic detection data. Empirical tests show that training a detection model on synthetic data displays a significantly lower drop in performance when tested on out-of-distribution datasets. Following this, we build a fast,yet efficient, end-to-end pipeline for Table-OCR. It reconstructs the table structure and content from raw detection crops and converts them into computer-storable text format. Finally, we design a comprehensive benchmark suite of tests to test the table structure understanding capabilities and limitations of existing Large Language models (LLMs) and Vision Language Models (VLMs) using both text and image modalities. The vision component of VLMs is found to be a bottleneck in multi-modal table interpretability. We work with a light-weight, yet efficient model-agnostic adapter module which injects positional information into the image modality through positional embeddings during model training. We also design a novel pre-training task for image-text alignment for open-source VLMs and study the change in model performance while interpreting visual tabular data. We also study the feasibility and future scope for true multimodal table understanding - interpreting tabular data from both image and text modalities for reasoning. Sat, 21 Dec 2024 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/1788 2024-12-21T00:00:00Z Optimising serialisation for cloud applications http://repository.iiitd.edu.in/xmlui/handle/123456789/1695 Title: Optimising serialisation for cloud applications Authors: Nayak, Siddharth; Shah, Rinku (Advisor) Abstract: Serialisation latency is a significant concern in modern cloud applications that leverage the microservice paradigm. A cloud service request typically traverses a sequence of microservices across nodes, increasing latency due to (de) serialisation overhead at every hop. The serialisation process comprises memory allocation, data encoding, and data copy. Observations from existing benchmarking results show that the data copy operation dominates the overall (de) serialisation cost. Existing serialisation libraries follow a two-copy technique — (1) the application copies the encoded data into a serialised buffer, and (2) the serialised data is copied to the NIC’s device memory. To reduce serialisation latency, researchers have proposed (1) kernel bypass techniques that eliminate data copy, (2) use of hardware acceleration solutions, and (3) wire format optimisations. However, kernel bypass solutions have security concerns and cannot be deployed in public cloud networks, and hardware acceleration solutions depend on specialised hardware. We propose designing and implementing a one-copy serialisation library, which leverages the scatter-gather I/O technique provided by the standard POSIX library for data movement. Our solution does not require special hardware support or any specialised network stack. Our design relies on the Linux network stack; there are no security concerns, making it usable in public and private clouds. Tue, 21 May 2024 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/1695 2024-05-21T00:00:00Z Programmable proxy for microservice communication in multi-cloud environments http://repository.iiitd.edu.in/xmlui/handle/123456789/1693 Title: Programmable proxy for microservice communication in multi-cloud environments Authors: Pathak, Shambhavi; Shah, Rinku (Advisor) Abstract: As enterprises migrate from single-cloud to multi-cloud architecture, they encounter challenges due to geographical dispersion, varying WAN characteristics, and diverse cloud policies. These challenges demand a re-evaluation of communication strategies by the developers to ensure reliability, security, and compliance. In response, our work presents a solution, Programmable Proxy, designed to address the dynamic nature of multi-cloud environments. My thesis focuses on the switch between HTTP and MQTT communication protocols; the proxy facilitates real-time adaptation based on service location, packet characteristics, and request types. Through this approach, programmable proxy optimizes communication mechanisms in alignment with evolving deployment requirements, enabling enhanced performance across diverse cloud infrastructures. This study contributes to the advancement of flexible, adaptive microservices architectures in the context of multi-cloud environments. Tue, 21 May 2024 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/1693 2024-05-21T00:00:00Z