Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1965
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGhorai, Arunoday-
dc.contributor.authorGoyal, Vikram (Advisor)-
dc.date.accessioned2026-04-30T13:37:20Z-
dc.date.available2026-04-30T13:37:20Z-
dc.date.issued2024-12-
dc.identifier.urihttp://repository.iiitd.edu.in/xmlui/handle/123456789/1965-
dc.description.abstractThe growing reliance on relational databases across industries and the ability to efficiently query and extract from a structured database has become a crucial skill in the industry. However, the Complexity of SQL Syntax creates a barrier for non-technical uses limiting their ability to interact with databases effectively. Natural Language to SQL (NL-to-SQL) query generation performs a critical task in bridging gap between non-technical users and relational databases and enables intuitive data interaction with out any need for SQL expertise. This thesis first explores various Text-to-SQL approaches, leveraging both proprietary model like Open AI’s GPT-4 and open-source models like RESDSQL, focusing on their performance across benchmark datasets like Spider, CoSQL and SPARC. Additionally, two datasets, MORD and CMEC are prepared from the real world use cases to highlight unique challenges such as hierarchical data structures, string matching operations, and privacy issues. The MORD dataset was queried using GPT-4 integrated with LangChain, to showcase natural language interaction with data and the usability of proprietary models without any tuning to domain specific dataset. Meanwhile the CMEC dataset is a privately curated dataset and access to it needs to be confidential. So we use open source models like RESDSQL that run on local server in order to minimize leakage. The dataset is pre-processed into a relational schema, and RESDSQL is fine tuned on curated NL to SQL pairs to improve performance. String matching techniques are applied to prepare better prompts in order to further enhance the results generated by the model.en_US
dc.language.isoen_USen_US
dc.publisherIIIT-Delhien_US
dc.subjectShip Marine Strategyen_US
dc.subjectSQL Modelsen_US
dc.subjectCoSQL and SPARCen_US
dc.titleShip marine strategy database access using natural language: an application of LLM-based text-to-SQL modelen_US
dc.typeThesisen_US
Appears in Collections:Year-2024

Files in This Item:
File Description SizeFormat 
Mtech_Thesis_MT23023.pdf873.81 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.