Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1677
Title: Patterns and problems : analyzing document layouts and improving mathematical reasoning of llms
Authors: Gupta, Mohit
Shah, Rajiv Ratn (Advisor)
Keywords: Large Language Models (LLMs)
domain adaptation
NLP related tasks
Issue Date: 8-May-2024
Publisher: IIIT-Delhi
Abstract: Documents play a pivotal role in conveying information, serving as integral carriers of knowledge across various domains. Their importance lies in their ability to encapsulate ideas, facts, and insights, thereby facilitating communication and record-keeping. Document analysis, as a field, involves the systematic examination of documents to extract meaningful insights, patterns, or information. While current document analysis tools have made strides, they face challenges such as limited accuracy, scalability issues, and struggles in handling diverse document formats. This highlights the pressing need for more robust and advanced document analysis tools. This research work attempts to address some of the broad domain difficulties associated with documents and their analysis. This research explains the solutions in the domain of domain adaptation based document layout detection, Detecting and Recognizing Tables within document images. Secondly, the Large Language Models (LLMs) proved to achieve state-of-the-art results on extensive, and complex NLP related tasks. But sometimes LLMs fails to solve basic mathematical reasoning tasks. Focusing on this, I’ve worked on proposing a extensive mathematical dataset for training LLMs to enhance their mathematical reasoning capabilities and proposed a efficient approach for solving physics problems using Reinforcement Learning with Human & AI feedback.
URI: http://repository.iiitd.edu.in/xmlui/handle/123456789/1677
Appears in Collections:Year-2024

Files in This Item:
File Description SizeFormat 
MohitGupta_M_Tech_Thesis___MT22112.pdf11.05 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.