Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1890
Full metadata record
DC FieldValueLanguage
dc.contributor.authorPopat, Harsh Parimal-
dc.contributor.authorMital, Harshil-
dc.contributor.authorShah, Rajiv Ratn (Advisor)-
dc.date.accessioned2026-04-15T14:21:40Z-
dc.date.available2026-04-15T14:21:40Z-
dc.date.issued2024-11-27-
dc.identifier.urihttp://repository.iiitd.edu.in/xmlui/handle/123456789/1890-
dc.description.abstractOver the course of the semester we worked on VertexVQA4k, a comprehensive multimodal dataset designed for secondary-level geometry education, drawing from Indian curricula. The dataset, containing approximately 4,000 geometric image-caption and question-answer pairs, em- phasizes Numerical Answer Questions and Theorem Proving Questions, thereby broadening the scope and educational significance of multimodal numerical reasoning in Large Language Models (LLMs). VertexVQA4k distinguishes itself from existing geometry datasets by providing dual solution approaches for each problem, aiming to enhance problem-solving skills and model com- prehension. The paper details the meticulous dataset extraction and augmentation processes, including diagram description generation and solution regeneration, to improve the capabilities of multimodal LLMs in geometric problem-solving. The paper also explores hallucination in Large Vision Language Models (LVLMs) and proposes mitigation strategies. Furthermore, it delves into image captioning, stressing the importance of generating meaningful visual repre- sentations and coherent captions. The study concludes with an evaluation of the dataset and models, underscoring the efficacy of VertexVQA4k in advancing multimodal learning and rea- soning in the LLMs.en_US
dc.language.isoen_USen_US
dc.publisherIIIT-Delhien_US
dc.subjectMaths Reasoningen_US
dc.subjectMultimodal Dataseten_US
dc.subjectLarge Vision Language modelsen_US
dc.titleVertexVQA4k: enhancing large language model proficiency: advanced datasets for solving complex geometric problemsen_US
dc.typeOtheren_US
Appears in Collections:Year-2024

Files in This Item:
File Description SizeFormat 
2021048_2021050_BTP_Report_end - Harsh Parimal Popat.pdf
  Restricted Access
3.1 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.