Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1464
Title: Visual question answering
Authors: Garg, Prakrit
Shah, Rajiv Ratn (Advisor)
Keywords: Visual Question Answering
Public Transport
DTC Buses
Image Captioning
OCR models
BLIP-2
Optical Character Recognition
EasyOCR
MMOCR
Issue Date: 29-Nov-2023
Publisher: IIIT-Delhi
Abstract: Visual Question Answering(VQA) has applications in various rising areas ranging from Medical Imagery to Video Surveillance and Assistance[1]. VQA problem is a central component of the Artificial General Intelligence problem, i.e., creating a machine that can understand or learn any intellectual task that a human being can[2]. Geman et al. (2015) have also suggested using the VQA problem as a Visual Turing Test[2]. Our research work dives into the application of VQA in public tranport. VQA can provide an interactive and accessible way for individuals with visual impairments or other disabilities to receive information about public transport. By using image recognition and natural language processing, VQA systems can describe surroundings, read signage, and provide real-time updates, making public transportation more accessible. The research work started by capturing images of public transport, mainly DTC buses. Utilizing advanced image captioning models such as BLIP-2 ViT-G FlanT5 XL robust OCR models like EasyOCR, PaddleOCR, and MMOCR for capturing information and precise text extraction from images. The primary objective is to create an integrative system that combines the strengths of both image captioning and OCR technologies to enhance the understanding and accessibility of public transport information.
URI: http://repository.iiitd.edu.in/xmlui/handle/123456789/1464
Appears in Collections:Year-2023

Files in This Item:
File Description SizeFormat 
BTP_Report_Research__Version_1937_ - Prakrit Garg.pdf
  Restricted Access
2.59 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.