Show simple item record

dc.contributor.author Garg, Prakrit
dc.contributor.author Shah, Rajiv Ratn (Advisor)
dc.date.accessioned 2024-05-15T11:00:57Z
dc.date.available 2024-05-15T11:00:57Z
dc.date.issued 2023-11-29
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/1464
dc.description.abstract Visual Question Answering(VQA) has applications in various rising areas ranging from Medical Imagery to Video Surveillance and Assistance[1]. VQA problem is a central component of the Artificial General Intelligence problem, i.e., creating a machine that can understand or learn any intellectual task that a human being can[2]. Geman et al. (2015) have also suggested using the VQA problem as a Visual Turing Test[2]. Our research work dives into the application of VQA in public tranport. VQA can provide an interactive and accessible way for individuals with visual impairments or other disabilities to receive information about public transport. By using image recognition and natural language processing, VQA systems can describe surroundings, read signage, and provide real-time updates, making public transportation more accessible. The research work started by capturing images of public transport, mainly DTC buses. Utilizing advanced image captioning models such as BLIP-2 ViT-G FlanT5 XL robust OCR models like EasyOCR, PaddleOCR, and MMOCR for capturing information and precise text extraction from images. The primary objective is to create an integrative system that combines the strengths of both image captioning and OCR technologies to enhance the understanding and accessibility of public transport information. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Visual Question Answering en_US
dc.subject Public Transport en_US
dc.subject DTC Buses en_US
dc.subject Image Captioning en_US
dc.subject OCR models en_US
dc.subject BLIP-2 en_US
dc.subject Optical Character Recognition en_US
dc.subject EasyOCR en_US
dc.subject MMOCR en_US
dc.title Visual question answering en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account