Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1750
Title: Multi-modal fusion transformer for understanding digital advertisements
Authors: Khurana, Varun
Shah, Rajiv Ratn (Advisor)
Keywords: Digital advertisements
Digital marketing
Multi-modal content understanding
Advertisement understanding
Transformer
Cross attention
Issue Date: 10-May-2023
Publisher: III-Delhi
Abstract: In today’s world, digital-born media, especially advertisements, have a substantial influence on our daily lives, from persuading us to buy particular brands to creating awareness about a social or environmental cause. This work proposes LearnAd, a learning method for the challenging task of understanding advertisements. Marketing graphics such as advertisements are digitally borne, multi modal (contain both text and visual content) and employ rhetorical devices such as emotions, symbolism, and slogans to convey meaning. On the other hand, most of the work in visual content understanding today is about camera shot images which does not translate well to marketing graphics To address this gap, we propose using human content interaction patterns in the form of eye movements to finetune the understanding of Vision Transformer (ViT). This helps LearnAd – a multimodal transformer-based cross-attention model, achieve state of the art results on three advertisement understanding tasks – generation of the action that an ad persuades a user to take and the reason it provides for the action (what-why of the ad), and prediction of the sentiment and topic of the advertisement image. Despite the lack of availability of real customer gaze patterns over marketing images, LearnAd achieves state of the art performance on three advertisement understanding tasks with the help of generated human saliency patterns.
URI: http://repository.iiitd.edu.in/xmlui/handle/123456789/1750
Appears in Collections:Year-2023

Files in This Item:
File Description SizeFormat 
BTP_Report_2019124_Winter_2023.pdf
  Restricted Access
9.48 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.