Abstract:
Multimodal sarcasm explanation is a challenging natural language understanding task that deals with training machines to understand the semantic incongruence present in sarcasm as a form of communication and resolve it to explain the implicit meaning behind a sarcastically communicated message. The processing of multiple modalities is imperative for this task due to the fact that cues from several different sources are often taken into account by people when trying to understand the implicit meaning behind sarcastic messages. The aim of this research was to explore different avenues of research in multimodal sarcasm analysis and to identify a promising avenue to pursue in this field. This report details the work done, including literature review and baseline model implementation, due to which the research has been able to reach the point that it has wherein the idea of using reinforcement learning techniques such as reinforcement learning with human feedback (RLHF) and reinforcement learning with artificial intelligence feedback (RLAIF) to train multimodal sarcasm explanation models is currently being explored.