Abstract:
Fake news refers to intentionally and verifiably false stories created to manipulate people’s perceptions of reality. Fake news is destructive and has been used to influence voting decisions and spread hatred against religion, organizations or individuals, resulting in violence or even death. It has also become a method to stir up and intensify social conflict. Fake news has existed for a long time, but what led to the change was the rise of Web 2.0 technologies and social media, which broadened communication horizons. Social media emerged as a multidisciplinary tool for exchanging information and ideas. However, there are always two sides to a coin; social media is no exception. On the positive side, social media aid users in generating content that is a backbone for the masses to interact. The negative impact, however, is significantly more profound. First, the availability of the Internet and smart phones at nominal prices, in tandem with lowering entry barriers on such platforms, has given fake news a vast audience and allowed it to spread rapidly and widely. Second, social media platforms suffer from a lack of centralized gatekeeping to regulate the volume of generated content. As a result, online users fall prey to misleading stories. Individuals tend to accept information supporting their ideologies, preventing them from making rational decisions. Third, one can gain monetary benefits from such platforms by engaging the audience. Users are always drawn to sensational and controversial content. As a result, manipulators tend to generate fake news that receives a lot of attention and engagement and is more likely to be spread on such platforms. Therefore, it is essential to understand the nature of fake news spreading online, devise new technologies to combat it, analyze the current detection methods and improve intuitive understanding among online readers. Henceforth, this PhD thesis addresses three fundamental challenges. First, we focuses on devising different methods to Identify, a.k.a., detect fake news online by extracting different feature sets from the given information. By designing foundational detection mechanisms, our work accelerates research innovations. Second, our research closely Inspect the fake stories from two perspectives. First, from the information point of view, one can inspect fabricated content to identify the patterns of false stories disseminating over the web, the modality used to create the fabricated content and the platform used for dissemination. To study such changing dynamics of fake news, we select India as the region and built an extensive dataset to aid researchers in investigating such issues. Next, from the model point of view, we inspect detection mechanisms used in prior work and their generalizability to other datasets. The thesis also suggests Intervention techniques to help internet users broaden their comprehension of fake news. We discuss potential practical implications for social media platform owners and policymakers. We design different multimodal fake news detection baselines to answer the first part of the thesis. Typically, a news article consists of a headline, content, top image and other corresponding images. We begin by designing SpotFake- a multimodal framework for fake news detection. Our proposed solution identifies fake news without taking into account any additional subtasks. It exploits both the textual and visual features of an article. Specifically, we used language models (like BERT) to learn contextual representations for the text, and image features are learned from VGG-19 pre-trained on the ImageNet dataset. Our proposed method outperforms the baselines by a margin of 6% accuracy on average. Next, we present SpotFake+: A Multimodal Framework for Fake News Detection via Transfer Learning. It is a multimodal approach that leverages transfer learning to capture semantic and contextual information from the news articles and its associated images and achieve better performance for fake news detection. SpotFake+ is one of the first attempts that performs a multimodal approach for fake news detection on a dataset that consists of full-length articles. Next, we observed that most of the research on fake news has focused on detecting fake news by leveraging information from both modalities, ignoring the other multiple visual signals present in a news sample. To this, we created Inter-modality Discordance for Multimodal Fake News Detection. The proposed method leverages information from multiple images in tandem with the text modality to perform multimodal fake news detection. The count of images varies per sample basis, and our designed method can incorporate such changes efficiently. We adopt a multimodal discordance rationale for multimodal fake news detection. Our proposed model effectively captures the intra and inter-modality relationship between the different modalities. Lastly, we observed that existing research capture high-level information from different modalities and jointly models them to decide. Given multiple input modalities, we hypothesize that not all modalities may be equally responsible for decisionmaking. Hence, we present Leveraging Intra and Inter Modality Relationship for Multimodal Fake News Detection. Here, we design a novel architecture that effectively identifies and suppresses information from weaker modalities and extracts relevant information from the strong modality on a per-sample basis. We also capture the intra-modality relationship that first generates fragments of a modality and then learn fine-grained salient representations from the fragments. In the first part of the thesis, we make numerous attempts to design methods that can effectively identify fake news. However, in the process, we observed that the results reported by state-of-the-art methods indicate achieving almost perfect performance. In contrast, such methods fail to cope with the changing dynamics of fake news. The reasons could be twofold. First, the issue can reside in the information itself; second, the designed method is incapable of extracting the informative signals. Hence, in the second part of the thesis, we inspect fake news from two perspectives. From an information viewpoint, we study the changing dynamics of fake news over time. We selected India as the region from which we could derive conclusions, as little effort was made to study the menace of fake news in India. To this end, we built an extensive dataset, FactDrill: A Data Repository of Fact-Checked Social Media Content to Study Fake News Incidents in India. Further, using the dataset, one can investigate the changing dynamics of fake news in a multi-lingual setting in India. The resource would aid in examining the fake news at its core, i.e. investigating the different kinds of stories being disseminated, the modalities or combinations used to create the fabricated content and the platform used for dissemination. From a model viewpoint, we examine the apparent discrepancy between current research and real applications. We hypothesize that the performance claims of the current state-of-the-art have become significantly overestimated. The overestimation might be due to the systematic biases in the dataset and models leveraging such preferences and not taking actual cues for detection. We conduct experiments to investigate the prior literature from the input data perspective, where we study statistical bias in the dataset. Our finding state that though reported performances are impressive, leveraging multiple modalities to detect fake news is far from solved. The final section of the thesis focuses on developing intervention strategies that enable readers to identify fake news. We design SachBoloPls- a system that validates news on Twitter in real-time. It is an effort to curb the proliferation of debunked fake news online, make audiences aware of fact-checking organizations, and educate them about false viral claims. There are three components of SachBoloPls that are independent and can be extended to other social media and instant messaging platforms like Instagram, WhatsApp, Facebook, and Telegram. The proposed prototype can also incorporate regional languages making it a viable tool to fight against fake news across India. Designing effective interventions can encourage social media users to exercise caution while reading or disseminating news online. Lastly, we discuss potential practical implications for social media platform owners and policymakers.