IIIT-Delhi Institutional Repository

Content moderation across multiple platforms with capsule networks and co-training

Show simple item record

dc.contributor.author Agarwal, Vani
dc.contributor.author Buduru, Arun Balaji (Advisor)
dc.contributor.author Kumaraguru, Ponnurangam (Advisor)
dc.date.accessioned 2020-05-31T15:14:16Z
dc.date.available 2020-05-31T15:14:16Z
dc.date.issued 2019-05
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/809
dc.description.abstract Social media systems provide a platform for users to freely express their thoughts and opinions. Although this property represents incredible and unique communication opportunities, it also brings along important challenges. Often, content which constitutes hate speech, abuse, harmful intent proliferates online platforms. Since problematic content reduces the health of a platform and negatively affects user experience, communities have terms of usage or community norms in place, which when violated by a user, leads to moderation action on that user by the platform. Unfortunately, the scale at which these platforms operate makes manual content moderation near impossible, leading to the need for automated or semi-automated content moderation systems. For understanding the prevalence and impact of such content, there are multiple methods including supervised machine learning and deep learning models. Despite the vast interest in the theme and wide popularity of some methods, it is unclear which model is most suitable for a certain platform since there have been few benchmarking efforts for moderated content. To that end, we compare existing approaches used for automatic moderation of multimodal content on five online platforms: Twitter, Reddit, Wikipedia, Quora, Whisper. In addition to investigating existing approaches, we propose a novel Capsule Network based method that performs better due to its ability to understand hierarchical patterns. In practical scenarios, labeling large scale data for training new models for a different domain or platform is a cumbersome task. Therefore we enrich our existing pre-trained model with a minimal number of labeled examples from a different domain to create a co-trained model for the new domain. We perform a cross-platform analysis using different models to identify which model is better. Finally, we analyze all methods, both qualitatively and quantitatively, to gain a deeper understanding of model performance, concluding that our method shows an increase of 10% in average precision. We also find that the co-trained models perform well despite having less training data and may be considered a cost-effective solution. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Content moderation en_US
dc.subject Capsule network en_US
dc.subject Co-training en_US
dc.title Content moderation across multiple platforms with capsule networks and co-training en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account