Abstract:
This project report investigates the underlying factors contributing to hate speech through an in-depth literature analysis of psychological studies. It identifies intent and implication as critical aspects shaping hate speech dynamics. The main goal of this study is to enhance existing hate speech detection models by incorporating these identified factors. An initial analysis of retrieval models highlights challenges in achieving balanced counterspeech, leading to the exploration of generative models. Contrary to expectations, generative models do not exhibit improved performance. This report provides a detailed analysis of these findings and evaluates different frameworks employed in the research process.