Abstract:
In a diverse country like India, socio-economic factors like religion, caste, language, income along with other common physical, professional based factors, play a vital role while searching for a spouse. With the surge of Internet connectivity, online matrimonial websites have become hugely popular to cater such needs. Most of the users registered on these portals have genuine intention of finding their desired life partner, however due to various factors, it attracts few profiles with no genuine intention for the same. Such profiles are also known as fake profiles. These profiles
lead to bad user experience as well as revenue loss for the online matrimony business. To dig into this problem, we have chosen a use case of India’s leading matrimony site and studied the behaviour, edit and profile differences between fake and genuine accounts. In this thesis, we present a machine learning based approach to identify such fake profiles on online matrimony. Due to lack of labelled examples for in-genuine users, we solve the above problem as anomaly detection problem. In this thesis, we use autoencoder which is widely used algorithm for anomaly detection. We capture user’s behaviour, profile information and edit history to predict him/her as in-genuine
or genuine profile. We then treat this problem as a reconstruction task using autoencoder which is trained on a set of genuine profiles features. While prediction, the autoencoder shows small reconstruction error for genuine profiles and a very high reconstruction error for the fake profiles and detect them. The proposed system produces 91.76% accuracy with 90.2% recall for fake class. To the best of our knowledge, this is the first study done to detect fake profiles in online matrimony domain.