dc.description.abstract |
YouTube is one of the most popular and largest video sharing websites (with social networking
features) on the Internet. A signi cant percentage of videos uploaded on YouTube contains
objectionable content and violates YouTube community guidelines. YouTube contains several
copyright violated videos, commercial spam, hate and extremism promoting videos, vulgar and
pornographic material and privacy invading content. This is primarily due to the low publication
barrier and anonymity. We present an approach to identify privacy invading harassment and
misdemeanour videos by mining the video metadata. We divide the problem into sub-problems:
vulgar video detection, abuse and violence in public places and ragging video detection in school
and colleges. We conduct a characterization study on a training dataset by downloading several
videos using YouTube API and manually annotating the dataset. We de ne several discrimina-
tory features for recognizing the target class objects. We employ a one-class classi er approach
to detect the objectionable video and frame the problem as a recognition problem. Our empirical
analysis on test dataset reveals that linguistic features (presence of certain terms and people in
the title and description of the main and related videos), popularity based, duration and cate-
gory of videos can be used to predict the video type. We validate our hypothesis by conducting
a series of experiments on evaluation dataset acquired from YouTube. Empirical results reveal
that accuracy of proposed approach is more than 80% demonstrating the e ectiveness of the
approach. |
en_US |