IIIT-Delhi Institutional Repository

Unraveling representations for face recognition : from handcrafted to deep learning

Show simple item record

dc.contributor.author Goswami, Gaurav
dc.contributor.author Singh, Richa (Advisor)
dc.contributor.author Vatsa, Mayank (Advisor)
dc.date.accessioned 2018-12-10T07:28:07Z
dc.date.available 2018-12-10T07:28:07Z
dc.date.issued 2018-11
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/702
dc.description.abstract Automatic face recognition in unconstrained environments is a popular and challenging research problem. With the improvements in recognition algorithms, focus has shifted from addressing various covariates individually to performing face recognition in truly unconstrained scenarios. Face databases such as the YouTube Faces and the Point-and-shoot-challenge capture a wide array of challenges such as pose, expression, illumination, resolution, and occlusion simultaneously. In general, every face recognition algorithm relies on some form of feature extraction mechanism to succinctly represent the most important characteristics of face images so that machine learning techniques can successfully distinguish face images of one individual apart from those of others. This dissertation proposes novel feature extraction and fusion paradigms along with improvements to existing methodologies in order to address the challenge of unconstrained face recognition. In addition, it also presents a novel methodology to improve the robustness of such algorithms in a generalizable manner. We begin with addressing the challenge of utilizing face data captured from consumer level RGB-D devices to improve face recognition performance without increasing the operational cost. The images captured using such devices is of poor quality compared to specialized 3D sensors. To solve this, we propose a novel feature descriptor based on the entropy of RGB-D faces along with the saliency feature obtained from a 2D face. Geometric facial attributes are also extracted from the depth image and face recognition is performed by fusing both the descriptor and attribute match scores. While score level fusion does increase the robustness of the overall framework, it cannot take into account and utilize the additional information present at the feature level. To address this challenge, we need a better feature-level fusion algorithm that can combine multiple features while preserving as much of this information before the score computation stage. To accomplish this, we propose the Group Sparse Representation based Classifier (GSRC) which removes the requirement for a separate feature-level fusion mechanism and integrates multiple features seamlessly into classification. We also propose a kernelization based extension to the GSRC that further improves its ability to separate classes that have high inter-class similarity. We next address the problem of efficiently using large amount of video data to perform face recognition. A single video contains hundreds of images, however, not all frames of a video contain useful features for face recognition and some frames might even deteriorate performance. Keeping this in mind, we propose a novel face verification algorithm which starts with selecting featurerich frames from a video sequence using discrete wavelet transform and entropy computation. Frame selection is followed by learning a joint representation from the proposed deep learning architecture which is a combination of stacked denoising sparse autoencoder and deep Boltzmann machine. A multilayer neural network is used as classifier to obtain the verification decision. Currently, most of the highly accurate face recognition algorithms are based on deep learning based feature extraction. These networks have been shown in literature to be vulnerable to engineered adversarial attacks. We assess that non-learning based image-level distortions can also adversely affect the performance of such algorithms. We capitalize on how some of these errors propagate through the network to devise detection and mitigation methodologies that can help improve the real-world robustness of deep network based face recognition. The proposed algorithm does not require any re-training of the existing networks and is not specific to a particular type of network. We also evaluate the generalizability and efficacy of the approach by testing it with multiple networks and distortions. We observe favorable results that are consistently better than existing methodologies in all the test cases. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Face recognition en_US
dc.subject Deep learning en_US
dc.title Unraveling representations for face recognition : from handcrafted to deep learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account