Deep learning based multitask learning for first-person videos

Gupta, Divam; Arora, Chetan (Advisor)

dc.contributor.author	Gupta, Divam
dc.contributor.author	Arora, Chetan (Advisor)
dc.date.accessioned	2018-09-25T07:38:00Z
dc.date.available	2018-09-25T07:38:00Z
dc.date.issued	2016-07-18
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/678
dc.description.abstract	First person videos captured from wearable cameras are growing in popularity.Standard algo- rithms developed for third person videos often do not work for such egocentric videos because of drastic change in camera perspective as well as unavailability of common cues such as actor's pose. In the last few years, researchers have developed various deep neural network models for variety of first person tasks such as action detection, object classification, hand detection and pose classification etc. The models are often constrained by the limited amount of annotated training data as well as inherent wide variations in egocentric tasks and contexts. In this paper we propose a multi task learning framework which allows the model to learn various egocentric cues automatically by explicitly training for multiple egocentric tasks together. The joint training allows the cues from multiple tasks to fuse with each other as well as exploits the training samples available for each of the tasks. We show that our approach simultaneously improves the accuracy of the state of the art on all the trained tasks. We also show that the proposed model can extend easily to newer tasks with scarce data.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Image analysis	en_US
dc.subject	Machine learning	en_US
dc.subject	Video analysis	en_US
dc.title	Deep learning based multitask learning for first-person videos	en_US
dc.type	Other	en_US