IIIT-Delhi Institutional Repository

Deep learning based multitask learning for first-person videos

Show simple item record

dc.contributor.author Gupta, Divam
dc.contributor.author Arora, Chetan (Advisor)
dc.date.accessioned 2018-09-25T07:38:00Z
dc.date.available 2018-09-25T07:38:00Z
dc.date.issued 2016-07-18
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/678
dc.description.abstract First person videos captured from wearable cameras are growing in popularity.Standard algo- rithms developed for third person videos often do not work for such egocentric videos because of drastic change in camera perspective as well as unavailability of common cues such as actor's pose. In the last few years, researchers have developed various deep neural network models for variety of first person tasks such as action detection, object classification, hand detection and pose classification etc. The models are often constrained by the limited amount of annotated training data as well as inherent wide variations in egocentric tasks and contexts. In this paper we propose a multi task learning framework which allows the model to learn various egocentric cues automatically by explicitly training for multiple egocentric tasks together. The joint training allows the cues from multiple tasks to fuse with each other as well as exploits the training samples available for each of the tasks. We show that our approach simultaneously improves the accuracy of the state of the art on all the trained tasks. We also show that the proposed model can extend easily to newer tasks with scarce data. en_US
dc.language.iso en_US en_US
dc.publisher IIIT-Delhi en_US
dc.subject Image analysis en_US
dc.subject Machine learning en_US
dc.subject Video analysis en_US
dc.title Deep learning based multitask learning for first-person videos en_US
dc.type Other en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository

Advanced Search


My Account