IIIT-Delhi Institutional Repository

Coupled deep learning for multi-modal retrieval

Show simple item record

dc.contributor.author Keswani, Sumit
dc.contributor.author Singh, Richa (Advisor)
dc.contributor.author Vatsa, Mayank (Advisor)
dc.date.accessioned 2017-11-14T09:08:21Z
dc.date.available 2017-11-14T09:08:21Z
dc.date.issued 2017-04
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/588
dc.description.abstract In past few years, cross-modal information retrieval has drawn much attention due to significant growth in the multimodal data. It takes one type of data as the query to retrieve relevant data of multiple modalities. For example, a user can use a text to retrieve relevant pictures or videos. Since the query and its retrieved results can be of different modalities, how to measure the content similarity between different modalities of data remains a challenge. The existing solutions try to project data from different modalities into a common latent space and then learn a independent mapping from one modality to another. In this paper, we propose a novel fully-coupled deep learning architecture that can effectively exploit the inter-modal and intra-modal associations from heterogeneous data. The proposed learning objective can capture the correlations between the cross-modal data while preserving the intra-modal relationships. We also propose a training method that uses expectation maximization for learning the mapping function from one modality to other. The proposed training method is memory efficient and large training datasets can be split into mini-batches for parameter updations. en_US
dc.language.iso en_US en_US
dc.subject Coupled deep learning en_US
dc.subject Alternate minimization en_US
dc.subject Information retrieval en_US
dc.title Coupled deep learning for multi-modal retrieval en_US
dc.type Other en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository

Advanced Search


My Account