Coupled deep learning for multi-modal retrieval

Keswani, Sumit; Singh, Richa (Advisor); Vatsa, Mayank (Advisor)

dc.contributor.author	Keswani, Sumit
dc.contributor.author	Singh, Richa (Advisor)
dc.contributor.author	Vatsa, Mayank (Advisor)
dc.date.accessioned	2017-11-14T09:08:21Z
dc.date.available	2017-11-14T09:08:21Z
dc.date.issued	2017-04
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/588
dc.description.abstract	In past few years, cross-modal information retrieval has drawn much attention due to significant growth in the multimodal data. It takes one type of data as the query to retrieve relevant data of multiple modalities. For example, a user can use a text to retrieve relevant pictures or videos. Since the query and its retrieved results can be of different modalities, how to measure the content similarity between different modalities of data remains a challenge. The existing solutions try to project data from different modalities into a common latent space and then learn a independent mapping from one modality to another. In this paper, we propose a novel fully-coupled deep learning architecture that can effectively exploit the inter-modal and intra-modal associations from heterogeneous data. The proposed learning objective can capture the correlations between the cross-modal data while preserving the intra-modal relationships. We also propose a training method that uses expectation maximization for learning the mapping function from one modality to other. The proposed training method is memory efficient and large training datasets can be split into mini-batches for parameter updations.	en_US
dc.language.iso	en_US	en_US
dc.subject	Coupled deep learning	en_US
dc.subject	Alternate minimization	en_US
dc.subject	Information retrieval	en_US
dc.title	Coupled deep learning for multi-modal retrieval	en_US
dc.type	Other	en_US