Learning Cross-modality Similarity for Multinomial Data

14 years 7 months ago

Download www.eecs.berkeley.edu

Many applications involve multiple-modalities such as text and images that describe the problem of interest. In order to leverage the information present in all the modalities, one must model the relationships between them. While some techniques have been proposed to tackle this problem, they either are restricted to words describing visual objects only, or require full correspondences between the different modalities. As a consequence, they are unable to tackle more realistic scenarios where a narrative text is only loosely related to an image, and where only a few image-text pairs are available. In this paper, we propose a model that addresses both these challenges. Our model can be seen as a Markov random ﬁeld of topic models, which connects the documents based on their similarity. As a consequence, the topics learned with our model are shared across connected documents, thus encoding the relations between different modalities. We demonstrate the effectiveness of our model for im...

Yangqing Jia, Mathieu Salzmann, Trevor Darrell

Real-time Traffic

Computer Vision | ICCV 2011 | Image Retrieval | Narrative Text | Realistic Scenarios |

claim paper

Added	11 Dec 2011
Updated	11 Dec 2011
Type	Journal
Year	2011
Where	ICCV
Authors	Yangqing Jia, Mathieu Salzmann, Trevor Darrell

Sciweavers

Learning Cross-modality Similarity for Multinomial Data

Computer Vision | ICCV 2011 | Image Retrieval | Narrative Text | Realistic Scenarios |

Explore & Download

Productivity Tools

Sciweavers