Connecting Modalities: Semi-supervised Segmentation and Annotation of Images Using Unaligned Text Corpora

16 years 3 months ago

Download www.socher.org

We propose a semi-supervised model which segments and annotates images using very few labeled images and a large unaligned text corpus to relate image regions to text labels. Given photos of a sports event, all that is necessary to provide a pixel-level labeling of objects and background is a set of newspaper articles about this sport and one to five labeled images. Our model is motivated by the observation that words in text corpora share certain context and feature similarities with visual objects. We describe images using visual words, a new region-based representation. The proposed model is based on kernelized canonical correlation analysis which finds a mapping between visual and textual words by projecting them into a latent meaning space. Kernels are derived from context and adjective features inside the respective visual and textual domains. We apply our method to a challenging dataset and rely on articles of the New York Times for textual features. Our model outperforms the s...

Richard Socher, Li Fei-Fei

Real-time Traffic

Computer Vision | CVPR 2010 | Model | Unaligned Text Corpus | Visual |

claim paper

Post Info
More Details (n/a)

Added	01 Apr 2010
Updated	14 May 2010
Type	Conference
Year	2010
Where	CVPR
Authors	Richard Socher, Li Fei-Fei

Comments (0)

Sciweavers

Connecting Modalities: Semi-supervised Segmentation and Annotation of Images Using Unaligned Text Corpora

Computer Vision | CVPR 2010 | Model | Unaligned Text Corpus | Visual |

Explore & Download

Productivity Tools

Sciweavers