Using Language to Learn Structured Appearance Models for Image Annotation

14 years 2 months ago

Download www.cs.toronto.edu

Abstract— Given an unstructured collection of captioned images of cluttered scenes featuring a variety of objects, our goal is to simultaneously learn the names and appearances of the objects. Only a small fraction of local features within any given image are associated with a particular caption word, and captions may contain irrelevant words not associated with any image object. We propose a novel algorithm that uses the repetition of feature neighborhoods across training images and a measure of correspondence with caption words to learn meaningful feature conﬁgurations (representing named objects). We also introduce a graph-based appearance model that captures some of the structure of an object by encoding the spatial relationships among the local visual features. In an iterative procedure we use language (the words) to drive a perceptual grouping process that assembles an appearance model for a named object. Results of applying our method to three data sets in a variety of condi...

Michael Jamieson, Afsaneh Fazly, Suzanne Stevenson

Real-time Traffic

Appearance Model | Caption Words | Image | PAMI 2010 |

claim paper

Post Info
More Details (n/a)

Added	29 Jan 2011
Updated	29 Jan 2011
Type	Journal
Year	2010
Where	PAMI
Authors	Michael Jamieson, Afsaneh Fazly, Suzanne Stevenson, Sven J. Dickinson, Sven Wachsmuth

Comments (0)

Sciweavers

Using Language to Learn Structured Appearance Models for Image Annotation

Appearance Model | Caption Words | Image | PAMI 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers