Sciweavers

DAGM
2008
Springer

Learning Visual Compound Models from Parallel Image-Text Datasets

14 years 1 months ago
Learning Visual Compound Models from Parallel Image-Text Datasets
Abstract. In this paper, we propose a new approach to learn structured visual compound models from shape-based feature descriptions. We use captioned text in order to drive the process of grouping boundary fragments detected in an image. In the learning framework, we transfer several techniques from computational linguistics to the visual domain and build on previous work in image annotation. A statistical translation model is used in order to establish links between caption words and image elements. Then, compounds are iteratively built up by using a mutual information measure. Relations between compound elements are automatically extracted and increase the discriminability of the visual models. We show results on dierent synthetic and realistic datasets in order to validate our approach.
Jan Moringen, Sven Wachsmuth, Sven J. Dickinson, S
Added 19 Oct 2010
Updated 19 Oct 2010
Type Conference
Year 2008
Where DAGM
Authors Jan Moringen, Sven Wachsmuth, Sven J. Dickinson, Suzanne Stevenson
Comments (0)