Mining Multiple Visual Appearances of Semantics for Image Annotation

16 years 24 days ago

Download vireo.cs.cityu.edu.hk

This paper investigates the problem of learning the visual semantics of keyword categories for automatic image annotation. Supervised learning algorithms which learn only a single concept point of a category are limited in their eﬀectiveness for image annotation. We propose to use data mining techniques to mine multiple concepts, where each concept may consist of one or more visual parts, to capture the diverse visual appearances of a single keyword category. For training, we use the Apriori principle to eﬃciently mine a set of frequent blobsets to capture the semantics of a rich and diverse visual category. Each concept is ranked based on a discriminative or diverse density measure. For testing, we propose a level-sensitive matching to rank words given an unannotated image. Our approach is eﬀective, scales better during training and testing, and is eﬃcient in terms of learning and annotation. Key words: Image Annotation, Multiple-Instance Learning, Apriori

Hung-Khoon Tan, Chong-Wah Ngo

Real-time Traffic