For effective retrieval of visual information, statistical learning plays a pivotal role. Statistical learning in such a context faces at least two major mathematical challenges: scarcity of training data, and imbalance of training classes. We present these challenges and outline our methods for addressing them: active learning, recursive subspace co-training, adaptive dimensionality reduction, class-boundary alignment, and quasi-bagging. 1 Overview The principal design goal of a visual information retrieval system is to return data (images or video clips) that accurately match users' query concepts. To achieve this design goal, the system must first comprehend a user's query concept thoroughly, and then find data that match the concept in the low-level input space accurately. Statistical learning techniques can assist achieving the design goal via two complementary avenues: semantic annotation and query-concept learning. Semantic annotation provides visual data with semanti...
Edward Y. Chang, Beitao Li, Gang Wu, Kingshy Goh