Most image retrieval systems only allow a fragment of text or an example image as a query. Most users have more complex information needs that are not easily expressed in either of these forms. This paper proposes a model based on the Inference Network framework from information retrieval that employs a powerful query language that allows structured query operators, term weighting, and the combination of text and images within a query. The model uses non-parametric methods to estimate probabilities within the inference network. Image annotation and retrieval results are reported and compared against other published systems and illustrative structured and weighted query results are given to show the power of the query language. The resulting system both performs well and is robust compared to existing approaches.
Donald Metzler, R. Manmatha