This paper discusses the problem of knowledge discovery in image databases with particular focus on the issues which arise when absolute ground truth is not available. It is often the case that a user exploring a large database is in search of items that are not easy to define completely. One way to circumvent this problem is to ask the user to specify examples of the item of interest. However, Iabelling of specific items may in itself be inconsistent, whether it is multiple labels from a single labcller at different times or labels from different labcllers. The paper discusses issues which arise in terms of elicitation of subjective probabilistic opinion, estimation of basic scicntitic parameters of interest given probabilistic labels, learning from probabilistic labels, and effective evaluation of both user and algorithm performance in the absence of ground truth. The problem of searching the Magcllan image data set in order to automatically locate and catalog small volcanoes on the...
Padhraic Smyth, Michael C. Burl, Usama M. Fayyad,