We propose a method for measuring the quality of a grouping result, based on the following observation: a better grouping result provides more information about the true, unknown grouping. The amount of information is evaluated using an automatic procedure, relying on the given hypothesized grouping, which generates (homogeneity) queries about the true grouping and answers them using an oracle. The process terminates once the queries suffice to specify the true grouping. The number of queries is a measure of the hypothesis non-informativeness. A relation between the query count and the (probabilistically characterized) uncertainty of the true grouping, is established and experimentally supported. The proposed information-based quality measure is free from arbitrary choices, uniformly treats different types of grouping errors, and does not favor any algorithm. We also found that it approximates human judgment better than other methods and gives better results when used to optimize a seg...
Erik A. Engbers, Michael Lindenbaum, Arnold W. M.