We study solutions to the problem of evaluating image similarity in the context of content-based image retrieval (CBIR). Retrieval is formulated as a classification problem, where the goal is to minimize probability of retrieval error. It is shown that this formulation establishes a common ground for comparing similarity functions, exposes assumptions hidden behind most of the ones in common use, enables a critical analysis of their relative merits, and determines the retrieval scenarios for which each may be most suited. We conclude that most of the current similarity functions are sub-optimal special cases of the Bayesian criteria that results from explicit minimization of error probability.