The visual appearance of an image is closely associated with its low-level features. Identifying the set of features that best characterizes the image is useful for tasks such as content-based image indexing and retrieval. In this paper, we present a method which simultaneously models and clusters large sets of images and their low-level visual features. A computational energy function suited for co-clustering images and their features is first constructed and a Hopfield model based stochastic algorithm is then developed for its optimization. We apply the method to cluster digital color photographs and present results to demonstrate its usefulness and effectiveness.