This paper proposes a very general max-margin learning framework for distance-based clustering. To this end, it formulates clustering as a high order energy minimization problem with latent variables, and applies a dual decomposition approach for training this model. The resulting framework allows learning a very broad class of distance functions, permits an automatic determination of the number of clusters during testing, and is also very efficient. As an additional contribution, we show how our method can be generalized to handle the training of a very broad class of important models in computer vision: arbitrary high-order latent CRFs. Experimental results verify its effectiveness.