We derive the clustering problem from first principles showing that the goal of achieving a probabilistic, or ”hard”, multi class clustering result is equivalent to the algebraic problem of a completely positive factorization under a doubly stochastic constraint. We show that spectral clustering, normalized cuts, kernel K-means and the various normalizations of the associated affinity matrix are particular instances and approximations of this general principle. We propose an efficient algorithm for achieving a completely positive factorization and extend the basic clustering scheme to situations where partial label information is available.