Abstract. A combinatorial random variable is a discrete random variable defined over a combinatorial set (e.g., a power set of a given set). In this paper we introduce combinatorial Markov random fields (Comrafs), which are Markov random fields where some of the nodes are combinatorial random variables. We argue that Comrafs are powerful models for unsupervised and semi-supervised learning. We put Comrafs in perspective by showing their relationship with several existing models. Since it can be problematic to apply existing inference techniques for graphical models to Comrafs, we design two simple and efficient inference algorithms specific for Comrafs, which are based on combinatorial optimization. We show that even such simple algorithms consistently and significantly outperform Latent Dirichlet Allocation (LDA) on a document clustering task. We then present Comraf models for semi-supervised clustering and transfer learning that demonstrate superior results in comparison to an existi...
Ron Bekkerman, Mehran Sahami, Erik G. Learned-Mill