Sciweavers

JMLR
2010

PAC-Bayesian Analysis of Co-clustering and Beyond

13 years 6 months ago
PAC-Bayesian Analysis of Co-clustering and Beyond
We derive PAC-Bayesian generalization bounds for supervised and unsupervised learning models based on clustering, such as co-clustering, matrix tri-factorization, graphical models, graph clustering, and pairwise clustering.1 We begin with the analysis of co-clustering, which is a widely used approach to the analysis of data matrices. We distinguish among two tasks in matrix data analysis: discriminative prediction of the missing entries in data matrices and estimation of the joint probability distribution of row and column variables in co-occurrence matrices. We derive PAC-Bayesian generalization bounds for the expected out-of-sample performance of co-clustering-based solutions for these two tasks. The analysis yields regularization terms that were absent in the previous formulations of co-clustering. The bounds suggest that the expected performance of co-clustering is governed by a trade-off between its empirical performance and the mutual information preserved by the cluster variabl...
Yevgeny Seldin, Naftali Tishby
Added 19 May 2011
Updated 19 May 2011
Type Journal
Year 2010
Where JMLR
Authors Yevgeny Seldin, Naftali Tishby
Comments (0)