Unsupervised Learning with Permuted Data

15 years 8 months ago

Download www.hpl.hp.com

We consider the problem of unsupervised learning from a matrix of data vectors where in each row the observed values are randomly permuted in an unknown fashion. Such problems arise naturally in areas such as computer vision and text modeling where measurements need not be in correspondence with the correct features. We provide a general theoretical characterization of the difficulty of "unscrambling" the values of the rows for such problems and relate the optimal error rate to the well-known concept of the Bayes classification error rate. For known parametric distributions we derive closed-form expressions for the optimal error rate that provide insight into what makes this problem difficult in practice. Finally, we show how the Expectation-Maximization procedure can be used to simultaneously estimate both a probabilistic model for the features as well as a distribution over the correspondence of the row values.

Sergey Kirshner, Sridevi Parise, Padhraic Smyth

Real-time Traffic

Classification Error Rate | ICML 2003 | Machine Learning | Optimal Error Rate | Row Values |

claim paper

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2003
Where	ICML
Authors	Sergey Kirshner, Sridevi Parise, Padhraic Smyth

Comments (0)

Sciweavers

Unsupervised Learning with Permuted Data

Classification Error Rate | ICML 2003 | Machine Learning | Optimal Error Rate | Row Values |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers