Generalization error bounds in semi-supervised classification under the cluster assumption

15 years 6 months ago

Download hal.archives-ouvertes.fr

We consider semi-supervised classification when part of the available data is unlabeled. These unlabeled data can be useful for the classification problem when we make an assumption relating the behavior of the regression function to that of the marginal distribution. Seeger (2000) proposed the well-known cluster assumption as a reasonable one. We propose a mathematical formulation of this assumption and a method based on density level sets estimation that takes advantage of it to achieve fast rates of convergence both in the number of unlabeled examples and the number of labeled examples.

Philippe Rigollet

Real-time Traffic

Classification Problem | CORR 2006 | Education | Unlabeled Data | Well-known Cluster Assumption |

claim paper

» Joint sampling distribution between actual and estimated classification errors for linear ...

Post Info
More Details (n/a)

Added	11 Dec 2010
Updated	11 Dec 2010
Type	Journal
Year	2006
Where	CORR
Authors	Philippe Rigollet

Comments (0)

Sciweavers

Generalization error bounds in semi-supervised classification under the cluster assumption

Classification Problem | CORR 2006 | Education | Unlabeled Data | Well-known Cluster Assumption |

Explore & Download

Productivity Tools

Sciweavers