Sciweavers

JMLR
2010

Multiclass-Multilabel Classification with More Classes than Examples

13 years 7 months ago
Multiclass-Multilabel Classification with More Classes than Examples
We discuss multiclass-multilabel classification problems in which the set of classes is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples per class. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of class frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambi...
Ofer Dekel, Ohad Shamir
Added 19 May 2011
Updated 19 May 2011
Type Journal
Year 2010
Where JMLR
Authors Ofer Dekel, Ohad Shamir
Comments (0)