Cross-training: learning probabilistic mappings between topics

15 years 8 months ago

Download www.it.iitb.ac.in

Classification is a well-established operation in text mining. Given a set of labels A and a set DA of training documents tagged with these labels, a classifier learns to assign labels to unlabeled test documents. Suppose we also had available a different set of labels B, together with a set of documents DB marked with labels from B. If A and B have some semantic overlap, can the availability of DB help us build a better classifier for A, and vice versa? We answer this question in the affirmative by proposing cross-training: a new approach to semi-supervised learning in presence of multiple label sets. We give distributional and discriminative algorithms for cross-training and show, through extensive experiments, that cross-training can discover and exploit probabilistic relations between two taxonomies for more accurate classification. Categories and subject descriptors: I.2.6 [Artificial intelligence]: Learning; I.5.2 [Pattern Recognition]: Design Methodology - classifier design and...

Sunita Sarawagi, Soumen Chakrabarti, Shantanu Godb

Real-time Traffic

Better Classifier | Classifier Design | Data Mining | KDD 2003 | Unlabeled Test Documents |

claim paper

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2003
Where	KDD
Authors	Sunita Sarawagi, Soumen Chakrabarti, Shantanu Godbole

Comments (0)

Sciweavers

Cross-training: learning probabilistic mappings between topics

Better Classifier | Classifier Design | Data Mining | KDD 2003 | Unlabeled Test Documents |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers