Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization

13 years 9 months ago

Download www.personal.psu.edu

We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a generalist, capable of classifying under all classes, to deliver a reasonably accurate initial category ranking given an instance. Edge then computes a confusion graph for the generalist and allocates the learning resources to train experts on relatively small groups of classes that tend to be systematically confused with one another by the generalist. The experts' votes, when invoked on a given instance, yield a reranking of the classes, thereby correcting the errors of the generalist. Our evaluations showcase the improved classification and ranking performance on several large-scale text categorization datasets. Edge is in particular efficient when the underlying learners are efficient. Our study of confusion graphs is also of independent interest. Categories and Subject Descriptors H.3.3 [Information ...

Jian Huang 0002, Omid Madani, C. Lee Giles

Real-time Traffic

CIKM 2008 | Confusion Graphs | Improved Classification | Information Management | Large-scale Text Categorization |

claim paper

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	CIKM
Authors	Jian Huang 0002, Omid Madani, C. Lee Giles

Comments (0)

Sciweavers

Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization

CIKM 2008 | Confusion Graphs | Improved Classification | Information Management | Large-scale Text Categorization |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers