This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text categorization tasks. We present results comparingthe performanceof BoosTexterand a numberof other text-categorizationalgorithms on a variety of tasks. We conclude by describing the application of our system to automatic call-type identification from unconstrainedspoken customer responses.
Robert E. Schapire, Yoram Singer