Sciweavers

ICDM
2008
IEEE

Classifying High-Dimensional Text and Web Data Using Very Short Patterns

14 years 7 months ago
Classifying High-Dimensional Text and Web Data Using Very Short Patterns
In this paper, we propose the "Democratic Classifier", a simple, democracy-inspired patternbased classification algorithm that uses very short patterns for classification, and does not rely on the minimum support threshold. Borrowing ideas from democracy, our training phase allows each training instance to vote for an equal number of candidate size2 patterns. Similar to the usual democratic election process, where voters select candidates by considering their qualifications, prior contributions at the constituency and territory levels, as well as their own perception about candidates, the training instances select patterns by effectively balancing between local, class, and global significance of patterns. In addition, we respect "each voter's opinion" by simultaneously adding shared patterns to all applicable classes, and then apply a novel power law based weighing scheme, instead of making binary decisions on these patterns. Results of experiments performed o...
Hassan H. Malik, John R. Kender
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICDM
Authors Hassan H. Malik, John R. Kender
Comments (0)