Sciweavers

SIGIR
2003
ACM

Single n-gram stemming

14 years 5 months ago
Single n-gram stemming
Stemming can improve retrieval accuracy, but stemmers are language-specific. Character n-gram tokenization achieves many of the benefits of stemming in a language independent way, but its use incurs a performance penalty. We demonstrate that selection of a single n-gram as a pseudo-stem for a word can be an effective and efficient language-neutral approach for some languages. Categories and Subject Descriptors H.3.1 [Information Systems]: Information Storage and Retrieval – content analysis and indexing. General Terms: Algorithms
James Mayfield, Paul McNamee
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Where SIGIR
Authors James Mayfield, Paul McNamee
Comments (0)