Sciweavers

IPM
2007

s-grams: Defining generalized n-grams for information retrieval

14 years 14 days ago
s-grams: Defining generalized n-grams for information retrieval
For European languages, n-gram has proved to be the cost effective alternative to morphological processing during indexing task and it has been studied and analyzed extensively using CLEF data. We adapted this work for our experiments on n-grams in Marathi language. Our experiments indicate that 4-gram produces the best results among n-grams of different lengths. Also we find that n-gram based retrieval provides improvements over mere word based retrieval for Marathi which is a morphologically rich language. We obtain the MAP (Mean Average Precision) score of 35.79% for n-gram based indexing against baseline MAP score of 23.94%.
Anni Järvelin, Antti Järvelin, Kalervo J
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2007
Where IPM
Authors Anni Järvelin, Antti Järvelin, Kalervo Järvelin
Comments (0)