Sciweavers

IPM
2007

s-grams: Defining generalized n-grams for information retrieval

13 years 11 months ago
s-grams: Defining generalized n-grams for information retrieval
For European languages, n-gram has proved to be the cost effective alternative to morphological processing during indexing task and it has been studied and analyzed extensively using CLEF data. We adapted this work for our experiments on n-grams in Marathi language. Our experiments indicate that 4-gram produces the best results among n-grams of different lengths. Also we find that n-gram based retrieval provides improvements over mere word based retrieval for Marathi which is a morphologically rich language. We obtain the MAP (Mean Average Precision) score of 35.79% for n-gram based indexing against baseline MAP score of 23.94%.
Anni Järvelin, Antti Järvelin, Kalervo J
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2007
Where IPM
Authors Anni Järvelin, Antti Järvelin, Kalervo Järvelin
Comments (0)