

Light stemming approaches for the French, Portuguese, German and Hungarian languages

14 years 6 months ago
Light stemming approaches for the French, Portuguese, German and Hungarian languages
This paper describes and evaluates various general stemming approaches for the French, Portuguese (Brazilian), German and Hungarian languages. Based on the CLEF test-collections, we demonstrate that light stemmers for the French, Portuguese and Hungarian languages perform well, and reasonably well for the German language. Variations in mean average precision among the different stemming approaches are also evaluated and sometimes they are found statistically significant. Categories and Subject Descriptors H.3.1 [Content Analysis and Indexing]: Indexing methods; Linguistic processing. H.3.3 [Information Search and Retrieval]: Retrieval models. H.3.4 [Systems and Software]: Performance evaluation. General Terms Algorithms, Measurement, Performance. Keywords Stemming for French, Portuguese, German, Hungarian; stemmer, natural language processing.
Jacques Savoy
Added 14 Jun 2010
Updated 14 Jun 2010
Type Conference
Year 2006
Where SAC
Authors Jacques Savoy
Comments (0)