Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

127

TSD
2010
Springer

favoriteEmaildiscussreport

175views Signal Processing» more TSD 2010»

Comparison of Different Lemmatization Approaches through the Means of Information Retrieval Performance

14 years 12 months ago

Comparison of Different Lemmatization Approaches through the Means of Information Retrieval Performance

Download www.kky.zcu.cz

This paper presents a quantitative performance analysis of two different approaches to the lemmatization of the Czech text data. The first one is based on manually prepared dictionary of lemmas and set of derivation rules while the second one is based on automatic inference of the dictionary and the rules from training data. The comparison is done by evaluating the mean Generalized Average Precision (mGAP) measure of the lemmatized documents and search queries in the set of information retrieval (IR) experiments. Such method is suitable for efficient and rather reliable comparison of the lemmatization performance since a correct lemmatization has proven to be crucial for IR effectiveness in highly inflected languages. Moreover, the proposed indirect comparison of the lemmatizers circumvents the need for manually lemmatized test data which are hard to obtain and also face the problem of incompatible sets of lemmas across different systems.

Jakub Kanis, Lucie Skorkovská

Real-time Traffic

Czech Text Data | Generalized Average Precision | Quantitative Performance Analysis | Signal Processing | TSD 2010 |

claim paper

Related Content

» Restricted inflectional form generation in management of morphological keyword variation

» Recommender Systems by means of Information Retrieval

» CLEFIP 2010 Prior Art Retrieval Using the Different Sections in Patent Documents

» A Comparison of Navigation Techniques Across Different Types of OffScreen Navigation Tasks

» NLP for Shallow Question Answering of Legal Documents Using Graphs

» Information retrieval system evaluation effort sensitivity and reliability

» Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied S...

» A study of learning a merge model for multilingual information retrieval

» Performance evaluation and optimization for contentbased image retrieval

Post Info
More Details (n/a)

Added	15 Feb 2011
Updated	15 Feb 2011
Type	Journal
Year	2010
Where	TSD
Authors	Jakub Kanis, Lucie Skorkovská

Comments (0)