Sciweavers

CICLING
2007
Springer

Morphological Disambiguation of Turkish Text with Perceptron Algorithm

14 years 6 months ago
Morphological Disambiguation of Turkish Text with Perceptron Algorithm
Abstract. This paper describes the application of the perceptron algorithm to the morphological disambiguation of Turkish text. Turkish has a productive derivational morphology. Due to the ambiguity caused by complex morphology, a word may have multiple morphological parses, each with a different stem or sequence of morphemes. The methodology employed is based on ranking with perceptron algorithm which has been successful in some NLP tasks in English. We use a baseline statistical trigram-based model of a previous work to enumerate an n-best list of candidate morphological parse sequences for each sentence. We then apply the perceptron algorithm to rerank the n-best list using a set of 23 features. The perceptron trained to do morphological disambiguation improves the accuracy of the baseline model from 93.61% to 96.80%. When we train the perceptron as a POS tagger, the accuracy is 98.27%. Turkish morphological disambiguation and POS tagging results that we obtained is the best report...
Hasim Sak, Tunga Güngör, Murat Saraclar
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where CICLING
Authors Hasim Sak, Tunga Güngör, Murat Saraclar
Comments (0)