Combining Stochastic and Rule-Based Methods for Disambiguation in Agglutinative Languages

15 years 8 months ago

Download www.aclweb.org

In this paper we present the results of the combination of stochastic and rule-based disambiguation methods applied to Basque languagel. The methods we have used in disambiguation are Constraint Grammar formalism and an HMM based tagger developed within the MULTEXT project. As Basque is an agglutinative language, a morphological analyser is needed to attach all possible readings to each word. Then, CG rules are applied using all the morphological features and this process decreases morphological ambiguity of texts. Finally, we use the MULTEXT project tools to select just one from the possible remaining tags. Using only the stochastic method the error rate is about 14%, but the accuracy may be increased by about 2% enriching the lexicon with the unknown words. When both methods are combined, the error rate of the whole process is 3.5%. Considering that the training corpus is quite small, that the HMM model is a first order one and that Constraint Grammar of Basque language is still in ...

Nerea Ezeiza, Iñaki Alegria, Jose Maria Arr

Real-time Traffic

ACL 1998 | ACL 2007 | Constraint Grammar | MULTEXT Project | Rule-based Disambiguation Methods |

claim paper

» The Benefit of Stochastic PP Attachment to a RuleBased Parser

» Abstracting the Differential Semantics of RuleBased Models Exact and Automated Model Reduc...

» Serial Combination of Rules and Statistics A Case Study in Czech Tagging

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	1998
Where	ACL
Authors	Nerea Ezeiza, Iñaki Alegria, Jose Maria Arriola, Ruben Urizar, Itziar Aduriz

Comments (0)

Sciweavers

Combining Stochastic and Rule-Based Methods for Disambiguation in Agglutinative Languages

ACL 1998 | ACL 2007 | Constraint Grammar | MULTEXT Project | Rule-based Disambiguation Methods |

Explore & Download

Productivity Tools

Sciweavers