Sciweavers

EMNLP
2007

Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features

14 years 1 months ago
Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features
This paper discusses automatic determination of case in Arabic. This task is an important part and major source of errors in full diacritization of Arabic. We use a goldstandard syntactic tree, and obtain an error rate of about 4.2%, with a machine learning based system outperforming a system using hand-written rules. A careful error analysis suggests that when we account for annotation errors in the gold standard, the error rate drops to 0.9%, with the hand-written rules outperforming the machine learningbased system.
Nizar Habash, Ryan Gabbard, Owen Rambow, Seth Kuli
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where EMNLP
Authors Nizar Habash, Ryan Gabbard, Owen Rambow, Seth Kulick, Mitchell P. Marcus
Comments (0)