Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features

15 years 8 months ago

Download www.seas.upenn.edu

This paper discusses automatic determination of case in Arabic. This task is an important part and major source of errors in full diacritization of Arabic. We use a goldstandard syntactic tree, and obtain an error rate of about 4.2%, with a machine learning based system outperforming a system using hand-written rules. A careful error analysis suggests that when we account for annotation errors in the gold standard, the error rate drops to 0.9%, with the hand-written rules outperforming the machine learningbased system.

Nizar Habash, Ryan Gabbard, Owen Rambow, Seth Kuli

Real-time Traffic

Careful Error Analysis | EMNLP 2007 | Error Rate | Hand-written Rules | Natural Language Processing |

claim paper

» Selfadjusting computation an overview

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2007
Where	EMNLP
Authors	Nizar Habash, Ryan Gabbard, Owen Rambow, Seth Kulick, Mitchell P. Marcus

Comments (0)

Sciweavers

Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features

Careful Error Analysis | EMNLP 2007 | Error Rate | Hand-written Rules | Natural Language Processing |

Explore & Download

Productivity Tools

Sciweavers