Sciweavers

LREC
2010

Romanian Zero Pronoun Distribution: A Comparative Study

14 years 29 days ago
Romanian Zero Pronoun Distribution: A Comparative Study
Anaphora resolution is still a challenging research field in natural language processing, lacking an algorithm that correctly resolves anaphoric pronouns. Anaphoric zero pronouns pose an even greater challenge, since this category is not lexically realised. Thus, their resolution is conditioned by their prior identification stage. This paper reports on the distribution of zero pronouns in Romanian in various genres: encyclopaedic, legal, literary, and news-wire texts. For this purpose, the RoZP corpus has been created, containing almost 50000 tokens and 800 zero pronouns which are manually annotated. The distribution patterns are compared across genres, and exceptional cases are presented in order to facilitate the methodological process of developing a future zero pronoun identification and resolution algorithm. The evaluation results emphasise that zero pronouns appear frequently in Romanian, and their distribution depends largely on the genre. Additionally, possible features are re...
Claudiu Mihaila, Iustina Ilisei, Diana Inkpen
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Claudiu Mihaila, Iustina Ilisei, Diana Inkpen
Comments (0)