This paper discusses the influence of the corpus on the automatic identification of proper names in texts. Techniques developed for the newswire genre are generally not sufficient...
Inclusions from other languages can be a significant source of errors for monolingual parsers. We show this for English inclusions, which are sufficiently frequent to present a ...
Bagging and boosting, two effective machine learning techniques, are applied to natural language parsing. Experiments using these techniques with a trainable statistical parser ar...
We describe here a method for automatically identifying word sense variation in a dated collection of historical books in a large digital library. By leveraging a small set of kno...
This paper proposes an approach to improve word alignment for languages with scarce resources using bilingual corpora of other language pairs. To perform word alignment between la...