The mwetoolkit is a tool for automatic extraction of Multiword Expressions (MWEs) from monolingual corpora. It both generates and validates MWE candidates. The generation is based...
Carlos Ramisch, Aline Villavicencio, Christian Boi...
Word Sense Disambiguation (WSD) often relies on a context model or vector constructed from the words that co-occur with the target word within the same text windows. In most cases...
Bernard Brosseau-Villeneuve, Jian-Yun Nie, Noriko ...
This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. A large speech corpus produced by a single speaker is used, and the speech out...
Today there is a relatively large body of work on automatic acquisition of lexicosyntactical preferences (subcategorization) from corpora. Various techniques have been developed t...
Digital libraries have untapped potential for supporting language teaching and learning. Although the Internet at large is widely used for language education, it has critical disad...