We describe cross language retrieval experiments using Amharic queries and English language document collection from our participation in the bilingual ad hoc track at the CLEF 20...
Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of bootstrapp...
Graph-based semi-supervised learning has recently emerged as a promising approach to data-sparse learning problems in natural language processing. All graph-based algorithms rely ...
We present new direct data analysis showing that dynamically-built context-dependent phrasal translation lexicons are more useful resources for phrase-based statistical machine tr...
The paper presents Bulgarian National Corpus project (BulNC) - a large-scale, representative, online available corpus of Bulgarian. The BulNC is also a monolingual general corpus,...