Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
An unsupervised method for word sense disambiguation using a bilingual comparable corpus was developed. First, it extracts statistically significant pairs of related words from th...
Paraphrase patterns are useful in paraphrase recognition and generation. In this paper, we present a pivot approach for extracting paraphrase patterns from bilingual parallel corp...
We present an unsupervised word segmentation model for machine translation. The model uses existing monolingual segmentation techniques and models the joint distribution over sour...
An increasing demand for new language resources of recent EU members and accessing countries has in turn initiated the development of different language tools and resources, such ...
Sanja Seljan, Marko Tadic, Zeljko Agic, Jan Snajde...