We describe a new unsupervised approach for synonymy discovery by aligning paraphrases in monolingual domain corpora. For that purpose, we identify phrasal terms that convey most ...
The lack of parallel corpora and linguistic resources for many languages and domains is one of the major obstacles for the further advancement of automated translation. A possible...
Marcis Pinnis, Radu Ion, Dan Stefanescu, Fangzhong...
Parallel corpora are indispensable resources for a variety of multilingual natural language processing tasks. This paper presents a technique for fully automatic construction of c...
We created a simple gold standard for English-Hungarian NP-level alignment, Orwell's 1984 by manually verifying the automatically generated NP chunking and manually aligning ...
Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....