Sciweavers

3140 search results - page 10 / 628
» On the Use of Comparable Corpora to Improve SMT performance
Sort
View
ACL
2006
13 years 9 months ago
Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization
Cross-language Text Categorization is the task of assigning semantic classes to documents written in a target language (e.g. English) while the system is trained using labeled doc...
Alfio Massimiliano Gliozzo, Carlo Strapparava
IJCNLP
2005
Springer
14 years 1 months ago
Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora
Abstract. We present a new implication of Wu’s (1997) Inversion Transduction Grammar (ITG) Hypothesis, on the problem of retrieving truly parallel sentence translations from larg...
Dekai Wu, Pascale Fung
ISPASS
2005
IEEE
14 years 1 months ago
Performance Characterization of Java Applications on SMT Processors
As Java is emerging as one of the major programming languages in software development, studying how Java applications behave on recent SMT processors is of great interest. This pa...
Wei Huang, Jiang Lin, Zhao Zhang, J. Morris Chang
COLING
2010
13 years 2 months ago
An Empirical Study on Web Mining of Parallel Data
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim
ACL
2006
13 years 9 months ago
Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora
We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs using a...
Dragos Stefan Munteanu, Daniel Marcu