Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

175

COLING
2010

137views Computational Linguistics» more COLING 2010»

An Empirical Study on Web Mining of Parallel Data

15 years 1 months ago

An Empirical Study on Web Mining of Parallel Data

Download www.aclweb.org

This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract parallel sentences. This paper attempts the much more challenging task of directly searching for high-quality sentence pairs from the Web. We tackle the problem by formulating good search query using ,,Learning to Rank and by filtering noisy document pairs using IBM Model 1 alignment. End-to-end evaluation shows that the proposed approach significantly improves the performance of statistical machine translation.

Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim

Real-time Traffic

COLING 2010 | Computational Linguistics | Nonparallel Corpora | Parallel Corpora | Statistical Machine Translation |

claim paper

Related Content

» An Intelligent Web Agent to Mine Bilingual Parallel Pages via Automatic Discovery of URL P...

» Empirical Bayesian data mining for discovering patterns in postmarketing drug safety

» Adaptive Parallel Sentences Mining from Web Bilingual News Collection

» An Empirical Evaluation of a Distributed ClusteringBased Index for Metric Space Databases

» An Empirical Study of Similarity Search in Stock Data

» A DOM Tree Alignment Model for Mining Parallel Data from the Web

» Exploiting Parallel Texts for Word Sense Disambiguation An Empirical Study

» Movie Review Mining a Comparison between Supervised and Unsupervised Classification Approa...

» Growing parallel paths for entitypage discovery

Post Info
More Details (n/a)

Added	13 May 2011
Updated	13 May 2011
Type	Journal
Year	2010
Where	COLING
Authors	Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim

Comments (0)