Search Sciweavers | Sciweavers

6 search results - page 1 / 2

» Improved Sentence Alignment on Parallel Web Pages Using a St...

210

click to vote

EMNLP
2008

139views Natural Language Processing» more EMNLP 2008»

Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model

15 years 8 months ago

Download www.aclweb.org

Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....

Lei Shi, Ming Zhou

claim paper

Read More »

169

click to vote

ACL
2006

141views Computational Linguistics» more ACL 2006»

A DOM Tree Alignment Model for Mining Parallel Data from the Web

15 years 8 months ago

Download research.microsoft.com

This paper presents a new web mining scheme for parallel data acquisition. Based on the Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM tree align...

Lei Shi, Cheng Niu, Ming Zhou, Jianfeng Gao

claim paper

Read More »

200

click to vote

NAACL
2010

182views Computational Linguistics» more NAACL 2010»

Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment

15 years 4 months ago

Download research.microsoft.com

The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...

Jason R. Smith, Chris Quirk, Kristina Toutanova

claim paper

Read More »

220

click to vote

ACL
2009

167views Computational Linguistics» more ACL 2009»

Mining Bilingual Data from the Web with Adaptively Learnt Patterns

15 years 4 months ago

Download www.aclweb.org

Mining bilingual data (including bilingual sentences and terms1 ) from the Web can benefit many NLP applications, such as machine translation and cross language information retrie...

Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, ...

claim paper

Read More »

181

click to vote

COLING
2010

137views Computational Linguistics» more COLING 2010»

An Empirical Study on Web Mining of Parallel Data

15 years 1 months ago

Download www.aclweb.org

This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...

Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim

claim paper

Read More »

« Prev « First page 1 / 2 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers